User Documentation
online_sv.sql_in
Go to the documentation of this file.
00001 /* ----------------------------------------------------------------------- *//** 
00002  *
00003  * @file online_sv.sql_in
00004  *
00005  * @brief SQL functions for support vector machines
00006  * @sa For an introduction to Support vector machines (SVMs) and related kernel
00007  *     methods, see the module description \ref grp_kernmach.
00008  *
00009  *//* ------------------------------------------------------------------------*/
00010 
00011 m4_include(`SQLCommon.m4')
00012 
00013 /**
00014 @addtogroup grp_kernmach
00015 
00016 @about
00017 
00018 Support vector machines (SVMs) and related kernel methods have been one of 
00019 the most popular and well-studied machine learning techniques of the 
00020 past 15 years, with an amazing number of innovations and applications.
00021 
00022 In a nutshell, an SVM model \f$f(x)\f$ takes the form of
00023 \f[
00024     f(x) = \sum_i \alpha_i k(x_i,x),
00025 \f]
00026 where each \f$ \alpha_i \f$ is a real number, each \f$ \boldsymbol x_i \f$ is a
00027 data point from the training set (called a support vector), and
00028 \f$ k(\cdot, \cdot) \f$ is a kernel function that measures how "similar" two
00029 objects are. In regression, \f$ f(\boldsymbol x) \f$ is the regression function
00030 we seek. In classification, \f$ f(\boldsymbol x) \f$ serves as
00031 the decision boundary; so for example in binary classification, the predictor 
00032 can output class 1 for object \f$x\f$ if \f$ f(\boldsymbol x) \geq 0 \f$, and class
00033 2 otherwise.
00034 
00035 In the case when the kernel function \f$ k(\cdot, \cdot) \f$ is the standard
00036 inner product on vectors, \f$ f(\boldsymbol x) \f$ is just an alternative way of
00037 writing a linear function
00038 \f[
00039     f'(\boldsymbol x) = \langle \boldsymbol w, \boldsymbol x \rangle,
00040 \f]
00041 where \f$ \boldsymbol w \f$ is a weight vector having the same dimension as
00042 \f$ \boldsymbol x \f$. One of the key points of SVMs is that we can use more
00043 fancy kernel functions to efficiently learn linear models in high-dimensional
00044 feature spaces, since \f$ k(\boldsymbol x_i, \boldsymbol x_j) \f$ can be
00045 understood as an efficient way of computing an inner product in the feature
00046 space:
00047 \f[
00048     k(\boldsymbol x_i, \boldsymbol x_j)
00049     =   \langle \phi(\boldsymbol x_i), \phi(\boldsymbol x_j) \rangle,
00050 \f]
00051 where \f$ \phi(\boldsymbol x) \f$ projects \f$ \boldsymbol x \f$ into a
00052 (possibly infinite-dimensional) feature space.
00053 
00054 There are many algorithms for learning kernel machines. This module
00055 implements the class of online learning with kernels algorithms
00056 described in Kivinen et al. [1]. It also includes the Stochastic
00057 Gradient Descent (SGD) method [3] for learning linear SVMs with the Hinge
00058 loss \f$l(z) = \max(0, 1-z)\f$. See also the book Scholkopf and Smola [2] for much more
00059 details.
00060 
00061 The SGD implementation is based on Léon Bottou's SGD package
00062 (http://leon.bottou.org/projects/sgd). The methods introduced in [1]
00063 are implemented according to their original descriptions, except that
00064 we only update the support vector model when we make a significant
00065 error. The original algorithms in [1] update the support vector model at
00066 every step, even when no error was made, in the name of
00067 regularisation. For practical purposes, and this is verified
00068 empirically to a certain degree, updating only when necessary is both
00069 faster and better from a learning-theoretic point of view, at least in
00070 the i.i.d. setting.
00071 
00072 Methods for classification, regression and novelty detection are 
00073 available. Multiple instances of the algorithms can be executed 
00074 in parallel on different subsets of the training data. The resultant
00075 support vector models can then be combined using standard techniques
00076 like averaging or majority voting.
00077 
00078 Training data points are accessed via a table or a view. The support
00079 vector models can also be stored in tables for fast execution.
00080 
00081 @input
00082 For classification and regression, the training table/view is expected to be of the following form (the array size of <em>ind</em> must not be greater than 102,400.):\n
00083 <pre>{TABLE|VIEW} <em>input_table</em> (
00084     ...
00085     <em>id</em> INT,
00086     <em>ind</em> FLOAT8[],
00087     <em>label</em> FLOAT8,
00088     ...
00089 )</pre>
00090 For novelty detection, the label field is not required.
00091 
00092 @usage
00093 
00094 - Regression learning is achieved through the following function:
00095   <pre>SELECT \ref svm_regression(
00096     '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 
00097     <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em>, <em>slambda DEFAULT 0.05</em>
00098     );</pre>   
00099 
00100 -  Classification learning is achieved through the following two
00101    functions:
00102      -# Learn linear SVM(s) using SGD [3]:
00103      <pre>SELECT \ref lsvm_classification(
00104     '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, 
00105     <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>reg DEFAULT 0.001</em>
00106     );</pre>   
00107      -# Learn linear or non-linear SVM(s) using the method described in [1]:
00108      <pre>SELECT \ref svm_classification(
00109     '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 
00110     <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em>
00111     );</pre>   
00112 
00113 -  Novelty detection is achieved through the following function:
00114     <pre>SELECT \ref svm_novelty_detection(
00115     '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 
00116     <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em>
00117     );</pre>
00118     Assuming the model_table parameter takes on value 'model', each learning function will produce two tables 
00119     as output: 'model' and 'model_param'.
00120     The first contains the support vectors of the model(s) learned.
00121     The second contains the parameters of the model(s) learned, which includes information like the kernel function
00122     used and the value of the intercept, if there is one.
00123 
00124 - To make predictions on a single data point x using a single model
00125   learned previously, we use the function
00126   <pre>SELECT \ref
00127   svm_predict('<em>model_table</em>',<em>x</em>);</pre>
00128   If the model is produced by the lsvm_classification() function, use
00129   the following prediction function instead
00130   <pre>SELECT \ref
00131   lsvm_predict('<em>model_table</em>',<em>x</em>);</pre>
00132 
00133 - To make predictions on new data points using multiple models
00134   learned in parallel, we use the function
00135   <pre>SELECT \ref
00136   svm_predict_combo('<em>model_table</em>',<em>x</em>);</pre>
00137   If the models are produced by the lsvm_classification() function, use
00138   the following prediction function instead
00139   <pre>SELECT \ref
00140   lsvm_predict_combo('<em>model_table</em>',<em>x</em>);</pre>
00141 
00142 
00143 - Note that, at the moment, we cannot use MADLIB_SCHEMA.svm_predict() and MADLIB_SCHEMA.svm_predict_combo()
00144   on multiple data points. For example, something like the following will fail:
00145   <pre>SELECT \ref svm_predict('<em>model_table</em>',<em>x</em>) FROM data_table;</pre>
00146   Instead, to make predictions on new data points stored in a table using
00147   previously learned models, we use the function:
00148   <pre>SELECT \ref svm_predict_batch('<em>input_table</em>', '<em>data_col</em>', '<em>id_col</em>', '<em>model_table</em>', '<em>output_table</em>', <em>parallel</em>);</pre>
00149   The output_table is created during the function call; an existing table with 
00150   the same name will be dropped.
00151   If the parallel parameter is true, then each data point in the input table will have multiple 
00152   predicted values corresponding to the number of models learned in
00153   parallel.\n\n
00154   Similarly, use the following function for batch prediction if the
00155   model(s) is produced by the lsvm_classification() function:
00156   <pre>SELECT \ref lsvm_predict_batch('<em>input_table</em>', '<em>data_col</em>', '<em>id_col</em>', '<em>model_table</em>','<em>output_table</em>', <em>parallel</em>);</pre>
00157   
00158   
00159 
00160 @implementation
00161 
00162 Currently, three kernel functions have been implemented: dot product (\ref svm_dot), polynomial (\ref svm_polynomial) and Gaussian (\ref svm_gaussian) kernels. To use the dot product kernel function,
00163 simply use '<tt><em>MADLIB_SCHEMA.svm_dot</em></tt>' as the <tt>kernel_func</tt> argument, which accepts any function that takes in two float[] and returns a float. To use the polynomial or Gaussian kernels,
00164 a wrapper function is needed since these kernels require additional input parameters (see online_sv.sql_in for input parameters).
00165 
00166 For example, to use the polynomial kernel with degree 2, first create a wrapper function:
00167 <pre>CREATE OR REPLACE FUNCTION mykernel(FLOAT[],FLOAT[]) RETURNS FLOAT AS $$
00168     SELECT \ref svm_polynomial($1,$2,2)
00169 $$ language sql;</pre>
00170 Then call the SVM learning functions with <tt>mykernel</tt> as the argument to <tt>kernel_func</tt>.
00171 <pre>SELECT \ref svm_regression('my_schema.my_train_data', 'mymodel', false, 'mykernel');</pre>
00172 
00173 To drop all tables pertaining to the model, we can use
00174 <pre>SELECT \ref svm_drop_model('model_table');</pre>
00175 
00176 @examp
00177 
00178 As a general first step, we need to prepare and populate an input 
00179 table/view with the following structure:
00180 \code   
00181 TABLE/VIEW my_schema.my_input_table 
00182 (       
00183         id    INT,       -- point ID
00184         ind   FLOAT8[],  -- data point
00185         label FLOAT8     -- label of data point
00186 );
00187 \endcode    
00188      Note: The label field is not required for novelty detection.
00189     
00190 
00191 <strong>Example usage for regression</strong>:
00192      -# We can randomly generate 1000 5-dimensional data labelled by the simple target function 
00193 \code
00194 t(x) = if x[5] = 10 then 50 else if x[5] = -10 then 50 else 0;
00195 \endcode
00196 and store that in the my_schema.my_train_data table as follows:
00197 \code
00198 sql> select MADLIB_SCHEMA.svm_generate_reg_data('my_schema.my_train_data', 1000, 5);
00199 \endcode
00200      -# We can now learn a regression model and store the resultant model
00201         under the name 'myexp'.
00202 \code
00203 sql> select MADLIB_SCHEMA.svm_regression('my_schema.my_train_data', 'myexp', false, 'MADLIB_SCHEMA.svm_dot');
00204 \endcode
00205      -# We can now start using it to predict the labels of new data points 
00206         like as follows:
00207 \code
00208 sql> select MADLIB_SCHEMA.svm_predict('myexp', '{1,2,4,20,10}');
00209 sql> select MADLIB_SCHEMA.svm_predict('myexp', '{1,2,4,20,-10}');
00210 \endcode
00211      -# To learn multiple support vector models, we replace the learning step above by 
00212 \code
00213 sql> select MADLIB_SCHEMA.svm_regression('my_schema.my_train_data', 'myexp', true, 'MADLIB_SCHEMA.svm_dot');
00214 \endcode
00215 The resultant models can be used for prediction as follows:
00216 \code
00217 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexp', '{1,2,4,20,10}');
00218 \endcode
00219      -# We can also predict the labels of all the data points stored in a table.
00220         For example, we can execute the following:
00221 \code
00222 sql> create table MADLIB_SCHEMA.svm_reg_test ( id int, ind float8[] );
00223 sql> insert into MADLIB_SCHEMA.svm_reg_test (select id, ind from my_schema.my_train_data limit 20);
00224 sql> select MADLIB_SCHEMA.svm_predict_batch('MADLIB_SCHEMA.svm_reg_test', 'ind', 'id', 'myexp', 'MADLIB_SCHEMA.svm_reg_output1', false); 
00225 sql> select * from MADLIB_SCHEMA.svm_reg_output1;
00226 sql> select MADLIB_SCHEMA.svm_predict_batch('MADLIB_SCHEMA.svm_reg_test', 'ind', 'id, 'myexp', 'MADLIB_SCHEMA.svm_reg_output2', true);
00227 sql> select * from MADLIB_SCHEMA.svm_reg_output2;
00228 \endcode 
00229 
00230 <strong>Example usage for classification:</strong>
00231 -# We can randomly generate 2000 5-dimensional data labelled by the simple
00232 target function 
00233 \code
00234 t(x) = if x[1] > 0 and  x[2] < 0 then 1 else -1;
00235 \endcode
00236 and store that in the my_schema.my_train_data table as follows:
00237 \code 
00238 sql> select MADLIB_SCHEMA.svm_generate_cls_data('my_schema.my_train_data', 2000, 5);
00239 \endcode
00240 -# We can now learn a classification model and store the resultant model
00241 under the name  'myexpc'.
00242 \code
00243 sql> select MADLIB_SCHEMA.svm_classification('my_schema.my_train_data', 'myexpc', false, 'MADLIB_SCHEMA.svm_dot');
00244 \endcode
00245 -# We can now start using it to predict the labels of new data points 
00246 like as follows:
00247 \code
00248 sql> select MADLIB_SCHEMA.svm_predict('myexpc', '{10,-2,4,20,10}');
00249 \endcode 
00250 -# To learn multiple support vector models, replace the model-building and prediction steps above by 
00251 \code
00252 sql> select MADLIB_SCHEMA.svm_classification('my_schema.my_train_data', 'myexpc', true, 'MADLIB_SCHEMA.svm_dot');
00253 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpc', '{10,-2,4,20,10}');
00254 \endcode
00255 -# To learn a linear support vector model using SGD, replace the model-building and prediction steps above by 
00256 \code
00257 sql> select MADLIB_SCHEMA.lsvm_classification('my_schema.my_train_data', 'myexpc', false);
00258 sql> select MADLIB_SCHEMA.lsvm_predict('myexpc', '{10,-2,4,20,10}');
00259 \endcode
00260 -# To learn multiple linear support vector models using SGD, replace the model-building and prediction steps above by 
00261 \code
00262 sql> select MADLIB_SCHEMA.lsvm_classification('my_schema.my_train_data', 'myexpc', true);
00263 sql> select MADLIB_SCHEMA.lsvm_predict_combo('myexpc', '{10,-2,4,20,10}');
00264 \endcode
00265 
00266 <strong>Example usage for novelty detection:</strong>
00267 -# We can randomly generate 100 2-dimensional data (the normal cases)
00268 and store that in the my_schema.my_train_data table as follows:
00269 \code
00270 sql> select MADLIB_SCHEMA.svm_generate_nd_data('my_schema.my_train_data', 100, 2);
00271 \endcode
00272 -# Learning and predicting using a single novelty detection model can be done as follows:
00273 \code
00274 sql> select MADLIB_SCHEMA.svm_novelty_detection('my_schema.my_train_data', 'myexpnd', false, 'MADLIB_SCHEMA.svm_dot');
00275 sql> select MADLIB_SCHEMA.svm_predict('myexpnd', '{10,-10}');  
00276 sql> select MADLIB_SCHEMA.svm_predict('myexpnd', '{-1,-1}');  
00277 \endcode
00278 -# Learning and predicting using multiple models can be done as follows:
00279 \code
00280 sql> select MADLIB_SCHEMA.svm_novelty_detection('my_schema.my_train_data', 'myexpnd', true, 'MADLIB_SCHEMA.svm_dot');
00281 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpnd', '{10,-10}');  
00282 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpnd', '{-1,-1}');  
00283 \endcode
00284 
00285 
00286 @literature
00287 
00288 [1] Jyrki Kivinen, Alexander J. Smola, and Robert C. Williamson: <em>Online
00289     Learning with Kernels</em>, IEEE Transactions on Signal Processing, 52(8),
00290     2165-2176, 2004.
00291 
00292 [2] Bernhard Scholkopf and Alexander J. Smola: <em>Learning with Kernels:
00293     Support Vector Machines, Regularization, Optimization, and Beyond</em>, 
00294     MIT Press, 2002.
00295 
00296 [3] L&eacute;on Bottou: <em>Large-Scale Machine Learning with Stochastic
00297 Gradient Descent</em>, Proceedings of the 19th International
00298 Conference on Computational Statistics, Springer, 2010.
00299     
00300 @sa File online_sv.sql_in documenting the SQL functions.
00301 
00302 @internal
00303 @sa namespace online_sv (documenting the implementation in Python)
00304 @endinternal
00305     
00306 */
00307 
00308 
00309 
00310 -- The following is the structure to record the results of a learning process.
00311 -- We work with arrays of float8 for now; we'll extend the code to work with sparse vectors next.
00312 --
00313 CREATE TYPE MADLIB_SCHEMA.svm_model_rec AS (
00314        inds int,        -- number of individuals processed 
00315        cum_err float8,  -- cumulative error
00316        epsilon float8,  -- the size of the epsilon tube around the hyperplane, adaptively adjusted by algorithm
00317        rho float8,      -- classification margin
00318        b   float8,      -- classifier offset
00319        nsvs int,        -- number of support vectors
00320        ind_dim int,     -- the dimension of the individuals
00321        weights float8[],       -- the weight of the support vectors
00322        individuals float8[],    -- the array of support vectors, represented as a 1-D array
00323        kernel_oid oid   -- OID of kernel function
00324 );
00325 
00326 -- The following is the structure to record the results of the linear SVM sgd algorithm
00327 --
00328 CREATE TYPE MADLIB_SCHEMA.lsvm_sgd_model_rec AS (
00329        weights float8[], -- the weight vector
00330        wdiv float8,      -- scaling factor for the weights
00331        wbias float8,     -- offset/bias of the linear model
00332        ind_dim int,      -- the dimension of the individuals
00333        inds int,         -- number of individuals processed 
00334        cum_err int   -- cumulative error
00335 );
00336 
00337 
00338 -- The following is the return type of a regression learning process
00339 --
00340 CREATE TYPE MADLIB_SCHEMA.svm_reg_result AS (
00341        model_table text, -- table where the model is stored
00342        model_name text,  -- model name
00343        inds int,         -- number of individuals processed 
00344        cum_err float8,   -- cumulative error
00345        epsilon float8,   -- the size of the epsilon tube around the hyperplane, adaptively adjusted by algorithm
00346        b float8,         -- classifier offset
00347        nsvs int          -- number of support vectors
00348 );
00349 
00350 -- The following is the return type of a classification learning process
00351 --
00352 CREATE TYPE MADLIB_SCHEMA.svm_cls_result AS (
00353        model_table text, -- table where the model is stored
00354        model_name text,  -- model name
00355        inds int,         -- number of individuals processed 
00356        cum_err float8,   -- cumulative error
00357        rho float8,       -- classification margin
00358        b float8,         -- classifier offset
00359        nsvs int          -- number of support vectors
00360 );
00361 
00362 -- The following is the return type of a linear classifier learning process
00363 --
00364 CREATE TYPE MADLIB_SCHEMA.lsvm_sgd_result AS (
00365        model_table text, -- table where the model is stored
00366        model_name text,  -- model name
00367        inds int,         -- number of individuals processed 
00368        ind_dim int,      -- the dimension of the individuals
00369        cum_err float8,   -- cumulative error
00370        wdiv float8,      -- scaling factor for the weights
00371        wbias float8      -- classifier offset
00372 );
00373 
00374 -- The following is the return type of a novelty detection learning process
00375 --
00376 CREATE TYPE MADLIB_SCHEMA.svm_nd_result AS (
00377        model_table text, -- table where the model is stored
00378        model_name text,  -- model name
00379        inds int,         -- number of individuals processed 
00380        rho float8,       -- classification margin
00381        nsvs int          -- number of support vectors
00382 );
00383 
00384 -- The type for representing support vectors
00385 --
00386 CREATE TYPE MADLIB_SCHEMA.svm_support_vector AS ( id text, weight float8, sv float8[] );
00387 
00388 
00389 
00390 -- Kernel functions are a generalisation of inner products. 
00391 -- They provide the means by which we can extend linear machines to work in non-linear transformed feature spaces.
00392 -- Here are a few standard kernels: dot product, polynomial kernel, Gaussian kernel.
00393 --
00394 /**
00395  * @brief Dot product kernel function
00396  *
00397  * @param x The data point \f$ \boldsymbol x \f$
00398  * @param y The data point \f$ \boldsymbol y \f$
00399  * @return Returns dot product of the two data points.
00400  *      
00401  */
00402 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_dot(x float8[], y float8[]) RETURNS float8 
00403 AS 'MODULE_PATHNAME', 'svm_dot' LANGUAGE C IMMUTABLE STRICT;
00404 
00405 /**
00406  * @brief Polynomial kernel function
00407  *
00408  * @param x The data point \f$ \boldsymbol x \f$
00409  * @param y The data point \f$ \boldsymbol y \f$
00410  * @param degree The degree \f$ d \f$
00411  * @return Returns \f$ K(\boldsymbol x,\boldsymbol y)=(\boldsymbol x \cdot \boldsymbol y)^d \f$
00412  *      
00413  */
00414 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_polynomial(x float8[], y float8[], degree float8) RETURNS float8 
00415 AS 'MODULE_PATHNAME', 'svm_polynomial' LANGUAGE C IMMUTABLE STRICT;
00416 
00417 /**
00418  * @brief Gaussian kernel function
00419  *
00420  * @param x The data point \f$ \boldsymbol x \f$
00421  * @param y The data point \f$ \boldsymbol y \f$
00422  * @param gamma The spread \f$ \gamma \f$
00423  * @return Returns \f$ K(\boldsymbol x,\boldsymbol y)=exp(-\gamma || \boldsymbol x \cdot \boldsymbol y ||^2 ) \f$
00424  *      
00425  */
00426 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_gaussian(x float8[], y float8[], gamma float8) RETURNS float8 
00427 AS 'MODULE_PATHNAME', 'svm_gaussian' LANGUAGE C IMMUTABLE STRICT; 
00428 
00429 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_predict_sub(int,int,float8[],float8[],float8[],text) RETURNS float8
00430 AS 'MODULE_PATHNAME', 'svm_predict_sub' LANGUAGE C IMMUTABLE STRICT;
00431 
00432 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_predict(svs MADLIB_SCHEMA.svm_model_rec, ind float8[], kernel text) 
00433 RETURNS float8 AS $$
00434     SELECT MADLIB_SCHEMA.svm_predict_sub($1.nsvs, $1.ind_dim, $1.weights, $1.individuals, $2, $3);
00435 $$ LANGUAGE SQL;
00436 
00437 -- This is the main online support vector regression learning algorithm. 
00438 -- The function updates the support vector model as it processes each new training example.
00439 -- This function is wrapped in an aggregate function to process all the training examples stored in a table.  
00440 --
00441 CREATE OR REPLACE FUNCTION 
00442 MADLIB_SCHEMA.svm_reg_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], label FLOAT8, kernel TEXT, eta FLOAT8, nu FLOAT8, slambda FLOAT8)
00443 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_reg_update' LANGUAGE C STRICT;   
00444 
00445 CREATE AGGREGATE MADLIB_SCHEMA.svm_reg_agg(float8[], float8, text, float8, float8, float8) (
00446        sfunc = MADLIB_SCHEMA.svm_reg_update,
00447        stype = MADLIB_SCHEMA.svm_model_rec,
00448        initcond = '(0,0,0,0,0,0,0,{},{},0)'
00449 );
00450 
00451 -- This is the main online support vector classification learning algorithm. 
00452 -- The function updates the support vector model as it processes each new training example.
00453 -- This function is wrapped in an aggregate function to process all the training examples stored in a table.  
00454 --
00455 CREATE OR REPLACE FUNCTION 
00456 MADLIB_SCHEMA.svm_cls_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], label FLOAT8, kernel TEXT, eta FLOAT8, nu FLOAT8)
00457 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_cls_update' LANGUAGE C STRICT;   
00458 
00459 CREATE AGGREGATE MADLIB_SCHEMA.svm_cls_agg(float8[], float8, text, float8, float8) (
00460        sfunc = MADLIB_SCHEMA.svm_cls_update,
00461        stype = MADLIB_SCHEMA.svm_model_rec,
00462        initcond = '(0,0,0,0,0,0,0,{},{},0)'
00463 );
00464 
00465 -- This is the main online support vector novelty detection algorithm. 
00466 -- The function updates the support vector model as it processes each new training example.
00467 -- In contrast to classification and regression, the training data points have no labels.
00468 -- This function is wrapped in an aggregate function to process all the training examples stored in a table.  
00469 --
00470 CREATE OR REPLACE FUNCTION 
00471 MADLIB_SCHEMA.svm_nd_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], kernel TEXT, eta FLOAT8, nu FLOAT8)
00472 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_nd_update' LANGUAGE C STRICT;   
00473 
00474 CREATE AGGREGATE MADLIB_SCHEMA.svm_nd_agg(float8[], text, float8, float8) (
00475        sfunc = MADLIB_SCHEMA.svm_nd_update,
00476        stype = MADLIB_SCHEMA.svm_model_rec,
00477        initcond = '(0,0,0,0,0,0,0,{},{},0)'
00478 );
00479 
00480 -- This is the SGD algorithm for linear SVMs. 
00481 -- The function updates the support vector model as it processes each new training example.
00482 -- This function is wrapped in an aggregate function to process all the training examples stored in a table.  
00483 --
00484 CREATE OR REPLACE FUNCTION 
00485 MADLIB_SCHEMA.lsvm_sgd_update(svs MADLIB_SCHEMA.lsvm_sgd_model_rec, ind FLOAT8[], label FLOAT8, eta FLOAT8, reg FLOAT8)
00486 RETURNS MADLIB_SCHEMA.lsvm_sgd_model_rec AS 'MODULE_PATHNAME', 'lsvm_sgd_update' LANGUAGE C STRICT;   
00487 
00488 CREATE AGGREGATE MADLIB_SCHEMA.lsvm_sgd_agg(float8[], float8, float8, float8) (
00489        sfunc = MADLIB_SCHEMA.lsvm_sgd_update,
00490        stype = MADLIB_SCHEMA.lsvm_sgd_model_rec,
00491        initcond = '({},1,0,0,0,0)'
00492 );
00493 
00494 
00495 -- This function stores a MADLIB_SCHEMA.svm_model_rec stored in model_temp_table into the model_table.
00496 --
00497 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_store_model(model_temp_table TEXT, model_name TEXT, model_table TEXT) RETURNS VOID AS $$
00498 
00499     sql = "SELECT COUNT(*) FROM " + model_temp_table + " WHERE id = \'" + model_name + "\'";
00500     temp = plpy.execute(sql);
00501     if (temp[0]['count'] == 0):
00502         plpy.error("No support vector model with name " + model_name + " found.");
00503 
00504     sql = "SELECT (model).ind_dim, (model).nsvs" \
00505            + " FROM " + model_temp_table + " WHERE id = '" + model_name + "'";
00506     rv = plpy.execute(sql);
00507     myind_dim = rv[0]['ind_dim'];
00508     mynsvs = rv[0]['nsvs'];
00509 
00510     if (mynsvs == 0):
00511         plpy.error("The specified model has no support vectors and therefore not processed");
00512 
00513     idx = 0;    
00514     for i in range(1,mynsvs+1):
00515         idx = myind_dim * (i-1);
00516         sql = "INSERT INTO " + model_table \
00517                   + " SELECT \'" + model_name + "\', (model).weights[" + str(i) + "], " \
00518                   + "            (model).individuals[(" + str(idx+1) + "):(" + str(idx) + "+" + str(myind_dim) + ")] " \
00519                   + " FROM " + model_temp_table + " WHERE id = \'" + model_name + "\' LIMIT 1";
00520         plpy.execute(sql);            
00521 
00522 $$ LANGUAGE plpythonu;
00523 
00524 /**
00525  * @brief Drops all tables pertaining to a model
00526  *
00527  * @param model_table The table to be dropped.
00528  */
00529 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_drop_model(model_table TEXT) RETURNS VOID AS $$
00530        plpy.execute("drop table if exists " + model_table)
00531        plpy.execute("drop table if exists " + model_table + "_param")
00532 $$ LANGUAGE plpythonu;
00533 
00534 CREATE TYPE MADLIB_SCHEMA.svm_model_pr AS ( model text, prediction float8 );
00535 
00536 /**
00537  * @brief Evaluates a support-vector model on a given data point
00538  *
00539  * @param model_table The table storing the learned model \f$ f \f$ to be used
00540  * @param ind The data point \f$ \boldsymbol x \f$
00541  * @return This function returns \f$ f(\boldsymbol x) \f$
00542  */
00543 CREATE OR REPLACE FUNCTION 
00544 MADLIB_SCHEMA.svm_predict(model_table text, ind float8[]) RETURNS FLOAT8 AS $$
00545 
00546     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00547     
00548     # schema_madlib comes from PythonFunctionBodyOnly
00549     return online_sv.svm_predict(model_table, ind);
00550 
00551 $$ LANGUAGE plpythonu;
00552 
00553 /**
00554  * @brief Evaluates multiple support-vector models on a data point
00555  *
00556  * @param model_table The table storing the learned models to be used.
00557  * @param ind The data point \f$ \boldsymbol x \f$
00558  * @return This function returns a table, a row for each model.
00559  *      Moreover, the last row contains the average value, over all models.
00560  *
00561  * The different models are assumed to be named <tt><em>model_table</em>1</tt>,
00562  * <tt><em>model_table</em>2</tt>, ....
00563  */
00564 CREATE OR REPLACE FUNCTION
00565 MADLIB_SCHEMA.svm_predict_combo(model_table text, ind float8[]) RETURNS SETOF MADLIB_SCHEMA.svm_model_pr AS $$
00566 
00567     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00568     
00569     # schema_madlib comes from PythonFunctionBodyOnly
00570     return online_sv.svm_predict_combo( schema_madlib, model_table, ind);
00571 
00572 $$ LANGUAGE plpythonu;
00573 
00574 
00575 /**
00576  * @brief This is the support vector regression function
00577  *
00578  * @param input_table The name of the table/view with the training data
00579  * @param model_table The name of the table under which we want to store the learned model
00580  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00581  * @param kernel_func Kernel function
00582  * @return A summary of the learning process
00583  *
00584  * @internal 
00585  * @sa This function is a wrapper for online_sv::svm_regression().
00586  */
00587 CREATE OR REPLACE FUNCTION 
00588 MADLIB_SCHEMA.svm_regression(input_table text, model_table text,  parallel bool, kernel_func text)
00589 RETURNS SETOF MADLIB_SCHEMA.svm_reg_result
00590 AS $$
00591 
00592     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00593     
00594     # schema_madlib comes from PythonFunctionBodyOnly
00595     return online_sv.svm_regression( schema_madlib, input_table, model_table, parallel, kernel_func);   
00596 
00597 $$ LANGUAGE 'plpythonu';
00598 
00599 /**
00600  * @brief This is the support vector regression function
00601  *
00602  * @param input_table The name of the table/view with the training data
00603  * @param model_table The name of the table under which we want to store the learned model
00604  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00605  * @param kernel_func Kernel function
00606  * @param verbose Verbosity of reporting
00607  * @param eta Learning rate in (0,1] 
00608  * @param nu  Compression parameter in (0,1] associated with the fraction of training data that will become support vectors 
00609  * @param slambda Regularisation parameter
00610  * @return A summary of the learning process
00611  *
00612  * @internal 
00613  * @sa This function is a wrapper for online_sv::svm_regression().
00614  */
00615 CREATE OR REPLACE FUNCTION 
00616 MADLIB_SCHEMA.svm_regression(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8, slambda float8)
00617 RETURNS SETOF MADLIB_SCHEMA.svm_reg_result
00618 AS $$
00619 
00620     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00621     
00622     # schema_madlib comes from PythonFunctionBodyOnly
00623     return online_sv.svm_regression( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu, slambda);
00624 
00625 $$ LANGUAGE 'plpythonu';
00626 
00627 /**
00628  * @brief This is the support vector classification function
00629  *
00630  * @param input_table The name of the table/view with the training data
00631  * @param model_table The name of the table under which we want to store the learned model
00632  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00633  * @param kernel_func Kernel function
00634  * @return A summary of the learning process
00635  *
00636  * @internal 
00637  * @sa This function is a wrapper for online_sv::svm_classification().
00638  */
00639 CREATE OR REPLACE FUNCTION 
00640 MADLIB_SCHEMA.svm_classification(input_table text, model_table text, parallel bool, kernel_func text)
00641 RETURNS SETOF MADLIB_SCHEMA.svm_cls_result
00642 AS $$
00643 
00644     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00645     
00646     # schema_madlib comes from PythonFunctionBodyOnly
00647     return online_sv.svm_classification( schema_madlib, input_table, model_table, parallel, kernel_func);
00648     
00649 $$ LANGUAGE 'plpythonu';
00650 
00651 /**
00652  * @brief This is the support vector classification function
00653  *
00654  * @param input_table The name of the table/view with the training data
00655  * @param model_table The name of the table under which we want to store the learned model
00656  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00657  * @param kernel_func Kernel function
00658  * @param verbose Verbosity of reporting
00659  * @param eta Learning rate in (0,1]
00660  * @param nu Compression parameter in (0,1] associated with the fraction of training data that will become support vectors
00661  * @return A summary of the learning process
00662  *
00663  * @internal 
00664  * @sa This function is a wrapper for online_sv::svm_classification().
00665  */
00666 CREATE OR REPLACE FUNCTION 
00667 MADLIB_SCHEMA.svm_classification(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8)
00668 RETURNS SETOF MADLIB_SCHEMA.svm_cls_result
00669 AS $$
00670 
00671     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00672     
00673     # schema_madlib comes from PythonFunctionBodyOnly
00674     return online_sv.svm_classification( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu);
00675 
00676 $$ LANGUAGE 'plpythonu';
00677 
00678 /**
00679  * @brief This is the support vector novelty detection function.
00680  * 
00681  * @param input_table The name of the table/view with the training data
00682  * @param model_table The name of the table under which we want to store the learned model
00683  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00684  * @param kernel_func Kernel function
00685  * @return A summary of the learning process
00686  *
00687  * @internal 
00688  * @sa This function is a wrapper for online_sv::svm_novelty_detection().
00689  */
00690 CREATE OR REPLACE FUNCTION 
00691 MADLIB_SCHEMA.svm_novelty_detection(input_table text, model_table text, parallel bool, kernel_func text)
00692 RETURNS SETOF MADLIB_SCHEMA.svm_nd_result
00693 AS $$
00694 
00695     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00696     
00697     # schema_madlib comes from PythonFunctionBodyOnly
00698     return online_sv.svm_novelty_detection( schema_madlib, input_table, model_table, parallel, kernel_func);
00699 
00700 $$ LANGUAGE 'plpythonu';
00701 
00702 /**
00703  * @brief This is the support vector novelty detection function.
00704  * 
00705  * @param input_table The name of the table/view with the training data
00706  * @param model_table The name of the table under which we want to store the learned model
00707  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00708  * @param kernel_func Kernel function
00709  * @param verbose Verbosity of reporting
00710  * @param eta Learning rate in (0,1]
00711  * @param nu Compression parameter in (0,1] associated with the fraction of training data that will become support vectors
00712  * @return A summary of the learning process
00713  *
00714  * @internal 
00715  * @sa This function is a wrapper for online_sv::svm_novelty_detection().
00716  */
00717 CREATE OR REPLACE FUNCTION 
00718 MADLIB_SCHEMA.svm_novelty_detection(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8)
00719 RETURNS SETOF MADLIB_SCHEMA.svm_nd_result
00720 AS $$
00721 
00722     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00723     
00724     # schema_madlib comes from PythonFunctionBodyOnly
00725     return online_sv.svm_novelty_detection( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu);
00726 
00727 $$ LANGUAGE 'plpythonu';
00728 
00729 
00730 /**
00731  * @brief Scores the data points stored in a table using a learned support-vector model
00732  *
00733  * @param input_table Name of table/view containing the data points to be scored
00734  * @param data_col Name of column in input_table containing the data points
00735  * @param id_col Name of column in input_table containing the integer identifier of data points
00736  * @param model_table Name of table where the learned model to be used is stored
00737  * @param output_table Name of table to store the results 
00738  * @param parallel A flag indicating whether the model to be used was learned in parallel
00739  * @return Textual summary of the algorithm run
00740  *
00741  * @internal 
00742  * @sa This function is a wrapper for online_sv::svm_predict_batch().
00743  */
00744 CREATE OR REPLACE FUNCTION
00745 MADLIB_SCHEMA.svm_predict_batch(input_table text, data_col text, id_col text, model_table text, output_table text, parallel bool)
00746 RETURNS TEXT
00747 AS $$
00748 
00749     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00750     
00751     # schema_madlib comes from PythonFunctionBodyOnly
00752     return online_sv.svm_predict_batch( input_table, data_col, id_col, model_table, output_table, parallel);
00753     
00754 $$ LANGUAGE 'plpythonu';
00755 
00756 -- Generate artificial training data 
00757 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_random_ind(d INT) RETURNS float8[] AS $$
00758 DECLARE
00759     ret float8[];
00760 BEGIN
00761     FOR i IN 1..(d-1) LOOP
00762         ret[i] = RANDOM() * 40 - 20;
00763     END LOOP;
00764     IF (RANDOM() > 0.5) THEN
00765         ret[d] = 10;
00766     ELSE 
00767         ret[d] = -10;
00768     END IF;
00769     RETURN ret;
00770 END
00771 $$ LANGUAGE plpgsql;
00772 
00773 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_random_ind2(d INT) RETURNS float8[] AS $$
00774 DECLARE
00775     ret float8[];
00776 BEGIN
00777     FOR i IN 1..d LOOP
00778         ret[i] = RANDOM() * 5 + 10;
00779     IF (RANDOM() > 0.5) THEN ret[i] = -ret[i]; END IF;
00780     END LOOP;
00781     RETURN ret;
00782 END
00783 $$ LANGUAGE plpgsql;
00784 
00785 
00786 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_reg_data(output_table text, num int, dim int) RETURNS VOID AS $$
00787     plpy.execute("drop table if exists " + output_table)
00788     plpy.execute("create table " + output_table + " ( id int, ind float8[], label float8 ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')")
00789     plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind(" + str(dim) + "), 0 FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a")
00790     plpy.execute("UPDATE " + output_table + " SET label = MADLIB_SCHEMA.__svm_target_reg_func(ind)")
00791 $$ LANGUAGE 'plpythonu';
00792 
00793 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_target_reg_func(ind float8[]) RETURNS float8 AS $$
00794 DECLARE
00795     dim int;
00796 BEGIN
00797     dim = array_upper(ind,1);
00798     IF (ind[dim] = 10) THEN RETURN 50; END IF;
00799     RETURN -50;
00800 END
00801 $$ LANGUAGE plpgsql;
00802 
00803 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_cls_data(output_table text, num int, dim int) RETURNS VOID AS $$
00804     plpy.execute("drop table if exists " + output_table);
00805     plpy.execute("create table " + output_table + " ( id int, ind float8[], label float8 ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')")
00806     plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind(" + str(dim) + "), 0 FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a")
00807     plpy.execute("UPDATE " + output_table + " SET label = MADLIB_SCHEMA.__svm_target_cl_func(ind)")
00808 $$ LANGUAGE 'plpythonu';
00809 
00810 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_target_cl_func(ind float8[]) RETURNS float8 AS $$
00811 BEGIN
00812     IF (ind[1] > 0 AND ind[2] < 0) THEN RETURN 1; END IF;
00813     RETURN -1;
00814 END
00815 $$ LANGUAGE plpgsql;
00816 
00817 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_nd_data(output_table text, num int, dim int) RETURNS VOID AS $$
00818     plpy.execute("drop table if exists " + output_table);
00819     plpy.execute("create table " + output_table + " ( id int, ind float8[] ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')")
00820     plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind2(" + str(dim) + ") FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a")
00821 $$ LANGUAGE 'plpythonu';
00822 
00823 
00824 /**
00825  * @brief Normalizes the data stored in a table, and save the normalized data in a new table. 
00826  *
00827  * @param input_table Name of table/view containing the data points to be scored
00828  */
00829 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_data_normalization(input_table TEXT) RETURNS VOID AS $$
00830     output_table = input_table + "_scaled";
00831     plpy.execute("DROP TABLE IF EXISTS " + output_table);
00832     plpy.execute("CREATE TABLE " + output_table + " ( id int, ind float8[], label int ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')");
00833     plpy.execute("INSERT INTO " + output_table + " SELECT id, MADLIB_SCHEMA.svm_normalization(ind), label FROM " + input_table);
00834     plpy.info("output table: %s" % output_table)
00835 $$ LANGUAGE plpythonu;
00836 
00837 
00838 /**
00839  * @brief This is the linear support vector classification function
00840  *
00841  * @param input_table The name of the table/view with the training data
00842  * @param model_table The name of the table under which we want to store the learned model
00843  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00844  * @return A summary of the learning process
00845  *
00846  * @internal 
00847  * @sa This function is a wrapper for online_sv::lsvm_classification().
00848 */ 
00849 CREATE OR REPLACE FUNCTION 
00850 MADLIB_SCHEMA.lsvm_classification(input_table text, model_table text, parallel bool) 
00851 RETURNS SETOF MADLIB_SCHEMA.lsvm_sgd_result
00852 AS $$
00853     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00854     # schema_madlib comes from PythonFunctionBodyOnly
00855     return online_sv.lsvm_classification( schema_madlib, input_table, model_table, parallel);
00856 $$ LANGUAGE 'plpythonu';
00857 
00858 
00859 
00860 /**
00861  * @brief This is the linear support vector classification function
00862  *
00863  * @param input_table The name of the table/view with the training data
00864  * @param model_table The name of the table under which we want to store the learned model
00865  * @param parallel A flag indicating whether the system should learn multiple models in parallel
00866  * @param verbose Verbosity of reporting
00867  * @param eta Initial learning rate in (0,1]
00868  * @param reg Regularization parameter, often chosen by cross-validation
00869  * @return A summary of the learning process
00870  *
00871  * @internal 
00872  * @sa This function is a wrapper for online_sv::lsvm_classification().
00873 */ 
00874 CREATE OR REPLACE FUNCTION 
00875 MADLIB_SCHEMA.lsvm_classification(input_table text, model_table text, parallel bool, verbose bool, eta float8, reg float8)
00876 RETURNS SETOF MADLIB_SCHEMA.lsvm_sgd_result
00877 AS $$
00878 
00879     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00880     
00881     # schema_madlib comes from PythonFunctionBodyOnly
00882     return online_sv.lsvm_classification( schema_madlib, input_table, model_table, parallel, verbose, eta, reg);
00883 
00884 $$ LANGUAGE 'plpythonu';
00885 
00886 
00887 /**
00888  * @brief Scores the data points stored in a table using a learned linear support-vector model
00889  *
00890  * @param input_table Name of table/view containing the data points to be scored
00891  * @param data_col Name of column in input_table containing the data points
00892  * @param id_col Name of column in input_table containing the integer identifier of data points
00893  * @param model_table Name of table where the learned model to be used is stored
00894  * @param output_table Name of table to store the results 
00895  * @param parallel A flag indicating whether the model to be used was learned in parallel
00896  * @return Textual summary of the algorithm run
00897  *
00898  * @internal 
00899  * @sa This function is a wrapper for online_sv::lsvm_predict_batch().
00900  */
00901 CREATE OR REPLACE FUNCTION
00902 MADLIB_SCHEMA.lsvm_predict_batch(input_table text, data_col text, id_col text, model_table text, output_table text, parallel bool)
00903 RETURNS TEXT
00904 AS $$
00905 
00906     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00907     
00908     # schema_madlib comes from PythonFunctionBodyOnly
00909     return online_sv.lsvm_predict_batch( schema_madlib, input_table, data_col, id_col, model_table, output_table, parallel);
00910     
00911 $$ LANGUAGE 'plpythonu';
00912 
00913 
00914 /**
00915  * @brief Evaluates a linear support-vector model on a given data point
00916  *
00917  * @param model_table The table storing the learned model \f$ f \f$ to be used
00918  * @param ind The data point \f$ \boldsymbol x \f$
00919  * @return This function returns \f$ f(\boldsymbol x) \f$
00920  */
00921 CREATE OR REPLACE FUNCTION 
00922 MADLIB_SCHEMA.lsvm_predict(model_table text, ind float8[]) RETURNS FLOAT8 AS $$
00923 
00924     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00925     
00926     # schema_madlib comes from PythonFunctionBodyOnly
00927     return online_sv.lsvm_predict(schema_madlib, model_table, ind);
00928 
00929 $$ LANGUAGE plpythonu;
00930 
00931 /**
00932  * @brief Evaluates multiple linear support-vector models on a data point
00933  *
00934  * @param model_table The table storing the learned models to be used.
00935  * @param ind The data point \f$ \boldsymbol x \f$
00936  * @return This function returns a table, a row for each model.
00937  *      Moreover, the last row contains the average value, over all models.
00938  *
00939  * The different models are assumed to be named <tt><em>model_table</em>0</tt>,
00940  * <tt><em>model_table</em>1</tt>, ....
00941  */
00942 CREATE OR REPLACE FUNCTION
00943 MADLIB_SCHEMA.lsvm_predict_combo(model_table text, ind float8[]) RETURNS SETOF MADLIB_SCHEMA.svm_model_pr AS $$
00944 
00945     PythonFunctionBodyOnly(`kernel_machines', `online_sv')
00946     
00947     # schema_madlib comes from PythonFunctionBodyOnly
00948     return online_sv.lsvm_predict_combo( schema_madlib, model_table, ind);
00949 
00950 $$ LANGUAGE plpythonu;