MADlib
0.7 A newer version is available
User Documentation
|
00001 /* ----------------------------------------------------------------------- *//** 00002 * 00003 * @file online_sv.sql_in 00004 * 00005 * @brief SQL functions for support vector machines 00006 * @sa For an introduction to Support vector machines (SVMs) and related kernel 00007 * methods, see the module description \ref grp_kernmach. 00008 * 00009 *//* ------------------------------------------------------------------------*/ 00010 00011 m4_include(`SQLCommon.m4') 00012 00013 /** 00014 @addtogroup grp_kernmach 00015 00016 @about 00017 00018 Support vector machines (SVMs) and related kernel methods have been one of 00019 the most popular and well-studied machine learning techniques of the 00020 past 15 years, with an amazing number of innovations and applications. 00021 00022 In a nutshell, an SVM model \f$f(x)\f$ takes the form of 00023 \f[ 00024 f(x) = \sum_i \alpha_i k(x_i,x), 00025 \f] 00026 where each \f$ \alpha_i \f$ is a real number, each \f$ \boldsymbol x_i \f$ is a 00027 data point from the training set (called a support vector), and 00028 \f$ k(\cdot, \cdot) \f$ is a kernel function that measures how "similar" two 00029 objects are. In regression, \f$ f(\boldsymbol x) \f$ is the regression function 00030 we seek. In classification, \f$ f(\boldsymbol x) \f$ serves as 00031 the decision boundary; so for example in binary classification, the predictor 00032 can output class 1 for object \f$x\f$ if \f$ f(\boldsymbol x) \geq 0 \f$, and class 00033 2 otherwise. 00034 00035 In the case when the kernel function \f$ k(\cdot, \cdot) \f$ is the standard 00036 inner product on vectors, \f$ f(\boldsymbol x) \f$ is just an alternative way of 00037 writing a linear function 00038 \f[ 00039 f'(\boldsymbol x) = \langle \boldsymbol w, \boldsymbol x \rangle, 00040 \f] 00041 where \f$ \boldsymbol w \f$ is a weight vector having the same dimension as 00042 \f$ \boldsymbol x \f$. One of the key points of SVMs is that we can use more 00043 fancy kernel functions to efficiently learn linear models in high-dimensional 00044 feature spaces, since \f$ k(\boldsymbol x_i, \boldsymbol x_j) \f$ can be 00045 understood as an efficient way of computing an inner product in the feature 00046 space: 00047 \f[ 00048 k(\boldsymbol x_i, \boldsymbol x_j) 00049 = \langle \phi(\boldsymbol x_i), \phi(\boldsymbol x_j) \rangle, 00050 \f] 00051 where \f$ \phi(\boldsymbol x) \f$ projects \f$ \boldsymbol x \f$ into a 00052 (possibly infinite-dimensional) feature space. 00053 00054 There are many algorithms for learning kernel machines. This module 00055 implements the class of online learning with kernels algorithms 00056 described in Kivinen et al. [1]. It also includes the Stochastic 00057 Gradient Descent (SGD) method [3] for learning linear SVMs with the Hinge 00058 loss \f$l(z) = \max(0, 1-z)\f$. See also the book Scholkopf and Smola [2] for much more 00059 details. 00060 00061 The SGD implementation is based on Léon Bottou's SGD package 00062 (http://leon.bottou.org/projects/sgd). The methods introduced in [1] 00063 are implemented according to their original descriptions, except that 00064 we only update the support vector model when we make a significant 00065 error. The original algorithms in [1] update the support vector model at 00066 every step, even when no error was made, in the name of 00067 regularisation. For practical purposes, and this is verified 00068 empirically to a certain degree, updating only when necessary is both 00069 faster and better from a learning-theoretic point of view, at least in 00070 the i.i.d. setting. 00071 00072 Methods for classification, regression and novelty detection are 00073 available. Multiple instances of the algorithms can be executed 00074 in parallel on different subsets of the training data. The resultant 00075 support vector models can then be combined using standard techniques 00076 like averaging or majority voting. 00077 00078 Training data points are accessed via a table or a view. The support 00079 vector models can also be stored in tables for fast execution. 00080 00081 @input 00082 For classification and regression, the training table/view is expected to be of the following form (the array size of <em>ind</em> must not be greater than 102,400.):\n 00083 <pre>{TABLE|VIEW} <em>input_table</em> ( 00084 ... 00085 <em>id</em> INT, 00086 <em>ind</em> FLOAT8[], 00087 <em>label</em> FLOAT8, 00088 ... 00089 )</pre> 00090 For novelty detection, the label field is not required. 00091 00092 @usage 00093 00094 - Regression learning is achieved through the following function: 00095 <pre>SELECT \ref svm_regression( 00096 '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 00097 <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em>, <em>slambda DEFAULT 0.05</em> 00098 );</pre> 00099 00100 - Classification learning is achieved through the following two 00101 functions: 00102 -# Learn linear SVM(s) using SGD [3]: 00103 <pre>SELECT \ref lsvm_classification( 00104 '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, 00105 <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>reg DEFAULT 0.001</em> 00106 );</pre> 00107 -# Learn linear or non-linear SVM(s) using the method described in [1]: 00108 <pre>SELECT \ref svm_classification( 00109 '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 00110 <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em> 00111 );</pre> 00112 00113 - Novelty detection is achieved through the following function: 00114 <pre>SELECT \ref svm_novelty_detection( 00115 '<em>input_table</em>', '<em>model_table</em>', <em>parallel</em>, '<em>kernel_func</em>', 00116 <em>verbose DEFAULT false</em>, <em>eta DEFAULT 0.1</em>, <em>nu DEFAULT 0.005</em> 00117 );</pre> 00118 Assuming the model_table parameter takes on value 'model', each learning function will produce two tables 00119 as output: 'model' and 'model_param'. 00120 The first contains the support vectors of the model(s) learned. 00121 The second contains the parameters of the model(s) learned, which includes information like the kernel function 00122 used and the value of the intercept, if there is one. 00123 00124 - To make predictions on a single data point x using a single model 00125 learned previously, we use the function 00126 <pre>SELECT \ref 00127 svm_predict('<em>model_table</em>',<em>x</em>);</pre> 00128 If the model is produced by the lsvm_classification() function, use 00129 the following prediction function instead 00130 <pre>SELECT \ref 00131 lsvm_predict('<em>model_table</em>',<em>x</em>);</pre> 00132 00133 - To make predictions on new data points using multiple models 00134 learned in parallel, we use the function 00135 <pre>SELECT \ref 00136 svm_predict_combo('<em>model_table</em>',<em>x</em>);</pre> 00137 If the models are produced by the lsvm_classification() function, use 00138 the following prediction function instead 00139 <pre>SELECT \ref 00140 lsvm_predict_combo('<em>model_table</em>',<em>x</em>);</pre> 00141 00142 00143 - Note that, at the moment, we cannot use MADLIB_SCHEMA.svm_predict() and MADLIB_SCHEMA.svm_predict_combo() 00144 on multiple data points. For example, something like the following will fail: 00145 <pre>SELECT \ref svm_predict('<em>model_table</em>',<em>x</em>) FROM data_table;</pre> 00146 Instead, to make predictions on new data points stored in a table using 00147 previously learned models, we use the function: 00148 <pre>SELECT \ref svm_predict_batch('<em>input_table</em>', '<em>data_col</em>', '<em>id_col</em>', '<em>model_table</em>', '<em>output_table</em>', <em>parallel</em>);</pre> 00149 The output_table is created during the function call; an existing table with 00150 the same name will be dropped. 00151 If the parallel parameter is true, then each data point in the input table will have multiple 00152 predicted values corresponding to the number of models learned in 00153 parallel.\n\n 00154 Similarly, use the following function for batch prediction if the 00155 model(s) is produced by the lsvm_classification() function: 00156 <pre>SELECT \ref lsvm_predict_batch('<em>input_table</em>', '<em>data_col</em>', '<em>id_col</em>', '<em>model_table</em>','<em>output_table</em>', <em>parallel</em>);</pre> 00157 00158 00159 00160 @implementation 00161 00162 Currently, three kernel functions have been implemented: dot product (\ref svm_dot), polynomial (\ref svm_polynomial) and Gaussian (\ref svm_gaussian) kernels. To use the dot product kernel function, 00163 simply use '<tt><em>MADLIB_SCHEMA.svm_dot</em></tt>' as the <tt>kernel_func</tt> argument, which accepts any function that takes in two float[] and returns a float. To use the polynomial or Gaussian kernels, 00164 a wrapper function is needed since these kernels require additional input parameters (see online_sv.sql_in for input parameters). 00165 00166 For example, to use the polynomial kernel with degree 2, first create a wrapper function: 00167 <pre>CREATE OR REPLACE FUNCTION mykernel(FLOAT[],FLOAT[]) RETURNS FLOAT AS $$ 00168 SELECT \ref svm_polynomial($1,$2,2) 00169 $$ language sql;</pre> 00170 Then call the SVM learning functions with <tt>mykernel</tt> as the argument to <tt>kernel_func</tt>. 00171 <pre>SELECT \ref svm_regression('my_schema.my_train_data', 'mymodel', false, 'mykernel');</pre> 00172 00173 To drop all tables pertaining to the model, we can use 00174 <pre>SELECT \ref svm_drop_model('model_table');</pre> 00175 00176 @examp 00177 00178 As a general first step, we need to prepare and populate an input 00179 table/view with the following structure: 00180 \code 00181 TABLE/VIEW my_schema.my_input_table 00182 ( 00183 id INT, -- point ID 00184 ind FLOAT8[], -- data point 00185 label FLOAT8 -- label of data point 00186 ); 00187 \endcode 00188 Note: The label field is not required for novelty detection. 00189 00190 00191 <strong>Example usage for regression</strong>: 00192 -# We can randomly generate 1000 5-dimensional data labelled by the simple target function 00193 \code 00194 t(x) = if x[5] = 10 then 50 else if x[5] = -10 then 50 else 0; 00195 \endcode 00196 and store that in the my_schema.my_train_data table as follows: 00197 \code 00198 sql> select MADLIB_SCHEMA.svm_generate_reg_data('my_schema.my_train_data', 1000, 5); 00199 \endcode 00200 -# We can now learn a regression model and store the resultant model 00201 under the name 'myexp'. 00202 \code 00203 sql> select MADLIB_SCHEMA.svm_regression('my_schema.my_train_data', 'myexp', false, 'MADLIB_SCHEMA.svm_dot'); 00204 \endcode 00205 -# We can now start using it to predict the labels of new data points 00206 like as follows: 00207 \code 00208 sql> select MADLIB_SCHEMA.svm_predict('myexp', '{1,2,4,20,10}'); 00209 sql> select MADLIB_SCHEMA.svm_predict('myexp', '{1,2,4,20,-10}'); 00210 \endcode 00211 -# To learn multiple support vector models, we replace the learning step above by 00212 \code 00213 sql> select MADLIB_SCHEMA.svm_regression('my_schema.my_train_data', 'myexp', true, 'MADLIB_SCHEMA.svm_dot'); 00214 \endcode 00215 The resultant models can be used for prediction as follows: 00216 \code 00217 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexp', '{1,2,4,20,10}'); 00218 \endcode 00219 -# We can also predict the labels of all the data points stored in a table. 00220 For example, we can execute the following: 00221 \code 00222 sql> create table MADLIB_SCHEMA.svm_reg_test ( id int, ind float8[] ); 00223 sql> insert into MADLIB_SCHEMA.svm_reg_test (select id, ind from my_schema.my_train_data limit 20); 00224 sql> select MADLIB_SCHEMA.svm_predict_batch('MADLIB_SCHEMA.svm_reg_test', 'ind', 'id', 'myexp', 'MADLIB_SCHEMA.svm_reg_output1', false); 00225 sql> select * from MADLIB_SCHEMA.svm_reg_output1; 00226 sql> select MADLIB_SCHEMA.svm_predict_batch('MADLIB_SCHEMA.svm_reg_test', 'ind', 'id, 'myexp', 'MADLIB_SCHEMA.svm_reg_output2', true); 00227 sql> select * from MADLIB_SCHEMA.svm_reg_output2; 00228 \endcode 00229 00230 <strong>Example usage for classification:</strong> 00231 -# We can randomly generate 2000 5-dimensional data labelled by the simple 00232 target function 00233 \code 00234 t(x) = if x[1] > 0 and x[2] < 0 then 1 else -1; 00235 \endcode 00236 and store that in the my_schema.my_train_data table as follows: 00237 \code 00238 sql> select MADLIB_SCHEMA.svm_generate_cls_data('my_schema.my_train_data', 2000, 5); 00239 \endcode 00240 -# We can now learn a classification model and store the resultant model 00241 under the name 'myexpc'. 00242 \code 00243 sql> select MADLIB_SCHEMA.svm_classification('my_schema.my_train_data', 'myexpc', false, 'MADLIB_SCHEMA.svm_dot'); 00244 \endcode 00245 -# We can now start using it to predict the labels of new data points 00246 like as follows: 00247 \code 00248 sql> select MADLIB_SCHEMA.svm_predict('myexpc', '{10,-2,4,20,10}'); 00249 \endcode 00250 -# To learn multiple support vector models, replace the model-building and prediction steps above by 00251 \code 00252 sql> select MADLIB_SCHEMA.svm_classification('my_schema.my_train_data', 'myexpc', true, 'MADLIB_SCHEMA.svm_dot'); 00253 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpc', '{10,-2,4,20,10}'); 00254 \endcode 00255 -# To learn a linear support vector model using SGD, replace the model-building and prediction steps above by 00256 \code 00257 sql> select MADLIB_SCHEMA.lsvm_classification('my_schema.my_train_data', 'myexpc', false); 00258 sql> select MADLIB_SCHEMA.lsvm_predict('myexpc', '{10,-2,4,20,10}'); 00259 \endcode 00260 -# To learn multiple linear support vector models using SGD, replace the model-building and prediction steps above by 00261 \code 00262 sql> select MADLIB_SCHEMA.lsvm_classification('my_schema.my_train_data', 'myexpc', true); 00263 sql> select MADLIB_SCHEMA.lsvm_predict_combo('myexpc', '{10,-2,4,20,10}'); 00264 \endcode 00265 00266 <strong>Example usage for novelty detection:</strong> 00267 -# We can randomly generate 100 2-dimensional data (the normal cases) 00268 and store that in the my_schema.my_train_data table as follows: 00269 \code 00270 sql> select MADLIB_SCHEMA.svm_generate_nd_data('my_schema.my_train_data', 100, 2); 00271 \endcode 00272 -# Learning and predicting using a single novelty detection model can be done as follows: 00273 \code 00274 sql> select MADLIB_SCHEMA.svm_novelty_detection('my_schema.my_train_data', 'myexpnd', false, 'MADLIB_SCHEMA.svm_dot'); 00275 sql> select MADLIB_SCHEMA.svm_predict('myexpnd', '{10,-10}'); 00276 sql> select MADLIB_SCHEMA.svm_predict('myexpnd', '{-1,-1}'); 00277 \endcode 00278 -# Learning and predicting using multiple models can be done as follows: 00279 \code 00280 sql> select MADLIB_SCHEMA.svm_novelty_detection('my_schema.my_train_data', 'myexpnd', true, 'MADLIB_SCHEMA.svm_dot'); 00281 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpnd', '{10,-10}'); 00282 sql> select * from MADLIB_SCHEMA.svm_predict_combo('myexpnd', '{-1,-1}'); 00283 \endcode 00284 00285 00286 @literature 00287 00288 [1] Jyrki Kivinen, Alexander J. Smola, and Robert C. Williamson: <em>Online 00289 Learning with Kernels</em>, IEEE Transactions on Signal Processing, 52(8), 00290 2165-2176, 2004. 00291 00292 [2] Bernhard Scholkopf and Alexander J. Smola: <em>Learning with Kernels: 00293 Support Vector Machines, Regularization, Optimization, and Beyond</em>, 00294 MIT Press, 2002. 00295 00296 [3] Léon Bottou: <em>Large-Scale Machine Learning with Stochastic 00297 Gradient Descent</em>, Proceedings of the 19th International 00298 Conference on Computational Statistics, Springer, 2010. 00299 00300 @sa File online_sv.sql_in documenting the SQL functions. 00301 00302 @internal 00303 @sa namespace online_sv (documenting the implementation in Python) 00304 @endinternal 00305 00306 */ 00307 00308 00309 00310 -- The following is the structure to record the results of a learning process. 00311 -- We work with arrays of float8 for now; we'll extend the code to work with sparse vectors next. 00312 -- 00313 CREATE TYPE MADLIB_SCHEMA.svm_model_rec AS ( 00314 inds int, -- number of individuals processed 00315 cum_err float8, -- cumulative error 00316 epsilon float8, -- the size of the epsilon tube around the hyperplane, adaptively adjusted by algorithm 00317 rho float8, -- classification margin 00318 b float8, -- classifier offset 00319 nsvs int, -- number of support vectors 00320 ind_dim int, -- the dimension of the individuals 00321 weights float8[], -- the weight of the support vectors 00322 individuals float8[], -- the array of support vectors, represented as a 1-D array 00323 kernel_oid oid -- OID of kernel function 00324 ); 00325 00326 -- The following is the structure to record the results of the linear SVM sgd algorithm 00327 -- 00328 CREATE TYPE MADLIB_SCHEMA.lsvm_sgd_model_rec AS ( 00329 weights float8[], -- the weight vector 00330 wdiv float8, -- scaling factor for the weights 00331 wbias float8, -- offset/bias of the linear model 00332 ind_dim int, -- the dimension of the individuals 00333 inds int, -- number of individuals processed 00334 cum_err int -- cumulative error 00335 ); 00336 00337 00338 -- The following is the return type of a regression learning process 00339 -- 00340 CREATE TYPE MADLIB_SCHEMA.svm_reg_result AS ( 00341 model_table text, -- table where the model is stored 00342 model_name text, -- model name 00343 inds int, -- number of individuals processed 00344 cum_err float8, -- cumulative error 00345 epsilon float8, -- the size of the epsilon tube around the hyperplane, adaptively adjusted by algorithm 00346 b float8, -- classifier offset 00347 nsvs int -- number of support vectors 00348 ); 00349 00350 -- The following is the return type of a classification learning process 00351 -- 00352 CREATE TYPE MADLIB_SCHEMA.svm_cls_result AS ( 00353 model_table text, -- table where the model is stored 00354 model_name text, -- model name 00355 inds int, -- number of individuals processed 00356 cum_err float8, -- cumulative error 00357 rho float8, -- classification margin 00358 b float8, -- classifier offset 00359 nsvs int -- number of support vectors 00360 ); 00361 00362 -- The following is the return type of a linear classifier learning process 00363 -- 00364 CREATE TYPE MADLIB_SCHEMA.lsvm_sgd_result AS ( 00365 model_table text, -- table where the model is stored 00366 model_name text, -- model name 00367 inds int, -- number of individuals processed 00368 ind_dim int, -- the dimension of the individuals 00369 cum_err float8, -- cumulative error 00370 wdiv float8, -- scaling factor for the weights 00371 wbias float8 -- classifier offset 00372 ); 00373 00374 -- The following is the return type of a novelty detection learning process 00375 -- 00376 CREATE TYPE MADLIB_SCHEMA.svm_nd_result AS ( 00377 model_table text, -- table where the model is stored 00378 model_name text, -- model name 00379 inds int, -- number of individuals processed 00380 rho float8, -- classification margin 00381 nsvs int -- number of support vectors 00382 ); 00383 00384 -- The type for representing support vectors 00385 -- 00386 CREATE TYPE MADLIB_SCHEMA.svm_support_vector AS ( id text, weight float8, sv float8[] ); 00387 00388 00389 00390 -- Kernel functions are a generalisation of inner products. 00391 -- They provide the means by which we can extend linear machines to work in non-linear transformed feature spaces. 00392 -- Here are a few standard kernels: dot product, polynomial kernel, Gaussian kernel. 00393 -- 00394 /** 00395 * @brief Dot product kernel function 00396 * 00397 * @param x The data point \f$ \boldsymbol x \f$ 00398 * @param y The data point \f$ \boldsymbol y \f$ 00399 * @return Returns dot product of the two data points. 00400 * 00401 */ 00402 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_dot(x float8[], y float8[]) RETURNS float8 00403 AS 'MODULE_PATHNAME', 'svm_dot' LANGUAGE C IMMUTABLE STRICT; 00404 00405 /** 00406 * @brief Polynomial kernel function 00407 * 00408 * @param x The data point \f$ \boldsymbol x \f$ 00409 * @param y The data point \f$ \boldsymbol y \f$ 00410 * @param degree The degree \f$ d \f$ 00411 * @return Returns \f$ K(\boldsymbol x,\boldsymbol y)=(\boldsymbol x \cdot \boldsymbol y)^d \f$ 00412 * 00413 */ 00414 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_polynomial(x float8[], y float8[], degree float8) RETURNS float8 00415 AS 'MODULE_PATHNAME', 'svm_polynomial' LANGUAGE C IMMUTABLE STRICT; 00416 00417 /** 00418 * @brief Gaussian kernel function 00419 * 00420 * @param x The data point \f$ \boldsymbol x \f$ 00421 * @param y The data point \f$ \boldsymbol y \f$ 00422 * @param gamma The spread \f$ \gamma \f$ 00423 * @return Returns \f$ K(\boldsymbol x,\boldsymbol y)=exp(-\gamma || \boldsymbol x \cdot \boldsymbol y ||^2 ) \f$ 00424 * 00425 */ 00426 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_gaussian(x float8[], y float8[], gamma float8) RETURNS float8 00427 AS 'MODULE_PATHNAME', 'svm_gaussian' LANGUAGE C IMMUTABLE STRICT; 00428 00429 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_predict_sub(int,int,float8[],float8[],float8[],text) RETURNS float8 00430 AS 'MODULE_PATHNAME', 'svm_predict_sub' LANGUAGE C IMMUTABLE STRICT; 00431 00432 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_predict(svs MADLIB_SCHEMA.svm_model_rec, ind float8[], kernel text) 00433 RETURNS float8 AS $$ 00434 SELECT MADLIB_SCHEMA.svm_predict_sub($1.nsvs, $1.ind_dim, $1.weights, $1.individuals, $2, $3); 00435 $$ LANGUAGE SQL; 00436 00437 -- This is the main online support vector regression learning algorithm. 00438 -- The function updates the support vector model as it processes each new training example. 00439 -- This function is wrapped in an aggregate function to process all the training examples stored in a table. 00440 -- 00441 CREATE OR REPLACE FUNCTION 00442 MADLIB_SCHEMA.svm_reg_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], label FLOAT8, kernel TEXT, eta FLOAT8, nu FLOAT8, slambda FLOAT8) 00443 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_reg_update' LANGUAGE C STRICT; 00444 00445 CREATE AGGREGATE MADLIB_SCHEMA.svm_reg_agg(float8[], float8, text, float8, float8, float8) ( 00446 sfunc = MADLIB_SCHEMA.svm_reg_update, 00447 stype = MADLIB_SCHEMA.svm_model_rec, 00448 initcond = '(0,0,0,0,0,0,0,{},{},0)' 00449 ); 00450 00451 -- This is the main online support vector classification learning algorithm. 00452 -- The function updates the support vector model as it processes each new training example. 00453 -- This function is wrapped in an aggregate function to process all the training examples stored in a table. 00454 -- 00455 CREATE OR REPLACE FUNCTION 00456 MADLIB_SCHEMA.svm_cls_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], label FLOAT8, kernel TEXT, eta FLOAT8, nu FLOAT8) 00457 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_cls_update' LANGUAGE C STRICT; 00458 00459 CREATE AGGREGATE MADLIB_SCHEMA.svm_cls_agg(float8[], float8, text, float8, float8) ( 00460 sfunc = MADLIB_SCHEMA.svm_cls_update, 00461 stype = MADLIB_SCHEMA.svm_model_rec, 00462 initcond = '(0,0,0,0,0,0,0,{},{},0)' 00463 ); 00464 00465 -- This is the main online support vector novelty detection algorithm. 00466 -- The function updates the support vector model as it processes each new training example. 00467 -- In contrast to classification and regression, the training data points have no labels. 00468 -- This function is wrapped in an aggregate function to process all the training examples stored in a table. 00469 -- 00470 CREATE OR REPLACE FUNCTION 00471 MADLIB_SCHEMA.svm_nd_update(svs MADLIB_SCHEMA.svm_model_rec, ind FLOAT8[], kernel TEXT, eta FLOAT8, nu FLOAT8) 00472 RETURNS MADLIB_SCHEMA.svm_model_rec AS 'MODULE_PATHNAME', 'svm_nd_update' LANGUAGE C STRICT; 00473 00474 CREATE AGGREGATE MADLIB_SCHEMA.svm_nd_agg(float8[], text, float8, float8) ( 00475 sfunc = MADLIB_SCHEMA.svm_nd_update, 00476 stype = MADLIB_SCHEMA.svm_model_rec, 00477 initcond = '(0,0,0,0,0,0,0,{},{},0)' 00478 ); 00479 00480 -- This is the SGD algorithm for linear SVMs. 00481 -- The function updates the support vector model as it processes each new training example. 00482 -- This function is wrapped in an aggregate function to process all the training examples stored in a table. 00483 -- 00484 CREATE OR REPLACE FUNCTION 00485 MADLIB_SCHEMA.lsvm_sgd_update(svs MADLIB_SCHEMA.lsvm_sgd_model_rec, ind FLOAT8[], label FLOAT8, eta FLOAT8, reg FLOAT8) 00486 RETURNS MADLIB_SCHEMA.lsvm_sgd_model_rec AS 'MODULE_PATHNAME', 'lsvm_sgd_update' LANGUAGE C STRICT; 00487 00488 CREATE AGGREGATE MADLIB_SCHEMA.lsvm_sgd_agg(float8[], float8, float8, float8) ( 00489 sfunc = MADLIB_SCHEMA.lsvm_sgd_update, 00490 stype = MADLIB_SCHEMA.lsvm_sgd_model_rec, 00491 initcond = '({},1,0,0,0,0)' 00492 ); 00493 00494 00495 -- This function stores a MADLIB_SCHEMA.svm_model_rec stored in model_temp_table into the model_table. 00496 -- 00497 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_store_model(model_temp_table TEXT, model_name TEXT, model_table TEXT) RETURNS VOID AS $$ 00498 00499 sql = "SELECT COUNT(*) FROM " + model_temp_table + " WHERE id = \'" + model_name + "\'"; 00500 temp = plpy.execute(sql); 00501 if (temp[0]['count'] == 0): 00502 plpy.error("No support vector model with name " + model_name + " found."); 00503 00504 sql = "SELECT (model).ind_dim, (model).nsvs" \ 00505 + " FROM " + model_temp_table + " WHERE id = '" + model_name + "'"; 00506 rv = plpy.execute(sql); 00507 myind_dim = rv[0]['ind_dim']; 00508 mynsvs = rv[0]['nsvs']; 00509 00510 if (mynsvs == 0): 00511 plpy.error("The specified model has no support vectors and therefore not processed"); 00512 00513 idx = 0; 00514 for i in range(1,mynsvs+1): 00515 idx = myind_dim * (i-1); 00516 sql = "INSERT INTO " + model_table \ 00517 + " SELECT \'" + model_name + "\', (model).weights[" + str(i) + "], " \ 00518 + " (model).individuals[(" + str(idx+1) + "):(" + str(idx) + "+" + str(myind_dim) + ")] " \ 00519 + " FROM " + model_temp_table + " WHERE id = \'" + model_name + "\' LIMIT 1"; 00520 plpy.execute(sql); 00521 00522 $$ LANGUAGE plpythonu; 00523 00524 /** 00525 * @brief Drops all tables pertaining to a model 00526 * 00527 * @param model_table The table to be dropped. 00528 */ 00529 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_drop_model(model_table TEXT) RETURNS VOID AS $$ 00530 plpy.execute("drop table if exists " + model_table) 00531 plpy.execute("drop table if exists " + model_table + "_param") 00532 $$ LANGUAGE plpythonu; 00533 00534 CREATE TYPE MADLIB_SCHEMA.svm_model_pr AS ( model text, prediction float8 ); 00535 00536 /** 00537 * @brief Evaluates a support-vector model on a given data point 00538 * 00539 * @param model_table The table storing the learned model \f$ f \f$ to be used 00540 * @param ind The data point \f$ \boldsymbol x \f$ 00541 * @return This function returns \f$ f(\boldsymbol x) \f$ 00542 */ 00543 CREATE OR REPLACE FUNCTION 00544 MADLIB_SCHEMA.svm_predict(model_table text, ind float8[]) RETURNS FLOAT8 AS $$ 00545 00546 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00547 00548 # schema_madlib comes from PythonFunctionBodyOnly 00549 return online_sv.svm_predict(model_table, ind); 00550 00551 $$ LANGUAGE plpythonu; 00552 00553 /** 00554 * @brief Evaluates multiple support-vector models on a data point 00555 * 00556 * @param model_table The table storing the learned models to be used. 00557 * @param ind The data point \f$ \boldsymbol x \f$ 00558 * @return This function returns a table, a row for each model. 00559 * Moreover, the last row contains the average value, over all models. 00560 * 00561 * The different models are assumed to be named <tt><em>model_table</em>1</tt>, 00562 * <tt><em>model_table</em>2</tt>, .... 00563 */ 00564 CREATE OR REPLACE FUNCTION 00565 MADLIB_SCHEMA.svm_predict_combo(model_table text, ind float8[]) RETURNS SETOF MADLIB_SCHEMA.svm_model_pr AS $$ 00566 00567 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00568 00569 # schema_madlib comes from PythonFunctionBodyOnly 00570 return online_sv.svm_predict_combo( schema_madlib, model_table, ind); 00571 00572 $$ LANGUAGE plpythonu; 00573 00574 00575 /** 00576 * @brief This is the support vector regression function 00577 * 00578 * @param input_table The name of the table/view with the training data 00579 * @param model_table The name of the table under which we want to store the learned model 00580 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00581 * @param kernel_func Kernel function 00582 * @return A summary of the learning process 00583 * 00584 * @internal 00585 * @sa This function is a wrapper for online_sv::svm_regression(). 00586 */ 00587 CREATE OR REPLACE FUNCTION 00588 MADLIB_SCHEMA.svm_regression(input_table text, model_table text, parallel bool, kernel_func text) 00589 RETURNS SETOF MADLIB_SCHEMA.svm_reg_result 00590 AS $$ 00591 00592 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00593 00594 # schema_madlib comes from PythonFunctionBodyOnly 00595 return online_sv.svm_regression( schema_madlib, input_table, model_table, parallel, kernel_func); 00596 00597 $$ LANGUAGE 'plpythonu'; 00598 00599 /** 00600 * @brief This is the support vector regression function 00601 * 00602 * @param input_table The name of the table/view with the training data 00603 * @param model_table The name of the table under which we want to store the learned model 00604 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00605 * @param kernel_func Kernel function 00606 * @param verbose Verbosity of reporting 00607 * @param eta Learning rate in (0,1] 00608 * @param nu Compression parameter in (0,1] associated with the fraction of training data that will become support vectors 00609 * @param slambda Regularisation parameter 00610 * @return A summary of the learning process 00611 * 00612 * @internal 00613 * @sa This function is a wrapper for online_sv::svm_regression(). 00614 */ 00615 CREATE OR REPLACE FUNCTION 00616 MADLIB_SCHEMA.svm_regression(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8, slambda float8) 00617 RETURNS SETOF MADLIB_SCHEMA.svm_reg_result 00618 AS $$ 00619 00620 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00621 00622 # schema_madlib comes from PythonFunctionBodyOnly 00623 return online_sv.svm_regression( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu, slambda); 00624 00625 $$ LANGUAGE 'plpythonu'; 00626 00627 /** 00628 * @brief This is the support vector classification function 00629 * 00630 * @param input_table The name of the table/view with the training data 00631 * @param model_table The name of the table under which we want to store the learned model 00632 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00633 * @param kernel_func Kernel function 00634 * @return A summary of the learning process 00635 * 00636 * @internal 00637 * @sa This function is a wrapper for online_sv::svm_classification(). 00638 */ 00639 CREATE OR REPLACE FUNCTION 00640 MADLIB_SCHEMA.svm_classification(input_table text, model_table text, parallel bool, kernel_func text) 00641 RETURNS SETOF MADLIB_SCHEMA.svm_cls_result 00642 AS $$ 00643 00644 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00645 00646 # schema_madlib comes from PythonFunctionBodyOnly 00647 return online_sv.svm_classification( schema_madlib, input_table, model_table, parallel, kernel_func); 00648 00649 $$ LANGUAGE 'plpythonu'; 00650 00651 /** 00652 * @brief This is the support vector classification function 00653 * 00654 * @param input_table The name of the table/view with the training data 00655 * @param model_table The name of the table under which we want to store the learned model 00656 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00657 * @param kernel_func Kernel function 00658 * @param verbose Verbosity of reporting 00659 * @param eta Learning rate in (0,1] 00660 * @param nu Compression parameter in (0,1] associated with the fraction of training data that will become support vectors 00661 * @return A summary of the learning process 00662 * 00663 * @internal 00664 * @sa This function is a wrapper for online_sv::svm_classification(). 00665 */ 00666 CREATE OR REPLACE FUNCTION 00667 MADLIB_SCHEMA.svm_classification(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8) 00668 RETURNS SETOF MADLIB_SCHEMA.svm_cls_result 00669 AS $$ 00670 00671 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00672 00673 # schema_madlib comes from PythonFunctionBodyOnly 00674 return online_sv.svm_classification( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu); 00675 00676 $$ LANGUAGE 'plpythonu'; 00677 00678 /** 00679 * @brief This is the support vector novelty detection function. 00680 * 00681 * @param input_table The name of the table/view with the training data 00682 * @param model_table The name of the table under which we want to store the learned model 00683 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00684 * @param kernel_func Kernel function 00685 * @return A summary of the learning process 00686 * 00687 * @internal 00688 * @sa This function is a wrapper for online_sv::svm_novelty_detection(). 00689 */ 00690 CREATE OR REPLACE FUNCTION 00691 MADLIB_SCHEMA.svm_novelty_detection(input_table text, model_table text, parallel bool, kernel_func text) 00692 RETURNS SETOF MADLIB_SCHEMA.svm_nd_result 00693 AS $$ 00694 00695 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00696 00697 # schema_madlib comes from PythonFunctionBodyOnly 00698 return online_sv.svm_novelty_detection( schema_madlib, input_table, model_table, parallel, kernel_func); 00699 00700 $$ LANGUAGE 'plpythonu'; 00701 00702 /** 00703 * @brief This is the support vector novelty detection function. 00704 * 00705 * @param input_table The name of the table/view with the training data 00706 * @param model_table The name of the table under which we want to store the learned model 00707 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00708 * @param kernel_func Kernel function 00709 * @param verbose Verbosity of reporting 00710 * @param eta Learning rate in (0,1] 00711 * @param nu Compression parameter in (0,1] associated with the fraction of training data that will become support vectors 00712 * @return A summary of the learning process 00713 * 00714 * @internal 00715 * @sa This function is a wrapper for online_sv::svm_novelty_detection(). 00716 */ 00717 CREATE OR REPLACE FUNCTION 00718 MADLIB_SCHEMA.svm_novelty_detection(input_table text, model_table text, parallel bool, kernel_func text, verbose bool, eta float8, nu float8) 00719 RETURNS SETOF MADLIB_SCHEMA.svm_nd_result 00720 AS $$ 00721 00722 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00723 00724 # schema_madlib comes from PythonFunctionBodyOnly 00725 return online_sv.svm_novelty_detection( schema_madlib, input_table, model_table, parallel, kernel_func, verbose, eta, nu); 00726 00727 $$ LANGUAGE 'plpythonu'; 00728 00729 00730 /** 00731 * @brief Scores the data points stored in a table using a learned support-vector model 00732 * 00733 * @param input_table Name of table/view containing the data points to be scored 00734 * @param data_col Name of column in input_table containing the data points 00735 * @param id_col Name of column in input_table containing the integer identifier of data points 00736 * @param model_table Name of table where the learned model to be used is stored 00737 * @param output_table Name of table to store the results 00738 * @param parallel A flag indicating whether the model to be used was learned in parallel 00739 * @return Textual summary of the algorithm run 00740 * 00741 * @internal 00742 * @sa This function is a wrapper for online_sv::svm_predict_batch(). 00743 */ 00744 CREATE OR REPLACE FUNCTION 00745 MADLIB_SCHEMA.svm_predict_batch(input_table text, data_col text, id_col text, model_table text, output_table text, parallel bool) 00746 RETURNS TEXT 00747 AS $$ 00748 00749 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00750 00751 # schema_madlib comes from PythonFunctionBodyOnly 00752 return online_sv.svm_predict_batch( input_table, data_col, id_col, model_table, output_table, parallel); 00753 00754 $$ LANGUAGE 'plpythonu'; 00755 00756 -- Generate artificial training data 00757 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_random_ind(d INT) RETURNS float8[] AS $$ 00758 DECLARE 00759 ret float8[]; 00760 BEGIN 00761 FOR i IN 1..(d-1) LOOP 00762 ret[i] = RANDOM() * 40 - 20; 00763 END LOOP; 00764 IF (RANDOM() > 0.5) THEN 00765 ret[d] = 10; 00766 ELSE 00767 ret[d] = -10; 00768 END IF; 00769 RETURN ret; 00770 END 00771 $$ LANGUAGE plpgsql; 00772 00773 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_random_ind2(d INT) RETURNS float8[] AS $$ 00774 DECLARE 00775 ret float8[]; 00776 BEGIN 00777 FOR i IN 1..d LOOP 00778 ret[i] = RANDOM() * 5 + 10; 00779 IF (RANDOM() > 0.5) THEN ret[i] = -ret[i]; END IF; 00780 END LOOP; 00781 RETURN ret; 00782 END 00783 $$ LANGUAGE plpgsql; 00784 00785 00786 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_reg_data(output_table text, num int, dim int) RETURNS VOID AS $$ 00787 plpy.execute("drop table if exists " + output_table) 00788 plpy.execute("create table " + output_table + " ( id int, ind float8[], label float8 ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')") 00789 plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind(" + str(dim) + "), 0 FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a") 00790 plpy.execute("UPDATE " + output_table + " SET label = MADLIB_SCHEMA.__svm_target_reg_func(ind)") 00791 $$ LANGUAGE 'plpythonu'; 00792 00793 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_target_reg_func(ind float8[]) RETURNS float8 AS $$ 00794 DECLARE 00795 dim int; 00796 BEGIN 00797 dim = array_upper(ind,1); 00798 IF (ind[dim] = 10) THEN RETURN 50; END IF; 00799 RETURN -50; 00800 END 00801 $$ LANGUAGE plpgsql; 00802 00803 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_cls_data(output_table text, num int, dim int) RETURNS VOID AS $$ 00804 plpy.execute("drop table if exists " + output_table); 00805 plpy.execute("create table " + output_table + " ( id int, ind float8[], label float8 ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')") 00806 plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind(" + str(dim) + "), 0 FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a") 00807 plpy.execute("UPDATE " + output_table + " SET label = MADLIB_SCHEMA.__svm_target_cl_func(ind)") 00808 $$ LANGUAGE 'plpythonu'; 00809 00810 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.__svm_target_cl_func(ind float8[]) RETURNS float8 AS $$ 00811 BEGIN 00812 IF (ind[1] > 0 AND ind[2] < 0) THEN RETURN 1; END IF; 00813 RETURN -1; 00814 END 00815 $$ LANGUAGE plpgsql; 00816 00817 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_generate_nd_data(output_table text, num int, dim int) RETURNS VOID AS $$ 00818 plpy.execute("drop table if exists " + output_table); 00819 plpy.execute("create table " + output_table + " ( id int, ind float8[] ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')") 00820 plpy.execute("INSERT INTO " + output_table + " SELECT a.val, MADLIB_SCHEMA.__svm_random_ind2(" + str(dim) + ") FROM (SELECT generate_series(1," + str(num) + ") AS val) AS a") 00821 $$ LANGUAGE 'plpythonu'; 00822 00823 00824 /** 00825 * @brief Normalizes the data stored in a table, and save the normalized data in a new table. 00826 * 00827 * @param input_table Name of table/view containing the data points to be scored 00828 */ 00829 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.svm_data_normalization(input_table TEXT) RETURNS VOID AS $$ 00830 output_table = input_table + "_scaled"; 00831 plpy.execute("DROP TABLE IF EXISTS " + output_table); 00832 plpy.execute("CREATE TABLE " + output_table + " ( id int, ind float8[], label int ) m4_ifdef(`__GREENPLUM__', `distributed by (id)')"); 00833 plpy.execute("INSERT INTO " + output_table + " SELECT id, MADLIB_SCHEMA.svm_normalization(ind), label FROM " + input_table); 00834 plpy.info("output table: %s" % output_table) 00835 $$ LANGUAGE plpythonu; 00836 00837 00838 /** 00839 * @brief This is the linear support vector classification function 00840 * 00841 * @param input_table The name of the table/view with the training data 00842 * @param model_table The name of the table under which we want to store the learned model 00843 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00844 * @return A summary of the learning process 00845 * 00846 * @internal 00847 * @sa This function is a wrapper for online_sv::lsvm_classification(). 00848 */ 00849 CREATE OR REPLACE FUNCTION 00850 MADLIB_SCHEMA.lsvm_classification(input_table text, model_table text, parallel bool) 00851 RETURNS SETOF MADLIB_SCHEMA.lsvm_sgd_result 00852 AS $$ 00853 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00854 # schema_madlib comes from PythonFunctionBodyOnly 00855 return online_sv.lsvm_classification( schema_madlib, input_table, model_table, parallel); 00856 $$ LANGUAGE 'plpythonu'; 00857 00858 00859 00860 /** 00861 * @brief This is the linear support vector classification function 00862 * 00863 * @param input_table The name of the table/view with the training data 00864 * @param model_table The name of the table under which we want to store the learned model 00865 * @param parallel A flag indicating whether the system should learn multiple models in parallel 00866 * @param verbose Verbosity of reporting 00867 * @param eta Initial learning rate in (0,1] 00868 * @param reg Regularization parameter, often chosen by cross-validation 00869 * @return A summary of the learning process 00870 * 00871 * @internal 00872 * @sa This function is a wrapper for online_sv::lsvm_classification(). 00873 */ 00874 CREATE OR REPLACE FUNCTION 00875 MADLIB_SCHEMA.lsvm_classification(input_table text, model_table text, parallel bool, verbose bool, eta float8, reg float8) 00876 RETURNS SETOF MADLIB_SCHEMA.lsvm_sgd_result 00877 AS $$ 00878 00879 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00880 00881 # schema_madlib comes from PythonFunctionBodyOnly 00882 return online_sv.lsvm_classification( schema_madlib, input_table, model_table, parallel, verbose, eta, reg); 00883 00884 $$ LANGUAGE 'plpythonu'; 00885 00886 00887 /** 00888 * @brief Scores the data points stored in a table using a learned linear support-vector model 00889 * 00890 * @param input_table Name of table/view containing the data points to be scored 00891 * @param data_col Name of column in input_table containing the data points 00892 * @param id_col Name of column in input_table containing the integer identifier of data points 00893 * @param model_table Name of table where the learned model to be used is stored 00894 * @param output_table Name of table to store the results 00895 * @param parallel A flag indicating whether the model to be used was learned in parallel 00896 * @return Textual summary of the algorithm run 00897 * 00898 * @internal 00899 * @sa This function is a wrapper for online_sv::lsvm_predict_batch(). 00900 */ 00901 CREATE OR REPLACE FUNCTION 00902 MADLIB_SCHEMA.lsvm_predict_batch(input_table text, data_col text, id_col text, model_table text, output_table text, parallel bool) 00903 RETURNS TEXT 00904 AS $$ 00905 00906 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00907 00908 # schema_madlib comes from PythonFunctionBodyOnly 00909 return online_sv.lsvm_predict_batch( schema_madlib, input_table, data_col, id_col, model_table, output_table, parallel); 00910 00911 $$ LANGUAGE 'plpythonu'; 00912 00913 00914 /** 00915 * @brief Evaluates a linear support-vector model on a given data point 00916 * 00917 * @param model_table The table storing the learned model \f$ f \f$ to be used 00918 * @param ind The data point \f$ \boldsymbol x \f$ 00919 * @return This function returns \f$ f(\boldsymbol x) \f$ 00920 */ 00921 CREATE OR REPLACE FUNCTION 00922 MADLIB_SCHEMA.lsvm_predict(model_table text, ind float8[]) RETURNS FLOAT8 AS $$ 00923 00924 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00925 00926 # schema_madlib comes from PythonFunctionBodyOnly 00927 return online_sv.lsvm_predict(schema_madlib, model_table, ind); 00928 00929 $$ LANGUAGE plpythonu; 00930 00931 /** 00932 * @brief Evaluates multiple linear support-vector models on a data point 00933 * 00934 * @param model_table The table storing the learned models to be used. 00935 * @param ind The data point \f$ \boldsymbol x \f$ 00936 * @return This function returns a table, a row for each model. 00937 * Moreover, the last row contains the average value, over all models. 00938 * 00939 * The different models are assumed to be named <tt><em>model_table</em>0</tt>, 00940 * <tt><em>model_table</em>1</tt>, .... 00941 */ 00942 CREATE OR REPLACE FUNCTION 00943 MADLIB_SCHEMA.lsvm_predict_combo(model_table text, ind float8[]) RETURNS SETOF MADLIB_SCHEMA.svm_model_pr AS $$ 00944 00945 PythonFunctionBodyOnly(`kernel_machines', `online_sv') 00946 00947 # schema_madlib comes from PythonFunctionBodyOnly 00948 return online_sv.lsvm_predict_combo( schema_madlib, model_table, ind); 00949 00950 $$ LANGUAGE plpythonu;