User Documentation
 All Files Functions Variables Groups
cross_validation.sql_in File Reference

SQL functions for cross validation. More...

Functions

void cross_validation_general (varchar modelling_func, varchar[] modelling_params, varchar[] modelling_params_type, varchar param_explored, varchar[] explore_values, varchar predict_func, varchar[] predict_params, varchar[] predict_params_type, varchar metric_func, varchar[] metric_params, varchar[] metric_params_type, varchar data_tbl, varchar data_id, boolean id_is_random, varchar validation_result, varchar[] data_cols, integer fold_num)
 
void cross_validation_general (varchar modelling_func, varchar[] modelling_params, varchar[] modelling_params_type, varchar param_explored, varchar[] explore_values, varchar predict_func, varchar[] predict_params, varchar[] predict_params_type, varchar metric_func, varchar[] metric_params, varchar[] metric_params_type, varchar data_tbl, varchar data_id, boolean id_is_random, varchar validation_result, varchar[] data_cols)
 
void cv_linregr_train (varchar tbl_source, varchar col_ind_var, varchar col_dep_var, varchar tbl_result)
 Simple interface of cross-validation, which has no limitation on lock number. More...
 
float8 linregr_predict (float8[] coef, float8[] col_ind)
 
void cv_linregr_predict (varchar tbl_model, varchar tbl_newdata, varchar col_ind_var, varchar col_id, varchar tbl_predict)
 A wrapper for linear regression prediction. More...
 
void mse_error (varchar tbl_prediction, varchar tbl_actual, varchar id_actual, varchar values_actual, varchar tbl_error)
 
boolean logregr_predict (float8[] coef, float8[] col_ind)
 A prediction function for logistic regression. More...
 
void cv_logregr_predict (varchar tbl_model, varchar tbl_newdata, varchar col_ind_var, varchar col_id, varchar tbl_predict)
 A prediction function for logistic regression The result is stored in the table of tbl_predict. More...
 
integer logregr_accuracy (float8[] coef, float8[] col_ind, boolean col_dep)
 Metric function for logistic regression. More...
 
void cv_logregr_accuracy (varchar tbl_predict, varchar tbl_source, varchar col_id, varchar col_dep_var, varchar tbl_accuracy)
 Metric function for logistic regression. More...
 

Detailed Description

Date
January 2011
See Also
For a brief introduction to the usage of cross validation, see the module description Cross Validation.

Function Documentation

void cross_validation_general ( varchar  modelling_func,
varchar[]  modelling_params,
varchar[]  modelling_params_type,
varchar  param_explored,
varchar[]  explore_values,
varchar  predict_func,
varchar[]  predict_params,
varchar[]  predict_params_type,
varchar  metric_func,
varchar[]  metric_params,
varchar[]  metric_params_type,
varchar  data_tbl,
varchar  data_id,
boolean  id_is_random,
varchar  validation_result,
varchar[]  data_cols,
integer  fold_num 
)
void cross_validation_general ( varchar  modelling_func,
varchar[]  modelling_params,
varchar[]  modelling_params_type,
varchar  param_explored,
varchar[]  explore_values,
varchar  predict_func,
varchar[]  predict_params,
varchar[]  predict_params_type,
varchar  metric_func,
varchar[]  metric_params,
varchar[]  metric_params_type,
varchar  data_tbl,
varchar  data_id,
boolean  id_is_random,
varchar  validation_result,
varchar[]  data_cols 
)
void cv_linregr_predict ( varchar  tbl_model,
varchar  tbl_newdata,
varchar  col_ind_var,
varchar  col_id,
varchar  tbl_predict 
)
void cv_linregr_train ( varchar  tbl_source,
varchar  col_ind_var,
varchar  col_dep_var,
varchar  tbl_result 
)
Parameters
module_nameModule to be cross validated
func_argsArguments of modelling function of the module, including the table name of data
param_to_tryThe name of the paramter that CV runs through
param_valuesThe values of the parameter that CV will try
data_idName of the unique ID associated with each row. Provide NULL if there is no such column in the data table
id_is_randomWhether the provided ID is randomly assigned to each row
validation_resultTable name to store the output of CV function, see the Output for format. It will be automatically created by CV function
fold_numHow many fold cross-validation Print the help message for a given module's cross-validation. Print the supported module names for cross_validation A wrapper for linear regression
void cv_logregr_accuracy ( varchar  tbl_predict,
varchar  tbl_source,
varchar  col_id,
varchar  col_dep_var,
varchar  tbl_accuracy 
)

It computes the percentage of correct predictions. The result is stored in the table of tbl_accuracy

void cv_logregr_predict ( varchar  tbl_model,
varchar  tbl_newdata,
varchar  col_ind_var,
varchar  col_id,
varchar  tbl_predict 
)

This function can be used together with cross-validation

float8 linregr_predict ( float8[]  coef,
float8[]  col_ind 
)
integer logregr_accuracy ( float8[]  coef,
float8[]  col_ind,
boolean  col_dep 
)
Parameters
coefLogistic fitting coefficients. Note: MADlib logregr_train function does not produce a seperate intercept term as elastic_net_train function.
col_indIndependent variable, an array
col_depDependent variable

returns 1 if the prediction is the same as col_dep, otherwise 0

boolean logregr_predict ( float8[]  coef,
float8[]  col_ind 
)
Parameters
coefCoefficients. Note: MADlib logregr_train function does not produce a seperate intercept term as elastic_net_train function.
col_indIndependent variable, which must be an array
void mse_error ( varchar  tbl_prediction,
varchar  tbl_actual,
varchar  id_actual,
varchar  values_actual,
varchar  tbl_error 
)