This module provides a set of metrics to evaluate the quality of predictions of a model. A typical function will take a set of "prediction" and "observation" values and use them to calculate the desired metric, unless noted otherwise. Grouping is supported for all functions (except confusion matrix).
mean_abs_error(table_in, table_out, prediction_col, observed_col, grouping_cols) | Mean absolute error |
---|---|
mean_abs_perc_error(table_in, table_out, prediction_col, observed_col, grouping_cols) | Mean absolute percentage error |
mean_perc_error(table_in, table_out, prediction_col, observed_col, grouping_cols) | Mean percentage error |
mean_squared_error(table_in, table_out, prediction_col, observed_col, grouping_cols) | Mean squared error |
r2_score(table_in, table_out, prediction_col, observed_col, grouping_cols) | R-squared |
adjusted_r2_score(table_in, table_out, prediction_col, observed_col, num_predictors, training_size, grouping_cols) | Adjusted R-squared |
binary_classifier(table_in, table_out, prediction_col, observed_col, grouping_cols) | Collection of prediction metrics related to binary classification |
area_under_roc(table_in, table_out, prediction_col, observed_col, grouping_cols) | Area under the ROC curve (in binary classification) |
confusion_matrix(table_in, table_out, prediction_col, observed_col, grouping_cols) | Confusion matrix for a multi-class classifier |
Arguments
R-squared Score
This function returns the coefficient of determination (R2) between the predicted and observed values. An R2 of 1 indicates that the regression line perfectly fits the data, while an R2 of 0 indicates that the line does not fit the data at all. Negative values of R2 may occur when fitting non-linear functions to data. Please refer to reference [1] for more details.
Adjusted R-squared Score
This function returns the adjusted R2 score in addition to the R-squared score described above. Adjusted R2 score is used to counter the problem of the R2 automatically increasing when extra explanatory variables are added to the model. It takes two additional parameters describing the degrees of freedom of the model (num_predictors) and the size of the training set over which it was developed (training_size):
Neither of these arguments can be deduced from the predicted values and the test data alone which is why they are explicit inputs. Please refer to reference [1] for more details.
This function returns an output table with a number of metrics commonly used in binary classification.
The definitions of the various metrics are as follows:
Area Under ROC Curve
This function returns the area under the Receiver Operating Characteristic curve for binary classification (the AUC). The ROC curve is the curve relating the classifier's TPR and FPR metrics. (See Binary Classification above for a definition of these metrics). Please refer to reference [2] for more details. Note that the binary classification function can be used to obtain the data (TPR and FPR values) required for drawing the ROC curve.
Confusion Matrix
This function returns the confusion matrix of a multi-class classification. Each column of the matrix represents the instances in a predicted class while each row represents the instances in an actual class. This allows more detailed analysis than mere proportion of correct guesses (accuracy). Please refer to the reference [3] for more details. Please note that grouping is not supported for the confusion matrix.
DROP TABLE IF EXISTS test_set; CREATE TABLE test_set( pred FLOAT8, obs FLOAT8 ); INSERT INTO test_set VALUES (37.5,53.1), (12.3,34.2), (74.2,65.4), (91.1,82.1);
DROP TABLE IF EXISTS table_out; SELECT madlib.mean_abs_error( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
mean_abs_error ---------------- 13.825
DROP TABLE IF EXISTS table_out; SELECT madlib.mean_abs_perc_error( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
mean_abs_perc_error --------------------- 0.294578793636013
DROP TABLE IF EXISTS table_out; SELECT madlib.mean_perc_error( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
mean_perc_error ------------------- -0.17248930032771
DROP TABLE IF EXISTS table_out; SELECT madlib.mean_squared_error( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
mean_squared_error -------------------- 220.3525
DROP TABLE IF EXISTS table_out; SELECT madlib.r2_score( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
r2_score ------------------------ 0.27992908844337695865
DROP TABLE IF EXISTS table_out; SELECT madlib.adjusted_r2_score( 'test_set', 'table_out', 'pred', 'obs', 3, 100); SELECT * FROM table_out;Result
r2_score | adjusted_r2_score --------------------+------------------ 0.279929088443375 | 0.257426872457231
DROP TABLE IF EXISTS test_set; CREATE TABLE test_set AS SELECT ((a*8)::integer)/8.0 pred, ((a*0.5+random()*0.5)>0.5) obs FROM (select random() as a from generate_series(1,100)) x;
DROP TABLE IF EXISTS table_out; SELECT madlib.binary_classifier( 'test_set', 'table_out', 'pred', 'obs');
SELECT threshold, tpr, fpr FROM table_out ORDER BY threshold;Result (your results for this and other functions below will look different due to the presence of the random function in sample data generator):
threshold | tpr | fpr ------------------------+------------------------+------------------------ 0.00000000000000000000 | 1.00000000000000000000 | 1.00000000000000000000 0.12500000000000000000 | 1.00000000000000000000 | 0.94915254237288135593 0.25000000000000000000 | 0.92682926829268292683 | 0.64406779661016949153 0.37500000000000000000 | 0.80487804878048780488 | 0.47457627118644067797 0.50000000000000000000 | 0.70731707317073170732 | 0.35593220338983050847 0.62500000000000000000 | 0.63414634146341463415 | 0.25423728813559322034 0.75000000000000000000 | 0.48780487804878048780 | 0.06779661016949152542 0.87500000000000000000 | 0.29268292682926829268 | 0.03389830508474576271 1.00000000000000000000 | 0.12195121951219512195 | 0.00000000000000000000
-- Set extended display on for easier reading of output \x on SELECT * FROM table_out WHERE threshold=0.5;Result
-[ RECORD 1 ]--------------------- threshold | 0.50000000000000000000 tp | 29 fp | 21 fn | 12 tn | 38 tpr | 0.70731707317073170732 tnr | 0.64406779661016949153 ppv | 0.58000000000000000000 npv | 0.76000000000000000000 fpr | 0.35593220338983050847 fdr | 0.42000000000000000000 fnr | 0.29268292682926829268 acc | 0.67000000000000000000 f1 | 0.63736263736263736264
\x off DROP TABLE IF EXISTS table_out; SELECT madlib.area_under_roc( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out;Result
area_under_roc --------------------------------------------- 0.77428689541132699462698842496899545266640
DROP TABLE IF EXISTS test_set; CREATE TABLE test_set AS SELECT (x+y)%5+1 AS pred, (x*y)%5 AS obs FROM generate_series(1,5) x, generate_series(1,5) y;
DROP TABLE IF EXISTS table_out; SELECT madlib.confusion_matrix( 'test_set', 'table_out', 'pred', 'obs'); SELECT * FROM table_out ORDER BY class;Result
row_id | class | confusion_arr --------+-------+--------------- 1 | 0 | {0,1,2,2,2,2} 2 | 1 | {0,2,0,1,1,0} 3 | 2 | {0,0,0,2,2,0} 4 | 3 | {0,0,2,0,0,2} 5 | 4 | {0,2,1,0,0,1} 6 | 5 | {0,0,0,0,0,0}
[1] https://en.wikipedia.org/wiki/Coefficient_of_determination
[2] https://en.wikipedia.org/wiki/Receiver_operating_characteristic
[3] https://en.wikipedia.org/wiki/Confusion_matrix
File pred_metrics.sql_in for list of functions and usage.