MADlib  1.4.1
User Documentation
 All Files Functions Variables Groups
Warning
This MADlib method is still in early stage development. There may be some issues that will be addressed in a future version. Interface and implementation is subject to change.

A random forest (RF) is an ensemble classifier that consists of many decision trees and outputs the class that is voted by the majority of the individual trees.

It has the following well-known advantages:

This module provides an implementation of the random forest algorithm described in [1].

The implementation supports:

Input

The data to classify is expected to be of the same form as training data, except that it does not need a class column.

Training Function

Run the training algorithm on the source data.

rf_train( split_criterion, 
          training_table_name, 
          result_rf_table_name, 
          num_trees, 
          features_per_node, 
          sampling_percentage, 
          continuous_feature_names, 
          feature_col_names, 
          id_col_name, 
          class_col_name, 
          how2handle_missing_value, 
          max_tree_depth, 
          node_prune_threshold, 
          node_split_threshold, 
          verbosity
        )

Arguments

split_criterion

The name of the split criterion that should be used for tree construction. The valid values are ‘infogain’, ‘gainratio’, and ‘gini’. It can't be NULL. Information gain(infogain) and gini index(gini) are biased toward multivalued attributes. Gain ratio(gainratio) adjusts for this bias. However, it tends to prefer unbalanced splits in which one partition is much smaller than the others.

training_table_name

The name of the table/view with the training data. It can't be NULL and must exist.

The training data is expected to be of the following form:

{TABLE|VIEW} trainingSource (
    ...
    id INT|BIGINT,
    feature1 SUPPORTED_DATA_TYPE,
    feature2 SUPPORTED_DATA_TYPE,
    feature3 SUPPORTED_DATA_TYPE,
    ....................
    featureN SUPPORTED_DATA_TYPE,
    class    SUPPORTED_DATA_TYPE,
    ...
)

SUPPORTED_DATA_TYPE can be any of the following: SMALLINT, INT, BIGINT, FLOAT8, REAL, DECIMAL, INET, CIDR, MACADDR, BOOLEAN, CHAR, VARCHAR, TEXT, "char", DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ, and INTERVAL.

result_rf_table_name

The name of the table where the resulting trees are stored. It can not be NULL and must not exist.

The output table stores an abstract object (representing the model) used for further classification. The table has the following columns:

id
tree_location
feature
probability
ebp_coeff
maxclass
split_gain
live
cat_size
parent_id
lmc_nid
lmc_fval
is_feature_cont
split_value
tid
dp_ids
num_trees
The number of trees to be trained. If it's NULL, 10 will be used.
features_per_node
The number of features to be considered when finding a best split. If it's NULL, sqrt(p), where p is the number of features, will be used.
sampling_percentage
The percentage of records sampled to train a tree. If it's NULL, 0.632 bootstrap will be used continuous_feature_names A comma-separated list of the names of the features whose values are continuous. NULL means there are no continuous features.
feature_col_names
A comma-separated list of names of the table columns, each of which defines a feature. NULL means all the columns except the ID and Class columns will be treated as features.
id_col_name
The name of the column containing id of each record. It can't be NULL.
class_col_name
The name of the column containing correct class of each record. It can't be NULL.
how2handle_missing_value
The way to handle missing value. The valid values are 'explicit' and 'ignore'. It can't be NULL.
max_tree_depth
The maximum tree depth. It can't be NULL.
node_prune_threshold
The minimum percentage of the number of records required in a child node. It can't be NULL. The range of it is in [0.0, 1.0]. This threshold only applies to the non-root nodes. Therefore, if the percentage(p) between the sampled training set size of a tree (the number of rows) and the total training set size is less than or equal to the value of this parameter, then the tree only has one node (the root node); if its value is 1, then the percentage p is less than or equal to 1 definitely. Therefore, the tree only has one node (the root node). if its value is 0, then no nodes will be pruned by this parameter.
node_split_threshold
The minimum percentage of the number of records required in a node in order for a further split to be possible. It can't be NULL. The range of it is in [0.0, 1.0]. If the percentage(p) between the sampled training set size of a tree (the number of rows) and the total training set size is less than the value of this parameter, then the root node will be a leaf one. Therefore, the trained tree only has one node. If the percentage p is equal to the value of this parameter, then the trained tree only has two levels, since only the root node will grow. (the root node); if its value is 0, then trees can grow extensively.
verbosity
Greater than 0 means this function runs in verbose mode. It can't be NULL.

Classification Function

The classification function creates the result_table with the classification results.

rf_classify( rf_table_name, 
             classification_table_name, 
             result_table_name)

Scoring Function

The scoring function gives a ratio of correctly classified items in the validation data set.

rf_score( rf_table_name, 
          validation_table_name, 
          verbosity)

Display Function

The display tree function displays the trained trees in a human-readable format.

rf_display( rf_table_name
          )

Cleaning Function

The clean tree function cleans up the learned model and metadata.

rf_clean( rf_table_name
        )

Examples
  1. Prepare an input table.
    SELECT * FROM golf_data ORDER BY id;
    
    Result:
     id | outlook  | temperature | humidity | windy  |    class     
     ---+----------+-------------+----------+--------+--------------
      1 | sunny    |          85 |       85 |  false |  Do not Play
      2 | sunny    |          80 |       90 |  true  |  Do not Play
      3 | overcast |          83 |       78 |  false |  Play
      4 | rain     |          70 |       96 |  false |  Play
      5 | rain     |          68 |       80 |  false |  Play
      6 | rain     |          65 |       70 |  true  |  Do not Play
      7 | overcast |          64 |       65 |  true  |  Play
      8 | sunny    |          72 |       95 |  false |  Do not Play
      9 | sunny    |          69 |       70 |  false |  Play
     10 | rain     |          75 |       80 |  false |  Play
     11 | sunny    |          75 |       70 |  true  |  Play
     12 | overcast |          72 |       90 |  true  |  Play
     13 | overcast |          81 |       75 |  false |  Play
     14 | rain     |          71 |       80 |  true  |  Do not Play
    (14 rows)
    
  2. Train the random forest.
    SELECT * FROM madlib.rf_clean( 'trained_tree_infogain'
                                 );
    SELECT * FROM madlib.rf_train( 'infogain', 
                                   'golf_data', 
                                   'trained_tree_infogain', 
                                   10, 
                                   NULL, 
                                   0.632, 
                                   'temperature,humidity', 
                                   'outlook,temperature,humidity,windy', 
                                   'id', 
                                   'class', 
                                   'explicit', 
                                   10, 
                                   0.0, 
                                   0.0, 
                                   0, 
                                   0
                                 );
    
    Result:
      
     training_time  | num_of_samples | num_trees | features_per_node | num_tree_nodes | max_tree_depth | split_criterion |    acs_time     |    acc_time     |    olap_time    |   update_time   |    best_time    
     ---------------+--------------+-----------+-------------------+----------------+----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------
     00:00:03.60498 |           14 |        10 |                 3 |             71 |              6 | infogain        | 00:00:00.154991 | 00:00:00.404411 | 00:00:00.736876 | 00:00:00.374084 | 00:00:01.722658
    (1 row)
    
  3. Check the table records that hold the random forest.
    SELECT * FROM golf_tree ORDER BY tid, id;
    
     id | tree_location | feature |    probability    | ebp_coeff | maxclass |     split_gain     | live | cat_size | parent_id | lmc_nid | lmc_fval | is_feature_cont | split_value | tid | dp_ids 
     ---+---------------+---------+-------------------+-----------+----------+--------------------+------+----------+-----------+---------+----------+-----------------+-------------+-----+--------
      1 | {0}           |       3 | 0.777777777777778 |         1 |        2 |  0.197530864197531 |    0 |        9 |         0 |      24 |        1 | f               |             |   1 | 
     24 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        4 |         1 |         |          | f               |             |   1 | {3}
     25 | {0,2}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        2 |         1 |         |          | f               |             |   1 | {3}
     26 | {0,3}         |       2 | 0.666666666666667 |         1 |        1 |  0.444444444444444 |    0 |        3 |         1 |      42 |        1 | t               |          70 |   1 | {3}
     42 | {0,3,1}       |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        26 |         |          | f               |             |   1 | 
     43 | {0,3,2}       |       4 |                 1 |         1 |        1 |                  0 |    0 |        2 |        26 |         |          | f               |             |   1 | 
      2 | {0}           |       2 | 0.555555555555556 |         1 |        1 |   0.17636684303351 |    0 |        9 |         0 |      11 |        1 | t               |          65 |   2 | 
     11 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        2 |         2 |         |          | f               |             |   2 | 
     12 | {0,2}         |       4 | 0.714285714285714 |         1 |        1 |  0.217687074829932 |    0 |        7 |         2 |      44 |        1 | f               |             |   2 | 
     44 | {0,2,1}       |       3 | 0.666666666666667 |         1 |        2 |  0.444444444444444 |    0 |        3 |        12 |      57 |        1 | f               |             |   2 | {4}
     45 | {0,2,2}       |       3 |                 1 |         1 |        1 |                  0 |    0 |        4 |        12 |         |          | f               |             |   2 | {4}
     57 | {0,2,1,1}     |       2 |                 1 |         1 |        2 |                  0 |    0 |        1 |        44 |         |          | t               |          78 |   2 | {4,3}
     58 | {0,2,1,2}     |       2 |                 1 |         1 |        2 |                  0 |    0 |        1 |        44 |         |          | t               |          96 |   2 | {4,3}
     59 | {0,2,1,3}     |       2 |                 1 |         1 |        1 |                  0 |    0 |        1 |        44 |         |          | t               |          85 |   2 | {4,3}
      3 | {0}           |       2 | 0.777777777777778 |         1 |        2 |  0.197530864197531 |    0 |        9 |         0 |      27 |        1 | t               |          80 |   3 | 
     27 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        6 |         3 |         |          | f               |             |   3 | 
     28 | {0,2}         |       2 | 0.666666666666667 |         1 |        1 |  0.444444444444444 |    0 |        3 |         3 |      46 |        1 | t               |          90 |   3 | 
     46 | {0,2,1}       |       4 |                 1 |         1 |        1 |                  0 |    0 |        2 |        28 |         |          | f               |             |   3 | 
     47 | {0,2,2}       |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        28 |         |          | f               |             |   3 | 
      4 | {0}           |       4 | 0.888888888888889 |         1 |        2 | 0.0493827160493827 |    0 |        9 |         0 |      13 |        1 | f               |             |   4 | 
     13 | {0,1}         |       3 |                 1 |         1 |        2 |                  0 |    0 |        6 |         4 |         |          | f               |             |   4 | {4}
     14 | {0,2}         |       3 | 0.666666666666667 |         1 |        2 |  0.444444444444444 |    0 |        3 |         4 |      48 |        1 | f               |             |   4 | {4}
     48 | {0,2,1}       |       2 |                 1 |         1 |        2 |                  0 |    0 |        2 |        14 |         |          | t               |          90 |   4 | {4,3}
     49 | {0,2,2}       |       2 |                 1 |         1 |        1 |                  0 |    0 |        1 |        14 |         |          | t               |          80 |   4 | {4,3}
      5 | {0}           |       2 | 0.888888888888889 |         1 |        2 |  0.197530864197531 |    0 |        9 |         0 |      29 |        1 | t               |          90 |   5 | 
     29 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        8 |         5 |         |          | f               |             |   5 | 
     30 | {0,2}         |       3 |                 1 |         1 |        1 |                  0 |    0 |        1 |         5 |         |          | f               |             |   5 | 
      6 | {0}           |       3 | 0.555555555555556 |         1 |        2 |  0.345679012345679 |    0 |        9 |         0 |      15 |        1 | f               |             |   6 | 
     15 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        3 |         6 |         |          | f               |             |   6 | {3}
     16 | {0,2}         |       4 | 0.666666666666667 |         1 |        2 |  0.444444444444444 |    0 |        3 |         6 |      51 |        1 | f               |             |   6 | {3}
     17 | {0,3}         |       4 |                 1 |         1 |        1 |                  0 |    0 |        3 |         6 |         |          | f               |             |   6 | {3}
     51 | {0,2,1}       |       2 |                 1 |         1 |        2 |                  0 |    0 |        2 |        16 |         |          | t               |          96 |   6 | {3,4}
     52 | {0,2,2}       |       2 |                 1 |         1 |        1 |                  0 |    0 |        1 |        16 |         |          | t               |          70 |   6 | {3,4}
      7 | {0}           |       4 | 0.666666666666667 |         1 |        2 |  0.253968253968254 |    0 |        9 |         0 |      31 |        1 | f               |             |   7 | 
     31 | {0,1}         |       2 | 0.857142857142857 |         1 |        2 |  0.102040816326531 |    0 |        7 |         7 |      36 |        1 | t               |          80 |   7 | {4}
     32 | {0,2}         |       3 |                 1 |         1 |        1 |                  0 |    0 |        2 |         7 |         |          | f               |             |   7 | {4}
     36 | {0,1,1}       |       4 |                 1 |         1 |        2 |                  0 |    0 |        5 |        31 |         |          | f               |             |   7 | 
     37 | {0,1,2}       |       2 |               0.5 |         1 |        2 |                0.5 |    0 |        2 |        31 |      60 |        1 | t               |          95 |   7 | 
     60 | {0,1,2,1}     |       4 |                 1 |         1 |        1 |                  0 |    0 |        1 |        37 |         |          | f               |             |   7 | 
     61 | {0,1,2,2}     |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        37 |         |          | f               |             |   7 | 
      8 | {0}           |       3 | 0.777777777777778 |         1 |        2 | 0.0864197530864197 |    0 |        9 |         0 |      18 |        1 | f               |             |   8 | 
     18 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        4 |         8 |         |          | f               |             |   8 | {3}
     19 | {0,2}         |       4 | 0.666666666666667 |         1 |        2 |  0.444444444444444 |    0 |        3 |         8 |      38 |        1 | f               |             |   8 | {3}
     20 | {0,3}         |       2 |               0.5 |         1 |        2 |                0.5 |    0 |        2 |         8 |      53 |        1 | t               |          70 |   8 | {3}
     38 | {0,2,1}       |       2 |                 1 |         1 |        2 |                  0 |    0 |        2 |        19 |         |          | t               |          80 |   8 | {3,4}
     39 | {0,2,2}       |       2 |                 1 |         1 |        1 |                  0 |    0 |        1 |        19 |         |          | t               |          80 |   8 | {3,4}
     53 | {0,3,1}       |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        20 |         |          | f               |             |   8 | 
     54 | {0,3,2}       |       4 |                 1 |         1 |        1 |                  0 |    0 |        1 |        20 |         |          | f               |             |   8 | 
      9 | {0}           |       3 | 0.555555555555556 |         1 |        2 |  0.327160493827161 |    0 |        9 |         0 |      33 |        1 | f               |             |   9 | 
     33 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        2 |         9 |         |          | f               |             |   9 | {3}
     34 | {0,2}         |       4 |              0.75 |         1 |        2 |              0.375 |    0 |        4 |         9 |      55 |        1 | f               |             |   9 | {3}
     35 | {0,3}         |       4 |                 1 |         1 |        1 |                  0 |    0 |        3 |         9 |         |          | f               |             |   9 | {3}
     55 | {0,2,1}       |       2 |                 1 |         1 |        2 |                  0 |    0 |        3 |        34 |         |          | t               |          96 |   9 | {3,4}
     56 | {0,2,2}       |       2 |                 1 |         1 |        1 |                  0 |    0 |        1 |        34 |         |          | t               |          70 |   9 | {3,4}
     10 | {0}           |       3 | 0.666666666666667 |         1 |        2 |  0.277777777777778 |    0 |        9 |         0 |      21 |        1 | f               |             |  10 | 
     21 | {0,1}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        10 |         |          | f               |             |  10 | {3}
     22 | {0,2}         |       4 |                 1 |         1 |        2 |                  0 |    0 |        4 |        10 |         |          | f               |             |  10 | {3}
     23 | {0,3}         |       2 |              0.75 |         1 |        1 |              0.375 |    0 |        4 |        10 |      40 |        1 | t               |          70 |  10 | {3}
     40 | {0,3,1}       |       4 |                 1 |         1 |        2 |                  0 |    0 |        1 |        23 |         |          | f               |             |  10 | 
     41 | {0,3,2}       |       4 |                 1 |         1 |        1 |                  0 |    0 |        3 |        23 |         |          | f               |             |  10 | 
    (60 rows)
    
  4. Display the random forest in a human readable format.
    SELECT * FROM madlib.rf_display( 'trained_tree_infogain'
                                   );
    
    Result:
                                          
                                                 rf_display                                              
     ----------------------------------------------------------------------------------------------------
                                                                                                          
     Tree 1                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.777777777777778)                    
             outlook:  = overcast  : class( Play)   num_elements(4)  predict_prob(1)                     
             outlook:  = rain  : class( Play)   num_elements(2)  predict_prob(1)                         
             outlook:  = sunny  : class( Do not Play)   num_elements(3)  predict_prob(0.666666666666667) 
                 humidity:  <= 70  : class( Play)   num_elements(1)  predict_prob(1)                     
                 humidity:  > 70  : class( Do not Play)   num_elements(2)  predict_prob(1)               
     
     Tree 2                                                                                              
         Root Node  : class( Do not Play)   num_elements(9)  predict_prob(0.555555555555556)             
             humidity:  <= 65  : class( Play)   num_elements(2)  predict_prob(1)                         
             humidity:  > 65  : class( Do not Play)   num_elements(7)  predict_prob(0.714285714285714)   
                 windy:  =  false  : class( Play)   num_elements(3)  predict_prob(0.666666666666667)     
                     outlook:  = overcast  : class( Play)   num_elements(1)  predict_prob(1)             
                     outlook:  = rain  : class( Play)   num_elements(1)  predict_prob(1)                 
                     outlook:  = sunny  : class( Do not Play)   num_elements(1)  predict_prob(1)         
                 windy:  =  true  : class( Do not Play)   num_elements(4)  predict_prob(1)               
                                                                                                          
     Tree 3                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.777777777777778)                    
             humidity:  <= 80  : class( Play)   num_elements(6)  predict_prob(1)                         
             humidity:  > 80  : class( Do not Play)   num_elements(3)  predict_prob(0.666666666666667)   
                 humidity:  <= 90  : class( Do not Play)   num_elements(2)  predict_prob(1)              
                 humidity:  > 90  : class( Play)   num_elements(1)  predict_prob(1)                      
                                                                                                          
     Tree 4                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.888888888888889)                    
             windy:  =  false  : class( Play)   num_elements(6)  predict_prob(1)                         
             windy:  =  true  : class( Play)   num_elements(3)  predict_prob(0.666666666666667)          
                 outlook:  = overcast  : class( Play)   num_elements(2)  predict_prob(1)                 
                 outlook:  = rain  : class( Do not Play)   num_elements(1)  predict_prob(1)              
                                                                                                          
     Tree 5                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.888888888888889)                    
             humidity:  <= 90  : class( Play)   num_elements(8)  predict_prob(1)                         
             humidity:  > 90  : class( Do not Play)   num_elements(1)  predict_prob(1)                   
                                                                                                          
     Tree 6                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.555555555555556)                    
             outlook:  = overcast  : class( Play)   num_elements(3)  predict_prob(1)                     
             outlook:  = rain  : class( Play)   num_elements(3)  predict_prob(0.666666666666667)         
                 windy:  =  false  : class( Play)   num_elements(2)  predict_prob(1)                     
                 windy:  =  true  : class( Do not Play)   num_elements(1)  predict_prob(1)               
             outlook:  = sunny  : class( Do not Play)   num_elements(3)  predict_prob(1)                 
                                                                                                          
     Tree 7                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.666666666666667)                    
             windy:  =  false  : class( Play)   num_elements(7)  predict_prob(0.857142857142857)         
                 humidity:  <= 80  : class( Play)   num_elements(5)  predict_prob(1)                     
                 humidity:  > 80  : class( Play)   num_elements(2)  predict_prob(0.5)                    
                     humidity:  <= 95  : class( Do not Play)   num_elements(1)  predict_prob(1)          
                     humidity:  > 95  : class( Play)   num_elements(1)  predict_prob(1)                  
             windy:  =  true  : class( Do not Play)   num_elements(2)  predict_prob(1)                   
                                                                                                          
     Tree 8                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.777777777777778)                    
             outlook:  = overcast  : class( Play)   num_elements(4)  predict_prob(1)                     
             outlook:  = rain  : class( Play)   num_elements(3)  predict_prob(0.666666666666667)         
                 windy:  =  false  : class( Play)   num_elements(2)  predict_prob(1)                     
                 windy:  =  true  : class( Do not Play)   num_elements(1)  predict_prob(1)               
             outlook:  = sunny  : class( Play)   num_elements(2)  predict_prob(0.5)                      
                 humidity:  <= 70  : class( Play)   num_elements(1)  predict_prob(1)                     
                 humidity:  > 70  : class( Do not Play)   num_elements(1)  predict_prob(1)               
                                                                                                          
     Tree 9                                                                                              
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.555555555555556)                    
             outlook:  = overcast  : class( Play)   num_elements(2)  predict_prob(1)                     
             outlook:  = rain  : class( Play)   num_elements(4)  predict_prob(0.75)                      
                 windy:  =  false  : class( Play)   num_elements(3)  predict_prob(1)                     
                 windy:  =  true  : class( Do not Play)   num_elements(1)  predict_prob(1)               
             outlook:  = sunny  : class( Do not Play)   num_elements(3)  predict_prob(1)                 
                                                                                                          
     Tree 10                                                                                             
         Root Node  : class( Play)   num_elements(9)  predict_prob(0.666666666666667)                    
             outlook:  = overcast  : class( Play)   num_elements(1)  predict_prob(1)                     
             outlook:  = rain  : class( Play)   num_elements(4)  predict_prob(1)                         
             outlook:  = sunny  : class( Do not Play)   num_elements(4)  predict_prob(0.75)              
                 humidity:  <= 70  : class( Play)   num_elements(1)  predict_prob(1)                     
                 humidity:  > 70  : class( Do not Play)   num_elements(3)  predict_prob(1)                
    (10 rows)
    
  5. Classify data with the learned model.
    SELECT * FROM madlib.rf_classify( 'trained_tree_infogain', 
                                      'golf_data', 
                                      'classification_result'
                                    );
    
    Result:
     input_set_size | classification_time 
     ---------------+---------------------
                 14 | 00:00:02.215017
    (1 row)
    
  6. Check the classification results.
    SELECT t.id, t.outlook, t.temperature, t.humidity, t.windy, c.class 
    FROM classification_result c, golf_data t 
    WHERE t.id=c.id ORDER BY id;
    
    Result:
     id | outlook  | temperature | humidity | windy  |    class     
     ---+----------+-------------+----------+--------+--------------
      1 | sunny    |          85 |       85 |  false |  Do not Play
      2 | sunny    |          80 |       90 |  true  |  Do not Play
      3 | overcast |          83 |       78 |  false |  Play
      4 | rain     |          70 |       96 |  false |  Play
      5 | rain     |          68 |       80 |  false |  Play
      6 | rain     |          65 |       70 |  true  |  Do not Play
      7 | overcast |          64 |       65 |  true  |  Play
      8 | sunny    |          72 |       95 |  false |  Do not Play
      9 | sunny    |          69 |       70 |  false |  Play
     10 | rain     |          75 |       80 |  false |  Play
     11 | sunny    |          75 |       70 |  true  |  Do not Play
     12 | overcast |          72 |       90 |  true  |  Play
     13 | overcast |          81 |       75 |  false |  Play
     14 | rain     |          71 |       80 |  true  |  Do not Play
    (14 rows)
    
  7. Score the data against a validation set.
    SELECT * FROM madlib.rf_score( 'trained_tree_infogain', 
                                   'golf_data_validation', 
                                   0
                                 );
    
    Result:
         rf_score      
     ------------------
     0.928571428571429
    (1 row)
    
  8. Clean up the random forest and other auxiliary information:
    SELECT madlib.rf_clean( 'trained_tree_infogain'
                          );
    
    Result:
     rf_clean 
     ---------
     t
    (1 row)
    

Literature

[1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

[2] http://en.wikipedia.org/wiki/Discretization_of_continuous_features

Related Topics
File rf.sql_in documenting the SQL functions.