MADlib
1.1 A newer version is available
User Documentation
|
SQL functions for naive Bayes. More...
Go to the source code of this file.
Functions | |
void | create_nb_prepared_data_tables (varchar trainingSource, varchar trainingClassColumn, varchar trainingAttrColumn, integer numAttrs, varchar featureProbsDestName, varchar classPriorsDestName) |
Precompute all class priors and feature probabilities. More... | |
void | create_nb_classify_view (varchar featureProbsSource, varchar classPriorsSource, varchar classifySource, varchar classifyKeyColumn, varchar classifyAttrColumn, integer numAttrs, varchar destName) |
Create a view with columns (key, nb_classification) More... | |
void | create_nb_probs_view (varchar featureProbsSource, varchar classPriorsSource, varchar classifySource, varchar classifyKeyColumn, varchar classifyAttrColumn, integer numAttrs, varchar destName) |
Create view with columns (key, class, nb_prob) More... | |
void | create_nb_classify_fn (varchar featureProbsSource, varchar classPriorsSource, integer numAttrs, varchar destName) |
Create a SQL function mapping arrays of attribute values to the Naive Bayes classification. More... | |
Definition in file bayes.sql_in.
void create_nb_classify_fn | ( | varchar | featureProbsSource, |
varchar | classPriorsSource, | ||
integer | numAttrs, | ||
varchar | destName | ||
) |
The created SQL function is bound to the given feature probabilities and class priors. Its declaration will be:
FUNCTION destName (attributes INTEGER[], smoothingFactor DOUBLE PRECISION) RETURNS INTEGER[]
The return type is INTEGER
[] because the Naive Bayes classification might be ambiguous (in which case all of the most likely candiates are returned).
featureProbsSource | Name of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables() |
classPriorsSource | Name of table with precomputed class priors, as created with create_nb_prepared_data_tables() |
numAttrs | Number of attributes to use for classification |
destName | Name of the function to create |
create_nb_classify_fn
can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.SELECT create_nb_classify_fn( 'featureProbsSource', 'classPriorsSource', numAttrs, 'destName' );
SELECT destName(attributes, smoothingFactor);
Definition at line 585 of file bayes.sql_in.
void create_nb_classify_view | ( | varchar | featureProbsSource, |
varchar | classPriorsSource, | ||
varchar | classifySource, | ||
varchar | classifyKeyColumn, | ||
varchar | classifyAttrColumn, | ||
integer | numAttrs, | ||
varchar | destName | ||
) |
The created relation will be
{TABLE|VIEW} destName (key, nb_classification)
where nb_classification
is an array containing the most likely class(es) of the record in classifySource identified by key
.
featureProbsSource | Name of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables() |
classPriorsSource | Name of table with precomputed class priors, as created with create_nb_prepared_data_tables() |
classifySource | Name of the relation that contains data to be classified |
classifyKeyColumn | Name of column in classifySource that can serve as unique identifier (the key of the source relation) |
classifyAttrColumn | Name of attributes-array column in classifySource |
numAttrs | Number of attributes to use for classification |
destName | Name of the view to create |
create_nb_classify_view
can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.SELECT create_nb_classify_view( 'featureProbsName', 'classPriorsName', 'classifySource', 'classifyKeyColumn', 'classifyAttrColumn', numAttrs, 'destName' );
SELECT * FROM destName;
Definition at line 451 of file bayes.sql_in.
void create_nb_prepared_data_tables | ( | varchar | trainingSource, |
varchar | trainingClassColumn, | ||
varchar | trainingAttrColumn, | ||
integer | numAttrs, | ||
varchar | featureProbsDestName, | ||
varchar | classPriorsDestName | ||
) |
Feature probabilities are stored in a table of format
TABLE featureProbsDestName ( class INTEGER, attr INTEGER, value INTEGER, cnt INTEGER, attr_cnt INTEGER )
Class priors are stored in a table of format
TABLE classPriorsDestName ( class INTEGER, class_cnt INTEGER, all_cnt INTEGER )
trainingSource | Name of relation containing the training data |
trainingClassColumn | Name of class column in training data |
trainingAttrColumn | Name of attributes-array column in training data |
numAttrs | Number of attributes to use for classification |
featureProbsDestName | Name of feature-probabilities table to create |
classPriorsDestName | Name of class-priors table to create |
SELECT create_nb_prepared_data_tables( 'trainingSource', 'trainingClassColumn', 'trainingAttrColumn', numAttrs, 'featureProbsName', 'classPriorsName' );
Definition at line 402 of file bayes.sql_in.
void create_nb_probs_view | ( | varchar | featureProbsSource, |
varchar | classPriorsSource, | ||
varchar | classifySource, | ||
varchar | classifyKeyColumn, | ||
varchar | classifyAttrColumn, | ||
integer | numAttrs, | ||
varchar | destName | ||
) |
The created view will be of the following form:
VIEW destName ( key ANYTYPE, class INTEGER, nb_prob FLOAT8 )
where nb_prob
is the Naive-Bayes probability that class
is the true class of the record in classifySource identified by key
.
featureProbsSource | Name of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables() |
classPriorsSource | Name of table with precomputed class priors, as created with create_nb_prepared_data_tables() |
classifySource | Name of the relation that contains data to be classified |
classifyKeyColumn | Name of column in classifySource that can serve as unique identifier (the key of the source relation) |
classifyAttrColumn | Name of attributes-array column in classifySource |
numAttrs | Number of attributes to use for classification |
destName | Name of the view to create |
create_nb_probs_view
can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.SELECT create_nb_probs_view( 'featureProbsName', 'classPriorsName', 'classifySource', 'classifyKeyColumn', 'classifyAttrColumn', numAttrs, 'destName' );
SELECT * FROM destName;
Definition at line 518 of file bayes.sql_in.