User Documentation
 All Files Functions Groups
bayes.sql_in File Reference

SQL functions for naive Bayes. More...

Go to the source code of this file.

Functions

void create_nb_prepared_data_tables (varchar trainingSource, varchar trainingClassColumn, varchar trainingAttrColumn, integer numAttrs, varchar featureProbsDestName, varchar classPriorsDestName)
 Precompute all class priors and feature probabilities. More...
 
void create_nb_classify_view (varchar featureProbsSource, varchar classPriorsSource, varchar classifySource, varchar classifyKeyColumn, varchar classifyAttrColumn, integer numAttrs, varchar destName)
 Create a view with columns (key, nb_classification) More...
 
void create_nb_probs_view (varchar featureProbsSource, varchar classPriorsSource, varchar classifySource, varchar classifyKeyColumn, varchar classifyAttrColumn, integer numAttrs, varchar destName)
 Create view with columns (key, class, nb_prob) More...
 
void create_nb_classify_fn (varchar featureProbsSource, varchar classPriorsSource, integer numAttrs, varchar destName)
 Create a SQL function mapping arrays of attribute values to the Naive Bayes classification. More...
 

Detailed Description

Date
January 2011
See Also
For a brief introduction to Naive Bayes Classification, see the module description Naive Bayes Classification.

Definition in file bayes.sql_in.

Function Documentation

void create_nb_classify_fn ( varchar  featureProbsSource,
varchar  classPriorsSource,
integer  numAttrs,
varchar  destName 
)

The created SQL function is bound to the given feature probabilities and class priors. Its declaration will be:

FUNCTION destName (attributes INTEGER[], smoothingFactor DOUBLE PRECISION) RETURNS INTEGER[]

The return type is INTEGER[] because the Naive Bayes classification might be ambiguous (in which case all of the most likely candiates are returned).

Parameters
featureProbsSourceName of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables()
classPriorsSourceName of table with precomputed class priors, as created with create_nb_prepared_data_tables()
numAttrsNumber of attributes to use for classification
destNameName of the function to create
Note
Just like create_nb_classify_view and create_nb_probs_view, also create_nb_classify_fn can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.
Usage:
  1. Create classification function:
    SELECT create_nb_classify_fn(
        'featureProbsSource', 'classPriorsSource',
        numAttrs, 'destName'
    );
  2. Run classification function:
    SELECT destName(attributes, smoothingFactor);
Note
On Greenplum, the generated SQL function can only be called on the master.

Definition at line 585 of file bayes.sql_in.

void create_nb_classify_view ( varchar  featureProbsSource,
varchar  classPriorsSource,
varchar  classifySource,
varchar  classifyKeyColumn,
varchar  classifyAttrColumn,
integer  numAttrs,
varchar  destName 
)

The created relation will be

{TABLE|VIEW} destName (key, nb_classification)

where nb_classification is an array containing the most likely class(es) of the record in classifySource identified by key.

Parameters
featureProbsSourceName of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables()
classPriorsSourceName of table with precomputed class priors, as created with create_nb_prepared_data_tables()
classifySourceName of the relation that contains data to be classified
classifyKeyColumnName of column in classifySource that can serve as unique identifier (the key of the source relation)
classifyAttrColumnName of attributes-array column in classifySource
numAttrsNumber of attributes to use for classification
destNameName of the view to create
Note
create_nb_classify_view can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.
Usage:
  1. Create Naive Bayes classifications view:
    SELECT create_nb_classify_view(
        'featureProbsName', 'classPriorsName',
        'classifySource', 'classifyKeyColumn', 'classifyAttrColumn',
        numAttrs, 'destName'
    );
  2. Show Naive Bayes classifications:
    SELECT * FROM destName;

Definition at line 451 of file bayes.sql_in.

void create_nb_prepared_data_tables ( varchar  trainingSource,
varchar  trainingClassColumn,
varchar  trainingAttrColumn,
integer  numAttrs,
varchar  featureProbsDestName,
varchar  classPriorsDestName 
)

Feature probabilities are stored in a table of format

TABLE featureProbsDestName (
    class INTEGER,
    attr INTEGER,
    value INTEGER,
    cnt INTEGER,
    attr_cnt INTEGER
)

Class priors are stored in a table of format

TABLE classPriorsDestName (
    class INTEGER,
    class_cnt INTEGER,
    all_cnt INTEGER
)
Parameters
trainingSourceName of relation containing the training data
trainingClassColumnName of class column in training data
trainingAttrColumnName of attributes-array column in training data
numAttrsNumber of attributes to use for classification
featureProbsDestNameName of feature-probabilities table to create
classPriorsDestNameName of class-priors table to create
Usage:
Precompute feature probabilities and class priors:
SELECT create_nb_prepared_data_tables(
    'trainingSource', 'trainingClassColumn', 'trainingAttrColumn',
    numAttrs, 'featureProbsName', 'classPriorsName'
);

Definition at line 402 of file bayes.sql_in.

void create_nb_probs_view ( varchar  featureProbsSource,
varchar  classPriorsSource,
varchar  classifySource,
varchar  classifyKeyColumn,
varchar  classifyAttrColumn,
integer  numAttrs,
varchar  destName 
)

The created view will be of the following form:

VIEW destName (
    key ANYTYPE,
    class INTEGER,
    nb_prob FLOAT8
)

where nb_prob is the Naive-Bayes probability that class is the true class of the record in classifySource identified by key.

Parameters
featureProbsSourceName of table with precomputed feature probabilities, as created with create_nb_prepared_data_tables()
classPriorsSourceName of table with precomputed class priors, as created with create_nb_prepared_data_tables()
classifySourceName of the relation that contains data to be classified
classifyKeyColumnName of column in classifySource that can serve as unique identifier (the key of the source relation)
classifyAttrColumnName of attributes-array column in classifySource
numAttrsNumber of attributes to use for classification
destNameName of the view to create
Note
create_nb_probs_view can be called in an ad-hoc fashion. See Naive Bayes Classification for instructions.
Usage:
  1. Create Naive Bayes probabilities view:
    SELECT create_nb_probs_view(
        'featureProbsName', 'classPriorsName',
        'classifySource', 'classifyKeyColumn', 'classifyAttrColumn',
        numAttrs, 'destName'
    );
  2. Show Naive Bayes probabilities:
    SELECT * FROM destName;

Definition at line 518 of file bayes.sql_in.