User Documentation
 All Files Functions Groups
multilogistic.sql_in File Reference

SQL functions for multinomial logistic regression. More...

Go to the source code of this file.

Functions

mlogregr_result mlogregr (varchar source, varchar depvar, varchar indepvar, integer maxnumiterations=20, varchar optimizer="irls", float8 precision=0.0001, integer ref_category)
 Compute logistic-regression coefficients and diagnostic statistics. More...
 

Detailed Description

Date
July 2012
See Also
For a brief introduction to multinomial logistic regression, see the module description Multinomial Logistic Regression.

Definition in file multilogistic.sql_in.

Function Documentation

mlogregr_result mlogregr ( varchar  source,
varchar  depvar,
varchar  indepvar,
integer  maxnumiterations = 20,
varchar  optimizer = "irls",
float8  precision = 0.0001,
integer  ref_category 
)

To include an intercept in the model, set one coordinate in the independentVariables array to 1.

Parameters
sourceName of the source relation containing the training data
depvarName of the dependent column (of type INTEGER < numcategories)
indepvarName of the independent column (of type DOUBLE PRECISION[])
maxnumiterationsThe maximum number of iterations
optimizerThe optimizer to use ( 'irls'/'newton' for iteratively reweighted least squares)
precisionThe difference between log-likelihood values in successive iterations that should indicate convergence. Note that a non-positive value here disables the convergence criterion, and execution will only stop after \ maxNumIterations iterations.
ref_categoryThe reference category specified by the user
Returns
A composite value:
  • ref_category INTEGER - Reference category
  • coef FLOAT8[] - Array of coefficients, \( \boldsymbol c \)
  • log_likelihood FLOAT8 - Log-likelihood \( l(\boldsymbol c) \)
  • std_err FLOAT8[] - Array of standard errors, \( \mathit{se}(c_1), \dots, \mathit{se}(c_k) \)
  • z_stats FLOAT8[] - Array of Wald z-statistics, \( \boldsymbol z \)
  • p_values FLOAT8[] - Array of Wald p-values, \( \boldsymbol p \)
  • odds_ratios FLOAT8[]: Array of odds ratios, \( \mathit{odds}(c_1), \dots, \mathit{odds}(c_k) \)
  • condition_no FLOAT8 - The condition number of matrix \( X^T A X \) during the iteration immediately preceding convergence (i.e., \( A \) is computed using the coefficients of the previous iteration)
  • num_iterations INTEGER - The number of iterations before the algorithm terminated
Usage:
  • Get vector of coefficients \( \boldsymbol c \) and all diagnostic statistics:
    SELECT * FROM mlogregr('sourceName', 'dependentVariable',
       'numCategories', 'independentVariables');
  • Get vector of coefficients \( \boldsymbol c \):
    SELECT (mlogregr('sourceName', 'dependentVariable',
       'numCategories', 'independentVariables')).coef;
  • Get a subset of the output columns, e.g., only the array of coefficients \( \boldsymbol c \), the log-likelihood of determination \( l(\boldsymbol c) \), and the array of p-values \( \boldsymbol p \):
    SELECT coef, log_likelihood, p_values
       FROM mlogregr('sourceName', 'dependentVariable',
      'numCategories', 'independentVariables');
Note
This function starts an iterative algorithm. It is not an aggregate function. Source and column names have to be passed as strings (due to limitations of the SQL syntax).

Definition at line 393 of file multilogistic.sql_in.