MADlib
1.0 A newer version is available
User Documentation
|
This module implements "partial SVD decomposition" method for representing a sparse matrix using a low-rank approximation. Mathematically, this algorithm seeks to find matrices U and V that, for any given A, minimizes:
\[ ||\boldsymbol A - \boldsymbol UV ||_2 \]
subject to \( rank(\boldsymbol UV) \leq k \), where \( ||\cdot||_2 \) denotes the Frobenius norm and \( k \leq rank(\boldsymbol A)\). If A is \( m \times n \), then U will be \( m \times k \) and V will be \( k \times n \).
This algorithm is not intended to do the full decomposition, or to be used as part of inverse procedure. It effectively computes the SVD of a low-rank approximation of A (preferably sparse), with the singular values absorbed in U and V. Code is based on the write-up as appears at [1], with some modifications.
{TABLE|VIEW} input_table ( col_num INTEGER, row_num INTEGER, value FLOAT )
Input is contained in a table where column number and row number for each cell are sequential; that is to say that if the data was written as a matrix, those values would be the actual row and column numbers and not some random identifiers. All rows and columns must be associated with a value. There should not be any missing row, columns or values.
SELECT svdmf_run( 'input_table', 'col_name', 'row_name', 'value', num_features);The function returns two tables
matrix_u
and matrix_v
, which represent the matrices U and V in table format.[1] Simon Funk, Netflix Update: Try This at Home, December 11 2006, http://sifter.org/~simon/journal/20061211.html