MADlib
1.1 A newer version is available
User Documentation
|
The linear systems module implements solution methods for systems of consistent linear equations. Systems of linear equations take the form:
\[ Ax = b \]
where \(x \in \mathbb{R}^{n}\), \(A \in \mathbb{R}^{m \times n} \) and \(b \in \mathbb{R}^{m}\). We assume that there are no rows of \(A\) where all elements are zero. The algorithms implemented in this module can handle large dense linear systems. Currently, the algorithms implemented in this module solve the linear system by a direct decomposition. Hence, these methods are known as direct method.
View short help messages using the following statements:
-- Summary of dense linear systems SELECT madlib.linear_solver_dense(); -- Function syntax and output table format SELECT madlib.linear_solver_dense('usage'); -- Syntax for direct methods SELECT madlib.linear_solver_dense('direct');
SELECT linear_solver_dense(tbl_source, tbl_result, row_id, LHS, RHS, grouping_col := NULL, optimizer := 'direct', optimizer_params := 'algorithm = householderqr');
Text value. The name of the table containing the training data. The input data is expected to be of the following form:
{TABLE|VIEW} sourceName ( ... row_id FLOAT8, left_hand_side FLOAT8[], right_hand_side FLOAT8, ... )
Here, each row represents a single equation using. The right_hand_side refers to the right hand side of the equations while the left_hand_side refers to the multipliers on the variables on the left hand side of the same equations.
Text value. The name of the table where the output is saved.
Text value. The name of the column storing the 'row id' of the equations.
Text value. The name of the column storing the 'left hand side' of the equations stored as an array.
Text value. The name of the column storing the 'right hand side' of the equations.
Text value. Type of optimizer. Default: 'direct'.
For each optimizer, there are specific parameters that can be tuned for better performance.
There are several algorithms that can be classified as 'direct' methods of solving linear systems. MADlib dense linear system solvers provide various algorithmic options for users.
The following table provides a guideline on the choice of algorithm based on conditions on the A matrix, speed of the algorithms and numerical stability.
Algorithm | Contitions on A | Speed | Accuracy ---------------------------------------------------------- householderqr | None | ++ | + partialpivlu | Invertable | ++ | + fullpivlu | None | - | +++ colpivhouseholderqr | None | + | ++ fullpivhouseholderqr | None | - | +++ llt | Pos. Definite | +++ | + ldlt | Pos. or Neg Def | +++ | ++
For speed '++' is faster than '+', which is faster than '-'. For accuracy '+++' is better than '++'.
More details about the individual algorithms can be found on the Eigen documentation. Eigen is an open source library for linear algebra.
The solution is an array (of double precision) with the variables in the same order as that provided as input in the 'left_hand_side' column name of the 'source_table'
Computes the scaled residual norm, defined as \( \frac{|Ax - b|}{|b|} \). This value is an indication of the accuracy of the solution.
The number of iterations required by the algorithm (only applicable for iterative algorithms). The output is NULL for direct' methods.
sql> CREATE TABLE linear_systems_test_data (id INTEGER NOT NULL, lhs DOUBLE PRECISION[], rhs DOUBLE PRECISION); sql> INSERT INTO linear_systems_test_data(id, lhs, rhs) VaLUES (0, ARRAY[1,0,0], 20), (1, ARRAY[0,1,0], 15), (2, ARRAY[0,0,1], 20);
sql> SELECT madlib.linear_solver_dense('linear_systems_test_data', 'output_table', 'id', 'lhs', 'rhs');
sql> SELECT * FROM output_table; --------------------+------------------------------------- solution | {20,15,20} residual_norm | 0 iters | NULL
drop table if exists result_table; select madlib.linear_solver_dense( 'linear_systems_test_data', 'result_table', 'id', 'lhs', 'rhs', NULL, 'direct', 'algorithm=llt' );