New features include: Utilities - Columns to vector, vector to columns, drop columns.
Multilayer perceptron - Added momentum and Nesterov's accelerated gradient methods to gradient updates.
Statistics - Added grouping support to correlation and covariance.
Decision tree/random forest - Added impurity variable importance.
Decision tree/random forest - Added new helper function to report variable importance values in a more readable way.
Install - Refactored and updated the madpack installation and upgrade tool.
New features include: Balanced datasets, personalized PageRank, mini-batch optimizer for multilayer perceptron neural networks (and associated pre-processor function), PostgreSQL 10.2 support.
K-nearest neighbors - Added weighted averaging/voting by distance.
Summary - Added more statistics including number of positive, negative, zero values and 95% confidence intervals.
Multilayer perceptron - Added support for one-hot encoded categorical dependent variable for classification.
New feature: Hyperlink-Induced Topic Search (HITS) link analysis algorithm.
k-nearest neighbors (kNN) - Added additional distance metrics, added list of neighbors in output table.
Multlayer perceptron (MLP) - Now supports grouping.
Cross validation - Improved the stats reporting in the output table.
Correlation: Improved quality of results by only ignoring a NULL value and not the whole row containing the NULL.
MADlib entered incubation in the fall of 2015 and made five releases as an incubating project. Along the way, the MADlib community has worked hard to ensure that the project is being developed according to the principles of the The Apache Way. We will continue to do so in the future as a TLP, to the best of our ability.
Thank you to all who have contributed to the project so far, and we look forward more innovation in machine learning in the future as a TLP!
Downloads for Apache MADlib releases. This also includes links to pre-Apache MADlib releases.