MADlib 1.12 Release (GA)On Aug 29, 2017, MADlib completed its first release as an Apache Software Foundation Top Level Project.
New features include: All Pairs Shortest Path, Weakly Connected Components, Breadth First Search, Mulitple Graph Measures, Stratified Sampling, Train-test split, Multilayer Perceptron and various updates for Apache Top Level Project.
Decision tree and random forest - Allow expressions in feature list, Allow array input for features, Filter NULL dependent values in OOB, Add option to treat NULL as category.
Summary - Allow user to determine the number of columns per run, Improve efficiency of computation time by ~35%.
Sketch - Promote cardinality estimators to top level module from early stage.