Apache MADlib is an open source project that endeavors to adhere in all respects to the principles of The Apache Way.
MADlib grew out of discussions between database engine developers, data scientists, IT architects and academics interested in new approaches to scalable, sophisticated in-database analytics. These discussions were written up in a paper in VLDB 2009 that coined the term “MAD Skills” for data analysis. The MADlib software project began the following year as a collaboration between researchers at UC Berkeley and engineers and data scientists at EMC/Greenplum (later Pivotal).
In September 2015 MADlib was accepted into the Apache Software Foundation Incubator and graduated to a Top Level Project in July 2017.
Some of the past and present participants in this project are:
We need your feedback, so if you find a bug, would like to suggest an improvement, or create a request then please follow the steps below to let us know.
Our online developer forum open to discuss any topics of interest to open source contributors.
Step by step instructions guiding you through tho MADlib contribution model
There are several resources aimed at helping new developers understand how MADlib is designed
Interaction and collaboration with the open source and academic communities continues to be a core foundation of our project. We have papers published in major conferences. Including:
There is a growing set of publically available datasets. Here are some examples: