2.1.0
User Documentation for Apache MADlib
Sampling

Detailed Description

A collection of methods for sampling from a population.

Modules

 Balanced Sampling
 A method to independently sample classes to produce a balanced data set. This is commonly used when classes are imbalanced, to ensure that subclasses are adequately represented in the sample.
 
 Stratified Sampling
 A method for independently sampling subpopulations (strata).