MADlib
0.7 A newer version is available
User Documentation
|
SELECT mfvsketch_top_histogram(col_name,n) FROM table_name;
SELECT mfvsketch_top_histogram(col_name,n) FROM table_name;
The MFV frequent-value UDA comes in two different versions:
In PostgreSQL the two UDAs are identical. In Greenplum, the quick version should produce good results unless the number of values requested is very small, or the distribution is very flat.
sql> CREATE TABLE data(class INT, a1 INT); sql> INSERT INTO data SELECT 1,1 FROM generate_series(1,10000); sql> INSERT INTO data SELECT 1,2 FROM generate_series(1,15000); sql> INSERT INTO data SELECT 1,3 FROM generate_series(1,10000); sql> INSERT INTO data SELECT 2,5 FROM generate_series(1,1000); sql> INSERT INTO data SELECT 2,6 FROM generate_series(1,1000);
sql> SELECT mfvsketch_top_histogram(a1,5) FROM data; mfvsketch_top_histogram -------------------------------------------------------------- [0:4][0:1]={{2,15000},{1,10000},{3,10000},{5,1000},{6,1000}} (1 row)