MADlib
1.0 A newer version is available
User Documentation
|
SELECT fmsketch_dcount(col_name) FROM table_name;
COUNT(DISTINCT x)
), but faster and approximate. Like any aggregate, it can be combined with a GROUP BY clause to do distinct counts per group.sql> CREATE TABLE data(class INT, a1 INT); sql> INSERT INTO data SELECT 1,1 FROM generate_series(1,10000); sql> INSERT INTO data SELECT 1,2 FROM generate_series(1,15000); sql> INSERT INTO data SELECT 1,3 FROM generate_series(1,10000); sql> INSERT INTO data SELECT 2,5 FROM generate_series(1,1000); sql> INSERT INTO data SELECT 2,6 FROM generate_series(1,1000);
sql> SELECT class,fmsketch_dcount(a1) FROM data GROUP BY data.class; class | fmsketch_dcount -------+----------------- 2 | 2 1 | 3 (2 rows)