MADlib
1.0 A newer version is available
User Documentation
|
The following aggregates will be called on every integer column:
And these on non-integer columns:
Because the input schema of the table or view is unknown, we need to synthesize SQL to suit. This is done either via the profile
or profile_full
user defined function.
SELECT * FROM profile( 'input_table');
SELECT * FROM profile_full( 'input_table', buckets);
sql> SELECT * FROM profile( 'pg_catalog.pg_tables'); schema_name | table_name | column_name | function | value -------------+------------+-------------+-------------------+------- pg_catalog | pg_tables | * | COUNT() | 105 pg_catalog | pg_tables | schemaname | fmsketch_dcount() | 6 pg_catalog | pg_tables | tablename | fmsketch_dcount() | 104 pg_catalog | pg_tables | tableowner | fmsketch_dcount() | 2 pg_catalog | pg_tables | tablespace | fmsketch_dcount() | 1 pg_catalog | pg_tables | hasindexes | fmsketch_dcount() | 2 pg_catalog | pg_tables | hasrules | fmsketch_dcount() | 1 pg_catalog | pg_tables | hastriggers | fmsketch_dcount() | 2 (8 rows)
sql> SELECT * FROM profile_full( 'pg_catalog.pg_tables', 5); schema_name | table_name | column_name | function | value -------------+------------+-------------+-------------------------------------------------+---------------------------------------------------------------------------------------------------- pg_catalog | pg_tables | * | COUNT() | 105 pg_catalog | pg_tables | schemaname | fmsketch_dcount() | 6 pg_catalog | pg_tables | schemaname | array_collapse(mfvsketch_quick_histogram((),5)) | [0:4]={pg_catalog:68,public:19,information_schema:7,gp_toolkit:5,maddy:5} pg_catalog | pg_tables | schemaname | array_collapse(mfvsketch_top_histogram((),5)) | [0:4]={pg_catalog:68,public:19,information_schema:7,gp_toolkit:5,maddy:5} pg_catalog | pg_tables | tablename | fmsketch_dcount() | 104 pg_catalog | pg_tables | tablename | array_collapse(mfvsketch_quick_histogram((),5)) | [0:4]={migrationhistory:2,pg_statistic:1,sql_features:1,sql_implementation_info:1,sql_languages:1} pg_catalog | pg_tables | tablename | array_collapse(mfvsketch_top_histogram((),5)) | [0:4]={migrationhistory:2,pg_statistic:1,sql_features:1,sql_implementation_info:1,sql_languages:1} pg_catalog | pg_tables | tableowner | fmsketch_dcount() | 2 pg_catalog | pg_tables | tableowner | array_collapse(mfvsketch_quick_histogram((),5)) | [0:1]={agorajek:104,alex:1} pg_catalog | pg_tables | tableowner | array_collapse(mfvsketch_top_histogram((),5)) | [0:1]={agorajek:104,alex:1} pg_catalog | pg_tables | tablespace | fmsketch_dcount() | 1 pg_catalog | pg_tables | tablespace | array_collapse(mfvsketch_quick_histogram((),5)) | [0:0]={pg_global:28} pg_catalog | pg_tables | tablespace | array_collapse(mfvsketch_top_histogram((),5)) | [0:0]={pg_global:28} pg_catalog | pg_tables | hasindexes | fmsketch_dcount() | 2 pg_catalog | pg_tables | hasindexes | array_collapse(mfvsketch_quick_histogram((),5)) | [0:1]={t:59,f:46} pg_catalog | pg_tables | hasindexes | array_collapse(mfvsketch_top_histogram((),5)) | [0:1]={t:59,f:46} pg_catalog | pg_tables | hasrules | fmsketch_dcount() | 1 pg_catalog | pg_tables | hasrules | array_collapse(mfvsketch_quick_histogram((),5)) | [0:0]={f:105} pg_catalog | pg_tables | hasrules | array_collapse(mfvsketch_top_histogram((),5)) | [0:0]={f:105} pg_catalog | pg_tables | hastriggers | fmsketch_dcount() | 2 pg_catalog | pg_tables | hastriggers | array_collapse(mfvsketch_quick_histogram((),5)) | [0:1]={f:102,t:3} pg_catalog | pg_tables | hastriggers | array_collapse(mfvsketch_top_histogram((),5)) | [0:1]={f:102,t:3} (22 rows)
Because some of the aggregate functions used in profile return multi-dimensional arrays, which are not easily handled in pl/python, we are using array_collapse
function to collaps the n-dim arrays to 1-dim arrays. All values of 2 and upper dimensions are separated with ':' character.