MADlib
1.1 A newer version is available
User Documentation
|
Create database tables and import POS/NER training/testing data to the database. More...
Go to the source code of this file.
Functions | |
void | crf_train_data (text datapath) |
void | crf_test_data (text datapath) |
Definition in file crf_data_loader.sql_in.
void crf_test_data | ( | text | datapath) |
sql> select * from test_segmenttbl order by doc_id, start_pos; start_pos | doc_id | seg_text | max_pos ----------+---------+--------------+------------- 0 | 1 | the | 26 1 | 1 | madlib | 26 2 | 1 | mission | 26 3 | 1 | : | 26 4 | 1 | to | 26 5 | 1 | foster | 26 6 | 1 | widespread | 26 7 | 1 | development | 26 8 | 1 | of | 26 9 | 1 | scalable | 26 10 | 1 | analytic | 26 11 | 1 | skills | 26 12 | 1 | , | 26 13 | 1 | by | 26 ... 24 | 1 | open-source | 26 25 | 1 | development | 26 26 | 1 | . | 26
Definition at line 231 of file crf_data_loader.sql_in.
void crf_train_data | ( | text | datapath) |
sql> select * from train_segmenttbl order by doc_id, start_pos; start_pos | doc_id | seg_text | max_pos ----------+---------+--------------+------------- 0 | 1 | madlib | 9 1 | 1 | is | 9 2 | 1 | an | 9 3 | 1 | open-source | 9 4 | 1 | library | 9 5 | 1 | for | 9 6 | 1 | scalable | 9 7 | 1 | in-database | 9 8 | 1 | analytics | 9 9 | 1 | . | 9 0 | 2 | it | 16 1 | 2 | provides | 16 2 | 2 |data-parallel | 16 3 | 2 |implementations| 16 ... 14 | 2 | unstructured | 16 15 | 2 | data | 16 16 | 2 | . | 16
sql> select * from crf_dictionary; token | label | count | total ------------+--------+-------------- freefall | 11 | 1 | 1 policy | 11 | 2 | 2 measures | 12 | 1 | 1 commitment | 11 | 1 | 1 new | 6 | 1 | 1 speech | 11 | 1 | 1 's | 16 | 2 | 2 reckon | 30 | 1 | 1 underlying | 28 | 1 | 1 ...
sql> select * from labeltbl order by id; id | label ------------+-------- 0 | CC 1 | CD 2 | DT 3 | EX 4 | FW 5 | IN 6 | JJ ... 42 | , 43 | . 44 | :
sql> select * from crf_regex; pattern | name ------------- +--------------- ^[A-Z][a-z]+$ | InitCapital% ^[A-Z]+$ | isAllCapital% ^.*[0-9]+.*$ | containsDigit% ^.+[.]$ | endsWithDot% ^.+[,]$ | endsWithComma% ^.+er$ | endsWithER% ^.+est$ | endsWithEst% ^.+ed$ | endsWithED% ...
sql> select * from featuretbl order by id; id | name | prev_label_id | label_id | weight ------------------------------------------------------- 1 | W_chancellor | -1 | 13 | 2.2322 2 | E.13 | 13 | 5 | 2.3995 3 | U | -1 | 5 | 1.2164 4 | W_of | -1 | 5 | 2.8744 5 | E.5 | 5 | 2 | 3.7716 6 | W_the | -1 | 2 | 4.1790 7 | E.2 | 2 | 13 | 0.8957 ...
sql> select * from crf_feature_dic order by id; f_index| f_name | feature -------------------------------- 0 | W_chancellor | -1 1 | E.13 | 13 2 | U | -1 3 | W_of | -1 4 | E.5 | 5 5 | W_the | -1 ...
Definition at line 155 of file crf_data_loader.sql_in.