This function loads custom Python functions into a table for use by deep learning algorithms.
Custom functions can be useful if, for example, you need loss functions or metrics that are not built into the standard libraries. The functions to be loaded must be in the form of serialized Python objects created using Dill, which extends Python's pickle module to the majority of the built-in Python types [3].
Custom functions can also be used to return top k categorical accuracy in the case that you want a different k value than the default from tensorflow.keras. This module includes a helper function to create the custom function automatically for a specified k.
There is also a utility function to delete a function from the table.
load_custom_function( object_table, object, name, description )
Arguments
VARCHAR. Table to load serialized Python objects. If this table does not exist, it will be created. If this table already exists, a new row is inserted into the existing table. Do not specify schema as part of the object table name, since it will be put in the MADlib schema automatically.
BYTEA. PostgreSQL binary data type of the Python object. Object must be created with the Dill package for serializing Python objects.
TEXT. Name of the object. Must be unique identifier in the table, since this name is used when passing the object to Keras.
TEXT, default: NULL. Free text string to provide a description, if desired.
Output table
The output table contains the following columns:
id | SERIAL. Object ID. |
---|---|
name | TEXT PRIMARY KEY. Name of the object. |
description | TEXT. Description of the object (free text). |
object | BYTEA. Serialized Python object stored as a PostgreSQL binary data type. |
Delete by id:
delete_custom_function( object_table, id )
Or alternatively by name:
delete_custom_function( object_table, name )
Arguments
Create and load a custom function for a specific k into the custom functions table. The Keras accuracy parameter 'top_k_categorical_accuracy' returns top 5 accuracy by default [4]. If you want a different top k value, use this helper function to create a custom Python function to compute the top k accuracy that you specify.
load_top_k_accuracy_function( object_table, k )
Arguments
VARCHAR. Table to load serialized Python objects. If this table does not exist, it will be created. If this table already exists, a new row is inserted into the existing table. Do not specify schema as part of the object table name, since it will be put in the MADlib schema automatically.
INTEGER. k value for the top k accuracy that you want.
Output table
The output table contains the following columns:
id | SERIAL. Object ID. |
---|---|
name | TEXT PRIMARY KEY. Name of the object. Generated with the following pattern: top_(k)_accuracy. |
description | TEXT. Description of the object. |
object | BYTEA. Serialized Python object stored as a PostgreSQL binary data type. |
# import database connector psycopg2 and create connection cursor import psycopg2 as p2 conn = p2.connect('postgresql://gpadmin:8000/madlib') cur = conn.cursor() # import Dill and define functions import dill # custom loss def squared_error(y_true, y_pred): import keras.backend as K return K.square(y_pred - y_true) pb_squared_error=dill.dumps(squared_error) # custom metric def rmse(y_true, y_pred): import keras.backend as K return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1)) pb_rmse=dill.dumps(rmse) # call load function cur.execute("DROP TABLE IF EXISTS madlib.custom_function_table") cur.execute("SELECT madlib.load_custom_function('custom_function_table', %s,'squared_error', 'squared error')", [p2.Binary(pb_squared_error)]) cur.execute("SELECT madlib.load_custom_function('custom_function_table', %s,'rmse', 'root mean square error')", [p2.Binary(pb_rmse)]) conn.commit()List table to see objects:
SELECT id, name, description FROM madlib.custom_function_table ORDER BY id;
id | name | description ----+---------------+------------------------ 1 | squared_error | squared error 2 | rmse | root mean square error
CREATE OR REPLACE FUNCTION custom_function_squared_error() RETURNS BYTEA AS $$ import dill def squared_error(y_true, y_pred): import keras.backend as K return K.square(y_pred - y_true) pb_squared_error=dill.dumps(squared_error) return pb_squared_error $$ language plpython3u; CREATE OR REPLACE FUNCTION custom_function_rmse() RETURNS BYTEA AS $$ import dill def rmse(y_true, y_pred): import keras.backend as K return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1)) pb_rmse=dill.dumps(rmse) return pb_rmse $$ language plpython3u;Now call loader:
DROP TABLE IF EXISTS madlib.custom_function_table; SELECT madlib.load_custom_function('custom_function_table', custom_function_squared_error(), 'squared_error', 'squared error'); SELECT madlib.load_custom_function('custom_function_table', custom_function_rmse(), 'rmse', 'root mean square error');
SELECT madlib.delete_custom_function( 'custom_function_table', 1); SELECT id, name, description FROM madlib.custom_function_table ORDER BY id;
id | name | description ----+------+------------------------ 2 | rmse | root mean square errorDelete an object by name:
SELECT madlib.delete_custom_function( 'custom_function_table', 'rmse');If all objects are deleted from the table using this function, the table itself will be dropped.
DROP TABLE IF EXISTS custom_function_table; SELECT madlib.load_top_k_accuracy_function('custom_function_table', 3); SELECT madlib.load_top_k_accuracy_function('custom_function_table', 10); SELECT id, name, description FROM madlib.custom_function_table ORDER BY id;
id | name | description ----+-----------------+------------------------- 1 | top_3_accuracy | returns top_3_accuracy 2 | top_10_accuracy | returns top_10_accuracy
[1] https://gpdb.docs.pivotal.io/latest/admin_guide/roles_privs.html
[2] https://www.postgresql.org/docs/current/ddl-priv.html
[3] Python catalog for Dill package https://pypi.org/project/dill/
[4] https://keras.io/api/metrics/accuracy_metrics/#topkcategoricalaccuracy-class