2.1.0
User Documentation for Apache MADlib
Define Model Architectures

This function loads model architectures and weights into a table for use by deep learning algorithms.

Model architecture is in JSON form and model weights are in the form of PostgreSQL binary data types (bytea). If the output table already exists, a new row is inserted into the table so it can act as a repository for multiple model architectures and weights.

There is also a function to delete a model from the table.

MADlib's deep learning methods are designed to use the TensorFlow package and its built in Keras functions. To ensure consistency, please use tensorflow.keras objects (models, layers, etc.) instead of importing Keras and using its objects.

Load Model
load_keras_model(
    keras_model_arch_table,
    model_arch,
    model_weights,
    name,
    description
    )

Arguments

keras_model_arch_table

VARCHAR. Output table to load keras model architecture and weights.

model_arch

JSON. JSON of the model architecture to load.

Note
Please note that every input layer must have the 'input_shape' stated explicitly in the model architecture. MADlib has this requirement because, in some cases, the JSON representation may not have the input shape by default and it has to be read from the JSON for fit() type functions.

model_weights (optional)

bytea. Model weights to load as a PostgreSQL binary data type.

name (optional)

TEXT, default: NULL. Free text string to provide a name, if desired.

description (optional)

TEXT, default: NULL. Free text string to provide a description, if desired.

Output table
The output table contains the following columns:

model_id SERIAL PRIMARY KEY. Model ID.
model_arch JSON. JSON blob of the model architecture.
model_weights BYTEA. Weights of the model which may be used for warm start or transfer learning. Weights are stored as a PostgreSQL binary data type.
name TEXT. Name of model (free text).
description TEXT. Description of model (free text).
__internal_madlib_id__ TEXT. Unique id for model arch. This is an id used internally be MADlib.

Delete Model
delete_keras_model(
    keras_model_arch_table
    model_id
)

Arguments

keras_model_arch_table

VARCHAR. Table containing model architectures and weights.

model_id
INTEGER. The id of the model to be deleted.

Examples
  1. Define model architecture. Use tensorflow.keras to define the model architecture:
    import keras
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense
    model_simple = Sequential()
    model_simple.add(Dense(10, activation='relu', input_shape=(4,)))
    model_simple.add(Dense(10, activation='relu'))
    model_simple.add(Dense(3, activation='softmax'))
    model_simple.summary()
    
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #
    =================================================================
    dense_1 (Dense)              (None, 10)                50
    _________________________________________________________________
    dense_2 (Dense)              (None, 10)                110
    _________________________________________________________________
    dense_3 (Dense)              (None, 3)                 33
    =================================================================
    Total params: 193
    Trainable params: 193
    Non-trainable params: 0
    
    Export the model to JSON:
    model_simple.to_json()
    
    '{"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_1", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_2", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_3", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}'
    
  2. Load into model architecture table:
    DROP TABLE IF EXISTS model_arch_library;
    SELECT madlib.load_keras_model('model_arch_library',  -- Output table,
    $$
    {"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_1", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_2", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_3", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}
    $$
    ::json,  -- JSON blob
                                   NULL,                  -- Weights
                                   'Sophie',              -- Name
                                   'A simple model'       -- Descr
    );
    SELECT COUNT(*) FROM model_arch_library;
    
     count
    -------+
         1
    
    Load another model architecture:
    SELECT madlib.load_keras_model('model_arch_library',  -- Output table,
    $$
    {"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_1", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_2", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_3", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}
    $$
    ::json,  -- JSON blob
                                   NULL,                  -- Weights
                                   'Maria',               -- Name
                                   'Also a simple model'  -- Descr
    );
    SELECT COUNT(*) FROM model_arch_library;
    
     count
    -------+
         2
    
  3. Load model weights. To load weights from previous MADlib run, use UPDATE to load directly into the table. For example, if 'model_weights' are the weights in the output table 'iris_model' from a previous run of 'madlib_keras_fit()' :
    UPDATE model_arch_library SET model_weights = model_weights FROM iris_model WHERE model_id = 2;
    SELECT model_id, name, description, (model_weights IS NOT NULL) AS has_model_weights FROM model_arch_library ORDER BY model_id;
    
     model_id |  name  |     description     | has_model_weights 
    ----------+--------+---------------------+-------------------
            1 | Sophie | A simple model      | f
            2 | Maria  | Also a simple model | t
    
  4. To load weights from tensorflow.keras using a PL/Python function, we need to flatten then serialize the weights to store as a PostgreSQL binary data type. Byte format is more efficient on space and memory compared to a numeric array. The model weights will be de-serialized when passed to Keras functions.
    CREATE OR REPLACE FUNCTION load_weights() RETURNS VOID AS
    $$
    import keras
    from tensorflow.keras.layers import *
    from tensorflow.keras import Sequential
    import numpy as np
    import plpy
    #
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(4,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(3, activation='softmax'))
    #
    # get weights, flatten and serialize
    weights = model.get_weights()
    weights_flat = [w.flatten() for w in weights]
    weights1d =  np.concatenate(weights_flat).ravel()
    weights_bytea = weights1d.tostring()
    #
    # load query
    load_query = plpy.prepare("""SELECT madlib.load_keras_model(
                            'model_arch_library',
                            $1, $2)
                        """, ['json','bytea'])
    plpy.execute(load_query, [model.to_json(), weights_bytea])
    $$ language plpython3u;
    -- Call load function
    SELECT load_weights();
    SELECT model_id, name, description, (model_weights IS NOT NULL) AS has_model_weights FROM model_arch_library ORDER BY model_id;
    
     model_id |  name  |     description     | has_model_weights 
    ----------+--------+---------------------+-------------------
            1 | Sophie | A simple model      | f
            2 | Maria  | Also a simple model | t
            3 | Ella   | Model x             | t
    
  5. Load weights from tensorflow.keras using psycopg2. (Psycopg is a PostgreSQL database adapter for the Python programming language.) As above we need to flatten then serialize the weights to store as a PostgreSQL binary data type. Note that the psycopg2.Binary function used below will increase the size of the Python object for the weights, so if your model is large it might be better to use a PL/Python function as above.
    import psycopg2
    import psycopg2 as p2
    conn = p2.connect('postgresql://gpadmin@35.239.240.26:5432/madlib')
    cur = conn.cursor()
    from tensorflow.keras.layers import *
    from tensorflow.keras import Sequential
    import numpy as np
    #
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(4,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(3, activation='softmax'))
    #
    # get weights, flatten and serialize
    weights = model.get_weights()
    weights_flat = [w.flatten() for w in weights]
    weights1d =  np.concatenate(weights_flat).ravel()
    weights_bytea = psycopg2.Binary(weights1d.tostring())
    #
    # load query
    query = "SELECT madlib.load_keras_model('model_arch_library', %s,%s)"
    cur.execute(query,[model.to_json(),weights_bytea])
    conn.commit()
    SELECT model_id, name, description, (model_weights IS NOT NULL) AS has_model_weights FROM model_arch_library ORDER BY model_id;
    
     model_id |  name  |     description     | has_model_weights 
    ----------+--------+---------------------+-------------------
            1 | Sophie | A simple model      | f
            2 | Maria  | Also a simple model | t
            3 | Ella   | Model x             | t
            4 | Grace  | Model y             | t
    
  6. Delete one of the models:
    SELECT madlib.delete_keras_model('model_arch_library',   -- Output table
                                      1                      -- Model id
                                    );
    SELECT model_id, name, description, (model_weights IS NOT NULL) AS has_model_weights FROM model_arch_library ORDER BY model_id;
    
     model_id | name  |     description     | has_model_weights 
    ----------+-------+---------------------+-------------------
            2 | Maria | Also a simple model | t
            3 | Ella  | Model x             | t
            4 | Grace | Model y             | t
    

Related Topics

See keras_model_arch_table.sql_in