2.1.0
User Documentation for Apache MADlib
In-Out Degree

This function computes the degree of each node. The node degree is the number of edges adjacent to that node. The node in-degree is the number of edges pointing in to the node and node out-degree is the number of edges pointing out of the node.

In-out degrees
graph_vertex_degrees(
    vertex_table,
    vertex_id,
    edge_table,
    edge_args,
    out_table,
    grouping_cols
)

Arguments

vertex_table

TEXT. Name of the table containing the vertex data for the graph. Must contain the column specified in the 'vertex_id' parameter below.

vertex_id

TEXT, default = 'id'. Name of the column in 'vertex_table' containing vertex ids. The vertex ids can be of type INTEGER or BIGINT with no duplicates. They do not need to be contiguous.

edge_table

TEXT. Name of the table containing the edge data. The edge table must contain columns for source vertex, destination vertex and edge weight. Column naming convention is described below in the 'edge_args' parameter.

edge_args

TEXT. A comma-delimited string containing multiple named arguments of the form "name=value". The following parameters are supported for this string argument:

  • src (INTEGER or BIGINT): Name of the column containing the source vertex ids in the edge table. Default column name is 'src'.
  • dest (INTEGER or BIGINT): Name of the column containing the destination vertex ids in the edge table. Default column name is 'dest'.
  • weight (FLOAT8): Name of the column containing the edge weights in the edge table. Default column name is 'weight'.

out_table

TEXT. Name of the table to store the result. It contains a row for every vertex of every group and has the following columns (in addition to the grouping columns):

  • vertex: The id for the source vertex. Will use the input vertex column 'id' for column naming.
  • indegree: Number of incoming edges to the vertex.
  • outdegree: Number of outgoing edges from the vertex.

grouping_cols
TEXT, default = NULL. List of columns used to group the input into discrete subgraphs. These columns must exist in the edge table. When this value is null, no grouping is used and a single result is generated.

Examples
  1. Create vertex and edge tables to represent the graph:
    DROP TABLE IF EXISTS vertex, edge;
    CREATE TABLE vertex(
            id INTEGER,
            name TEXT
            );
    CREATE TABLE edge(
            src_id INTEGER,
            dest_id INTEGER,
            edge_weight FLOAT8
            );
    INSERT INTO vertex VALUES
    (0, 'A'),
    (1, 'B'),
    (2, 'C'),
    (3, 'D'),
    (4, 'E'),
    (5, 'F'),
    (6, 'G'),
    (7, 'H');
    INSERT INTO edge VALUES
    (0, 1, 1.0),
    (0, 2, 1.0),
    (0, 4, 10.0),
    (1, 2, 2.0),
    (1, 3, 10.0),
    (2, 3, 1.0),
    (2, 5, 1.0),
    (2, 6, 3.0),
    (3, 0, 1.0),
    (4, 0, -2.0),
    (5, 6, 1.0),
    (6, 7, 1.0);
    
  2. Calculate the in-out degrees for each node:
    DROP TABLE IF EXISTS degrees;
    SELECT madlib.graph_vertex_degrees(
        'vertex',      -- Vertex table
        'id',          -- Vertix id column (NULL means use default naming)
        'edge',        -- Edge table
        'src=src_id, dest=dest_id, weight=edge_weight',
        'degrees');        -- Output table of shortest paths
    SELECT * FROM degrees ORDER BY id;
    
     id | indegree | outdegree
    ----+----------+-----------
      0 |        2 |         3
      1 |        1 |         2
      2 |        2 |         3
      3 |        2 |         1
      4 |        1 |         1
      5 |        1 |         1
      6 |        2 |         1
      7 |        1 |         0
    
  3. Create a graph with 2 groups and find degrees for each group:
    DROP TABLE IF EXISTS edge_gr;
    CREATE TABLE edge_gr AS
    (
      SELECT *, 0 AS grp FROM edge
      UNION
      SELECT *, 1 AS grp FROM edge WHERE src_id < 6 AND dest_id < 6
    );
    INSERT INTO edge_gr VALUES
    (4,5,-20,1);
    
  4. Find in-out degrees for all groups:
    DROP TABLE IF EXISTS out_gr;
    SELECT madlib.graph_vertex_degrees(
        'vertex',      -- Vertex table
        NULL,          -- Vertex id column (NULL means use default naming)
        'edge_gr',     -- Edge table
        'src=src_id, dest=dest_id, weight=edge_weight',
        'out_gr',      -- Output table of shortest paths
        'grp'          -- Grouping columns
    );
    SELECT * FROM out_gr WHERE id < 2 ORDER BY grp, id;
    
     grp | id | indegree |   outdegree
    ----—+---—+---------—+----------—
       0 |  0 |        2 |         3
       0 |  1 |        1 |         2
       1 |  0 |        2 |         3
       1 |  1 |        1 |         2
    (4 rows)