2.1.0
User Documentation for Apache MADlib
Graph Diameter
Contents

Diameter is defined as the longest of all shortest paths in a graph.

Note
This function assumes a valid output from a prior APSP run - both the APSP table and the associated output summary table. APSP is a computationally expensive algorithm because it finds the shortest path between all nodes in the graph. The worst case run-time for this implementation is O(V^2 * E) where V is the number of vertices and E is the number of edges. In practice, run-time will be generally be much less than this, depending on the graph.

Diameter
graph_diameter( apsp_table,
                output_table
               )

Arguments

apsp_table

TEXT. Name of the output table generated by a prior run of all pairs shortest path (APSP).

out_table

TEXT. Name of the table to store the diameter. It contains a row for every group, the diameter value and the two vertices that are the farthest apart.

Examples
  1. Create vertex and edge tables to represent the graph:
    DROP TABLE IF EXISTS vertex, edge;
    CREATE TABLE vertex(
            id INTEGER,
            name TEXT
            );
    CREATE TABLE edge(
            src_id INTEGER,
            dest_id INTEGER,
            edge_weight FLOAT8
            );
    INSERT INTO vertex VALUES
    (0, 'A'),
    (1, 'B'),
    (2, 'C'),
    (3, 'D'),
    (4, 'E'),
    (5, 'F'),
    (6, 'G'),
    (7, 'H');
    INSERT INTO edge VALUES
    (0, 1, 1.0),
    (0, 2, 1.0),
    (0, 4, 10.0),
    (1, 2, 2.0),
    (1, 3, 10.0),
    (2, 3, 1.0),
    (2, 5, 1.0),
    (2, 6, 3.0),
    (3, 0, 1.0),
    (4, 0, -2.0),
    (5, 6, 1.0),
    (6, 7, 1.0);
    
  2. Calculate the all-pair shortest paths:
    DROP TABLE IF EXISTS out_apsp, out_apsp_summary;
    SELECT madlib.graph_apsp('vertex',      -- Vertex table
                             'id',          -- Vertix id column (NULL means use default naming)
                             'edge',        -- Edge table
                             'src=src_id, dest=dest_id, weight=edge_weight',
                                            -- Edge arguments (NULL means use default naming)
                             'out_apsp');        -- Output table of shortest paths
    
  3. Compute the diameter measure for the graph:
    DROP TABLE IF EXISTS out_diameter;
    SELECT madlib.graph_diameter('out_apsp', 'out_diameter');
    SELECT * FROM out_diameter;
    
    diameter | diameter_end_vertices
    ---------+-----------------------
          14 | {{1,4}}
    (1 row)
    
  4. Create a graph with 2 groups and find APSP for each group:
    DROP TABLE IF EXISTS edge_gr;
    CREATE TABLE edge_gr AS
    (
      SELECT *, 0 AS grp FROM edge
      UNION
      SELECT *, 1 AS grp FROM edge WHERE src_id < 6 AND dest_id < 6
    );
    INSERT INTO edge_gr VALUES
    (4,5,-20,1);
    
  5. Find APSP for all groups:
    DROP TABLE IF EXISTS out_gr, out_gr_summary;
    SELECT madlib.graph_apsp(
                             'vertex',      -- Vertex table
                             NULL,          -- Vertex id column (NULL means use default naming)
                             'edge_gr',     -- Edge table
                             'src=src_id, dest=dest_id, weight=edge_weight',
                             'out_gr',      -- Output table of shortest paths
                             'grp'          -- Grouping columns
    );
    
  6. Find the diameter of graph in every group
    DROP TABLE IF EXISTS out_gr_path;
    SELECT madlib.graph_diameter('out_gr', 'out_gr_diameter');
    SELECT * FROM out_gr_diameter ORDER BY grp;
    
    grp | diameter | diameter_end_vertices
    ---—+---------—+----------------------—
      0 |       14 | {{1,4}}
      1 |       14 | {{1,4}}
    (2 rows)