Graph Syntax in SQL 2023

7 min readMay 19, 2023

The new developer release, Oracle Database 23c Free, became available on April 3, 2023. This release supports the Property Graph syntax in the latest SQL specification (SQL 2023).

In this article, I will show some example queries based on the data from “Analyzing Bank Transfer Data with Graphs”. In the previous article, I showed queries using PGQL, the query language for property graphs. In this article, I will try the same graph query capability using SQL.

For more details, go to Property Graph Release 23.2 page of Oracle Database 23c Free Documentation and see the SQL Property Graphs section.

Launch Oracle Database 23c

Run the container using Docker or Podman. This container is about 10 GB after decompression.

docker run -p 1521:1521 --name database-23c \
container-registry.oracle.com/database/free:latest

Once you see this message, the database has started successfully.

...
#########################
DATABASE IS READY TO USE!
#########################
...

If you stopped the container using Ctrl + C or similar, start it again.

docker start database-23c

Set the password for the admin user.

docker exec -it database-23c ./setPassword.sh password123

docker exec -it database-23c sqlplus sys/password123@freepdb1 as sysdba

Create a user

Create “graphuser” as a regular database user and grant the CONNECT and RESOURCE roles. Refer to this page for more information on new permissions related to graphs.

CREATE USER graphuser
IDENTIFIED BY password123
DEFAULT TABLESPACE users
TEMPORARY TABLESPACE temp
QUOTA UNLIMITED ON users;

GRANT CONNECT, RESOURCE TO graphuser;

Log out from the SQL client.

exit

Create table dataset

To download the data files onto the container, run a shell on the container.

docker exec -it database-23c /bin/bash

Download the files containing INSERT statements for data loading.

curl -O https://raw.githubusercontent.com/ryotayamanaka/oracle-graph/main/bank-transaction/data/scale-100/bank_customer.sql
curl -O https://raw.githubusercontent.com/ryotayamanaka/oracle-graph/main/bank-transaction/data/scale-100/bank_account.sql
curl -O https://raw.githubusercontent.com/ryotayamanaka/oracle-graph/main/bank-transaction/data/scale-100/bank_transaction.sql

sqlplus graphuser/password123@freepdb1

Create the tables. Here, constraints such as primary keys and foreign keys are not mandatory to create a graph but are useful for maintaining the consistency of the graph.

CREATE TABLE bank_customer (
  cst_id NUMBER NOT NULL
, first_name VARCHAR2(255)
, last_name VARCHAR2(255)
, CONSTRAINT bank_customer_pk PRIMARY KEY (cst_id)
);

CREATE TABLE bank_account (
  acc_id NUMBER NOT NULL
, cst_id NUMBER NOT NULL
, CONSTRAINT bank_account_pk PRIMARY KEY (acc_id)
, CONSTRAINT bank_account_cst_fk FOREIGN KEY (cst_id) REFERENCES bank_customer (cst_id)
);

CREATE TABLE bank_transaction (
  acc_id_src NUMBER
, acc_id_dst NUMBER
, txn_id NUMBER
, datetime TIMESTAMP
, amount NUMBER
, CONSTRAINT bank_transaction_pk PRIMARY KEY (txn_id)
, CONSTRAINT bank_transaction_src_fk FOREIGN KEY (acc_id_src) REFERENCES bank_account (acc_id)
, CONSTRAINT bank_transaction_dst_fk FOREIGN KEY (acc_id_dst) REFERENCES bank_account (acc_id)
);

Execute INSERT statements.

set termout off
@bank_customer.sql
@bank_account.sql
@bank_transaction.sql
commit;

Commit complete.

Confirm that the data has been loaded.

set termout on
select count(*) from bank_customer;
select count(*) from bank_account;
select count(*) from bank_transaction;

  COUNT(*)
----------
       160

  COUNT(*)
----------
       200

  COUNT(*)
----------
      6030

Create a graph

Create a graph bank_graph. The CREATE PROPERTY GRAPH syntax used here was introduced in SQL 2023. The statement below defines customer and account as nodes (= vertex) and transferred_to and owns as edges.

CREATE PROPERTY GRAPH bank_graph
  VERTEX TABLES (
    bank_customer
      KEY (cst_id)
      LABEL customer
      PROPERTIES (cst_id, first_name, last_name)
  , bank_account
      KEY (acc_id)
      LABEL account
      PROPERTIES (acc_id)
  )
  EDGE TABLES (
    bank_transaction
      KEY (txn_id)
      SOURCE KEY (acc_id_src) REFERENCES bank_account (acc_id)
      DESTINATION KEY (acc_id_dst) REFERENCES bank_account (acc_id)
      LABEL transferred_to
      PROPERTIES (txn_id, datetime, amount)
  , bank_account AS bank_account_owns
      KEY (acc_id)
      SOURCE KEY (cst_id) REFERENCES bank_customer (cst_id)
      DESTINATION KEY (acc_id) REFERENCES bank_account (acc_id)
      LABEL owns
  );

Property graph created.

Run queries

The commands below are for formatting the query results.

SET LINESIZE 200
SET PAGESIZE 5000
COLUMN first_name FORMAT a10

First, check the account held by the customer with CST_ID = 10.

SELECT * FROM GRAPH_TABLE (bank_graph
  MATCH (c IS customer)-[e IS owns]->(a IS account)
  WHERE c.cst_id = 10
  COLUMNS (c.cst_id, c.first_name, a.acc_id)
);

This customer’s name is Laura, and who has two accounts.

CST_ID     FIRST_NAME     ACC_ID
---------- ---------- ----------
        10 Laura              10
        10 Laura              90

Next, let’s query for all destinations of Laura’s transfers.

SELECT DISTINCT * FROM GRAPH_TABLE (bank_graph
  MATCH (c IS customer)-[e IS owns]->(a IS account)-[t IS transferred_to]->(a1 IS account)
  WHERE c.cst_id = 10
  COLUMNS (c.cst_id, c.first_name, a.acc_id, a1.acc_id AS acc_id_1)
);

CST_ID     FIRST_NAME     ACC_ID   ACC_ID_1
---------- ---------- ---------- ----------
        10 Laura              10         97
        10 Laura              10         49
        10 Laura              10         59
        10 Laura              10         75
        10 Laura              10         92
...

Now, focusing only on the transfers between accounts, let’s look for a pattern starting from the account with acc_id = 10 and returning to the same account in 3 hops.

SELECT DISTINCT * FROM GRAPH_TABLE (bank_graph
  MATCH (a1)-[t1 IS transferred_to]->(a2)-[t2 IS transferred_to]->(a3)
      , (a3)-[t3 IS transferred_to]->(a1)
  WHERE a1.acc_id = 10
    AND t1.amount > 500 AND t2.amount > 500 AND t3.amount > 500
    AND t1.datetime < t2.datetime AND t2.datetime < t3.datetime
  COLUMNS (
    a1.acc_id AS acc_id_1, t1.amount AS amount_1
  , a2.acc_id AS acc_id_2, t2.amount AS amount_2
  , a3.acc_id AS acc_id_3, t3.amount AS amount_3
  )
);

The pattern was found where acc_id is 10 > 75 > 77 > 10.

ACC_ID_1   AMOUNT_1   ACC_ID_2   AMOUNT_2   ACC_ID_3   AMOUNT_3
---------- ---------- ---------- ---------- ---------- ----------
        10        600         75        900         77        900
        10        800         75        900         77        900

Similarly, try this for 4 hops.

SELECT DISTINCT * FROM GRAPH_TABLE (bank_graph
  MATCH (a1)-[t1 IS transferred_to]->(a2)-[t2 IS transferred_to]->(a3)
      , (a3)-[t3 IS transferred_to]->(a4)-[t4 IS transferred_to]->(a1)
  WHERE a1.acc_id = 10
    AND a1.acc_id != a3.acc_id AND a2.acc_id != a4.acc_id
    AND t1.amount > 500 AND t2.amount > 500
    AND t3.amount > 500 AND t4.amount > 500
    AND t1.datetime < t2.datetime
    AND t2.datetime < t3.datetime
    AND t3.datetime < t4.datetime
  COLUMNS (
    a1.acc_id AS acc_id_1, t1.amount AS amount_1
  , a2.acc_id AS acc_id_2, t2.amount AS amount_2
  , a3.acc_id AS acc_id_3, t3.amount AS amount_3
  , a4.acc_id AS acc_id_4, t4.amount AS amount_4
  )
);

Only one path was found in this case.

ACC_ID_1   AMOUNT_1   ACC_ID_2   AMOUNT_2   ACC_ID_3   AMOUNT_3   ACC_ID_4   AMOUNT_4
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
        10        800         49        600         81        700         25        700

Let’s look for accounts that received a large number of small remittances (less than 500) during the period between Oct 1 and Dec 1. Relational database is also good at aggregation operations.

SELECT t.acc_id, COUNT(acc_id) AS num_of_txn
FROM GRAPH_TABLE (bank_graph
  MATCH (a1)-[t IS transferred_to]->(a2)
  WHERE t.datetime >= TIMESTAMP '2020-10-01 00:00:00'
    AND t.datetime < TIMESTAMP '2020-12-01 00:00:00'
    AND t.amount < 500.00
  COLUMNS (
    a2.acc_id
  )
) t
GROUP BY t.acc_id
ORDER BY num_of_txn DESC
FETCH FIRST 10 ROWS ONLY
;

ACC_ID     NUM_OF_TXN
---------- ----------
        23         29
        41         27
        33         26
        30         24
       141         23
       159         22
       134         22
       123         22
       130         21
        34         20

Check execution plans

Like other SQL queries, EXPLAIN PLAN can obtain the execution plan.

SET LINESIZE 200
SET PAGESIZE 5000
COLUMN plan_table_output FORMAT a120

EXPLAIN PLAN FOR 
SELECT DISTINCT * FROM GRAPH_TABLE (bank_graph
  MATCH (a1)-[t1 IS transferred_to]->(a2)-[t2 IS transferred_to]->(a3)
      , (a3)-[t3 IS transferred_to]->(a1)
  WHERE a1.acc_id = 10
    AND t1.amount > 500 AND t2.amount > 500 AND t3.amount > 500
    AND t1.datetime < t2.datetime AND t2.datetime < t3.datetime
  COLUMNS (
    a1.acc_id AS acc_id_1, t1.amount AS amount_1
  , a2.acc_id AS acc_id_2, t2.amount AS amount_2
  , a3.acc_id AS acc_id_3, t3.amount AS amount_3
  )
);
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY(format=>'ALL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 540988490

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                  |     1 |   156 |    28   (4)| 00:00:01 |
|   1 |  HASH UNIQUE         |                  |     1 |   156 |    28   (4)| 00:00:01 |
|*  2 |   HASH JOIN          |                  |     1 |   156 |    27   (0)| 00:00:01 |
|*  3 |    HASH JOIN         |                  |     7 |   728 |    18   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| BANK_TRANSACTION |     9 |   468 |     9   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| BANK_TRANSACTION |  2943 |   149K|     9   (0)| 00:00:01 |
|*  6 |    TABLE ACCESS FULL | BANK_TRANSACTION |    19 |   988 |     9   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
...

Since we can apply standard SQL tuning techniques, let’s create indexes that may improve table join operations.

CREATE INDEX bank_account_cst_idx ON bank_account (cst_id);
CREATE INDEX bank_transaction_src_idx ON bank_transaction (acc_id_src);
CREATE INDEX bank_transaction_dst_idx ON bank_transaction (acc_id_dst);

From the execution plan, we can confirm that the indexes are used.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4108979867

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                          |     1 |   156 |    19   (6)| 00:00:01 |
|   1 |  HASH UNIQUE                           |                          |     1 |   156 |    19   (6)| 00:00:01 |
|*  2 |   HASH JOIN                            |                          |     1 |   156 |    18   (0)| 00:00:01 |
|*  3 |    HASH JOIN                           |                          |     7 |   728 |    16   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS BY INDEX ROWID BATCHED| BANK_TRANSACTION         |     9 |   468 |     7   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | BANK_TRANSACTION_DST_IDX |    18 |       |     1   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL                  | BANK_TRANSACTION         |  2943 |   149K|     9   (0)| 00:00:01 |
|*  7 |    TABLE ACCESS BY INDEX ROWID BATCHED | BANK_TRANSACTION         |    19 |   988 |     2   (0)| 00:00:01 |
|*  8 |     INDEX RANGE SCAN                   | BANK_TRANSACTION_SRC_IDX |    50 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Such graph pattern matching can also be executed on PGX, an in-memory graph database implementation, possibly for significant performance gains (currently using PGQL). The SQL Property Graphs section of the documentation describes how to load the graphs into the PGX.

We intend to utilize this functionality for various use cases in the future. If you are interested, please follow the OracleGraph article.

Please learn more about Oracle Graph from:

Medium (= all articles with Oracle Graph tag)
Stack Overflow (= questions with oracle-graph tag)
Slack AnDOUC (= invitation link, please visit #graph)
Slack OracleDevRel (= invitation link, please visit #oracle-db-graph)

Graph Syntax in SQL 2023

Written by Ryota Yamanaka