How to recursively search JSON with PL/SQL

When presented with a JSON document you may need to process all of its contents. You’ll need to read each value and if there are any objects or arrays within the document you’ll need to read through the contents of each of them as well.

Fortunately, the JSON structure is, intentionally, simple in its construction. When evaluating each value it will fall into only one of three types – it will either be a scalar, an object, or an array. If it’s a scalar (a string, number, Boolean, or null), you can process it with whatever rules you need for that data type. If it’s an object, you recursively process its key-value pairs. If it’s an array, you iterate through each of its member values.

Within PL/SQL the the JSON_ELEMENT_T object type is a parent type to the JSON_OBJECT_T, JSON_ARRAY_T, and JSON_SCALAR_T types. Thus, a function accepting accepting JSON_ELEMENT_T and taking the appropriate action to either process a scalar or search through the objects and arrays, recursively calling itself for each member value will descend through the entire JSON document, exhausting all of its contents.

The first thing we’ll need is a method of determining what type of element each value is. The JSON_ELEMENT_T has IS_xxxxx methods to check for various types. It supports checking for Oracle-specific subtypes such as is_binary, is_date, is_timestamp; but for the purposes of simply recursing through the structure, all we need to know is if it’s an OBJECT or an ARRAY. If the value is a scalar, we’ll process it simply by declaring it’s general JSON type (boolean, number, string, or null)

CREATE OR REPLACE FUNCTION get_json_element_type(p_element IN json_element_t)
    RETURN VARCHAR2
IS
BEGIN
    RETURN CASE
               WHEN p_element.is_object THEN 'OBJECT'
               WHEN p_element.is_array THEN 'ARRAY'
               WHEN p_element.is_boolean THEN 'boolean'
               WHEN p_element.is_null THEN 'null'
               WHEN p_element.is_number THEN 'number'
               WHEN p_element.is_string THEN 'string'
               ELSE 'unknown element type'
           END;
END;

Using this, the basic structure of our function is shown below. I’ve just stubbed in the object and array processing, but for scalar element values, we’re done. The p_level parameter isn’t used yet, but will be as we descend through the child objects and arrays.

CREATE OR REPLACE PROCEDURE iterate_json(p_element   IN json_element_t,
                                         p_level     IN NUMBER DEFAULT 0)
IS
    v_keylist           json_key_list;
    v_subelement_type   VARCHAR2(20);
BEGIN
    CASE
        WHEN p_element.is_object
        THEN
           null;  -- Iterate through object's child elements
        WHEN p_element.is_array
        THEN
           null;  -- Iterate through the array's child elements
        ELSE
            DBMS_OUTPUT.put_line('SCALAR ' || get_json_element_type(p_element));
    END CASE;
END;

Expanding on this structure, if the element is an object, then we’ll use the TREAT function to process the JSON_ELEMENT_T parameter as a JSON_OBJECT_T type value.

JSON_OBJECT_T has get_keys, get_type, and get methods to allow for iterating through each of the key/value pairs within the JSON object. So, first we’ll get a json_key_list using the get_keys method and iterate through them. If the object is empty, we’ll print that. If it’s populated, then for each sub-element, we will use get_type to determine what it is, and print it’s key along with its type. If it is a scalar we’ll also print out it’s JSON sub-type.

If the sub-element is an OBJECT or ARRAY then we’ll call our function with the child as the element parameter, increasing the p_level parameter by 1 as we go one level deeper into the structure.

Edit

Note: depending on the version of your database, the get_keys method of an empty object will return either NULL (19c and lower) or a collection of 0 elements (21c and higher). I’ll write more about this in a followup article.

The function now looks like this…

CREATE OR REPLACE PROCEDURE iterate_json(p_element   IN json_element_t,
                                         p_level     IN NUMBER DEFAULT 0)
IS
    v_keylist           json_key_list;
    v_subelement_type   VARCHAR2(20);
BEGIN
    CASE
        WHEN p_element.is_object
        THEN
            v_keylist := TREAT(p_element AS json_object_t).get_keys;

            CASE
                WHEN v_keylist IS NULL OR v_keylist.COUNT = 0
                THEN
                    DBMS_OUTPUT.put_line(
                        RPAD('.', p_level * 3, '.') || 'empty object');
                ELSE
                    FOR i IN 1 .. v_keylist.COUNT
                    LOOP
                        v_subelement_type :=
                            TREAT(p_element AS json_object_t).get_type(
                                v_keylist(i));
                        DBMS_OUTPUT.put_line(
                               RPAD('.', p_level * 3, '.')
                            || v_keylist(i)
                            || '  ('
                            || v_subelement_type
                            || CASE
                                   WHEN v_subelement_type = 'SCALAR'
                                   THEN
                                          ' '
                                       || get_json_element_type(
                                              TREAT(p_element AS json_object_t).get(
                                                  v_keylist(i)))
                               END
                            || ')');

                        IF v_subelement_type IN ('OBJECT', 'ARRAY')
                        THEN
                            iterate_json(
                                TREAT(p_element AS json_object_t).get(
                                    v_keylist(i)),
                                p_level + 1);
                        END IF;
                    END LOOP;
            END CASE;
        WHEN p_element.is_array
        THEN
           null;  -- Iterate through the array's child elements
        ELSE
            DBMS_OUTPUT.put_line('SCALAR ' || get_json_element_type(p_element));
    END CASE;
END;
/

To handle array type elements, we follow a similar model, using TREAT to process the JSON_ELEMENT_T as a JSON_ARRAY_T. Unlike the object elements we don’t need to generate a keylist, we simply iterate directly on the array’s sub-elements 0-N. Like the object, if the array is empty, we will indicate that.

If the array is not empty, then for each sub-element, we’ll print it out with its array index and similar to the object’s sub-elements, if it’s a scalar, we’ll also print out its JSON sub-type.

Then, if the sub-element is an OBJECT or an ARRAY, we use recursion as above, passing in the array indexed child as the element and increasing the p_level parameter by 1 as we descend another level into the JSON structure.

This completes the function, with its final form looking like this:

CREATE OR REPLACE PROCEDURE iterate_json(p_element   IN json_element_t,
                                         p_level     IN NUMBER DEFAULT 0)
IS
    v_keylist           json_key_list;
    v_subelement_type   VARCHAR2(20);
BEGIN
    CASE
        WHEN p_element.is_object
        THEN
            v_keylist := TREAT(p_element AS json_object_t).get_keys;

            CASE
                WHEN v_keylist IS NULL OR v_keylist.COUNT = 0
                THEN
                    DBMS_OUTPUT.put_line(
                        RPAD('.', p_level * 3, '.') || 'empty object');
                ELSE
                    FOR i IN 1 .. v_keylist.COUNT
                    LOOP
                        v_subelement_type :=
                            TREAT(p_element AS json_object_t).get_type(
                                v_keylist(i));
                        DBMS_OUTPUT.put_line(
                               RPAD('.', p_level * 3, '.')
                            || v_keylist(i)
                            || '  ('
                            || v_subelement_type
                            || CASE
                                   WHEN v_subelement_type = 'SCALAR'
                                   THEN
                                          ' '
                                       || get_json_element_type(
                                              TREAT(p_element AS json_object_t).get(
                                                  v_keylist(i)))
                               END
                            || ')');

                        IF v_subelement_type IN ('OBJECT', 'ARRAY')
                        THEN
                            iterate_json(
                                TREAT(p_element AS json_object_t).get(
                                    v_keylist(i)),
                                p_level + 1);
                        END IF;
                    END LOOP;
            END CASE;
        WHEN p_element.is_array
        THEN
            CASE
                WHEN TREAT(p_element AS json_array_t).get_size = 0
                THEN
                    DBMS_OUTPUT.put_line(
                        RPAD('.', p_level * 3, '.') || 'empty array');
                ELSE
                    FOR i IN 0 .. TREAT(p_element AS json_array_t).get_size - 1
                    LOOP
                        v_subelement_type :=
                            TREAT(p_element AS json_array_t).get_type(i);
                        DBMS_OUTPUT.put_line(
                               RPAD('.', p_level * 3, '.')
                            || '['
                            || TO_CHAR(i, 'fm999')
                            || ']  ('
                            || v_subelement_type
                            || CASE
                                   WHEN v_subelement_type = 'SCALAR'
                                   THEN
                                          ' '
                                       || get_json_element_type(
                                              TREAT(p_element AS json_array_t).get(
                                                  i))
                               END
                            || ')');

                        IF v_subelement_type IN ('OBJECT', 'ARRAY')
                        THEN
                            iterate_json(
                                TREAT(p_element AS json_array_t).get(i),
                                p_level + 1);
                        END IF;
                    END LOOP;
            END CASE;
        ELSE
            DBMS_OUTPUT.put_line('SCALAR ' || get_json_element_type(p_element));
    END CASE;
END;

Example usage on a JSON document containing scalars, sub-objects, and arrays where the array also contain sub-objects and the sub-objects contain arrays. Using the p_level parameter, each deeper, level is padded to show the structure.

SQL> set serveroutput on
SQL> BEGIN
  2      iterate_json(json_object_t('
  3  {
  4     "a": 123,
  5     "b": "hello",
  6     "c": true,
  7     "d": false,
  8     "e": {
  9        "e_a": 456,
 10        "e_b": "sub_object_string",
 11        "e_c": true,
 12        "e_d": false,
 13        "e_e": {
 14           "e_e_a": 123,
 15           "e_e_b": null
 16        },
 17        "e_f": [222,
 18                "arraystring",
 19                null,
 20                {"array_obj_a": 987, "array_obj_b": "string_inside_object_inside_array_inside_object"}
 21               ]
 22     },
 23     "f": [1, 2, 3, 4, 5],
 24     "g": null
 25  }')
 26     );
 27  END;
 28  /
a  (SCALAR number)
b  (SCALAR string)
c  (SCALAR boolean)
d  (SCALAR boolean)
e  (OBJECT)
...e_a  (SCALAR number)
...e_b  (SCALAR string)
...e_c  (SCALAR boolean)
...e_d  (SCALAR boolean)
...e_e  (OBJECT)
......e_e_a  (SCALAR number)
......e_e_b  (SCALAR null)
...e_f  (ARRAY)
......[0]  (SCALAR number)
......[1]  (SCALAR string)
......[2]  (SCALAR null)
......[3]  (OBJECT)
.........array_obj_a  (SCALAR number)
.........array_obj_b  (SCALAR string)
f  (ARRAY)
...[0]  (SCALAR number)
...[1]  (SCALAR number)
...[2]  (SCALAR number)
...[3]  (SCALAR number)
...[4]  (SCALAR number)
g  (SCALAR null)


PL/SQL procedure successfully completed.

This is, of course, a simplified function merely printing out the types of every element; but hopefully it will help with constructing your own recursive algorithms to process your data values within your business context.

The core logic of determining type, iterating through object keys or array members will be the same regardless of the content. It’s what you do with the end scalar values that will vary based on your needs.

Thank you for reading. Questions and comment, as always, are welcome.

Edit

Leave a ReplyCancel reply