In two prior articles I used pl/sql and java to build functions that would return the keys of a JSON document as a collection which could be queried as a table.
23c includes a new MLE (multilingual engine) functionality providing a mechanism to define a stored procedure using JavaScript.
This seemed like an ideal solution and the syntax to return keys is delightfully simple, by merely invoking the “keys” method json object, which, unlike in SQL, is a native feature in JavaScript.
Do note, the syntax for creating the MLE functions changed from the early 23c releases to the 23.4 23ai release. As of 23.4, the JavaScript syntax needs to be enclosed in double curly-braces as shown below. Hat Tip to Chris Saxon for sending me a note on the syntax change.
CREATE OR REPLACE FUNCTION getkeys(obj IN JSON) RETURN json AS MLE LANGUAGE JAVASCRIPT {{ return Object.keys(OBJ); }}; /
Also note, the change in syntax is accompanied by changes in the way MLE and JavaScript functionality is granted. You need two additional privileges to create the function above:
- GRANT EXECUTE ON JAVASCRIPT to user/role
- GRANT CREATE MLE to user/role
If you are using one of the older releases then the JavaScript is entered as a varchar2 value and you don’t need the additional privileges either.
CREATE OR REPLACE FUNCTION getkeys(obj IN JSON) RETURN json AS MLE LANGUAGE JAVASCRIPT q'[return Object.keys(OBJ);]'; /
In either case – the JavaScript itself is unchanged. It is a single line of code!
However, one of the differences of the MLE functionality compared to PL/SQL or Java is it can’t return a SQL collection type. In order to return our set of values, it must return the array of keys as a json object.
So, given a simple json document of {“a”:”test 1″, “b”: “test 2”, “c” : “test 3”} we want to get back the three rows a, b, and c. To do that we’ll parse through the returned json value with the JSON_TABLE function
SQL> SELECT * FROM 2 JSON_TABLE(getkeys(json('{"a":"test 1", "b": "test 2", "c" : "test 3"}')), 3 '$[*]' COLUMNS key PATH '$'); KEY -------------------------------------------------------------------------------- a b c
That’s a little less convenient than using the table functions; but it works and it’s repeatable. The output will always be a simple JSON array, so the JSON_TABLE structure is reusable for any queries of this type. So, we can make the use of it a little easier with a SQL Macro.
CREATE OR REPLACE FUNCTION getkeys_macro(p_json CLOB) RETURN VARCHAR2 SQL_MACRO (TABLE) IS BEGIN RETURN q'~SELECT key FROM JSON_TABLE(getkeys(json(p_json)),'$[*]' COLUMNS key PATH '$')~'; END; /
Using the macro is then just a matter of passing in the JSON text value and querying it like the previous table functions.
SQL> select * from 2 getkeys_macro('{"a":"test 1", "b": "test 2", "c" : "test 3"}'); KEY -------------------------------------------------------------------------------- a b c
So, that’s easy enough. But, the point of building these functions was to use them for parsing out keys of large documents, i.e. those with greater than 32K keys.
Using the n_key_json function from the previous article, I’ll create a JSON document with 35000 keys and compare the time it takes to read them with the java function and the MLE function.
SQL> set serveroutput on SQL> DECLARE 2 v_clob CLOB; 3 v_start TIMESTAMP WITH TIME ZONE; 4 BEGIN 5 v_clob := n_key_json(35000); 6 7 v_start := SYSTIMESTAMP; 8 9 FOR x IN (SELECT * FROM java_json_keys(v_clob)) 10 LOOP 11 NULL; 12 END LOOP; 13 14 DBMS_OUTPUT.put_line(SYSTIMESTAMP - v_start); 15 16 v_start := SYSTIMESTAMP; 17 18 FOR x IN (SELECT * FROM getkeys_macro(v_clob)) 19 LOOP 20 NULL; 21 END LOOP; 22 23 DBMS_OUTPUT.put_line(SYSTIMESTAMP - v_start); 24 END; 25 / +000000000 00:00:00.183055000 +000000000 00:00:00.777686000
Both sub-second, but the MLE was slower. Next I’ll ramp up to ten times the size with 350000 keys.
SQL> DECLARE 2 v_clob CLOB; 3 v_start TIMESTAMP WITH TIME ZONE; 4 BEGIN 5 v_clob := n_key_json(350000); 6 7 v_start := SYSTIMESTAMP; 8 9 FOR x IN (SELECT * FROM java_json_keys(v_clob)) 10 LOOP 11 NULL; 12 END LOOP; 13 14 DBMS_OUTPUT.put_line(SYSTIMESTAMP - v_start); 15 16 v_start := SYSTIMESTAMP; 17 18 FOR x IN (SELECT * FROM getkeys_macro(v_clob)) 19 LOOP 20 NULL; 21 END LOOP; 22 23 DBMS_OUTPUT.put_line(SYSTIMESTAMP - v_start); 24 END; 25 / +000000000 00:00:07.465496000 +000000000 00:02:38.495631000
This is significantly slower now, over 20 times slower. It might seem artificial to use the cursor-for loops, but I also tried with a BULK COLLECT into a collection and got comparable times. Doing a direct assignment in pl/sql isn’t legal syntax with the sql macro. So, while I could do
v_keys := java_json_keys(v_clob);
with the java function, I can’t do
v_keys :=getkeys_macro(v_clob);
PLS-00382: expression is of wrong type
So, while the syntax to implement the key extraction is the simplest and most elegant, it does come with a performance penalty. As the multilingual engine improves I expect the javascript implementation to improve and may warrant a revisit. Fortunately, JSON docs of this size are rare in my experience, so for the odd exception, I’ll continue to use my java stored procedure until a native SQL function is made available.
Thank you for reading, questions and comments, as always, are welcome.