Java Table Functions – Returning a SQL collection from a Java function.

One of the great features of Oracle is the ability to create a table function. That is, a function returning a collection type that can be used as the source for a query. These functions then allow a developer to provide a simple SQL api to query rows and columns from a variety of data sources. Any piece of information you can retrieve with PL/SQL may then be returned as a collection.

However, as robust as PL/SQL is, there are times when it simply lacks the functionality to achieve a particular purpose. When this happens it maybe useful to turn to Java stored procedures as an option. Alternately, it may be that a working Java class already exists and, even if possible, there could be little or no value in rewriting the class in PL/SQL.

A few weeks ago I wrote an article showing how to use a Java stored procedure to return a single value from a SQL Server database as an Oracle function usable within SQL or PL/SQL statements. In this article I’ll extend the same idea, but instead of returning just one value, the function will return a nested-table collection of objects.

For this function, the goal is to return a list of items ordered from each salesperson.

SELECT p.personid saldpersonid,
       p.fullname salesperson,
       o.customerid,
       o.orderid,
       o.orderdate,
       l.orderlineid,
       l.unitprice,
       l.quantity,
       i.stockitemname
  FROM application.people p
       JOIN sales.orders o ON o.salespersonpersonid = p.personid
       JOIN sales.orderlines l ON l.orderid = o.orderid
       JOIN warehouse.stockitems i ON i.stockitemid = l.stockitemid;
SALESPERSONID SALESPERSON CUSTOMERID ORDERID ORDERDATE ORDERLINEID UNITPRICE QUANTITY STOCKITEMNAME
13
Hudson Hollinworth
185
44773
2015-03-11
141507
2.9
182
Clear packaging tape 48mmx75m
15
Taj Shand
930
18428
2013-12-11
58433
18
24
"The Gu" red shirt XML tag t-shirt (Black) L
6
Sophia Hinton
85
29217
2014-06-19
92634
32
8
Animal with big feet slippers (Brown) L
7
Amy Trefl
1004
8919
2013-06-18
28345
32
2
Dinosaur battery-powered slippers (Green) M
3
Hudson Onslow
531
68540
2016-03-16
215660
18
40
20 mm Double sided bubble wrap 10m
2
Kayla Woodcock
402
44893
2015-03-12
141891
18.5
40
Office cube periscope (Black)

First, we’ll need an object type to hold each row’s data, and a collection type to return a set of those objects.

CREATE OR REPLACE TYPE wwi_salesperson_lineitem_type
AS OBJECT
(
salespersonid INTEGER,
salesperson VARCHAR2(50),
customerid INTEGER,
orderid INTEGER,
orderdate DATE,
orderlineid INTEGER,
unitprice NUMBER,
quantity INTEGER,
stockitemname VARCHAR2(100)
);
/
CREATE OR REPLACE TYPE wwi_salesperson_lineitem_tab
AS TABLE OF wwi_salesperson_lineitem_type;
/

The JDBC and query portion is essentially the same as in the previous article; except instead of single value, we will iterate through a result set. For each row, append STRUCT object to a java array. When the array is full, we then use an ArrayDescriptor to create an oracle.sql.ARRAY from our java array. Also of note, the function will not only have a jdbc connection to the SQL Server database, it will also utilize the local default connection to the Oracle database as well.

CREATE OR REPLACE AND RESOLVE JAVA SOURCE NAMED SDS.WWI_SALESPERSON_LINEITEMS
RESOLVER ((* SDS)(* SQLJDBC)(* PUBLIC))
as import java.sql.*;
import java.util.*;
import com.microsoft.sqlserver.jdbc.*;
import oracle.sql.*;
import oracle.jdbc.driver.*;

public class wwi_salesperson_lineitems {
    public static oracle.sql.ARRAY get_items(
                                             String p_server,
                                             String p_instance,
                                             int p_port,
                                             String p_database,
                                             String p_user,
                                             String p_password,
                                             int p_personid
                                            ) throws Exception
    {
       SQLServerDataSource ds = new SQLServerDataSource();
       ds.setServerName(p_server);
       if (p_instance != null) { ds.setInstanceName(p_instance); }
       ds.setPortNumber(p_port);
       ds.setDatabaseName(p_database);
       ds.setUser(p_user);
       ds.setPassword(p_password);

       Connection localconn    = DriverManager.getConnection("jdbc:default:connection:");
       StructDescriptor oneItem = StructDescriptor.createDescriptor("WWI_SALESPERSON_LINEITEM_TYPE", localconn);
       Object[] salesItem = new Object[9];

       Connection remoteconn = ds.getConnection();

       PreparedStatement pstmt = remoteconn.prepareStatement(
               "SELECT p.personid salespersonid,"
             + "       p.fullname salesperson,"
             + "       o.customerid,"
             + "       o.orderid,"
             + "       o.orderdate,"
             + "       l.orderlineid,"
             + "       l.unitprice,"
             + "       l.quantity,"
             + "       i.stockitemname,"
             + "       count(*) over() cnt"
             + "  FROM application.people   p"
             + "  JOIN sales.orders         o ON o.salespersonpersonid = p.personid"
             + "  JOIN sales.orderlines     l ON l.orderid = o.orderid"
             + "  JOIN warehouse.stockitems i ON i.stockitemid = l.stockitemid"
             + " WHERE ? < 1 or p.personid = ?");

       pstmt.setInt(1, p_personid);
       pstmt.setInt(2, p_personid);

       ResultSet rs = pstmt.executeQuery();
       STRUCT[] temparray = null;
       int i=0;
       while (rs.next()) {
          if (i==0) {
               temparray = new STRUCT[rs.getInt("cnt")];
          }
          salesItem[0] = rs.getInt("salespersonid");
          salesItem[1] = rs.getString("salesperson");
          salesItem[2] = rs.getInt("customerid");
          salesItem[3] = rs.getInt("orderid");
          salesItem[4] = rs.getDate("orderdate");
          salesItem[5] = rs.getInt("orderlineid");
          salesItem[6] = rs.getFloat("unitprice");
          salesItem[7] = rs.getInt("quantity");
          salesItem[8] = rs.getString("stockitemname");

          temparray[i++] = new STRUCT(oneItem, localconn, salesItem);
       }
       rs.close();

       remoteconn.close();

       ArrayDescriptor  SalesItemArray = ArrayDescriptor.createDescriptor("WWI_SALESPERSON_LINEITEM_TAB", localconn);
       return new ARRAY(SalesItemArray, localconn, temparray);
    }
}
/

Then of course we need a pl/sql api to expose the java function.

CREATE OR REPLACE FUNCTION sds.wwi_get_salesperson_items(
            p_server     IN VARCHAR2,
            p_instance   IN VARCHAR2,
            p_port       IN NUMBER,
            p_database   IN VARCHAR2,
            p_user       IN VARCHAR2,
            p_password   IN VARCHAR2,
            p_personid   IN NUMBER)
    RETURN wwi_salesperson_lineitem_tab
IS
    LANGUAGE JAVA
    NAME 'wwi_salesperson_lineitems.get_items(
                  java.lang.String, java.lang.String, int, 
                  java.lang.String, java.lang.String, java.lang.String,
                  int) 
          return oracle.sql.ARRAY' ;

Finally, use the function in a query. I'll look up the first rows for salesid 7.

SELECT *
  FROM TABLE(wwi_get_salesperson_items('my_remote_server',
                                       NULL,
                                       1433,
                                       'WideWorldImporters',
                                       'wwi',
                                       'pa55w0rd',
                                       7))
ORDER BY orderdate
SALESPERSONID SALESPERSON CUSTOMERID ORDERID ORDERDATE ORDERLINEID UNITPRICE QUANTITY STOCKITEMNAME
7
Amy Trefl
949
19
2013-01-01
70
13
7
DBA joke mug - it depends (White)
7
Amy Trefl
949
59
2013-01-01
69
99
60
10 mm Anti static bubble wrap (Blue) 50m
7
Amy Trefl
964
8
2013-01-01
30
13
1
Developer joke mug - old C developers never die (White)
7
Amy Trefl
173
23
2013-01-01
80
18
24
Halloween zombie mask (Light Brown) L
7
Amy Trefl
173
23
2013-01-01
83
13
3
IT joke mug - that behavior is by design (White)
7
Amy Trefl
949
59
2013-01-01
68
18
108
"The Gu" red shirt XML tag t-shirt (White) 7XL
7
Amy Trefl
77
51
2013-01-01
31
18
24
"The Gu" red shirt XML tag t-shirt (Black) XL
7
Amy Trefl
964
8
2013-01-01
27
32
4
USB food flash drive - banana
7
Amy Trefl
77
9
2013-01-01
33
32
4
Animal with big feet slippers (Brown) S
7
Amy Trefl
173
23
2013-01-01
81
32
2
Dinosaur battery-powered slippers (Green) XL
7
Amy Trefl
105
47
2013-01-01
10
230
9
Ride on toy sedan car (Pink) 1/12 scale

Another use case, possibly more common, is reading file lists from the host server. The idea is the same, create an object to hold the rows, a collection to hold the objects, iterate through files creating a STRUCT to add to an array, and finally convert the array to an oracle.sql.ARRAY. By default most users won't have access to read the server directories. If this functionality is required, your DBA will need to grant the appropriate permissions.

Again, first create the SQL objects.

CREATE OR REPLACE TYPE file_type
AS OBJECT
(
file_name VARCHAR2(512),
file_size NUMBER(15, 0),
file_type VARCHAR2(1),
last_modified DATE,
readable VARCHAR2(1),
writeable VARCHAR2(1)
);
/

CREATE OR REPLACE TYPE file_table
AS TABLE OF file_type;
/

Next, create the java class to read the directories and return a collection. I'll overload the class to allow for user specification of file size units (bytes vs kilobytes, megabytes, or gigabytes.) File permissions are protected separately from the permissions needed to view the file name, type, and size. I included a limited exception handler to simply return "?" as unknown if the user receives an error while trying to read them. As with the previous example, the function needs to establish a jdbc connection to the local default connection in order to create the SQL objects.

CREATE OR REPLACE AND RESOLVE JAVA SOURCE NAMED "DirList"
as import java.io.*;
import java.sql.*;
import java.util.*;
import oracle.sql.*;
import oracle.jdbc.driver.*;

public class DirList{
    public static oracle.sql.ARRAY getListasObjects(String p_directory)
    throws Exception
    {
        return getListasObjects(p_directory,"Bytes");
    }

   public static oracle.sql.ARRAY getListasObjects(String p_directory, String p_size_units)
    throws Exception
      {
        Connection conn    = DriverManager.getConnection("jdbc:default:connection:");
        StructDescriptor oneFile = StructDescriptor.createDescriptor("FILE_TYPE", conn);
        Object[] fileattributes = new Object[6];

        File[] files = new File(p_directory).listFiles();
        STRUCT[] temparray = new STRUCT[files.length];
        for( int i = 0; i < files.length; i++ ) {
            fileattributes[0] = files[i].getName();

            String tempunits = p_size_units.toUpperCase();
            if (tempunits.equals("G") || tempunits.equals("GB") )
                 fileattributes[1] = Math.ceil((files[i].length() + 1.0) /1024/1024/1024);
            else if (tempunits.equals("M") || tempunits.equals("MB") )
                 fileattributes[1] =  Math.ceil((files[i].length() + 1.0) /1024/1024);
            else if (tempunits.equals("K")  || tempunits.equals("KB") )
                 fileattributes[1] =  Math.ceil((files[i].length() + 1.0) /1024);
            else {
                 fileattributes[1] = files[i].length();
            }

            fileattributes[2] = (files[i].isDirectory() ? "D" : "F");
            fileattributes[3] = new java.sql.Timestamp(files[i].lastModified());

            // If you don't have permissions granted to read or write
            // you'll get an error simply by looking at the attribute
            // For example...
            // ORA-29532: Java call terminated by uncaught Java exception:
            // java.security.AccessControlException:
            // the Permission (java.io.FilePermission YOURFILE write)
            // has not been granted to YOURUSER.
            // The PL/SQL to grant this is
            // dbms_java.grant_permission('YOURUSER',
            //                            'SYS:java.io.FilePermission',
            //                            'YOURFILE',
            //                            'write')
            try {
               fileattributes[4] = (files[i].canRead() ? "Y" : "N");
            } catch (Exception e) {
               fileattributes[4] = "?";
            }
            try {
               fileattributes[5] = (files[i].canWrite() ? "Y" : "N");
            } catch (Exception e) {
               fileattributes[5] = "?";
            }
            temparray[i] = new STRUCT(oneFile, conn, fileattributes);
        }

        ArrayDescriptor  FileArray = ArrayDescriptor.createDescriptor("FILE_TABLE", conn);
        return new ARRAY(FileArray, conn, temparray);
    }

}

Instead of creating a standalone function, this class will be exposed through a pl/sql package in order to invoke the overloaded methods. This could also have been implemented through a default value in the pl/sql wrapper function. This implementation more directly mirrors the java api which may be more appropriate for other examples.

CREATE OR REPLACE PACKAGE files
AS
    FUNCTION dir_list(p_directory IN VARCHAR2, p_size_units IN VARCHAR2 DEFAULT NULL)
        RETURN file_table;
END;
/

CREATE OR REPLACE PACKAGE BODY files
AS
    FUNCTION dir_list1(p_directory IN VARCHAR2)
        RETURN file_table
    AS
        LANGUAGE JAVA
        NAME 'DirList.getListasObjects( java.lang.String ) return oracle.sql.ARRAY' ;

    FUNCTION dir_list2(p_directory IN VARCHAR2, p_size_units IN VARCHAR2)
        RETURN file_table
    AS
        LANGUAGE JAVA
        NAME 'DirList.getListasObjects( java.lang.String, java.lang.String ) return oracle.sql.ARRAY' ;

    FUNCTION dir_list(p_directory IN VARCHAR2, p_size_units IN VARCHAR2 DEFAULT NULL)
        RETURN file_table
    IS
    BEGIN
        RETURN CASE WHEN p_size_units IS NULL THEN 
                           dir_list1(p_directory) 
                    ELSE 
                           dir_list(p_directory, p_size_units)
               END;
    END;
END;
/

And then finally, use the package to look up the contents of a directory. This does assume permissions have been granted and the directory and files exist.

select * from files.dir_list('/tmp/customer_files');
FILE_NAME FILE_SIZE FILE_TYPE LAST_MODIFIED READABLE WRITEABLE
messages.json
722
F 2019-03-21 01:19:29 Y ?
datadump.csv
3709
F 2019-03-21 01:19:13 Y ?
sales.xml
2473
F 2019-03-21 01:18:56 Y ?

I hope these example help. A natural question to follow up might be "could these functions returned a pipelined collection?" Yes, they can through the Data Cartridge api (ODCI) and I hope to provide examples of those options in a future article.

Questions and comments, as always, are welcome.


Creativity vs Functionality – A Tale of a Query’s Development

Recently, I had a satisfying conclusion to a series of questions that came in randomly over the course of a few weeks.

First the developer was trying to do recursion. I pointed to documentation and gave a couple of small examples showing CONNECT BY and recursive WITH clauses.

    SELECT LPAD(' ', LEVEL * 2) || empno || ' ' || ename || ' ' || job
      FROM emp
CONNECT BY mgr = PRIOR empno
START WITH job = 'MANAGER'
ORDER SIBLINGS BY empno;
WITH
    recursive (lvl, empno, ename, job, mgr)
    AS
        (SELECT 1, empno, ename, job, mgr
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT recursive.lvl + 1, emp.empno, emp.ename,
                emp.job, emp.mgr
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
          SEARCH DEPTH FIRST BY empno SET RN
SELECT LPAD(' ', lvl * 2) || empno || ' ' || ename || ' ' || job
  FROM recursive;
  7566 JONES MANAGER
    7788 SCOTT ANALYST
      7876 ADAMS CLERK
    7902 FORD ANALYST
      7369 SMITH CLERK
  7698 BLAKE MANAGER
    7499 ALLEN SALESMAN
    7521 WARD SALESMAN
    7654 MARTIN SALESMAN
    7844 TURNER SALESMAN
    7900 JAMES CLERK
  7782 CLARK MANAGER
    7934 MILLER CLERK

The next time he reached out to me, he had settled on using a recursive WITH subquery and was aggregating the children of the top level entries – creating a comma-separated list for each.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         LISTAGG(ename, ',') 
              WITHIN GROUP (ORDER BY empno) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;
MANAGER  EMPLOYEES
BLAKE	ALLEN,WARD,MARTIN,TURNER,JAMES
CLARK	MILLER
JONES	SMITH,SCOTT,ADAMS,FORD

This solution only worked for a few test rows. He returned a little while later because the aggregated strings were exceeding the VARCHAR2’s 4000 character limit for some groups and thus failing. He was using 12cR2, so the ON OVERFLOW syntax would have been legal; but he needed all of the content. So, neither the ERROR nor the TRUNCATE option would have sufficed. While this isn’t a problem using the small EMP table, it was an issue needing solved for his real data. I showed him how to use the COLLECT function to aggregate into a nested table and then pass that object to a function to iterate through the members and return a delimited CLOB.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         tbl2clob(CAST(COLLECT(ename) AS vctab)) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

The TBL2CLOB function and VCTAB type are objects I’ve used in an earlier series and their source may be found here.

At this point though, I was getting a little concerned his plan was to use the csv strings to construct dynamic sql with large IN-lists. So, before he got a chance to implement that I tell him there are better alternatives using the collections as is instead of converting them into large text values.

The next round of questions weren’t with the syntax itself or database problems; but in testing the queries and constructing various collections he began getting errors for missing objects or inconsistent data types. The problem here was the collection being returned was not explicitly CAST to a known type, thus the client software was not able to render the results because it didn’t know what the results were. The fix is to simply give them a type. CAST was shown before but at that the time it was implied casting was simply a requirement for the tbl2clob function, rather than a practice for all COLLECT output.

  SELECT manager_name, 
         COLLECT(ename) employees  -- error
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name
  SELECT manager_name,
         CAST(COLLECT(ename) AS vctab) employees  -- good
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

Now that his collections were properly typed and he had a query aggregating the expected results he could then use the collection in a join to some other table. I showed samples of MEMBER and IN variations. While the MEMBER syntax is more compact and (in my opinion) cleaner, the optimizer will often make better choices with an IN clause against a subquery. I always recommend testing both.

MEMBER operator
     x.name MEMBER OF m.employees
IN-List Condition
     x.name in (select column_value from table(m.employees))

The next step in his task was materializing the aggregated values from a remote database into a local copy. He didn’t need a materialized view, but did want a static local copy for short-term reuse.

CREATE TABLE managers
NESTED TABLE employees STORE AS manager_employees_nt
AS
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, CAST(COLLECT(ename) AS vctab) employees
             FROM recursive
            WHERE job != 'MANAGER'
         GROUP BY manager_name

Now that his queries have worked through the basic syntax requirements and I can see some real results starting to come through, I become a little concerned he’s doing a lot of look-ups based on the elements in his collection and this route will be unnecessarily complicated. After a few minutes of wrangling about the details of his requirements and nature of the data I then suggest simplifying the materialized table without a collection.

CREATE TABLE managers
AS       
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
SELECT manager_name, ename
  FROM recursive
 WHERE job != 'MANAGER';

Then do the lookup values and subsequent aggregations using simple equality predicates and groups by.

SELECT m.managers, 
       SUM(e.value1) val1_sum,
       SUM(e.value2) val2_sum
  FROM managers m 
  LEFT JOIN lookup_table e 
      ON e.ename = m.ename;

In retrospect, I suppose it is possible that spending more time digging through his requirements a few weeks ago could have led directly to this solution; but I’m glad we worked through the various combinations. He learned several new syntax techniques and different ways of looking at a problem. So, even though most of them proved unneeded in the end, I feel it was a useful exercise. I’ve given him more tools for his toolbelt. Hopefully I also taught him enough to know when and how to use them.

Moral of the story – It’s not enough to simply know 10 ways to tie a knot, it’s equally important to know why you shouldn’t use 9 of them.

SQL for the fun of it: finding typing patterns in words (Part 1 of 3)

Part 2 Part 3

Yesterday, a coworker asked me a question just out of curiosity: “How would you determine which words can be typed with alternating hands?” He then added an additional criteria that he’d like to find the words that ended on the left hand (presumably, so you could press space or enter with the right.)

Obviously, any word could be typed with alternating hands; but in the spirit of the question, we’ll assume a standard U.S. QWERTY keyboard layout and normal typing technique with left hand resting position on “a-s-d-f” and right hand resting position on “j-k-l-;”. Thus the left-hand letters would be q,w,e,r,t,a,s,d,f,g,z,x,c,v,b and the right-hand letters would then be y,u,i,o,p,h,j,k,l,n,m. Also, the word list to be checked would have no nulls, thus there will always at least one key stroke needed.

He then proceeded to describe how he thought about it: Divide each word into odd and even letters. That is the letters typed on the 1st, 3rd, 5th, etc key stroke are the odd letters, and the 2nd, 4th, 6th, etc. are the even letters. Then check if all odd letters are left and all even letters are right, or vice versa.

I thought that was an interesting question and fairly straight forward approach so I spent a few minutes and came up with this to, more or less, implement exactly what he described.

SELECT word
  FROM (SELECT word,
               odds MULTISET EXCEPT DISTINCT  r x,
               evens MULTISET EXCEPT DISTINCT l y,
               evens MULTISET EXCEPT DISTINCT r w,
               odds MULTISET EXCEPT DISTINCT l z
          FROM (  SELECT word,
                         CAST(
                             COLLECT(
                                 CASE
                                     WHEN MOD(COLUMN_VALUE, 2) = 0 THEN SUBSTR(word, COLUMN_VALUE, 1)
                                 END
                             ) AS vctab
                         ) evens,
                         CAST(
                             COLLECT(
                                 CASE
                                     WHEN MOD(COLUMN_VALUE, 2) = 1 THEN SUBSTR(word, COLUMN_VALUE, 1)
                                 END
                             ) AS vctab
                         ) odds,
                        vctab('y','u','i','o','p','h','j','k','l','n','m') r,
                        vctab('q','w','e','r','t','a','s','d','f','g','z','x','c','v','b') l
                    FROM words,
                         TABLE(
                                 SELECT COLLECT(LEVEL)
                                   FROM DUAL
                             CONNECT BY LEVEL <= LENGTH(word)
                         )
                GROUP BY word)             
         WHERE SUBSTR(word, -1, 1) MEMBER OF l)
 WHERE (x IS EMPTY AND y IS EMPTY) OR (w IS EMPTY AND z IS EMPTY);

"vctab" is a simple collection type I already had defined.

CREATE OR REPLACE TYPE vctab AS TABLE OF VARCHAR2(4000);

The inner query would create the collections of keys needed for each word and each hand.

WORD                EVENS                       ODDS                     R                               L
------------------- --------------------------- ------------------------ ------------------------------  ------------------------------------
ahent               VCTAB(h,n)                  VCTAB(a,t,e)             VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
aid                 VCTAB(i)                    VCTAB(a,d)               VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
archaeographical    VCTAB(r,l,c,h,a,g,e,h)      VCTAB(a,a,i,p,r,o,a,c)   VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
argyrodites         VCTAB(r,e,i,o,y)            VCTAB(a,s,t,d,r,g)       VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
baklava             VCTAB(a,v,l)                VCTAB(b,a,a,k)           VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
bib                 VCTAB(i)                    VCTAB(b,b)               VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
boeuf               VCTAB(o,u)                  VCTAB(b,f,e)             VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
bullwork            VCTAB(u,k,o,l)              VCTAB(b,r,w,l)           VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
cacomorphia         VCTAB(a,i,p,o,o)            VCTAB(c,a,h,r,m,c)       VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)               
caitives            VCTAB(a,s,v,t)              VCTAB(c,e,i,i)           VCTAB(y,u,i,o,p,h,j,k,l,n,m)    VCTAB(q,w,e,r,t,a,s,d,f,g,z,x,c,v,b)    

Then it was simply a matter of applying the rules. Checking if a word should end with the left hand is simple enough: SUBSTR(word, -1, 1) MEMBER OF l.

To check if all even/odd are left/right or right/left is a little tricker; but not too bad.

  • odds MULTISET EXCEPT DISTINCT r - this will return any odd letter not typed with the right hand
  • evens MULTISET EXCEPT DISTINCT l - this will return any even letter not typed with the left hand
  • evens MULTISET EXCEPT DISTINCT r - this will return any even letter not typed with the right hand
  • odds MULTISET EXCEPT DISTINCT l - this will return any odd letter not typed with the left hand

If a word has a proper alternating hand typing pattern the pairs of odd/even must both be empty for either left/right or right/left (but not both.)

That worked, but it seemed cumbersome, so I wanted to rework it. I thought MATCH_RECOGNIZE might be a good option since this question is essentially just pattern matching.
That thought line took me to this:

SELECT *
  FROM (SELECT word, COLUMN_VALUE n, SUBSTR(word, COLUMN_VALUE, 1) c
          FROM words,
               TABLE(
                       SELECT COLLECT(LEVEL)
                         FROM DUAL
                   CONNECT BY LEVEL <= LENGTH(word)
               )
       )
 MATCH_RECOGNIZE(
     PARTITION BY word
     ORDER BY n
     PATTERN (^l? (r l)* $)
     DEFINE
         l AS (c MEMBER OF vctab('q','w','e','r','t','a','s','d','f','g','z','x','c','v','b')),
         r AS (c MEMBER OF vctab('y','u','i','o','p','h','j','k','l','n','m'))
 );

I liked this one better. For small sample sets of words it won't make much difference but when I ran it on a dictionary of a few hundred thousand words this was much more efficient than my first attempt. The basic idea is to split each word into letters then check if the letters follow a repeating right-left pattern. I chose right-left instead of left-right, to ensure I ended with the left hand. To handle words that start with the left hand I make the first letter an optional left (l?).

I'm happy with that variant, but it does require 12c. So next I figured I would try using some analytic functions usable in older releases. Checking alternating hands would just be a matter of looking to see if the previous letter was not typed with the same hand as the current letter's hand. I also decided to remove the correlated COLLECT subquery and instead do a CONNECT BY directly on the words table. Neither method is significantly better than the other but as long as I was exploring different syntax, I might as well try different driving queries too.

  SELECT word
    FROM (SELECT word,
                 n,
                 hand,
                 CASE WHEN n = 1 OR (hand != LAG(hand) OVER(PARTITION BY word ORDER BY n)) THEN 'yes' ELSE 'no' END
                     alternating,
                 LAST_VALUE(hand) OVER(PARTITION BY word ORDER BY n ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
                     last_hand
            FROM (    SELECT word,
                             LEVEL
                                 n,
                             CASE
                                 WHEN SUBSTR(word, LEVEL, 1) MEMBER OF
                                           vctab('y','u','i','o','p','h','j','k','l','n','m')
                                 THEN
                                     'right'
                                 ELSE
                                     'left'
                             END
                                 hand
                        FROM words
                  CONNECT BY word = PRIOR word AND LEVEL <= LENGTH(word) AND PRIOR SYS_GUID() IS NOT NULL))
   WHERE last_hand = 'left'
GROUP BY word
HAVING MIN(alternating) = 'yes'

With this approach I categorize each letter as being either right or left and check if each letter alternates hands from the previous letter. Then grouping by each word, if all letters are flagged as alternating from the previous letter I know my word alternates.

I also tried variants of the connect by in earlier queries and also using the correlated COLLECT subquery with the analytics. I didn't find a significant difference in them. I'm sure there is a way to do this with the MODEL clause; but I have to admit I don't use that much and thus will save that for another time when I get a little more practice with it.

This was a fun exercise, exploring several interesting and powerful features of Oracle's SQL syntax, even if the goal was a bit silly.

Using Object Types: User-Defined Aggregates – CSV aggregation and uniqueness

Previously, I’ve shown how to use a user-defined aggregate to process numeric values; but they can be used for text as well.

Recently I was tasked with importing some denormalized text data and aggregating into a consistent form.

The source data consisted of many rows holding identifiers, multiple comma-separated values (CSV,) and several other fields. For the purposes of this article I’m only concerned with the identifiers and the CSV text.

So, after importing the source I ended up with raw data in this form:

CREATE TABLE csv_example
(
    id      NUMBER,
    csv1    VARCHAR2(100),
    csv2    VARCHAR2(100)
);

INSERT INTO csv_example(id, csv1, csv2) VALUES (1, 'a', 'a,c');
INSERT INTO csv_example(id, csv1, csv2) VALUES (1, 'a,b,c,d', 'e,f');
INSERT INTO csv_example(id, csv1, csv2) VALUES (1, 'd,a', 'a,d');
INSERT INTO csv_example(id, csv1, csv2) VALUES (2, 'a,c', 'b');
INSERT INTO csv_example(id, csv1, csv2) VALUES (2, 'b,c', 'd');
INSERT INTO csv_example(id, csv1, csv2) VALUES (2, 'e,a,c', 'e,c');
INSERT INTO csv_example(id, csv1, csv2) VALUES (3, 'x', 'x');
INSERT INTO csv_example(id, csv1, csv2) VALUES (3, 'y', NULL);
INSERT INTO csv_example(id, csv1, csv2) VALUES (3, 'y', 'y');
INSERT INTO csv_example(id, csv1, csv2) VALUES (3, 'z', NULL);

COMMIT;

So there are rows with duplicate values, rows with overlapping csv sets, rows with single values, and rows with multiple values. Then desired results would be one row for each id, where the distinct sub-values of the multiple text columns aggregated into a single csv.

ID  CSV
--  ---------
1   a,b,c,d,e,f
2   a,b,c,d,e
3   x,y,z

One way would be to use a function such as str2tbl on a concatenated list of both values.

SELECT id, LISTAGG(val, ',') WITHIN GROUP (ORDER BY val) csv
  FROM (SELECT DISTINCT id, COLUMN_VALUE val
          FROM (SELECT id, COLUMN_VALUE
                  FROM csv_example, TABLE(str2tbl(csv1||','||csv2))))
GROUP BY id

For a small number of values that are themselves small, this works fine; but as the number of columns increases, and the number of values in each column increases, and the size of the individual values increases, this method breaks down. Eventually we’ll hit the upper limit of a varchar2 when concatenating them to build the initial collection from str2tbl.

To address that we can create a collection for each column and use the MULTISET UNION operator to combine them into one large collection. This method allows for any number of columns with contents of any size. The only limiting factor then is the size of the distinct list of final values. Assuming the final delimited string will fit within a varchar2 value, then we can use the native LISTAGG function.

SELECT id, LISTAGG(val, ',') WITHIN GROUP (ORDER BY val) csv
  FROM (SELECT DISTINCT id, COLUMN_VALUE val
          FROM (SELECT id, str2tbl(csv1) MULTISET UNION str2tbl(csv2) coll
                  FROM csv_example),
               TABLE(coll))
GROUP BY id

If the resulting value will be too big for a varchar2 then we can use tbl2clob to create a clob value from the resulting aggregated set of values.

SELECT id, tbl2clob(SET(CAST(COLLECT(COLUMN_VALUE ORDER BY COLUMN_VALUE) AS vctab)), ',') val
  FROM (SELECT id, str2tbl(csv1) MULTISET UNION str2tbl(csv2) coll
          FROM csv_example),
       TABLE(coll)
GROUP BY id;

While all of those work, are reasonably efficient and not overly complex, I thought it would be even easier if I could simply aggregate the csv columns more directly. To that end I have constructed a user-defined aggregate that will accept a collection of columns, each of which is a comma-separated value list. The aggregation will combine the contents of all of the columns into a single collection of distinct elements, sort them, and then return a single CLOB value.
If a varchar2 result would suffice it would simply be a matter of change clob to varchar2 below and swapping the tbl2clob function with the tbl2str function.

CREATE OR REPLACE TYPE csvtabagg_obj AS OBJECT
(
    v_values vctab,
    STATIC FUNCTION odciaggregateinitialize(ctx IN OUT csvtabagg_obj)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateiterate(self IN OUT csvtabagg_obj, p_csvtab IN vctab)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregatemerge(self IN OUT csvtabagg_obj, ctx2 IN csvtabagg_obj)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateterminate(self IN csvtabagg_obj, returnvalue OUT CLOB, flags IN NUMBER)
        RETURN NUMBER
);
/

CREATE OR REPLACE TYPE BODY csvtabagg_obj
IS
    STATIC FUNCTION odciaggregateinitialize(ctx IN OUT csvtabagg_obj)
        RETURN NUMBER
    IS
    BEGIN
        ctx := csvtabagg_obj(NULL);
        RETURN odciconst.success;
    END odciaggregateinitialize;

    MEMBER FUNCTION odciaggregateiterate(self IN OUT csvtabagg_obj, p_csvtab IN vctab)
        RETURN NUMBER
    IS
        v_length   PLS_INTEGER;
        v_start    PLS_INTEGER;
        v_index    PLS_INTEGER;
    BEGIN
        IF v_values IS NULL
        THEN
            v_values := vctab();
        END IF;

        FOR i IN 1 .. p_csvtab.COUNT
        LOOP
            v_length := LENGTH(p_csvtab(i));
            v_start := 1;

            WHILE (v_start <= v_length)
            LOOP
                v_index := INSTR(p_csvtab(i), ',', v_start);

                v_values.EXTEND();

                IF v_index = 0
                THEN
                    v_values(v_values.COUNT) := SUBSTR(p_csvtab(i), v_start);
                    v_start := v_length + 1;
                ELSE
                    v_values(v_values.COUNT) := SUBSTR(p_csvtab(i), v_start, v_index - v_start);
                    v_start := v_index + 1;
                END IF;
            END LOOP;
        END LOOP;

        RETURN odciconst.success;
    END odciaggregateiterate;

    MEMBER FUNCTION odciaggregatemerge(self IN OUT csvtabagg_obj, ctx2 IN csvtabagg_obj)
        RETURN NUMBER
    IS
    BEGIN
        IF v_values IS NULL
        THEN
            v_values := ctx2.v_values;
        ELSE
            v_values := v_values MULTISET UNION ctx2.v_values;
        END IF;

        RETURN odciconst.success;
    END odciaggregatemerge;

    MEMBER FUNCTION odciaggregateterminate(self IN csvtabagg_obj, returnvalue OUT CLOB, flags IN NUMBER)
        RETURN NUMBER
    IS
    BEGIN
        SELECT tbl2clob(CAST(COLLECT(COLUMN_VALUE ORDER BY COLUMN_VALUE) AS vctab), ',')
          INTO returnvalue
          FROM TABLE(SET(v_values));

        RETURN odciconst.success;
    END odciaggregateterminate;
END;
/

CREATE OR REPLACE FUNCTION csvagg2clob(p_csv IN vctab)
    RETURN CLOB
    PARALLEL_ENABLE
    AGGREGATE USING csvtabagg_obj;
/

With the new aggregate type and function, the query is reduced to a simple function call and group by, easily extensible to additional columns simply by adding them to the vctab collection in the aggregate parameter.

SQL> SELECT id, csvagg2clob(vctab(csv1, csv2)) csv
  2    FROM csv_example
  3  GROUP BY id;

        ID CSV
---------- ---------------------------------------
         1 a,b,c,d,e,f
         2 a,b,c,d,e
         3 x,y,z

I thought this was an interesting request when it was first presented. I hope the various options above provide some inspiration in your own parsing and reporting tasks.

Using Object Types: converting Text to Collections and Collections to Text

A common occurrence in sql is to aggregate text values into a larger, delimited value. It can also occur that you have a delimited value and you want treat it as a collection to pull data out of it process each of the sub-values. Oracle has a built-in function LISTAGG to aggregate text values into a larger varchar2 value. Before LISTAGG was available a user-defined aggregate might have been used. Or possibly the native function COLLECT would be used then turn the collection into a string.

Here I’ll try to list several of the popular incarnations that translate the data in either direction.

Probably the first function that gained popularity was Tom Kyte’s STRAGG which was user-defined aggregate first available in 9i and his version looked like this:

CREATE OR REPLACE TYPE string_agg_type AS OBJECT
(
    total VARCHAR2(4000),
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT string_agg_type)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateiterate(self IN OUT string_agg_type, VALUE IN VARCHAR2)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateterminate(self IN string_agg_type, returnvalue OUT VARCHAR2, flags IN NUMBER)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregatemerge(self IN OUT string_agg_type, ctx2 IN string_agg_type)
        RETURN NUMBER
);
/

CREATE OR REPLACE TYPE BODY string_agg_type
IS
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT string_agg_type)
        RETURN NUMBER
    IS
    BEGIN
        sctx := string_agg_type(NULL);
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateiterate(self IN OUT string_agg_type, VALUE IN VARCHAR2)
        RETURN NUMBER
    IS
    BEGIN
        self.total := self.total || ',' || VALUE;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateterminate(self IN string_agg_type, returnvalue OUT VARCHAR2, flags IN NUMBER)
        RETURN NUMBER
    IS
    BEGIN
        returnvalue := LTRIM(self.total, ',');
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregatemerge(self IN OUT string_agg_type, ctx2 IN string_agg_type)
        RETURN NUMBER
    IS
    BEGIN
        self.total := self.total || ctx2.total;
        RETURN odciconst.success;
    END;
END;
/

CREATE OR REPLACE FUNCTION stragg(input VARCHAR2)
    RETURN VARCHAR2
    PARALLEL_ENABLE
    AGGREGATE USING string_agg_type;
/

The original can be found here.

stragg worked great, and still does; but it does suffer from one significant limitation, and that is the varchar2 limit of 4000 characters. The native function LISTAGG has the same problem. It can be somewhat ameliorated in 12c with the LISTAGG overflow parameters; but those don’t let you exceed the limit. They only determine what to do when an overflow happens. If you really want all of the data then you need to aggregate into a different data type.

With a few simple changes though and we can have an aggregate that concatenates varchar2 values into a clob and the resulting text can be billions of characters long.

CREATE OR REPLACE TYPE vcagg_to_clob_type AS OBJECT
(
    v_result CLOB,
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_type)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateiterate(self IN OUT vcagg_to_clob_type, p_string IN VARCHAR2)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateterminate(self IN vcagg_to_clob_type, returnvalue OUT CLOB, flags IN NUMBER)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregatemerge(self IN OUT vcagg_to_clob_type, ctx2 IN vcagg_to_clob_type)
        RETURN NUMBER
);
/

CREATE OR REPLACE TYPE BODY vcagg_to_clob_type
IS
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_type)
        RETURN NUMBER
    IS
    BEGIN
        sctx := vcagg_to_clob_type(NULL);
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateiterate(self IN OUT vcagg_to_clob_type, p_string IN VARCHAR2)
        RETURN NUMBER
    IS
    BEGIN
        self.v_result := self.v_result || ',' || p_string;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateterminate(self IN vcagg_to_clob_type, returnvalue OUT CLOB, flags IN NUMBER)
        RETURN NUMBER
    IS
    BEGIN
        returnvalue := LTRIM(self.v_result, ',');
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregatemerge(self IN OUT vcagg_to_clob_type, ctx2 IN vcagg_to_clob_type)
        RETURN NUMBER
    IS
    BEGIN
        self.v_result := self.v_result || ctx2.v_result;
        RETURN odciconst.success;
    END;
END;
/

CREATE OR REPLACE FUNCTION clobagg(p_string VARCHAR2)
    RETURN CLOB
    DETERMINISTIC
    PARALLEL_ENABLE
    AGGREGATE USING vcagg_to_clob_type;
/

Another option, introduced into 10g is the COLLECT aggregate function. This will allow the user to aggregate multiple rows into a nested table collection type. This nested table can then be iterated through to create a varchar2 or clob type depending on the size.

First, we’ll create a nested table type to hold our collection.

CREATE TYPE vctab AS TABLE OF VARCHAR2(4000);

Then we can create functions to turn the collection into a varchar2.

CREATE OR REPLACE FUNCTION tbl2str(p_tbl IN vctab, p_delimiter IN VARCHAR2 DEFAULT ',')
    RETURN VARCHAR2
    DETERMINISTIC
IS
    v_str   VARCHAR2(32767);
BEGIN
    IF p_tbl.COUNT > 0
    THEN
        v_str  := p_tbl(1);

        FOR i IN 2 .. p_tbl.COUNT
        LOOP
            v_str  := v_str || p_delimiter || p_tbl(i);
        END LOOP;
    END IF;

    RETURN v_str;
END;

And we’ll create another one to return a clob for large result sets. I’ve added a little extra complexity for performance by using an intermediate varchar2.

CREATE OR REPLACE FUNCTION tbl2clob(p_tbl IN vctab, p_delimiter IN VARCHAR2 DEFAULT ',')
    RETURN CLOB
    DETERMINISTIC
IS
    v_str                VARCHAR2(32767);
    v_clob               CLOB;
    v_delimiter_length   PLS_INTEGER := NVL(LENGTH(p_delimiter), 0);
BEGIN
    DBMS_LOB.createtemporary(v_clob, TRUE);

    IF p_tbl.COUNT > 0
    THEN
        v_str := p_tbl(1);

        FOR i IN 2 .. p_tbl.COUNT
        LOOP
            -- Appending to clobs is slower than appending to varchar2
            -- so use varchar2 until you can't anymore then append one big chunk
            IF NVL(LENGTH(v_str), 0) + v_delimiter_length + NVL(LENGTH(p_tbl(i)), 0) <= 32767
            THEN
                v_str := v_str || p_delimiter || p_tbl(i);
            ELSE
                v_clob := v_clob || v_str || p_delimiter || p_tbl(i);
                v_str := NULL;
            END IF;
        END LOOP;
    END IF;

    IF v_str IS NOT NULL
    THEN
        v_clob := v_clob || v_str;
    END IF;

    RETURN v_clob;
END;

So a simple example comparing all of the above might look like this...

  SELECT owner,
         stragg(DISTINCT object_type) stragg_object_types,
         --LISTAGG(object_type, ',') WITHIN GROUP (ORDER BY object_type) listagg_object_types,
         tbl2str(CAST(COLLECT(DISTINCT object_type) AS vctab)) collect_object_types,
         tbl2clob(CAST(COLLECT(object_name) AS vctab)) collect_object_names
    FROM all_objects
GROUP BY owner;

Note though, the LISTAGG line is commented out. This is because the DISTINCT keyword is not allowed within the LISTAGG syntax. To get around that we'll have to use a subquery to force distinct values before querying.

  SELECT owner,
         stragg(object_type) stragg_object_types,
         LISTAGG(object_type, ',') WITHIN GROUP (ORDER BY object_type) listagg_object_types,
         tbl2str(CAST(COLLECT(object_type ORDER BY object_type) AS vctab)) collect_object_types
    FROM (SELECT DISTINCT owner, object_type
            FROM all_objects)
GROUP BY owner;

Also note, the LISTAGG and COLLECT functions allow for ordering, STRAGG does not, at least not as an aggregate. If you invoke STRAGG as an analytic function then it is possible to do so.

SELECT DISTINCT
       owner,
       stragg(object_type)
           OVER(PARTITION BY owner ORDER BY object_type ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
           stragg_object_types
  FROM (SELECT DISTINCT owner, object_type
          FROM all_objects);

And last, what if you already have a delimited text field that you would like to turn into a collection?

Depending on the size of the text, we'll either start with a varchar2 or a clob. The corresponding functions will then create an instance of a collection type.

I've posted a clob version earlier with theSPLIT_CLOB function. While it is perfectly legal syntax and functionally feasible to use that function as is on varchar2 values, you may want to use a varchar2 specific version for performance reasons.

CREATE OR REPLACE FUNCTION str2tbl(p_string IN VARCHAR2, p_delimiter IN VARCHAR2 := ',')
    RETURN vctab
    PIPELINED
AS
    v_length   NUMBER := LENGTH(p_string);
    v_start    NUMBER := 1;
    v_index    NUMBER;
BEGIN
    WHILE (v_start <= v_length)
    LOOP
        v_index := INSTR(p_string, p_delimiter, v_start);

        IF v_index = 0
        THEN
            PIPE ROW (SUBSTR(p_string, v_start));

            v_start := v_length + LENGTH(p_delimiter);
        ELSE
            PIPE ROW (SUBSTR(p_string, v_start, v_index - v_start));

            v_start := v_index + LENGTH(p_delimiter);
        END IF;
    END LOOP;

    RETURN;
END str2tbl;

Another option would be to put them into a package with a single overloaded name.

A simple example usage:

select * from table(str2tbl('a|b|c|d','|'));

Using these functions individually or in combination has helped me immensely in parsing documents of various sources. I hope they help you as well.

Questions and comments, as always, are welcome.

Using Object Types: ORDER member functions

In my previous post I showed how to use the MAP member functions to compare two objects of the same class. The MAP functions are useful when an object can be meaningfully represented and compared by a single value, especially if that value might provide functionality or meaning itself (such as the triangle area mapping in that article.)

Sometimes though the ordering of two objects is based more on a rule or series of rules than any one particular value. The Oracle version mapping function provided in my previous post could fall in to that category. While it was possible to create a mapping function, the resulting value is a little clunky. A more logical comparison function would be to look at each component of the version string compare the corresponding values. That is, if I’m comparing 11.2.0.3.0 to 12.1.0.2.0, then I want to compare 11 to 12 and I know the result. If I have 12.1.0.2.0 and 12.2.0.1.0 then I compare 12 to 12, see they are the same and then compare 1 to 2 to see the result. This kind of logic isn’t possible in a MAP function; but can be implemented with an ORDER member function.

To achieve the ordering logic, we’ll use a constructor to parse the version string into individual elements stored in a collection. Then iterate through the collection elements. Unlike the MAP member function which generates a value from its own object, the ORDER function must compare to another object which is passed in as a parameter. The output of the comparison is a number. If the result is negative then the object is less than the input parameter. If 0 then the object has equivalent ordering to the input parameter, and if positive then the object is considered greater than the input parameter.

First, we’ll declare a collection type to hold the elements.

CREATE OR REPLACE TYPE number_table AS TABLE OF NUMBER;
CREATE OR REPLACE TYPE db_version_obj AS OBJECT
(
    version_string VARCHAR2(50),
    elements number_table,
    CONSTRUCTOR FUNCTION db_version_obj(p_version_string IN VARCHAR2)
        RETURN SELF AS RESULT,
    ORDER MEMBER FUNCTION sorter(p_obj IN db_version_obj)
        RETURN INTEGER
);

CREATE OR REPLACE TYPE BODY db_version_obj
AS
    CONSTRUCTOR FUNCTION db_version_obj(p_version_string IN VARCHAR2)
        RETURN SELF AS RESULT
    IS
    BEGIN
        version_string := p_version_string;
        elements := number_table();

        elements.EXTEND(REGEXP_COUNT(p_version_string, '[0-9]+'));

        FOR i IN 1 .. elements.COUNT
        LOOP
            elements(i) :=
                TO_NUMBER(REGEXP_SUBSTR(p_version_string,
                                        '[0-9]+',
                                        1,
                                        i));
        END LOOP;

        RETURN;
    END;

    ORDER MEMBER FUNCTION sorter(p_obj IN db_version_obj)
        RETURN INTEGER
    IS
        v_index   INTEGER := 1;
        v_diff    INTEGER := 0;
    BEGIN
        WHILE v_diff = 0 AND v_index <= LEAST(self.elements.COUNT, p_obj.elements.COUNT)
        LOOP
            v_diff :=
                CASE
                    WHEN self.elements(v_index) IS NOT NULL AND p_obj.elements(v_index) IS NOT NULL
                    THEN
                        self.elements(v_index) - p_obj.elements(v_index)
                    WHEN self.elements(v_index) IS NULL AND p_obj.elements(v_index) IS NULL
                    THEN
                        0
                    WHEN self.elements(v_index) IS NOT NULL
                    THEN
                        1
                    ELSE
                        -- p_obj.elements(v_index) is not null
                        -1
                END;

            v_index := v_index + 1;
        END LOOP;

        -- if all of the pieces match, check if one side has more pieces
        -- example:   11.2  vs 11.2.0.1,  in this case 11.2.0.1 is greater.
        -- this can produce potentially arguable results like 11.2.0 > 11.2
        -- but even if undesirable sometimes they will at least be consistent
        IF v_diff = 0
        THEN
            v_diff := self.elements.COUNT - p_obj.elements.COUNT;
        END IF;

        RETURN v_diff;
    END;
END;

Note the commented section at the end of the sorter function. I wanted the object to be able to sort abbreviated versions correctly, so ‘12.1’ > ‘9.2.0.7’ even though neither one is a full 5-number version string. The following example shows a mix of complete and partial versions sorting correctly.

SQL>   SELECT ver
  2      FROM (SELECT '12.2.0.1.0' ver FROM DUAL
  3            UNION ALL
  4            SELECT '9.2.0.7.0' FROM DUAL
  5            UNION ALL
  6            SELECT '12.2' FROM DUAL
  7            UNION ALL
  8            SELECT '12.1' FROM DUAL
  9            UNION ALL
 10            SELECT '11.1.0.1.0' FROM DUAL
 11            UNION ALL
 12            SELECT '8.1.7.5.3' FROM DUAL
 13            UNION ALL
 14            SELECT '10.2.0.1.0' FROM DUAL
 15            UNION ALL
 16            SELECT '10.1' FROM DUAL
 17            UNION ALL
 18            SELECT '7.3' FROM DUAL)
 19  ORDER BY db_version_obj(ver);

VER
----------
7.3
8.1.7.5.3
9.2.0.7.0
10.1
10.2.0.1.0
11.1.0.1.0
12.1
12.2
12.2.0.1.0

Also note, NULLS must be handled by you when using an ORDER function. If you let the ORDER function return a NULL, you’ll raise an exception.

ORA-22951: NULL returned by ORDER method

In my example, I sort NULLs as less than a populated value. You could reverse that, raise an exception, or change your constructor to cleanse the input data, either by raising an exception or forcing default values when a NULL would be generated. This also means you must hard code the null handling within the object and it can’t be changed at query time using the NULLS FIRST/NULLS LAST modifiers to ORDER BY clauses.

Another thing to consider about the constructor is it isn’t entirely reliable. This is because the elements collection can be manipulated directly. Unfortunately, as of 12cR2, there is no way to hide an attribute in the body of an object. So, an alternate implementation would be to store only the version string itself as an attribute and do the parsing inside the ORDER member.

An implementation of that method might look like this:

CREATE OR REPLACE TYPE db_version_obj AS OBJECT
(
    version_string VARCHAR2(50),
    ORDER MEMBER FUNCTION sorter(p_obj IN db_version_obj)
        RETURN INTEGER
);

CREATE OR REPLACE TYPE BODY db_version_obj
AS
    ORDER MEMBER FUNCTION sorter(p_obj IN db_version_obj)
        RETURN INTEGER
    IS
        v_index          INTEGER := 1;
        v_diff           INTEGER := 0;
        self_elements    number_table := number_table();
        other_elements   number_table := number_table();
    BEGIN
        self_elements.EXTEND(REGEXP_COUNT(version_string, '[0-9]+'));

        FOR i IN 1 .. self_elements.COUNT
        LOOP
            self_elements(i) :=
                TO_NUMBER(REGEXP_SUBSTR(version_string,
                                        '[0-9]+',
                                        1,
                                        i));
        END LOOP;

        other_elements.EXTEND(REGEXP_COUNT(p_obj.version_string, '[0-9]+'));

        FOR i IN 1 .. other_elements.COUNT
        LOOP
            other_elements(i) :=
                TO_NUMBER(REGEXP_SUBSTR(p_obj.version_string,
                                        '[0-9]+',
                                        1,
                                        i));
        END LOOP;

        WHILE v_diff = 0 AND v_index <= LEAST(self_elements.COUNT, other_elements.COUNT)
        LOOP
            v_diff :=
                CASE
                    WHEN self_elements(v_index) IS NOT NULL AND other_elements(v_index) IS NOT NULL
                    THEN
                        self_elements(v_index) - other_elements(v_index)
                    WHEN self_elements(v_index) IS NULL AND other_elements(v_index) IS NULL
                    THEN
                        0
                    WHEN self_elements(v_index) IS NOT NULL
                    THEN
                        1
                    ELSE
                        -- other_elements(v_index) is not null
                        -1
                END;

            v_index := v_index + 1;
        END LOOP;

        -- if all of the pieces match, check if one side has more pieces
        -- example:   11.2  vs 11.2.0.1,  in this case 11.2.0.1 is greater.
        -- this can produce potentially arguable results like 11.2.0 > 11.2
        -- but even if undesirable sometimes they will at least be consistent
        IF v_diff = 0
        THEN
            v_diff := self_elements.COUNT - other_elements.COUNT;
        END IF;

        RETURN v_diff;
    END;
END;

With the end results being the same as earlier, but a little more reliable.

SQL>   SELECT ver
  2      FROM (SELECT '12.2.0.1.0' ver FROM DUAL
  3            UNION ALL
  4            SELECT '9.2.0.7.0' FROM DUAL
  5            UNION ALL
  6            SELECT '12.2' FROM DUAL
  7            UNION ALL
  8            SELECT '12.1' FROM DUAL
  9            UNION ALL
 10            SELECT '11.1.0.1.0' FROM DUAL
 11            UNION ALL
 12            SELECT '8.1.7.5.3' FROM DUAL
 13            UNION ALL
 14            SELECT '10.2.0.1.0' FROM DUAL
 15            UNION ALL
 16            SELECT '10.1' FROM DUAL
 17            UNION ALL
 18            SELECT '7.3' FROM DUAL)
 19  ORDER BY db_version_obj(ver);

VER
----------
7.3
8.1.7.5.3
9.2.0.7.0
10.1
10.2.0.1.0
11.1.0.1.0
12.1
12.2
12.2.0.1.0

As mentioned in the MAP article, ORDER member functions are generally slower than MAP member functions. Mapping occurs once for each object. Then the mapped objects are used to do the sorting. With an ORDER function, their is no predetermined value or set of values to compare directly, so the ORDER function must be executed not just once for every object, but for every comparison! If the code behind your function is complex and expensive it won’t scale well when processing a lot of data. So, if you’ll be working with lots of objects, you may want to consider a mapping, even if it might create a seemingly awkward value.

Sometimes though, speed is irrelevant. If it’s not possible/feasible to create a mapping to a single value, then ORDER functions are the only other option.

One last thing to consider with MAP and ORDER functions is they are mutually exclusive.
A single object type can have only one MAP or one ORDER, but not both, and not more than one of either.

Using Object Types: MAP member functions

If you have 2 instances of an object, A and B, you might want to compare them: is A > B or B > A? By default, two instances can’t be compared for greater or lesser value, only for equality. Where equality is determined by checking each attribute of each instance in order and comparing if they are equal or not.

If you try to check the ordering you’ll get

ORA-22950: cannot ORDER objects without MAP or ORDER method

For example, using the MY_TRIANGLE object defined in my previous post we can attempt the various comparisons, but only equality comparisons are valid.

SQL> select * from dual where my_triangle(2,3,4) = my_triangle(2,3,4);

D
-
X

SQL> select * from dual where my_triangle(2,3,4) = my_triangle(3,4,5);

no rows selected

SQL> select * from dual where my_triangle(2,3,4) > my_triangle(3,4,5);
select * from dual where my_triangle(2,3,4) > my_triangle(3,4,5)
                         *
ERROR at line 1:
ORA-22950: cannot ORDER objects without MAP or ORDER method


SQL> select * from dual where my_triangle(2,3,4) < my_triangle(2,3,4);
select * from dual where my_triangle(2,3,4) < my_triangle(2,3,4)
                         *
ERROR at line 1:
ORA-22950: cannot ORDER objects without MAP or ORDER method

In this article, we'll look at MAP methods.

Mapping translates your object into simple scalar value such as a number, date, or varchar2. For example, I might add a MAP method to my triangle object to yield the sum of the 3 sides, or the area of the triangle. If I had an object of U.S. states, I might map them to a number by population in most recent census, or square miles of area. Maybe map it to date of statehood when officially joining the union, or possibly just map them to a text field of the state name for alphabetic sorting. The MAP method is a pl/sql function so the value returned may be the result of a calculation, a lookup value, or simply returning the value of attribute already known within the object.

Let's add a mapping function to my_triangle based on the area of the triangle. The code is the same except for the newly added MAP functionality in bold.

CREATE OR REPLACE TYPE my_triangle AS OBJECT
(
    x_side NUMBER,
    y_side NUMBER,
    z_side NUMBER,
    x_angle NUMBER,
    y_angle NUMBER,
    z_angle NUMBER,
    CONSTRUCTOR FUNCTION my_triangle(self      IN OUT my_triangle,
                                     p_aside   IN     NUMBER,
                                     p_bside   IN     NUMBER,
                                     p_cside   IN     NUMBER)
        RETURN SELF AS RESULT,
    MAP MEMBER FUNCTION area RETURN NUMBER
);
/

Then, our object includes attributes for the three sides, so we'll use Heron's Formula to calculate the area from those values.

CREATE OR REPLACE TYPE BODY my_triangle
IS
    CONSTRUCTOR FUNCTION my_triangle(self      IN OUT my_triangle,
                                     p_aside   IN     NUMBER,
                                     p_bside   IN     NUMBER,
                                     p_cside   IN     NUMBER)
        RETURN SELF AS RESULT
    IS
        -- sides
        a    NUMBER;
        b    NUMBER;
        c    NUMBER;
        -- angles
        aa   NUMBER;
        bb   NUMBER;
        cc   NUMBER;
    BEGIN
        -- All sides of a triangle must have positive length
        IF p_aside <= 0 OR p_bside <= 0 OR p_cside <= 0
        THEN
            RAISE VALUE_ERROR;
        END IF;

        -- Assign a,b,c in descending order of parameter values
        -- Using Cosine rule. solve for angle AA, opposite side a.
        -- Then using Sine rule, solve for angle BB, opposite side b.
        -- Once we know AA and BB,  CC is calculated simply with 180 - AA - BB
        -- The Oracle "acos" function return values in radians,
        -- so convert all angles to degrees before assigning to object attributes

        CASE GREATEST(p_aside, p_bside, p_cside)
            WHEN p_aside
            THEN
                a := p_aside;
                b := GREATEST(p_bside, p_cside);
                c := LEAST(p_bside, p_cside);
            WHEN p_bside
            THEN
                a := p_bside;
                b := GREATEST(p_aside, p_cside);
                c := LEAST(p_aside, p_cside);
            WHEN p_cside
            THEN
                a := p_cside;
                b := GREATEST(p_aside, p_bside);
                c := LEAST(p_aside, p_bside);
        END CASE;

        -- Check Triangle Inequality Theorem
        --  That is, the sum of the lengths of any two sides must be greater than the length of the third side.
        --  Since we have ordered sides such that a >= b >= c, it is sufficient to test b+c> a
        IF b + c <= a
        THEN
            RAISE VALUE_ERROR;
        END IF;

        aa := ACOS((b * b + c * c - a * a) / (2 * b * c));  -- cosine rule
        bb := ASIN(b * SIN(aa) / a);  -- sine rule

        aa := aa * 180 / ACOS(-1);  -- convert radians to degrees
        bb := bb * 180 / ACOS(-1);  -- convert radians to degrees
        cc := 180 - aa - bb;

        self.x_side := a;
        self.y_side := b;
        self.z_side := c;
        self.x_angle := aa;
        self.y_angle := bb;
        self.z_angle := cc;

        RETURN;
    END;

    MAP MEMBER FUNCTION area
        RETURN NUMBER
    IS
        a   NUMBER := x_side;
        b   NUMBER := y_side;
        c   NUMBER := z_side;
        s   NUMBER := (a + b + c) / 2;
    BEGIN
        -- We know the lengths of the 3 sides
        -- so use Heron's Formula to calculate the area
        RETURN SQRT(s * (s - a) * (s - b) * (s - c));
    END;
END;
/

Now we can compare our triangles

SQL> select * from dual where my_triangle(2,3,4) > my_triangle(3,4,5);

no rows selected

SQL> select * from dual where my_triangle(2,3,4) < my_triangle(3,4,5);

D
-
X

Another nice feature of MAP methods is they can still be invoked as other methods. So, if you simply want to know the area of a given triangle and not necessarily compare it to anything else, the method is still viable.

SQL> select my_triangle(2,3,4).area() from dual;

MY_TRIANGLE(2,3,4).AREA()
-------------------------
               2.90473751

SQL> select my_triangle(3,4,5).area() from dual;

MY_TRIANGLE(3,4,5).AREA()
-------------------------
                        6

Next lets pursue the sorting functionality more directly by fixing a problem with Oracle version numbers. If you have a list of databases, maybe from OEM or some other tool where each database has a different version, you might want to sort them; but version numbers are actually text. So, while it seems natural for 12.2.0.1.0 to follow 9.2.0.7.0 it won't when sorted because the text "9" is greater than the text "1". A simple way to address this is with an object type that can map version number text to a sortable value that is actually numeric.

Oracle versions are made up of 5 numeric parts. Using 12.2.0.1.0 as an example the parts are as follows.

  • Major release (12)
  • Minor release (2)
  • App server number (0)
  • Patch number (1)
  • Platform specific patch number (0)

To produce the map we'll multiply each sub-value within the version to offset them within a single larger numeric value.
Most of the sub-values are 1 or 2 digits, but the platform patch can be up to 6 digits. For simplicity I'll pad all the sub-values to the same length - 6.
Working from the least-significant to most significant sub-value, multiply each by increasing offsets of 6 digits and sum them together.

  1. 10^0 * 0 +
  2. 10^6 * 1 +
  3. 10^12 * 0 +
  4. 10^18 * 2 +
  5. 10^24 * 12

This sum produces the value: 12000002000000000001000000.

Given a version of 9.2.0.7.0 we follow the same process

  1. 10^0 * 0 +
  2. 10^6 * 7 +
  3. 10^12 * 0 +
  4. 10^18 * 2 +
  5. 10^24 * 9

Which yields 9000002000000000007000000

These values are large and unwieldy but; unlike the area method of the triangle object, these map values are meant solely for comparison purposes.

Creating a sortable object is relatively simple, simply extract each part, turn it into a number, multiply, and sum.

CREATE OR REPLACE TYPE db_version_obj AS OBJECT
(
    version_string VARCHAR2(50),
    MAP MEMBER FUNCTION mapvalue
        RETURN INTEGER
);

CREATE OR REPLACE TYPE BODY db_version_obj
AS
    MAP MEMBER FUNCTION mapvalue
        RETURN INTEGER
    IS
    -- Oracle versions are made up of 5 numeric parts
    -- For example  12.2.0.1.0
    --    Major release (12)
    --    Minor release (2)
    --    App server number (0)
    --    Patch number (1)
    --    Platform specific patch number (0)
    --
    --  For the map function, we'll turn the string into single large number
    --   where each part maps to 6 digits within the overall number
    BEGIN
        IF version_string IS NULL
        THEN
            RETURN NULL;
        ELSE
            RETURN   TO_NUMBER(REGEXP_SUBSTR(version_string,'[0-9]+',1,1)) * POWER(1000000, 4)
                   + TO_NUMBER(REGEXP_SUBSTR(version_string,'[0-9]+',1,2)) * POWER(1000000, 3)
                   + TO_NUMBER(REGEXP_SUBSTR(version_string,'[0-9]+',1,3)) * POWER(1000000, 2)
                   + TO_NUMBER(REGEXP_SUBSTR(version_string,'[0-9]+',1,4)) * POWER(1000000, 1)
                   + TO_NUMBER(REGEXP_SUBSTR(version_string,'[0-9]+',1,5));
        END IF;
    END;
END;

And using it is quite simple too

SQL>   SELECT ver
  2      FROM (SELECT '12.2.0.1.0' ver FROM DUAL
  3            UNION ALL
  4            SELECT '9.2.0.7.0' FROM DUAL
  5            UNION ALL
  6            SELECT '11.1.0.1.0' FROM DUAL
  7            UNION ALL
  8            SELECT '8.1.7.5.3' FROM DUAL
  9            UNION ALL
 10            SELECT '10.2.0.1.0' FROM DUAL)
 11  ORDER BY db_version_obj(ver);

VER
----------
8.1.7.5.3
9.2.0.7.0
10.2.0.1.0
11.1.0.1.0
12.2.0.1.0

This particular example is maybe a little contrived since the same functionality could be achieved with a normal pl/sql function returning the same value as the object method. The point of this example was to illustrate creating an arcane value that is still useful for sorting. Similar techniques could be used for more complex objects. For example mapping a car object's make, model, and year to a number or text value which is then sortable.

If some of these mapping seem too convoluted simple to create a strange value that is only usable in sorting, then it's possible a MAP function is not what you need and instead should examine an ORDER member function which will be the topic of my next blog entry. However, it should be noted MAP functions do tend to be more efficient for comparisons because the mapping is performed once for each object and then the mapped values are used in the sorting. So, you will need to consider your use cases. In my example of database versions, maybe I just have a few in a report so it won't make much difference; but if I had to pick for performance I'd go with MAP. If I'm working with some sort of graphics application that uses thousands of triangles for rendering of images it will probably be more efficient to use the MAP function instead of ORDER.

Also note, if the mapping returns a NULL, then those objects will obey the default ordering of NULLS LAST, but can be reversed with NULLS FIRST in the ORDER BY clause.

Using Object Types: constructors

When an object type is instantiated (i.e. assigned a non-null value) it will execute initialization code. This initialization code is called a CONSTRUCTOR function. Every object type includes an implicit constructor which simply assigns values to the object attributes according to the values provided in the object assignment.

For example:

CREATE OR REPLACE TYPE simpleobject AS OBJECT(a NUMBER, b NUMBER, c NUMBER);

I can create an instance of the object by providing it’s name and values for the attributes.

SQL> select simpleobject(1,2,3) example from dual;

EXAMPLE(A, B, C)
----------------------
SIMPLEOBJECT(1, 2, 3)

or, in pl/sql

SQL> set serveroutput on
SQL> DECLARE
  2      v_obj   simpleobject := simpleobject(1, 2, 3);
  3  BEGIN
  4      DBMS_OUTPUT.put_line(v_obj.a);
  5      DBMS_OUTPUT.put_line(v_obj.b);
  6      DBMS_OUTPUT.put_line(v_obj.c);
  7  END;
  8  /
1
2
3

That’s easy enough, but what if we don’t always know all of the values and only want to define some of them. In that case we define our own constructor by creating a function with the CONSTRUCTOR key word and naming it the same as our object. It’s not strictly necessary to declare the first parameter (SELF) as it will be implicit in any constructor, but it is common practice to do so.

So, either of these versions will provide equivalent functionality. First, with the SELF parameter explicitly defined:

CREATE OR REPLACE TYPE simpleobject AS OBJECT
(
    a NUMBER,
    b NUMBER,
    c NUMBER,
    CONSTRUCTOR FUNCTION simpleobject(self IN OUT simpleobject, p_a IN NUMBER)
        RETURN SELF AS RESULT
);

CREATE OR REPLACE TYPE BODY simpleobject
IS
    CONSTRUCTOR FUNCTION simpleobject(self IN OUT simpleobject, p_a IN NUMBER)
        RETURN SELF AS RESULT
    IS
    BEGIN
        self.a := p_a;

        RETURN;
    END;
END;

or using the implicit parameter:

CREATE OR REPLACE TYPE simpleobject AS OBJECT
(
    a NUMBER,
    b NUMBER,
    c NUMBER,
    CONSTRUCTOR FUNCTION simpleobject(a IN NUMBER)
        RETURN SELF AS RESULT
);

CREATE OR REPLACE TYPE BODY simpleobject
IS
    CONSTRUCTOR FUNCTION simpleobject(a IN NUMBER)
        RETURN SELF AS RESULT
    IS
    BEGIN
        self.a := a;
        
        RETURN;
    END;
END;

Now we have the option of instantiating with just the a parameter, or using all three attributes.

SELECT simpleobject(1), simpleobject(1, 2, 3) FROM DUAL;

DECLARE
    v_obj   simpleobject := simpleobject(1, 2, 3);
BEGIN
    DBMS_OUTPUT.put_line(v_obj.a);
    DBMS_OUTPUT.put_line(v_obj.b);
    DBMS_OUTPUT.put_line(v_obj.c);
END;

DECLARE
    v_obj   simpleobject := simpleobject(1);
BEGIN
    DBMS_OUTPUT.put_line(v_obj.a);
    DBMS_OUTPUT.put_line(v_obj.b);
    DBMS_OUTPUT.put_line(v_obj.c);
END;

Those are fairly simple. Let’s get a little more complicated. I’ll create a triangle object. It will have 6 attributes containing the 3 sides and 3 angles. Since the angles can be calculated if we have the 3 sides, we’ll create a constructor that only requires the side values and we’ll calculate the angles ourselves.

The triangle object then would look like this with attributes for the sides X,Y,Z and the corresponding angles opposite those sides, also X,Y, and Z.
The constructor accepts 3 values for the sides.

CREATE OR REPLACE TYPE my_triangle AS OBJECT
(
    x_side NUMBER,
    y_side NUMBER,
    z_side NUMBER,
    x_angle NUMBER,
    y_angle NUMBER,
    z_angle NUMBER,
    CONSTRUCTOR FUNCTION my_triangle(self   IN OUT my_triangle,
                                     p_aside    IN     NUMBER,
                                     p_bside    IN     NUMBER,
                                     p_cside    IN     NUMBER)
        RETURN SELF AS RESULT
);

In the body we can our constructor is quite a bit more complicated than the previous “simpleobject”. First I sort the inputs by size, and then using some rules from trigonometry we can calculate the values of the angles and set the attributes. I also add some parameter checking to make sure the input values can form a valid triangle. If they don’t then raise the VALUE_ERROR exception.

CREATE OR REPLACE TYPE BODY my_triangle
IS
    CONSTRUCTOR FUNCTION my_triangle(self      IN OUT my_triangle,
                                     p_aside   IN     NUMBER,
                                     p_bside   IN     NUMBER,
                                     p_cside   IN     NUMBER)
        RETURN SELF AS RESULT
    IS
        -- sides
        a    NUMBER;
        b    NUMBER;
        c    NUMBER;
        -- angles
        aa   NUMBER;
        bb   NUMBER;
        cc   NUMBER;
    BEGIN
        -- All sides of a triangle must have positive length
        IF p_aside <= 0 OR p_bside <= 0 OR p_cside <= 0
        THEN
            RAISE VALUE_ERROR;
        END IF;

        -- Assign a,b,c in descending order of parameter values
        -- Using Cosine rule. solve for angle AA, opposite side a.
        -- Then using Sine rule, solve for angle BB, opposite side b.
        -- Once we know AA and BB,  CC is calculated simply with 180 - AA - BB
        -- The Oracle "acos" function return values in radians,
        -- so convert all angles to degrees before assigning to object attributes

        CASE GREATEST(p_aside, p_bside, p_cside)
            WHEN p_aside
            THEN
                a := p_aside;
                b := GREATEST(p_bside, p_cside);
                c := LEAST(p_bside, p_cside);
            WHEN p_bside
            THEN
                a := p_bside;
                b := GREATEST(p_aside, p_cside);
                c := LEAST(p_aside, p_cside);
            WHEN p_cside
            THEN
                a := p_cside;
                b := GREATEST(p_aside, p_bside);
                c := LEAST(p_aside, p_bside);
        END CASE;

        -- Check Triangle Inequality Theorem
        --  That is, the sum of the lengths of any two sides must be greater than the length of the third side.
        --  Since we have ordered sides such that a >= b >= c, it is sufficient to test b+c > a
        IF b + c <= a
        THEN
            RAISE VALUE_ERROR;
        END IF;

        aa := ACOS((b * b + c * c - a * a) / (2 * b * c));  -- cosine rule
        bb := ASIN(b * SIN(aa) / a);  -- sine rule

        aa := aa * 180 / ACOS(-1);  -- convert radians to degrees
        bb := bb * 180 / ACOS(-1);  -- convert radians to degrees
        cc := 180 - aa - bb;

        self.x_side := a;
        self.y_side := b;
        self.z_side := c;
        self.x_angle := aa;
        self.y_angle := bb;
        self.z_angle := cc;

        RETURN;
    END;
END;

Trying out the object with some sample values, we see the sides are populated in descending order of length and each sides opposite angle is calculated correctly.

SQL> SELECT my_triangle(3, 4, 2) FROM DUAL;

MY_TRIANGLE(3,4,2)(X_SIDE, Y_SIDE, Z_SIDE, X_ANGLE, Y_ANGLE, Z_ANGLE)
----------------------------------------------------------------------
MY_TRIANGLE(4, 3, 2, 104.477512, 46.5674634, 28.9550244)

SQL> SELECT my_triangle(4, 3, 5) FROM DUAL;

MY_TRIANGLE(4,3,5)(X_SIDE, Y_SIDE, Z_SIDE, X_ANGLE, Y_ANGLE, Z_ANGLE)
----------------------------------------------------------------------
MY_TRIANGLE(5, 4, 3, 90, 53.1301024, 36.8698976)

SQL> SELECT my_triangle(7, 7, 7) FROM DUAL;

MY_TRIANGLE(7,7,7)(X_SIDE, Y_SIDE, Z_SIDE, X_ANGLE, Y_ANGLE, Z_ANGLE)
----------------------------------------------------------------------
MY_TRIANGLE(7, 7, 7, 60, 60, 60)

As you can see, defining your own constructors allows for a lot of versatility in how you can use and populate your object’s attributes and I highly recommend exploring their use, looking for opportunities where they may benefit your object. From simply defaulting to nulls as shown in the simpleobject, to calculated results, or possibly just some formatting, such as using UPPER, LOWER, ROUND, etc. on your values at instantiation time to act as data cleanup.

Using Object Types: consolidating multiple values

A lot of developers see Object Types as confusing, esoteric, or simply unrelated to their SQL work.  To be fair, some objects can be quite complex and some of the data cartridge (ODCI) features will never find a home in some developer’s applications. But, that still shouldn’t preclude an examination of some of the fundamentals of this extremely useful feature.

The most basic feature of an object is that it can be used to consolidate multiple values into attributes of a single entity.  Similar to a record declared in PL/SQL an object type allows you to have a reference to a single value that acts as a container for other values.

Why might that be useful?  After all, a SQL statement can already reference multiple values (columns) from a table, or invoke multiple functions calls, or return values from sub-queries.  So each row returned by a SQL statement can already be viewed as a consolidated set of values.  What does an object type have to offer beyond what we already have?

First – lets look at functions.  Lets say I have list of sales to various customers and I want to call a function that returns detail information about those sales.

I might have a function that returns product name, another function returns weight, and yet another returning price.  So, if I want to gather all of that information I have to call all 3 functions.  Behind the scenes, each of those functions will likely execute a SQL statement to fetch a row from the product table and extract the relevant column.

Obviously that’s not very efficient.  Better would be to have a single function call that could return all 3 values.  But, we all know a function can only return one value, so we’re kind of stuck!  There are, of course, other options, such as caching values within PL/SQL records or collections thereby reducing the IO penalties, but still necessitating multiple function calls and possibly context switches.  While still viable for some problems; those options are outside the scope of this article.

So, what would the Object Type solution look like?

First, let’s create an object to hold our 3 product values.

CREATE OR REPLACE TYPE product_info 
  AS OBJECT(
    product_name   VARCHAR2(50), 
    product_weight NUMBER,
    product_price  NUMBER
);

Next, create a function to return a PRODUCT_INFO value (i.e. an “instance” of the class.)

CREATE OR REPLACE FUNCTION get_product_info(p_product_id IN products.id%TYPE)
    RETURN product_info
IS
    v_info   product_info;
BEGIN
    SELECT product_info(name, weight, price)
      INTO v_info
      FROM products
     WHERE id = p_product_id;

    RETURN v_info;
END;

Now we can use our new function and type within a SQL statement.

SELECT order_date, product_id, quantity, get_product_info(product_id) product
  FROM sales
 WHERE customer_id = 1234 AND order_date = DATE '2017-01-11';

ORDER_DATE          PRODUCT_ID QUANTITY PRODUCT(PRODUCT_NAME, PRODUCT_WEIGHT, PRODUCT_PRICE)
------------------- ---------- -------- -----------------------------------------------------
2017-01-11 00:00:00         50        1 PRODUCT_INFO('Bananas', 1, .86)
2017-01-11 00:00:00         35        3 PRODUCT_INFO('Apples-Gala', .33, .4)

Depending on the tool used, the object type may be represented in different forms. OBJECT_NAME(FIELD_1, FIELD_2, FIELD_3, etc.) is how SQL*Plus displays objects.

The most important at this stage though is confirming our function works. We got the expected results. In particular, the three values we need are returned as a single column containing our object.

Next though, because we’re using SQL, it’s likely we’ll want the results returned in rows and columns of simple, scalar values. So, let’s extract the individual attributes of our object out into their respective columns for final output. We’ll do that with an inline view of the previous query and then pull the attributes out by name. Note, when dereferencing an object to get to the attributes, you must wrap the object name in parentheses to create an object expression.

SQL> SELECT order_date,
  2         product_id,
  3         quantity,
  4         (product).product_name product_name,
  5         (product).product_weight weight,
  6         (product).product_price price
  7    FROM (SELECT order_date,
  8                 product_id,
  9                 quantity,
 10                 get_product_info(product_id) product
 11            FROM sales
 12           WHERE customer_id = 1234 AND order_date = DATE '2017-01-11');

ORDER_DATE          PRODUCT_ID   QUANTITY PRODUCT_NAME   WEIGHT  PRICE
------------------- ---------- ---------- ------------- ------- ------
2017-01-11 00:00:00         50          1 Bananas             1    .55
2017-01-11 00:00:00         35          3 Apples-Gala       .33     .4

You can also use objects within a scalar subquery in order to consolidate multiple values from a lookup table into a single object as the subquery’s result. When doing so you should test if a simple join to pull the related values might be more efficient or equivalent but simpler.

For example, if the function above was not available and I had direct access to the underlying product table then I could write the query like this:

SELECT order_date,
       product_id,
       quantity,
       (product).product_name product_name,
       (product).product_weight weight,
       (product).product_price price
  FROM (SELECT order_date,
               product_id,
               quantity,
               (SELECT product_info(name, weight, price)
                  FROM products
                 WHERE products.id = sales.product_id)
                   product
          FROM sales
         WHERE customer_id = 2576 AND product_id = 35);

Elapsed: 00:00:00.03

Statistics
---------------------------------------------------------
          0  recursive calls
          0  db block gets
       4464  consistent gets
          0  physical reads
          0  redo size
        972  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         16  rows processed

or, with a simple join as shown below (in this case a referential constraint makes an inner join equivalent to the functionality above, but if not then an outer join to products could be used.)

SELECT s.order_date,
       s.product_id,
       s.quantity,
       p.name,
       p.weight,
       p.price
  FROM sales s, products p
 WHERE s.customer_id = 2576 
  AND s.product_id = 35
  AND p.id = s.product_id;

Elapsed: 00:00:00.03

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4464  consistent gets
          0  physical reads
          0  redo size
        964  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         16  rows processed

In this case the join is simpler syntax for equivalent performance and resource consumption. Also consider, this was a simple example. Some data may not be available through a table and may require a function. Either because of api design and user privileges or because the data comes from a non-sql source, such as as a web service, remote file, or other pl/sql routines.

When building such functions it is important to consider use cases that may require multiple pieces of output. When those are found, it might be appropriate to return an XML or JSON document; but it may be easier and more efficient to return an object type.