Close

Creativity vs Functionality – A Tale of a Query’s Development

Recently, I had a satisfying conclusion to a series of questions that came in randomly over the course of a few weeks.

First the developer was trying to do recursion. I pointed to documentation and gave a couple of small examples showing CONNECT BY and recursive WITH clauses.

    SELECT LPAD(' ', LEVEL * 2) || empno || ' ' || ename || ' ' || job
      FROM emp
CONNECT BY mgr = PRIOR empno
START WITH job = 'MANAGER'
ORDER SIBLINGS BY empno;
WITH
    recursive (lvl, empno, ename, job, mgr)
    AS
        (SELECT 1, empno, ename, job, mgr
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT recursive.lvl + 1, emp.empno, emp.ename,
                emp.job, emp.mgr
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
          SEARCH DEPTH FIRST BY empno SET RN
SELECT LPAD(' ', lvl * 2) || empno || ' ' || ename || ' ' || job
  FROM recursive;
  7566 JONES MANAGER
    7788 SCOTT ANALYST
      7876 ADAMS CLERK
    7902 FORD ANALYST
      7369 SMITH CLERK
  7698 BLAKE MANAGER
    7499 ALLEN SALESMAN
    7521 WARD SALESMAN
    7654 MARTIN SALESMAN
    7844 TURNER SALESMAN
    7900 JAMES CLERK
  7782 CLARK MANAGER
    7934 MILLER CLERK

The next time he reached out to me, he had settled on using a recursive WITH subquery and was aggregating the children of the top level entries – creating a comma-separated list for each.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         LISTAGG(ename, ',') 
              WITHIN GROUP (ORDER BY empno) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;
MANAGER  EMPLOYEES
BLAKE	ALLEN,WARD,MARTIN,TURNER,JAMES
CLARK	MILLER
JONES	SMITH,SCOTT,ADAMS,FORD

This solution only worked for a few test rows. He returned a little while later because the aggregated strings were exceeding the VARCHAR2’s 4000 character limit for some groups and thus failing. He was using 12cR2, so the ON OVERFLOW syntax would have been legal; but he needed all of the content. So, neither the ERROR nor the TRUNCATE option would have sufficed. While this isn’t a problem using the small EMP table, it was an issue needing solved for his real data. I showed him how to use the COLLECT function to aggregate into a nested table and then pass that object to a function to iterate through the members and return a delimited CLOB.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         tbl2clob(CAST(COLLECT(ename) AS vctab)) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

The TBL2CLOB function and VCTAB type are objects I’ve used in an earlier series and their source may be found here.

At this point though, I was getting a little concerned his plan was to use the csv strings to construct dynamic sql with large IN-lists. So, before he got a chance to implement that I tell him there are better alternatives using the collections as is instead of converting them into large text values.

The next round of questions weren’t with the syntax itself or database problems; but in testing the queries and constructing various collections he began getting errors for missing objects or inconsistent data types. The problem here was the collection being returned was not explicitly CAST to a known type, thus the client software was not able to render the results because it didn’t know what the results were. The fix is to simply give them a type. CAST was shown before but at that the time it was implied casting was simply a requirement for the tbl2clob function, rather than a practice for all COLLECT output.

  SELECT manager_name, 
         COLLECT(ename) employees  -- error
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name
  SELECT manager_name,
         CAST(COLLECT(ename) AS vctab) employees  -- good
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

Now that his collections were properly typed and he had a query aggregating the expected results he could then use the collection in a join to some other table. I showed samples of MEMBER and IN variations. While the MEMBER syntax is more compact and (in my opinion) cleaner, the optimizer will often make better choices with an IN clause against a subquery. I always recommend testing both.

MEMBER operator
     x.name MEMBER OF m.employees
IN-List Condition
     x.name in (select column_value from table(m.employees))

The next step in his task was materializing the aggregated values from a remote database into a local copy. He didn’t need a materialized view, but did want a static local copy for short-term reuse.

CREATE TABLE managers
NESTED TABLE employees STORE AS manager_employees_nt
AS
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, CAST(COLLECT(ename) AS vctab) employees
             FROM recursive
            WHERE job != 'MANAGER'
         GROUP BY manager_name

Now that his queries have worked through the basic syntax requirements and I can see some real results starting to come through, I become a little concerned he’s doing a lot of look-ups based on the elements in his collection and this route will be unnecessarily complicated. After a few minutes of wrangling about the details of his requirements and nature of the data I then suggest simplifying the materialized table without a collection.

CREATE TABLE managers
AS       
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
SELECT manager_name, ename
  FROM recursive
 WHERE job != 'MANAGER';

Then do the lookup values and subsequent aggregations using simple equality predicates and groups by.

SELECT m.managers, 
       SUM(e.value1) val1_sum,
       SUM(e.value2) val2_sum
  FROM managers m 
  LEFT JOIN lookup_table e 
      ON e.ename = m.ename;

In retrospect, I suppose it is possible that spending more time digging through his requirements a few weeks ago could have led directly to this solution; but I’m glad we worked through the various combinations. He learned several new syntax techniques and different ways of looking at a problem. So, even though most of them proved unneeded in the end, I feel it was a useful exercise. I’ve given him more tools for his toolbelt. Hopefully I also taught him enough to know when and how to use them.

Moral of the story – It’s not enough to simply know 10 ways to tie a knot, it’s equally important to know why you shouldn’t use 9 of them.