Recently, I had a satisfying conclusion to a series of questions that came in randomly over the course of a few weeks.
First the developer was trying to do recursion. I pointed to documentation and gave a couple of small examples showing CONNECT BY and recursive WITH clauses.
SELECT LPAD(' ', LEVEL * 2) || empno || ' ' || ename || ' ' || job FROM emp CONNECT BY mgr = PRIOR empno START WITH job = 'MANAGER' ORDER SIBLINGS BY empno;
WITH recursive (lvl, empno, ename, job, mgr) AS (SELECT 1, empno, ename, job, mgr FROM emp WHERE job = 'MANAGER' UNION ALL SELECT recursive.lvl + 1, emp.empno, emp.ename, emp.job, emp.mgr FROM emp, recursive WHERE emp.mgr = recursive.empno) SEARCH DEPTH FIRST BY empno SET RN SELECT LPAD(' ', lvl * 2) || empno || ' ' || ename || ' ' || job FROM recursive;
7566 JONES MANAGER 7788 SCOTT ANALYST 7876 ADAMS CLERK 7902 FORD ANALYST 7369 SMITH CLERK 7698 BLAKE MANAGER 7499 ALLEN SALESMAN 7521 WARD SALESMAN 7654 MARTIN SALESMAN 7844 TURNER SALESMAN 7900 JAMES CLERK 7782 CLARK MANAGER 7934 MILLER CLERK
The next time he reached out to me, he had settled on using a recursive WITH subquery and was aggregating the children of the top level entries – creating a comma-separated list for each.
WITH recursive (empno, ename, job, mgr, manager_name) AS (SELECT empno, ename, job, mgr, ename FROM emp WHERE job = 'MANAGER' UNION ALL SELECT emp.empno, emp.ename, emp.job, emp.mgr, manager_name FROM emp, recursive WHERE emp.mgr = recursive.empno) SELECT manager_name, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY empno) employees FROM recursive WHERE job != 'MANAGER' GROUP BY manager_name;
MANAGER EMPLOYEES BLAKE ALLEN,WARD,MARTIN,TURNER,JAMES CLARK MILLER JONES SMITH,SCOTT,ADAMS,FORD
This solution only worked for a few test rows. He returned a little while later because the aggregated strings were exceeding the VARCHAR2’s 4000 character limit for some groups and thus failing. He was using 12cR2, so the ON OVERFLOW syntax would have been legal; but he needed all of the content. So, neither the ERROR nor the TRUNCATE option would have sufficed. While this isn’t a problem using the small EMP table, it was an issue needing solved for his real data. I showed him how to use the COLLECT function to aggregate into a nested table and then pass that object to a function to iterate through the members and return a delimited CLOB.
WITH recursive (empno, ename, job, mgr, manager_name) AS (SELECT empno, ename, job, mgr, ename FROM emp WHERE job = 'MANAGER' UNION ALL SELECT emp.empno, emp.ename, emp.job, emp.mgr, manager_name FROM emp, recursive WHERE emp.mgr = recursive.empno) SELECT manager_name, tbl2clob(CAST(COLLECT(ename) AS vctab)) employees FROM recursive WHERE job != 'MANAGER' GROUP BY manager_name;
The TBL2CLOB function and VCTAB type are objects I’ve used in an earlier series and their source may be found here.
At this point though, I was getting a little concerned his plan was to use the csv strings to construct dynamic sql with large IN-lists. So, before he got a chance to implement that I tell him there are better alternatives using the collections as is instead of converting them into large text values.
The next round of questions weren’t with the syntax itself or database problems; but in testing the queries and constructing various collections he began getting errors for missing objects or inconsistent data types. The problem here was the collection being returned was not explicitly CAST to a known type, thus the client software was not able to render the results because it didn’t know what the results were. The fix is to simply give them a type. CAST was shown before but at that the time it was implied casting was simply a requirement for the tbl2clob function, rather than a practice for all COLLECT output.
SELECT manager_name, COLLECT(ename) employees -- error FROM recursive WHERE job != 'MANAGER' GROUP BY manager_name
SELECT manager_name, CAST(COLLECT(ename) AS vctab) employees -- good FROM recursive WHERE job != 'MANAGER' GROUP BY manager_name;
Now that his collections were properly typed and he had a query aggregating the expected results he could then use the collection in a join to some other table. I showed samples of MEMBER and IN variations. While the MEMBER syntax is more compact and (in my opinion) cleaner, the optimizer will often make better choices with an IN clause against a subquery. I always recommend testing both.
MEMBER operatorx.name MEMBER OF m.employeesIN-List Condition
x.name in (select column_value from table(m.employees))
The next step in his task was materializing the aggregated values from a remote database into a local copy. He didn’t need a materialized view, but did want a static local copy for short-term reuse.
CREATE TABLE managers NESTED TABLE employees STORE AS manager_employees_nt AS WITH recursive (empno, ename, job, mgr, manager_name) AS (SELECT empno, ename, job, mgr, ename FROM emp WHERE job = 'MANAGER' UNION ALL SELECT emp.empno, emp.ename, emp.job, emp.mgr, manager_name FROM emp, recursive WHERE emp.mgr = recursive.empno) SELECT manager_name, CAST(COLLECT(ename) AS vctab) employees FROM recursive WHERE job != 'MANAGER' GROUP BY manager_name
Now that his queries have worked through the basic syntax requirements and I can see some real results starting to come through, I become a little concerned he’s doing a lot of look-ups based on the elements in his collection and this route will be unnecessarily complicated. After a few minutes of wrangling about the details of his requirements and nature of the data I then suggest simplifying the materialized table without a collection.
CREATE TABLE managers AS WITH recursive (empno, ename, job, mgr, manager_name) AS (SELECT empno, ename, job, mgr, ename FROM emp WHERE job = 'MANAGER' UNION ALL SELECT emp.empno, emp.ename, emp.job, emp.mgr, manager_name FROM emp, recursive WHERE emp.mgr = recursive.empno) SELECT manager_name, ename FROM recursive WHERE job != 'MANAGER';
Then do the lookup values and subsequent aggregations using simple equality predicates and groups by.
SELECT m.managers, SUM(e.value1) val1_sum, SUM(e.value2) val2_sum FROM managers m LEFT JOIN lookup_table e ON e.ename = m.ename;
In retrospect, I suppose it is possible that spending more time digging through his requirements a few weeks ago could have led directly to this solution; but I’m glad we worked through the various combinations. He learned several new syntax techniques and different ways of looking at a problem. So, even though most of them proved unneeded in the end, I feel it was a useful exercise. I’ve given him more tools for his toolbelt. Hopefully I also taught him enough to know when and how to use them.
Moral of the story – It’s not enough to simply know 10 ways to tie a knot, it’s equally important to know why you shouldn’t use 9 of them.