Goodbye to another great Collaborate.

Thank you to all that attended my presentations this year both live or in the virtual conference.

As promised here are the links to my slides.

https://www.dropbox.com/s/d7kjh5rp70ucul0/2019_352_Stuber_ppt.pptx?dl=0

https://www.dropbox.com/s/lqqhb6m8b3vrv0p/2019_354_Stuber_ppt.pptx?dl=0

For session 354, I mentioned expanded versions of some of the dictionary queries. Those maybe found at the following links…

https://seanstuber.com/2018/03/08/who-is-using-temp-and-what-are-they-doing/

https://seanstuber.com/2018/03/04/who-is-blocking-whom-and-what-are-they-doing/

Thank you again to all who came, questions and comments are always welcome.

Java Table Functions – Returning a SQL collection from a Java function.

One of the great features of Oracle is the ability to create a table function. That is, a function returning a collection type that can be used as the source for a query. These functions then allow a developer to provide a simple SQL api to query rows and columns from a variety of data sources. Any piece of information you can retrieve with PL/SQL may then be returned as a collection.

However, as robust as PL/SQL is, there are times when it simply lacks the functionality to achieve a particular purpose. When this happens it maybe useful to turn to Java stored procedures as an option. Alternately, it may be that a working Java class already exists and, even if possible, there could be little or no value in rewriting the class in PL/SQL.

A few weeks ago I wrote an article showing how to use a Java stored procedure to return a single value from a SQL Server database as an Oracle function usable within SQL or PL/SQL statements. In this article I’ll extend the same idea, but instead of returning just one value, the function will return a nested-table collection of objects.

For this function, the goal is to return a list of items ordered from each salesperson.

SELECT p.personid saldpersonid,
       p.fullname salesperson,
       o.customerid,
       o.orderid,
       o.orderdate,
       l.orderlineid,
       l.unitprice,
       l.quantity,
       i.stockitemname
  FROM application.people p
       JOIN sales.orders o ON o.salespersonpersonid = p.personid
       JOIN sales.orderlines l ON l.orderid = o.orderid
       JOIN warehouse.stockitems i ON i.stockitemid = l.stockitemid;
SALESPERSONID SALESPERSON CUSTOMERID ORDERID ORDERDATE ORDERLINEID UNITPRICE QUANTITY STOCKITEMNAME
13
Hudson Hollinworth
185
44773
2015-03-11
141507
2.9
182
Clear packaging tape 48mmx75m
15
Taj Shand
930
18428
2013-12-11
58433
18
24
"The Gu" red shirt XML tag t-shirt (Black) L
6
Sophia Hinton
85
29217
2014-06-19
92634
32
8
Animal with big feet slippers (Brown) L
7
Amy Trefl
1004
8919
2013-06-18
28345
32
2
Dinosaur battery-powered slippers (Green) M
3
Hudson Onslow
531
68540
2016-03-16
215660
18
40
20 mm Double sided bubble wrap 10m
2
Kayla Woodcock
402
44893
2015-03-12
141891
18.5
40
Office cube periscope (Black)

First, we’ll need an object type to hold each row’s data, and a collection type to return a set of those objects.

CREATE OR REPLACE TYPE wwi_salesperson_lineitem_type
AS OBJECT
(
salespersonid INTEGER,
salesperson VARCHAR2(50),
customerid INTEGER,
orderid INTEGER,
orderdate DATE,
orderlineid INTEGER,
unitprice NUMBER,
quantity INTEGER,
stockitemname VARCHAR2(100)
);
/
CREATE OR REPLACE TYPE wwi_salesperson_lineitem_tab
AS TABLE OF wwi_salesperson_lineitem_type;
/

The JDBC and query portion is essentially the same as in the previous article; except instead of single value, we will iterate through a result set. For each row, append STRUCT object to a java array. When the array is full, we then use an ArrayDescriptor to create an oracle.sql.ARRAY from our java array. Also of note, the function will not only have a jdbc connection to the SQL Server database, it will also utilize the local default connection to the Oracle database as well.

CREATE OR REPLACE AND RESOLVE JAVA SOURCE NAMED SDS.WWI_SALESPERSON_LINEITEMS
RESOLVER ((* SDS)(* SQLJDBC)(* PUBLIC))
as import java.sql.*;
import java.util.*;
import com.microsoft.sqlserver.jdbc.*;
import oracle.sql.*;
import oracle.jdbc.driver.*;

public class wwi_salesperson_lineitems {
    public static oracle.sql.ARRAY get_items(
                                             String p_server,
                                             String p_instance,
                                             int p_port,
                                             String p_database,
                                             String p_user,
                                             String p_password,
                                             int p_personid
                                            ) throws Exception
    {
       SQLServerDataSource ds = new SQLServerDataSource();
       ds.setServerName(p_server);
       if (p_instance != null) { ds.setInstanceName(p_instance); }
       ds.setPortNumber(p_port);
       ds.setDatabaseName(p_database);
       ds.setUser(p_user);
       ds.setPassword(p_password);

       Connection localconn    = DriverManager.getConnection("jdbc:default:connection:");
       StructDescriptor oneItem = StructDescriptor.createDescriptor("WWI_SALESPERSON_LINEITEM_TYPE", localconn);
       Object[] salesItem = new Object[9];

       Connection remoteconn = ds.getConnection();

       PreparedStatement pstmt = remoteconn.prepareStatement(
               "SELECT p.personid salespersonid,"
             + "       p.fullname salesperson,"
             + "       o.customerid,"
             + "       o.orderid,"
             + "       o.orderdate,"
             + "       l.orderlineid,"
             + "       l.unitprice,"
             + "       l.quantity,"
             + "       i.stockitemname,"
             + "       count(*) over() cnt"
             + "  FROM application.people   p"
             + "  JOIN sales.orders         o ON o.salespersonpersonid = p.personid"
             + "  JOIN sales.orderlines     l ON l.orderid = o.orderid"
             + "  JOIN warehouse.stockitems i ON i.stockitemid = l.stockitemid"
             + " WHERE ? < 1 or p.personid = ?");

       pstmt.setInt(1, p_personid);
       pstmt.setInt(2, p_personid);

       ResultSet rs = pstmt.executeQuery();
       STRUCT[] temparray = null;
       int i=0;
       while (rs.next()) {
          if (i==0) {
               temparray = new STRUCT[rs.getInt("cnt")];
          }
          salesItem[0] = rs.getInt("salespersonid");
          salesItem[1] = rs.getString("salesperson");
          salesItem[2] = rs.getInt("customerid");
          salesItem[3] = rs.getInt("orderid");
          salesItem[4] = rs.getDate("orderdate");
          salesItem[5] = rs.getInt("orderlineid");
          salesItem[6] = rs.getFloat("unitprice");
          salesItem[7] = rs.getInt("quantity");
          salesItem[8] = rs.getString("stockitemname");

          temparray[i++] = new STRUCT(oneItem, localconn, salesItem);
       }
       rs.close();

       remoteconn.close();

       ArrayDescriptor  SalesItemArray = ArrayDescriptor.createDescriptor("WWI_SALESPERSON_LINEITEM_TAB", localconn);
       return new ARRAY(SalesItemArray, localconn, temparray);
    }
}
/

Then of course we need a pl/sql api to expose the java function.

CREATE OR REPLACE FUNCTION sds.wwi_get_salesperson_items(
            p_server     IN VARCHAR2,
            p_instance   IN VARCHAR2,
            p_port       IN NUMBER,
            p_database   IN VARCHAR2,
            p_user       IN VARCHAR2,
            p_password   IN VARCHAR2,
            p_personid   IN NUMBER)
    RETURN wwi_salesperson_lineitem_tab
IS
    LANGUAGE JAVA
    NAME 'wwi_salesperson_lineitems.get_items(
                  java.lang.String, java.lang.String, int, 
                  java.lang.String, java.lang.String, java.lang.String,
                  int) 
          return oracle.sql.ARRAY' ;

Finally, use the function in a query. I'll look up the first rows for salesid 7.

SELECT *
  FROM TABLE(wwi_get_salesperson_items('my_remote_server',
                                       NULL,
                                       1433,
                                       'WideWorldImporters',
                                       'wwi',
                                       'pa55w0rd',
                                       7))
ORDER BY orderdate
SALESPERSONID SALESPERSON CUSTOMERID ORDERID ORDERDATE ORDERLINEID UNITPRICE QUANTITY STOCKITEMNAME
7
Amy Trefl
949
19
2013-01-01
70
13
7
DBA joke mug - it depends (White)
7
Amy Trefl
949
59
2013-01-01
69
99
60
10 mm Anti static bubble wrap (Blue) 50m
7
Amy Trefl
964
8
2013-01-01
30
13
1
Developer joke mug - old C developers never die (White)
7
Amy Trefl
173
23
2013-01-01
80
18
24
Halloween zombie mask (Light Brown) L
7
Amy Trefl
173
23
2013-01-01
83
13
3
IT joke mug - that behavior is by design (White)
7
Amy Trefl
949
59
2013-01-01
68
18
108
"The Gu" red shirt XML tag t-shirt (White) 7XL
7
Amy Trefl
77
51
2013-01-01
31
18
24
"The Gu" red shirt XML tag t-shirt (Black) XL
7
Amy Trefl
964
8
2013-01-01
27
32
4
USB food flash drive - banana
7
Amy Trefl
77
9
2013-01-01
33
32
4
Animal with big feet slippers (Brown) S
7
Amy Trefl
173
23
2013-01-01
81
32
2
Dinosaur battery-powered slippers (Green) XL
7
Amy Trefl
105
47
2013-01-01
10
230
9
Ride on toy sedan car (Pink) 1/12 scale

Another use case, possibly more common, is reading file lists from the host server. The idea is the same, create an object to hold the rows, a collection to hold the objects, iterate through files creating a STRUCT to add to an array, and finally convert the array to an oracle.sql.ARRAY. By default most users won't have access to read the server directories. If this functionality is required, your DBA will need to grant the appropriate permissions.

Again, first create the SQL objects.

CREATE OR REPLACE TYPE file_type
AS OBJECT
(
file_name VARCHAR2(512),
file_size NUMBER(15, 0),
file_type VARCHAR2(1),
last_modified DATE,
readable VARCHAR2(1),
writeable VARCHAR2(1)
);
/

CREATE OR REPLACE TYPE file_table
AS TABLE OF file_type;
/

Next, create the java class to read the directories and return a collection. I'll overload the class to allow for user specification of file size units (bytes vs kilobytes, megabytes, or gigabytes.) File permissions are protected separately from the permissions needed to view the file name, type, and size. I included a limited exception handler to simply return "?" as unknown if the user receives an error while trying to read them. As with the previous example, the function needs to establish a jdbc connection to the local default connection in order to create the SQL objects.

CREATE OR REPLACE AND RESOLVE JAVA SOURCE NAMED "DirList"
as import java.io.*;
import java.sql.*;
import java.util.*;
import oracle.sql.*;
import oracle.jdbc.driver.*;

public class DirList{
    public static oracle.sql.ARRAY getListasObjects(String p_directory)
    throws Exception
    {
        return getListasObjects(p_directory,"Bytes");
    }

   public static oracle.sql.ARRAY getListasObjects(String p_directory, String p_size_units)
    throws Exception
      {
        Connection conn    = DriverManager.getConnection("jdbc:default:connection:");
        StructDescriptor oneFile = StructDescriptor.createDescriptor("FILE_TYPE", conn);
        Object[] fileattributes = new Object[6];

        File[] files = new File(p_directory).listFiles();
        STRUCT[] temparray = new STRUCT[files.length];
        for( int i = 0; i < files.length; i++ ) {
            fileattributes[0] = files[i].getName();

            String tempunits = p_size_units.toUpperCase();
            if (tempunits.equals("G") || tempunits.equals("GB") )
                 fileattributes[1] = Math.ceil((files[i].length() + 1.0) /1024/1024/1024);
            else if (tempunits.equals("M") || tempunits.equals("MB") )
                 fileattributes[1] =  Math.ceil((files[i].length() + 1.0) /1024/1024);
            else if (tempunits.equals("K")  || tempunits.equals("KB") )
                 fileattributes[1] =  Math.ceil((files[i].length() + 1.0) /1024);
            else {
                 fileattributes[1] = files[i].length();
            }

            fileattributes[2] = (files[i].isDirectory() ? "D" : "F");
            fileattributes[3] = new java.sql.Timestamp(files[i].lastModified());

            // If you don't have permissions granted to read or write
            // you'll get an error simply by looking at the attribute
            // For example...
            // ORA-29532: Java call terminated by uncaught Java exception:
            // java.security.AccessControlException:
            // the Permission (java.io.FilePermission YOURFILE write)
            // has not been granted to YOURUSER.
            // The PL/SQL to grant this is
            // dbms_java.grant_permission('YOURUSER',
            //                            'SYS:java.io.FilePermission',
            //                            'YOURFILE',
            //                            'write')
            try {
               fileattributes[4] = (files[i].canRead() ? "Y" : "N");
            } catch (Exception e) {
               fileattributes[4] = "?";
            }
            try {
               fileattributes[5] = (files[i].canWrite() ? "Y" : "N");
            } catch (Exception e) {
               fileattributes[5] = "?";
            }
            temparray[i] = new STRUCT(oneFile, conn, fileattributes);
        }

        ArrayDescriptor  FileArray = ArrayDescriptor.createDescriptor("FILE_TABLE", conn);
        return new ARRAY(FileArray, conn, temparray);
    }

}

Instead of creating a standalone function, this class will be exposed through a pl/sql package in order to invoke the overloaded methods. This could also have been implemented through a default value in the pl/sql wrapper function. This implementation more directly mirrors the java api which may be more appropriate for other examples.

CREATE OR REPLACE PACKAGE files
AS
    FUNCTION dir_list(p_directory IN VARCHAR2, p_size_units IN VARCHAR2 DEFAULT NULL)
        RETURN file_table;
END;
/

CREATE OR REPLACE PACKAGE BODY files
AS
    FUNCTION dir_list1(p_directory IN VARCHAR2)
        RETURN file_table
    AS
        LANGUAGE JAVA
        NAME 'DirList.getListasObjects( java.lang.String ) return oracle.sql.ARRAY' ;

    FUNCTION dir_list2(p_directory IN VARCHAR2, p_size_units IN VARCHAR2)
        RETURN file_table
    AS
        LANGUAGE JAVA
        NAME 'DirList.getListasObjects( java.lang.String, java.lang.String ) return oracle.sql.ARRAY' ;

    FUNCTION dir_list(p_directory IN VARCHAR2, p_size_units IN VARCHAR2 DEFAULT NULL)
        RETURN file_table
    IS
    BEGIN
        RETURN CASE WHEN p_size_units IS NULL THEN 
                           dir_list1(p_directory) 
                    ELSE 
                           dir_list(p_directory, p_size_units)
               END;
    END;
END;
/

And then finally, use the package to look up the contents of a directory. This does assume permissions have been granted and the directory and files exist.

select * from files.dir_list('/tmp/customer_files');
FILE_NAME FILE_SIZE FILE_TYPE LAST_MODIFIED READABLE WRITEABLE
messages.json
722
F 2019-03-21 01:19:29 Y ?
datadump.csv
3709
F 2019-03-21 01:19:13 Y ?
sales.xml
2473
F 2019-03-21 01:18:56 Y ?

I hope these example help. A natural question to follow up might be "could these functions returned a pipelined collection?" Yes, they can through the Data Cartridge api (ODCI) and I hope to provide examples of those options in a future article.

Questions and comments, as always, are welcome.


Creativity vs Functionality – A Tale of a Query’s Development

Recently, I had a satisfying conclusion to a series of questions that came in randomly over the course of a few weeks.

First the developer was trying to do recursion. I pointed to documentation and gave a couple of small examples showing CONNECT BY and recursive WITH clauses.

    SELECT LPAD(' ', LEVEL * 2) || empno || ' ' || ename || ' ' || job
      FROM emp
CONNECT BY mgr = PRIOR empno
START WITH job = 'MANAGER'
ORDER SIBLINGS BY empno;
WITH
    recursive (lvl, empno, ename, job, mgr)
    AS
        (SELECT 1, empno, ename, job, mgr
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT recursive.lvl + 1, emp.empno, emp.ename,
                emp.job, emp.mgr
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
          SEARCH DEPTH FIRST BY empno SET RN
SELECT LPAD(' ', lvl * 2) || empno || ' ' || ename || ' ' || job
  FROM recursive;
  7566 JONES MANAGER
    7788 SCOTT ANALYST
      7876 ADAMS CLERK
    7902 FORD ANALYST
      7369 SMITH CLERK
  7698 BLAKE MANAGER
    7499 ALLEN SALESMAN
    7521 WARD SALESMAN
    7654 MARTIN SALESMAN
    7844 TURNER SALESMAN
    7900 JAMES CLERK
  7782 CLARK MANAGER
    7934 MILLER CLERK

The next time he reached out to me, he had settled on using a recursive WITH subquery and was aggregating the children of the top level entries – creating a comma-separated list for each.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         LISTAGG(ename, ',') 
              WITHIN GROUP (ORDER BY empno) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;
MANAGER  EMPLOYEES
BLAKE	ALLEN,WARD,MARTIN,TURNER,JAMES
CLARK	MILLER
JONES	SMITH,SCOTT,ADAMS,FORD

This solution only worked for a few test rows. He returned a little while later because the aggregated strings were exceeding the VARCHAR2’s 4000 character limit for some groups and thus failing. He was using 12cR2, so the ON OVERFLOW syntax would have been legal; but he needed all of the content. So, neither the ERROR nor the TRUNCATE option would have sufficed. While this isn’t a problem using the small EMP table, it was an issue needing solved for his real data. I showed him how to use the COLLECT function to aggregate into a nested table and then pass that object to a function to iterate through the members and return a delimited CLOB.

WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, 
         tbl2clob(CAST(COLLECT(ename) AS vctab)) employees
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

The TBL2CLOB function and VCTAB type are objects I’ve used in an earlier series and their source may be found here.

At this point though, I was getting a little concerned his plan was to use the csv strings to construct dynamic sql with large IN-lists. So, before he got a chance to implement that I tell him there are better alternatives using the collections as is instead of converting them into large text values.

The next round of questions weren’t with the syntax itself or database problems; but in testing the queries and constructing various collections he began getting errors for missing objects or inconsistent data types. The problem here was the collection being returned was not explicitly CAST to a known type, thus the client software was not able to render the results because it didn’t know what the results were. The fix is to simply give them a type. CAST was shown before but at that the time it was implied casting was simply a requirement for the tbl2clob function, rather than a practice for all COLLECT output.

  SELECT manager_name, 
         COLLECT(ename) employees  -- error
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name
  SELECT manager_name,
         CAST(COLLECT(ename) AS vctab) employees  -- good
    FROM recursive
   WHERE job != 'MANAGER'
GROUP BY manager_name;

Now that his collections were properly typed and he had a query aggregating the expected results he could then use the collection in a join to some other table. I showed samples of MEMBER and IN variations. While the MEMBER syntax is more compact and (in my opinion) cleaner, the optimizer will often make better choices with an IN clause against a subquery. I always recommend testing both.

MEMBER operator
     x.name MEMBER OF m.employees
IN-List Condition
     x.name in (select column_value from table(m.employees))

The next step in his task was materializing the aggregated values from a remote database into a local copy. He didn’t need a materialized view, but did want a static local copy for short-term reuse.

CREATE TABLE managers
NESTED TABLE employees STORE AS manager_employees_nt
AS
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
  SELECT manager_name, CAST(COLLECT(ename) AS vctab) employees
             FROM recursive
            WHERE job != 'MANAGER'
         GROUP BY manager_name

Now that his queries have worked through the basic syntax requirements and I can see some real results starting to come through, I become a little concerned he’s doing a lot of look-ups based on the elements in his collection and this route will be unnecessarily complicated. After a few minutes of wrangling about the details of his requirements and nature of the data I then suggest simplifying the materialized table without a collection.

CREATE TABLE managers
AS       
WITH
    recursive (empno, ename, job, mgr, manager_name)
    AS
        (SELECT  empno, ename, job, mgr, ename
           FROM emp
          WHERE job = 'MANAGER'
         UNION ALL
         SELECT emp.empno, emp.ename,
                emp.job, emp.mgr,  manager_name
           FROM emp, recursive
          WHERE emp.mgr = recursive.empno)
SELECT manager_name, ename
  FROM recursive
 WHERE job != 'MANAGER';

Then do the lookup values and subsequent aggregations using simple equality predicates and groups by.

SELECT m.managers, 
       SUM(e.value1) val1_sum,
       SUM(e.value2) val2_sum
  FROM managers m 
  LEFT JOIN lookup_table e 
      ON e.ename = m.ename;

In retrospect, I suppose it is possible that spending more time digging through his requirements a few weeks ago could have led directly to this solution; but I’m glad we worked through the various combinations. He learned several new syntax techniques and different ways of looking at a problem. So, even though most of them proved unneeded in the end, I feel it was a useful exercise. I’ve given him more tools for his toolbelt. Hopefully I also taught him enough to know when and how to use them.

Moral of the story – It’s not enough to simply know 10 ways to tie a knot, it’s equally important to know why you shouldn’t use 9 of them.

A bug in the COLLECT function?

Today I received a request that had me using the COLLECTION function to aggregate values for different categories. However, it didn’t take too long to find extra rows creeping into my results.

To my surprise I found using DISTINCT within a COLLECT works, unless you also try to order the results with an ORDER BY clause. If the ordering is included, the uniqueness criteria is negated.

I tested it on 11.2.0.4, 12.1.0.2, 12.2.0.1, and 18.3.0.0 databases. All reproduced the same problem. Here is the smallest example I could think of, simply collecting two identical values (the number 1) and the resulting collection is not distinct even though DISTINCT is specified in the function call.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters and Automatic Storage Management options

SQL> select collect(distinct n order by n) x from (select 1 n from dual union all select 1 n from dual);

X
--------------------------------------
SYSTPgubs6QoXB17gUwowH60Wkw==(1, 1)

1 row selected.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> select collect(distinct n order by n) x from (select 1 n from dual union all select 1 n from dual);

X
--------------------------------------
SYSTPgubmYoi2AY7gUwpcDSE2+A==(1, 1)

1 row selected.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select collect(distinct n order by n) x from (select 1 n from dual union all select 1 n from dual);

X
--------------------------------------
ST00001uZWKJPoArbgUwowAbgM2A=(1, 1)

1 row selected.
Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0

SQL> select collect(distinct n order by n) x from (select 1 n from dual union all select 1 n from dual);

X
--------------------------------------
ST00001uadYnwXOo3gU0tqWArDUg=(1, 1)

1 row selected.

I also tested it on Oracle’s own livesql.oracle.com site. Live SQL doesn’t support returning an undefined collection type. So in the example I cast it to a defined public collection and then select from that using the TABLE function. My test script can be found here.

As a work around I can either use a subquery to find distinct values first, then wrap that in an outer query to collect the values into an ordered collection.

SELECT COLLECT(n ORDER BY n)
FROM (SELECT DISTINCT n
FROM (SELECT 1 n FROM DUAL
UNION ALL
SELECT 2 n FROM DUAL
UNION ALL
SELECT 2 n FROM DUAL
UNION ALL
SELECT 1 n FROM DUAL));

Another option, I can use collect twice, once with distinct and then again with order by. This is essentially the same as the first work around, but more complicated syntax.

SELECT COLLECT(COLUMN_VALUE ORDER BY COLUMN_VALUE)
FROM TABLE(SELECT COLLECT(DISTINCT n)
FROM (SELECT 1 n FROM DUAL
UNION ALL
SELECT 2 n FROM DUAL
UNION ALL
SELECT 2 n FROM DUAL
UNION ALL
SELECT 1 n FROM DUAL));

Using the secondary COLLECT can be useful if you are using other aggregates that apply at the lower level.

SELECT object_type,
count_all,
count_distinct,
(SELECT COLLECT(COLUMN_VALUE ORDER BY COLUMN_VALUE) FROM TABLE(owners))
FROM ( SELECT object_type,
COUNT(*) count_all,
COUNT(DISTINCT owner) count_distinct,
COLLECT(DISTINCT owner) owners
FROM all_objects
GROUP BY object_type);

I have opened an SR with Oracle Support to investigate the issue. I hope these workaround help until a fix is available.


Using Java stored procedure to read other databases

Oracle provides a great set of tools called “gateways” to connect an Oracle database to other data sources such as SQL Server or DB2. Most of these gateways are separately licensed though. There is a no-cost ODBC-based gateway but that might not fit your needs and you may want to explore using other technologies to read or modify data in a remote source. Alternately, you may already have java classes from a legacy integration that you want to use from within the Oracle database.

Fortunately, Oracle supports java stored procedures. So, if you load the appropriate jdbc drivers you can have your Oracle sql and pl/sql integrate with other database platforms. In the examples below I’ll use a SQL Server instance running the WideWorldImporters sample database as my remote source. To read data from it, I’ll create a schema to hold the needed jdbc classes, write a java class using them to read the remote data, and then publish the class for use through a pl/sql function interface.

1 – Create the schema to own the jdbc classes. The schema does not need create session privilege and can be locked. If you’re using 18c or higher, you can take it a step further and create the user without a password.

CREATE USER sqljdbc IDENTIFIED BY pa55w0rd
DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP ACCOUNT LOCK;

ALTER USER sqljdbc QUOTA UNLIMITED ON users;

GRANT CREATE PROCEDURE TO sqljdbc

2 – The schema needs to be granted Java privileges to connect and resolve the host and port of each external database.

DECLARE
keynum NUMBER;
BEGIN
sys.DBMS_JAVA.grant_permission(
grantee => 'SQLJDBC',
permission_type => 'SYS:java.net.SocketPermission',
permission_name => 'my_remote_server:1433',
permission_action => 'connect,resolve',
key => keynum);
END;
/

3 – Load the java classes, use the grant option to allow them to be used by an application schema. You may get errors on the KeyVaultCredential and SQLServerColumnEncryptionAzureKeyVaultProvider classes; these aren’t needed for these examples.

loadjava -genmissing -schema sqljdbc -resolve -grant sds -user system@testdb sqljdbc42.jar

4 – Create a java class in your application schema to read some of the remote data.

CREATE OR REPLACE AND RESOLVE JAVA SOURCE NAMED wwi_lookup 
RESOLVER(( * sds )( * sqljdbc )( * public ))
    AS 
import java.sql.*;
import com.microsoft.sqlserver.jdbc.*;

public class wwi_lookup {
    public static int get_population(
                            String p_server,
                            String p_instance,
                            int p_port,
                            String p_database,
                            String p_user,
                            String p_password,
                            String p_city,
                            String p_state) throws Exception
    {
       SQLServerDataSource ds = new SQLServerDataSource();
       ds.setServerName(p_server);
       if (p_instance != null) { ds.setInstanceName(p_instance); }           
       ds.setPortNumber(p_port);
       ds.setDatabaseName(p_database);
       ds.setUser(p_user);
       ds.setPassword(p_password);

       Connection conn = ds.getConnection();

       PreparedStatement pstmt = conn.prepareStatement(
               "SELECT c.LatestRecordedPopulation "
             + "  FROM      Application.Cities c "
             + " INNER JOIN Application.StateProvinces s "
             + "    ON s.StateProvinceID = c.StateProvinceID "
             + " WHERE c.CityName = ? AND s.StateProvinceCode = ?");

       pstmt.setString(1, p_city);
       pstmt.setString(2, p_state);

       ResultSet rs = pstmt.executeQuery(); 

       int population = -1; 
       if (rs.next()) {
          population = rs.getInt("latestrecordedpopulation");
       }
       rs.close();

       conn.close();

       return population; 
    }
}
/

5 – Publish the class through a pl/sql interface.

CREATE OR REPLACE FUNCTION get_population(p_server     IN VARCHAR2,
p_instance IN VARCHAR2,
p_port IN NUMBER,
p_database IN VARCHAR2,
p_user IN VARCHAR2,
p_password IN VARCHAR2,
p_city IN VARCHAR2,
p_state IN VARCHAR2)
RETURN PLS_INTEGER
IS
LANGUAGE JAVA
NAME 'wwi_lookup.get_population(java.lang.String, java.lang.String, int,
java.lang.String, java.lang.String, java.lang.String,
java.lang.String, java.lang.String
) return int' ;
/

6 – Grant the application owner Java privileges to connect and resolve the remote host and port. These are the same as the jdbc schema needed.

DECLARE
keynum NUMBER;
BEGIN
sys.DBMS_JAVA.grant_permission(
grantee => 'SDS',
permission_type => 'SYS:java.net.SocketPermission',
permission_name => 'my_remote_server:1433',
permission_action => 'connect,resolve',
key => keynum);
END;
/

7 – The function should now be usable within your Oracle applications as any other function.

SQL> select get_population('my_remote_server',null,1433,'WideWorldImporters',
2 'wwi','pa55w0rd',
3 'New York','NY') city_population
4 from dual;

CITY_POPULATION
8175133

Using the same general form, you can create additional procedures and functions to utilize any functionality exposed by the remote system; including inserting, updating, or deleting remote data and invoking remote stored procedures.

If you need to do a lot of integrations with heterogeneous systems, then you may want to invest in the Oracle Gateway technologies; but for smaller, isolated use cases, a few small Java/jdbc objects may be more than enough.


Introduction to ASH Data, part 4

Up to this point in my ASH series all of the output was simply rows and columns of data, sometimes augmented with an ascii, text based chart. This time I’ll explore a few different tools to construct better visualizations.

One of the first things I ever used “R” for was to create ASH charts as an alternate to those found in Oracle’s Enterprise Manager “Top Activity” and “ASH Analytics” pages. I had multiple reasons to pursue the exercise. The first being the Top Activity page didn’t have a mechanism to specify an arbitrary time range. The ASH Analytics page does allow inputs but on refreshes it often loses them. Furthermore, both pages generated their graphs from averages over periods, which can distort the data, causing peaks and valleys to disappear. I wrote about the missing peaks a couple years ago.

So, the first visualization I created was a simple stacked bar chart with the ggplot2 package, accessing the database through the ROracle package. This script plots a half-hour range seen in previous articles but every sample time is represented. While this is still subject to the sampling issues of all ASH queries, this chart eliminates missing peaks caused by rounded averages over periods.

library("DBI")
library("ROracle")
library("ggplot2")

drv <- dbDriver("Oracle")

### Connect via credentials stored in wallet
con <- dbConnect(drv, dbname ="testdb")

######################################################################################################
#### Function: get_top_activity
######################################################################################################
get_top_activity <- function(p_con, p_start, p_end) {
    v_sql <- "
        SELECT sample_time, activity, COUNT(*) cnt
          FROM (SELECT sample_time,
                       CASE
                       WHEN session_state = 'ON CPU' THEN
                          'ON CPU'
                       WHEN wait_class IN ('Concurrency',
                                           'System I/O',
                                           'User I/O',
                                           'Configuration',
                                           'Application',
                                           'Commit',
                                           'Queueing',
                                           'Network',
                                           'Administrative',
                                           'Cluster',
                                           'Scheduler') THEN
                          wait_class
                       ELSE
                          'Other'
                       END activity
                  FROM dba_hist_active_sess_history
                 WHERE sample_time >= TO_TIMESTAMP(:start_time,'yyyy-mm-dd hh24:mi:ss')
                   AND sample_time < TO_TIMESTAMP(:end_time,'yyyy-mm-dd hh24:mi:ss'))
          GROUP BY sample_time, activity"

    ### Need to tell R the timezone of the incoming data
    ### Alternately, could use FROM_TZ function to give the timestamp values a timezone in the query.
    Sys.setenv(ORA_SDTZ = "US/Eastern")
    
    return(dbGetQuery(
              p_con,
              v_sql,
              data.frame(
                  start_time = p_start,
                  end_time = p_end
              )
           )
    )
}
######################################################################################################


##############################################################################################
###  Get ASH Data
##############################################################################################
df_activity <- get_top_activity(con, '2018-09-15 13:00:00', '2018-09-15 13:30:00')

### We don't need the db connection anymore
dbDisconnect(con)
dbUnloadDriver(drv)

### Stack the wait classes in this order (CPU on bottom, Other on top)
wait_class_order <- c(
  "Other",
  "Cluster",
  "Queueing",
  "Network",
  "Administrative",
  "Configuration",
  "Commit",
  "Application",
  "Concurrency",
  "System I/O",
  "User I/O",
  "Scheduler",
  "ON CPU"
)
df_activity$ACTIVITY <- factor(df_activity$ACTIVITY, levels = wait_class_order)


## Color hex codes for each activity type shown in the chart
wait_class_colors <- c(
  "ON CPU"         = "#04CE04",
  "Scheduler"      = "#CCFECC",
  "User I/O"       = "#084AD4",
  "System I/O"     = "#0499F4",
  "Concurrency"    = "#8E1B04",
  "Application"    = "#C72D04",
  "Commit"         = "#E46A04",
  "Configuration"  = "#5B460D",
  "Administrative" = "#707251",
  "Network"        = "#9C9376",
  "Queueing"       = "#C5B79A",
  "Cluster"        = "#CCC3AE",
  "Other"          = "#F56CA9"
)

### Build chart, fill in missing values, add titles and labels
plot_activity <- ggplot(df_activity, aes(x = SAMPLE_TIME, y = CNT, fill = ACTIVITY)) +
                        geom_bar(stat = "identity") +
                        scale_fill_manual(values = wait_class_colors) +
                        ggtitle("Active Sessions")  +
                        labs(x = "Sample Time", y =  "Active Sessions")

### Show the chart
print(plot_activity)

Running this script produced the chart below from my test system.

Shortly after I started exploring R, I was asked to help with a project using Googlecharts. I hadn’t done a lot with the Googlecharts api before; but having just completed some simple R charting, I started by replicating the ASH charts as a learning exercise. With R, you build a data frame to pass into the ggplot api. With Googlecharts, you build an array of arrays, each sub-array has a place holder value for every possible activity type, whereas the R plot function could fill in missing values as needed. Rows in the array will look something like these:

 [new Date(2018,09,15,13,06,00),6,0,0,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,29,33),0,0,3,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,19,42),0,0,3,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,15,51),6,0,0,0,0,0,0,0,0,0,0,0,0]

Using the same query and date range as in the R example above, adding a pivot clause, and a little pl/sql we can spool out an html file containing all of the necessary data and invoke a browser to load it and generate the chart.

set echo off
set pages 1000
set lines 1000
set feedback off
set define off
set serveroutput on
spool ash.html
DECLARE
      p_start_time TIMESTAMP := TO_TIMESTAMP('2018-09-15 13:00:00','yyyy-mm-dd hh24:mi:ss');
      p_end_time   TIMESTAMP := TO_TIMESTAMP('2018-09-15 13:30:00','yyyy-mm-dd hh24:mi:ss');
BEGIN
    DBMS_OUTPUT.put_line(
        q'[<html>
  <head>
    <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
    <script type="text/javascript">
       google.charts.load('current', {'packages':['corechart']});
       google.charts.setOnLoadCallback(drawChart);
       function drawChart() {
           var data = google.visualization.arrayToDataTable([
            ['Activity',
            'ON CPU','Scheduler','User IO', 'System IO', 'Concurrency','Application','Commit',
            'Configuration',    'Administrative',    'Network',  'Queueing', 'Cluster',   'Other'],]'
       );

    FOR x
        IN (
            SELECT    CASE WHEN ROWNUM = 1 THEN ' ' ELSE ',' END
                   || '['
                   || TO_CHAR(sample_time, '"new Date("yyyy,mm,dd,hh24,mi,ss")"')
                   || ','
                   || NVL("ON CPU", 0)
                   || ','
                   || NVL("Scheduler", 0)
                   || ','
                   || NVL("User I/O", 0)
                   || ','
                   || NVL("System I/O", 0)
                   || ','
                   || NVL("Concurrency", 0)
                   || ','
                   || NVL("Application", 0)
                   || ','
                   || NVL("Commit", 0)
                   || ','
                   || NVL("Configuration", 0)
                   || ','
                   || NVL("Administrative", 0)
                   || ','
                   || NVL("Network", 0)
                   || ','
                   || NVL("Queueing", 0)
                   || ','
                   || NVL("Cluster", 0)
                   || ','
                   || NVL("Other", 0)
                   || ']' arrayrow
              FROM (SELECT sample_time,
                           CASE
                               WHEN session_state = 'ON CPU'
                               THEN
                                   'ON CPU'
                               WHEN wait_class IN ('Concurrency',
                                                   'System I/O',
                                                   'User I/O',
                                                   'Configuration',
                                                   'Application',
                                                   'Commit',
                                                   'Queueing',
                                                   'Network',
                                                   'Administrative',
                                                   'Cluster',
                                                   'Scheduler')
                               THEN
                                   wait_class
                               ELSE
                                   'Other'
                           END activity
                      FROM dba_hist_active_sess_history
                     WHERE sample_time >= p_start_time 
                       AND sample_time < p_end_time)
                   PIVOT (COUNT(*)
                         FOR activity
                         IN ('Other' AS "Other",
                            'Cluster' AS "Cluster",
                            'Queueing' AS "Queueing",
                            'Network' AS "Network",
                            'Administrative' AS "Administrative",
                            'Configuration' AS "Configuration",
                            'Commit' AS "Commit",
                            'Application' AS "Application",
                            'Concurrency' AS "Concurrency",
                            'System I/O' AS "System I/O",
                            'User I/O' AS "User I/O",
                            'Scheduler' AS "Scheduler",
                            'ON CPU' AS "ON CPU"))
    )
    LOOP
        DBMS_OUTPUT.put_line(x.arrayrow);
    END LOOP;

    DBMS_OUTPUT.put_line(
           q'[        ]);           

      var options = {
        width: 1200,
        height: 700,
        legend: { position: 'right', maxLines: 3 },
        bar: { groupWidth: '75%' },
        isStacked: true,
        colors: [
          '#04CE04',
          '#CCFECC',
          '#084AD4',
          '#0499F4',
          '#8E1B04',
          '#C72D04',
          '#E46A04',
          '#5B460D',
          '#707251',
          '#9C9376',
          '#C5B79A',
          '#CCC3AE',
          '#F56CA9']
      };
        // Instantiate and draw our chart, passing in some options.
        var chart = new google.visualization.ColumnChart(document.getElementById('chart_div'));
        chart.draw(data, options);
      }
    </script>
  </head>

  <body><div id="chart_div"></div></body></html>]'
    );
END;
/
spool off
host "C:\Program Files\Mozilla Firefox\firefox.exe" .\ash.html
exit

Using the script above the googlechart version of the ASH data looks like below.

The last visualization tool I’ll explore is perhaps the most obvious one – Application Express (APEX.) To create a stacked bar chart similar to the ones above…

Create a new Region
  • Under Identification
    • Set Title – Active Sessions
    • Set Type – Chart
  • Under Attributes
    • Under Chart, Set Type – Bar
    • Under Appearance, Set Stacked – Yes
    • Under Settings, Set Time Axis Type – Enabled
  • Under Series
    • Under Source, Set Type – SQL Query (query is below)
    • Under Column Mapping
      • Set Series Name – ACTIVITY
      • Set Label – SAMPLE_TIME
      • Set Value – CNT
    • Under Appearance, Set Color – &COLOR.

In addition to the settings above, we’ll also need to add the SQL query in the Series Source. Below is the same query used in the R example, except I’ve also added a color attribute.

  SELECT sample_time,
         activity,
         color,
         COUNT(*) cnt
    FROM (SELECT sample_time,
                 CASE
                     WHEN session_state = 'ON CPU'
                     THEN
                         'ON CPU'
                     WHEN wait_class IN ('Concurrency',
                                         'System I/O',
                                         'User I/O',
                                         'Configuration',
                                         'Application',
                                         'Commit',
                                         'Queueing',
                                         'Network',
                                         'Administrative',
                                         'Cluster',
                                         'Scheduler')
                     THEN
                         wait_class
                     ELSE
                         'Other'
                 END activity,
                 CASE
                     WHEN session_state = 'ON CPU' THEN '#04CE04'
                     WHEN wait_class = 'Concurrency' THEN '#8E1B04'
                     WHEN wait_class = 'System I/O' THEN '#0499F4'
                     WHEN wait_class = 'User I/O' THEN '#084AD4'
                     WHEN wait_class = 'Configuration' THEN '#5B460D'
                     WHEN wait_class = 'Application' THEN '#C72D04'
                     WHEN wait_class = 'Commit' THEN '#E46A04'
                     WHEN wait_class = 'Queueing' THEN '#C5B79A'
                     WHEN wait_class = 'Network' THEN '#9C9376'
                     WHEN wait_class = 'Administrative' THEN '#707251'
                     WHEN wait_class = 'Cluster' THEN '#CCC3AE'
                     WHEN wait_class = 'Scheduler' THEN '#CCFECC'
                     ELSE '#F56CA9'
                 END color
            FROM dba_hist_active_sess_history
           WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
             AND sample_time < TIMESTAMP '2018-09-15 13:30:00')
GROUP BY sample_time, activity, color

Running the page with the chart region defined as above produced the following chart. Unfortunately, the options for defining the stacking order are limited to sorting by value or label. Thus, it’s not possible (using just the APEX native options) to guarantee the APEX chart will stack the same way as the R or Googlechart variations. It might be possible with some custom java script to manipulate the underlying JET attributes; but I have not explored that route. The overall effect though, even with the current limitations is still quite similar to the other tools.

Hopefully these examples will help you in digging into your own ASH data and maybe give a little head start into exploring some new tools. I welcome any suggestions, corrections, or additions to any of the methods above.

While all of these examples have hard-coded date ranges, I hope the code arrangement provides a solid framework to parameterize within your own environment and tools. As always, questions and comments are welcome.


Converting ISO8601 week-numbering text to date

Today I was asked how to convert iyyy and iw values into dates, in particular how to convert the day-of-week value since there is no “id” format mask. As it turns out, it wouldn’t matter if there was because the “i” format masks are not supported in TO_datetime functions. They can only be used in TO_CHAR conversions from datetime values.

The week-number calendar described in the ISO standard starts on the first Monday prior to the first Thursday that occurs within the Gregorian calendar. Furthermore the standard defines two formats for representing dates in the ISO week-numbering system. An extended form yyyy-Www-d and a compact form yyyyWwwd

Where W is the literal W letter, ww are the ISO weeks 1-53, d is the day of the ISO week 1-7 for Monday through Friday, and of course, yyyy the ISO year which mostly corresponds with the Gregorian calendar year except for the first few and last few days of each year which could fall into a different ISO year than the Gregorian year.

The first tricky part of this problem was finding the first Thursday. I couldn’t use day-of-week numbers because those vary by NLS_TERRITORY and I couldn’t use names because those vary by NLS_LANGUAGE. The easiest solution to this I could think of was to use a date that I already knew to be a Thursday and then extract the day name for that date. Doing this means I will always be able to refer to the “Thursday” day by the correct name, regardless of the current session’s NLS_LANGUAGE setting. It’s a tiny bit of overhead but neatly solves the problem.

Next was validating the inputs to make sure they follow one of the two standard formats. A regular expression nicely covers both. Then a few substring extractions pull out the year, week and day numbers. Again, I validate the values so users can’t pass in week 0, 86, or other out-of-range value.

The math to construct a date from the ISO values is then fairly straight forward. Start with Thursday, then find Monday, then increment for weeks and days to get the final result.

One last check ensures that a 53 week input applied to a 52 week year will also be rejected.

The final function using the above logic ended up looking like this:

CREATE OR REPLACE FUNCTION isoweekdate(p_date IN VARCHAR2)
    RETURN DATE
IS
    --                    .///.
    --                   (0 o)
    ---------------0000--(_)--0000---------------
    --
    --  Sean D. Stuber
    --  sean.stuber@gmail.com
    --
    --             oooO      Oooo
    --------------(   )-----(   )---------------
    --             \ (       ) /
    --              \_)     (_/

    v_result         DATE;

    v_week           PLS_INTEGER;
    v_day            PLS_INTEGER;
    -- Get the weekday name of a known Thursday.  Doing it this way
    -- ensures the function will work regardless of the NLS_LANGUAGE setting
    --- 1971 is first year recognizing this format, so use its first Thursday
    v_nls_thursday   VARCHAR2(3) := TO_CHAR(DATE '1971-1-7', 'DY');
BEGIN
    --- Assert the input string is an ISO8601 week date string in Extended or Compact from
    ---  Extended:  yyyy-Www-d
    ---  Compact:   yyyyWwwd
    --- where "W" is the literal W character, ww is the ISO week number
    IF NOT REGEXP_LIKE(p_date, '^[0-9]{4}-W[0-9]{2}-[1-7]$|^[0-9]{4}W[0-9]{2}[1-7]$')
    THEN
        raise_application_error(-20001, 'Invalid format, must be yyyy-Www-d or yyyyWwwd');
    END IF;

    v_week := TO_NUMBER(SUBSTR(p_date, INSTR(p_date, 'W') + 1, 2), '99');

    IF v_week NOT BETWEEN 1 AND 53
    THEN
        raise_application_error(-20002, 'Illegal week number, must be between 1 and 53');
    END IF;

    v_day := TO_NUMBER(SUBSTR(p_date, -1), '9');

    v_result :=
          NEXT_DAY(TRUNC(TO_DATE(SUBSTR(p_date, 1, 4), 'yyyy'), 'yyyy') - 1, v_nls_thursday) -- find first Thursday in the Gregorian calendar year
        - 4                                                          -- Go back four days adding the first day of the ISO year will fall on Monday
        + ((v_week - 1) * 7)                                         -- Increment for the number of weeks
        + v_day;                                                     -- Increment for the day number, 1=Monday, 7= Sunday


    -- If someone tries to use Week 53 in a 52-week year
    --   the result will improperly shift into the following year.
    -- Check for that condition and raise exception if it happens
    IF v_week = 53 AND TO_CHAR(v_result, 'iyyy') > SUBSTR(p_date, 1, 4)
    THEN
        raise_application_error(-20003, 'Illegal week 53 in a 52 week year');
    END IF;

    RETURN v_result;
END;

Some sample usage:

SQL> select isoweekdate('2004-W53-6') from dual;
ISOWEEKDATE('2004-W53-6')
2005-01-01

SQL> select isoweekdate('2006-W01-1') from dual;
ISOWEEKDATE('2006-W01-1')
2006-01-02

SQL> select isoweekdate('2019-W03-4') from dual;
ISOWEEKDATE('2019-W03-4')
2019-01-17

SQL> select isoweekdate('2019W041') from dual;
ISOWEEKDATE('2019W041')
2019-01-21

SQL> select isoweekdate('2019W042') from dual;
ISOWEEKDATE('2019W042')
2019-01-22

This should work well for my friend, hopefully it will help others as well.

Introduction to ASH Data, part 3

In the first 2 chapters of this series I queried the ASH data with respect to distinct sample times. While this is helpful maximizing the granularity of the analysis, it can sometimes be too much detail, too many data points for the eye to follow, or if run through a graphical engine, so many elements that the resulting chart becomes slow to render. It’s also possible the resolution of the resulting chart may be too low to adequately display all of the points distinctly; thus making the abundance of data redundant and useless.

Furthermore, sometimes what you’re looking for isn’t a detailed play-by-play of all activity but a general picture. You know there will be peaks and valleys in the system activity; but how busy is it on average?

Also, any one sample might be unusually high or unusually low, but looking at a wider view, within a given period the overall activity is what you’d expect.

All of these lead to the one of the most common aggregations across time with ASH data: Average Active Sessions (AAS.) The idea behind the calculation is simple enough, count up the sessions for each second of sampling and then divide that by the total number of seconds within the period. Thus yielding the average number of active sessions at any point in time. The idea is simple and the math is simple for V$ACTIVE_SESSION_HISTORY because that view’s contents are sampled each second. DBA_HIST_ACTIVE_SESS_HISTORY is only populated with samples once each 10 seconds. Thus creating an average requires extrapolation for the missing time slices by multiplying the counts you do have by 10.

In the following example the active sessions counts are averaged over each minute (thus averaging over 60 seconds) within a half-hour span.

SQL>   SELECT TRUNC(sample_time, 'mi') time,
  2           ROUND(10 * COUNT(*) / 60) avg_sess_per_minute
  3      FROM dba_hist_active_sess_history
  4     WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  5       AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  6  GROUP BY TRUNC(sample_time, 'mi')
  7  ORDER BY 1;
TIME                    AVG_SESS_PER_MINUTE
2018-09-15 13:00:00                       5
2018-09-15 13:01:00                       5
2018-09-15 13:02:00                       3
2018-09-15 13:03:00                       3
2018-09-15 13:04:00                       5
2018-09-15 13:05:00                       7
2018-09-15 13:06:00                       5
2018-09-15 13:07:00                       4
2018-09-15 13:08:00                       3
2018-09-15 13:09:00                       5
2018-09-15 13:10:00                       4
2018-09-15 13:11:00                       4
2018-09-15 13:12:00                       7
2018-09-15 13:13:00                       3
2018-09-15 13:14:00                       3
2018-09-15 13:15:00                       5
2018-09-15 13:16:00                       8
2018-09-15 13:17:00                       6
2018-09-15 13:18:00                      10
2018-09-15 13:19:00                       6
2018-09-15 13:20:00                       9
2018-09-15 13:21:00                       6
2018-09-15 13:22:00                       6
2018-09-15 13:23:00                       4
2018-09-15 13:24:00                       3
2018-09-15 13:25:00                       3
2018-09-15 13:26:00                       4
2018-09-15 13:27:00                       4
2018-09-15 13:28:00                       7
2018-09-15 13:29:00                       3

30 rows selected.

Another approach is to count the sessions for each sample and then apply averages outside. This way can be less efficient since it involves multiple aggregations; but it can be a simpler construction

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT TRUNC(sample_time, 'mi'),
 10           ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute
 11      FROM sample_counts
 12  GROUP BY TRUNC(sample_time, 'mi')
 13  ORDER BY 1;
TRUNC(SAMPLE_TIME,'MI')     AVG_SESS_PER_MINUTE
2018-09-15 13:00:00                           5
2018-09-15 13:01:00                           5
2018-09-15 13:02:00                           3
2018-09-15 13:03:00                           3
2018-09-15 13:04:00                           5
2018-09-15 13:05:00                           7
2018-09-15 13:06:00                           5
2018-09-15 13:07:00                           4
2018-09-15 13:08:00                           3
2018-09-15 13:09:00                           5
2018-09-15 13:10:00                           4
2018-09-15 13:11:00                           4
2018-09-15 13:12:00                           7
2018-09-15 13:13:00                           3
2018-09-15 13:14:00                           3
2018-09-15 13:15:00                           5
2018-09-15 13:16:00                           8
2018-09-15 13:17:00                           6
2018-09-15 13:18:00                          10
2018-09-15 13:19:00                           6
2018-09-15 13:20:00                           9
2018-09-15 13:21:00                           6
2018-09-15 13:22:00                           6
2018-09-15 13:23:00                           4
2018-09-15 13:24:00                           3
2018-09-15 13:25:00                           3
2018-09-15 13:26:00                           4
2018-09-15 13:27:00                           4
2018-09-15 13:28:00                           7
2018-09-15 13:29:00                           3

30 rows selected.

Next, we’ll extend the previous query by adding a maximum value. It’s important to remember that DBA_HIST_ACTIVE_SESS_HISTORY is a collection of samples of samples. Thus the MAX-value seen within the samples could still be lower than the actual active session peak your system experienced. So, while these results should not be taken as absolutes, they can still offer an expanded, approximate view of what was happening.

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT TRUNC(sample_time, 'mi'),
 10           ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute,
 11           MAX(cnt) max_sess_per_minute
 12      FROM sample_counts
 13  GROUP BY TRUNC(sample_time, 'mi')
 14  ORDER BY 1;
TRUNC(SAMPLE_TIME,'MI')     AVG_SESS_PER_MINUTE   MAX_SESS_PER_MINUTE
2018-09-15 13:00:00                           5                     9
2018-09-15 13:01:00                           5                     9
2018-09-15 13:02:00                           3                     3
2018-09-15 13:03:00                           3                     6
2018-09-15 13:04:00                           5                     6
2018-09-15 13:05:00                           7                    24
2018-09-15 13:06:00                           5                     6
2018-09-15 13:07:00                           4                     6
2018-09-15 13:08:00                           3                     3
2018-09-15 13:09:00                           5                     6
2018-09-15 13:10:00                           4                     6
2018-09-15 13:11:00                           4                     6
2018-09-15 13:12:00                           7                    24
2018-09-15 13:13:00                           3                     3
2018-09-15 13:14:00                           3                     3
2018-09-15 13:15:00                           5                     9
2018-09-15 13:16:00                           8                    24
2018-09-15 13:17:00                           6                     6
2018-09-15 13:18:00                          10                    27
2018-09-15 13:19:00                           6                     9
2018-09-15 13:20:00                           9                    27
2018-09-15 13:21:00                           6                     9
2018-09-15 13:22:00                           6                     6
2018-09-15 13:23:00                           4                     9
2018-09-15 13:24:00                           3                     3
2018-09-15 13:25:00                           3                     3
2018-09-15 13:26:00                           4                     6
2018-09-15 13:27:00                           4                     6
2018-09-15 13:28:00                           7                    24
2018-09-15 13:29:00                           3                     3

30 rows selected.

As with previous queries, these numbers can then be turned into a text-based bar chart to visualize the system activity. Here the average active sessions are charted with the high-watermark of maximum sessions indicated within each minute.

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT time,
 10           RPAD('*', avg_sess_per_minute, '*') ||
 11           LPAD(']',max_sess_per_minute-avg_sess_per_minute) chart2
 12      FROM (  SELECT TRUNC(sample_time, 'mi') time,
 13                     ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute,
 14                     MAX(cnt) max_sess_per_minute
 15                FROM sample_counts
 16            GROUP BY TRUNC(sample_time, 'mi'))
 17  ORDER BY 1;
TIME                  CHART2
2018-09-15 13:00:00   *****   ]
2018-09-15 13:01:00   *****   ]
2018-09-15 13:02:00   ***
2018-09-15 13:03:00   ***  ]
2018-09-15 13:04:00   *****]
2018-09-15 13:05:00   *******                ]
2018-09-15 13:06:00   *****]
2018-09-15 13:07:00   **** ]
2018-09-15 13:08:00   ***
2018-09-15 13:09:00   *****]
2018-09-15 13:10:00   **** ]
2018-09-15 13:11:00   **** ]
2018-09-15 13:12:00   *******                ]
2018-09-15 13:13:00   ***
2018-09-15 13:14:00   ***
2018-09-15 13:15:00   *****   ]
2018-09-15 13:16:00   ********               ]
2018-09-15 13:17:00   ******
2018-09-15 13:18:00   **********                ]
2018-09-15 13:19:00   ******  ]
2018-09-15 13:20:00   *********                 ]
2018-09-15 13:21:00   ******  ]
2018-09-15 13:22:00   ******
2018-09-15 13:23:00   ****    ]
2018-09-15 13:24:00   ***
2018-09-15 13:25:00   ***
2018-09-15 13:26:00   **** ]
2018-09-15 13:27:00   **** ]
2018-09-15 13:28:00   *******                ]
2018-09-15 13:29:00   ***

As shown above, aggregating to the minute, or any of the other native TRUNC levels (minute, hour, day, week, etc.) isn’t too difficult. If you want to aggregate to another range, for example sampling in windows of 15 minutes then the math is a little trickier but the idea is still the same. Modify the truncation logic to produce times on the hour, 15, 30, and 45 minutes past the hour and divide the count totals by 900 seconds (15 minutes * 60 seconds per minute.)

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 07:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 17:00:00'
  8           GROUP BY sample_time)
  9    SELECT time,
 10           RPAD('*', avg_sess_per_window, '*') ||
 11           LPAD(']',max_sess_per_window-avg_sess_per_window) chart
 12      FROM (  SELECT TRUNC(sample_time, 'hh24')
 13                      + FLOOR((TO_NUMBER(TO_CHAR(sample_time, 'mi'), '99')) / 15) * 15 / 1440 time,
 14                     ROUND(10 * SUM(cnt) / (15*60)) avg_sess_per_window,
 15                     MAX(cnt) max_sess_per_window
 16                FROM sample_counts
 17            GROUP BY TRUNC(sample_time, 'hh24')
 18                      + FLOOR((TO_NUMBER(TO_CHAR(sample_time, 'mi'), '99')) / 15) * 15 / 1440)
 19  ORDER BY 1;
TIME                  CHART
2018-09-15 07:00:00   ******************                                 ]
2018-09-15 07:15:00   ***********                        ]
2018-09-15 07:30:00   **********                             ]
2018-09-15 07:45:00   *********                          ]
2018-09-15 08:00:00   **********                         ]
2018-09-15 08:15:00   **********                         ]
2018-09-15 08:30:00   **********                             ]
2018-09-15 08:45:00   *********                          ]
2018-09-15 09:00:00   ********                           ]
2018-09-15 09:15:00   ******                             ]
2018-09-15 09:30:00   *****                          ]
2018-09-15 09:45:00   *****                          ]
2018-09-15 10:00:00   ********                       ]
2018-09-15 10:15:00   *******                            ]
2018-09-15 10:30:00   ******                             ]
2018-09-15 10:45:00   ********                           ]
2018-09-15 11:00:00   ******                         ]
2018-09-15 11:15:00   ****                               ]
2018-09-15 11:30:00   ******                         ]
2018-09-15 11:45:00   *****                          ]
2018-09-15 12:00:00   *****                  ]
2018-09-15 12:15:00   ****                   ]
2018-09-15 12:30:00   ****                   ]
2018-09-15 12:45:00   ****                   ]
2018-09-15 13:00:00   ****                   ]
2018-09-15 13:15:00   *****                     ]
2018-09-15 13:30:00   ****                   ]
2018-09-15 13:45:00   ****                   ]
2018-09-15 14:00:00   ****                   ]
2018-09-15 14:15:00   *****                     ]
2018-09-15 14:30:00   *****                  ]
2018-09-15 14:45:00   ****                   ]
2018-09-15 15:00:00   ****                      ]
2018-09-15 15:15:00   ****                   ]
2018-09-15 15:30:00   ****                   ]
2018-09-15 15:45:00   ****                   ]
2018-09-15 16:00:00   ****                   ]
2018-09-15 16:15:00   ****                   ]
2018-09-15 16:30:00   ****                   ]
2018-09-15 16:45:00   ****                   ]

This method can easily be modified to use 5 minute, 10 minute, or other windows simply by replacing each “15” above with the corresponding window length in minutes. The method doesn’t require even division within an hour. For example, if you used 13 minutes for your sampling windows you’d get windows beginning at 00, 13, 26, and 39 minutes after the hour, each lasting 13 minutes and a final remainder window of 8 minutes starting at 52 minutes after the hour.

And finally, the same idea can be used with arbitrary time windows. Instead of having a fixed duration for each window, you simply divide each window’s count by the duration (in seconds) of that window. Below I’ve selected 5 different windows ranging in duration from 1 minute up to an hour. The same approach as above still applies.

SQL> WITH
  2      sample_times
  3      AS
  4          (SELECT TO_DATE('2018-09-15 07:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
  5                  TO_DATE('2018-09-15 08:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
  6             FROM DUAL
  7           UNION ALL
  8           SELECT TO_DATE('2018-09-15 08:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
  9                  TO_DATE('2018-09-15 08:05:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 10             FROM DUAL
 11           UNION ALL
 12           SELECT TO_DATE('2018-09-15 08:30:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 13                  TO_DATE('2018-09-15 09:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 14             FROM DUAL
 15           UNION ALL
 16           SELECT TO_DATE('2018-09-15 13:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 17                  TO_DATE('2018-09-15 13:01:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 18             FROM DUAL
 19           UNION ALL
 20           SELECT TO_DATE('2018-09-15 13:30:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 21                  TO_DATE('2018-09-15 14:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 22             FROM DUAL),
 23      sample_counts
 24      AS
 25          (  SELECT ash.sample_time, COUNT(ash.sample_id) cnt
 26               FROM dba_hist_active_sess_history ash, sample_times t
 27              WHERE ash.sample_time >= t.start_time
 28                AND ash.sample_time < t.end_time
 29           GROUP BY ash.sample_time)
 30    SELECT start_time,
 31           end_time,
 32           RPAD('*', avg_sess_per_window, '*')
 33            || LPAD(']', max_sess_per_window - avg_sess_per_window) chart
 34      FROM (  SELECT t.start_time,
 35                     t.end_time,
 36                     ROUND(10 * SUM(c.cnt) /
 37                         ((t.end_time - t.start_time) * 86400)) avg_sess_per_window,
 38                     MAX(c.cnt) max_sess_per_window
 39                FROM sample_counts c, sample_times t
 40               WHERE c.sample_time >= t.start_time
 41                 AND c.sample_time < t.end_time
 42            GROUP BY t.start_time, t.end_time)
 43  ORDER BY 1;
START_TIME            END_TIME              CHART
2018-09-15 07:00:00   2018-09-15 08:00:00   ************                                       ]
2018-09-15 08:00:00   2018-09-15 08:05:00   **********                         ]
2018-09-15 08:30:00   2018-09-15 09:00:00   **********                             ]
2018-09-15 13:00:00   2018-09-15 13:01:00   *****   ]
2018-09-15 13:30:00   2018-09-15 14:00:00   ****                   ]

Hopefully these queries and the previous articles give a good start in pulling data with standard SQL tools, both numerically and with text-based charting. In the next chapter we’ll look at using other tools with graphical capabilities to make more sophisticated charts with finer visual granularities.

Using collections to create variable parameter functions.

If you’re familiar with java, c, c# (or, I’m sure other languages as well) then you’ve probably encountered functions that allow a variable number of input parameters without requiring a fixed set of overloading.

Java

int my_variable_function(String str, int ...intlist)

c

int my_variable_function(char *str, …)

c#

int my_variable_function(string str, params int[] intlist)

PL/SQL does not expose a similar syntax for user-defined functions. However, it is possible to pass in a collection containing a variable number of values to achieve the same functionality, albeit with a slightly bulkier syntax.

For example…

FUNCTION my_variable_function(str IN VARCHAR2,
                              intlist IN inttab:= NULL)
       RETURN INTEGER

Such a function could then be used as follows:

SELECT my_variable_function('test one'),
       my_variable_function('test two', inttab(1, 2, 3)),
       my_variable_function('test three', inttab()),
       my_variable_function('test four', NULL),
       my_variable_function('test five', inttab(9,8,7,6,5,4,3,2,1))
  FROM DUAL;

One of the more common requests I’ve received that I solved with this technique is a multi-format date validator and/or date constructor. First, we need a collection type. In the example above I had a collection of integers, in this next one it will be a collection of varchar2 values, defined as follows:

CREATE TYPE vctab AS TABLE OF VARCHAR2(4000);

Then we can define a function with the collection parameter like this:

CREATE FUNCTION dynamic_to_date(p_str IN VARCHAR2,
                                p_formats IN vctab)
    RETURN DATE
    DETERMINISTIC
IS
    v_temp         DATE;
    v_successful   BOOLEAN := FALSE;
    v_index        INTEGER := p_formats.FIRST;
BEGIN
    WHILE v_index <= p_formats.LAST AND NOT v_successful
    LOOP
        BEGIN
            v_temp := TO_DATE(p_str, p_formats(v_index));
            v_successful := TRUE;
        EXCEPTION
            WHEN OTHERS
            THEN
                -- Conversion failed, try next format in the list
                v_index := v_index + 1;
        END;
    END LOOP;

    IF v_successful
    THEN
        RETURN v_temp;
    ELSE
        RAISE VALUE_ERROR;
    END IF;
END;

And then invoke it as follows:

SQL> SELECT dynamic_to_date('21 AUG 2009',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM DUAL;

DYNAMIC_TO_DATE('21
-------------------
2009-08-21 00:00:00

SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM dual;
SELECT dynamic_to_date('21 AUG 2009 AD',
       *
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SDS.DYNAMIC_TO_DATE", line 26

Often when I build these types of functions I will include an optional parameter to provide alternate error handling. For example, returning an invoker-defined message on exception, or returning NULL.

CREATE FUNCTION dynamic_to_date(p_str IN VARCHAR2,
                                p_formats IN vctab,
                                p_on_error IN VARCHAR2:= NULL)
    RETURN DATE
    DETERMINISTIC
IS   
    v_temp    DATE;
    v_ok      BOOLEAN := FALSE;
    v_index   INTEGER := p_formats.FIRST;
BEGIN
    WHILE v_index <= p_formats.LAST AND NOT v_ok
    LOOP
        BEGIN
            v_temp := TO_DATE(p_str, p_formats(v_index));
            v_ok := TRUE;
        EXCEPTION
            WHEN OTHERS
            THEN
                -- Conversion failed, try next format in the list
                v_index := v_index + 1;
        END;
    END LOOP;

    IF v_ok
    THEN
        RETURN v_temp;
    ELSIF p_on_error IS NULL
    THEN
        RETURN NULL;
    ELSE
        raise_application_error(-20001, p_on_error, FALSE);
    END IF;
END dynamic_to_date;

Then we can use the function the same with success or different results on failure.

SQL> SELECT dynamic_to_date('21 AUG 2009',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM DUAL;

DYNAMIC_TO_DATE('21
-------------------
2009-08-21 00:00:00

SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM dual;

DYNAMIC_TO_DATE('21
-------------------


SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'),
  3                         'Input string does not match any of the supplied formats')
  4    FROM dual;
SELECT dynamic_to_date('21 AUG 2009 AD',
       *
ERROR at line 1:
ORA-20001: Input string does not match any of the supplied formats
ORA-06512: at "SDS.DYNAMIC_TO_DATE", line 29
ORA-06512: at line 1

Using the same idea you can extend your own functions with input collections of dates, numbers, user-defined types, or even ANYDATA values. I hope you find it useful.

Introduction to ASH Data, part 2

Previously I showed a few queries using (G)V$ACTIVE_SESSION_HISTORY, but only mentioned the DBA_HIST_ACTIVE_SESS_HISTORY view. Here I’ll extend the query patterns into this historical repository.

Where V$ACTIVE_SESSION_HISTORY is a once-per-second snapshot of sessions from memory and kept in memory, the DBA_HIST_ACTIVE_SESS_HISTORY data is persisted to disk. One out of every ten snapshots of V$ACTIVE_SESSION_HISTORY data is sampled and preserved in the Automatic Workload Repository (AWR.) Thus AWR session data consists of sampling of sampling. Remembering the partial nature of the data is important as it means you cannot use AWR as a reliable means of capturing all activity in a database and it can also affect the math of aggregations.

Before we get into the complications, lets examine some of the simple uses. First, we can use it much as shown in the previous article, but with less granularity.

SQL>   SELECT sample_time, COUNT(*) cnt, RPAD('*', COUNT(*), '*') chart
  2      FROM dba_hist_active_sess_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-20 10:03:00'
  4       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;

SAMPLE_TIME                       CNT CHART
----------------------------- ------- ---------------
2018-09-20 10:03:07.754000000       9 *********
2018-09-20 10:03:17.764000000      10 **********
2018-09-20 10:03:27.784000000       9 *********
2018-09-20 10:03:37.794000000       8 ********
2018-09-20 10:03:47.863000000       8 ********
2018-09-20 10:03:57.873000000       7 *******
2018-09-20 10:04:07.893000000      10 **********
2018-09-20 10:04:17.913000000       7 *******
2018-09-20 10:04:27.923000000       6 ******
2018-09-20 10:04:37.933000000       9 *********
2018-09-20 10:04:48.003000000      12 ************
2018-09-20 10:04:58.023000000       8 ********
2018-09-20 10:05:08.033000000       8 ********
2018-09-20 10:05:18.043000000       7 *******
2018-09-20 10:05:28.063000000      12 ************
2018-09-20 10:05:38.073000000      11 ***********
2018-09-20 10:05:48.142000000      10 **********
2018-09-20 10:05:58.152000000      10 **********
2018-09-20 10:06:08.162000000      12 ************
2018-09-20 10:06:18.182000000       9 *********
2018-09-20 10:06:28.192000000      10 **********
2018-09-20 10:06:38.202000000       9 *********
2018-09-20 10:06:48.272000000       9 *********
2018-09-20 10:06:58.282000000       8 ********

As in my first article, these examples are from a single-instance system, so I don’t need to distinguish between different instance numbers.

The activity chart above is adequate to get a general idea of how much work is on your system; but it’s often more useful to determine what kind of work is happening. Are the sessions on cpu, actively trying to process something or are they waiting?

SQL>   SELECT sample_time, session_state activity
  2      FROM dba_hist_active_sess_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-20 10:03:00' AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  4  ORDER BY sample_time, session_state;

SAMPLE_TIME                ACTIVITY
-------------------------- --------
20-SEP-18 10.03.07.754 AM  ON CPU
20-SEP-18 10.03.07.754 AM  ON CPU
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.17.764 AM  ON CPU
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
...

Given these two pieces of information, you can produce charts similar to those show before, but indicating which type of activity is happening at each snapshot. Rather than simply using RPAD, we’ll use LISTAGG with a CASE statement to indicate which characters to concatenate into each line. Here I use an asterisk (*) for sessions on cpu, and a dash (-) for sessions waiting.

SQL>   SELECT sample_time,
  2           LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*' ELSE '-' END)
  3                    WITHIN GROUP (ORDER BY session_state) activity
  4      FROM dba_hist_active_sess_history
  5     WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00'
  6       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  7  GROUP BY sample_time
  8  ORDER BY sample_time;
SAMPLE_TIME                     ACTIVITY
2018-09-20 10:03:07.754000000   **-------
2018-09-20 10:03:17.764000000   *---------
2018-09-20 10:03:27.784000000   ****-----
2018-09-20 10:03:37.794000000   **------
2018-09-20 10:03:47.863000000   ****----
2018-09-20 10:03:57.873000000   **-----
2018-09-20 10:04:07.893000000   **--------
2018-09-20 10:04:17.913000000   *------
2018-09-20 10:04:27.923000000   ****--
2018-09-20 10:04:37.933000000   ****-----
2018-09-20 10:04:48.003000000   *********---
2018-09-20 10:04:58.023000000   **------
2018-09-20 10:05:08.033000000   *****---
2018-09-20 10:05:18.043000000   *****--
2018-09-20 10:05:28.063000000   *******-----
2018-09-20 10:05:38.073000000   *********--
2018-09-20 10:05:48.142000000   ****------
2018-09-20 10:05:58.152000000   *********-
2018-09-20 10:06:08.162000000   **********--
2018-09-20 10:06:18.182000000   ********-
2018-09-20 10:06:28.192000000   *********-
2018-09-20 10:06:38.202000000   ****-----
2018-09-20 10:06:48.272000000   *****----
2018-09-20 10:06:58.282000000   ****----

ASH data has even more descriptive data about waiting sessions, including the classes and specific events for each session. However, it becomes harder to create a text-based chart with quickly identifiable symbols to capture all of the activity types.

Here, I generally limit the chart into 3 character types at most. One for CPU, one for some specific wait class or event, and the last for everything else. For example, one class I frequently want to track is User I/O, so I will add a plus-sign (+) for each session in that wait class. In this particular sampling window, we can see some I/O waits, but not a lot, indicating there probably wasn’t an I/O problem during that time.

SQL>   SELECT sample_time,
  2           LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*'
  3                        WHEN wait_class = 'User I/O' THEN '+'
  4                        ELSE '-' END)
  5               WITHIN GROUP (ORDER BY session_state,
  6                             CASE WHEN wait_class = 'User I/O' THEN 1 ELSE 2 END) activity
  7      FROM dba_hist_active_sess_history
  8     WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00'
  9       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
 10  GROUP BY sample_time
 11  ORDER BY sample_time;
SAMPLE_TIME                     ACTIVITY
2018-09-20 10:03:07.754000000   **+------
2018-09-20 10:03:17.764000000   *---------
2018-09-20 10:03:27.784000000   ****+----
2018-09-20 10:03:37.794000000   **------
2018-09-20 10:03:47.863000000   ****----
2018-09-20 10:03:57.873000000   **+----
2018-09-20 10:04:07.893000000   **--------
2018-09-20 10:04:17.913000000   *------
2018-09-20 10:04:27.923000000   ****--
2018-09-20 10:04:37.933000000   ****+----
2018-09-20 10:04:48.003000000   *********++-
2018-09-20 10:04:58.023000000   **+-----
2018-09-20 10:05:08.033000000   *****++-
2018-09-20 10:05:18.043000000   *****--
2018-09-20 10:05:28.063000000   *******-----
2018-09-20 10:05:38.073000000   *********--
2018-09-20 10:05:48.142000000   ****------
2018-09-20 10:05:58.152000000   *********-
2018-09-20 10:06:08.162000000   **********+-
2018-09-20 10:06:18.182000000   ********+
2018-09-20 10:06:28.192000000   *********+
2018-09-20 10:06:38.202000000   ****++---
2018-09-20 10:06:48.272000000   *****----
2018-09-20 10:06:58.282000000   ****+---

And finally, as in the first article, if you’re using Oracle’s SQLcl tool with ansiconsole formatting, you can add color to your chart to help highlight the different activity types. Here I’ll color the ON CPU sessions green and the User I/O sessions red

  SELECT sample_time,
         regexp_replace(regexp_replace(
         LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*' 
                      WHEN wait_class = 'User I/O' THEN '+' 
                      ELSE '-' END)
             WITHIN GROUP (ORDER BY session_state, 
                           CASE WHEN wait_class = 'User I/O' THEN 1 ELSE 2 END)
         ,'(\*+)','@|bg_green \1|@')
         ,'(\++)','@|bg_red \1|@')      
          activity
    FROM dba_hist_active_sess_history
   WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00' 
     AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
GROUP BY sample_time
ORDER BY sample_time;

All of the examples to this point have kept each snapshot time distinct from the others. Next time we’ll look at grouping snapshots across time. As always, questions and comments are welcome.