Converting ISO8601 week-numbering text to date

Today I was asked how to convert iyyy and iw values into dates, in particular how to convert the day-of-week value since there is no “id” format mask. As it turns out, it wouldn’t matter if there was because the “i” format masks are not supported in TO_datetime functions. They can only be used in TO_CHAR conversions from datetime values.

The week-number calendar described in the ISO standard starts on the first Monday prior to the first Thursday that occurs within the Gregorian calendar. Furthermore the standard defines two formats for representing dates in the ISO week-numbering system. An extended form yyyy-Www-d and a compact form yyyyWwwd

Where W is the literal W letter, ww are the ISO weeks 1-53, d is the day of the ISO week 1-7 for Monday through Friday, and of course, yyyy the ISO year which mostly corresponds with the Gregorian calendar year except for the first few and last few days of each year which could fall into a different ISO year than the Gregorian year.

The first tricky part of this problem was finding the first Thursday. I couldn’t use day-of-week numbers because those vary by NLS_TERRITORY and I couldn’t use names because those vary by NLS_LANGUAGE. The easiest solution to this I could think of was to use a date that I already knew to be a Thursday and then extract the day name for that date. Doing this means I will always be able to refer to the “Thursday” day by the correct name, regardless of the current session’s NLS_LANGUAGE setting. It’s a tiny bit of overhead but neatly solves the problem.

Next was validating the inputs to make sure they follow one of the two standard formats. A regular expression nicely covers both. Then a few substring extractions pull out the year, week and day numbers. Again, I validate the values so users can’t pass in week 0, 86, or other out-of-range value.

The math to construct a date from the ISO values is then fairly straight forward. Start with Thursday, then find Monday, then increment for weeks and days to get the final result.

One last check ensures that a 53 week input applied to a 52 week year will also be rejected.

The final function using the above logic ended up looking like this:

CREATE OR REPLACE FUNCTION isoweekdate(p_date IN VARCHAR2)
    RETURN DATE
IS
    --                    .///.
    --                   (0 o)
    ---------------0000--(_)--0000---------------
    --
    --  Sean D. Stuber
    --  sean.stuber@gmail.com
    --
    --             oooO      Oooo
    --------------(   )-----(   )---------------
    --             \ (       ) /
    --              \_)     (_/

    v_result         DATE;

    v_week           PLS_INTEGER;
    v_day            PLS_INTEGER;
    -- Get the weekday name of a known Thursday.  Doing it this way
    -- ensures the function will work regardless of the NLS_LANGUAGE setting
    --- 1971 is first year recognizing this format, so use its first Thursday
    v_nls_thursday   VARCHAR2(3) := TO_CHAR(DATE '1971-1-7', 'DY');
BEGIN
    --- Assert the input string is an ISO8601 week date string in Extended or Compact from
    ---  Extended:  yyyy-Www-d
    ---  Compact:   yyyyWwwd
    --- where "W" is the literal W character, ww is the ISO week number
    IF NOT REGEXP_LIKE(p_date, '^[0-9]{4}-W[0-9]{2}-[1-7]$|^[0-9]{4}W[0-9]{2}[1-7]$')
    THEN
        raise_application_error(-20001, 'Invalid format, must be yyyy-Www-d or yyyyWwwd');
    END IF;

    v_week := TO_NUMBER(SUBSTR(p_date, INSTR(p_date, 'W') + 1, 2), '99');

    IF v_week NOT BETWEEN 1 AND 53
    THEN
        raise_application_error(-20002, 'Illegal week number, must be between 1 and 53');
    END IF;

    v_day := TO_NUMBER(SUBSTR(p_date, -1), '9');

    v_result :=
          NEXT_DAY(TRUNC(TO_DATE(SUBSTR(p_date, 1, 4), 'yyyy'), 'yyyy') - 1, v_nls_thursday) -- find first Thursday in the Gregorian calendar year
        - 4                                                          -- Go back four days adding the first day of the ISO year will fall on Monday
        + ((v_week - 1) * 7)                                         -- Increment for the number of weeks
        + v_day;                                                     -- Increment for the day number, 1=Monday, 7= Sunday


    -- If someone tries to use Week 53 in a 52-week year
    --   the result will improperly shift into the following year.
    -- Check for that condition and raise exception if it happens
    IF v_week = 53 AND TO_CHAR(v_result, 'iyyy') > SUBSTR(p_date, 1, 4)
    THEN
        raise_application_error(-20003, 'Illegal week 53 in a 52 week year');
    END IF;

    RETURN v_result;
END;

Some sample usage:

SQL> select isoweekdate('2004-W53-6') from dual;
ISOWEEKDATE('2004-W53-6')
2005-01-01

SQL> select isoweekdate('2006-W01-1') from dual;
ISOWEEKDATE('2006-W01-1')
2006-01-02

SQL> select isoweekdate('2019-W03-4') from dual;
ISOWEEKDATE('2019-W03-4')
2019-01-17

SQL> select isoweekdate('2019W041') from dual;
ISOWEEKDATE('2019W041')
2019-01-21

SQL> select isoweekdate('2019W042') from dual;
ISOWEEKDATE('2019W042')
2019-01-22

This should work well for my friend, hopefully it will help others as well.

Introduction to ASH Data, part 3

In the first 2 chapters of this series I queried the ASH data with respect to distinct sample times. While this is helpful maximizing the granularity of the analysis, it can be sometimes be too much detail, too many data points for the eye to follow, or if run through a graphical engine, so many elements that the resulting chart becomes slow to render. It’s also possible the resolution of the resulting chart may be too low to adequately display all of the points distinctly; thus making the abundance of data redundant and useless.

Furthermore, sometimes what you’re looking for isn’t a detailed play-by-play of all activity but a general picture. You know there will be peaks and valleys in the system activity; but how busy is it on average?

Also, any one sample might be unusually high or unusually low, but looking at a wider view, within a given period the overall activity is what you’d expect.

All of these lead to the one of the most common aggregations across time with ASH data: Average Active Sessions (AAS.) The idea behind the calculation is simple enough, count up the sessions for each second of sampling and then divide that by the total number of seconds within the period. Thus yielding the average number of active sessions at any point in time. The idea is simple and the math is simple for V$ACTIVE_SESSION_HISTORY because that view’s contents are sampled each second. DBA_HIST_ACTIVE_SESS_HISTORY is only populated with samples once each 10 seconds. Thus creating an average requires extrapolation for the missing time slices by multiplying the counts you do have by 10.

In the following example the active sessions counts are averaged over each minute (thus averaging over 60 seconds) within a half-hour span.

SQL>   SELECT TRUNC(sample_time, 'mi') time,
  2           ROUND(10 * COUNT(*) / 60) avg_sess_per_minute
  3      FROM dba_hist_active_sess_history
  4     WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  5       AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  6  GROUP BY TRUNC(sample_time, 'mi')
  7  ORDER BY 1;
TIME                    AVG_SESS_PER_MINUTE
2018-09-15 13:00:00                       5
2018-09-15 13:01:00                       5
2018-09-15 13:02:00                       3
2018-09-15 13:03:00                       3
2018-09-15 13:04:00                       5
2018-09-15 13:05:00                       7
2018-09-15 13:06:00                       5
2018-09-15 13:07:00                       4
2018-09-15 13:08:00                       3
2018-09-15 13:09:00                       5
2018-09-15 13:10:00                       4
2018-09-15 13:11:00                       4
2018-09-15 13:12:00                       7
2018-09-15 13:13:00                       3
2018-09-15 13:14:00                       3
2018-09-15 13:15:00                       5
2018-09-15 13:16:00                       8
2018-09-15 13:17:00                       6
2018-09-15 13:18:00                      10
2018-09-15 13:19:00                       6
2018-09-15 13:20:00                       9
2018-09-15 13:21:00                       6
2018-09-15 13:22:00                       6
2018-09-15 13:23:00                       4
2018-09-15 13:24:00                       3
2018-09-15 13:25:00                       3
2018-09-15 13:26:00                       4
2018-09-15 13:27:00                       4
2018-09-15 13:28:00                       7
2018-09-15 13:29:00                       3

30 rows selected.

Another approach is to count the sessions for each sample and then apply averages outside. This way can be less efficient since it involves multiple aggregations; but it can be a simpler construction

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT TRUNC(sample_time, 'mi'),
 10           ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute
 11      FROM sample_counts
 12  GROUP BY TRUNC(sample_time, 'mi')
 13  ORDER BY 1;
TRUNC(SAMPLE_TIME,'MI')     AVG_SESS_PER_MINUTE
2018-09-15 13:00:00                           5
2018-09-15 13:01:00                           5
2018-09-15 13:02:00                           3
2018-09-15 13:03:00                           3
2018-09-15 13:04:00                           5
2018-09-15 13:05:00                           7
2018-09-15 13:06:00                           5
2018-09-15 13:07:00                           4
2018-09-15 13:08:00                           3
2018-09-15 13:09:00                           5
2018-09-15 13:10:00                           4
2018-09-15 13:11:00                           4
2018-09-15 13:12:00                           7
2018-09-15 13:13:00                           3
2018-09-15 13:14:00                           3
2018-09-15 13:15:00                           5
2018-09-15 13:16:00                           8
2018-09-15 13:17:00                           6
2018-09-15 13:18:00                          10
2018-09-15 13:19:00                           6
2018-09-15 13:20:00                           9
2018-09-15 13:21:00                           6
2018-09-15 13:22:00                           6
2018-09-15 13:23:00                           4
2018-09-15 13:24:00                           3
2018-09-15 13:25:00                           3
2018-09-15 13:26:00                           4
2018-09-15 13:27:00                           4
2018-09-15 13:28:00                           7
2018-09-15 13:29:00                           3

30 rows selected.

Next, we’ll extend the previous query by adding a maximum value. It’s important to remember here that DBA_HIST_ACTIVE_SESS_HISTORY is a collection of samples of samples. So the MAX-value seen here could still be lower than the actual active session peak your system experienced. So, again, these results should not be taken as absolutes, but they do offer an expanded view.

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT TRUNC(sample_time, 'mi'),
 10           ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute,
 11           MAX(cnt) max_sess_per_minute
 12      FROM sample_counts
 13  GROUP BY TRUNC(sample_time, 'mi')
 14  ORDER BY 1;
TRUNC(SAMPLE_TIME,'MI')     AVG_SESS_PER_MINUTE   MAX_SESS_PER_MINUTE
2018-09-15 13:00:00                           5                     9
2018-09-15 13:01:00                           5                     9
2018-09-15 13:02:00                           3                     3
2018-09-15 13:03:00                           3                     6
2018-09-15 13:04:00                           5                     6
2018-09-15 13:05:00                           7                    24
2018-09-15 13:06:00                           5                     6
2018-09-15 13:07:00                           4                     6
2018-09-15 13:08:00                           3                     3
2018-09-15 13:09:00                           5                     6
2018-09-15 13:10:00                           4                     6
2018-09-15 13:11:00                           4                     6
2018-09-15 13:12:00                           7                    24
2018-09-15 13:13:00                           3                     3
2018-09-15 13:14:00                           3                     3
2018-09-15 13:15:00                           5                     9
2018-09-15 13:16:00                           8                    24
2018-09-15 13:17:00                           6                     6
2018-09-15 13:18:00                          10                    27
2018-09-15 13:19:00                           6                     9
2018-09-15 13:20:00                           9                    27
2018-09-15 13:21:00                           6                     9
2018-09-15 13:22:00                           6                     6
2018-09-15 13:23:00                           4                     9
2018-09-15 13:24:00                           3                     3
2018-09-15 13:25:00                           3                     3
2018-09-15 13:26:00                           4                     6
2018-09-15 13:27:00                           4                     6
2018-09-15 13:28:00                           7                    24
2018-09-15 13:29:00                           3                     3

30 rows selected.

As with previous queries, these numbers can then be turned into a text-based bar chart to visualize the system activity. Here the average active sessions are charted with the high-watermark of maximum sessions indicated within each minute.

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 13:30:00'
  8           GROUP BY sample_time)
  9    SELECT time,
 10           RPAD('*', avg_sess_per_minute, '*') ||
 11           LPAD(']',max_sess_per_minute-avg_sess_per_minute) chart2
 12      FROM (  SELECT TRUNC(sample_time, 'mi') time,
 13                     ROUND(10 * SUM(cnt) / 60) avg_sess_per_minute,
 14                     MAX(cnt) max_sess_per_minute
 15                FROM sample_counts
 16            GROUP BY TRUNC(sample_time, 'mi'))
 17  ORDER BY 1;
TIME                  CHART2
2018-09-15 13:00:00   *****   ]
2018-09-15 13:01:00   *****   ]
2018-09-15 13:02:00   ***
2018-09-15 13:03:00   ***  ]
2018-09-15 13:04:00   *****]
2018-09-15 13:05:00   *******                ]
2018-09-15 13:06:00   *****]
2018-09-15 13:07:00   **** ]
2018-09-15 13:08:00   ***
2018-09-15 13:09:00   *****]
2018-09-15 13:10:00   **** ]
2018-09-15 13:11:00   **** ]
2018-09-15 13:12:00   *******                ]
2018-09-15 13:13:00   ***
2018-09-15 13:14:00   ***
2018-09-15 13:15:00   *****   ]
2018-09-15 13:16:00   ********               ]
2018-09-15 13:17:00   ******
2018-09-15 13:18:00   **********                ]
2018-09-15 13:19:00   ******  ]
2018-09-15 13:20:00   *********                 ]
2018-09-15 13:21:00   ******  ]
2018-09-15 13:22:00   ******
2018-09-15 13:23:00   ****    ]
2018-09-15 13:24:00   ***
2018-09-15 13:25:00   ***
2018-09-15 13:26:00   **** ]
2018-09-15 13:27:00   **** ]
2018-09-15 13:28:00   *******                ]
2018-09-15 13:29:00   ***

As shown above, aggregating to the minute, or any of the other native TRUNC levels (minute, hour, day, week, etc.) isn’t too difficult. If you want to aggregate to another range, for example sampling in windows of 15 minutes then the math is a little trickier but the idea is still the same. Modify the truncation logic to produce times on the hour, 15, 30, and 45 minutes past the hour and divide the count totals by 900 seconds (15 minutes * 60 seconds per minute.)

SQL> WITH
  2      sample_counts
  3      AS
  4          (  SELECT sample_time, COUNT(*) cnt
  5               FROM dba_hist_active_sess_history
  6              WHERE sample_time >= TIMESTAMP '2018-09-15 07:00:00'
  7                AND sample_time < TIMESTAMP '2018-09-15 17:00:00'
  8           GROUP BY sample_time)
  9    SELECT time,
 10           RPAD('*', avg_sess_per_window, '*') ||
 11           LPAD(']',max_sess_per_window-avg_sess_per_window) chart
 12      FROM (  SELECT TRUNC(sample_time, 'hh24')
 13                      + FLOOR((TO_NUMBER(TO_CHAR(sample_time, 'mi'), '99')) / 15) * 15 / 1440 time,
 14                     ROUND(10 * SUM(cnt) / (15*60)) avg_sess_per_window,
 15                     MAX(cnt) max_sess_per_window
 16                FROM sample_counts
 17            GROUP BY TRUNC(sample_time, 'hh24')
 18                      + FLOOR((TO_NUMBER(TO_CHAR(sample_time, 'mi'), '99')) / 15) * 15 / 1440)
 19  ORDER BY 1;
TIME                  CHART
2018-09-15 07:00:00   ******************                                 ]
2018-09-15 07:15:00   ***********                        ]
2018-09-15 07:30:00   **********                             ]
2018-09-15 07:45:00   *********                          ]
2018-09-15 08:00:00   **********                         ]
2018-09-15 08:15:00   **********                         ]
2018-09-15 08:30:00   **********                             ]
2018-09-15 08:45:00   *********                          ]
2018-09-15 09:00:00   ********                           ]
2018-09-15 09:15:00   ******                             ]
2018-09-15 09:30:00   *****                          ]
2018-09-15 09:45:00   *****                          ]
2018-09-15 10:00:00   ********                       ]
2018-09-15 10:15:00   *******                            ]
2018-09-15 10:30:00   ******                             ]
2018-09-15 10:45:00   ********                           ]
2018-09-15 11:00:00   ******                         ]
2018-09-15 11:15:00   ****                               ]
2018-09-15 11:30:00   ******                         ]
2018-09-15 11:45:00   *****                          ]
2018-09-15 12:00:00   *****                  ]
2018-09-15 12:15:00   ****                   ]
2018-09-15 12:30:00   ****                   ]
2018-09-15 12:45:00   ****                   ]
2018-09-15 13:00:00   ****                   ]
2018-09-15 13:15:00   *****                     ]
2018-09-15 13:30:00   ****                   ]
2018-09-15 13:45:00   ****                   ]
2018-09-15 14:00:00   ****                   ]
2018-09-15 14:15:00   *****                     ]
2018-09-15 14:30:00   *****                  ]
2018-09-15 14:45:00   ****                   ]
2018-09-15 15:00:00   ****                      ]
2018-09-15 15:15:00   ****                   ]
2018-09-15 15:30:00   ****                   ]
2018-09-15 15:45:00   ****                   ]
2018-09-15 16:00:00   ****                   ]
2018-09-15 16:15:00   ****                   ]
2018-09-15 16:30:00   ****                   ]
2018-09-15 16:45:00   ****                   ]

This method can easily be modified use 5 minute, 10 minute, or other windows simply by replace each “15” above with the corresponding window length in minutes. The method doesn’t require even division within an hour. For example, if you used 13 minutes for your sampling windows you’d get windows beginning at 00, 13, 26, and 39 minutes after the hour, each lasting 13 minutes and a final remainder window of 8 minutes starting at 52 minutes after the hour.

And finally, the same idea can be used with arbitrary time windows. Instead of having a fixed duration for each window, you simply divide each window’s count by the duration (in seconds) of that window. Below I’ve selected 5 different windows ranging in duration from 1 minute up to an hour. The same approach as above still applies.

SQL> WITH
  2      sample_times
  3      AS
  4          (SELECT TO_DATE('2018-09-15 07:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
  5                  TO_DATE('2018-09-15 08:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
  6             FROM DUAL
  7           UNION ALL
  8           SELECT TO_DATE('2018-09-15 08:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
  9                  TO_DATE('2018-09-15 08:05:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 10             FROM DUAL
 11           UNION ALL
 12           SELECT TO_DATE('2018-09-15 08:30:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 13                  TO_DATE('2018-09-15 09:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 14             FROM DUAL
 15           UNION ALL
 16           SELECT TO_DATE('2018-09-15 13:00:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 17                  TO_DATE('2018-09-15 13:01:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 18             FROM DUAL
 19           UNION ALL
 20           SELECT TO_DATE('2018-09-15 13:30:00', 'yyyy-mm-dd hh24:mi:ss') start_time,
 21                  TO_DATE('2018-09-15 14:00:00', 'yyyy-mm-dd hh24:mi:ss') end_time
 22             FROM DUAL),
 23      sample_counts
 24      AS
 25          (  SELECT ash.sample_time, COUNT(ash.sample_id) cnt
 26               FROM dba_hist_active_sess_history ash, sample_times t
 27              WHERE ash.sample_time >= t.start_time
 28                AND ash.sample_time < t.end_time
 29           GROUP BY ash.sample_time)
 30    SELECT start_time,
 31           end_time,
 32           RPAD('*', avg_sess_per_window, '*')
 33            || LPAD(']', max_sess_per_window - avg_sess_per_window) chart
 34      FROM (  SELECT t.start_time,
 35                     t.end_time,
 36                     ROUND(10 * SUM(c.cnt) /
 37                         ((t.end_time - t.start_time) * 86400)) avg_sess_per_window,
 38                     MAX(c.cnt) max_sess_per_window
 39                FROM sample_counts c, sample_times t
 40               WHERE c.sample_time >= t.start_time
 41                 AND c.sample_time < t.end_time
 42            GROUP BY t.start_time, t.end_time)
 43  ORDER BY 1;
START_TIME            END_TIME              CHART
2018-09-15 07:00:00   2018-09-15 08:00:00   ************                                       ]
2018-09-15 08:00:00   2018-09-15 08:05:00   **********                         ]
2018-09-15 08:30:00   2018-09-15 09:00:00   **********                             ]
2018-09-15 13:00:00   2018-09-15 13:01:00   *****   ]
2018-09-15 13:30:00   2018-09-15 14:00:00   ****                   ]

Hopefully these queries and the previous articles give a good start in pulling data with standard SQL tools, both numerically and with text-based charting. In the next chapter we’ll look at using other tools with graphical capabilities to make more sophisticated charts with finer visual granularities.

Using collections to create variable parameter functions.

If you’re familiar with java, c, c# (or, I’m sure other languages as well) then you’ve probably encountered functions that allow a variable number of input parameters without requiring a fixed set of overloading.

Java

int my_variable_function(String str, int ...intlist)

c

int my_variable_function(char *str, …)

c#

int my_variable_function(string str, params int[] intlist)

PL/SQL does not expose a similar syntax for user-defined functions. However, it is possible to pass in a collection containing a variable number of values to achieve the same functionality, albeit with a slightly bulkier syntax.

For example…

FUNCTION my_variable_function(str IN VARCHAR2,
                              intlist IN inttab:= NULL)
       RETURN INTEGER

Such a function could then be used as follows:

SELECT my_variable_function('test one'),
       my_variable_function('test two', inttab(1, 2, 3)),
       my_variable_function('test three', inttab()),
       my_variable_function('test four', NULL),
       my_variable_function('test five', inttab(9,8,7,6,5,4,3,2,1))
  FROM DUAL;

One of the more common requests I’ve received that I solved with this technique is a multi-format date validator and/or date constructor. First, we need a collection type. In the example above I had a collection of integers, in this next one it will be a collection of varchar2 values, defined as follows:

CREATE TYPE vctab AS TABLE OF VARCHAR2(4000);

Then we can define a function with the collection parameter like this:

CREATE FUNCTION dynamic_to_date(p_str IN VARCHAR2,
                                p_formats IN vctab)
    RETURN DATE
    DETERMINISTIC
IS
    v_temp         DATE;
    v_successful   BOOLEAN := FALSE;
    v_index        INTEGER := p_formats.FIRST;
BEGIN
    WHILE v_index <= p_formats.LAST AND NOT v_successful
    LOOP
        BEGIN
            v_temp := TO_DATE(p_str, p_formats(v_index));
            v_successful := TRUE;
        EXCEPTION
            WHEN OTHERS
            THEN
                -- Conversion failed, try next format in the list
                v_index := v_index + 1;
        END;
    END LOOP;

    IF v_successful
    THEN
        RETURN v_temp;
    ELSE
        RAISE VALUE_ERROR;
    END IF;
END;

And then invoke it as follows:

SQL> SELECT dynamic_to_date('21 AUG 2009',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM DUAL;

DYNAMIC_TO_DATE('21
-------------------
2009-08-21 00:00:00

SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM dual;
SELECT dynamic_to_date('21 AUG 2009 AD',
       *
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "SDS.DYNAMIC_TO_DATE", line 26

Often when I build these types of functions I will include an optional parameter to provide alternate error handling. For example, returning an invoker-defined message on exception, or returning NULL.

CREATE FUNCTION dynamic_to_date(p_str IN VARCHAR2,
                                p_formats IN vctab,
                                p_on_error IN VARCHAR2:= NULL)
    RETURN DATE
    DETERMINISTIC
IS   
    v_temp    DATE;
    v_ok      BOOLEAN := FALSE;
    v_index   INTEGER := p_formats.FIRST;
BEGIN
    WHILE v_index <= p_formats.LAST AND NOT v_ok
    LOOP
        BEGIN
            v_temp := TO_DATE(p_str, p_formats(v_index));
            v_ok := TRUE;
        EXCEPTION
            WHEN OTHERS
            THEN
                -- Conversion failed, try next format in the list
                v_index := v_index + 1;
        END;
    END LOOP;

    IF v_ok
    THEN
        RETURN v_temp;
    ELSIF p_on_error IS NULL
    THEN
        RETURN NULL;
    ELSE
        raise_application_error(-20001, p_on_error, FALSE);
    END IF;
END dynamic_to_date;

Then we can use the function the same with success or different results on failure.

SQL> SELECT dynamic_to_date('21 AUG 2009',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM DUAL;

DYNAMIC_TO_DATE('21
-------------------
2009-08-21 00:00:00

SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'))
  3    FROM dual;

DYNAMIC_TO_DATE('21
-------------------


SQL> SELECT dynamic_to_date('21 AUG 2009 AD',
  2                         vctab('DD/MM/YYYY', 'DD/MM/YYYY HH24:MI', 'DD MON YYYY'),
  3                         'Input string does not match any of the supplied formats')
  4    FROM dual;
SELECT dynamic_to_date('21 AUG 2009 AD',
       *
ERROR at line 1:
ORA-20001: Input string does not match any of the supplied formats
ORA-06512: at "SDS.DYNAMIC_TO_DATE", line 29
ORA-06512: at line 1

Using the same idea you can extend your own functions with input collections of dates, numbers, user-defined types, or even ANYDATA values. I hope you find it useful.

Introduction to ASH Data, part 2

Previously I showed a few queries using (G)V$ACTIVE_SESSION_HISTORY, but only mentioned the DBA_HIST_ACTIVE_SESS_HISTORY view. Here I’ll extend the query patterns into this historical repository.

Where V$ACTIVE_SESSION_HISTORY is a once-per-second snapshot of sessions from memory and kept in memory, the DBA_HIST_ACTIVE_SESS_HISTORY data is persisted to disk. One out of every ten snapshots of V$ACTIVE_SESSION_HISTORY data is sampled and preserved in the Automatic Workload Repository (AWR.) Thus AWR session data consists of sampling of sampling. Remembering the partial nature of the data is important as it means you cannot use AWR as a reliable means of capturing all activity in a database and it can also affect the math of aggregations.

Before we get into the complications, lets examine some of the simple uses. First, we can use it much as shown in the previous article, but with less granularity.

SQL>   SELECT sample_time, COUNT(*) cnt, RPAD('*', COUNT(*), '*') chart
  2      FROM dba_hist_active_sess_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-20 10:03:00'
  4       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;

SAMPLE_TIME                       CNT CHART
----------------------------- ------- ---------------
2018-09-20 10:03:07.754000000       9 *********
2018-09-20 10:03:17.764000000      10 **********
2018-09-20 10:03:27.784000000       9 *********
2018-09-20 10:03:37.794000000       8 ********
2018-09-20 10:03:47.863000000       8 ********
2018-09-20 10:03:57.873000000       7 *******
2018-09-20 10:04:07.893000000      10 **********
2018-09-20 10:04:17.913000000       7 *******
2018-09-20 10:04:27.923000000       6 ******
2018-09-20 10:04:37.933000000       9 *********
2018-09-20 10:04:48.003000000      12 ************
2018-09-20 10:04:58.023000000       8 ********
2018-09-20 10:05:08.033000000       8 ********
2018-09-20 10:05:18.043000000       7 *******
2018-09-20 10:05:28.063000000      12 ************
2018-09-20 10:05:38.073000000      11 ***********
2018-09-20 10:05:48.142000000      10 **********
2018-09-20 10:05:58.152000000      10 **********
2018-09-20 10:06:08.162000000      12 ************
2018-09-20 10:06:18.182000000       9 *********
2018-09-20 10:06:28.192000000      10 **********
2018-09-20 10:06:38.202000000       9 *********
2018-09-20 10:06:48.272000000       9 *********
2018-09-20 10:06:58.282000000       8 ********

As in my first article, these examples are from a single-instance system, so I don’t need to distinguish between different instance numbers.

The activity chart above is adequate to get a general idea of how much work is on your system; but it’s often more useful to determine what kind of work is happening. Are the sessions on cpu, actively trying to process something or are they waiting?

SQL>   SELECT sample_time, session_state activity
  2      FROM dba_hist_active_sess_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-20 10:03:00' AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  4  ORDER BY sample_time, session_state;

SAMPLE_TIME                ACTIVITY
-------------------------- --------
20-SEP-18 10.03.07.754 AM  ON CPU
20-SEP-18 10.03.07.754 AM  ON CPU
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.07.754 AM  WAITING
20-SEP-18 10.03.17.764 AM  ON CPU
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
20-SEP-18 10.03.17.764 AM  WAITING
...

Given these two pieces of information, you can produce charts similar to those show before, but indicating which type of activity is happening at each snapshot. Rather than simply using RPAD, we’ll use LISTAGG with a CASE statement to indicate which characters to concatenate into each line. Here I use an asterisk (*) for sessions on cpu, and a dash (-) for sessions waiting.

SQL>   SELECT sample_time,
  2           LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*' ELSE '-' END)
  3                    WITHIN GROUP (ORDER BY session_state) activity
  4      FROM dba_hist_active_sess_history
  5     WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00'
  6       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
  7  GROUP BY sample_time
  8  ORDER BY sample_time;
SAMPLE_TIME                     ACTIVITY
2018-09-20 10:03:07.754000000   **-------
2018-09-20 10:03:17.764000000   *---------
2018-09-20 10:03:27.784000000   ****-----
2018-09-20 10:03:37.794000000   **------
2018-09-20 10:03:47.863000000   ****----
2018-09-20 10:03:57.873000000   **-----
2018-09-20 10:04:07.893000000   **--------
2018-09-20 10:04:17.913000000   *------
2018-09-20 10:04:27.923000000   ****--
2018-09-20 10:04:37.933000000   ****-----
2018-09-20 10:04:48.003000000   *********---
2018-09-20 10:04:58.023000000   **------
2018-09-20 10:05:08.033000000   *****---
2018-09-20 10:05:18.043000000   *****--
2018-09-20 10:05:28.063000000   *******-----
2018-09-20 10:05:38.073000000   *********--
2018-09-20 10:05:48.142000000   ****------
2018-09-20 10:05:58.152000000   *********-
2018-09-20 10:06:08.162000000   **********--
2018-09-20 10:06:18.182000000   ********-
2018-09-20 10:06:28.192000000   *********-
2018-09-20 10:06:38.202000000   ****-----
2018-09-20 10:06:48.272000000   *****----
2018-09-20 10:06:58.282000000   ****----

ASH data has even more descriptive data about waiting sessions, including the classes and specific events for each session. However, it becomes harder to create a text-based chart with quickly identifiable symbols to capture all of the activity types.

Here, I generally limit the chart into 3 character types at most. One for CPU, one for some specific wait class or event, and the last for everything else. For example, one class I frequently want to track is User I/O, so I will add a plus-sign (+) for each session in that wait class. In this particular sampling window, we can see some I/O waits, but not a lot, indicating there probably wasn’t an I/O problem during that time.

SQL>   SELECT sample_time,
  2           LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*'
  3                        WHEN wait_class = 'User I/O' THEN '+'
  4                        ELSE '-' END)
  5               WITHIN GROUP (ORDER BY session_state,
  6                             CASE WHEN wait_class = 'User I/O' THEN 1 ELSE 2 END) activity
  7      FROM dba_hist_active_sess_history
  8     WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00'
  9       AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
 10  GROUP BY sample_time
 11  ORDER BY sample_time;
SAMPLE_TIME                     ACTIVITY
2018-09-20 10:03:07.754000000   **+------
2018-09-20 10:03:17.764000000   *---------
2018-09-20 10:03:27.784000000   ****+----
2018-09-20 10:03:37.794000000   **------
2018-09-20 10:03:47.863000000   ****----
2018-09-20 10:03:57.873000000   **+----
2018-09-20 10:04:07.893000000   **--------
2018-09-20 10:04:17.913000000   *------
2018-09-20 10:04:27.923000000   ****--
2018-09-20 10:04:37.933000000   ****+----
2018-09-20 10:04:48.003000000   *********++-
2018-09-20 10:04:58.023000000   **+-----
2018-09-20 10:05:08.033000000   *****++-
2018-09-20 10:05:18.043000000   *****--
2018-09-20 10:05:28.063000000   *******-----
2018-09-20 10:05:38.073000000   *********--
2018-09-20 10:05:48.142000000   ****------
2018-09-20 10:05:58.152000000   *********-
2018-09-20 10:06:08.162000000   **********+-
2018-09-20 10:06:18.182000000   ********+
2018-09-20 10:06:28.192000000   *********+
2018-09-20 10:06:38.202000000   ****++---
2018-09-20 10:06:48.272000000   *****----
2018-09-20 10:06:58.282000000   ****+---

And finally, as in the first article, if you’re using Oracle’s SQLcl tool with ansiconsole formatting, you can add color to your chart to help highlight the different activity types. Here I’ll color the ON CPU sessions green and the User I/O sessions red

  SELECT sample_time,
         regexp_replace(regexp_replace(
         LISTAGG(CASE WHEN session_state = 'ON CPU' THEN '*' 
                      WHEN wait_class = 'User I/O' THEN '+' 
                      ELSE '-' END)
             WITHIN GROUP (ORDER BY session_state, 
                           CASE WHEN wait_class = 'User I/O' THEN 1 ELSE 2 END)
         ,'(\*+)','@|bg_green \1|@')
         ,'(\++)','@|bg_red \1|@')      
          activity
    FROM dba_hist_active_sess_history
   WHERE sample_time > = TIMESTAMP '2018-09-20 10:03:00' 
     AND sample_time < TIMESTAMP '2018-09-20 10:07:00'
GROUP BY sample_time
ORDER BY sample_time;

All of the examples to this point have kept each snapshot time distinct from the others. Next time we’ll look at grouping snapshots across time. As always, questions and comments are welcome.

Looking Through Holes in Time

While testing some date/time functions today I came across an interesting quirk in Oracle’s Gregorian calendar and different date representations.

In the Gregorian calendar, the day after October 4, 1582 is October 15, 1582 (not all countries adopted the calendar immediately, so they have different gaps, but Oracle observes the calendar as it was originally defined.) However, depending on the version of the Oracle database, that calendar adjustment is handled differently.

In database versions 12.1.0.2 and lower, TO_DATE can’t be used to create a DATE value within the calendar gap. If you try, the day portion will be changed to October 15, the time will be remain the same. So, as shown in the example below. Attempting to create a date with TO_DATE of October 10, causes the same time to be shifted to October 15. The same will happen with any other day falling into the hole. Unless… you use a date-literal. Then you can create dates within the gap.

SQL> SELECT banner FROM v$version
  2   WHERE banner LIKE 'Oracle Database%';
 
BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL> SELECT DATE '1582-10-10' date_literal,
  2         TO_DATE('1582-10-10 12:34:56', 'yyyy-mm-dd hh24:mi:ss') date_conversion,
  3         TIMESTAMP '1582-10-10 12:34:56' timestamp_literal,
  4         TO_TIMESTAMP('1582-10-10 12:34:56', 'yyyy-mm-dd hh24:mi:ss') timestamp_conversion
  5    FROM DUAL;
 
DATE_LITERAL        DATE_CONVERSION     TIMESTAMP_LITERAL                                                           TIMESTAMP_CONVERSION
------------------- ------------------- --------------------------------------------------------------------------- ---------------------------------------------------------------------------
1582-10-10 00:00:00 1582-10-15 12:34:56 1582-10-10 12:34:56.000000000                                               1582-10-15 12:34:56.000000000

In 12.2 and above, the date handling is consistent. All dates in the calendar gap are shifted to October 15.

SQL> SELECT banner FROM v$version
  2   WHERE banner LIKE 'Oracle Database%';
 
BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> SELECT DATE '1582-10-10' date_literal,
  2         TO_DATE('1582-10-10 12:34:56', 'yyyy-mm-dd hh24:mi:ss') date_conversion,
  3         TIMESTAMP '1582-10-10 12:34:56' timestamp_literal,
  4         TO_TIMESTAMP('1582-10-10 12:34:56', 'yyyy-mm-dd hh24:mi:ss') timestamp_conversion
  5    FROM DUAL;
 
DATE_LITERAL        DATE_CONVERSION     TIMESTAMP_LITERAL                                                           TIMESTAMP_CONVERSION
------------------- ------------------- --------------------------------------------------------------------------- ---------------------------------------------------------------------------
1582-10-15 00:00:00 1582-10-15 12:34:56 1582-10-15 12:34:56.000000000                                               1582-10-15 12:34:56.000000000

Also interesting is conversion between the two types of date handling in the older versions. While a date literal can be created, if you then try to manipulate it, the gap-date will again be shifted to October 15.

SQL> SELECT date_literal, date_literal + 2
  2    FROM (SELECT DATE '1582-10-05' date_literal FROM DUAL);

DATE_LITERAL        DATE_LITERAL+2
------------------- -------------------
1582-10-05 00:00:00 1582-10-17 00:00:00

This can get even more confusing because the shifting will take into account the delta within the gap. In the example above, October 5, is shifted to October 15, then add 2 to get October 17.
But, try a later date and the math is a little weird because it both shifts the date to October 15, but also preserves the difference from within the gap. e.g. October 10 is five days later than October 5, so a shifted date is also 5 days later.

SQL> SELECT date_literal, date_literal + 2
  2    FROM (SELECT DATE '1582-10-10' date_literal FROM DUAL);

DATE_LITERAL        DATE_LITERAL+2
------------------- -------------------
1582-10-10 00:00:00 1582-10-22 00:00:00

In 12.2 and above, this difference preservation does not occur. All dates from the gap are converted to October 15, so adding 2 to October 5, or October 10 will still result in October 17, because both of those gap-dates are converted to the same value before applying the date math.

There is another hole in the calendar at year-zero, in that there isn’t one. Oracle even has a specific error message to say so.

ORA-01841: (full) year must be between -4713 and +9999, and not be 0

But, as with the Gregorian calendar hole, date-literal values for the year-0 can be created if you’re using a 12.1.0.2 database or lower.

SQL> SELECT banner FROM v$version
  2  WHERE banner LIKE 'Oracle Database%';

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL> select date '0000-01-01' from dual;

DATE'0000-01-01'
-------------------
0000-01-01 00:00:00

But, try the same thing in 12.2 or above and you’ll get the ORA-1841 error.

SQL> SELECT banner FROM v$version
  2  WHERE banner LIKE 'Oracle Database%';

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> select date '0000-01-01' from dual;
select date '0000-01-01' from dual
            *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

For most of applications I expect neither of these holes-in-time to be an issue as most of the data will likely be relatively modern. If you do have centuries- or millennia-old data, or if you’re simply trying to be thorough in your testing of date functions it’s important to note how the calendar has changed over the years, and more importantly, how the database itself handles those changes.

Introduction to ASH Data

The Oracle database captures session activity approximately once each second. This data can then be used to identify usage patterns and performance problems. Many performance experts have written very good suites of scripts and tools to query, analyze, and chart the ASH data. However, some of these tool sets can be somewhat intimidating for new users.

In this article I’d like to introduce a few simple queries of ASH data as well as a simple text-based charting method built from those queries. Thus forming a foundation for other, more complex queries as the users gain comfort using the ASH data effectively.

Before we begin digging in to the data, it is important to note ASH and AWR information is not automatically made available to all users. In addition to the object privileges needed to read the views, there are licensing requirements. Simply SELECTing data from V$ACTIVE_SESSION_HISTORY or DBA_HIST_ACTIVE_SESS_HISTORY requires the Oracle Diagnostic Pack license.

Probably the most common usage of ASH data is simply measuring how much activity is on the system at various points in time. When activity is low the database starts to become idle and should be responsive to user requests. As activity increases users start to compete for the limited CPU capacity. A simple method of measuring activity is to simply count the number active sessions at various points in time. At the most basic level this would simply be counting the number of sessions captured in each sample time. For example, in the following query we can our chosen window of 20 seconds, the system varied from a low of 4 active sessions up to a high of 18 active sessions. These numbers, without any greater context simply indicate the system had varying levels of activity. There is no indicator that the 4 was “good” or the 18 was “bad” but the measurements are a start.

SQL>   SELECT sample_time, COUNT(*) cnt
  2      FROM gv$active_session_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  4       AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;
SAMPLE_TIME                       CNT
2018-09-15 09:00:00.969000000       6
2018-09-15 09:00:01.969000000       6
2018-09-15 09:00:02.969000000       4
2018-09-15 09:00:03.979000000       4
2018-09-15 09:00:04.979000000       4
2018-09-15 09:00:05.979000000       6
2018-09-15 09:00:06.979000000       6
2018-09-15 09:00:07.979000000       6
2018-09-15 09:00:08.979000000       6
2018-09-15 09:00:09.979000000       6
2018-09-15 09:00:10.979000000       4
2018-09-15 09:00:11.989000000       6
2018-09-15 09:00:12.989000000      18
2018-09-15 09:00:14.049000000      18
2018-09-15 09:00:15.049000000      18
2018-09-15 09:00:16.049000000      18
2018-09-15 09:00:17.049000000       4
2018-09-15 09:00:18.049000000       6
2018-09-15 09:00:19.049000000       6
2018-09-15 09:00:20.049000000       6

20 rows selected.

You may have noticed the view I queried is gv$active_session_history rather than just v$active_session_history. While multiple nodes in a RAC system would need to be handled, in this case there is no functional difference as my test system only has a single instance. The rest of the examples in this article are on the same system.

This strategy of pulling the numerical counts is useful, but sometimes it’s helpful to have a visualization to convey the information more easily. A simple bar chart is easy to generate by padding characters in length proportional to those counts.

SQL>   SELECT sample_time, COUNT(*) cnt, RPAD('*', COUNT(*), '*') chart
  2      FROM gv$active_session_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  4       AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;
SAMPLE_TIME                       CNT CHART
2018-09-15 09:00:00.969000000       6 ******
2018-09-15 09:00:01.969000000       6 ******
2018-09-15 09:00:02.969000000       4 ****
2018-09-15 09:00:03.979000000       4 ****
2018-09-15 09:00:04.979000000       4 ****
2018-09-15 09:00:05.979000000       6 ******
2018-09-15 09:00:06.979000000       6 ******
2018-09-15 09:00:07.979000000       6 ******
2018-09-15 09:00:08.979000000       6 ******
2018-09-15 09:00:09.979000000       6 ******
2018-09-15 09:00:10.979000000       4 ****
2018-09-15 09:00:11.989000000       6 ******
2018-09-15 09:00:12.989000000      18 ******************
2018-09-15 09:00:14.049000000      18 ******************
2018-09-15 09:00:15.049000000      18 ******************
2018-09-15 09:00:16.049000000      18 ******************
2018-09-15 09:00:17.049000000       4 ****
2018-09-15 09:00:18.049000000       6 ******
2018-09-15 09:00:19.049000000       6 ******
2018-09-15 09:00:20.049000000       6 ******

We can make our chart a little smarter by checking if the number of active sessions exceeds the CPU count, thus providing some of the missing context mentioned earlier. First, we’ll see what the cpu_count init parameter is for this database.

SQL> select value from v$parameter where name = 'cpu_count';
VALUE
8

Then we’ll adjust the chart logic to use a different character when there are more active sessions than CPUs to service them. Now the spike of 18 sessions is not only obvious that it’s larger than the rest of the activity but the exclamation characters highlight a potential problem.

SQL>   SELECT sample_time, cnt,
  2           CASE
  3             WHEN cnt <= 8 THEN RPAD('*', cnt * 2, '*')
  4             ELSE RPAD('!', cnt * 2, '!')
  5           END chart
  6      FROM (  SELECT sample_time, COUNT(*) cnt
  7                FROM gv$active_session_history
  8               WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  9                 AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
 10            GROUP BY sample_time)
 11  ORDER BY sample_time;
SAMPLE_TIME                       CNT CHART
2018-09-15 09:00:00.969000000       6 ************
2018-09-15 09:00:01.969000000       6 ************
2018-09-15 09:00:02.969000000       4 ********
2018-09-15 09:00:03.979000000       4 ********
2018-09-15 09:00:04.979000000       4 ********
2018-09-15 09:00:05.979000000       6 ************
2018-09-15 09:00:06.979000000       6 ************
2018-09-15 09:00:07.979000000       6 ************
2018-09-15 09:00:08.979000000       6 ************
2018-09-15 09:00:09.979000000       6 ************
2018-09-15 09:00:10.979000000       4 ********
2018-09-15 09:00:11.989000000       6 ************
2018-09-15 09:00:12.989000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:14.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:15.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:16.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:17.049000000       4 ********
2018-09-15 09:00:18.049000000       6 ************
2018-09-15 09:00:19.049000000       6 ************
2018-09-15 09:00:20.049000000       6 ************

20 rows selected.

If you use Oracle’s sqlcl tool then the formatting of the chart can be made even more vibrant with the use of color codes with ansiconsole formatting.

SQL>   SELECT sample_time, cnt,
  2           CASE
  3             WHEN cnt < 8 THEN '@|bg_green '
  4             WHEN cnt = 8 THEN '@|bg_yellow '
  5             ELSE '@|bg_red '
  6           END
  7           || RPAD('*', cnt, '*')
  8           || '|@'
  9               chart
 10      FROM (  SELECT sample_time, COUNT(*) cnt
 11                FROM gv$active_session_history
 12               WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
 13                 AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
 14            GROUP BY sample_time)
 15  ORDER BY sample_time;

These queries provide the most basic form of investigation. They are useful, but at a one-second granularity the data volumes can become unwieldy for rapid diagnostic value and visualization. In my next articles in this series we’ll expand on these to show methods of querying ASH data across with greater detail or across larger time ranges while still maintaining a manageable set of results.

Simple Performance Diagnostics in APEX

Recently I received a call that an APEX application was experiencing poor performance.  They asked me to investigate what was happening on the database or in APEX itself.

First, I ask if they have any application logs that give some indication of where the problems might be.  The answer came back no.  There was no instrumentation or logging of their own in place to capture the activity; but we did have APEX debug enabled for some of the user actions.

So, I query the APEX dictionary for their application, user, and an approximate execution time when the users saw poor performance while debugging.

  SELECT view_timestamp,
         page_id,
         elapsed_time,
         page_view_type,
         apex_session_id,
         debug_page_view_id
    FROM apex_workspace_activity_log
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND view_timestamp > TIMESTAMP '2018-10-02 15:00:00 -04:00'
     AND view_timestamp < TIMESTAMP '2018-10-02 15:20:00 -04:00'
ORDER BY view_timestamp;

This yielded output as follows, with page 27 having nearly 9 seconds of time. Users reported other times where the page was even slower, however they were not running with DEBUG mode enabled then so this was the view I used for Step 1.

VIEW_TIMESTAMP                     PAGE_ID  ELAPSED_TIME  PAGE_VIEW_TYPE  APEX_SESSION_ID  DEBUG_PAGE_VIEW_ID
2018-10-02 15:11:48.691578 -04:00  20       3.121443      Rendering       307779930648678  801               
2018-10-02 15:11:54.204761 -04:00  27       0.045717      Rendering       307779930648678  804               
2018-10-02 15:11:54.651241 -04:00  27       0.03838       Ajax            307779930648678  807               
2018-10-02 15:12:21.752888 -04:00  27       0.570659      Rendering       307779930648678  810               
2018-10-02 15:12:22.267021 -04:00  27       0.031825      Ajax            307779930648678  813               
2018-10-02 15:12:31.484343 -04:00  27       8.814905      Ajax            307779930648678  816   

With this output, I then took the session id and debug view id into Step 2.

  SELECT id,
         message_timestamp,
         elapsed_time,
         execution_time,
         message
    FROM apex_debug_messages
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND session_id = 307779930648678
     AND page_view_id = 816
ORDER BY message_timestamp;

APEX debugging is quite verbose, even at the default level 4, within those 8 seconds, over 11000 debug messages were recorded.

ID      MESSAGE_TIMESTAMP                   ELAPSED_TIME    EXECUTION_TIME  MESSAGE                                                
817021  2018-10-02 15:12:22.678806 -04:00   0.009403        0.00036         Session State: fetch from database (exact)             
817022  2018-10-02 15:12:22.679166 -04:00   0.009761        0.000042        ... sentry+verification success                        
817023  2018-10-02 15:12:22.679208 -04:00   0.009802        0.000393        ...Session ID 307779930648678 can be used                                    
...
 { 11000+ more rows }
... 
828467  2018-10-02 15:12:31.483859 -04:00   8.814462        0.000292        Stop APEX Engine detected                              
828468  2018-10-02 15:12:31.484151 -04:00   8.814746                        Final commit

Scanning through all of those messages wasn't going to be a very efficient diagnostic method, so I changed the ORDER BY to sort by execution time, with the slowest first.

  SELECT id,
         message_timestamp,
         elapsed_time,
         execution_time,
         message
    FROM apex_debug_messages
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND session_id = 307779930648678
     AND page_view_id = 816
ORDER BY execution_time desc;

This immediately identified the main culprits.

ID      MESSAGE_TIMESTAMP                   ELAPSED_TIME    EXECUTION_TIME  MESSAGE
828468  2018-10-02 15:12:31.484151 -04:00   8.814746                        Final commit
825696  2018-10-02 15:12:26.932971 -04:00   4.263569        2.733338        SQL Statement prepared: ... found binds=1
817036  2018-10-02 15:12:22.683182 -04:00   0.013779        1.840666        SQL Statement prepared: ... found binds=2
821366  2018-10-02 15:12:25.143178 -04:00   2.473775        1.426806        SQL Statement prepared: ... found binds=2
...

Thus 6 out of the total 8.8 seconds were spent in the application sql statements. The remainder of the time was primarily spent iterating over the result sets populating 3 series of chart data with several thousand data points.

In this case it appears APEX was executing their application quite quickly, the problem was their queries needed some work, possibly pruning the result sets down a little if the charting didn't really benefit from results of that granularity.

While a 9 second page was still considered slow, that performance was benefiting from caching by repeatedly running tests; thus further highlighting the need to focus on the queries.

What struck me most in this exercise was how quickly and easily it was to identify the problem areas. With just three queries, each of which only took a few seconds to run, the problem queries were clearly identified. Had it been a custom javascript function, or something in APEX itself, those too would have been similarly identified.

One of the hallmarks of a good architecture is how easy it is to diagnose. APEX gets an A+ here even before adding in any external instrumentation.

Using MERGE to INSERT overlapping source data

Recently I’ve been working with a data source that generates a new file each day for me to import. That seemed simple enough, I created an external table to read the file and then run an insert statement to load the data into a target table. However, the source files sometimes overlap each other. That is, on day 1, I might get ids 1,2,3,4,5. Then on day 2, I might get ids 3,4,5,6,7,8,9. I could simply use a MERGE statement to update matching records and then insert the new unmatched ones. That’s what MERGE is for; but in this case I know the source data has static history. So, if I update ID 3 on the second day because I find a match from the first day, I know the update won’t actually change any data. Thus making for an inefficient process doing dummy work.

As an alternate solution, I could use insert with a unique constraint to prevent duplicates and use DBMS_ERRLOG to create a log table to catch the duplicate rows, thus allowing the insert to continue rather than aborting when a duplicate is found. This too though seems like a waste of resources.

Of course, another option would be to do a select from the source table that excludes the matching ids with an anti-join, not-exists subquery, or not-in subquery.

All of these variations would work but are less desirable in that they require extra steps or are needlessly complex.

The easiest method I found was to simply use a one-sided merge. That is, only define a WHEN NOT MATCHED condition, so the matching records simply drop out of the merge statement when they are detected without extra processing or raising exceptions.

As a test case to illustrate, I’ll use a normal table instead of an external table as the source and simulate 3 days of results to load into the target table.

CREATE TABLE target
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

CREATE TABLE source
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

On day 1 we receive 5 rows of data.

-- Source data Day-1   
truncate table source;            
Insert into source (id, textval, dateval, numval) values (1,'one',  date '2018-09-02',1);
Insert into source (id, textval, dateval, numval) values (2,'two',  date '2018-09-02',4);
Insert into source (id, textval, dateval, numval) values (3,'three',date '2018-09-02',9);
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
commit;

Then use merge to insert the first day’s data.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

5 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25

Then on day 2 we receive 4 rows of data, 2 repeats and 2 new one ones.

-- Source data Day-2 
truncate table source;  
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
Insert into source (id, textval, dateval, numval) values (6,'six',  date '2018-09-03',36);
Insert into source (id, textval, dateval, numval) values (7,'seven',date '2018-09-03',49);
commit;

Then merge again. Note only the two new rows are merged.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

2 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49

On day 3 we receive 4 more rows, this time with 1 duplicates and 3 new ones.

-- Source data Day-3
truncate table source;
Insert into source (id, textval, dateval, numval) values (7, 'seven',date '2018-09-03',49);
Insert into source (id, textval, dateval, numval) values (8, 'eight',date '2018-09-04',64);
Insert into source (id, textval, dateval, numval) values (9, 'nine', date '2018-09-04',81);
Insert into source (id, textval, dateval, numval) values (10,'ten',  date '2018-09-04',100);
commit;

Merge one more time and we have 3 new rows merged and only the 10 distinct copies from all 3 days in the target table.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

3 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49
         8 eight      04-SEP-18         64
         9 nine       04-SEP-18         81
        10 ten        04-SEP-18        100

I’m not Steven Feuerstein, should I even bother blogging?

If you’re not into PL/SQL, substitute Jonathan Lewis, Tanel Poder, Maria Colgan, or any other technology great you might respect. Doesn’t even need to be Oracle related.

The answer, quite simply, is yes.

First – it’s not a competition. If it were, then yes, I concede to the guy that literally wrote book on “Oracle PL/SQL Programming.” I don’t blog to show I know more or even as much as he does. I blog to share examples, interesting quirks, and use cases I’ve encountered. For all of the syntax descriptions in the Oracle manuals and all of the code downloads and explanatory videos out there, maybe my contribution will be the one that clicks with some reader googling for a particular problem. The point of my code isn’t that nobody else could have written what I did, but simply that I did write it, and maybe somebody else can benefit from it.

So, if you’ve written something you thought was interesting, by all means share it. Maybe it’s something with nested table collections that has been covered a thousand times before. Post it anyway, maybe your specific example, with your choice of descriptive phrasing will be the one that catches the next reader’s eye.

Second – more diverse voices matter. I’ve been told my style of writing is too casual and sometime unprofessionally conversational. I’ve also been told my writing helps make new syntax and/or complicated problems more approachable. So even if my content might be exactly what someone needs, if my delivery doesn’t work for them maybe the message gets lost. I’d feel a little bad about that; but it happens. Some of it just implicit. I only read one language well, English, so I only read blogs and watch videos in English. I’m sure I miss out on some good content because of that limitation. Others might too.

So, maybe you have something to say about collections that you know for sure Feuerstein has covered before. Go ahead, say it anyway. Use your code, your words, your language. Get your message, explained your way out there. If you happen to be a polyglot, even better. Translate your own work.

Third – what you’re posting is probably new to someone I’ve been coding in PL/SQL since version 7. I’ve written tens of thousands of lines of code. I don’t believe there is anything I’ve written that couldn’t have been written by lots of other people, but nobody is writing everything. You might be surprised with what others will look for. Month after month, the hottest article on my blog (by far) from search engines is one I wrote several years ago on using DBMS_PARALLEL_EXECUTE to run parallel pl/sql . I posted it because I thought it was a kind of quirky thing to do. I had no idea so many other people would search for exactly that. I’m going to guess I’m not the first or only person to do it; but I bothered to blog about it, so now that article is a resource out there for others. If I picked my blog articles based on what I thought would garner the most “hits” then I never would have written that article. My goal is one. If one person likes a post then I’ve done my job. Two and up is just bonus.

So, if you’ve got some strange use case and think it’s too obscure to be helpful to anyone else – post it anyway. Even if you’re right that it can’t be used in your exact form in other code; your approach, or general idea behind your solution might be the inspiration somebody else needs to get their problem solved.

Fourth – blogging is also helpful to the author. The act of of writing an instructional blog about code is different than the act of writing the code itself. It’s also different than writing comments for the code. When you write a blog article about some block of code you are removing that code from the context in which it was originally written. So, comments you might leave in the code for yourself or future maintainers might be meaningless when read with a stand-alone example. Similarly, intent may be obvious within the context of an application, but in isolation a code block could be opaque. Also, you don’t know your audience, so you need to write in such a way that it aids the intended reader. This may mean exploring syntax minutia and documenting features that are or are not used in order to give scope to your code. These extra steps improve your skill as a writer; but you will often find you learn more about your own code as you start fleshing out the areas around it.

So, if you got some code or practice that you think you have some expertise, challenge yourself to write about it so a novice can understand it. Or, you can flip it around. If there is something you want to learn about for yourself, start writing about it. You don’t have to publish immediately at each twist, turn, trial and error; but as you reach milestones – write about them. Your own blog can then be a reference for yourself as well.

Fifth – start the conversation If it’s your first line of code or your millionth, you have something to say about what you’re doing. Write it down. If you’ve got something good – show it off. If you’ve got a question – ask it. Maybe it’s something in between, a little code block, nothing fancy, but explained well. Put the code out there, get some feedback. You might get kudos and thanks, you might get questions, you might get corrections. Any kind of feedback is good. You might get nothing. That’s fine too. Most people don’t comment, “like”, or “vote” on posts. It’s ok, post anyway. When I started I was getting 1 or 2 visits per month, then a few more, then a dozen, then a few dozen, then a hundred, and another hundred, and another hundred, and so on. The audience is growing in silence but growing. But, let’s say it didn’t. A lack of traffic would itself be feedback. Maybe you need a different topic, different writing style, different audience. Maybe there are no problems with your content, but simply your blog’s interaction with search engines.

So, everyone had something to share, go ahead and share it. You might feel self-conscious about posting some code you’re not sure about it. You can either take a little more time to refine it or go ahead and post it, and ask for improvements. Maybe you’re afraid you’ll post your awesome toolkit and somebody better will come along and tear it to shreds, detailing hole after hole. For one, that’s probably not going to happen. If it does, and it’s bad enough, you can delete the article and repost later after applying corrections, or simply append to the article showing before and after. The work of changing could itself become an article series. I once wrote a 3-part series exploring syntax variations of a single query to solve a question posed by a friend. All but the very last example in the third article are objectively bad ways to do the work; but the point of the series was to explore and explain different approaches to the problem. I intentionally posted answers that I would not recommend for the sake of the exploration. Had i jumped immediately to the simple solution I gave him it wouldn’t have been as interesting or educational. When my friend asked me how to do it, he already assumed I could. When he read all the different ways I came up with he responded “I didn’t know you could do that!” I both solved the problem and showed him something new, making the whole exercise worthwhile.

In conclusion, if you’re thinking about starting a blog, just start. Some blog every day, some once a week, some once a month. Some are regular, some in bursts. It doesn’t matter, just start. Push out some content, talk to some people. Who knows? The next person to read your blog might be a potential employer that didn’t know you existed until he or she saw your work.

Go, put fingers to keyboard and welcome aboard!

How to use DBMS_LDAP (part 6: Modifying)

Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes
  5. Timeouts
  6. Modifying Data (this page)

Modifying data in the LDAP hierarchy

The DBMS_LDAP package allows you to write as well as read data from the directory. However, it does not provide an API for manipulating the schemas within the tree. That is, you can add, delete, or modify records of predefined structures but you may not create new structure types of your own.

In my previous articles in this series I used publicly available servers for my examples. As you might expect, those are read-only. So, in this article my examples will be based off of the LDAP structures used for Oracle database name resolution. These could be from Oracle’s own OID product, or some other LDAP server hosting the same schema.

Inserting a new record

To insert a new record, you might think of it similar to a SQL insert into an index-organized table. That is, unlike a normal heap-table, where rows can go anywhere in the table; in an IOT, or an LDAP structure, each row can only exist within a specific location. Thus an LDAP insert consists of defining the unique key of the record, i.e. where the new record will go, as well as the content fields of the new record.

The where portion consists of the DN as we’ve used in searching from previous sessions. The content is defined by an array of attributes. Each attribute is itself an array. In the example below, each attribute has only a single value, so as the modification array is constructed, each array adds an array of just one element. Since every node within a directory must have an object class, there will always be at least on attribute populated in the modification array to set that value.

In the case of an Oracle Net Service Name, the object class is “orclNetService”. In order to be a useful record it must also include the name to be resolved, as well as the TNS description to which the name resolves. So, in this example I am inserting a new name “test_name” that resolves to “(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))”

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 3);

        v_strings(1) := 'orclNetService';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'objectclass',
                                     modval     => v_strings);

        v_strings(1) := 'test_name';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'cn',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.add_s(ld        => v_session,
                            entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                            modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Add successful');
        ELSE
            DBMS_OUTPUT.put_line('Add Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Add successful

If you run the block twice, you should get an error on the second run for trying to create two entries at the same DN.

Add Result: 68 Already exists

Deleting an old record

Deleting a record is fairly simple. To make a correlation with SQL, you simply specify the unique key to delete that record. DBMS_LDAP does not provide an API for mass deletion. If you wanted to delete multiple records, you could call the delete_s function for each DN, or, if you didn’t know them, you could perform a search as shown in prior chapters, and then retrieve the DN from each search result to do the delete.

DECLARE
    v_session          DBMS_LDAP.session;
    v_result           PLS_INTEGER;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_result := DBMS_LDAP.delete_s(ld => v_session, entrydn => 'cn=test_name,cn=OracleContext,dc=example,dc=net');

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Delete successful');
        ELSE
            DBMS_OUTPUT.put_line('Delete Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Delete successful

If you run the block twice, you should get an error on the second run for trying to delete a DN that doesn’t exist.

Delete Result: 32 No such object

Modifying a record

Modifying a record in LDAP can but doesn’t always correlate with a SQL update. At the simplest level, you can specify the unique key, the DN, as you would with a delete, but instead of removing the record, you can replace the value of an attribute with a new value. In SQL terms this would be analogous to something like this:

update my_table set my_attribute = new_value where id = my_dn;

You can update more than one attribute by populating the modification array with more than one replace command. Here I update our test record with a new TNS description pointing to another server with a different service name for that listener.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result := DBMS_LDAP.simple_bind_s(ld => v_session, dn => 'uid=myuserid,ou=users,dc=example,dc=net', passwd => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 1);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_replace,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Modify successful

While that replace looks like a fairly standard update, correlating well with SQL, you can also modify records to remove attributes. In SQL terms, that would be like removing a column from just one row of a table. Not setting it to NULL, but actually removing the column entirely.

Or, maybe a closer analogy, if your table had a collection type for a column and you deleted an element from that collection.

Even that though still doesn’t quite compare, because if you delete all attributes of a certain type, there is no place holder attribute left behind in an LDAP directory. But, if you had a collection type in a table, the column would still exist, but have a null or empty collection there.

So, the lesson here is that LDAP modify operations should be seen as a more complex operation.

Next I’ll perform an equivalent update as before, except instead of doing a replace, I’ll add a new description attribute and then delete the old one. This might seem like a violation because there will temporarily be two descriptions for a single name, but modify_s function is essentially atomic, no matter how many attribute modifications are applied. So, the LDAP constraint ensuring a single description for a name isn’t applied until after the all of the modifications are completed.

This example is merely for illustration not a recommendation. If you have a single value to update, using replace is more efficient than insert and delete. Also note this example assumes the directory is populated after the initial insert above, not after the other modify example.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 2);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_delete,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);


        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;

If the block is run twice you’ll get an error trying to add the same description value twice.

Modify Result: 20 Type or value exists

Rearranging the delete to occur first won’t help, because the delete is for the old description value. So, if you put the delete first and run it twice, you’ll get a different error, but the block will still fail.

Modify Result: 16 No such attribute

In addition to the inefficiency using two modifications instead of one, these errors also help to illustrate why using the replace modification is a better choice when it applies.

Of course, if you have a record with a multi-value attribute, it may be appropriate to delete one attribute and add another instead of doing a replace. For example, updating a person’s record, deleting one or more old email records and adding new ones. Or, not doing a replace at all, maybe adding children to a parent or making changes to employees of a department. The modification action may consist only of add operations or only of deletes. During a business’s reorganization a department might change names using a replace, change managers, also using a replace, and then add several employees and remove others. All of which could be executed with a single modify_s call to the department’s record.

This concludes my series on DBMS_LDAP. I hope you found it enlightening and the examples useful starting points for your own applications. I know I learned more in the process of writing it.

Questions on this chapter or any of the previous chapters are always welcome.