Close

Strange (erroneous) query from Oracle Audit Vault

Today I was asked about a query from an Audit Vault job that was failing. Unfortunately, the job didn’t report any useful error messages to explain the failure. After some digging the problem was narrowed down to a query pulling user account information. In particular, a calculated column was proving both erroneous as well as confusing, thus prompting the question: “What is this doing?”

The calculated column (with a little formatting) was:

TO_CHAR(lock_date
        - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
                            (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
                            3)
                    )
          ) / 24,
        'MM/dd/yyyy HH24:mi:ss')

So, the first question was – what is this trying to do and why does it fail? The first thing I noticed was the function was attempting to parse SYSTIMESTAMP as if it were text. That is possible, because Oracle will implicitly convert it to a VARCHAR2 type, but we don’t know the format the job was using when it failed. I ran the query in a test database with my session’s NLS_TIMESTAMP_TZ_FORMAT set to ‘yyyy-mm-dd hh24:mi:ssxff tzr’ producing results like this:

SQL> alter session set nls_timestamp_tz_format='yyyy-mm-dd hh24:mi:ssxff tzr';

Session altered.

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
  3                              (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

Thus it seems the calculation is shifting the LOCK_DATE by 4 hours, but why and why does it work for me in the test db but not for the Audit Vault in the real one? The answer to that is in the text parsing.

In the problem db there was a logon trigger setting the NLS_TIMESTAMP_TZ_FORMAT to a custom form. The calculation makes the assumption that either the ‘TZR’ format model (with numeric offset) or ‘TZH:TZM’ is the last part of the converted text. In my test scenario that was the case, in the problem db it was not, that database used a format to support the assumptions of other processes. This is the reason implicit conversions should be avoided. You can’t always count on implicit formats being what you expect. Furthermore, there are security implications when implicit transformations are used. A better solution would have been to replace the implicit conversions with explicit ones. For example, rewriting the query as shown below would work regardless of the session timestamp format mask.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),
  3                              (INSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That solves the immediate problem of why the query fails and what needs to be done to fix it; but still doesn’t answer the how and intent. The function is extracting the time zone hour offset from SYSTIMESTAMP, but how? In my test case the offset was -04:00 because I was running in US/Eastern time zone during Daylight Saving Time.

INSTR finds the colon separating the hours from minutes, then adjusts by 3 characters to the left to accommodate hour offsets of -23 to +23 (the ranges of real time zones are only -12 to +14 though.) Then using SUBSTR from that string index, parses the SYSTIMESTAMP converted text again, returning the hour portion of the offset; and finally TO_NUMBER changes that substring into a number – here too it would be better to use an explicit format such as TO_NUMBER(…,’S00′) to ensure a valid conversion.

Thus a “final” form with all conversions made explicit and reliable regardless of session settings might look like this:

TO_CHAR(lock_date
- (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZH'),
                    (INSTR(TO_CHAR(SYSTIMESTAMP,'TZH:TZM'),':',-1,1)) - 3,
                    3),
             'S00'
            )
  ) / 24,
'MM/dd/yyyy HH24:mi:ss')

While this syntax would function reliably, it would still fail a code inspection for violating my rule of “Does this code jump through too many hoops?”
This was evident before ever running the code myself. The initial question was “What is this doing?” With 4 data type conversions, multiple substring parses, index adjustment, and date math the result is certainly not immediately obvious, thus prompting the next question: “Is there a better way?”

Yes, in multiple ways. First, it is possible to simplify the syntax above by only using the TZH format, thus removing the need to find that field within the string. Even better though, when timestamps were introduced in 9i, they also came with the EXTRACT function to pull pieces from them. So a much simpler, more efficient, and self-documenting syntax would be: extract(timezone_hour from systimestamp)

SQL> select extract(timezone_hour from systimestamp) from dual;

EXTRACT(TIMEZONE_HOURFROMSYSTIMESTAMP)
--------------------------------------
                                    -4

If used in the context of the original query…

SQL> SELECT lock_date, TO_CHAR(lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24,
  2                            'MM/dd/yyyy HH24:mi:ss') ae$date
  3    FROM sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That’s much better. Using the offset, the LOCK_DATE values are now adjusted to UTC/GMT time. This is helpful within the Audit Vault so all of the data is stored in a consistent form for easier comparison, joins, filtering, etc.

But…

The query is still not entirely correct!
Note all of the lock_date values from February. Their adjusted AE$DATE value is NOT equivalent to UTC/GMT time, but the August values are. The reason for this the conversion logic does not take Daylight Saving Time into account. During February, my system would be in US/Eastern Standard time thus having an offset of -5 hours. If I wait a few more months until November when I’m back in Standard time the February values will convert correctly but then the August values will be wrong.

Unfortunately, there isn’t a good solution for this problem. It is possible with a java stored procedure or an external procedure call to retrieve the system timezone setting. Ideally that will return a time zone name or unique abbreviation. If so, that could be used from the lock date value instead of a fixed offset from current system time.

That still isn’t completely reliable though!
The transition from Daylight Saving Time to Standard Time has two instances of the 1am-2am hour (in the US, other countries will have their own variations of the same issue) and nothing in the DATE value of LOCK_DATE will indicate whether a value is from the first or second iteration through the transition period.

But, even if you somehow solve that problem, it’s STILL not completely correct!
All of the math to adjust dates to UTC/GMT assume an integer hour offset; but some time zones around the world are not on even hour offsets. Asia/Kathmandu is +05:45, Canada/Newfoundland is -02:30. So the entire premise of the adjustment math is incorrect. This problem is possible to fix though, if you know the full offset. You simply include the minutes in the calculation as well as the hours.

lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24  - EXTRACT(TIMEZONE_MINUTE FROM SYSTIMESTAMP)/1440

In summary, the original query has a multitude of failings:

  1. Unreliable syntax that simply fails if specific system assumptions are not met
  2. Overly complex computation when native functionality exists
  3. Unreliable date math not supporting Daylight Saving Time
  4. Incorrect time adjustment not taking into account non-hourly offsets

The point of this article wasn’t to pick on Audit Vault, it’s a fine product, and these issues were relatively easy to work around within the real system. No, the real message here is I see these same types of problems all the time, especially when date/time values are involved. This one query highlighted several from a family of logic and syntax errors that developers fall prey to with alarming frequency. Hopefully the breakdown and various rewrites and comments will help others avoid some of these pitfalls and correct the ones they do encounter.

Questions and comments, as always, are welcome!