Close

Is it a bug or a quirky feature?

I was asked about some strange behavior querying a view where the results of a single row seemed to inexplicably change depending on the where clause. After a few minutes of looking around and creating a simplified test case I was able to produce similar odd behavior. Before digging in, let me start by showing some of the magic trick involved here.

SQL> select * from v3;
no rows selected

SQL> select * from v3 where id = 'CLII';

ID                  CNT
------------ ----------
CLII                 26

SQL> select * from v3 where id like 'CLI%';
no rows selected

Simply querying the view returns no rows. Providing an equality condition in the where clause causes the view to return data! But, expanding the where clause with a wildcard causes the row we just saw to disappear and we get no results again.

I saw this in 18c and 19c databases on AIX and Linux servers as well as 12.2 on an Exadata. I also tested in 11.2 and 12.1 but did not reproduce the symptoms in any of those systems. I was also able to replicate it on Oracle’s own LiveSQL at https://livesql.oracle.com/apex/livesql/s/jb0yz14dil4rbplu1d78vvy61.

So, why does this happen? The answer is due to optimizations around predicate pushing. Let’s look at the explain plan and predicates of the query without a where clause.

--------------------------------------------------
| Id  | Operation             | Name    | E-Rows |
--------------------------------------------------
|   0 | SELECT STATEMENT      |         |        |
|*  1 |  FILTER               |         |        |
|*  2 |   HASH JOIN OUTER     |         |  19999 |
|   3 |    NESTED LOOPS       |         |    999 |
|   4 |     TABLE ACCESS FULL | TAB1    |    999 |
|*  5 |     INDEX UNIQUE SCAN | PK_TAB2 |      1 |
|   6 |    VIEW               | V1      |  19999 |
|*  7 |     HASH GROUP BY     |         |  19999 |
|   8 |      TABLE ACCESS FULL| TAB3    |  19999 |
--------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NVL("V1"."CNT",0)>0)
   2 - access("TAB2"."T2_ID"="V1"."ID")
   5 - access("TAB1"."T1_ID"="TAB2"."T2_ID")
   7 - filter(COUNT("T3_ID")>0)

Take note of the access predicates in lines 2 and 5. Now, compare that to the plan and predicates of the query with the equality condition.

--------------------------------------------------------
| Id  | Operation            | Name           | E-Rows |
--------------------------------------------------------
|   0 | SELECT STATEMENT     |                |     26 |
|*  1 |  FILTER              |                |        |
|   2 |   NESTED LOOPS OUTER |                |     26 |
|   3 |    NESTED LOOPS      |                |      1 |
|*  4 |     INDEX UNIQUE SCAN| PK_TAB2        |      1 |
|*  5 |     INDEX UNIQUE SCAN| PK_TAB1        |      1 |
|   6 |    VIEW              | V1             |     26 |
|*  7 |     SORT GROUP BY    |                |     26 |
|*  8 |      INDEX RANGE SCAN| IDX_TAB3_T3_ID |     26 |
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("V1"."CNT",0)>0)
   4 - access("TAB2"."T2_ID"='CLII')
   5 - access("TAB1"."T1_ID"='CLII')
   7 - filter(COUNT("T3_ID")>0)
   8 - access("T3_ID"='CLII')

Here, the literal in the outer query is pushed into predicates across 3 indexes. Thus all comparisons are directly against the literal value and not, as in the previous example, between two columns.

This distinction is important because padding semantics apply when using the literal value; but not when comparing the columns. From the SQL Reference:

Oracle uses blank-padded comparison semantics only when both values in the comparison are either expressions of data type CHAR, NCHAR, text literals, or values returned by the USER function.

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Data-Type-Comparison-Rules.html#GUID-A114F1F4-A08D-4107-B679-323DC7FEA31C

and a little later in the same document…

Oracle uses nonpadded comparison semantics whenever one or both values in the comparison have the data type VARCHAR2 or NVARCHAR2.

Now you can probably see where this is going. There must be a mix of data types in the tables. Some have an ID column of VARCHAR2 type and some are CHAR. This datatype discrepancy creates the opportunity for the conditions to go awry.

But what about that third query, with the LIKE clause and the literal? Why does it also return no rows? It turns out it’s the same problem even with predicate pushing of the literal. Here is the plan and predicates for the LIKE query:

---------------------------------------------------------
| Id  | Operation             | Name           | E-Rows |
---------------------------------------------------------
|   0 | SELECT STATEMENT      |                |        |
|*  1 |  FILTER               |                |        |
|*  2 |   HASH JOIN OUTER     |                |      5 |
|   3 |    JOIN FILTER CREATE | :BF0000        |      5 |
|   4 |     NESTED LOOPS      |                |      5 |
|*  5 |      INDEX RANGE SCAN | PK_TAB1        |      5 |
|*  6 |      INDEX UNIQUE SCAN| PK_TAB2        |      1 |
|   7 |    VIEW               | V1             |    130 |
|*  8 |     HASH GROUP BY     |                |    130 |
|   9 |      JOIN FILTER USE  | :BF0000        |    130 |
|* 10 |       INDEX RANGE SCAN| IDX_TAB3_T3_ID |    130 |
---------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("V1"."CNT",0)>0)
   2 - access("TAB2"."T2_ID"="V1"."ID")
   5 - access("TAB1"."T1_ID" LIKE 'CLI%')
       filter("TAB1"."T1_ID" LIKE 'CLI%')
   6 - access("TAB1"."T1_ID"="TAB2"."T2_ID")
       filter("TAB2"."T2_ID" LIKE 'CLI%')
   8 - filter(COUNT("T3_ID")>0)
  10 - access("T3_ID" LIKE 'CLI%')
       filter("T3_ID" LIKE 'CLI%')

As you can see, the literal is pushed all the way into all of the indexes as expected. However, unlike the equality predicate which was guaranteed to return at most 1 row the LIKE condition can match multiple rows. In order to ensure the corresponding rows of each side of the join are matched with each other, the access predicates in steps 2 and 6 must be evaluated. Thus causing unpadded semantic comparisons, resulting in false conditions and hence, no rows returned.

The tables in the original business case had many columns but I only need one column from each to reproduce the symptoms. The query is essentially a lookup of how many rows are present in table 3 that match the intersection of two other mostly-overlapping tables. A complete example is in the LiveSQL link above; but you can run it yourself with the following setup.

create table tab1 (t1_id varchar2(12) constraint pk_tab1 primary key);
create table tab2 (t2_id varchar2(12) constraint pk_tab2 primary key);
create table tab3 (t3_id char(12));
create index idx_tab3_t3_id on tab3(t3_id);

insert into tab1(t1_id) select to_char(level,'fmRM') from dual connect by level < 1000;
insert into tab2(t2_id) select to_char(level+100,'fmRM') from dual connect by level < 1000;
insert into tab3(t3_id) select to_char(nullif(mod(level,777),0),'fmRM') from dual connect by level <20000;
commit;

create or replace view v1 as 
select t3_id id,count(t3_id) cnt from tab3 group by t3_id having count(t3_id) > 0;

create or replace view v2 as
select tab2.t2_id id, nvl(v1.cnt,0) cnt from tab2, v1
where tab2.t2_id = v1.id (+);   -- tested with ANSI JOIN syntax too, no difference in results

create or replace view v3 as
select tab1.t1_id id, v2.cnt from tab1, v2 where tab1.t1_id = v2.id and v2.cnt > 0;

select * from v3;
select * from v3 where id = 'CLII';
select * from v3 where id LIKE 'CLI%';

Putting it all together and running it…

SQL> create table tab1 (t1_id varchar2(12) constraint pk_tab1 primary key);

Table TAB1 created.

SQL> create table tab2 (t2_id varchar2(12) constraint pk_tab2 primary key);

Table TAB2 created.

SQL> create table tab3 (t3_id char(12));

Table TAB3 created.

SQL> create index idx_tab3_t3_id on tab3(t3_id);

Index IDX_TAB3_T3_ID created.

SQL> insert into tab1(t1_id) select to_char(level,'fmRM') from dual connect by level < 1000;

999 rows inserted.

SQL> insert into tab2(t2_id) select to_char(level+100,'fmRM') from dual connect by level < 1000;

999 rows inserted.

SQL> insert into tab3(t3_id) select to_char(nullif(mod(level,777),0),'fmRM') from dual connect by level <20000;

19,999 rows inserted.

SQL> commit;

Commit complete.

SQL> create or replace view v1 as
  2  select t3_id id,count(t3_id) cnt from tab3 group by t3_id having count(t3_id) > 0;

View V1 created.

SQL> create or replace view v2 as
  2  select tab2.t2_id id, nvl(v1.cnt,0) cnt from tab2, v1
  3  where tab2.t2_id = v1.id (+);

View V2 created.

SQL> create or replace view v3 as
  2  select tab1.t1_id id, v2.cnt from tab1, v2 where tab1.t1_id = v2.id and v2.cnt > 0;

View V3 created.

SQL> select * from v3;
no rows selected
SQL> select * from v3 where id = 'CLII';

ID                  CNT
------------ ----------
CLII                 26

SQL> select * from v3 where id LIKE 'CLI%';
no rows selected

Even though all of the comparison functionality is documented and I can see and explain why it happens – I think this should probably still be counted as a wrong-results bug.

3 thoughts on “Is it a bug or a quirky feature?

  1. Hi Sean,

    I have just looked at your examples and found that fix control 21800590 can be disabled to get consistent results (no rows) in all cases.
    We might get an idea about that fix control using the query below:

    select * from v$system_fix_control where bugno = 21800590;

    The following queries return no results in all environments that I tested (Oracle ATP 18c, my own 12.2 databases):

    select * from v3;
    select /*+ opt_param(‘_fix_control’ ‘21800590:0’)*/* from v3 where id = ‘CLI’;
    select /*+ opt_param(‘_fix_control’ ‘21800590:0’)*/* from v3 where id like ‘CLI%’;

    If you manage to raise an SR and/or get a bug number for that issue, it could help other Oracle users to watch out for its progress. I highly discourage mixing CHAR/VARCHAR2 datatypes in my shop due to a long history of those bugs Oracle has. A note from Tom Kyte comes to my mind about that CHAR/VARCHAR2 topic: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2668391900346844476

    Regards,
    Mikhail.

  2. Hello Sean,

    Very interesting post 🙂

    Just out of curiosity, I checked the plan for the following query:

    select * from v3 where id like ‘CLII’
    /

    no rows selected

    It is almost identical to the plan with the equality condition, still with a NESTED LOOPS OUTER operation, but with two additional filter operation at steps 5 and 6, using join conditions.

    The filter on the view at operation 6 is the one that eliminates the (single) row, due to the CHAR/VARCHAR2 comparison.

    ————————————————————————————————-
    | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
    ————————————————————————————————-
    | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 6 |
    |* 1 | FILTER | | 1 | | 0 |00:00:00.01 | 6 |
    | 2 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 6 |
    | 3 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 4 |
    |* 4 | INDEX UNIQUE SCAN| PK_TAB2 | 1 | 1 | 1 |00:00:00.01 | 2 |
    |* 5 | INDEX UNIQUE SCAN| PK_TAB1 | 1 | 1 | 1 |00:00:00.01 | 2 |
    |* 6 | VIEW | V1 | 1 | 1 | 0 |00:00:00.01 | 2 |
    |* 7 | SORT GROUP BY | | 1 | 26 | 0 |00:00:00.01 | 2 |
    |* 8 | INDEX RANGE SCAN| IDX_TAB3_T3_ID | 1 | 26 | 0 |00:00:00.01 | 2 |
    ————————————————————————————————-

    Predicate Information (identified by operation id):
    —————————————————

    1 – filter(NVL(“V1″.”CNT”,0)>0)
    4 – access(“TAB2”.”T2_ID”=’CLII’)
    5 – access(“TAB1″.”T1_ID”=”TAB2″.”T2_ID”)
    filter(“TAB1”.”T1_ID”=’CLII’)
    6 – filter(“TAB2″.”T2_ID”=”V1″.”ID”)
    7 – filter(COUNT(“T3_ID”)>0)
    8 – access(“T3_ID”=’CLII’)

    The conclusion for the Oracle optimizer team is that, probably, when optimizing queries containing comparison conditions between different data types, the original condition (here a join condition) should be “carried on” as an additional (security) filter condition, as it is done here for the “LIKE” case.

    And, the lesson for us is that it is probably best to avoid CHAR columns at all 🙂

    I have seen many life examples where CHAR columns only produces headaches and need weird workarounds to overcome similar data type comparison problems.

    Cheers & Best Regards,
    Iudith Mentzel

Leave a Reply