My favorite database feature: the “Collapsing Index”

One of the interesting features of Oracle is if an entry in an index would only have nulls in it, then that entry isn’t created.
A simple test can verify this assertion.

SQL> CREATE TABLE null_test
  2  (
  3      id              INTEGER NOT NULL,
  4      dummy_string    VARCHAR2(10)
  5  );

Table created.

SQL> INSERT INTO null_test(id, dummy_string)
  2          SELECT LEVEL, 'populated'
  3            FROM DUAL
  4      CONNECT BY LEVEL <= 10;

10 rows created.

SQL> INSERT INTO null_test(id, dummy_string)
  2      SELECT 11, NULL
  3        FROM DUAL;

1 row created.

SQL> INSERT INTO null_test(id, dummy_string)
  2      SELECT 12, NULL
  3        FROM DUAL;

1 row created.

SQL> CREATE INDEX null_test_dummy_string_idx
  2      ON null_test(dummy_string);

Index created.

SQL> BEGIN
  2      DBMS_STATS.gather_table_stats(USER, 'NULL_TEST');
  3  END;
  4  /

PL/SQL procedure successfully completed.

So, at this point I have a table with 12 rows in it. Of the index column, 10 rows are populated and 2 are left null. Now that I’ve gathered stats, lets see what the contents of the table vs the index look like.

SQL> SELECT num_rows
  2    FROM user_tables
  3   WHERE table_name = 'NULL_TEST';

  NUM_ROWS
----------
        12

SQL> SELECT num_rows
  2    FROM user_ind_statistics
  3   WHERE index_name = 'NULL_TEST_DUMMY_STRING_IDX';

  NUM_ROWS
----------
        10

As expected, the index is missing the 2 null rows.

So, what can we do with this feature? How can missing data be part of my favorite feature? Don’t we want data to be in an index for easy lookup?

Yes, we do… BUT, what we really want is the data we want to lookup to be in our index. That is, an index that has extra, unimportant data in it is just wasted space and more data to sift through. So, what if we could have an index that only contained exactly the rows we were interested in? Then, as the data became less interesting it would disappear from the index and the index would “collapse” to once again only contain the important data. Sounds ideal, right, but what sort of use-case would have data like that and could use such an index?

Processing queues are a classic example of this kind of data. Over time a table will accumulate many records that have been completed and don’t need revisited, or if they do, they don’t have the same urgency as new, unprocessed records. In this case, the ideal index would point to only the new records and exclude all the old ones.

To illustrate I’ll create a simple task queue containing just 2 columns, one a TASK_ID, and a second column which is a flag indicating if the record has been PROCESSED or not. Then I’ll fill the table with 100000 rows of old data, where the processed flag = ‘Y’, then I’ll insert 10 new rows with processed flag = ‘N’. Last, create an index on the processed flag, gather stats and check the numbers.

SQL> CREATE TABLE task_queue
  2  (
  3      task_id      INTEGER NOT NULL,
  4      processed    CHAR(1) NOT NULL
  5  );

Table created.

SQL> INSERT INTO task_queue(task_id, processed)
  2          SELECT LEVEL, 'Y'
  3            FROM DUAL
  4      CONNECT BY LEVEL <= 100000;

100000 rows created.

SQL> INSERT INTO task_queue(task_id, processed)
  2          SELECT 100000 + LEVEL, 'N'
  3            FROM DUAL
  4      CONNECT BY LEVEL <= 10;

10 rows created.

SQL> CREATE INDEX task_queue_processed_idx
  2      ON task_queue(processed);

Index created.

SQL> BEGIN
  2      DBMS_STATS.gather_table_stats(USER, 'TASK_QUEUE');
  3  END;
  4  /

PL/SQL procedure successfully completed.

SQL> SELECT column_name, num_distinct, num_nulls
  2    FROM user_tab_col_statistics
  3   WHERE table_name = 'TASK_QUEUE';

COLUMN_NAME               NUM_DISTINCT  NUM_NULLS
------------------------- ------------ ----------
TASK_ID                         100010          0
PROCESSED                            2          0

SQL> SELECT index_name,
  2         leaf_blocks,
  3         distinct_keys,
  4         num_rows
  5    FROM user_ind_statistics
  6   WHERE table_name = 'TASK_QUEUE';

INDEX_NAME                     LEAF_BLOCKS  DISTINCT_KEYS   NUM_ROWS
------------------------------ -----------  ------------- ----------
TASK_QUEUE_PROCESSED_IDX               182             2     100010

All looks pretty good at this point but when we try to query the table to find the new rows we get poor results.
The index isn’t even used and we use nearly 192 get operations to find just the 10 rows we want.

SQL> SELECT *
  2    FROM task_queue
  3   WHERE processed = 'N';

   TASK_ID P
---------- -
    100001 N
    100002 N
    100003 N
    100004 N
    100005 N
    100006 N
    100007 N
    100008 N
    100009 N
    100010 N

10 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2224684298

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            | 50005 |   341K|    69   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TASK_QUEUE | 50005 |   341K|    69   (2)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROCESSED"='N')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        192  consistent gets
          0  physical reads
          0  redo size
        538  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

Let’s see if we can do better. Going back to original assertion that null-only entries won’t be indexed – we throw out all the processed records if we treat them as nulls. Using the NULLIF function we’ll achieve exactly the functionality we’re looking for. ‘Y’ values will become NULL, ‘N’ values will be the only ones left to index.

SQL> CREATE INDEX task_queue_unprocessed_idx
  2      ON task_queue(NULLIF(processed,'Y'));

Index created.

SQL> SELECT index_name,
  2         leaf_blocks,
  3         distinct_keys,
  4         num_rows
  5    FROM user_ind_statistics
  6   WHERE table_name = 'TASK_QUEUE';

INDEX_NAME                      LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS
------------------------------- ----------- ------------- ----------
TASK_QUEUE_PROCESSED_IDX                182             2     100010
TASK_QUEUE_UNPROCESSED_IDX                1             1         10

That looks promising. The new index only has 10 rows in it and it’s much smaller, all 10 leaf nodes fit into a single block.

SQL> SELECT *
  2    FROM task_queue
  3   WHERE NULLIF(processed,'Y') = 'N';

   TASK_ID P
---------- -
    100001 N
    100002 N
    100003 N
    100004 N
    100005 N
    100006 N
    100007 N
    100008 N
    100009 N
    100010 N

10 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1345274010

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                            |  1000 |  7000 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TASK_QUEUE                 |  1000 |  7000 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TASK_QUEUE_UNPROCESSED_IDX |    10 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(CASE  WHEN "PROCESSED"='Y' THEN NULL ELSE "PROCESSED" END ='N')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        538  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

That’s much better, only 4 gets needed instead of 192. That’s much more efficient. Also note the NULLIF is actually rewritten in the function-based index definition and the execution plan filter/access operation.

Writing the WHERE-clause with the NULLIF or CASE is somewhat inconvenient syntax, so I often create a view to encapsulate the function filter as well as provide a meaningful name. Here we’re looking for unprocessed tasks, so that’s what I name the view.

SQL> CREATE OR REPLACE VIEW unprocessed_tasks
  2  AS
  3      SELECT *
  4        FROM task_queue
  5       WHERE NULLIF(processed,'Y') = 'N';

View created.

SQL> SELECT * FROM unprocessed_tasks;

   TASK_ID P
---------- -
    100001 N
    100002 N
    100003 N
    100004 N
    100005 N
    100006 N
    100007 N
    100008 N
    100009 N
    100010 N

10 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1345274010

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                            |  1000 |  7000 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TASK_QUEUE                 |  1000 |  7000 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TASK_QUEUE_UNPROCESSED_IDX |    10 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(CASE  WHEN "PROCESSED"='Y' THEN NULL ELSE "PROCESSED" END ='N')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        538  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

Some might wonder: “What about that normal index we created at the beginning?” Well, the optimizer doesn’t want to use it by default; but we can use a HINT to tell the optimizer to use it instead of the full table scan.

SQL> SELECT  /*+ INDEX(task_queue task_queue_processed_idx) */
  2         *
  3    FROM task_queue
  4   WHERE processed = 'N';

   TASK_ID P
---------- -
    100001 N
    100002 N
    100003 N
    100004 N
    100005 N
    100006 N
    100007 N
    100008 N
    100009 N
    100010 N

10 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2535177461

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                          | 50005 |   341K|   175   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TASK_QUEUE               | 50005 |   341K|   175   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TASK_QUEUE_PROCESSED_IDX | 50005 |       |    91   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PROCESSED"='N')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        562  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

5 gets isn’t bad. It’s definitely a lot better than the 192 of the full table scan, but it’s not as good as the function-based “collapsing” index approach. But, this is a very small table with not all that many rows in it. Let’s look at the relative sizes of the objects created.

SQL> SELECT segment_name,
  2         segment_type,
  3         bytes,
  4         blocks
  5    FROM user_segments
  6   WHERE segment_name LIKE 'TASK_QUEUE%';

SEGMENT_NAME                    SEGMENT_TYPE            BYTES     BLOCKS
------------------------------- ------------------ ---------- ----------
TASK_QUEUE                      TABLE                 2097152        256
TASK_QUEUE_PROCESSED_IDX        INDEX                 2097152        256
TASK_QUEUE_UNPROCESSED_IDX      INDEX                   65536          8

As expected, the function-based index has collapsed to just a small fraction of the size of the table, whereas the normal index is just as big as the table itself. Of course, with a more complex table the sizes will likely differ; but the intent is to show relative scale of the different indexes. Therefore, it is possible to get nearly the same execution performance with a hinted index it’ll still never be quite the same and you’ll consume more space. Furthermore, the normal index will continue to grow forever. The collapsing index should stay small as long as you regularly process the new records. It won’t matter if your table grows into the trillions of rows, if you never have more than a few dozen unprocessed records at a time, the index will never require more than a few blocks to hold them.

Function-based indexes were introduced in 8i and I’ve used this technique in every db version since.
I hope you find it as useful as I have.

Note 1: All of the autotrace results shown above were taken after repeated executions and the results stabilized after the effects of caching were removed. You should get comparable results on your systems; but not necessarily on the first execution of any of the queries.

Note 2: This isn’t just an Oracle feature. Not indexing NULLs is a fairly standard operation across modern relational databases. DB2 has similar functionality in version 10.5 and above via expression-based indexes. SQL Server doesn’t have quite the same functionality but indexes on non-persisted, computed columns can achieve somewhat similar benefits.