Introduction to ASH Data

The Oracle database captures session activity approximately once each second. This data can then be used to identify usage patterns and performance problems. Many performance experts have written very good suites of scripts and tools to query, analyze, and chart the ASH data. However, some of these tool sets can be somewhat intimidating for new users.

In this article I’d like to introduce a few simple queries of ASH data as well as a simple text-based charting method built from those queries. Thus forming a foundation for other, more complex queries as the users gain comfort using the ASH data effectively.

Before we begin digging in to the data, it is important to note ASH and AWR information is not automatically made available to all users. In addition to the object privileges needed to read the views, there are licensing requirements. Simply SELECTing data from V$ACTIVE_SESSION_HISTORY or DBA_HIST_ACTIVE_SESS_HISTORY requires the Oracle Diagnostic Pack license.

Probably the most common usage of ASH data is simply measuring how much activity is on the system at various points in time. When activity is low the database starts to become idle and should be responsive to user requests. As activity increases users start to compete for the limited CPU capacity. A simple method of measuring activity is to simply count the number active sessions at various points in time. At the most basic level this would simply be counting the number of sessions captured in each sample time. For example, in the following query we can our chosen window of 20 seconds, the system varied from a low of 4 active sessions up to a high of 18 active sessions. These numbers, without any greater context simply indicate the system had varying levels of activity. There is no indicator that the 4 was “good” or the 18 was “bad” but the measurements are a start.

SQL>   SELECT sample_time, COUNT(*) cnt
  2      FROM gv$active_session_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  4       AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;
SAMPLE_TIME                       CNT
2018-09-15 09:00:00.969000000       6
2018-09-15 09:00:01.969000000       6
2018-09-15 09:00:02.969000000       4
2018-09-15 09:00:03.979000000       4
2018-09-15 09:00:04.979000000       4
2018-09-15 09:00:05.979000000       6
2018-09-15 09:00:06.979000000       6
2018-09-15 09:00:07.979000000       6
2018-09-15 09:00:08.979000000       6
2018-09-15 09:00:09.979000000       6
2018-09-15 09:00:10.979000000       4
2018-09-15 09:00:11.989000000       6
2018-09-15 09:00:12.989000000      18
2018-09-15 09:00:14.049000000      18
2018-09-15 09:00:15.049000000      18
2018-09-15 09:00:16.049000000      18
2018-09-15 09:00:17.049000000       4
2018-09-15 09:00:18.049000000       6
2018-09-15 09:00:19.049000000       6
2018-09-15 09:00:20.049000000       6

20 rows selected.

You may have noticed the view I queried is gv$active_session_history rather than just v$active_session_history. While multiple nodes in a RAC system would need to be handled, in this case there is no functional difference as my test system only has a single instance. The rest of the examples in this article are on the same system.

This strategy of pulling the numerical counts is useful, but sometimes it’s helpful to have a visualization to convey the information more easily. A simple bar chart is easy to generate by padding characters in length proportional to those counts.

SQL>   SELECT sample_time, COUNT(*) cnt, RPAD('*', COUNT(*), '*') chart
  2      FROM gv$active_session_history
  3     WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  4       AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
  5  GROUP BY sample_time
  6  ORDER BY sample_time;
SAMPLE_TIME                       CNT CHART
2018-09-15 09:00:00.969000000       6 ******
2018-09-15 09:00:01.969000000       6 ******
2018-09-15 09:00:02.969000000       4 ****
2018-09-15 09:00:03.979000000       4 ****
2018-09-15 09:00:04.979000000       4 ****
2018-09-15 09:00:05.979000000       6 ******
2018-09-15 09:00:06.979000000       6 ******
2018-09-15 09:00:07.979000000       6 ******
2018-09-15 09:00:08.979000000       6 ******
2018-09-15 09:00:09.979000000       6 ******
2018-09-15 09:00:10.979000000       4 ****
2018-09-15 09:00:11.989000000       6 ******
2018-09-15 09:00:12.989000000      18 ******************
2018-09-15 09:00:14.049000000      18 ******************
2018-09-15 09:00:15.049000000      18 ******************
2018-09-15 09:00:16.049000000      18 ******************
2018-09-15 09:00:17.049000000       4 ****
2018-09-15 09:00:18.049000000       6 ******
2018-09-15 09:00:19.049000000       6 ******
2018-09-15 09:00:20.049000000       6 ******

We can make our chart a little smarter by checking if the number of active sessions exceeds the CPU count, thus providing some of the missing context mentioned earlier. First, we’ll see what the cpu_count init parameter is for this database.

SQL> select value from v$parameter where name = 'cpu_count';
VALUE
8

Then we’ll adjust the chart logic to use a different character when there are more active sessions than CPUs to service them. Now the spike of 18 sessions is not only obvious that it’s larger than the rest of the activity but the exclamation characters highlight a potential problem.

SQL>   SELECT sample_time, cnt,
  2           CASE
  3             WHEN cnt <= 8 THEN RPAD('*', cnt * 2, '*')
  4             ELSE RPAD('!', cnt * 2, '!')
  5           END chart
  6      FROM (  SELECT sample_time, COUNT(*) cnt
  7                FROM gv$active_session_history
  8               WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
  9                 AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
 10            GROUP BY sample_time)
 11  ORDER BY sample_time;
SAMPLE_TIME                       CNT CHART
2018-09-15 09:00:00.969000000       6 ************
2018-09-15 09:00:01.969000000       6 ************
2018-09-15 09:00:02.969000000       4 ********
2018-09-15 09:00:03.979000000       4 ********
2018-09-15 09:00:04.979000000       4 ********
2018-09-15 09:00:05.979000000       6 ************
2018-09-15 09:00:06.979000000       6 ************
2018-09-15 09:00:07.979000000       6 ************
2018-09-15 09:00:08.979000000       6 ************
2018-09-15 09:00:09.979000000       6 ************
2018-09-15 09:00:10.979000000       4 ********
2018-09-15 09:00:11.989000000       6 ************
2018-09-15 09:00:12.989000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:14.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:15.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:16.049000000      18 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2018-09-15 09:00:17.049000000       4 ********
2018-09-15 09:00:18.049000000       6 ************
2018-09-15 09:00:19.049000000       6 ************
2018-09-15 09:00:20.049000000       6 ************

20 rows selected.

If you use Oracle’s sqlcl tool then the formatting of the chart can be made even more vibrant with the use of color codes with ansiconsole formatting.

SQL>   SELECT sample_time, cnt,
  2           CASE
  3             WHEN cnt < 8 THEN '@|bg_green '
  4             WHEN cnt = 8 THEN '@|bg_yellow '
  5             ELSE '@|bg_red '
  6           END
  7           || RPAD('*', cnt, '*')
  8           || '|@'
  9               chart
 10      FROM (  SELECT sample_time, COUNT(*) cnt
 11                FROM gv$active_session_history
 12               WHERE sample_time >= TIMESTAMP '2018-09-15 09:00:00'
 13                 AND sample_time < TIMESTAMP '2018-09-15 09:00:21'
 14            GROUP BY sample_time)
 15  ORDER BY sample_time;

These queries provide the most basic form of investigation. They are useful, but at a one-second granularity the data volumes can become unwieldy for rapid diagnostic value and visualization.
In my next article we’ll expand on these to show different methods of querying ASH data across larger time ranges while still maintaining a manageable set of results.

Simple Performance Diagnostics in APEX

Recently I received a call that an APEX application was experiencing poor performance.  They asked me to investigate what was happening on the database or in APEX itself.

First, I ask if they have any application logs that give some indication of where the problems might be.  The answer came back no.  There was no instrumentation or logging of their own in place to capture the activity; but we did have APEX debug enabled for some of the user actions.

So, I query the APEX dictionary for their application, user, and an approximate execution time when the users saw poor performance while debugging.

  SELECT view_timestamp,
         page_id,
         elapsed_time,
         page_view_type,
         apex_session_id,
         debug_page_view_id
    FROM apex_workspace_activity_log
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND view_timestamp > TIMESTAMP '2018-10-02 15:00:00 -04:00'
     AND view_timestamp < TIMESTAMP '2018-10-02 15:20:00 -04:00'
ORDER BY view_timestamp;

This yielded output as follows, with page 27 having nearly 9 seconds of time. Users reported other times where the page was even slower, however they were not running with DEBUG mode enabled then so this was the view I used for Step 1.

VIEW_TIMESTAMP                     PAGE_ID  ELAPSED_TIME  PAGE_VIEW_TYPE  APEX_SESSION_ID  DEBUG_PAGE_VIEW_ID
2018-10-02 15:11:48.691578 -04:00  20       3.121443      Rendering       307779930648678  801               
2018-10-02 15:11:54.204761 -04:00  27       0.045717      Rendering       307779930648678  804               
2018-10-02 15:11:54.651241 -04:00  27       0.03838       Ajax            307779930648678  807               
2018-10-02 15:12:21.752888 -04:00  27       0.570659      Rendering       307779930648678  810               
2018-10-02 15:12:22.267021 -04:00  27       0.031825      Ajax            307779930648678  813               
2018-10-02 15:12:31.484343 -04:00  27       8.814905      Ajax            307779930648678  816   

With this output, I then took the session id and debug view id into Step 2.

  SELECT id,
         message_timestamp,
         elapsed_time,
         execution_time,
         message
    FROM apex_debug_messages
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND session_id = 307779930648678
     AND page_view_id = 816
ORDER BY message_timestamp;

APEX debugging is quite verbose, even at the default level 4, within those 8 seconds, over 11000 debug messages were recorded.

ID      MESSAGE_TIMESTAMP                   ELAPSED_TIME    EXECUTION_TIME  MESSAGE                                                
817021  2018-10-02 15:12:22.678806 -04:00   0.009403        0.00036         Session State: fetch from database (exact)             
817022  2018-10-02 15:12:22.679166 -04:00   0.009761        0.000042        ... sentry+verification success                        
817023  2018-10-02 15:12:22.679208 -04:00   0.009802        0.000393        ...Session ID 307779930648678 can be used                                    
...
 { 11000+ more rows }
... 
828467  2018-10-02 15:12:31.483859 -04:00   8.814462        0.000292        Stop APEX Engine detected                              
828468  2018-10-02 15:12:31.484151 -04:00   8.814746                        Final commit

Scanning through all of those messages wasn't going to be a very efficient diagnostic method, so I changed the ORDER BY to sort by execution time, with the slowest first.

  SELECT id,
         message_timestamp,
         elapsed_time,
         execution_time,
         message
    FROM apex_debug_messages
   WHERE application_id = 100
     AND apex_user = 'EXAMPLEUSER'
     AND session_id = 307779930648678
     AND page_view_id = 816
ORDER BY execution_time desc;

This immediately identified the main culprits.

ID      MESSAGE_TIMESTAMP                   ELAPSED_TIME    EXECUTION_TIME  MESSAGE
828468  2018-10-02 15:12:31.484151 -04:00   8.814746                        Final commit
825696  2018-10-02 15:12:26.932971 -04:00   4.263569        2.733338        SQL Statement prepared: ... found binds=1
817036  2018-10-02 15:12:22.683182 -04:00   0.013779        1.840666        SQL Statement prepared: ... found binds=2
821366  2018-10-02 15:12:25.143178 -04:00   2.473775        1.426806        SQL Statement prepared: ... found binds=2
...

Thus 6 out of the total 8.8 seconds were spent in the application sql statements. The remainder of the time was primarily spent iterating over the result sets populating 3 series of chart data with several thousand data points.

In this case it appears APEX was executing their application quite quickly, the problem was their queries needed some work, possibly pruning the result sets down a little if the charting didn't really benefit from results of that granularity.

While a 9 second page was still considered slow, that performance was benefiting from caching by repeatedly running tests; thus further highlighting the need to focus on the queries.

What struck me most in this exercise was how quickly and easily it was to identify the problem areas. With just three queries, each of which only took a few seconds to run, the problem queries were clearly identified. Had it been a custom javascript function, or something in APEX itself, those too would have been similarly identified.

One of the hallmarks of a good architecture is how easy it is to diagnose. APEX gets an A+ here even before adding in any external instrumentation.

Using MERGE to INSERT overlapping source data

Recently I’ve been working with a data source that generates a new file each day for me to import. That seemed simple enough, I created an external table to read the file and then run an insert statement to load the data into a target table. However, the source files sometimes overlap each other. That is, on day 1, I might get ids 1,2,3,4,5. Then on day 2, I might get ids 3,4,5,6,7,8,9. I could simply use a MERGE statement to update matching records and then insert the new unmatched ones. That’s what MERGE is for; but in this case I know the source data has static history. So, if I update ID 3 on the second day because I find a match from the first day, I know the update won’t actually change any data. Thus making for an inefficient process doing dummy work.

As an alternate solution, I could use insert with a unique constraint to prevent duplicates and use DBMS_ERRLOG to create a log table to catch the duplicate rows, thus allowing the insert to continue rather than aborting when a duplicate is found. This too though seems like a waste of resources.

Of course, another option would be to do a select from the source table that excludes the matching ids with an anti-join, not-exists subquery, or not-in subquery.

All of these variations would work but are less desirable in that they require extra steps or are needlessly complex.

The easiest method I found was to simply use a one-sided merge. That is, only define a WHEN NOT MATCHED condition, so the matching records simply drop out of the merge statement when they are detected without extra processing or raising exceptions.

As a test case to illustrate, I’ll use a normal table instead of an external table as the source and simulate 3 days of results to load into the target table.

CREATE TABLE target
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

CREATE TABLE source
(
    id         NUMBER,
    textval    VARCHAR2(10),
    dateval    DATE,
    numval     NUMBER
);

On day 1 we receive 5 rows of data.

-- Source data Day-1   
truncate table source;            
Insert into source (id, textval, dateval, numval) values (1,'one',  date '2018-09-02',1);
Insert into source (id, textval, dateval, numval) values (2,'two',  date '2018-09-02',4);
Insert into source (id, textval, dateval, numval) values (3,'three',date '2018-09-02',9);
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
commit;

Then use merge to insert the first day’s data.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

5 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25

Then on day 2 we receive 4 rows of data, 2 repeats and 2 new one ones.

-- Source data Day-2 
truncate table source;  
Insert into source (id, textval, dateval, numval) values (4,'four', date '2018-09-02',16);
Insert into source (id, textval, dateval, numval) values (5,'five', date '2018-09-02',25);
Insert into source (id, textval, dateval, numval) values (6,'six',  date '2018-09-03',36);
Insert into source (id, textval, dateval, numval) values (7,'seven',date '2018-09-03',49);
commit;

Then merge again. Note only the two new rows are merged.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

2 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49

On day 3 we receive 4 more rows, this time with 1 duplicates and 3 new ones.

-- Source data Day-3
truncate table source;
Insert into source (id, textval, dateval, numval) values (7, 'seven',date '2018-09-03',49);
Insert into source (id, textval, dateval, numval) values (8, 'eight',date '2018-09-04',64);
Insert into source (id, textval, dateval, numval) values (9, 'nine', date '2018-09-04',81);
Insert into source (id, textval, dateval, numval) values (10,'ten',  date '2018-09-04',100);
commit;

Merge one more time and we have 3 new rows merged and only the 10 distinct copies from all 3 days in the target table.

SQL> MERGE INTO target t
  2       USING source s
  3          ON (s.id = t.id)
  4  WHEN NOT MATCHED
  5  THEN
  6      INSERT     (id,
  7                  textval,
  8                  dateval,
  9                  numval)
 10          VALUES (s.id,
 11                  s.textval,
 12                  s.dateval,
 13                  s.numval);

3 rows merged.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM target ORDER BY id;

        ID TEXTVAL    DATEVAL       NUMVAL
---------- ---------- --------- ----------
         1 one        02-SEP-18          1
         2 two        02-SEP-18          4
         3 three      02-SEP-18          9
         4 four       02-SEP-18         16
         5 five       02-SEP-18         25
         6 six        03-SEP-18         36
         7 seven      03-SEP-18         49
         8 eight      04-SEP-18         64
         9 nine       04-SEP-18         81
        10 ten        04-SEP-18        100

I’m not Steven Feuerstein, should I even bother blogging?

If you’re not into PL/SQL, substitute Jonathan Lewis, Tanel Poder, Maria Colgan, or any other technology great you might respect. Doesn’t even need to be Oracle related.

The answer, quite simply, is yes.

First – it’s not a competition. If it were, then yes, I concede to the guy that literally wrote book on “Oracle PL/SQL Programming.” I don’t blog to show I know more or even as much as he does. I blog to share examples, interesting quirks, and use cases I’ve encountered. For all of the syntax descriptions in the Oracle manuals and all of the code downloads and explanatory videos out there, maybe my contribution will be the one that clicks with some reader googling for a particular problem. The point of my code isn’t that nobody else could have written what I did, but simply that I did write it, and maybe somebody else can benefit from it.

So, if you’ve written something you thought was interesting, by all means share it. Maybe it’s something with nested table collections that has been covered a thousand times before. Post it anyway, maybe your specific example, with your choice of descriptive phrasing will be the one that catches the next reader’s eye.

Second – more diverse voices matter. I’ve been told my style of writing is too casual and sometime unprofessionally conversational. I’ve also been told my writing helps make new syntax and/or complicated problems more approachable. So even if my content might be exactly what someone needs, if my delivery doesn’t work for them maybe the message gets lost. I’d feel a little bad about that; but it happens. Some of it just implicit. I only read one language well, English, so I only read blogs and watch videos in English. I’m sure I miss out on some good content because of that limitation. Others might too.

So, maybe you have something to say about collections that you know for sure Feuerstein has covered before. Go ahead, say it anyway. Use your code, your words, your language. Get your message, explained your way out there. If you happen to be a polyglot, even better. Translate your own work.

Third – what you’re posting is probably new to someone I’ve been coding in PL/SQL since version 7. I’ve written tens of thousands of lines of code. I don’t believe there is anything I’ve written that couldn’t have been written by lots of other people, but nobody is writing everything. You might be surprised with what others will look for. Month after month, the hottest article on my blog (by far) from search engines is one I wrote several years ago on using DBMS_PARALLEL_EXECUTE to run parallel pl/sql . I posted it because I thought it was a kind of quirky thing to do. I had no idea so many other people would search for exactly that. I’m going to guess I’m not the first or only person to do it; but I bothered to blog about it, so now that article is a resource out there for others. If I picked my blog articles based on what I thought would garner the most “hits” then I never would have written that article. My goal is one. If one person likes a post then I’ve done my job. Two and up is just bonus.

So, if you’ve got some strange use case and think it’s too obscure to be helpful to anyone else – post it anyway. Even if you’re right that it can’t be used in your exact form in other code; your approach, or general idea behind your solution might be the inspiration somebody else needs to get their problem solved.

Fourth – blogging is also helpful to the author. The act of of writing an instructional blog about code is different than the act of writing the code itself. It’s also different than writing comments for the code. When you write a blog article about some block of code you are removing that code from the context in which it was originally written. So, comments you might leave in the code for yourself or future maintainers might be meaningless when read with a stand-alone example. Similarly, intent may be obvious within the context of an application, but in isolation a code block could be opaque. Also, you don’t know your audience, so you need to write in such a way that it aids the intended reader. This may mean exploring syntax minutia and documenting features that are or are not used in order to give scope to your code. These extra steps improve your skill as a writer; but you will often find you learn more about your own code as you start fleshing out the areas around it.

So, if you got some code or practice that you think you have some expertise, challenge yourself to write about it so a novice can understand it. Or, you can flip it around. If there is something you want to learn about for yourself, start writing about it. You don’t have to publish immediately at each twist, turn, trial and error; but as you reach milestones – write about them. Your own blog can then be a reference for yourself as well.

Fifth – start the conversation If it’s your first line of code or your millionth, you have something to say about what you’re doing. Write it down. If you’ve got something good – show it off. If you’ve got a question – ask it. Maybe it’s something in between, a little code block, nothing fancy, but explained well. Put the code out there, get some feedback. You might get kudos and thanks, you might get questions, you might get corrections. Any kind of feedback is good. You might get nothing. That’s fine too. Most people don’t comment, “like”, or “vote” on posts. It’s ok, post anyway. When I started I was getting 1 or 2 visits per month, then a few more, then a dozen, then a few dozen, then a hundred, and another hundred, and another hundred, and so on. The audience is growing in silence but growing. But, let’s say it didn’t. A lack of traffic would itself be feedback. Maybe you need a different topic, different writing style, different audience. Maybe there are no problems with your content, but simply your blog’s interaction with search engines.

So, everyone had something to share, go ahead and share it. You might feel self-conscious about posting some code you’re not sure about it. You can either take a little more time to refine it or go ahead and post it, and ask for improvements. Maybe you’re afraid you’ll post your awesome toolkit and somebody better will come along and tear it to shreds, detailing hole after hole. For one, that’s probably not going to happen. If it does, and it’s bad enough, you can delete the article and repost later after applying corrections, or simply append to the article showing before and after. The work of changing could itself become an article series. I once wrote a 3-part series exploring syntax variations of a single query to solve a question posed by a friend. All but the very last example in the third article are objectively bad ways to do the work; but the point of the series was to explore and explain different approaches to the problem. I intentionally posted answers that I would not recommend for the sake of the exploration. Had i jumped immediately to the simple solution I gave him it wouldn’t have been as interesting or educational. When my friend asked me how to do it, he already assumed I could. When he read all the different ways I came up with he responded “I didn’t know you could do that!” I both solved the problem and showed him something new, making the whole exercise worthwhile.

In conclusion, if you’re thinking about starting a blog, just start. Some blog every day, some once a week, some once a month. Some are regular, some in bursts. It doesn’t matter, just start. Push out some content, talk to some people. Who knows? The next person to read your blog might be a potential employer that didn’t know you existed until he or she saw your work.

Go, put fingers to keyboard and welcome aboard!

How to use DBMS_LDAP (part 6: Modifying)

Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes
  5. Timeouts
  6. Modifying Data (this page)

Modifying data in the LDAP hierarchy

The DBMS_LDAP package allows you to write as well as read data from the directory. However, it does not provide an API for manipulating the schemas within the tree. That is, you can add, delete, or modify records of predefined structures but you may not create new structure types of your own.

In my previous articles in this series I used publicly available servers for my examples. As you might expect, those are read-only. So, in this article my examples will be based off of the LDAP structures used for Oracle database name resolution. These could be from Oracle’s own OID product, or some other LDAP server hosting the same schema.

Inserting a new record

To insert a new record, you might think of it similar to a SQL insert into an index-organized table. That is, unlike a normal heap-table, where rows can go anywhere in the table; in an IOT, or an LDAP structure, each row can only exist within a specific location. Thus an LDAP insert consists of defining the unique key of the record, i.e. where the new record will go, as well as the content fields of the new record.

The where portion consists of the DN as we’ve used in searching from previous sessions. The content is defined by an array of attributes. Each attribute is itself an array. In the example below, each attribute has only a single value, so as the modification array is constructed, each array adds an array of just one element. Since every node within a directory must have an object class, there will always be at least on attribute populated in the modification array to set that value.

In the case of an Oracle Net Service Name, the object class is “orclNetService”. In order to be a useful record it must also include the name to be resolved, as well as the TNS description to which the name resolves. So, in this example I am inserting a new name “test_name” that resolves to “(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))”

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 3);

        v_strings(1) := 'orclNetService';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'objectclass',
                                     modval     => v_strings);

        v_strings(1) := 'test_name';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'cn',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.add_s(ld        => v_session,
                            entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                            modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Add successful');
        ELSE
            DBMS_OUTPUT.put_line('Add Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Add successful

If you run the block twice, you should get an error on the second run for trying to create two entries at the same DN.

Add Result: 68 Already exists

Deleting an old record

Deleting a record is fairly simple. To make a correlation with SQL, you simply specify the unique key to delete that record. DBMS_LDAP does not provide an API for mass deletion. If you wanted to delete multiple records, you could call the delete_s function for each DN, or, if you didn’t know them, you could perform a search as shown in prior chapters, and then retrieve the DN from each search result to do the delete.

DECLARE
    v_session          DBMS_LDAP.session;
    v_result           PLS_INTEGER;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_result := DBMS_LDAP.delete_s(ld => v_session, entrydn => 'cn=test_name,cn=OracleContext,dc=example,dc=net');

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Delete successful');
        ELSE
            DBMS_OUTPUT.put_line('Delete Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Delete successful

If you run the block twice, you should get an error on the second run for trying to delete a DN that doesn’t exist.

Delete Result: 32 No such object

Modifying a record

Modifying a record in LDAP can but doesn’t always correlate with a SQL update. At the simplest level, you can specify the unique key, the DN, as you would with a delete, but instead of removing the record, you can replace the value of an attribute with a new value. In SQL terms this would be analogous to something like this:

update my_table set my_attribute = new_value where id = my_dn;

You can update more than one attribute by populating the modification array with more than one replace command. Here I update our test record with a new TNS description pointing to another server with a different service name for that listener.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result := DBMS_LDAP.simple_bind_s(ld => v_session, dn => 'uid=myuserid,ou=users,dc=example,dc=net', passwd => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 1);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_replace,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;
/
Modify successful

While that replace looks like a fairly standard update, correlating well with SQL, you can also modify records to remove attributes. In SQL terms, that would be like removing a column from just one row of a table. Not setting it to NULL, but actually removing the column entirely.

Or, maybe a closer analogy, if your table had a collection type for a column and you deleted an element from that collection.

Even that though still doesn’t quite compare, because if you delete all attributes of a certain type, there is no place holder attribute left behind in an LDAP directory. But, if you had a collection type in a table, the column would still exist, but have a null or empty collection there.

So, the lesson here is that LDAP modify operations should be seen as a more complex operation.

Next I’ll perform an equivalent update as before, except instead of doing a replace, I’ll add a new description attribute and then delete the old one. This might seem like a violation because there will temporarily be two descriptions for a single name, but modify_s function is essentially atomic, no matter how many attribute modifications are applied. So, the LDAP constraint ensuring a single description for a name isn’t applied until after the all of the modifications are completed.

This example is merely for illustration not a recommendation. If you have a single value to update, using replace is more efficient than insert and delete. Also note this example assumes the directory is populated after the initial insert above, not after the other modify example.

DECLARE
    v_session      DBMS_LDAP.session;
    v_result       PLS_INTEGER;
    v_attributes   DBMS_LDAP.mod_array;
    v_strings      DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := FALSE;
    v_session := DBMS_LDAP.init(hostname => 'ldap.example.net', portnum => DBMS_LDAP.port);

    v_result :=
        DBMS_LDAP.simple_bind_s(ld       => v_session,
                                dn       => 'uid=myuserid,ou=users,dc=example,dc=net',
                                passwd   => 'secret_password');

    IF v_result != DBMS_LDAP.success
    THEN
        DBMS_OUTPUT.put_line('Bind Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    ELSE
        v_attributes := DBMS_LDAP.create_mod_array(num => 2);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myotherserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myotherservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_add,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);

        v_strings(1) :=
            '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myserver)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=myservice)))';
        DBMS_LDAP.populate_mod_array(modptr     => v_attributes,
                                     mod_op     => DBMS_LDAP.mod_delete,
                                     mod_type   => 'orclNetDescString',
                                     modval     => v_strings);


        v_result :=
            DBMS_LDAP.modify_s(ld        => v_session,
                               entrydn   => 'cn=test_name,cn=OracleContext,dc=example,dc=net',
                               modptr    => v_attributes);

        IF v_result = DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Modify successful');
        ELSE
            DBMS_OUTPUT.put_line('Modify Result: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        END IF;

        DBMS_LDAP.free_mod_array(modptr => v_attributes);
        v_result := DBMS_LDAP.unbind_s(v_session);
    END IF;
END;

If the block is run twice you’ll get an error trying to add the same description value twice.

Modify Result: 20 Type or value exists

Rearranging the delete to occur first won’t help, because the delete is for the old description value. So, if you put the delete first and run it twice, you’ll get a different error, but the block will still fail.

Modify Result: 16 No such attribute

In addition to the inefficiency using two modifications instead of one, these errors also help to illustrate why using the replace modification is a better choice when it applies.

Of course, if you have a record with a multi-value attribute, it may be appropriate to delete one attribute and add another instead of doing a replace. For example, updating a person’s record, deleting one or more old email records and adding new ones. Or, not doing a replace at all, maybe adding children to a parent or making changes to employees of a department. The modification action may consist only of add operations or only of deletes. During a business’s reorganization a department might change names using a replace, change managers, also using a replace, and then add several employees and remove others. All of which could be executed with a single modify_s call to the department’s record.

This concludes my series on DBMS_LDAP. I hope you found it enlightening and the examples useful starting points for your own applications. I know I learned more in the process of writing it.

Questions on this chapter or any of the previous chapters are always welcome.

Strange (erroneous) query from Oracle Audit Vault

Today I was asked about a query from an Audit Vault job that was failing. Unfortunately, the job didn’t report any useful error messages to explain the failure. After some digging the problem was narrowed down to a query pulling user account information. In particular, a calculated column was proving both erroneous as well as confusing, thus prompting the question: “What is this doing?”

The calculated column (with a little formatting) was:

TO_CHAR(lock_date
        - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
                            (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
                            3)
                    )
          ) / 24,
        'MM/dd/yyyy HH24:mi:ss')

So, the first question was – what is this trying to do and why does it fail? The first thing I noticed was the function was attempting to parse SYSTIMESTAMP as if it were text. That is possible, because Oracle will implicitly convert it to a VARCHAR2 type, but we don’t know the format the job was using when it failed. I ran the query in a test database with my session’s NLS_TIMESTAMP_TZ_FORMAT set to ‘yyyy-mm-dd hh24:mi:ssxff tzr’ producing results like this:

SQL> alter session set nls_timestamp_tz_format='yyyy-mm-dd hh24:mi:ssxff tzr';

Session altered.

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(SYSTIMESTAMP,
  3                              (INSTR(SYSTIMESTAMP,':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

Thus it seems the calculation is shifting the LOCK_DATE by 4 hours, but why and why does it work for me in the test db but not for the Audit Vault in the real one? The answer to that is in the text parsing.

In the problem db there was a logon trigger setting the NLS_TIMESTAMP_TZ_FORMAT to a custom form. The calculation makes the assumption that either the ‘TZR’ format model (with numeric offset) or ‘TZH:TZM’ is the last part of the converted text. In my test scenario that was the case, in the problem db it was not, that database used a format to support the assumptions of other processes. This is the reason implicit conversions should be avoided. You can’t always count on implicit formats being what you expect. Furthermore, there are security implications when implicit transformations are used. A better solution would have been to replace the implicit conversions with explicit ones. For example, rewriting the query as shown below would work regardless of the session timestamp format mask.

SQL> select lock_date,TO_CHAR(lock_date
  2          - (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),
  3                              (INSTR(TO_CHAR(SYSTIMESTAMP,'TZR'),':',-1,1)) - 3,
  4                              3)
  5                      )
  6            ) / 24,
  7          'MM/dd/yyyy HH24:mi:ss') ae$date from sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That solves the immediate problem of why the query fails and what needs to be done to fix it; but still doesn’t answer the how and intent. The function is extracting the time zone hour offset from SYSTIMESTAMP, but how? In my test case the offset was -04:00 because I was running in US/Eastern time zone during Daylight Saving Time.

INSTR finds the colon separating the hours from minutes, then adjusts by 3 characters to the left to accommodate hour offsets of -23 to +23 (the ranges of real time zones are only -12 to +14 though.) Then using SUBSTR from that string index, parses the SYSTIMESTAMP converted text again, returning the hour portion of the offset; and finally TO_NUMBER changes that substring into a number – here too it would be better to use an explicit format such as TO_NUMBER(…,’S00′) to ensure a valid conversion.

Thus a “final” form with all conversions made explicit and reliable regardless of session settings might look like this:

TO_CHAR(lock_date
- (TO_NUMBER(SUBSTR(TO_CHAR(SYSTIMESTAMP,'TZH'),
                    (INSTR(TO_CHAR(SYSTIMESTAMP,'TZH:TZM'),':',-1,1)) - 3,
                    3),
             'S00'
            )
  ) / 24,
'MM/dd/yyyy HH24:mi:ss')

While this syntax would function reliably, it would still fail a code inspection for violating my rule of “Does this code jump through too many hoops?”
This was evident before ever running the code myself. The initial question was “What is this doing?” With 4 data type conversions, multiple substring parses, index adjustment, and date math the result is certainly not immediately obvious, thus prompting the next question: “Is there a better way?”

Yes, in multiple ways. First, it is possible to simplify the syntax above by only using the TZH format, thus removing the need to find that field within the string. Even better though, when timestamps were introduced in 9i, they also came with the EXTRACT function to pull pieces from them. So a much simpler, more efficient, and self-documenting syntax would be: extract(timezone_hour from systimestamp)

SQL> select extract(timezone_hour from systimestamp) from dual;

EXTRACT(TIMEZONE_HOURFROMSYSTIMESTAMP)
--------------------------------------
                                    -4

If used in the context of the original query…

SQL> SELECT lock_date, TO_CHAR(lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24,
  2                            'MM/dd/yyyy HH24:mi:ss') ae$date
  3    FROM sys.cdb_users;

LOCK_DATE           AE$DATE
------------------- -------------------
2018-02-07 20:00:11 02/08/2018 00:00:11
2018-08-21 23:42:02 08/22/2018 03:42:02
2018-02-07 19:21:02 02/07/2018 23:21:02
2018-02-07 20:15:41 02/08/2018 00:15:41
2018-02-07 20:15:45 02/08/2018 00:15:45
2018-02-07 20:00:25 02/08/2018 00:00:25
2018-02-07 20:16:04 02/08/2018 00:16:04
2018-02-07 21:05:41 02/08/2018 01:05:41
2018-08-25 14:25:05 08/25/2018 18:25:05

That’s much better. Using the offset, the LOCK_DATE values are now adjusted to UTC/GMT time. This is helpful within the Audit Vault so all of the data is stored in a consistent form for easier comparison, joins, filtering, etc.

But…

The query is still not entirely correct!
Note all of the lock_date values from February. Their adjusted AE$DATE value is NOT equivalent to UTC/GMT time, but the August values are. The reason for this the conversion logic does not take Daylight Saving Time into account. During February, my system would be in US/Eastern Standard time thus having an offset of -5 hours. If I wait a few more months until November when I’m back in Standard time the February values will convert correctly but then the August values will be wrong.

Unfortunately, there isn’t a good solution for this problem. It is possible with a java stored procedure or an external procedure call to retrieve the system timezone setting. Ideally that will return a time zone name or unique abbreviation. If so, that could be used from the lock date value instead of a fixed offset from current system time.

That still isn’t completely reliable though!
The transition from Daylight Saving Time to Standard Time has two instances of the 1am-2am hour (in the US, other countries will have their own variations of the same issue) and nothing in the DATE value of LOCK_DATE will indicate whether a value is from the first or second iteration through the transition period.

But, even if you somehow solve that problem, it’s STILL not completely correct!
All of the math to adjust dates to UTC/GMT assume an integer hour offset; but some time zones around the world are not on even hour offsets. Asia/Kathmandu is +05:45, Canada/Newfoundland is -02:30. So the entire premise of the adjustment math is incorrect. This problem is possible to fix though, if you know the full offset. You simply include the minutes in the calculation as well as the hours.

lock_date - EXTRACT(TIMEZONE_HOUR FROM SYSTIMESTAMP)/24  - EXTRACT(TIMEZONE_MINUTE FROM SYSTIMESTAMP)/1440

In summary, the original query has a multitude of failings:

  1. Unreliable syntax that simply fails if specific system assumptions are not met
  2. Overly complex computation when native functionality exists
  3. Unreliable date math not supporting Daylight Saving Time
  4. Incorrect time adjustment not taking into account non-hourly offsets

The point of this article wasn’t to pick on Audit Vault, it’s a fine product, and these issues were relatively easy to work around within the real system. No, the real message here is I see these same types of problems all the time, especially when date/time values are involved. This one query highlighted several from a family of logic and syntax errors that developers fall prey to with alarming frequency. Hopefully the breakdown and various rewrites and comments will help others avoid some of these pitfalls and correct the ones they do encounter.

Questions and comments, as always, are welcome!

How to use DBMS_LDAP (part 5: Timeouts)

Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes
  5. Timeouts (this page)
  6. Modifying Data

Timeouts

LDAP servers are designed to be fast; but it is still possible for an action to take a long time. A deep sub-tree search with an open search criteria could require from seconds to even minutes to return data if the directory was large and no other criteria limited the results.

Thus clients are able to put timeout limits on their requests to abort if they take too long. The public servers I’ve been using are relatively small in scope and/or have result limits on them so they tend to be fast but we’ll create an intentionally expensive search and time it. I’ll use the approximate equality operator to do an expensive search into a deep and wide subtree.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'dc=cmu,dc=edu',
                           scope      => DBMS_LDAP.scope_subtree,
                           filter     => 'maillocaladdress~=ayushb',
                           attrs      => v_search_attrs,
                           attronly   => 0,
                           res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

+000000000 00:00:20.977627000

So, just over 20 seconds to do that search. If that is too long, a time out can be created using the DBMS_LDAP.TIMEVAL type. Then instead of using SEARCH_S, we use SEARCH_ST – “T” for timed. If the timeout is exceeded an exception will be raised. In order to more clearly capture the time, I’ll disable exceptions and examine the result code of the SEARCH_ST function instead. We’ll put a 2 second time limit on the search and check the timing when we’re done.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
    v_timeout          DBMS_LDAP.timeval;
BEGIN
    DBMS_LDAP.use_exception := FALSE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_timeout.seconds := 2;

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_st(ld         => v_session,
                            base       => 'dc=cmu,dc=edu',
                            scope      => DBMS_LDAP.scope_subtree,
                            filter     => 'maillocaladdress~=ayushb',
                            attrs      => v_search_attrs,
                            attronly   => 0,
                            tv         => v_timeout,
                            res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line('Error code: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Error code: 85 Timed out
+000000000 00:00:02.000382000

As expected, the search quit after 2 seconds and exited.

The TIMEVAL type is actually a record of two fields: seconds and microseconds.

    TYPE TIMEVAL IS RECORD
      ( seconds  PLS_INTEGER,
        useconds PLS_INTEGER
      );

Either or both fields may be populated. If both are populated the timeout value will be the sum of the two times. As in the following example where seconds is populated with 1 and useconds with 1.5 million, for a total timeout of 2.5 seconds.

DECLARE
    v_result           PLS_INTEGER;
    v_session          DBMS_LDAP.session;
    v_search_attrs     DBMS_LDAP.string_collection;
    v_search_results   DBMS_LDAP.MESSAGE;
    v_start            TIMESTAMP WITH TIME ZONE;
    v_end              TIMESTAMP WITH TIME ZONE;
    v_timeout          DBMS_LDAP.timeval;
BEGIN
    DBMS_LDAP.use_exception := FALSE;

    v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_timeout.seconds := 1;
    v_timeout.useconds := 1500000;

    v_start := SYSTIMESTAMP;
    v_result :=
        DBMS_LDAP.search_st(ld         => v_session,
                            base       => 'dc=cmu,dc=edu',
                            scope      => DBMS_LDAP.scope_subtree,
                            filter     => 'maillocaladdress~=ayushb',
                            attrs      => v_search_attrs,
                            attronly   => 0,
                            tv         => v_timeout,
                            res        => v_search_results);
    v_end := SYSTIMESTAMP;

    DBMS_OUTPUT.put_line('Error code: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
    DBMS_OUTPUT.put_line(v_end - v_start);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Error code: 85 Timed out
+000000000 00:00:02.500799000

While the datatype of each value is PLS_INTEGER, which accepts negative values, if either value is negative you will receive an error code 1027 (Unknown error.)
Similarly if both values are left unpopulated (both NULL) a 1027 error will be returned.

If exceptions are enabled then the following exception will be raised.

ORA-31207: DBMS_LDAP: PL/SQL - Invalid LDAP search time value.

If both values are set to 0, that is treated as no timelimit. The search will run to completion or the server’s limit.

How to use DBMS_LDAP (part 4: Attributes)

Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data
  4. Browsing Attributes (this page)
  5. Timeouts
  6. Modifying Data

Searching Hierarchical Data with Attributes

Returning attributes with search results

The examples in the prior chapter use the “1.1” code to return no attributes.
In this example we’ll return two of the attributes associated with an entry.
Instead of building a collection of one element containing “1.1”, the collection will have the name of each attributes we want to return.
Also note, the attronly => 0 parameter. If attronly is 1 then only the names of attributes will be returned. With 0, the values associated with each attributes are also returned.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
    v_values               DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := 'description';
    v_search_attrs(2) := 'admin';

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'host=beach,ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_base,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => 0,
                           res        => v_search_results);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
    DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);

    FOR i IN 1 .. v_search_attrs.COUNT
    LOOP
        v_values := DBMS_LDAP.get_values(v_session, v_entry, v_search_attrs(i));

        IF v_values.COUNT > 0
        THEN
            FOR j IN v_values.FIRST .. v_values.LAST
            LOOP
                DBMS_OUTPUT.put_line(v_search_attrs(i) || ' : ' || v_values(j));
            END LOOP;
        ELSE
            DBMS_OUTPUT.put_line(v_search_attrs(i) || ' not found');
        END IF;
    END LOOP;

    v_result := DBMS_LDAP.msgfree(v_entry);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/
DN: host=beach,ou=hosts,dc=debian,dc=org
description : bugs.debian.org web frontend
admin : debian-admin@lists.debian.org

Sometimes you might not know all of the attributes for an entry. Perhaps you’re doing a sub-tree search across different entry types and the attributes might change. In these scenarios you may need to employ a wildcard search and return all of the attributes for a given entry.

To return all of them, simply define an attribute array of one element consisting of the value ‘*’. It is legal syntax to include other attribute names in the array, but there is no functional value in doing so.

In the snippet below we’ll pull all of the attributes for the “host=beach” entry. Because we don’t know their names, in order to retrieve them we use a new data type: “DBMS_LDAP.BER_ELEMENT”. “BER” stands for Basic Encoding Rule. The datatype is a RAW, holding a pointer to the encoded attributes. With attronly set to 1, only the names of the attributes will be returned.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
    v_attribute_name       VARCHAR2(256); 
    v_ber                  DBMS_LDAP.ber_element;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';    

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'host=beach,ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_base,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => 1,
                           res        => v_search_results);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
    DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);

    v_attribute_name := DBMS_LDAP.first_attribute(v_session, v_entry, v_ber);

    WHILE v_attribute_name IS NOT NULL
    LOOP
        DBMS_OUTPUT.put_line(v_attribute_name);

        v_attribute_name := DBMS_LDAP.next_attribute(v_session, v_entry, v_ber);
    END LOOP;

    DBMS_LDAP.ber_free(v_ber, 1);

    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/
DN: host=beach,ou=hosts,dc=debian,dc=org
objectClass
distribution
access
admin
architecture
host
purpose
allowedGroups
description
hostname
sshRSAHostKey
mXRecord
rebootPolicy
physicalHost
sponsor
ipHostNumber

Once you have the names you could then go back and pull values for specific attributes as shown in the first example above. Or, we can return the attributes from a wildcard along with their attributes.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
    v_attribute_name       VARCHAR2(256);
    v_attribute_values     DBMS_LDAP.string_collection;
    v_ber                  DBMS_LDAP.ber_element; -- Basic Encoding Rule
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '*';

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'host=beach,ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_base,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => 0,
                           res        => v_search_results);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
    DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);

    v_attribute_name := DBMS_LDAP.first_attribute(v_session, v_entry, v_ber);

    WHILE v_attribute_name IS NOT NULL
    LOOP
        v_attribute_values := DBMS_LDAP.get_values(v_session, v_entry, v_attribute_name);

        CASE v_attribute_values.COUNT
            WHEN 0
            THEN
                NULL;
            WHEN 1
            THEN
                DBMS_OUTPUT.put_line(v_attribute_name || ' : ' || v_attribute_values(0));
            ELSE
                DBMS_OUTPUT.put_line(v_attribute_name || ' :');

                FOR i IN v_attribute_values.FIRST .. v_attribute_values.LAST
                LOOP
                    DBMS_OUTPUT.put_line('...' || v_attribute_values(i));
                END LOOP;
        END CASE;

        v_attribute_name := DBMS_LDAP.next_attribute(v_session, v_entry, v_ber);
    END LOOP;

    DBMS_LDAP.ber_free(v_ber, 1);

    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/
DN: host=beach,ou=hosts,dc=debian,dc=org
objectClass :
...top
...debianServer
distribution : Debian GNU/Linux
access : restricted
admin : debian-admin@lists.debian.org
architecture : amd64
host : beach
purpose : [[*bugs.debian.org|bugs web frontend]]
allowedGroups :
...debbugs
...Debian
description : bugs.debian.org web frontend
hostname : beach.debian.org

Note, some of the attributes can have more than one value. The “objectClass” attribute is a common one to have multiple values. This is important to remember when retrieving values for an entry that there isn’t a 1:1 distribution.

Also note, while you can wildcard the entirety of an attribute set, the “attrs” parameter collection does not support wildcards within an attribute name.
For example there is no syntax to request all attributes starting with the letter “a”. Trying something like “v_search_attrs(1) := ‘a*’;” won’t produce an error, it will simply fail to find an attribute with the exact name “a*”.

If you wanted to simulate such functionality you would need to return the list of all attribute names, then use that list to generate just the subset of attributes in which you were interested and build a collection of those. Then search again using just those names, as in the first example, but with attrsonly=>0 to return their values.

In the next chapter we’ll explore setting timeouts on searches.

How to use DBMS_LDAP (part 3: Searching)

Table of Contents

  1. Introduction
  2. Establishing a Connection
  3. Searching Hierarchical Data (this page)
  4. Browsing Attributes
  5. Timeouts
  6. Modifying Data

Searching Hierarchical Data

Once you’ve established a connection, the next thing you’ll probably want to do is start searching for records. Either to pull the data back to a client, or to modify one or more records or simply to check for existence of some records. Such as authorization checks based on a user’s membership in one or more groups; or searching for users with particular attributes.

One of the important things to remember about LDAP searches is, unlike the flat nature of a SQL WHERE-clause against a table, the data is hierarchical. That is, you must specify a depth scope when looking at data.

So, using the sample hierarchy from the introduction:

DC=NET
    DC=Example
        OU=People
            OU=Managers
                uid=1234
                    CN=Jane Doe
                    hiredate=2005-04-03
                    email=jane.doe@example.net
                    department=Acquisitions
                    objectClass=person
                uid=3456
                    CN=John Doe
                    hiredate=2006-05-04
                    email=john.doe@example.net
                    department=Sales
                    objectClass=person
            OU=Consultants
                uid=1987
                    CN=Mickey Mouse
                    hiredate=2005-12-11
                    email=mickey.mouse@example.net
                    department=Acquisitions
                    objectClass=person
                uid=6543
                    CN=Donald Duck
                    hiredate=2008-07-06
                    email=donald.duck@example.net
                    department=Sales
                    objectClass=person
  • We could search for the existence of a particular record directly.

    uid=1987,ou=consultants,ou=people,dc=example,dc=net
  • We could search one level, perhaps getting a list of all consultants.

    ou=consultants,ou=people,dc=example,dc=net
  • We can search an entire subtree, for example, all people regardless of group.

    ou=people,dc=example,dc=net

Within DBMS_LDAP the three search scopes are defined by constants:

SCOPE_BASE     CONSTANT NUMBER := 0;
SCOPE_ONELEVEL CONSTANT NUMBER := 1;
SCOPE_SUBTREE  CONSTANT NUMBER := 2;

While some LDAP servers and clients support a 4th type subordinate sub-tree searching; DBMS_LDAP does not provide such a mechanism. You could mimic the functionality (but not the efficiency) with a sub-tree search combined with an appropriate filter to eliminate the top level group from the results.

In addition to the search scope, you must also specify what will be returned from a search that matches the criteria. There are three levels of detail in the results:

  • Entry Distinguished Names only
  • Entry Distinguished Names, along with names of Attributes for each entry
  • Entry Distinguished Names, along with the names and values of Attributes for each entry

Attributes are specified by a string collection regardless of the level of detail. You can provide a string collection populated with the names of the attributes to be returned for each entry. Alternately, define the collection to have only one entry with a wildcard “*” to get all attributes, or provide a collection of one entry consisting of the magic string “1.1”, meaning return no attributes at all.
Within this article, all searches will use “1.1”, hence excluding attributes from the results. Including the attributes and browsing their names and values will be explored in the next chapter of this series.

Searching with BASE scope, no attributes

The simplest search is just a BASE scope, meaning you will provide the entire DN as a location. If you include no attributes, then this type of search is mosty just a an existence check. If you can find the record it exists, if you can’t, it doesn’t.
In the example below we’ll search the public Debian LDAP server for a specific host name. To use a BASE scope, we use the constant DBMS_LDAP.SCOPE_BASE. We’ll return no attributes, so our search list is “1.1”. You must specify a filter on the search, in this case we’ll leave it open to any record that matches the specified DN. Every record has an objectclass attribute, so our filter is a wildcard on that attribute: “objectclass=*”. The attronly parameter doesn’t apply since we aren’t returning any attributes so just leave it NULL.

In these examples, if we find a match we’ll display the DN as a single string and also break the DN into its constituent levels within the hierarchy.

If we don’t find a matching entry then the search_s function will raise an exception.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
    v_dn_pieces            DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '1.1';

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'host=beach,ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_base,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => NULL,
                           res        => v_search_results);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
    DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);
    v_dn_pieces := DBMS_LDAP.explode_dn(dn => v_distinguished_name, notypes => 0);

    FOR i IN v_dn_pieces.FIRST .. v_dn_pieces.LAST
    LOOP
        DBMS_OUTPUT.put_line(i || ': ' || v_dn_pieces(i));
    END LOOP;

    v_result := DBMS_LDAP.msgfree(v_entry);
    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

DN: host=beach,ou=hosts,dc=debian,dc=org
0: host=beach
1: ou=hosts
2: dc=debian
3: dc=org

Instead of raising an exception we could disable exceptions for the search and then check the result value of the search_s function. You can test this by changing “beach” in the sample to any host name not in their server, like “xxxxxxx”.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
    v_dn_pieces            DBMS_LDAP.string_collection;
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '1.1';

    DBMS_LDAP.use_exception := FALSE;
    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'host=xxxxxxx,ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_base,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => NULL,
                           res        => v_search_results);

    IF v_result = DBMS_LDAP.no_such_object
    THEN
        DBMS_OUTPUT.put_line('Base DN not found');
    ELSE
        DBMS_LDAP.use_exception := TRUE;
        v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

        v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
        DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);
        v_dn_pieces := DBMS_LDAP.explode_dn(dn => v_distinguished_name, notypes => 0);

        FOR i IN v_dn_pieces.FIRST .. v_dn_pieces.LAST
        LOOP
            DBMS_OUTPUT.put_line(i || ': ' || v_dn_pieces(i));
        END LOOP;

        v_result := DBMS_LDAP.msgfree(v_entry);
    END IF;

    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Base DN not found

For the remainder of the examples I will simply raise exceptions on a failed search.

Searching with scope of One Sub-Level, no attributes

Instead of searching for a record at a specific DN, you could instead search for any entries within one level of the directory matching your search criteria. Here we’ll again search within the hosts level, but we’ll allow all records under hosts. As the name implies, the constant to do one-level searching is DBMS_LDAP.SCOPE_ONELEVEL.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '1.1';

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'ou=hosts,dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_onelevel,
                           filter     => 'objectclass=*',
                           attrs      => v_search_attrs,
                           attronly   => 1,
                           res        => v_search_results);

    v_result := DBMS_LDAP.count_entries(v_session, v_search_results);

    DBMS_OUTPUT.put_line('Entry count: ' || v_result);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    WHILE v_entry IS NOT NULL
    LOOP
        v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
        DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);
       
        v_entry := DBMS_LDAP.next_entry(v_session, v_entry);
    END LOOP;

    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Entry count: 192
DN: host=klecker,ou=hosts,dc=debian,dc=org
DN: host=lobos,ou=hosts,dc=debian,dc=org
DN: host=wieck,ou=hosts,dc=debian,dc=org
DN: host=schumann,ou=hosts,dc=debian,dc=org
DN: host=handel,ou=hosts,dc=debian,dc=org
DN: host=geo1,ou=hosts,dc=debian,dc=org
DN: host=geo2,ou=hosts,dc=debian,dc=org
DN: host=draghi,ou=hosts,dc=debian,dc=org
DN: host=kaufmann,ou=hosts,dc=debian,dc=org
DN: host=wolkenstein,ou=hosts,dc=debian,dc=org
DN: host=sibelius,ou=hosts,dc=debian,dc=org
DN: host=tchaikovsky,ou=hosts,dc=debian,dc=org
DN: host=pettersson,ou=hosts,dc=debian,dc=org
DN: host=lully,ou=hosts,dc=debian,dc=org
DN: host=abel,ou=hosts,dc=debian,dc=org
DN: host=arnold,ou=hosts,dc=debian,dc=org
DN: host=antheil,ou=hosts,dc=debian,dc=org
...


Searching with scope of entire Sub-Tree, no attributes

And finally the last scope to explore is the Sub-Tree scope which will search for records at any and all depths within a directory tree below the level of the specified base DN. Here again the package includes a constant: DBMS_LDAP.SCOPE_SUBTREE. Since we’re searching the entire tree the results could get large, especially if our filter allows every object class to be returned.

So, in this example, we’ll restrict the filter to object classes of type “debiangroup”. We’ll start the search at the top of the tree looking for everything under “dc=debian,dc=org”.

DECLARE
    v_result               PLS_INTEGER;
    v_session              DBMS_LDAP.session;
    v_search_attrs         DBMS_LDAP.string_collection;
    v_search_results       DBMS_LDAP.MESSAGE;
    v_entry                DBMS_LDAP.MESSAGE;
    v_distinguished_name   VARCHAR2(256);
BEGIN
    DBMS_LDAP.use_exception := TRUE;

    v_session := DBMS_LDAP.init(hostname => 'db.debian.org', portnum => DBMS_LDAP.port);

    v_search_attrs(1) := '1.1';

    v_result :=
        DBMS_LDAP.search_s(ld         => v_session,
                           base       => 'dc=debian,dc=org',
                           scope      => DBMS_LDAP.scope_subtree,
                           filter     => 'objectclass=debiangroup',
                           attrs      => v_search_attrs,
                           attronly   => 1,
                           res        => v_search_results);

    v_result := DBMS_LDAP.count_entries(v_session, v_search_results);

    DBMS_OUTPUT.put_line('Entry count: ' || v_result);

    v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);

    WHILE v_entry IS NOT NULL
    LOOP
        v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
        DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);
       
        v_entry := DBMS_LDAP.next_entry(v_session, v_entry);
    END LOOP;

    v_result := DBMS_LDAP.unbind_s(v_session);
END;
/

Entry count: 2172
DN: gid=nm,ou=users,dc=debian,dc=org
DN: gid=qa,ou=users,dc=debian,dc=org
DN: gid=adm,ou=users,dc=debian,dc=org
DN: gid=wnpp,ou=users,dc=debian,dc=org
DN: gid=list,ou=users,dc=debian,dc=org
DN: gid=dput_OBS,ou=users,dc=debian,dc=org
DN: gid=visi,ou=users,dc=debian,dc=org
DN: gid=above,ou=users,dc=debian,dc=org
DN: gid=udcvs,ou=users,dc=debian,dc=org
DN: gid=guest,ou=users,dc=debian,dc=org
DN: gid=debwww,ou=users,dc=debian,dc=org
DN: gid=doccvs,ou=users,dc=debian,dc=org
DN: gid=globus,ou=users,dc=debian,dc=org
DN: gid=debcvs,ou=users,dc=debian,dc=org
DN: gid=popcon,ou=users,dc=debian,dc=org
DN: gid=webwml,ou=users,dc=debian,dc=org
DN: gid=wb-ppc,ou=users,dc=debian,dc=org
DN: gid=buildd,ou=users,dc=debian,dc=org
DN: gid=aptcvs,ou=users,dc=debian,dc=org
DN: gid=Debian,ou=users,dc=debian,dc=org
DN: gid=wb-arm,ou=users,dc=debian,dc=org
...

Searching with Filters

The Search API requires a filter, so each of the examples above includes one but all were fairly simple. Either the generic all filter of ‘objectclass=*’
or a specific object class ‘objectclass=”debian group”‘. The LDAP filter syntax is much richer though. The parameter is simply a VARCHAR2, and the filter string is passed “as is” to the server, so the supported functionality is determined by the server and not the DBMS_LDAP package.

A basic LDAP filter takes the form of:

(atttribute_name operator value)
  • The attribute name is the name of any attribute that might be found within an entry. It need not be an attribute that exists in every entry, as many(most) will not.
  • The operator could be =, >=, or <=. Inequality comparisons can be a little tricky. From the client side, all of the data appears as text; but within the LDAP server the data is managed by a schema with types and defined comparison rules. Those rules may use case-sensitive text comparison, or case-insensitive. Numerals may be compared as numeric or text data depending on the schema's definition for that attribute. Some data may not have ordering rules making less-than or greater-than comparisons meaningless. An attribute classifying an entry as "animal", "vegetable", or "mineral" for example would not necessarily have an ordering criteria, only an equality or inequality comparison.
  • The filter value could be a string like the “debian group” example earlier, or a wild card like “*”, or a combination of string and wildcard such as “abc*” to find all entries with an attribute beginning with “abc”.
  • Technically, a filter should be wrapped in parentheses; but if you have a single filter condition with no modifiers, then DBMS_LDAP will accept it as is.
  • The basic filters can then be modified by AND, OR, or NEGATION operators.

    • The AND operator “&” applies the intersection of conditions that follow.
      (&(a=1)(b=2)) – Find records where a=1 and b=2, i.e. (a,b) = (1,2)
      (&(x=10)(y=100)(z=1000)) – Find records where x=10 and y=100 and z=1000, i.e. (x,y,z) = (10,100,1000)
      As a matter of efficiency, it is usually better to have an AND condition with many sub-filters than a more extensive filter using many AND conditions with fewer filters in each.

      It is legal to have a filter consisting of only the AND operator: (&). This filter condition simply evaluates to TRUE for any entry.

    • The OR operator “|” applies the union of the conditions that follow.
      (|(a=1)(b=2)) – Find any records where a=1 or b=2
      (|(x=10)(y=100)(z=1000)) – Find records where x=10 or y=100 or z=1000
      Similar to AND operators, it is usually better to have an OR condition with many sub-filters than to use many OR conditions across fewer sub-filters.

      It is legal to have a filter consisting of only the OR operator: (|). This filter condition simply evaluates to FALSE for any entry.

    • The NEGATION operator “!” simply negates whatever condition follows it.
      (!(objectClass=server)) – Find all records that do not have a server object class
      Unlike the AND and OR operators, the NEGATION operator can not operate on a list of conditions, only a single filter condition can be negated with a single NEGATION operator.

    And finally, you can nest filter conditions within one another, thus allowing more complicated conditions such as: Find all Managers named Jane Doe or Consultants named Mickey Mouse.

    (|(&(OU=Managers)(CN=Jane Doe))(&(OU=Consultants)(CN=Mickey Mouse)))

    With some formatting, the logic is a little easier to follow.

    (|
     (&(OU=Managers)(CN=Jane Doe))
     (&(OU=Consultants)(CN=Mickey Mouse))
    )
    

    Or, we could find John Doe and Donald Duck by negating the previous search.

    (!
       (|
          (&(OU=Managers)(CN=Jane Doe))
          (&(OU=Consultants)(CN=Mickey Mouse))
       )
    )
    

    Data size limits and result set pagination

    There is one additional filtering criteria you may run into with large directories. That is a size limit on returned results. LDAP servers may be willing to pump an unlimited amount of data back to a client, or may have the limits set high enough that your searches never exceed their thresholds; but eventually you will likely come across a result set too large to be returned. When that happens you will get an exception:

    ORA-31202: DBMS_LDAP: LDAP client/server error: Sizelimit exceeded

    or, if you have exceptions turned off, then search_s will return a result code of DBMS_LDAP.SIZELIMIT_EXCEEDED.

    While many LDAP servers support paginated results to handle this situation, DBMS_LDAP, unfortunately, does not have a means of invoking such functionality.

    If you encounter this the only way to get around it from the client side is to construct a series of filters that will span the entire data set.

    In the example below I use the following filter:

    filter     => '&(status=Active)(uid=ab*)'

    Simply searching for all accounts failed. Searching for all Active accounts still failed. Searching for all Active accounts with a uid starting with an “a” still failed. When I got to the point of limiting the uid to those starting with “ab” for just Active accounts I finally got small enough results to pass the server’s size limit. I could then iterate across aa-zz combinations for both Active and not Active accounts to pull all of the data. Of course I might miss ids with leading numeric characters; but hopefully, if you had to do manual pagination you would work with the server’s admins to determine what the best spanning ranges would be.

    DECLARE
        v_result               PLS_INTEGER;
        v_session              DBMS_LDAP.session;
        v_search_attrs         DBMS_LDAP.string_collection;
        v_search_results       DBMS_LDAP.MESSAGE;
        v_entry                DBMS_LDAP.MESSAGE;
        v_distinguished_name   VARCHAR2(256);
        v_dn_pieces            DBMS_LDAP.string_collection;
    BEGIN
        DBMS_LDAP.use_exception := TRUE;
    
        v_session := DBMS_LDAP.init(hostname => 'ldap.andrew.cmu.edu', portnum => DBMS_LDAP.port);
    
        v_search_attrs(1) := '1.1';
        DBMS_LDAP.use_exception := FALSE;
        v_result :=
            DBMS_LDAP.search_s(ld         => v_session,
                               base       => 'ou=account,dc=cmu,dc=edu',
                               scope      => DBMS_LDAP.scope_onelevel,
                               filter     => '&(status=Active)(uid=ab*)',
                               attrs      => v_search_attrs,
                               attronly   => 1,
                               res        => v_search_results);
    
        IF v_result != DBMS_LDAP.success
        THEN
            DBMS_OUTPUT.put_line('Search failed with code: ' || v_result || ' ' || DBMS_LDAP.err2string(v_result));
        ELSE
            DBMS_LDAP.use_exception := FALSE;
            v_result := DBMS_LDAP.count_entries(v_session, v_search_results);
    
            DBMS_OUTPUT.put_line('Entry count: ' || v_result);
    
            v_entry := DBMS_LDAP.first_entry(v_session, v_search_results);
    
            WHILE v_entry IS NOT NULL
            LOOP
                v_distinguished_name := DBMS_LDAP.get_dn(v_session, v_entry);
                DBMS_OUTPUT.put_line('DN: ' || v_distinguished_name);
    
                v_entry := DBMS_LDAP.next_entry(v_session, v_entry);
            END LOOP;
        END IF;
    
        v_result := DBMS_LDAP.unbind_s(v_session);
    END;
    /
    Entry count: 71
    DN: uid=ABHAT,ou=account,dc=cmu,dc=edu
    DN: uid=ABRUNK,ou=account,dc=cmu,dc=edu
    DN: uid=ABIHAUS,ou=account,dc=cmu,dc=edu
    DN: uid=ABUSE,ou=account,dc=cmu,dc=edu
    DN: uid=ABOY,ou=account,dc=cmu,dc=edu
    DN: uid=ABERENDEYEV,ou=account,dc=cmu,dc=edu
    DN: uid=abenavides,ou=account,dc=cmu,dc=edu
    DN: uid=abelsaj,ou=account,dc=cmu,dc=edu
    DN: uid=abhinav,ou=account,dc=cmu,dc=edu
    DN: uid=abeer,ou=account,dc=cmu,dc=edu
    DN: uid=abhishekveldurthy,ou=account,dc=cmu,dc=edu
    DN: uid=ab,ou=account,dc=cmu,dc=edu
    DN: uid=abhay,ou=account,dc=cmu,dc=edu
    DN: uid=abinajay,ou=account,dc=cmu,dc=edu
    DN: uid=abhayjoseph,ou=account,dc=cmu,dc=edu
    DN: uid=abhishekjain,ou=account,dc=cmu,dc=edu
    DN: uid=abreis,ou=account,dc=cmu,dc=edu
    DN: uid=abhinavas,ou=account,dc=cmu,dc=edu
    ...
    

    That covers all of the basic of searching for entries within an LDAP hierarchy.
    In the next section we’ll extend the search results to include attribute lists with and without their values.