Introduction to ASH Data, part 4

What is ASH?
AWR and Activity Types
Average Active Sessions
Graphical Visualizations (this page)

Up to this point in my ASH series all of the output was simply rows and columns of data, sometimes augmented with an ascii, text based chart. This time I’ll explore a few different tools to construct better visualizations.

One of the first things I ever used “R” for was to create ASH charts as an alternate to those found in Oracle’s Enterprise Manager “Top Activity” and “ASH Analytics” pages. I had multiple reasons to pursue the exercise. The first being the Top Activity page didn’t have a mechanism to specify an arbitrary time range. The ASH Analytics page does allow inputs but on refreshes it often loses them. Furthermore, both pages generated their graphs from averages over periods, which can distort the data, causing peaks and valleys to disappear. I wrote about the missing peaks a couple years ago.

So, the first visualization I created was a simple stacked bar chart with the ggplot2 package, accessing the database through the ROracle package. This script plots a half-hour range seen in previous articles but every sample time is represented. While this is still subject to the sampling issues of all ASH queries, this chart eliminates missing peaks caused by rounded averages over periods.

library("DBI")
library("ROracle")
library("ggplot2")

drv <- dbDriver("Oracle")

### Connect via credentials stored in wallet
con <- dbConnect(drv, dbname ="testdb")

######################################################################################################
#### Function: get_top_activity
######################################################################################################
get_top_activity <- function(p_con, p_start, p_end) {
    v_sql <- "
        SELECT sample_time, activity, COUNT(*) cnt
          FROM (SELECT sample_time,
                       CASE
                       WHEN session_state = 'ON CPU' THEN
                          'ON CPU'
                       WHEN wait_class IN ('Concurrency',
                                           'System I/O',
                                           'User I/O',
                                           'Configuration',
                                           'Application',
                                           'Commit',
                                           'Queueing',
                                           'Network',
                                           'Administrative',
                                           'Cluster',
                                           'Scheduler') THEN
                          wait_class
                       ELSE
                          'Other'
                       END activity
                  FROM dba_hist_active_sess_history
                 WHERE sample_time >= TO_TIMESTAMP(:start_time,'yyyy-mm-dd hh24:mi:ss')
                   AND sample_time < TO_TIMESTAMP(:end_time,'yyyy-mm-dd hh24:mi:ss'))
          GROUP BY sample_time, activity"

    ### Need to tell R the timezone of the incoming data
    ### Alternately, could use FROM_TZ function to give the timestamp values a timezone in the query.
    Sys.setenv(ORA_SDTZ = "US/Eastern")
    
    return(dbGetQuery(
              p_con,
              v_sql,
              data.frame(
                  start_time = p_start,
                  end_time = p_end
              )
           )
    )
}
######################################################################################################


##############################################################################################
###  Get ASH Data
##############################################################################################
df_activity <- get_top_activity(con, '2018-09-15 13:00:00', '2018-09-15 13:30:00')

### We don't need the db connection anymore
dbDisconnect(con)
dbUnloadDriver(drv)

### Stack the wait classes in this order (CPU on bottom, Other on top)
wait_class_order <- c(
  "Other",
  "Cluster",
  "Queueing",
  "Network",
  "Administrative",
  "Configuration",
  "Commit",
  "Application",
  "Concurrency",
  "System I/O",
  "User I/O",
  "Scheduler",
  "ON CPU"
)
df_activity$ACTIVITY <- factor(df_activity$ACTIVITY, levels = wait_class_order)


## Color hex codes for each activity type shown in the chart
wait_class_colors <- c(
  "ON CPU"         = "#04CE04",
  "Scheduler"      = "#CCFECC",
  "User I/O"       = "#084AD4",
  "System I/O"     = "#0499F4",
  "Concurrency"    = "#8E1B04",
  "Application"    = "#C72D04",
  "Commit"         = "#E46A04",
  "Configuration"  = "#5B460D",
  "Administrative" = "#707251",
  "Network"        = "#9C9376",
  "Queueing"       = "#C5B79A",
  "Cluster"        = "#CCC3AE",
  "Other"          = "#F56CA9"
)

### Build chart, fill in missing values, add titles and labels
plot_activity <- ggplot(df_activity, aes(x = SAMPLE_TIME, y = CNT, fill = ACTIVITY)) +
                        geom_bar(stat = "identity") +
                        scale_fill_manual(values = wait_class_colors) +
                        ggtitle("Active Sessions")  +
                        labs(x = "Sample Time", y =  "Active Sessions")

### Show the chart
print(plot_activity)

Running this script produced the chart below from my test system.

Shortly after I started exploring R, I was asked to help with a project using Googlecharts. I hadn’t done a lot with the Googlecharts api before; but having just completed some simple R charting, I started by replicating the ASH charts as a learning exercise. With R, you build a data frame to pass into the ggplot api. With Googlecharts, you build an array of arrays, each sub-array has a place holder value for every possible activity type, whereas the R plot function could fill in missing values as needed. Rows in the array will look something like these:

 [new Date(2018,09,15,13,06,00),6,0,0,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,29,33),0,0,3,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,19,42),0,0,3,0,0,0,0,0,0,0,0,0,0]
,[new Date(2018,09,15,13,15,51),6,0,0,0,0,0,0,0,0,0,0,0,0]

Using the same query and date range as in the R example above, adding a pivot clause, and a little pl/sql we can spool out an html file containing all of the necessary data and invoke a browser to load it and generate the chart.

set echo off
set pages 1000
set lines 1000
set feedback off
set define off
set serveroutput on
spool ash.html
DECLARE
      p_start_time TIMESTAMP := TO_TIMESTAMP('2018-09-15 13:00:00','yyyy-mm-dd hh24:mi:ss');
      p_end_time   TIMESTAMP := TO_TIMESTAMP('2018-09-15 13:30:00','yyyy-mm-dd hh24:mi:ss');
BEGIN
    DBMS_OUTPUT.put_line(
        q'[<html>
  <head>
    <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
    <script type="text/javascript">
       google.charts.load('current', {'packages':['corechart']});
       google.charts.setOnLoadCallback(drawChart);
       function drawChart() {
           var data = google.visualization.arrayToDataTable([
            ['Activity',
            'ON CPU','Scheduler','User IO', 'System IO', 'Concurrency','Application','Commit',
            'Configuration',    'Administrative',    'Network',  'Queueing', 'Cluster',   'Other'],]'
       );

    FOR x
        IN (
            SELECT    CASE WHEN ROWNUM = 1 THEN ' ' ELSE ',' END
                   || '['
                   || TO_CHAR(sample_time, '"new Date("yyyy,mm,dd,hh24,mi,ss")"')
                   || ','
                   || NVL("ON CPU", 0)
                   || ','
                   || NVL("Scheduler", 0)
                   || ','
                   || NVL("User I/O", 0)
                   || ','
                   || NVL("System I/O", 0)
                   || ','
                   || NVL("Concurrency", 0)
                   || ','
                   || NVL("Application", 0)
                   || ','
                   || NVL("Commit", 0)
                   || ','
                   || NVL("Configuration", 0)
                   || ','
                   || NVL("Administrative", 0)
                   || ','
                   || NVL("Network", 0)
                   || ','
                   || NVL("Queueing", 0)
                   || ','
                   || NVL("Cluster", 0)
                   || ','
                   || NVL("Other", 0)
                   || ']' arrayrow
              FROM (SELECT sample_time,
                           CASE
                               WHEN session_state = 'ON CPU'
                               THEN
                                   'ON CPU'
                               WHEN wait_class IN ('Concurrency',
                                                   'System I/O',
                                                   'User I/O',
                                                   'Configuration',
                                                   'Application',
                                                   'Commit',
                                                   'Queueing',
                                                   'Network',
                                                   'Administrative',
                                                   'Cluster',
                                                   'Scheduler')
                               THEN
                                   wait_class
                               ELSE
                                   'Other'
                           END activity
                      FROM dba_hist_active_sess_history
                     WHERE sample_time >= p_start_time 
                       AND sample_time < p_end_time)
                   PIVOT (COUNT(*)
                         FOR activity
                         IN ('Other' AS "Other",
                            'Cluster' AS "Cluster",
                            'Queueing' AS "Queueing",
                            'Network' AS "Network",
                            'Administrative' AS "Administrative",
                            'Configuration' AS "Configuration",
                            'Commit' AS "Commit",
                            'Application' AS "Application",
                            'Concurrency' AS "Concurrency",
                            'System I/O' AS "System I/O",
                            'User I/O' AS "User I/O",
                            'Scheduler' AS "Scheduler",
                            'ON CPU' AS "ON CPU"))
    )
    LOOP
        DBMS_OUTPUT.put_line(x.arrayrow);
    END LOOP;

    DBMS_OUTPUT.put_line(
           q'[        ]);           

      var options = {
        width: 1200,
        height: 700,
        legend: { position: 'right', maxLines: 3 },
        bar: { groupWidth: '75%' },
        isStacked: true,
        colors: [
          '#04CE04',
          '#CCFECC',
          '#084AD4',
          '#0499F4',
          '#8E1B04',
          '#C72D04',
          '#E46A04',
          '#5B460D',
          '#707251',
          '#9C9376',
          '#C5B79A',
          '#CCC3AE',
          '#F56CA9']
      };
        // Instantiate and draw our chart, passing in some options.
        var chart = new google.visualization.ColumnChart(document.getElementById('chart_div'));
        chart.draw(data, options);
      }
    </script>
  </head>

  <body><div id="chart_div"></div></body></html>]'
    );
END;
/
spool off
host "C:\Program Files\Mozilla Firefox\firefox.exe" .\ash.html
exit

Using the script above the googlechart version of the ASH data looks like below.

The last visualization tool I’ll explore is perhaps the most obvious one – Application Express (APEX.) To create a stacked bar chart similar to the ones above…

Create a new Region

Under Identification
- Set Title – Active Sessions
- Set Type – Chart
Under Attributes
- Under Chart, Set Type – Bar
- Under Appearance, Set Stacked – Yes
- Under Settings, Set Time Axis Type – Enabled
Under Series
- Under Source, Set Type – SQL Query (query is below)
- Under Column Mapping
  - Set Series Name – ACTIVITY
  - Set Label – SAMPLE_TIME
  - Set Value – CNT
- Under Appearance, Set Color – &COLOR.

In addition to the settings above, we’ll also need to add the SQL query in the Series Source. Below is the same query used in the R example, except I’ve also added a color attribute.

  SELECT sample_time,
         activity,
         color,
         COUNT(*) cnt
    FROM (SELECT sample_time,
                 CASE
                     WHEN session_state = 'ON CPU'
                     THEN
                         'ON CPU'
                     WHEN wait_class IN ('Concurrency',
                                         'System I/O',
                                         'User I/O',
                                         'Configuration',
                                         'Application',
                                         'Commit',
                                         'Queueing',
                                         'Network',
                                         'Administrative',
                                         'Cluster',
                                         'Scheduler')
                     THEN
                         wait_class
                     ELSE
                         'Other'
                 END activity,
                 CASE
                     WHEN session_state = 'ON CPU' THEN '#04CE04'
                     WHEN wait_class = 'Concurrency' THEN '#8E1B04'
                     WHEN wait_class = 'System I/O' THEN '#0499F4'
                     WHEN wait_class = 'User I/O' THEN '#084AD4'
                     WHEN wait_class = 'Configuration' THEN '#5B460D'
                     WHEN wait_class = 'Application' THEN '#C72D04'
                     WHEN wait_class = 'Commit' THEN '#E46A04'
                     WHEN wait_class = 'Queueing' THEN '#C5B79A'
                     WHEN wait_class = 'Network' THEN '#9C9376'
                     WHEN wait_class = 'Administrative' THEN '#707251'
                     WHEN wait_class = 'Cluster' THEN '#CCC3AE'
                     WHEN wait_class = 'Scheduler' THEN '#CCFECC'
                     ELSE '#F56CA9'
                 END color
            FROM dba_hist_active_sess_history
           WHERE sample_time >= TIMESTAMP '2018-09-15 13:00:00'
             AND sample_time < TIMESTAMP '2018-09-15 13:30:00')
GROUP BY sample_time, activity, color

Running the page with the chart region defined as above produced the following chart. Unfortunately, the options for defining the stacking order are limited to sorting by value or label. Thus, it’s not possible (using just the APEX native options) to guarantee the APEX chart will stack the same way as the R or Googlechart variations. It might be possible with some custom java script to manipulate the underlying JET attributes; but I have not explored that route. The overall effect though, even with the current limitations is still quite similar to the other tools.

Hopefully these examples will help you in digging into your own ASH data and maybe give a little head start into exploring some new tools. I welcome any suggestions, corrections, or additions to any of the methods above.

While all of these examples have hard-coded date ranges, I hope the code arrangement provides a solid framework to parameterize within your own environment and tools. As always, questions and comments are welcome.