Using VBA to conduct Oracle Statspack time series analysis

Back to the main data prep VBA page

I work within development organizations helping to tune high scale workloads for our large scale commercial customer base. These customers typically deploy in a variety of environments that include many different database technologies, often ones not directly owned by the company I work for. Since our application is heavily dependent on the performance of the underlying database, I’ve worked up many reference scripts used to parse performance data provided by the different databases in use so I can create a time series view of key performance metrics from the RDBMS.

In this case I have presented a simple script that parses a directory full of Oracle Statspack text files that contain performance data on regular snap intervals. These files contain a wealth of performance data that is often used as a single report that spans multiple hours. Being a performance guy I want a time series view of this data that uses smaller snap intervals and tracks key individual metrics over time so I can correlate any application tier issues to specific database metrics captured by Oracle Statspack.

The following script will create a single pipe (“|”) delimited output file that contains a row per report with the key sections of the Oracle Statspack output I am interested in. This script can easily be adapted to target other sections of Oracle Statspack output, but there are a few interesting parsing examples that can be altered as needed. Here’s the script in it’s entirety:


Sub importAWRtext()
'June 2014
'routine to import text based AWR snaps
'logic allows for multi-node imports on RAC environments
'point to a directory and import all reports

    'set variables
        Dim rawLOG As String, rawLOGpath As String
        Dim splitArray() As String, StrOut As String
        Dim awrTextArr() As Variant
        Dim lineCt As Double
        pathFull = Sheets("input").Range("C13")
        Set objrawLOG = CreateObject("Scripting.FileSystemObject")
        rawLOGpath = objrawLOG.GetFile(pathFull).ParentFolder.path & "\"
    'setup clean files
        srcCSV = Dir(objrawLOG.GetFile(pathFull).ParentFolder.path & "\")
        Do While srcCSV <> ""
            tempFILE = rawLOGpath & "\" & srcCSV & ""
            If srcCSV Like "csvAWR*" Then
                Set delFILE = CreateObject("Scripting.FileSystemObject")
                delFILE.Deletefile tempFILE, True
            End If
            srcCSV = Dir
        Loop
        fnameLOGclean = "csvAWRclean.out"
        Set objFSO = CreateObject("Scripting.FileSystemObject")
        Set objCLEAN = objFSO.CreateTextFile(rawLOGpath & "\" & fnameLOGclean, 1)
    'build array of log names up front, use those as source
        Dim logNameArr(10000, 1) As Variant
        rawLOG = Dir(rawLOGpath)
        logCT = 0: logSO = 0
        Do While rawLOG <> ""
            If rawLOG Like "*.txt" Then
                logCT = logCT + 1
                logNameArr(logCT, 1) = rawLOG
            End If
            rawLOG = Dir
        Loop
    'setup array for caching results
        ReDim awrTextArr(logCT + 1, 1) As Variant
    'loop through statspack files
        For logLoop = 1 To logCT
            lineCt = 0: curStr = ""
            rawLOG = logNameArr(logLoop, 1)
            srcFullPath = rawLOGpath & rawLOG
            Set objFSO = CreateObject("Scripting.FileSystemObject")
            Set objrawLOG = objFSO.OpenTextFile(rawLOGpath & rawLOG, 1)
            Do Until objrawLOG.AtEndOfStream
                txnstring = objrawLOG.ReadLine
                lineCt = lineCt + 1
                'DB info
                    If txnstring Like "DB Name*" Then curStr = "DBname": tgtLine = lineCt + 2
                    If tgtLine = lineCt And curStr = "DBname" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(0) & "|" & splitArray(2) & "|" _
                            & splitArray(6) & "|" & splitArray(7) & "|"
                        curStr = ""
                        headerLine = headerLine & "dbname|instance|release|RAC|"
                    End If
                'Host info
                    If txnstring Like "Host Name*" Then curStr = "HOSTname": tgtLine = lineCt + 2
                    If tgtLine = lineCt And curStr = "HOSTname" Then
                        StrOut = StrOut & Trim(Left(txnstring, 16)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 49), 32)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 54), 4)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 60), 4)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 68), 7)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 80), 10)) & "|"
                        curStr = ""
                        headerLine = headerLine & "host name|platform|CPUs|Cores|Sockets|Memory(GB)|"
                    End If
                'unique strings
                    If txnstring Like "Begin Snap:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|" & splitArray(3) & " " & splitArray(4) & "|"
                        headerLine = headerLine & "begin snap ID|begin time|"
                        End If
                    If txnstring Like "  End Snap:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|" & splitArray(3) & " " & splitArray(4) & "|"
                        headerLine = headerLine & "end snap ID|end time|"
                        End If
                    If txnstring Like "               Buffer Cache:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|" & splitArray(7) & "|"
                        headerLine = headerLine & "buffer cache|block size|"
                        End If
                    If txnstring Like "   Logical reads:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|"
                        headerLine = headerLine & "lrps|"
                        End If
                    If txnstring Like "  Physical reads:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|"
                        headerLine = headerLine & "prps|"
                        End If
                    If txnstring Like "*Buffer  Hit   %:*" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(3) & "|"
                        headerLine = headerLine & "BCHR|"
                        End If
                'CPU utilization
                    If txnstring Like "*Host CPU (CPUs*" Then curStr = "CPUutil": tgtLine = lineCt + 4
                    If tgtLine = lineCt And curStr = "CPUutil" Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(2) & "|" & splitArray(3) & _
                            "|" & splitArray(4) & "|" & splitArray(5) & "|"
                        curStr = "": tgtLine = 0
                        headerLine = headerLine & "%user|%system|%wio|%idle|"
                    End If
                'Top 5 timed events
                    If txnstring Like "Top 5 Timed Foreground Events*" Then curStr = "top5": tgtLine = lineCt + 6
                    If tgtLine = lineCt And curStr = "top5" Then
                        StrOut = StrOut & Trim(Left(txnstring, 30)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 43), 12)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 55), 11)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 62), 6)) & "|"
                        StrOut = StrOut & Trim(Right(Left(txnstring, 69), 6)) & "|"
                        tgtLine = tgtLine + 1: loopInc = loopInc + 1
                        headerLine = headerLine & "event" & loopInc & "|waits|time(s)|avg wait(ms)|%DB time|"
                        If loopInc = 5 Then curStr = "": loopInc = 0: tgtLine = 0
                        End If
                'top 3 SQL by elapsed time
                    If txnstring Like "*SQL ordered by Elapsed Time*" Then curStr = "SQLbyTime"
                    If curStr = "SQLbyTime" And loopInc = 0 And txnstring Like "----------*" Then tgtLine = lineCt + 1
                    If curStr = "SQLbyTime" And loopInc > 0 And Trim(txnstring) = "" Then tgtLine = lineCt + 1
                    If tgtLine = lineCt And curStr = "SQLbyTime" And loopInc < 4 Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(6) & "|" & splitArray(0) & "|" & splitArray(1) & "|" & _
                            splitArray(2) & "|" & splitArray(3) & "|"
                        loopInc = loopInc + 1
                        headerLine = headerLine & "elapsed" & loopInc & " sqlID|elapsed(sec)|execs|ms/exec|%total|"
                        End If
                    If txnstring Like "*SQL ordered by CPU Time*" Then loopInc = 0: tgtLine = 0: curStr = ""
                'top 3 SQL by Gets
                    If txnstring Like "*SQL ordered by Gets*" Then curStr = "SQLbyGets"
                    If curStr = "SQLbyGets" And loopInc = 0 And txnstring Like "----------*" Then tgtLine = lineCt + 1
                    If curStr = "SQLbyGets" And loopInc > 0 And Trim(txnstring) = "" Then tgtLine = lineCt + 1
                    If tgtLine = lineCt And curStr = "SQLbyGets" And loopInc < 4 Then
                        tempStr = Application.WorksheetFunction.Trim(txnstring)
                        splitArray() = Split(tempStr, " ")
                        StrOut = StrOut & splitArray(7) & "|" & splitArray(0) & "|" & splitArray(1) & "|" & _
                            splitArray(2) & "|" & splitArray(3) & "|"
                        loopInc = loopInc + 1
                        headerLine = headerLine & "gets" & loopInc & " sqlID|gets|execs|gets/exec|%total|"
                        End If
                    If txnstring Like "*SQL ordered by Reads*" Then loopInc = 0: tgtLine = 0: curStr = ""
                'Segments by parser
                'there are multiple similarly structured tables in the segments by section
                'they are variable in length, and some have an extra line in the table header
                'this section sets up the target line start and the custom header variables
                    If txnstring Like "*Segments by Logical Reads*" Then
                        curStr = "SegByLog": tgtLine = lineCt + 7: testEmpty = lineCt + 2
                        header1 = "segs by log"
                        header2 = "|name|object|type|logical reads|%total|"
                        End If
                    If txnstring Like "*Segments by Physical Reads*" Then
                        curStr = "SegByPhys": tgtLine = lineCt + 7: testEmpty = lineCt + 2
                        header1 = "segs by phys"
                        header2 = "|name|object|type|physical reads|%total|"
                        End If
                    If txnstring Like "*Segments by Physical Writes*" Then
                        curStr = "SegByPhysW": tgtLine = lineCt + 7: testEmpty = lineCt + 2
                        header1 = "segs by writes"
                        header2 = "|name|object|type|physical writes|%total|"
                        End If
                    If txnstring Like "*Segments by Table Scans*" Then
                        curStr = "SegByScans": tgtLine = lineCt + 7: testEmpty = lineCt + 2
                        header1 = "segs by scans"
                        header2 = "|name|object|type|table scans|%total|"
                        End If
                    If txnstring Like "*Segments by Row Lock Waits *" Then
                        curStr = "SegByLocks": tgtLine = lineCt + 8: testEmpty = lineCt + 2
                        header1 = "segs by rlwait"
                        header2 = "|name|object|type|row lock waits|%total|"
                        End If
                    If curStr Like "SegBy*" And lineCt = testEmpty And txnstring Like "                  No data exists for this section of the report.*" Then
                        For t = 1 To 5
                                StrOut = StrOut & "||||||"
                                headerLine = headerLine & header1 & loopInc & header2
                            Next t
                            curStr = "": loopInc = 0: tgtLine = 0
                            End If
                'execute "segments by" parsing logic
                'since the table is variable length, look for the end line and skip further processing
                    If tgtLine = lineCt And curStr Like "SegBy*" Then
                        If txnstring Like "          ------------------------------*" Then
                            For t = loopInc + 1 To 5
                                StrOut = StrOut & "||||||"
                                headerLine = headerLine & header1 & loopInc & header2
                            Next t
                            curStr = "": loopInc = 0: tgtLine = 0
                        Else
                            StrOut = StrOut & Trim(Left(txnstring, 11)) & "|"
                            StrOut = StrOut & Trim(Right(Left(txnstring, 21), 10)) & "|"
                            StrOut = StrOut & Trim(Right(Left(txnstring, 42), 20)) & "|"
                            StrOut = StrOut & Trim(Right(Left(txnstring, 59), 5)) & "|"
                            StrOut = StrOut & Trim(Right(Left(txnstring, 72), 12)) & "|"
                            StrOut = StrOut & Trim(Right(Left(txnstring, 80), 7)) & "|"
                            tgtLine = tgtLine + 1: loopInc = loopInc + 1
                            headerLine = headerLine & header1 & loopInc & header2
                            If loopInc = 5 Then curStr = "": loopInc = 0: tgtLine = 0
                        End If
                    End If
            Loop
            'record output string
            'caching to an array so that I can build out a custom header
                If logLoop = 1 Then awrTextArr(1, 1) = headerLine
                awrTextArr(logLoop + 1, 1) = StrOut
                StrOut = "": headerLine = ""
        Next logLoop
        'loop through array, write output to flie
            For i = 1 To logCT + 1
                objCLEAN.WriteLine awrTextArr(i, 1)
            Next i
        'clean up
            objrawLOG.Close

End Sub

I fully admit I cheated and using the worksheet TRIM function in a few cases to collapse consecutive spaces for easier splitting on variable length fields. There are likely a few ways to solve this by walking the string and counting sections, but TRIM was easier. I may revisit some of my brute force approaches to relying on fixed field length parsing.

So far I have used this on 11.2.0.3.0 output. We’ll see if it holds up as other samples from different Oracle versions come in. I’d been putting off writing this script for a few years, focusing my initial efforts on the Oracle AWR output, but I finally broke down and wrote this script after yet another RAC cluster set of reports came in (24 x 1 hour samples across a 4 node cluster for a total of 96 reports…).

 

Parsing data from a single line in Oracle Statspack

The simplest use case is to pull data from multiple columns in a single line. In this case, I used the worksheet TRIM function to eliminate extra spaces in the variable length lines, followed by the split function using a space as a delimiter. Then I pulled the specific “column” of interest:

If txnstring Like "*Buffer  Hit   %:*" Then
    tempStr = Application.WorksheetFunction.Trim(txnstring)
    splitArray() = Split(tempStr, " ")
    StrOut = StrOut & splitArray(3) & "|"
    headerLine = headerLine & "BCHR|"
    End If

 

Parsing data from a fixed row count Oracle Statspack table

There are tables within Oracle Statspack that have a fixed number of entries. An example of this is the “top 5 timed events” that will always have 5 entries in a production system. I have not run across a case where there are not five entries, so we can simplify the parsing of this table by simply keeping a count of the lines that have been processed.

'Top 5 timed events
    If txnstring Like "Top 5 Timed Foreground Events*" Then curStr = "top5": tgtLine = lineCt + 6
    If tgtLine = lineCt And curStr = "top5" Then
        StrOut = StrOut & Trim(Left(txnstring, 30)) & "|"
        StrOut = StrOut & Trim(Right(Left(txnstring, 43), 12)) & "|"
        StrOut = StrOut & Trim(Right(Left(txnstring, 55), 11)) & "|"
        StrOut = StrOut & Trim(Right(Left(txnstring, 62), 6)) & "|"
        StrOut = StrOut & Trim(Right(Left(txnstring, 69), 6)) & "|"
        tgtLine = tgtLine + 1: loopInc = loopInc + 1
        headerLine = headerLine & "event" & loopInc & "|waits|time(s)|avg wait(ms)|%DB time|"
        If loopInc = 5 Then curStr = "": loopInc = 0: tgtLine = 0
        End If

You can see that I used a fixed format parsing method for counting character positions based on the expected table layout. We’ll see if they modify the column assignments by release 🙂

 

Parsing data from a variable length Oracle Statspack table

This is a section of the script where I pulled the top three SQL queries ordered by buffer gets. There are two if statements that skip lines that should be ignored. The parsing will run until the next table is encountered (SQL by Reads):

'top 3 SQL by Gets
    If txnstring Like "*SQL ordered by Gets*" Then curStr = "SQLbyGets"
    If curStr = "SQLbyGets" And loopInc = 0 And txnstring Like "----------*" Then tgtLine = lineCt + 1
    If curStr = "SQLbyGets" And loopInc > 0 And Trim(txnstring) = "" Then tgtLine = lineCt + 1
    If tgtLine = lineCt And curStr = "SQLbyGets" And loopInc < 4 Then
        tempStr = Application.WorksheetFunction.Trim(txnstring)
        splitArray() = Split(tempStr, " ")
        StrOut = StrOut & splitArray(7) & "|" & splitArray(0) & "|" & splitArray(1) & "|" & _
            splitArray(2) & "|" & splitArray(3) & "|"
        loopInc = loopInc + 1
        headerLine = headerLine & "gets" & loopInc & " sqlID|gets|execs|gets/exec|%total|"
        End If
    If txnstring Like "*SQL ordered by Reads*" Then loopInc = 0: tgtLine = 0: curStr = ""

 

Parsing a variable length Oracle Statspack tables with the same base format

I have a section in the script that uses the same parsing method for multiple tables that have the same fixed format dimensions. These “Segment by” tables have the same column counts but different headers. The complicating factor is in some cases, these tables contain no data. I have to trap that condition and maintain the output string dimensions:

'Segments by parser
'there are multiple similarly structured tables in the segments by section
'they are variable in length, and some have an extra line in the table header
'this section sets up the target line start and the custom header variables
    If txnstring Like "*Segments by Logical Reads*" Then
        curStr = "SegByLog": tgtLine = lineCt + 7: testEmpty = lineCt + 2
        header1 = "segs by log"
        header2 = "|name|object|type|logical reads|%total|"
        End If
    If txnstring Like "*Segments by Physical Reads*" Then
        curStr = "SegByPhys": tgtLine = lineCt + 7: testEmpty = lineCt + 2
        header1 = "segs by phys"
        header2 = "|name|object|type|physical reads|%total|"
        End If
    If txnstring Like "*Segments by Physical Writes*" Then
        curStr = "SegByPhysW": tgtLine = lineCt + 7: testEmpty = lineCt + 2
        header1 = "segs by writes"
        header2 = "|name|object|type|physical writes|%total|"
        End If
    If txnstring Like "*Segments by Table Scans*" Then
        curStr = "SegByScans": tgtLine = lineCt + 7: testEmpty = lineCt + 2
        header1 = "segs by scans"
        header2 = "|name|object|type|table scans|%total|"
        End If
    If txnstring Like "*Segments by Row Lock Waits *" Then
        curStr = "SegByLocks": tgtLine = lineCt + 8: testEmpty = lineCt + 2
        header1 = "segs by rlwait"
        header2 = "|name|object|type|row lock waits|%total|"
        End If
    If curStr Like "SegBy*" And lineCt = testEmpty And txnstring Like "                  No data exists for this section of the report.*" Then
        For t = 1 To 5
            StrOut = StrOut & "||||||"
            headerLine = headerLine & header1 & loopInc & header2
        Next t
        curStr = "": loopInc = 0: tgtLine = 0
        End If

The prior section sets the type of table that is being parsed with the proper identifiers and header strings. It also contains the logic to look for the tables that contain no data. Once the type of table is established, the tables are parsed and the content is appended to the output string:

'execute "segments by" parsing logic
'since the table is variable length, look for the end line and skip further processing
    If tgtLine = lineCt And curStr Like "SegBy*" Then
        If txnstring Like "          ------------------------------*" Then
            For t = loopInc + 1 To 5
                StrOut = StrOut & "||||||"
                headerLine = headerLine & header1 & loopInc & header2
            Next t
            curStr = "": loopInc = 0: tgtLine = 0
        Else
            StrOut = StrOut & Trim(Left(txnstring, 11)) & "|"
            StrOut = StrOut & Trim(Right(Left(txnstring, 21), 10)) & "|"
            StrOut = StrOut & Trim(Right(Left(txnstring, 42), 20)) & "|"
            StrOut = StrOut & Trim(Right(Left(txnstring, 59), 5)) & "|"
            StrOut = StrOut & Trim(Right(Left(txnstring, 72), 12)) & "|"
            StrOut = StrOut & Trim(Right(Left(txnstring, 80), 7)) & "|"
            tgtLine = tgtLine + 1: loopInc = loopInc + 1
            headerLine = headerLine & header1 & loopInc & header2
            If loopInc = 5 Then curStr = "": loopInc = 0: tgtLine = 0
        End If
    End If

 

Back to the main data prep VBA page