Oracle's ASH, or Active Session History, statistics are simply a gold mine. ASH's power stems from its capability to correlate performance issues to a specific database user or a SQL statement very quickly. Active Session History, as name implies, stores session states for each and every active database session. A session is considered active if it is on CPU or waiting for a non-idle event (or OS call).
Since Oracle samples active sessions every second, each sample is an statistical approximation of one second. If an ASH record indicates that a session is on CPU, then you could say that the session has consumed 1 second of CPU. By the same token, if an ASH record indicates that a session is waiting for a non-idle event, then you could say that the session has waited for a second for that event.
For example, if there are ASH 100 records showing various sessions on CPU for a 30-minute time interval, then you could say DB CPU Time is 100 seconds. During the same interval, if there are 200 ASH records showing various wait events(OS calls), then you could say that the database has waited for 200 seconds. In this case, DB Time would be 100 + 200 = 300 seconds.
As we know, DB Time and DB CPU time can also be derived from an AWR report which naturally raises a question - do DB TIme and DB CPU Time statistics, reported by AWR report, match with those derived from the ASH? I would contend that it is not important to have a 100% match for ASH and AWR statistics. What is more important is that ASH statistics would help you identify a problem very quickly which AWR may not be able to do do. Anyway, let me first attempt to answer this question and then explain why it is futile to attempt to compare two sets of statistics.
Back to the question, ASH and AWR statistics may or may not match. AWR statistics represent the difference between two snapshots while ASH statistics are samples of all active session collected every second. There may be sampling errors in ASH Data.
ASH experts or statistician may be able to elaborate more on ASH sampling algorithm and answer this question intelligently. In my small world, I have observed ASH statistics to be off by 10 to 20%. Let me show you how to calculate differences in DB Time and DB CPU Time as reported by AWR and ASH.
Since Oracle samples active sessions every second, each sample is an statistical approximation of one second. If an ASH record indicates that a session is on CPU, then you could say that the session has consumed 1 second of CPU. By the same token, if an ASH record indicates that a session is waiting for a non-idle event, then you could say that the session has waited for a second for that event.
For example, if there are ASH 100 records showing various sessions on CPU for a 30-minute time interval, then you could say DB CPU Time is 100 seconds. During the same interval, if there are 200 ASH records showing various wait events(OS calls), then you could say that the database has waited for 200 seconds. In this case, DB Time would be 100 + 200 = 300 seconds.
As we know, DB Time and DB CPU time can also be derived from an AWR report which naturally raises a question - do DB TIme and DB CPU Time statistics, reported by AWR report, match with those derived from the ASH? I would contend that it is not important to have a 100% match for ASH and AWR statistics. What is more important is that ASH statistics would help you identify a problem very quickly which AWR may not be able to do do. Anyway, let me first attempt to answer this question and then explain why it is futile to attempt to compare two sets of statistics.
Back to the question, ASH and AWR statistics may or may not match. AWR statistics represent the difference between two snapshots while ASH statistics are samples of all active session collected every second. There may be sampling errors in ASH Data.
ASH experts or statistician may be able to elaborate more on ASH sampling algorithm and answer this question intelligently. In my small world, I have observed ASH statistics to be off by 10 to 20%. Let me show you how to calculate differences in DB Time and DB CPU Time as reported by AWR and ASH.
SELECT TO_CHAR (ASHTime, 'MM/DD/YY:hh24')
ASHTime,
ASHDBTime,
ASHCPUTime,
AWRDBTime,
AWRCPUTime,
ROUND ( (ASHDBTime - AWRDBTime) /
ASHDBTime * 100) DBTimeDiff,
ROUND ( (ASHCPUTime - AWRCPUTime) /
ASHCPUTime * 100) CPUTimeDiff
FROM (
SELECT TRUNC (SAMPLE_TIME, 'HH24') ASHTime,
ROUND (COUNT (*) / 60)
ASHDBTime,
ROUND (SUM (DECODE
(session_state, 'ON CPU', 1, 0)) / 60)
ASHCPUTime
FROM v$active_session_history ash
WHERE sample_time >= TO_DATE
('&&s_begin_time', 'MM/DD/YYYY:HH24:MI:SS')
AND sample_time <=
TO_DATE ('&&s_end_time', 'MM/DD/YYYY:HH24:MI:SS')
GROUP BY TRUNC (SAMPLE_TIME, 'HH24'))
ASH,
(
SELECT TRUNC (BEGIN_INTERVAL_TIME, 'HH24') AWRTime,
ROUND (SUM (EE1.VALUE -
EE2.VALUE) / 60000000) AWRCPUTime
FROM DBA_HIST_SNAPSHOT SS,
DBA_HIST_SYS_TIME_MODEL EE1,
DBA_HIST_SYS_TIME_MODEL EE2
WHERE EE1.SNAP_ID = EE2.SNAP_ID + 1
AND EE1.INSTANCE_NUMBER =
EE2.INSTANCE_NUMBER
AND EE1.STAT_NAME =
EE2.STAT_NAME
AND EE1.SNAP_ID = SS.SNAP_ID
AND EE1.INSTANCE_NUMBER =
SS.INSTANCE_NUMBER
AND SS.BEGIN_INTERVAL_TIME
>=
TO_DATE
('&&s_begin_time', 'MM/DD/YYYY:HH24:MI:SS')
AND SS.BEGIN_INTERVAL_TIME
<=
TO_DATE
('&&s_end_time', 'MM/DD/YYYY:HH24:MI:SS')
AND EE1.STAT_NAME IN ('DB
CPU')
AND EE2.STAT_NAME IN ('DB
CPU')
GROUP BY TRUNC (BEGIN_INTERVAL_TIME,
'HH24'), EE1.STAT_NAME) AWRCPUTime,
(
SELECT TRUNC (BEGIN_INTERVAL_TIME, 'HH24') AWRTime,
ROUND (SUM (EE1.VALUE -
EE2.VALUE) / 60000000) AWRDBTime
FROM DBA_HIST_SNAPSHOT SS,
DBA_HIST_SYS_TIME_MODEL EE1,
DBA_HIST_SYS_TIME_MODEL EE2
WHERE EE1.SNAP_ID = EE2.SNAP_ID + 1
AND EE1.INSTANCE_NUMBER =
EE2.INSTANCE_NUMBER
AND EE1.STAT_NAME =
EE2.STAT_NAME
AND EE1.SNAP_ID = SS.SNAP_ID
AND EE1.INSTANCE_NUMBER =
SS.INSTANCE_NUMBER
AND SS.BEGIN_INTERVAL_TIME
>=
TO_DATE
('&&s_begin_time', 'MM/DD/YYYY:HH24:MI:SS')
AND SS.BEGIN_INTERVAL_TIME
<=
TO_DATE
('&&s_end_time', 'MM/DD/YYYY:HH24:MI:SS')
AND EE1.STAT_NAME IN ('DB
time')
AND EE2.STAT_NAME IN ('DB
time')
GROUP BY TRUNC (BEGIN_INTERVAL_TIME,
'HH24'), EE1.STAT_NAME) AWRDBTime
WHERE ASH.ASHTIME = AWRCPUTime.AWRTime
AND
ASH.ASHTIME = AWRDBTime.AWRTime
ORDER BY 1;
ASHTIME ASHDBTIME ASHCPUTIME AWRDBTIME AWRCPUTIME DBTIMEDIFF% CPUTIMEDIFF%
-----------
---------- ---------- ---------- ---------- ----------- ------------
03/12/15:22 51 48 44 42 14 13
03/12/15:23 51 49 45 43 12 12
03/13/15:00 25 22 20 18 20 18
03/13/15:01 50 48 45 42 10 13
03/13/15:02 47 45 41 40 13 11
03/13/15:03 57 55 51 50 11 9
03/13/15:04 46 45 41 40 11 11
03/13/15:05 44 42 39 37 11 12
03/13/15:06 43 41 37 36 14 12
03/13/15:07 43 42 37 36 14 14
03/13/15:08 42 41 38 36 10 12
03/13/15:09 42 41 38 36 10 12
03/13/15:10 43 41 38 37 12 10
03/13/15:11 44 42 38 36 14 14
03/13/15:12 43 41 38 37 12 10
03/13/15:13 43 42 38 36 12 14
03/13/15:14 46 42 39 37 15 12
03/13/15:15 43 42 37 36 14 14
As you
could see from the above report, DB Time and DB CPU Time statistics diverge by 10 to 20%, but that's OK. I think you are more
interested in knowing that there was a problem than knowing 100% accurate values in ASH stats. Let me explain you why.
Since samples are taken
every second, it may miss an event or activity which started and completed
between the two consecutive samples. I don't think as a performance analyst, you are
interested in an event or activity which took less than a second. Any wait
events lasting more than a second will be captured. Any SQLs running more than
a second will be captured. Any long running queries lasting more than a second
will be captured. So ASH does capture any events or activities of significance.
It may not capture events or activities that don't span two consecutive ASH samples, but I
don't think it will impede your abilities to diagnose database issues.
As mentioned above, SQLs, which complete within a
second between the two samples, are not captured, but if they are run frequently
enough, then ASH sampling algorithm would capture it and you would know about
it.
If we can predict presidential elections through polls
which are based on statistical sampling algorithms, then ASH surely can tell
you about performance issues that your database may be facing. ASH is a gold mine of diagnostic information. Just dig it, but beware. ASH comes with a cost - it requires diagnostic pack license.