Wait events - how to read it
Hi frnds,
As, I'm beginner to performance tuning I dont know
What action do i need to take?
I mean how to read the output which I given below.
this is the output suffering buffer busy waits.
Could anyone please tell me
CLASS TOTAL_WAITS TOTAL_TIME
data block 93303 58711
unused 0 0
system undo header 12 232
undo header 7847 6636
3rd level bmb 0 0
save undo header 0 0
bitmap index block 0 0
file header block 0 0
free list 0 0
undo block 68 207
segment header 422 399
extent map 0 0
2nd level bmb 0 0
system undo block 0 0
sort block 0 0
save undo block 0 0
1st level bmb 1 17
bitmap block 0 0
Thanks, Muhammed Thameem. S
Hello,
"Buffer busy waits" is contention for a buffer (representing a specific
version of a database block) within the Buffer Cache. So, in essence
it is block contention and thus it is most likely something to do with
the design of the tables and indexes supporting the application. A
built-in bottleneck. On indexes, it could be the age-old problem of
insertions into an index on a column with a monotonically-ascending
data value (i.e. timestamps or sequence numbers) which tends to cause
contention on the highest leaf node of the index. On tables, it might
have to do with many concurrent insertions into a table in a
freelist-managed tablespace where the table has only one freelist. It
could also be due to a home-grown implementation of sequence-number
generators (i.e. small table with one row, one column in which contains
the "last value" of a sequence, etc) which lots of people use to avoid
not being "portable across databases" which they think means not using
Oracle sequences (yadda yadda yadda).
I'd look for any SQL statement in the "SQL sorted by Elapsed Time"
section of the AWR report which exhibits high elapsed time but
relatively low CPU time, indicating a lot of wait time. Of course,
there are something like 800 possible wait events in current releases
of Oracle, of which "buffer busy waits" is only one, so this is just
inference and not a direct causal connection to your problem. But,
once I find such statements I'd check to see if they are
accessing/manipulating tables within the CUBS_DATA tablespace, and then
use "select * from table(dbms_xplan.display_awr('sql-id'))" to
get the execution plan(s), and then look for something ineffective
within the execution plan. You might find the script "sqlhistory.sql" helpful
here as well, to get a "historical perspective" on the execution of the
SQL statements over time, in case the buffer busy waits peaked at some
point in the past
Please refer to:
http://www.pubbs.net/201003/oracle/51925-understanding-awr-buffer-waits.html
Also
http://www.remote-dba.net/oracle_10g_tuning/t_buffer_busy_waits.htm
kind regards
Mohamed
Similar Messages
-
Wait Event Direct path read TEMP
HI,
I have two sessions waiting on wait event "DIECT PATH READ TEMP" (both the sessions with same SQL).Those sessions are almost in hung state.If i run that query from any other session,even that gets hanged.I have generated 10046 trace for these and found that all of them are waiting for a index object in TEMP datafile.
Has anyone of you had the same issue.
Please explain me what kind of action i have taken.
I have verified the stats..they are all up to date.
Thanks,
Pramod GarreHi ,
It is IMPOSSIBLE for two SELECT statements to obstruct each other.-
Its not impossible my dear.Do you know something called "Hot Block".Dont you know that in that case select queries obstruct each other??Dont suggest if you dont know.
If you don't post meaningful information, you'll be left with your own mystery-
I agree i have to provide more information.
Thanks
Pramod -
Hash join ending up in huge wait events
Hi,
We are experiencing huge wait events ( direct path read temp , direct path write temp) on our Materialized View refresh in 10.2.0.4 version of oracle 10g in linux rhel5 environment while monitoring the refresh session from db console. While checking the explain plan of the mv query there is a huge hash_join (due to self join nature of the query) is shown. As advised in some dba forums, i have increased my pga_aggregate_target to a value of 4 gb from 1800 mb. The PGA_HIT % is raised to 60% from 58% ( just 2% improvement). But still my direct path read temp and direct path write temp wait event have not reduced and a huge temp space is taken for hash join.
Since we have some usage limit set by some hidden parameters for a each session on pga_aggregate_target, increase the size did not helped me much. The mv refresh is taking more than 5 hours ( sometimes it exceeds 5 hrs) to completes it refresh where as the same query in window (production) is completed less than two hours. Before a month, the refresh time in both environment was nearly close. But now it has changed and not able to figure it out.
STATISTICS have been collected regularly using dbms_gather_stats in both environment. Both mv refresh are scheduled to run using dbms_scheduler (Manual refresh). SGA_TARGET and other memory parameters are almost same.
Environment : Dataware house
O/s : RHEL 5
Oracle version : 10.2.0.4
Work_policy=auto
Is there any possibility to reduce this wait event and there by reducing the elapsed time? I am also interested to know changing the plan to use other sort will help? I don't know whether the details are sufficient to analyze this issue. If you need more details on this just let me know.
I really appreciate your help and thanks in advance to all.Thans for your comments. Here is the code, explan plan and autotrace trace stat output.
SELECT lasg.employee_number "EMPLOYEE_NUM",
lasg.full_name "FULL_NAME",
lasg.person_id "PERSON_ID",
SUBSTR (lasg.organization, 1, 4) "DEPT",
casg.assign_start_date "EFFECTIVE_START_DATE",
casg.assign_end_date "EFFECTIVE_END_DATE",
hasg.organization "PRIOR_ORG",
casg.organization organization,
hasg.supervisor "PRIOR_SUPERVISOR",
casg.supervisor "SUPERVISOR_NAME",
hasg.location "PRIOR_LOCATION",
casg.location location,
hasg.job_title "PRIOR_TITLE",
casg.job_title job_name,
CASE
WHEN hasg.organization = casg.organization THEN 'No Change'
ELSE 'Change'
END
org_change,
CASE
WHEN hasg.location = casg.location THEN 'No Change'
ELSE 'Change'
END
loc_change,
CASE
WHEN hasg.supervisor = casg.supervisor THEN 'No Change'
ELSE 'Change'
END
sup_change,
CASE
WHEN hasg.job_title = casg.job_title THEN 'No Change'
ELSE 'Change'
END
job_change
FROM panad.data_employ_details lasg,
panad.data_employ_details casg,
panad.data_employ_details hasg
WHERE lasg.person_id = casg.person_id(+)
AND lasg.assign_end_date = (SELECT MAX (lasg2.assign_end_date)
FROM panad.data_employ_details lasg2
WHERE lasg.person_id = lasg2.person_id)
AND casg.person_id = hasg.person_id(+)
AND hasg.assign_start_date =
(SELECT MAX (hasg2.assign_start_date)
FROM panad.data_employ_details hasg2
WHERE hasg2.person_id = lasg.person_id
AND hasg2.assign_end_date < casg.assign_start_date)
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 303 | | 10261 (91)| 00:02:04 |
|* 1 | FILTER | | | | | | |
|* 2 | HASH JOIN | | 1 | 303 | | 10179 (91)| 00:02:03 |
|* 3 | HASH JOIN | | 5 | 1060 | | 10095 (92)| 00:02:02 |
|* 4 | HASH JOIN | | 6786 | 960K| | 10011 (93)| 00:02:01 |
| 5 | VIEW | VW_SQ_1 | 6786 | 225K| | 9927 (94)| 00:02:00 |
| 6 | HASH GROUP BY | | 6786 | 384K| | 9927 (94)| 00:02:00 |
| 7 | MERGE JOIN | | 50M| 2820M| | 1427 (53)| 00:00:18 |
| 8 | SORT JOIN | | 31937 | 998K| 2776K| 367 (2)| 00:00:05 |
| 9 | TABLE ACCESS FULL| DATA_EMPLOY_DETAILS | 31937 | 998K| | 82 (2)| 00:00:01 |
|* 10 | SORT JOIN | | 31937 | 810K| 2520K| 324 (2)| 00:00:04 |
| 11 | TABLE ACCESS FULL| DATA_EMPLOY_DETAILS | 31937 | 810K| | 82 (2)| 00:00:01 |
| 12 | TABLE ACCESS FULL | DATA_EMPLOY_DETAILS | 31937 | 3461K| | 83 (3)| 00:00:01 |
| 13 | TABLE ACCESS FULL | DATA_EMPLOY_DETAILS | 31937 | 2089K| | 83 (3)| 00:00:01 |
| 14 | TABLE ACCESS FULL | DATA_EMPLOY_DETAILS | 31937 | 2838K| | 83 (3)| 00:00:01 |
| 15 | SORT AGGREGATE | | 1 | 13 | | | |
|* 16 | TABLE ACCESS FULL | DATA_EMPLOY_DETAILS | 5 | 65 | | 82 (2)| 00:00:01 |
Predicate Information (identified by operation id):
1 - filter("LASG"."ASSIGN_END_DATE"= (SELECT MAX("LASG2"."ASSIGN_END_DATE") FROM
"PANAD"."DATA_EMPLOY_DETAILS" "LASG2" WHERE "LASG2"."PERSON_ID"=:B1))
2 - access("CASG"."PERSON_ID"="HASG"."PERSON_ID" AND "HASG"."ASSIGN_START_DATE"="VW_COL_1")
3 - access("LASG"."PERSON_ID"="CASG"."PERSON_ID" AND "PERSON_ID"="LASG"."PERSON_ID")
4 - access("ROWID"=ROWID)
10 - access(INTERNAL_FUNCTION("HASG2"."ASSIGN_END_DATE")<INTERNAL_FUNCTION("CASG"."ASSIGN_START_DATE")
filter(INTERNAL_FUNCTION("HASG2"."ASSIGN_END_DATE")<INTERNAL_FUNCTION("CASG"."ASSIGN_START_DATE")
16 - filter("LASG2"."PERSON_ID"=:B1)
37 rows selected.
- autot trace stat output -
5070 rows selected.
Statistics
35203 recursive calls
0 db block gets
3675913 consistent gets
4269882 physical reads
0 redo size
1046781 bytes sent via SQL*Net to client
4107 bytes received via SQL*Net from client
339 SQL*Net roundtrips to/from client
69 sorts (memory)
0 sorts (disk)
5070 rows processed I have tried running this query with paralell but not helped.
I have read the links provided by both of you. Dictionary and fixed table stats are collected as a routine.
From the link given byTaral, Greg Rahn has suggested that it is a bug as below.
Its bug 9041800 and there is a 10.2.0.4 backport available as of 01/29/10.How can i get this bug fixed since there is no explanation of what need to be done? Do i need to contact oracle support for the 10.2.0.4 backport for RHEL5?
Thanks in advance
Edited by: Karthikambalav on Mar 9, 2010 2:43 AM -
Wait events 'direct path write' and 'direct path read'
Hi,
We have a query which is taking more that 2 min. It's a 9.2.0.7 database. We took the trace/tkprof of the query,and identified that there are so manay 'direct path write' and 'direct path read' wait events in the trace file.
WAIT #3: nam='direct path write' ela= 5 p1=201 p2=70710 p3=15
WAIT #3: nam='direct path read' ela= 170 p1=201 p2=71719 p3=15
In the above, "p1=201" is a file_id, but we could not find any data file, temp file, control file with that id# 201.
Can you please let us know what's "p1=201" here, how to identify the file which is causing the issue.
Thanks
SravanWhat does:
show parameter db_filesreturn? My guess, is that it returns 200.
The direct file read and direct file write events are reads and writes to TEMP tablespace. In those wait events, the file# is reported as db_files+temp file id. So, 201 means temp file #1.
Now, as to your actual performance problem.
Without seeing the SQL and the corresponding execution plan, it's impossible to be sure. However, the most common causes of temp writes are sort operations and group by operations.
If you decide to post your SQL and execution plan, please be sure to make it readable by formatting it. Information on how to do so can be found here.
Hope that helps,
-Mark
Edited by: mbobak on May 1, 2011 1:50 AM -
How to see the wait events info. after excute a select query
Hi
How to see the wait events info. after execute a select query. Are there any parameter to set for this option?
And also wanna see the follwing info. in trace file. For this what are the parameters I have to set right?
SELECT * FROM emp, dept
WHERE emp.deptno = dept.deptno;
call count cpu elapsed disk query current rows
Parse 1 0.16 0.29 3 13 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.03 0.26 2 2 4 14
Misses in library cache during parse: 1
Parsing user id: (8) SCOTT Regards
Arpitha$ sqlplus scott/tiger
SQL*Plus: Release 10.2.0.2.0 - Production on Wed Apr 20 15:29:33 2011
Copyright (c) 1982, 2005, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> SHOW PARAMETER dump
NAME TYPE VALUE
background_core_dump string partial
background_dump_dest string /user/oracle/app/oracle/admin/
orclsb/bdump
core_dump_dest string /user/oracle/app/oracle/admin/
orclsb/cdump
max_dump_file_size string UNLIMITED
shadow_core_dump string partial
user_dump_dest string /user/oracle/app/oracle/admin/
orclsb/udump
SQL> ALTER SESSION SET EVENTS='10046 trace name context forever, level 12';
Session altered.
SQL> SELECT * FROM emp WHERE deptno=20;
EMPNO ENAME JOB MGR HIREDATE SAL COMM
DEPTNO
7369 SMITH CLERK 7902 17-DEC-80 800
20
7566 JONES MANAGER 7839 02-APR-81 2975
20
7788 SCOTT ANALYST 7566 19-APR-87 3000
20
EMPNO ENAME JOB MGR HIREDATE SAL COMM
DEPTNO
7876 ADAMS CLERK 7788 23-MAY-87 1100
20
7902 FORD ANALYST 7566 03-DEC-81 3000
20Now
$ pwd
/user/oracle/app/oracle/admin/orclsb/udump
$ ls -ltr|tail -5
-rw-r----- 1 oracle oinstall 622 Apr 20 11:35 orclsb_ora_949.trc
-rw-r----- 1 oracle oinstall 651 Apr 20 11:35 orclsb_ora_976.trc
-rw-r----- 1 oracle oinstall 1982 Apr 20 11:35 orclsb_ora_977.trc
-rw-r----- 1 oracle oinstall 1443 Apr 20 15:29 orclsb_ora_1251.trc
-rw-r----- 1 oracle oinstall 279719 Apr 20 15:30 orclsb_ora_1255.trc
$ tkprof orclsb_ora_1255.trc orclsb_ora_1255.txt
TKPROF: Release 10.2.0.2.0 - Production on Wed Apr 20 15:32:17 2011
Copyright (c) 1982, 2005, Oracle. All rights reserved.
$ ls -ltr|tail -5
-rw-r----- 1 oracle oinstall 651 Apr 20 11:35 orclsb_ora_976.trc
-rw-r----- 1 oracle oinstall 1982 Apr 20 11:35 orclsb_ora_977.trc
-rw-r----- 1 oracle oinstall 1443 Apr 20 15:29 orclsb_ora_1251.trc
-rw-r----- 1 oracle oinstall 279719 Apr 20 15:30 orclsb_ora_1255.trc
-rw-r--r-- 1 oracle oinstall 26872 Apr 20 15:32 orclsb_ora_1255.txtThis orclsb_ora_1255.txt contains the required information. -
Read by other session wait event
I have some reports in my database that whenever executed by two or more session these reports suffers with read by other session wait event. Tables which are involving in that's query i moves them to other tablespace and then move back to original tablespace. Some how this could solves read by other session wait event. But after some days these problem occur again.
How can i permanently solve this problem??
My db is Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Prod
Oracle Block size 4096
ThanksHi there,
Searching for this wait eventin the docs gave this,
http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref3159
read by other session
This event occurs when a session requests a buffer that is currently being read into the buffer cache by another session. Prior to release 10.1, waits for this event were grouped with the other reasons for waiting for buffers under the 'buffer busy wait' event
Wait Time: Time waited for the buffer to be read by the other session (in microseconds)And a quick google search for the same gave these results in the top.
http://www.confio.com/English/Tips/Read_By_Other_Session.php
http://www.dbafan.com/blog/?p=132
HTH
Aman.... -
Can you please tell me how to find top 10 wait events
Hi
Can you please tell me how to find top 10 wait events and what actions need to be taken when there is a wait?
Thanks
Regards,
RJ.hi,
suggest you to use statspack !!!!!!! for the all tuning..else use the views
* v$session_event
* v$session_wait
* v$system_event
go through this for tuning tips
http://www.dba-oracle.com/art_dbazine_waits.htm
Thanks
--Raman -
Wait events "db file scatter read"
I have a continuous wait event in my production database...*"db file scatter read"* can any one tell me the reason behind that..
i want perfect answer not google search..google produce lot of options ..i know that.Where is perfect answer? In documentation, books, expert's blog, forum right? And how to find them, i think from google and/or search in net. Simple to admit, but hard to follow....
I really appreciate that you are seeking "perfect answer", but i am sure for that you have to use "search tech" as well.
As Sidhu has already given you the perfect link, Aman has asked you statspack info, now i hope you are addressing your question in a well manner. Let us work on the issue with great experts.
Regards
Girish Sharma -
according to statspack report top five waits events are:-
Event Waits Time (s) Ela Time
db file sequential read 15,496 651 55.53
CPU time 470 40.09
control file parallel write 643 15 1.26
log file sync 518 13 1.14
db file parallel write 252 10 .85
But how can i resolve these events.Please help me. I am not well aware about tuning aspect.
Thanks in advanceYou input is very restricted, no one can't really advice on this. You would have given more info.
As you can see, 'db file sequential read' is the top most wait event. Which mean, you are using wrong index or index doesn't required. Look at the top sqls in the stats pack nd tune them.
This also due to the fact the your index statsitics are not providing enough good info to optimizer.
Jaffar -
How to Read FPM Events in WDDOBEFOREACTION
Dear All,
I am working on a GAF solution. If the user clicks the PREV or EXIT buttons I want to skip the Check the mandatory field validations. But do not know how to read the FPM Events inside the WDDOBEFOREACTION method inside the View.
Any help is greatly appreciated.
Thank you.
PKHi,
Below code can be used to get the current executed FPM event.
DATA io_event type cl_fpm_event.
io_event = wdevent->get_string( 'ID' ).
io_event will have the Event which is being triggered.
FPM event can be captured in WDDOBEFOREACTION, but since this hook method is invoked before any events in the view, so it will throw a dump , if any action is triggered other FPM event.
So it is better to write the validation in Component controller method,(ON_PROCESS ), if we want to Validate based on prev and next buttons.
DATA io_event type cl_fpm_event.
CASE io_event->mv_event_id .
WHEN cl_fpm_event=>'NEXT'. " Global constants can be declared like gc_event_edit,etc
Endcase.
Regards,
Harsha J -
How to pass data at 'wait' event ?
Hello,
I created a workflow starting from a form: the user can chose a production order number.
Inside the workflow there's a step (= wait ) and the system wait until an event (BUS2005-DELETE) is raising.
My requirement is:
When the event BUS2005-DELETE is raising I have to finish the workflow but only if the event is raising for the
production order number entered from the user.
So, my question is:
How to pass the production order number at the WAIT event ??
Tks.Hello Roberto Baldassarre !
To instantiate your BOR, release it and do the event binding at workflow and also the binding at wait step perfectly.
Regards,
S.Suresh -
How to read iCal events from SQLite Cache file?
I need to figure out how to read event entries in the Cache file stored in /Library/Calenders/ as my actual events have been corrupted due to a file system problem. The example below shows the information I have for each entry, where the event in the entry was Bon Jovi Concert, however I am confused how I can read the date and time of the event, information I believe is stored as 218215639,219613081,219524400,219540600. Any help is greatly appreciated.
INSERT INTO "ZCALENDARITEM" VALUES(0,NULL,0,NULL,0,5,NULL,6,4039,NULL,0,0,0,0,0,0,2,23,0,0,0,0,6,0,21821563 9,219613081,219524400,219540600,NULL,NULL,NULL,NULL,NULL,'local_AFB8D342-2DAE-4F A1-A9A6-3FA9B28B5C7C','Bon Jovi Concert',NULL,NULL,NULL,'Europe/London',NULL,'0E17BBB6-0E76-4024-8DD7-60E43D38D 35B');OK, here we go ... I hope. I have tried it on my iCal, with about 2000 entries, and it worked. I cannot guarantee that it will work for you.
Make a new text file from sql as before, but using this modified command to get additional data for each event:
sqlite3 Desktop/Hope 'select ZSTARTDATE, ZCALENDARITEM.ZTITLE, ZENDDATE, ZNODE.ZTITLE, ZISALLDAY, ZRECURRENCERULE, ZISDETACHED from ZCALENDARITEM inner join ZNODE on ZCALENDARITEM.ZCALENDAR = ZNODE.Z_PK where ZSTARTDATE not null' >hope.txt
I suggest you make a new user account, then copy the text file hope.txt and the script to the Public/Drop Box folder for that account. Log on to the new account and copy the two files from the drop box to the Desktop. Start iCal and make new calendars to match those in your ordinary account. Double click the script to open it in Script Editor and CHANGE THE DATE RANGE in the "set ThePeriod" line. Stand well back and click the run button. Every 50 records processed it will pop up a progress box, which will pop down again after a second. Go for lunch.
When it is finished see if things look OK in iCal. If they do export each of the calendars, copy them to the drop box of your normal account and return to your normal account to import them.
If there is an error make a note of the error message and line number, find that line in the text file, and post it here with a couple of lines either side.
Note that for recurring items iCal normally only makes a single event, the first occurrence, then uses a recurrence rule to calculate if the event should be displayed in the current window. If there are any recurring events in the period you have selected where the first occurrence is before the period they will not be recreated. Where however you have changed a single occurrence of a recurring event iCal makes a new, detached, event. Any of these in the period will be detected. If their original event was also in the period they will now appear in the iCal display as duplicates. For easy spotting of these, I have prefixed "XX-" to the title of any detached events.
AK
<pre style="font-family: 'Monaco', 'Courier New', Courier, monospace; overflow:auto; color: #222; background: #DDD; padding: 0.2em; font-size: 10px; width:400px">on run
set ThePeriod to {date ("jan 1 2008"), date ("dec 31 2008")} --CHANGE THIS BEFORE RUNNIN
set TheFile to open for access (path to desktop as text) & "temp.txt"
set TheContents to read TheFile --until return
close TheFile
set TheLines to paragraphs of TheContents
set OldDelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"|"}
set LCount to 0
set ACount to 0
set RCount to 0
set DCount to 0
set SCount to 0
set HowMany to (count of TheLines)
try
repeat with ThisLine in TheLines
if (count of ThisLine) is 0 then exit repeat
-- Start Date, Title, End Date, Calendar, All Day, Recurs, Detached
set LCount to LCount + 1
set Details to text items of ThisLine
set MyStartDate to FixMyDate(item 1 of Details)
set MyTitle to item 2 of Details
set MyEndDate to FixMyDate(item 3 of Details)
set MyCalendar to item 4 of Details
set MyAllDay to item 5 of Details
set MyRecurs to item 6 of Details
set MyDetached to item 7 of Details
if MyAllDay is "1" then set ACount to ACount + 1
if (count of MyRecurs) > 0 then set RCount to RCount + 1
if MyDetached is "1" then set MyTitle to "XX-" & MyTitle
if MyDetached is "1" then set DCount to DCount + 1
if (MyCalendar is not "Birthdays") and (MyStartDate ≥ item 1 of ThePeriod) and (MyStartDate ≤ item 2 of ThePeriod) then
tell application "iCal"
tell calendar MyCalendar
set ThisItem to make new event at end of events with properties {summary:MyTitle, start date:MyStartDate}
end tell
tell ThisItem
if MyAllDay is "1" then set allday event to true
if (count of MyRecurs) > 0 then set recurrence to MyRecurs
set end date to MyEndDate
end tell
end tell
else
set SCount to SCount + 1
end if
if (LCount mod 50) = 0 then
set the_message to "Processed " & (LCount as string) & " of " & (HowMany as string)
display dialog the_message buttons {"Cancel"} giving up after 1
end if
end repeat
on error TheError
display dialog "Error: " & TheError & " about line " & LCount
end try
set AppleScript's text item delimiters to OldDelim
display dialog (LCount as string) & " lines processed" & return & "All Day: " & (ACount as string) & return & "Recurs: " & (RCount as string) & return & "Detach: " & (DCount as string) & return & "Skipped: " & (SCount as string)
end run
on FixMyDate(MyDate)
date (do shell script "date -r " & MyDate & " -v+31y +%e'/'%m'/'%y' '%T")
end FixMyDate
</pre> -
How do I the interpret "Disk file operations I/O" wait event?
I have a large and very busy batch database. All of a sudden the "Disk file operations I/O" wait event is in the top 5 in AWR.
The manual page isn't very helpful:
http://download.oracle.com/docs/cd/E11882_01/server.112/e17110/waitevents003.htm#insertedID40
Disk file operations I/O
This event is used to wait for disk file operations (for example, open, close, seek, and resize). It is also used for miscellaneous I/O operations such as block dumps and password file accesses.
So here is my question: What exactly is going on when I see this wait event? Why doesn't it show up as one of the other I/O events? Can I make it go away? Should I make it go away?
DRsb92075 wrote:
All of a sudden the "Disk file operations I/O" wait event is in the top 5 in AWR.Top wait event
In EVERY Top Wait Event list, one wait event will ALWAYS be on top as #1; by definition of the list.
Simply because any item, even #1, appears on this list does not mean this is a problem & needs to be fixed.
If the Top Wait Event accounts for only 5 seconds out of a 1 hour sample,
then reducing it to ZERO won't measurably improve overall application performance.
The actual Time Waited is required to determine if it is a problem or not.It's taking 20% of time in a 15 minute sample. Anything that takes 20% of deserves to be understood....So: What actually causes it?
DR -
Read the bottom for updated information/amendments
Hi everyone,
S.M.A.R.T. is Self-Monitoring, Analysis and Reporting Technology.
[size=15]1. S.M.A.R.T. info websites[/size]
OK, now we all know what S.M.A.R.T. stands for. For more details about S.M.A.R.T., here is a comprehensive non-technical explanation about S.M.A.R.T. by pcguide.com. Maxtor has published a white paper on S.M.A.R.T. too. And this is from Seagate. Anyhow, I am not going to discuss whether S.M.A.R.T will protect your harddisk from failure, etc etc here. Let us focus more on S.M.A.R.T. itself.
[size=15]2. S.M.A.R.T. monitoring tools[/size]
S.M.A.R.T. data is stored as a tabulated data somewhere in the harddisk as registers. The tools below support reading these registers value from S.M.A.R.T. enabled harddisk.
1. SpeedFan *sign-up is required for download, but it's FREE
2. Active SMART
3. Sisoftware SANDRA 2002.6.8.97 SP1 , older version doesn't support
[size=15]3. S.M.A.R.T. Tabulated Information[/size]
Here is a screenshot captured with SpeedFan.
Let us go through the screenshot. As can be seen, there are 5 columns - Attribute, Value, Worst, Warn and Raw.
Attribute
describes the meaning of the values. As mentioned above, the SMART data is stored as a tabulated data as registers. Each attribute represents one register ID. eg Raw Read Error Rate is register ID 01, register ID 03 is Spin-up Time, etc.
Note that the register IDs are not displayed by SpeedFan, but are displayed by other SANDRA and Active SMART. I have no idea how many S.M.A.R.T. attributes are there, Active SMART stated it can detect more than 35 attributes!
However, some attributes are manufacturer specific. So possibly some attribute names as shown in these SMART tools might not represent the true meaning of the values too!!
Value
is the current relative value of the attribute read from the registers.
Worst
is the worst value ever achieved.
Warn
is the critical threshold value of the attribute. If the Value has reached and OVER this threshold value, with very high probability the harddisk is in trouble. If SMART is enabled in the BIOS, SMART will alert user in the POST screen, with some manufacturer specific error code. You may need to refer to your manufacturer then.
Note : Value, Worst and Warn are all relative values, such as percentage, not actual count. I have no idea how to calculate these values. This is the very S.M.A.R.T. algorithm, isn't it? ?(
Raw
is in fact, the most understandable represented value here! It represents the actual count of the attribute. SpeedFan displays this raw value as hexadecimal numbers. Such as Power On Hours Count is "AA8", which is 2728 in decimal numbers, meaning that the harddisk has been power on for 2728 hours!! There are some Raws represent average rate, such as "CRC Error Rate" etc.
[size=15]4. S.M.A.R.T. Attributes In Detail[/size]
Hopefully by now, you have the basic idea how to read the tabulated data. I will try my best to go through the attributes one-by-one, which after that shall make you understand more about S.M.A.R.T. and of course, starts to appreciate it.
Raw Read Error Rate represents the condition of the physical disk, based on raw (physical) read errors, and physical surface defects.
Spin Up Time is the time taken for the hdd to spin-up. more info
Start/Stop Count is the number of counts (cycles) the hdd start/stop. more info
Reallocated Sector Count - the number of sectors have been reallocated. Surface scan and found bad sector will increase this count. Now this drive has 7 bad sectors already!
Seek Error Rate - how often the drive failed to locate the data (seeking)
Power On Hours Count - the number of hours you have powered on the HDD.
Spin Retry Count - how many time your drive need to attempt to get the drive platter spinning. If this value is more than 1, your drive is seriously in very bad condition!
Calibration Retry Count - represents the number of time your drive perform calibration retry. I am not sure if low-level format would increase this value.
Power Cycle CountThe number of time the hdd has been powered on. On and Off = 1 cycle.
Read Soft Error Rate - should be Soft Read Error Rate. Similar to Raw Read Error Rate but this one depends on logical level, such as error occured in the hdd buffer, etc.
Temperature - The drive temperature, Forget the relative values, read from the Raw value -- "2D" in this example, which is 45C. But my SpeedFan displayed 47C at that time!! SpeedFan seems to produce some +-2C error from actual reading once in a while. :P
Hardware ECC Recovered is the number of counts ECC correction is performed on the data.
Reallocated Event Count - similar to Reallocated Sector Count but this one is on the data.
Current Pending Sector - is the number of counts how many sectors are currently pending. But what is a pending sector?? ?( ?(
Offline Correctable - which is Off-line Scan Uncorrectable Sector Count. Again, what is off-line scan?
UltraATA CRC Error Rate - which is in fact, CRC Error Count instead of rate. As shown, there have been "3C3", which is 883 errors occured already!!!
There are more S.M.A.R.T. attributes, such as
Thoroughput Performance - again, this relative value is surely got from some wierd algorithm again.
Seek Time Performance - some algorithm has been used to calculate this performance value.
Power Off Retract and Load Cycle Count are IBM HDD specific features -- unload the head off the platters when power off. More info on Head Load/unload cycle.
[size=15]5. SpeedFan S.M.A.R.T. Fitness and Performance bars[/size]
I should not comment on this too much because this is Almico's work. I am not sure how he calculate to define the "fitness" and "performance" though. It could probably based on mathematic relation between the current Values and the threshold Warns.
For most hdds, like Maxtor and Seagate's, out-of-the-box the fitness bar has already reaching half-way 50%. But for IBM hdd, the fitness bar is always around 100%. This is because IBM hdd has its threshold/Warn values set to unrealistically high until it's quite unreachable even after a long term use. While for other brand HDDs, the threshold values are more realistic. Thus, the fitness bar in particular, does not tell the true fitness of the HDD. Take it as a reference only. Please always refer back to the Raw values to determine the fitness.
[size=15]6. Conclusion : Judging HDD fitness by our own selves![/size]
Now we all know what the Attributes are, so roughly everyone will have the basic idea how to judge HDD health by our own selves depend on the attributes you're reading. S.M.A.R.T. itself however, define "fitness" and "performance" based on its own algorithm.
We can categorize the attributes into :
Error Related Attributes
"UltraATA CRC Error Rate", "Raw Read Error Rate", "Raw Soft Error Rate", "Hardware ECC Recovered Count", "Reallocated Sector Count", "Reallocated Event/Data Count", Offline Correctable"
-- tell how often are those errors occured. For the above example, this Maxtor harddisk should be RMA-ed for its high CRC Error Rate count.
Drive Fitness Attributes
Spin Up Time, Start/Stop Count, Seek Error Rate, Power On Hours Count, Spin Retry Count, Calibration Retry Count, Power Cycle Count, Power Off Retract Count, Load Cycle Count
-- check "spin retry count" and "Seek Error Rate", any value other than zero is really bad.
Other Attributes
I have no idea what they are for except Temperature.
As the conclusion, understanding S.M.A.R.T. attributes helps in knowing your HDD fitness by your own self, rather than waiting for the S.M.A.R.T. to alert you for severe error. That might be too late already. 8o
Thanks for reading.
Edited:
1. for better reading pleasure
2. added Conclusion
3. added explanations about SpeedFan SMART Fitness and Performance bar.Quote
Originally posted by WarLord
I like this HD tool. i use it everyday now. the tempture readings are great Hd temp cpu temp and even the system temp nice added feature to the monitoring. And this is an alternative to enabling the SMART in the bios? Thats the way im understanding it. is that right Maesus ?Because i have disabled in bios. She went threw alot of trouble here. thank you
Well from my observation, whether SMART is disabled or enabled in the BIOS, SMART is always working within the HDD itself.
Basically SMART is acting like a blackbox, monitoring and tabulating HDD condition from time to time and its attributes only fully revealable by the manufacturers. SpeedFan's SMART status only displays partial information that is displayable. Some attributes are hidden, ~OR~ the attributes' locations are different from one HDD to another brand, such that some values don't correspond to the attribute meaning at all.
It is very doubtful to claim that enabling SMART in the BIOS will hog down the performance. Just like a transport bus (yeah real bus that fetch passenger :P ), with or without the black box installed can't help it if the driver wants to speeding. :P -
Hi: I'm analyzing this STATSPACK report: it is "volume test" on our UAT server, so most input is from 'bind variables'. Our shared pool is well utilized in oracle. Oracle redo logs is not appropriately configured on this server, as in 'Top 5 wait events' there are 2 for redos.
I need to know what else information can be dig-out from 'foreground wait events' & 'background wait events', and what can assist us to better understanding, in combination of 'Top 5 wait event's, that how the server/test went? it could be overwelming No. of wait events, so appreciate any helpful diagnostic or analysis. Database is oracle 11.2.0.4 upgraded from 11.2.0.3, on IBM AIX power system 64bit, level 6.x
STATSPACK report for
Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
700000XXX XXX 1 22-Apr-15 12:12 11.2.0.4.0 NO
Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
dXXXX_XXX AIX-Based Systems (64- 2 1 0 16.0
Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 5635 22-Apr-15 13:00:02 114 4.6
End Snap: 5636 22-Apr-15 14:00:01 128 8.8
Elapsed: 59.98 (mins) Av Act Sess: 0.6
DB time: 35.98 (mins) DB CPU: 19.43 (mins)
Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 2,064M Std Block Size: 8K
Shared Pool: 3,072M Log Buffer: 13,632K
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 0.6 0.0 0.00 0.00
DB CPU(s): 0.3 0.0 0.00 0.00
Redo size: 458,720.6 8,755.7
Logical reads: 12,874.2 245.7
Block changes: 1,356.4 25.9
Physical reads: 6.6 0.1
Physical writes: 61.8 1.2
User calls: 2,033.7 38.8
Parses: 286.5 5.5
Hard parses: 0.5 0.0
W/A MB processed: 1.7 0.0
Logons: 1.2 0.0
Executes: 801.1 15.3
Rollbacks: 6.1 0.1
Transactions: 52.4
Instance Efficiency Indicators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.98 Optimal W/A Exec %: 100.00
Library Hit %: 99.77 Soft Parse %: 99.82
Execute to Parse %: 64.24 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 53.15 % Non-Parse CPU: 98.03
Shared Pool Statistics Begin End
Memory Usage %: 10.50 12.79
% SQL with executions>1: 69.98 78.37
% Memory for SQL w/exec>1: 70.22 81.96
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
CPU time 847 50.2
enq: TX - row lock contention 4,480 434 97 25.8
log file sync 284,169 185 1 11.0
log file parallel write 299,537 164 1 9.7
log file sequential read 698 16 24 1.0
Host CPU (CPUs: 2 Cores: 1 Sockets: 0)
~~~~~~~~ Load Average
Begin End User System Idle WIO WCPU
1.16 1.84 19.28 14.51 66.21 1.20 82.01
Instance CPU
~~~~~~~~~~~~ % Time (seconds)
Host: Total time (s): 7,193.8
Host: Busy CPU time (s): 2,430.7
% of time Host is Busy: 33.8
Instance: Total CPU time (s): 1,203.1
% of Busy CPU used for Instance: 49.5
Instance: Total Database time (s): 2,426.4
%DB time waiting for CPU (Resource Mgr): 0.0
Memory Statistics Begin End
~~~~~~~~~~~~~~~~~ ------------ ------------
Host Mem (MB): 16,384.0 16,384.0
SGA use (MB): 7,136.0 7,136.0
PGA use (MB): 282.5 361.4
% Host Mem used for SGA+PGA: 45.3 45.8
Foreground Wait Events DB/Inst: XXXXXs Snaps: 5635-5636
-> Only events with Total Wait Time (s) >= .001 are shown
-> ordered by Total Wait Time desc, Waits desc (idle events last)
Avg %Total
%Tim Total Wait wait Waits Call
Event Waits out Time (s) (ms) /txn Time
enq: TX - row lock contentio 4,480 0 434 97 0.0 25.8
log file sync 284,167 0 185 1 1.5 11.0
Disk file operations I/O 8,741 0 4 0 0.0 .2
direct path write 13,247 0 3 0 0.1 .2
db file sequential read 6,058 0 1 0 0.0 .1
buffer busy waits 1,800 0 1 1 0.0 .1
SQL*Net more data to client 29,161 0 1 0 0.2 .1
direct path read 7,696 0 1 0 0.0 .0
db file scattered read 316 0 1 2 0.0 .0
latch: shared pool 144 0 0 2 0.0 .0
CSS initialization 30 0 0 3 0.0 .0
cursor: pin S 10 0 0 9 0.0 .0
row cache lock 41 0 0 2 0.0 .0
latch: row cache objects 19 0 0 3 0.0 .0
log file switch (private str 8 0 0 7 0.0 .0
library cache: mutex X 28 0 0 2 0.0 .0
latch: cache buffers chains 54 0 0 1 0.0 .0
latch free 290 0 0 0 0.0 .0
control file sequential read 1,568 0 0 0 0.0 .0
log file switch (checkpoint 4 0 0 6 0.0 .0
direct path sync 8 0 0 3 0.0 .0
latch: redo allocation 60 0 0 0 0.0 .0
SQL*Net break/reset to clien 34 0 0 1 0.0 .0
latch: enqueue hash chains 45 0 0 0 0.0 .0
latch: cache buffers lru cha 7 0 0 2 0.0 .0
latch: session allocation 5 0 0 1 0.0 .0
latch: object queue header o 6 0 0 1 0.0 .0
ASM file metadata operation 30 0 0 0 0.0 .0
latch: In memory undo latch 15 0 0 0 0.0 .0
latch: undo global data 8 0 0 0 0.0 .0
SQL*Net message from client 6,362,536 0 278,225 44 33.7
jobq slave wait 7,270 100 3,635 500 0.0
SQL*Net more data from clien 7,976 0 15 2 0.0
SQL*Net message to client 6,362,544 0 8 0 33.7
Background Wait Events DB/Inst: XXXXXs Snaps: 5635-5636
-> Only events with Total Wait Time (s) >= .001 are shown
-> ordered by Total Wait Time desc, Waits desc (idle events last)
Avg %Total
%Tim Total Wait wait Waits Call
Event Waits out Time (s) (ms) /txn Time
log file parallel write 299,537 0 164 1 1.6 9.7
log file sequential read 698 0 16 24 0.0 1.0
db file parallel write 9,556 0 13 1 0.1 .8
os thread startup 146 0 10 70 0.0 .6
control file parallel write 2,037 0 2 1 0.0 .1
Log archive I/O 35 0 1 30 0.0 .1
LGWR wait for redo copy 2,447 0 0 0 0.0 .0
db file async I/O submit 9,556 0 0 0 0.1 .0
db file sequential read 145 0 0 2 0.0 .0
Disk file operations I/O 349 0 0 0 0.0 .0
db file scattered read 30 0 0 4 0.0 .0
control file sequential read 5,837 0 0 0 0.0 .0
ADR block file read 19 0 0 4 0.0 .0
ADR block file write 5 0 0 15 0.0 .0
direct path write 14 0 0 2 0.0 .0
direct path read 3 0 0 7 0.0 .0
latch: shared pool 3 0 0 6 0.0 .0
log file single write 56 0 0 0 0.0 .0
latch: redo allocation 53 0 0 0 0.0 .0
latch: active service list 1 0 0 3 0.0 .0
latch free 11 0 0 0 0.0 .0
rdbms ipc message 314,523 5 57,189 182 1.7
Space Manager: slave idle wa 4,086 88 18,996 4649 0.0
DIAG idle wait 7,185 100 7,186 1000 0.0
Streams AQ: waiting for time 2 50 4,909 ###### 0.0
Streams AQ: qmn slave idle w 129 0 3,612 28002 0.0
Streams AQ: qmn coordinator 258 50 3,612 14001 0.0
smon timer 43 2 3,605 83839 0.0
pmon timer 1,199 99 3,596 2999 0.0
SQL*Net message from client 17,019 0 31 2 0.1
SQL*Net message to client 12,762 0 0 0 0.1
class slave wait 28 0 0 0 0.0
thank you very much!Hi: just know it now: it is a large amount of 'concurrent transaction' designed in this "Volume Test" - to simulate large incoming transaction volme, so I guess wait in eq:TX - row is expected.
The fact: (1) redo logs at uat server is known to not well-tune for configurations (2) volume test slow 5%, however data amount in its test is kept the same by each time import production data, by the team. So why it slowed 5% this year?
The wait histogram is pasted below, any one interest to take a look? any ideas?
Wait Event Histogram DB/Inst: XXXX/XXXX Snaps: 5635-5636
-> Total Waits - units: K is 1000, M is 1000000, G is 1000000000
-> % of Waits - column heading: <=1s is truly <1024ms, >1s is truly >=1024ms
-> % of Waits - value: .0 indicates value was <.05%, null is truly 0
-> Ordered by Event (idle events last)
Total ----------------- % of Waits ------------------
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
ADR block file read 19 26.3 5.3 10.5 57.9
ADR block file write 5 40.0 60.0
ADR file lock 6 100.0
ARCH wait for archivelog l 14 100.0
ASM file metadata operatio 30 100.0
CSS initialization 30 100.0
Disk file operations I/O 9090 97.2 1.4 .6 .4 .2 .1 .1
LGWR wait for redo copy 2447 98.5 .5 .4 .2 .2 .2 .1
Log archive I/O 35 40.0 8.6 25.7 2.9 22.9
SQL*Net break/reset to cli 34 85.3 8.8 5.9
SQL*Net more data to clien 29K 99.9 .0 .0 .0 .0 .0
buffer busy waits 1800 96.8 .7 .7 .6 .3 .4 .5
control file parallel writ 2037 90.7 5.0 2.1 .8 1.0 .3 .1
control file sequential re 7405 100.0 .0
cursor: pin S 10 10.0 90.0
db file async I/O submit 9556 99.9 .0 .0 .0
db file parallel read 1 100.0
db file parallel write 9556 62.0 32.4 1.7 .8 1.5 1.3 .1
db file scattered read 345 72.8 3.8 2.3 11.6 9.0 .6
db file sequential read 6199 97.2 .2 .3 1.6 .7 .0 .0
direct path read 7699 99.1 .4 .2 .1 .1 .0
direct path sync 8 25.0 37.5 12.5 25.0
direct path write 13K 97.8 .9 .5 .4 .3 .1 .0
enq: TX - row lock content 4480 .4 .7 1.3 3.0 6.8 12.3 75.4 .1
latch free 301 98.3 .3 .7 .7
latch: In memory undo latc 15 93.3 6.7
latch: active service list 1 100.0
latch: cache buffers chain 55 94.5 3.6 1.8
latch: cache buffers lru c 9 88.9 11.1
latch: call allocation 6 100.0
latch: checkpoint queue la 3 100.0
latch: enqueue hash chains 45 97.8 2.2
latch: messages 4 100.0
latch: object queue header 7 85.7 14.3
latch: redo allocation 113 97.3 1.8 .9
latch: row cache objects 19 89.5 5.3 5.3
latch: session allocation 5 80.0 20.0
latch: shared pool 147 90.5 1.4 2.7 1.4 .7 1.4 2.0
latch: undo global data 8 100.0
library cache: mutex X 28 89.3 3.6 3.6 3.6
log file parallel write 299K 95.6 2.6 1.0 .4 .3 .2 .0
log file sequential read 698 29.5 .1 4.6 46.8 18.9
log file single write 56 100.0
log file switch (checkpoin 4 25.0 50.0 25.0
log file switch (private s 8 12.5 37.5 50.0
log file sync 284K 93.3 3.7 1.4 .7 .5 .3 .1
os thread startup 146 100.0
row cache lock 41 85.4 9.8 2.4 2.4
DIAG idle wait 7184 100.0
SQL*Net message from clien 6379K 86.6 5.1 2.9 1.3 .7 .3 2.8 .3
SQL*Net message to client 6375K 100.0 .0 .0 .0 .0 .0 .0
Wait Event Histogram DB/Inst: XXXX/xxxx Snaps: 5635-5636
-> Total Waits - units: K is 1000, M is 1000000, G is 1000000000
-> % of Waits - column heading: <=1s is truly <1024ms, >1s is truly >=1024ms
-> % of Waits - value: .0 indicates value was <.05%, null is truly 0
-> Ordered by Event (idle events last)
Total ----------------- % of Waits ------------------
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
SQL*Net more data from cli 7976 99.7 .1 .1 .0 .1
Space Manager: slave idle 4086 .1 .2 .0 .0 .3 3.2 96.1
Streams AQ: qmn coordinato 258 49.2 .8 50.0
Streams AQ: qmn slave idle 129 100.0
Streams AQ: waiting for ti 2 50.0 50.0
class slave wait 28 92.9 3.6 3.6
jobq slave wait 7270 .0 100.0
pmon timer 1199 100.0
rdbms ipc message 314K 10.3 7.3 39.7 15.4 10.6 5.3 8.2 3.3
smon timer 43 100.0
Maybe you are looking for
-
Intermittent connectivity, possibly due to third-party 802.11b bridge?
I use a Linksys 802.11b wireless bridge/switch/router with a 12" G4 iBook and up until recently, everything worked just fine. Then about two weeks ago I brought home a 12" G4 PowerBook and tried to use it with this wireless network. The PowerBook is
-
Upgrade from 10.1.3 to 10.1.3.1
J dev install depens on copying files. We made many configurtion change to JDEv 10.1.3 Also We have installed a lot of extention to 10.1.3.1. We want to protect our own configuration. How can we do it when upgrading. In my opinion Upgrade process nee
-
Please help with editing elements 7
Hello everybody, I am new in this s/w but I have spent many hours trying to figure out how to edit/cut a portion of a clip and leave the blank area out. It seems that when I cut a piece out it also enters the scene/timeline leaving me me with many bl
-
Remove all Scriplets ?? and change to tags.
I have a shopping cart aproximately 2 years old that is jsp/servlet. The jsp pages look more like servlets as they have TONS of java code in them. The shopping cart currently has no taglibs in it, and I am in the process of re-writing all jsp pages.
-
Apple' robot chat test out of service ?
I wanted to test the configuration of my router with Apple' robot chat (appleu3test01, appleu3test02, appleu3test03), but it seems disconencted at this time ? Am I doing anything wrong ?