Statspack: High log file sync timeouts and waits
Hi all,
Please see an extract from our statpack report:
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
log file sync 349,713 215,674 74.13
db file sequential read 16,955,622 31,342 10.77
CPU time 21,787 7.49
direct path read (lob) 92,762 8,910 3.06
db file scattered read 4,335,034 4,439 1.53
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
log file sync 349,713 150,785 215,674 617 1.8
db file sequential read 16,955,622 0 31,342 2 85.9
I hope the above is readable. I'm concerned with the very high number of Waits and Timeouts, particulary around the log file sync event. From reading around I suspect that the disk our redo log sits on isn't fast enough.
1) Is this conclusion correct, are these timeouts excessively high (70% seems high...)?
2) I see high waits on almost every other event (but not timeouts), is this pointing towards an incorrect database database setup (give our very high loads of 160 executes second?
Any help would be much appreciated.
Jonathan
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
log file sync 349,713 215,674 74.13
db file sequential read 16,955,622 31,342 10.77
CPU time 21,787 7.49
direct path read (lob) 92,762 8,910 3.06
db file scattered read 4,335,034 4,439 1.53
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
log file sync 349,713 150,785 215,674 617 1.8
db file sequential read 16,955,622 0 31,342 2 85.9What's the time frame of this report on?
It looks like your disk storage can't keep up with the volume of I/O requests from your database.
The first few thing need to look at, what're IO intensive SQLs in your database. Are these SQLs doing unnecessary full table scan?
Find out the hot blocks and the objects they belong.
Check v$session_wait view.
Is there any other suspicious activity going on in your Server ? Like other program other than Oracle doing high IO activities? Are there any core dump going on?
Similar Messages
-
Very high log file sequential read and control file sequential read waits?
I have a 10.2.0.4 database and have 5 streams capture processes running to replicate data to another database. However I am seeing very high
log file sequential read and control file sequential read by the capture procesess. This is causing slowness in the database as the databass is wasting so much time on these wait events. From AWR report
Elapsed: 20.12 (mins)
DB Time: 67.04 (mins)
and From top 5 wait events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time 1,712 42.6
log file sequential read 99,909 683 7 17.0 System I/O
log file sync 49,702 426 9 10.6 Commit
control file sequential read262,625 384 1 9.6 System I/O
db file sequential read 41,528 378 9 9.4 User I/O
Oracle support hasn't been of much help, other than wasting my 10 days and telling me to try this and try that.
Do you have streams running in your environment, are you experiencing this wait. Have you done anything to resolve these waits..
ThanksWelcome to the forums.
There is insufficient information in what you have posted to know that your analysis of the situation is correct or anything about your Streams environment.
We don't know what you are replicating. Not size, not volume, not type of capture, not rules, etc.
We don't know the distance over which it is being replicated ... 10 ft. or 10 light years.
We don't have any AWR or ASH data to look at.
etc. etc. etc. If this is what you provided Oracle Support it is no wonder they were unable to help you.
To diagnose this problem, if one exists, requires someone on-site or with a very substantial body of data which you have not provided. The first step is to fill in the answers to all of the obvious first level questions. Then we will likely come back with a second level of questioning.
But when you do ... do not post here. Your questions are not "Database General" they are specific to Streams and there is a Streams forum specifically for them.
Thank you. -
Wait Events "log file parallel write" / "log file sync" during CREATE INDEX
Hello guys,
at my current project i am performing some performance tests for oracle data guard. The question is "How does a LGWR SYNC transfer influences the system performance?"
To get some performance values, that i can compare i just built up a normal oracle database in the first step.
Now i am performing different tests like creating "large" indexes, massive parallel inserts/commits, etc. to get the bench mark.
My database is an oracle 10.2.0.4 with multiplexed redo log files on AIX.
I am creating an index on a "normal" table .. i execute "dbms_workload_repository.create_snapshot()" before and after the CREATE INDEX to get an equivalent timeframe for the AWR report.
After the index is built up (round about 9 GB) i perform an awrrpt.sql to get the AWR report.
And now take a look at these values from the AWR
Avg
%Time Total Wait wait Waits
Event Waits -outs Time (s) (ms) /txn
log file parallel write 10,019 .0 132 13 33.5
log file sync 293 .7 4 15 1.0
......How can this be possible?
Regarding to the documentation
-> log file sync: http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref3120
Wait Time: The wait time includes the writing of the log buffer and the post.-> log file parallel write: http://download.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref3104
Wait Time: Time it takes for the I/Os to complete. Even though redo records are written in parallel, the parallel write is not complete until the last I/O is on disk.This was also my understanding .. the "log file sync" wait time should be higher than the "log file parallel write" wait time, because of it includes the I/O and the response time to the user session.
I could accept it, if the values are close to each other (maybe round about 1 second in total) .. but the different between 132 seconds and 4 seconds is too noticeable.
Is the behavior of the log file sync/write different when performing a DDL like CREATE INDEX (maybe async .. like you can influence it with the initialization parameter COMMIT_WRITE??)?
Do you have any idea how these values come about?
Any thoughts/ideas are welcome.
Thanks and RegardsSurachart Opun (HunterX) wrote:
Thank you for Nice Idea.
In this case, How can we reduce "log file parallel write" and "log file sync" waited time?
CREATE INDEX with NOLOGGINGA NOLOGGING can help, can't it?Yes - if you create index nologging then you wouldn't be generating that 10GB of redo log, so the waits would disappear.
Two points on nologging, though:
<ul>
it's "only" an index, so you could always rebuild it in the event of media corruption, but if you had lots of indexes created nologging this might cause an unreasonable delay before the system was usable again - so you should decide on a fallback option, such as taking a new backup of the tablespace as soon as all the nologging operatons had completed.
If the database, or that tablespace, is in +"force logging"+ mode, the nologging will not work.
</ul>
Don't get too alarmed by the waits, though. My guess is that the +"log file sync"+ waits are mostly from other sessions, and since there aren't many of them the other sessions are probably not seeing a performance issue. The +"log file parallel write"+ waits are caused by your create index, but they are happeninng to lgwr in the background which is running concurrently with your session - so your session is not (directly) affected by them, so may not be seeing a performance issue.
The other sessions are seeing relatively high sync times because their log file syncs have to wait for one of the large writes that you have triggered to complete, and then the logwriter includes their (little) writes with your next (large) write.
There may be a performance impact, though, from the pure volume of I/O. Apart from the I/O to write the index you have LGWR writting (N copies) of the redo for the index and ARCH is reading and writing the completed log files caused by the index build. So the 9GB of index could easily be responsible for vastly more I/O than the initial 9GB.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
To post code, statspack/AWR report, execution plans or trace files, start and end the section with the tag {noformat}{noformat} (lowercase, curly brackets, no spaces) so that the text appears in fixed format.
"Science is more than a body of knowledge; it is a way of thinking"
Carl Sagan -
Hi all, I am using Oracle 10gR2 on Solaris 10.
It did a SQL Trace and came up with the following resultMisses in library cache during parse: 591
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
library cache lock 241 0.06 0.61
KJC: Wait for msg sends to complete 2 0.00 0.00
SQL*Net message to client 1768 0.00 0.00
SQL*Net message from client 1768 0.14 7.94
row cache lock 7 0.00 0.00
gc cr grant 2-way 1 0.00 0.00
db file sequential read 67 0.87 6.73
gc current grant 2-way 19 0.00 0.01
gc current grant busy 58 0.01 0.08
log file sync 3055 0.98 2592.00
gc current block 2-way 14 0.00 0.02
gc cr block 2-way 77 0.00 0.06
log file switch completion 12 0.98 8.80
gc current request 5 1.23 6.15
gc current block lost 1 0.45 0.45
lock deadlock retry 1 0.00 0.00
latch free 1 0.00 0.00
enq: TM - contention 1 0.00 0.00
gc cr request 5 1.23 6.14
gc cr block lost 1 0.31 0.31
cr request retry 1 0.00 0.00
latch: session allocation 1 0.00 0.00
gc buffer busy 2 0.98 1.96
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
Parse 237 0.08 0.05 0 0 38 0
Execute 2184 0.51 6.17 1 200 585 364
Fetch 1884 0.18 6.96 27 3234 195 2127
total 4305 0.77 13.19 28 3434 818 2491
Misses in library cache during parse: 21
Misses in library cache during execute: 19
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
library cache lock 21 0.00 0.01
row cache lock 248 0.01 0.08
gc cr grant 2-way 3 0.00 0.00
db file sequential read 28 1.01 3.22
gc current grant busy 8 0.00 0.00
gc current block 2-way 5 0.00 0.00
gc cr block 2-way 1 0.00 0.00
log file switch completion 4 0.98 3.55
gc current request 1 1.22 1.22
latch: KCL gc element parent latch 1 0.00 0.00
latch: redo allocation 1 0.00 0.00
gc current block busy 1 0.64 0.64
1181 user SQL statements in session.
314 internal SQL statements in session.
1495 SQL statements in session.There is a lot of log file sync waits. There were a lot of INSERTS in the sql but I did not find any commits. For example
INSERT INTO CM_WORKFLOW_AUDIT (AUDIT_TRAIL_ID, CASE_HISTORY_ID,USER_ID,
GROUP_ID,ACTION_ID,DESCRIPTION,DATE_TIME)
VALUES
('080504001809',2154515,19,2,23,'Ticket[2157817] added to Super Ticket',
TO_DATE('04-05-2008 13:14:38','dd-mm-yyyy HH24:MI:SS'))
But there is no commit at the end, there are a lot of INSERTS like this one but no commit at the end of it. So log file sync cant be waiting to flush the buffer into a redolog (well, that is what I think atleast). Can some one please tell me what is causing the log files sync wait? By the way my log_buffer is 12mb.
Regards.....Hi,
The number of commits can be retrieved from v$sysstat and v$sesstat.
There are no commit statements in any trace file,
you need to look at lines starting with XCTEND in the raw trace data.
Also the size of the log_buffer has 0 to do with the log file sync event, and setting big log_buffer is the typical 'more is better' tuning which doesn't help.
You need to investigate the speed of the devices with holding online redolog, do NOT locate online redologs on RAID-5 devices.
Also posting 'Any one ??' when you are not getting response immediately shows you don't seem to understand this is a volunteer forum.
Sybrand Bakker
Senior Oracle DBA -
45 min long session of log file sync waits between 5000 and 20000 ms
45 min long log file sync waits between 5000 and 20000 ms
Encountering a rather unusual performance issue. Once every 4 hours I am seeing a 45 minute long log file sync wait event being reported using Spotlight on Oracle. For the first 30 minutes the event wait is for approx 5000 ms, followed by an increase to around 20000 ms for the next 15 min before rapidly dropping off and normal operation continues for the next 3 hours and 15 minutes before the cycle repeats itself. The issue appears to maintain it's schedule independently of restarting the database. Statspack reports do not show an increase in commits or executions or any new sql running during the time the issue is occuring. We have two production environments both running identicle applications with similar usage and we do not see the issue on the other system. I am leaning towards this being a hardware issue, but the 4 hour interval regardless of load on the database has me baffled. If it were a disk or controller cache issue one would expect to see the interval change with database load.
I cycle my redo logs and archive them just fine with log file switches every 15-20 minutes. Even during this unusally long and high session of log file sync waits I can see that the redo log files are still switching and are being archived.
The redo logs are on a RAID 10, we have 4 redo logs at 1 GB each.
I've run statspack reports on hourly intervals around this event:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
log file sync 756,729 2,538,034 88.47
db file sequential read 208,851 153,276 5.34
log file parallel write 636,648 129,981 4.53
enqueue 810 21,423 .75
log file sequential read 65,540 14,480 .50
And here is a sample while not encountering the issue:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
log file sync 953,037 195,513 53.43
log file parallel write 875,783 83,119 22.72
db file sequential read 221,815 63,944 17.48
log file sequential read 98,310 18,848 5.15
db file scattered read 67,584 2,427 .66
Yes I know I am already tight on I/O for my redo even during normal operations yet, my redo and archiving works just fine for 3 hours and 15 minutes (11 to 15 log file switches). These normal switches result in a log file sync wait of about 5000 ms for about 45 seconds while the 1GB redo log is being written and then archived.
I welcome any and all feedback.
Message was edited by:
acyoung1
Message was edited by:
acyoung1Lee,
log_buffer = 1048576 we use a standard of 1 MB for our buffer cache, we've not altered the setting. It is my understanding that Oracle typically recommends that you not exceed 1MB for the log_buffer, stating that a larger buffer normally does not increase performance.
I would agree that tuning the log_buffer parameter may be a place to consider; however, this issue last for ~45 minutes once every 4 hours regardless of database load. So for 3 hours and 15 minutes during both peak usage and low usage the buffer cache, redo log and archival processes run just fine.
A bit more information from statspack reports:
Here is a sample while the issue is occuring.
Snap Id Snap Time Sessions
Begin Snap: 661 24-Mar-06 12:45:08 87
End Snap: 671 24-Mar-06 13:41:29 87
Elapsed: 56.35 (mins)
Cache Sizes
~~~~~~~~~~~
db_block_buffers: 196608 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 67108864
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 615,141.44 2,780.83
Logical reads: 13,241.59 59.86
Block changes: 2,255.51 10.20
Physical reads: 144.56 0.65
Physical writes: 61.56 0.28
User calls: 1,318.50 5.96
Parses: 210.25 0.95
Hard parses: 8.31 0.04
Sorts: 16.97 0.08
Logons: 0.14 0.00
Executes: 574.32 2.60
Transactions: 221.21
% Blocks changed per Read: 17.03 Recursive Call %: 26.09
Rollback per transaction %: 0.03 Rows per Sort: 46.87
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 98.91 In-memory Sort %: 100.00
Library Hit %: 98.89 Soft Parse %: 96.05
Execute to Parse %: 63.39 Latch Hit %: 99.87
Parse CPU to Parse Elapsd %: 90.05 % Non-Parse CPU: 85.05
Shared Pool Statistics Begin End
Memory Usage %: 89.96 92.20
% SQL with executions>1: 76.39 67.76
% Memory for SQL w/exec>1: 72.53 63.71
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
log file sync 756,729 2,538,034 88.47
db file sequential read 208,851 153,276 5.34
log file parallel write 636,648 129,981 4.53
enqueue 810 21,423 .75
log file sequential read 65,540 14,480 .50
And this is a sample during "normal" operation.
Snap Id Snap Time Sessions
Begin Snap: 671 24-Mar-06 13:41:29 88
End Snap: 681 24-Mar-06 14:42:57 88
Elapsed: 61.47 (mins)
Cache Sizes
~~~~~~~~~~~
db_block_buffers: 196608 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 67108864
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 716,776.44 2,787.81
Logical reads: 13,154.06 51.16
Block changes: 2,627.16 10.22
Physical reads: 129.47 0.50
Physical writes: 67.97 0.26
User calls: 1,493.74 5.81
Parses: 243.45 0.95
Hard parses: 9.23 0.04
Sorts: 18.27 0.07
Logons: 0.16 0.00
Executes: 664.05 2.58
Transactions: 257.11
% Blocks changed per Read: 19.97 Recursive Call %: 25.87
Rollback per transaction %: 0.02 Rows per Sort: 46.85
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 99.02 In-memory Sort %: 100.00
Library Hit %: 98.95 Soft Parse %: 96.21
Execute to Parse %: 63.34 Latch Hit %: 99.90
Parse CPU to Parse Elapsd %: 96.60 % Non-Parse CPU: 84.06
Shared Pool Statistics Begin End
Memory Usage %: 92.20 88.73
% SQL with executions>1: 67.76 75.40
% Memory for SQL w/exec>1: 63.71 68.28
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
log file sync 953,037 195,513 53.43
log file parallel write 875,783 83,119 22.72
db file sequential read 221,815 63,944 17.48
log file sequential read 98,310 18,848 5.15
db file scattered read 67,584 2,427 .66 -
Performance Issue: Wait event "log file sync" and "Execute to Parse %"
In one of our test environments users are complaining about slow response.
In statspack report folowing are the top-5 wait events
Event Waits Time (cs) Wt Time
log file parallel write 1,046 988 37.71
log file sync 775 774 29.54
db file scattered read 4,946 248 9.47
db file parallel write 66 248 9.47
control file parallel write 188 152 5.80
And after runing the same application 4 times, we are geting Execute to Parse % = 0.10. Cursor sharing is forced and query rewrite is enabled
When I view v$sql, following command is parsed frequently
EXECUTIONS PARSE_CALLS
SQL_TEXT
93380 93380
select SEQ_ORDO_PRC.nextval from DUAL
Please suggest what should be the method to troubleshoot this and if I need to check some more information
Regards,
Sudhanshu BhandariWell, of course, you probably can't eliminate this sort of thing entirely: a setup such as yours is inevitably a compromise. What you can do is make sure your log buffer is a good size (say 10MB or so); that your redo logs are large (at least 100MB each, and preferably large enough to hold one hour or so of redo produced at the busiest time for your database without filling up); and finally set ARCHIVE_LAG_TARGET to something like 1800 seconds or more to ensure a regular, routine, predictable log switch.
It won't cure every ill, but that sort of setup often means the redo subsystem ceases to be a regular driver of foreground waits. -
Dear Experts,
There is a huge wait on log file sync. I have 5 redo log groups 3 members each. I suspect writing to one group which is hosted on /proddb is taking time. Is there a way we can identify and control this.
DB is on 10.2.0.4
OS is RHEL 5
GROUP# STATUS TYPE MEMBER
1 ONLINE /proddb/proddata/log01a.dbf
1 ONLINE /proddb/proddata/log01b.dbf
1 ONLINE /prodlog/proddata/log01c.dbf
2 ONLINE /proddb/proddata/log02a.dbf
2 ONLINE /proddb/proddata/log02b.dbf
2 ONLINE /prodlog/proddata/log02c.dbf
3 ONLINE /proddb/proddata/log03a.dbf
3 ONLINE /proddb/proddata/log03b.dbf
3 ONLINE /prodlog/proddata/log03c.dbf
4 ONLINE /proddb/proddata/log04a.dbf
4 ONLINE /proddb/proddata/log04b.dbf
GROUP# STATUS TYPE MEMBER
4 ONLINE /prodlog/proddata/log04c.dbf
5 ONLINE /proddb/proddata/log05a.dbf
5 ONLINE /proddb/proddata/log05b.dbf
5 ONLINE /prodlog/proddata/log05c.dbfEach log file member is 300MB in size and log buffer is 14MB
Thanks
Neel>
There is a huge wait on log file sync. I have 5 redo log groups 3 members each. I suspect writing to one group which is hosted on /proddb is taking time. Is there a way we can identify and control this.
>
Log file sync is a client wait basically. The only way to control this is by not committing too frequently. If you have a big loop and do a commit within the loop, this may add to log file sync waits as LGWR may not be able to cope up with the speed of posting. You may want to check the alert log and see if you have loads of "checkpoint not complete" messages. In this case, you may look at increasing the redo log size or add more members. But remember, log file sync is a client wait and it doesn't suggest anything with the size of redo logs. You will have to check the application or the process and make sure you don't commit too frequently. That's the only thing you can do on this and in fact, you have to do that.
For ex, if you have a loop like this
Begin
For i in 1..100000000
Loop
Update tab set col1 = 'blah' where key = i;
commit;
End Loop;
End;
/You will have to rewrite the above code something like this
Declare
v_count Number := 0;
Begin
For i in 1..100000000
Loop
v_count := v_count +1;
Update tab set col1 = 'blah' where key = i;
If (v_count >= 50000) Then ---some good number.
commit;
v_count := 0;
End If;
End Loop;
commit;
End;
/ -
10.2.0.2 aix 5.3 64bit archivelog mode.
I'm going to attempt to describe the system first and then outline the issue: The database is about 1Gb in size of which only about 400Mb is application data. There is only one table in the schema that is very active with all transactions inserting and or updating a row to log the user activity. The rest of the tables are used primarily for reads by the users and periodically updated by the application administrator with application code. There's about 1.2G of archive logs generated per day, from 3 50Mb redo logs all on the same filesystem.
The problem: We randomly have issues with users being kicked out of the application or hung up for a period of time. This application is used at a remote site and many times we can attribute the users issues to network delays or problems with a terminal server they are logging into. Today however they called and I noticed an abnormally high amount of 'log file sync' waits.
I asked the application admin if there could have been more activity during that time frame and more frequent commits than normal, but he says there was not. My next thought was that there might be an issue with the IO sub-system that the logs are on. So I went to our aix admin to find out the activity of that file system during that time frame. She had an nmon report generated that shows the RAID-1 disk group peak activity during that time was only 10%.
Now I took two awr reports and compared some of the metrics to see if indeed there was the same amount of activity, and it does look like the load was the same. With the same amount of activity & commits during both time periods wouldn't that lead to it being time spent waiting on writes to the disk that the redo logs are on? If so, why wouldn't the nmon report show a higher percentage of disk activity?
I can provide more values from the awr reports if needed.
per sec per trx
Redo size: 31,226.81 2,334.25
Logical reads: 646.11 48.30
Block changes: 190.80 14.26
Physical reads: 0.65 0.05
Physical writes: 3.19 0.24
User calls: 69.61 5.20
Parses: 34.34 2.57
Hard parses: 19.45 1.45
Sorts: 14.36 1.07
Logons: 0.01 0.00
Executes: 36.49 2.73
Transactions: 13.38
Redo size: 33,639.71 2,347.93
Logical reads: 697.58 48.69
Block changes: 215.83 15.06
Physical reads: 0.86 0.06
Physical writes: 3.26 0.23
User calls: 71.06 4.96
Parses: 36.78 2.57
Hard parses: 21.03 1.47
Sorts: 15.85 1.11
Logons: 0.01 0.00
Executes: 39.53 2.76
Transactions: 14.33
Total Per sec Per Trx
redo blocks written 252,046 70.52 5.27
redo buffer allocation retries 7 0.00 0.00
redo entries 167,349 46.82 3.50
redo log space requests 7 0.00 0.00
redo log space wait time 49 0.01 0.00
redo ordering marks 2,765 0.77 0.06
redo size 111,612,156 31,226.81 2,334.25
redo subscn max counts 5,443 1.52 0.11
redo synch time 47,910 13.40 1.00
redo synch writes 64,433 18.03 1.35
redo wastage 13,535,756 3,787.03 283.09
redo write time 27,642 7.73 0.58
redo writer latching time 2 0.00 0.00
redo writes 48,507 13.57 1.01
user commits 47,815 13.38 1.00
user rollbacks 0 0.00 0.00
redo blocks written 273,363 76.17 5.32
redo buffer allocation retries 6 0.00 0.00
redo entries 179,992 50.15 3.50
redo log space requests 6 0.00 0.00
redo log space wait time 18 0.01 0.00
redo ordering marks 2,997 0.84 0.06
redo size 120,725,932 33,639.71 2,347.93
redo subscn max counts 5,816 1.62 0.11
redo synch time 12,977 3.62 0.25
redo synch writes 66,985 18.67 1.30
redo wastage 14,665,132 4,086.37 285.21
redo write time 11,358 3.16 0.22
redo writer latching time 6 0.00 0.00
redo writes 52,521 14.63 1.02
user commits 51,418 14.33 1.00
user rollbacks 0 0.00 0.00Edited by: PktAces on Oct 1, 2008 1:45 PMMr Lewis,
Here's the results from the histogram query, the two sets of values were gathered about 15 minutes apart, during a slower than normal activity time.
105 log file parallel write 1 714394
105 log file parallel write 2 289538
105 log file parallel write 4 279550
105 log file parallel write 8 58805
105 log file parallel write 16 28132
105 log file parallel write 32 10851
105 log file parallel write 64 3833
105 log file parallel write 128 1126
105 log file parallel write 256 316
105 log file parallel write 512 192
105 log file parallel write 1024 78
105 log file parallel write 2048 49
105 log file parallel write 4096 31
105 log file parallel write 8192 35
105 log file parallel write 16384 41
105 log file parallel write 32768 9
105 log file parallel write 65536 1
105 log file parallel write 1 722787
105 log file parallel write 2 295607
105 log file parallel write 4 284524
105 log file parallel write 8 59671
105 log file parallel write 16 28412
105 log file parallel write 32 10976
105 log file parallel write 64 3850
105 log file parallel write 128 1131
105 log file parallel write 256 316
105 log file parallel write 512 192
105 log file parallel write 1024 78
105 log file parallel write 2048 49
105 log file parallel write 4096 31
105 log file parallel write 8192 35
105 log file parallel write 16384 41
105 log file parallel write 32768 9
105 log file parallel write 65536 1 -
Log file sync waits - COMMIT class - Pattern observed
I am running a performance test with 1 vUser for 24 hrs. Each time the user logs in, it logs in to my application as a different customer. All customers go through the same screens and perform the same operations.
When I analyze the % Total CPU time, % User CPU and % Wait CPU, I see a strange pattern or behavior. % User CPU is pretty much flat for the entire duration of the test. However, % Total CPU and % Wait CPU have the same pattern as described below:
- First 3 hrs 15 mins - high % Total CPU and % Wait CPU
- Next 1 hr 15 mins - low % Total CPU and % Wait CPU
- Next 3 hrs 15 mins - same as first 3 hrs 15 mins.
- Next 1 hr 15 mins - same as the earlier 1 hr 15 mins
... and so on.
I generated the awrddrpt - between two awrrpt - each of 1 hr duration. One of the 1 hr duration falls within the 3 hrs 15 mins time slice, and another 1 hr falls within the 1 hr 15 min time slice.
I see that there is
--> 40% difference in the wait time for log file sync events - 40% less in the 1 hr duration from 1 hr 15 mins slice compared to the larger slice, even though the # of log file sync events in both the 1 hr durations is close to same.
Pls note that this is the same test from which the two durations were compared. Same environment, Same load (loadProfile in awrddrpt shows no difference), then why do I see difference in Wait CPU - higher in one time slice (3 hr 15 mins) when compared to the other slice (1 hr 15 mins).
log file sync shows in Top 5 timed events in both time slices. In the 1 hr duration from longer time slice, the # of log file sync events is 13,222 and in the 1 hr duration from the shorter time slice it is 13,278.
The log file sync event Wait Time(s) in the 1 hr duration in the longer time slice is 154.2s and that in the 1 hr duration in the shorter time slice is 10.3s.
Why do we see this difference, when the load profile is the same in both time slices?
Message was edited by: the user who posted the message.
user586033You are either bored or suffer from Compulsive Tuning Disorder.
It can be a challenge to solve a problem that only exists between your ears
post results from SQL below
SELECT sql_id,
SUM(time_waited) / 1000000
FROM v$active_session_history
WHERE sample_time > SYSDATE - 1 / 24
AND time_waited > 0
GROUP BY sql_id
ORDER BY 2 DESC -
Log file sync wait event advise?
Due to business needs, Apps has been designed to do every single transaction commit and coming to infrastructure, db Datafiles and redo logs are in faster disk (FC) and archive logs are placed in slower speed disk(SATA). We are seeing the log file sync wait event in the top events and symptoms for this waitevent is either disk speed is slow or doing frequent commit. In my scenario i guess 99% this wait event happening due to frequent commits. Can i assume archive log slower disk will not be root cause for this (my understanding this waitevent occurs on redo log writing area and not in archive log writing area) ? Please confirm.
user530956 wrote:
We are seeing the log file sync wait event in the top events and symptoms for this waitevent is either disk speed is slow or doing frequent commit.As Hemant has pointed out, this could also be due to CPU overload.
I note you say the event is IN the top events - this tells us virtually nothing; an event might be IN the top 5 while being responsible for less than 1% of the total recorded wait time; it could be IN the top 5 but explained as a side effect of something that appeared above it in the Top 5. Why not just show us a typical Top 5 (along with a typical Load Profile it you want to be really helpful).
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
To post code, statspack/AWR report, execution plans or trace files from text files, START and END the text with the tag {noformat}{noformat} (the word "code" in lowercase, curly brackets, no spaces) so that the text appears in fixed format. This won't be sufficient if you try to cut and paste from an HTML report, which will need further editing. -
when i look at oem dbconsole, i see that waits on log file sync has %98 bad impact on my database. dbconsole says that :
finding : Waits on event "log file sync" while performing COMMIT and ROLLBACK operations were consuming significant database time.
Action : Investigate the possibility of improving the performance of I/O to the online redo log files
what can be done for this error?what can be done for this error? This is not an error,its a wait events.
when user perform commit/rollback then information in logbuffer will be flush to redo logfile by lgwr process.and user session will wait until all this activity process has to complete after commit.
try following actions
log file sync
When a user session commits (or rolls back), the session's redo information must be flushed to the redo logfile by LGWR. The server process performing the COMMIT or ROLLBACK waits under this event for the write to the redo log to complete.
Actions
If this event's waits constitute a significant wait on the system or a significant amount of time waited by a user experiencing response time issues or on a system, then examine the average time waited.
If the average time waited is low, but the number of waits are high, then the application might be committing after every INSERT, rather than batching COMMITs. Applications can reduce the wait by committing after 50 rows, rather than every row.
If the average time waited is high, then examine the session waits for the log writer and see what it is spending most of its time doing and waiting for. If the waits are because of slow I/O, then try the following:
[b] * Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.
* Alternate redo logs on different disks to minimize the effect of the archiver on the log writer.
* Move the redo logs to faster disks or a faster I/O subsystem (for example, switch from RAID 5 to RAID 1).
* Consider using raw devices (or simulated raw devices provided by disk vendors) to speed up the writes.
* Depending on the type of application, it might be possible to batch COMMITs by committing every N rows, rather than every row, so that fewer log file syncs are needed.
kuljeet -
Hig Log file sync waits on DG environment.
Hi All,
Experiencing a high number of "log file sync" waits on primary after AIX OS upgrade to 6.1 and
changed storage from EMC DMX storage to EMC VMAX storage on primary. SA's says EMC checked out the
storage and it is showing faster response time than the old DMX storage.
Not made any app changes and have been running fine for years on 10.2.0.3 just befor upgrade.
Research:
Every time primary database has to write redo from the log buffer to the online redolog files, the user session
waits on "log file sync" wait event while waiting for LGWR to post it back to confirm all redo changes are safely
on disk, however when the primary database also has a standby DB and the log shipping is using "LGWR SYNC AFFIRM"
means that user sessions not only has to wait for the local write to the online redologs, but also wait for the
write to the SRL on the standby DB, so every delay in getting a complete write response from the standby will
be seen in the primary as an "log file sync" wait event even if the local write to the ORL has completed already
After AWRs from primary(DB1ABRN_regy_AWR_20110802_1400_3.html)thre is not much to pinpoint except:
i) DG is configured to use 'LGWR SYNC AFFIRM'(customer can't switch to standby "ARCH SYNC NOAFFIRM" for critical application support).
ii) Upgraded the OS to 6.1 and changed their storage on primary *** Standby DB storage is still using slow one.
In DG there are several components like transport network, IO on the standby etc that can feed back to primary "log file sync" wait events that are seen.
Question we have is:
What is the best way to trace/query information from the standby side to identify its contribution to the high log file synch wait on the primary ?There an Oracle support note on this :
WAITEVENT: "log file sync" Reference Note [ID 34592.1]
While it does not address trace it does have a Data Guard section.
Also this Oracle support note may help :
Troubleshooting I/O-related waits [ID 223117.1]
Best Regards
mseberg -
Log file sync waits with null sql_ids
10.2.0.3
I am querying V$ACTIVE_SESSION_HISTORY to drill into log file sync waits.
select sql_id,sum(time_waited)
from v$active_session_history
where sample_time > sysdate - 1/24
group by sql_id
order by 2 desc
All of my top sessions for this have null sql_ids. I did some google searches and these are the answers that I found have null sql_ids. There are some other sessions where the sql_id is not null, but they are not anywhere near the top.
1. could be running pl/sql. yeah ok. but I would need to run 'dml' and issue a commit for this event to fire).
2. no sql is running. does this mean the insert finished and then I am waiting on the 'commit' part?
I want to track these sqls down so I can track them back to the application. I want to get the developers to limit their commit frequency and use batch (array based) DML. How do I track this down?
Also, is there anyway to figure out how often different users are committing? I want to track back to the worst offenders. Could be some parts of the application are commit periodically and others are not, but log file sync's could slow down everyone.You are either bored or suffer from Compulsive Tuning Disorder.
It can be a challenge to solve a problem that only exists between your ears
post results from SQL below
SELECT sql_id,
SUM(time_waited) / 1000000
FROM v$active_session_history
WHERE sample_time > SYSDATE - 1 / 24
AND time_waited > 0
GROUP BY sql_id
ORDER BY 2 DESC -
Hi everyone,
DB 11.2.0.1- 32 Cores - 64 GB RAM
I have a database that 1.5 million records are inserting every hour in it. This database suffers from too much log file sync waits. Googling the matter i have found that the reason is the way the application is inserting the data in database that is a single insert proceeded by a commit after each one. Currently we can not change the insertion method to use batch inserts instead. The number of redolog files in the database is 22 each of size 200MB and the log_buffer is about 150 MB.
is there any solution to reduce the number of log file sync waits?
i have tried increasing the redo log file size to 700MB each but there were after some log buffer contention waits. the SGA is 38GB.
thanks for any guideline
regardsf9smsk wrote:
DB 11.2.0.1- 32 Cores - 64 GB RAM
I have a database that 1.5 million records are inserting every hour in it. This database suffers from too much log file sync waits.
What makes you think the database is suffering from TOO MUCH log file sync waits ? Do you have a task that need to run more quickly - if so how long does it take, and how much of that time is spent on log file syncs ? How many copies of this task are running concurrently ? How do the log file syncs compare with the log file parallel writes ? What's the typical size of a log file write ?
Regards
Jonathan Lewis -
Hi all,
We are using Oracle 9.2.0.4 on SUSE Linux 10. In Our statspack report one of the Top timed event is log file sysnc we are getting.We are not using any storage.IS this a bug of 9.2.0.4 or what is the solution of it
STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
ai 1495142514 ai 1 9.2.0.4.0 NO ai-oracle
Snap Id Snap Time Sessions Curs/Sess Comment
Begin Snap: 241 03-Sep-09 12:17:17 255 63.2
End Snap: 242 03-Sep-09 12:48:50 257 63.4
Elapsed: 31.55 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,280M Std Block Size: 8K
Shared Pool Size: 160M Log Buffer: 1,024K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
Redo size: 7,881.17 8,673.87
Logical reads: 14,016.10 15,425.86
Block changes: 44.55 49.04
Physical reads: 3,421.71 3,765.87
Physical writes: 8.97 9.88
User calls: 254.50 280.10
Parses: 27.08 29.81
Hard parses: 0.46 0.50
Sorts: 8.54 9.40
Logons: 0.12 0.13
Executes: 139.47 153.50
Transactions: 0.91
% Blocks changed per Read: 0.32 Recursive Call %: 42.75
Rollback per transaction %: 13.66 Rows per Sort: 120.84
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 75.59 In-memory Sort %: 99.99
Library Hit %: 99.55 Soft Parse %: 98.31
Execute to Parse %: 80.58 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 67.17 % Non-Parse CPU: 99.10
Shared Pool Statistics Begin End
Memory Usage %: 95.32 96.78
% SQL with executions>1: 74.91 74.37
% Memory for SQL w/exec>1: 68.59 69.14
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
log file sync 11,558 10,488 67.52
db file sequential read 611,828 3,214 20.69
control file parallel write 436 541 3.48
buffer busy waits 626 522 3.36
CPU time 395 2.54
^LWait Events for DB: ai Instance: ai Snaps: 241 -242
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
log file sync 11,558 9,981 10,488 907 6.7
db file sequential read 611,828 0 3,214 5 355.7
control file parallel write 436 0 541 1241 0.3
buffer busy waits 626 518 522 834 0.4
control file sequential read 661 0 159 241 0.4
BFILE read 734 0 110 151 0.4
db file scattered read 595,462 0 81 0 346.2
enqueue 15 5 19 1266 0.0
latch free 109 22 1 8 0.1
db file parallel read 102 0 1 6 0.1
log file parallel write 1,498 1,497 1 0 0.9
BFILE get length 166 0 0 3 0.1
SQL*Net break/reset to clien 199 0 0 1 0.1
SQL*Net more data to client 5,139 0 0 0 3.0
BFILE open 76 0 0 0 0.0
row cache lock 5 0 0 0 0.0
BFILE internal seek 734 0 0 0 0.4
BFILE closure 76 0 0 0 0.0
db file parallel write 173 0 0 0 0.1
direct path read 18 0 0 0 0.0
direct path write 4 0 0 0 0.0
SQL*Net message from client 480,888 0 284,247 591 279.6
virtual circuit status 64 64 1,861 29072 0.0
wakeup time manager 59 59 1,757 29781 0.0Your elapsed time is roughly 2000 seconds (31:55 rounded up) - and your log file sync time is roughly 10,000 - which is 5 seconds per second for the duration. Alternatively your session count is roughly 250 at start and end of snapshot - so if we assume that the number of sessions was steady for the duration, every session has suffered 40 seconds of log file sync in the interval. You've recorded roughly 1,500 transactions in the interval (0.91 per second, of which about 13% were rollbacks) - so your log file sync time has averaged more than 6.5 seconds per commit.
Whichever way you look at it, this suggests that either the log file sync figures are wrong, or you have had a temporary hardware failure. Given that you've had a few buffer busy waits and control file write waits of about 900 m/s each, the hardware failure seems likely.
Check log file parallel write times to see if this helps to confirm the hypothesis. (Unfortunately some platforms don't report liog file parallel wriite times correctly for earlier versions of 9.2 - so this may not help.)
You also have 15 enqueue waits averaging 1.2 seconds - check the enqueue stats section of the report to see which enqueue this was: if it was (e.g. CF - control file) then this also helps to confirm the hardware hypothesis.
It's possible that you had a couple of hardware resets or something of that sort in the interval that stopped your system quite dramatically for a minute or two.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
"Science is more than a body of knowledge; it is a way of thinking"
Carl Sagan
Maybe you are looking for
-
HP LaserJet Enterprise 500 color MFP M575 control panel display
HP LaserJet 500 color MFP M575 have no Display. and is not anymore possible access any function. What is the problem behind that ?How can we solve this issue ? is it the problem with the control panel display or the logic board ? there is no dim mem
-
How to open canon 70D RAW files in Lightroom 4?
I have udated LR 4 to LR 4.4. It is supposed to open the newer Canon camera RAW files like the 70D. After updating with the download from Adobe LR ver. 4 will still not open the 70D RAW files. Does anyone have any suggestion? I must be doing somethin
-
Control browser properties from a java application while launching a browse
How to control the properties of a browser when it is launched from a java application? I am using the command " Runtime.getRuntime().exec("rundll32 url.dll,FileProtocolHandler "+url); " to launch the browser. The syntax of "exec" command is "public
-
How do I configure the mixed signal graph property node to set the plot names
I have a mixed signal graph displaying digital and analog data in two groups. I would like to programmatically set the names of individual plots i.e. rather that the digital graph displaying 'Digital 0->Line 0' I would like to set it to 'Signal 1'
-
Cursor size limited to 32x32?
Hi, I'm trying to implement a custom cursor on my DnDTable, and the idea is that it must display the name of the dragged object. I already found out how to change the cursor to one of my own (Toolkit.createCustomCursor()), but the problem is that I c