STATSPACK Performance Question / Discrepancy
I'm trying to troubleshoot a performance issue and I'm having trouble interpreting the STATSPACK report. It seems like the STATSPACK report is missing information that I expect to be there. I'll explain below.
Header
STATSPACK report for
Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
2636235846 testdb 1 30-Jan-11 16:10 11.2.0.2.0 NO
Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
TEST Microsoft Windows IA ( 4 2 0 3.4
Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 3427 01-Feb-11 06:40:00 65 4.4
End Snap: 3428 01-Feb-11 07:00:00 66 4.1
Elapsed: 20.00 (mins) Av Act Sess: 7.3
DB time: 146.39 (mins) DB CPU: 8.27 (mins)
Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 192M 176M Std Block Size: 8K
Shared Pool: 396M 412M Log Buffer: 10,848K
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 7.3 2.0 0.06 0.04
DB CPU(s): 0.4 0.1 0.00 0.00
Redo size: 6,366.0 1,722.1
Logical reads: 1,114.6 301.5
Block changes: 35.8 9.7
Physical reads: 44.9 12.1
Physical writes: 1.5 0.4
User calls: 192.2 52.0
Parses: 101.5 27.5
Hard parses: 3.6 1.0
W/A MB processed: 0.1 0.0
Logons: 0.1 0.0
Executes: 115.1 31.1
Rollbacks: 0.0 0.0
Transactions: 3.7As you can see a significant amount of time was spent in database calls (DB Time) with relatively little time on CPU (DB CPU). Initially that made me think there were some significant wait events.
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
log file sequential read 48,166 681 14 7.9
CPU time 484 5.6
db file sequential read 35,357 205 6 2.4
control file sequential read 50,747 23 0 .3
Disk file operations I/O 16,518 18 1 .2
-------------------------------------------------------------However, looking at the Top 5 Timed Events I don't see anything out of the ordinary given my normal operations. the log file sequential read may be a little slow but it doesn't make up a significant portion of the execution time.
Based on an Excel/VB spreadsheet I wrote, which converts STATSPACK data to graphical form, I suspected that there was a wait event not listed here. So I decided to query the data directly. Here is the query and result.
SQL> SELECT wait_class
2 , event
3 , delta/POWER(10,6) AS delta_sec
4 FROM
5 (
6 SELECT syev.snap_id
7 , evna.wait_class
8 , syev.event
9 , syev.time_waited_micro
10 , syev.time_waited_micro - LAG(syev.time_waited_micro) OVER (PARTITION BY syev.event ORDER BY syev.snap_id) AS delta
11 FROM perfstat.stats$system_event syev
12 JOIN v$event_name evna ON evna.name = syev.event
13 WHERE syev.snap_id IN (3427,3428)
14 )
15 WHERE delta > 0
16 ORDER BY delta DESC
17 ;
?WAIT_CLASS EVENT DELTA_SEC
Idle SQL*Net message from client 21169.742
Idle rdbms ipc message 19708.390
Application enq: TM - contention 7199.819
Idle Space Manager: slave idle wait 3001.719
Idle DIAG idle wait 2382.943
Idle jobq slave wait 1258.829
Idle smon timer 1220.902
Idle Streams AQ: qmn coordinator idle wait 1204.648
Idle Streams AQ: qmn slave idle wait 1204.637
Idle pmon timer 1197.898
Idle Streams AQ: waiting for messages in the queue 1197.484
Idle Streams AQ: waiting for time management or cleanup tasks 791.803
System I/O log file sequential read 681.444
User I/O db file sequential read 204.721
System I/O control file sequential read 23.168
User I/O Disk file operations I/O 17.737
User I/O db file parallel read 14.536
System I/O log file parallel write 7.618
Commit log file sync 7.150
User I/O db file scattered read 3.488
Idle SGA: MMAN sleep for component shrink 2.461
User I/O direct path read 1.621
Other process diagnostic dump 1.418
... snip ...So based on the above it looks like there was a significant amount of time spent in enq: TM - contention
Question 1
Why does this wait event not show up in the Top 5 Timed Events section? Note that this wait event is also not listed in any of the other wait events sections either.
Moving on, I decided to look at the Time Model Statistics
Time Model System Stats DB/Inst: testdb /testdb Snaps: 3427-3428
-> Ordered by % of DB time desc, Statistic name
Statistic Time (s) % DB time
sql execute elapsed time 8,731.0 99.4
PL/SQL execution elapsed time 1,201.1 13.7
DB CPU 496.3 5.7
parse time elapsed 26.4 .3
hard parse elapsed time 21.1 .2
PL/SQL compilation elapsed time 2.8 .0
connection management call elapsed 0.6 .0
hard parse (bind mismatch) elapsed 0.5 .0
hard parse (sharing criteria) elaps 0.5 .0
failed parse elapsed time 0.0 .0
repeated bind elapsed time 0.0 .0
sequence load elapsed time 0.0 .0
DB time 8,783.2
background elapsed time 87.1
background cpu time 2.4Great, so it looks like I spent >99% of DB Time in SQL calls. I decided to scroll to the SQL ordered by Elapsed time section. The header information surprised me.
SQL ordered by Elapsed time for DB: testdb Instance: testdb Snaps: 3427 -3
-> Total DB Time (s): 8,783
-> Captured SQL accounts for 4.1% of Total DB Time
-> SQL reported below exceeded 1.0% of Total DB TimeIf I'm spending > 99% of my time in SQL, I would have expected the captured % to be higher.
Question 2
Am I correct in assuming that a long running SQL that started before the first snap and is still running at the end of the second snap would not display in this section?
Question 3
Would that answer my wait event question above? Ala, are wait events not reported until the action that is waiting (execution of a SQL statement for example) is complete?
So I looked a few snaps past what I have posted here. I still haven't determined why the enq: TM - contention wait is not displayed anywhere in the STATSPACK reports. I did end up finding an interesting PL/SQL block that may have been causing the issues. Here is the SQL ordered by Elapsed time for a snapshot that was taken an hour after the one I posted.
SQL ordered by Elapsed time for DB: testdb Instance: testdb Snaps: 3431 -3
-> Total DB Time (s): 1,088
-> Captured SQL accounts for ######% of Total DB Time
-> SQL reported below exceeded 1.0% of Total DB Time
Elapsed Elap per CPU Old
Time (s) Executions Exec (s) %Total Time (s) Physical Reads Hash Value
26492.65 29 913.54 ###### 1539.34 480 1013630726
Module: OEM.CacheModeWaitPool
BEGIN EMDW_LOG.set_context(MGMT_JOB_ENGINE.MODULE_NAME, :1); BEG
IN MGMT_JOB_ENGINE.process_wait_step(:2);END; EMDW_LOG.set_conte
xt; END;I'm still not sure if this is the problem child or not.
I just wanted to post this to get your thoughts on how I correctly/incorrectly attacked this problem and to see if you can fill in any gaps in my understanding.
Thanks!
Centinul wrote:
I'm still not sure if this is the problem child or not.
I just wanted to post this to get your thoughts on how I correctly/incorrectly attacked this problem and to see if you can fill in any gaps in my understanding.
I think you've attacked the problem well.
It has prompted me to take a little look at what's going on, running 11.1.0.6 in my case, and something IS broken.
The key predicate in statspack for reporting top 5 is:
and e.total_waits > nvl(b.total_waits,0)In other words, an event gets reported if total_waits increased across the period.
So I've been taking snapshots of v$system_event and looking at 10046 trace files at level 8. The basic test was as simple as:
<ul>
Session 1: lock table t1 in exclusive mode
Session 2: lock table t1 in exclusive mode
</ul>
About three seconds after session 2 started to wait, v$system_event incremented total_waits (for the "enq: TM - contention" event). When I committed in session 1 the total_waits figure did not change.
Now do this after waiting across a snapshot:
We start to wait, after three seconds we record a wait, a few minutes later perfstat takes a snapshot.
30 minutes later "session 1" commits and our wait ends, but we do not increment total_waits, but we record 30+ minutes wait time.
30 minutes later perfstat takes another snapshot
The total_waits has not changed between the start and end snapshot even though we have added 30 minutes to the "enq: TM - contention" in the interim.
The statspack report loses our 30 minutes from the Top N.
It's a bug - raise an SR.
Edit: The AWR will have the same problem, of course.
Regards
Jonathan Lewis
Edited by: Jonathan Lewis on Feb 1, 2011 7:07 PM
Similar Messages
-
Simple performance question. the simplest way possible, assume
I have a int[][][][][] matrix, and a boolean add. The array is several dimensions long.
When add is true, I must add a constant value to each element in the array.
When add is false, I must subtract a constant value to each element in the array.
Assume this is very hot code, i.e. it is called very often. How expensive is the condition checking? I present the two scenarios.
private void process(){
for (int i=0;i<dimension1;i++)
for (int ii=0;ii<dimension1;ii++)
for (int iii=0;iii<dimension1;iii++)
for (int iiii=0;iiii<dimension1;iiii++)
if (add)
matrix[i][ii][iii][...] += constant;
else
matrix[i][ii][iii][...] -= constant;
private void process(){
if (add)
for (int i=0;i<dimension1;i++)
for (int ii=0;ii<dimension1;ii++)
for (int iii=0;iii<dimension1;iii++)
for (int iiii=0;iiii<dimension1;iiii++)
matrix[i][ii][iii][...] += constant;
else
for (int i=0;i<dimension1;i++)
for (int ii=0;ii<dimension1;ii++)
for (int iii=0;iii<dimension1;iii++)
for (int iiii=0;iiii<dimension1;iiii++)
matrix[i][ii][iii][...] -= constant;
}Is the second scenario worth a significant performance boost? Without understanding how the compilers generates executable code, it seems that in the first case, n^d conditions are checked, whereas in the second, only 1. It is however, less elegant, but I am willing to do it for a significant improvement.erjoalgo wrote:
I guess my real question is, will the compiler optimize the condition check out when it realizes the boolean value will not change through these iterations, and if it does not, is it worth doing that micro optimization?Almost certainly not; the main reason being that
matrix[i][ii][iii][...] +/-= constantis liable to take many times longer than the condition check, and you can't avoid it. That said, Mel's suggestion is probably the best.
but I will follow amickr advice and not worry about it.Good idea. Saves you getting flamed with all the quotes about premature optimization.
Winston -
Guys,
I do understand that ccPBM is very resource hungry but what I was wondering is this:
Once you use BPM, does an extra step decreases the performance significantly? Or does it just need slightly more resources?
More specifically we have quite complex mapping in 2 BPM steps. Combining them would make the mapping less clear but would it worth doing so from the performance point of view?
Your opinion is appreciated.
Thanks a lot,
Viktor VargaHi,
In SXMB_ADM you can set the time out higher for the sync processing.
Go to Integration Processing in SXMB_ADM and add parameter SA_COMM CHECK_FOR_ASYNC_RESPONSE_TIMEOUT to 120 (seconds). You can also increase the number of parallel processes if you have more waiting now. SA_COMM CHECK_FOR_MAX_SYNC_CALLS from 20 to XX. All depends on your hardware but this helped me from the standard 60 seconds to go to may be 70 in some cases.
Make sure that your calling system does not have a timeout below that you set in XI otherwise yours will go on and finish and your partner may end up sending it twice
when you go for BPM the whole workflow
has to come into action so for example
when your mapping last < 1 sec without bpm
if you do it in a BPM the transformation step
can last 2 seconds + one second mapping...
(that's just an example)
so the workflow gives you many design possibilities
(brigde, error handling) but it can
slow down the process and if you have
thousands of messages the preformance
can be much worse than having the same without BPM
see below links
http://help.sap.com/bp_bpmv130/Documentation/Operation/TuningGuide.pdf
http://help.sap.com/saphelp_nw04/helpdata/en/43/d92e428819da2ce10000000a1550b0/content.htm
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/library/xi/3.0/sap%20exchange%20infrastructure%20tuning%20guide%20xi%203.0.pdf
BPM Performance tuning
BPM Performance issue
BPM performance question
BPM performance- data aggregation persistance
Regards
Chilla.. -
Swing performance question: CPU-bound
Hi,
I've posted a Swing performance question to the java.net performance forum. Since it is a Swing performance question, I thought readers of this forum might also be interested.
Swing CPU-bound in sun.awt.windows.WToolkit.eventLoop
http://forums.java.net/jive/thread.jspa?threadID=1636&tstart=0
Thanks,
CurtYou obviously don't understand the results, and the first reply to your posting on java.net clearly explains what you missed.
The event queue is using Thread.wait to sleep until it gets some more events to dispatch. You have incorrectly diagnosed the sleep waiting as your performance bottleneck. -
Xcontrol: performance question (again)
Hello,
I've got a little performance question regarding xcontrols. I observed rather high cpu-load when using xcontrols. To investigate it further, I built a minimal xcontrol (boolean type) which only writes the received boolean-value to a display-element in it's facade (see attached example). When I use this xcontrol in a test-vi and write to it with a rate of 1000 booleans / second, I get a cpu-load of about 10%. When I write directly to a boolean display element instead of the xcontrol,I have a load of 0 to 1 %. The funny thing is, when I emulate the xcontrol functionality with a subvi, a subpanel and a queue (see example), I only have 0 to 1% cpu-load, too.
Is there a way to reduce the cpu-load when using xcontrols?
If there isn't and if this is not a problem with my installation but a known issue, I think this would be a potential point for NI to fix in a future update of LV.
Regards,
soranito
Message Edited by soranito on 04-04-2010 08:16 PM
Message Edited by soranito on 04-04-2010 08:18 PM
Attachments:
XControl_performance_test.zip 60 KBsoranito wrote:
Hello,
I've got a little performance question regarding xcontrols. I observed rather high cpu-load when using xcontrols. To investigate it further, I built a minimal xcontrol (boolean type) which only writes the received boolean-value to a display-element in it's facade (see attached example). When I use this xcontrol in a test-vi and write to it with a rate of 1000 booleans / second, I get a cpu-load of about 10%. When I write directly to a boolean display element instead of the xcontrol,I have a load of 0 to 1 %. The funny thing is, when I emulate the xcontrol functionality with a subvi, a subpanel and a queue (see example), I only have 0 to 1% cpu-load, too.
Okay, I think I understand question now. You want to know why an equivalent xcontrol boolean consumes 10x more CPU resource than the LV base package boolean?
Okay, try opening the project I replied yesterday. I don't have access to LV at my desk so let's try this. Open up your xcontrol facade.vi. Notice how I separated up your data event into two events? Go the data change vi event, when looping back the action, set the isDataChanged (part of the data change cluster) to FALSE. While the data input (the one displayed on your facade.vi front panel), set that isDataChanged to TRUE. This is will limit the number of times facade will be looping. It will not drop your CPU down from 10% to 0% but it should drop a little, just enough to give you a short term solution. If that doesn't work, just play around with the loopback statement. I can't remember the exact method.
Yeah, I agree xcontrol shouldn't be overconsuming system resource. I think xcontrol is still in its primitive form and I'm not sure if NI is planning on investing more times to bug fix or even enhance it. Imo, I don't think xcontrol is quite ready for primetime yet. Just too many issues that need improvement.
Message Edited by lavalava on 04-06-2010 03:34 PM -
MBP with 27" Display performance question
I'm looking for advice regarding improving the performance, if possible, of my Macbook Pro and new 27'' Apple display combination. I'm using a 13" Macbook Pro 2.53Ghz with 4GB RAM, NVIDIA GeForce 9400M graphics card and I have 114GB of the 250GB of HD space available. What I'm really wondering is is this enough spec to run the 27" display easily. Apple says it is… and it does work, but I suspect that I'm working at the limit of what my MCB is capable of. My main applications are Photoshop CS5 with Camera RAW and Bridge. Everything works but I sometimes get lock ups and things are basically a bit jerky. Is the bottle neck my 2.53Ghz processor or the graphics card? I have experimented with the Open GL settings in Photoshop and tried closing all unused applications. Does anyone have any suggestions for tuning things and is there a feasible upgrade for the graphics card if such a thing would make a difference? I have recently started working with 21mb RAW files which I realise isn't helping. Any thoughts would be appreciated.
Matt.I just added a gorgeous LCD 24" to my MBP setup (the G5 is not Happy) The answer to your question is yes. Just go into Display Preferences and drag the menu bar over to the the 24 this will make the 24 the Primary Display and the MBP the secondary when connected.
-
Performance question about 11.1.2 forms at runtime
hi all,
Currently we are investigating a forms/reports migration from 10 to 11.
Initialy we were using v. 11.1.1.4 as the baseline for the migration. Now we are looking at 11.1.2.
We have the impression that the performance has decreased significantly between these two releases.
To give an example:
A wizard screen contains an image alongside a number of items to enter details. In 11.1.1.4 this screen shows up immediately. In 11.1.2 you see the image rolling out on the canvas whilst the properties of the items seem to be set during this event.
I saw that a number of features were added to be able to tune performance which ... need processing too.
I get the impression that a big number of events are communicating over the network during the 'built' of the client side view of the screen. If I recall well during the migration of 6 to 9, events were bundled to be transmitted over the network so that delays couldn't come from network roundtrips. I have the impression that this has been reversed and things are communicated between the client and server when they arrive and are not bundled.
My questions are:
- is anyone out there experiencing the same kind of behaviour?
- if so, is there some kind of property(ies) that exist to control the behaviour and improve performance?
- are there properties for performance monitoring that are set but which cause the slowness as a kind of sideeffect and maybe can be unset.
Your feedback will be dearly appreciated,
Greetigns,
Jan.The profile can't be changed although I suspect if there was an issue then banding the line would be something they could utilise if you were happy to do so.
It's all theoretical right now until you get the service installed. Don't forget there's over 600000 customers now on FTTC and only a very small percentage of them have faults. It might seem like lots looking on this forum but that's only because forums are where people tend to come to complain.
If you want to say thanks for a helpful answer,please click on the Ratings star on the left-hand side If the the reply answers your question then please mark as ’Mark as Accepted Solution’ -
Controlfile on ASM performance question
Seeing Controlfile Enqueue performance spikes, consideration are to move control file to separater diskgroup(need outage) ? or add some disk(from different luns,( i prefer this approach) in the same disk group , seems like slow disk is casing this issue...
2nd question :can snapshot controlfile be placed on ASM storage?Following points may help:
- Separating the control file to another diskgroup may make things even worse in case that the total number of disks are insufficient in the new disk group.
- Those control file contention issues are usually nothing to do with the storage throughput you have but the number of operations requiring different levels of exclusion on the control files.
- Since multiple copies of controlfiles are updated concurrently a possible, sometimes, problem is that the secondary copy of controlfile is slower than the other. Please check that this is not the issue (different tiers of storage may cause such problems)
Regards,
Husnu Sensoy -
Editing stills with motion effects, performance questions.
I am editing a video in FCE that consists solely of still photos.
I am creating motion effects (pans and pullbacks, etc) and dissolve
transitions, and overlaying titles. It will be played back on dvd
on a 16:9 monitor (standard dvd,not blueray hi-def). Some questions:
What is the FCE best setup to use for best image quality: DV-NTSC?
DV-NTSC Anamorphic? or is it HDV-1080i or 720p30 even though it
won't be played back as hi-def?
How do best avoid squiggly line problem with pan moves etc?
On my G-5, 2gb RAM, single processor machine I seem to be having
performance problems with playback: slow to render, dropping frames, etc
Thanks for any help!Excellent summary MacDLS, thanks for the contribution.
A lot of the photos I've taken on my camera are 3072 X 2304 (resolution 314) .jpegs.
I've heard it said that jpegs aren't the best format for Motion, since they're a compressed format.
If you're happy with the jpegs, Motion will be, too.
My typical project could either be 1280 X 720 or SD. I like the photo to be a lot bigger than the
canvas size, so I have room to do crops and grows, and the like. Is there a maximum dimension
that I should be working with?
Yes and no. Your originals are 7,000,000 pixels. Your video working space only displays about 950,000 pixels at any single instant.
At that project size, your stills are almost 700% larger than the frame. This will tax any system as you add more stills. 150% is more realistic in terms of processing overhead and I try to only import HUGE images that I know are going to be tightly cropped by zooming in. You need to understand that an 1300x800 section of your original is as far as you can zoom in , the pixels will be 100% in size. If you zoom in further, all you get are bigger pixels. The trade off you make is that if you zoom way out on your source image, you've thrown away 75% of its content to scale it to fit the video format; you lose much much more if you go to SD.
Finally, the manual says that d.p.i doesn't matter in Motion, so does this mean that it's worth
actually exporting my 300 dpi photos to 72 dpi before working with them in Motion?
Don't confuse DPI with resolution. Your video screen will only show about 900,000 pixels in HD and about 350,000 pixels in SD totally regardless of how many pixels there are in your original.
bogiesan -
9 shared objects performance question
I have 9 shared objects 8 of which contain dynamic data which i cannot really consolidate into a single shared object because the shared object often has to be cleared.....my question is what performance issues will I exsperience with this number of shared objects .....I maybe wrong in thinking that 9 shared objects is alot.....anybody with any exsperience using multiple shared objects plz respond.
I've used many more than 9 SO's in an application without issue. I suppose what it really comes down to is how many clients are connected to those SO's and how often each one is being updated.
-
Import: performance question
Hi, what is the different between these statements for application in term of performance?
import java.io.*;
and
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
Which one is faster for execution?Neither. Search the forums or web for the countless answers to the same question.
-
Functions slowing down performance question
Hey there.
I've got a query that really slogs. This query calls quite a few functions and there's no question that some of the work that needs to be done, simply takes time.
However, someone has adamantly told me that using functions slow down the query compared to the same code in the base SQL.
I find this hard to believe that the exact same code - whether well written or not - would be much faster in the base view than having a view call the functions.
Is this correct that functions kill performance?
Thanks for any advice.
RussThere is the performance impact of context switching between SQL and PL/SQL engines. Pure SQL is always faster.
SQL> create or replace function f (n number) return number as
2 begin
3 return n + 1;
4 end;
5 /
Function created.
SQL> set timing on
SQL> select sum(f(level)) from dual
2 connect by level <= 1000000;
SUM(F(LEVEL))
5.0000E+11
Elapsed: 00:00:07.06
SQL> select sum(level + 1) from dual
2 connect by level <= 1000000;
SUM(LEVEL+1)
5.0000E+11
Elapsed: 00:00:01.09 -
SQL performance question (between clause)
Hello,
I'm new to SQL tuning and bumped into the following performance problem:
Situation:
--Table 1
CREATE TABLE GGS
CHROM_ID NUMBER(2),
START_POS NUMBER(10),
TAG VARCHAR2(3 CHAR)
CREATE INDEX GGS_IDX ON GGS
(CHROM_ID, START_POS);
--Table 2
CREATE TABLE LEL
CHROM_ID NUMBER(2),
START_POS NUMBER(10),
TAG VARCHAR2(3 CHAR)
CREATE INDEX GGS_IDX ON LEL
(CHROM_ID, START_POS);
--Table 3
CREATE TABLE PGD
CHROM_ID NUMBER(2),
START_POS NUMBER(10),
TAG VARCHAR2(3 CHAR)
CREATE INDEX PGD_IDX ON LEL
(CHROM_ID, START_POS);
For these 3 tables & 3 indexes the statistics are gathered.
I'm issuing the following SQL statements:
select t1.tag,t1.chrom_id,t1.start_pos
from LEL t1
where exists
(select 'x' from GGS t2
where t2.chrom_id = t1.chrom_id
and t2.start_pos = t1.start_pos + 9
and exists
(select 'x' from PGD t3
where t3.chrom_id = t1.chrom_id
and t3.start_pos = t1.start_pos + 18
Execution Plan
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 27 | 3677 (5)| 00:00:45 |
| 1 | NESTED LOOPS SEMI | | 1 | 27 | 3677 (5)| 00:00:45 |
|* 2 | HASH JOIN | | 118 | 2242 | 3323 (6)| 00:00:40 |
| 3 | SORT UNIQUE | | 428K| 3348K| 257 (5)| 00:00:04 |
| 4 | TABLE ACCESS FULL| PGD | 428K| 3348K| 257 (5)| 00:00:04 |
| 5 | TABLE ACCESS FULL | LEL | 2399K| 25M| 1435 (5)| 00:00:18 |
|* 6 | INDEX RANGE SCAN | GGS_IDX | 1 | 8 | 3 (0)| 00:00:01 |
Predicate Information (identified by operation id):
2 - access("T3"."CHROM_ID"="T1"."CHROM_ID" AND
"T3"."START_POS"="T1"."START_POS"+18)
6 - access("T2"."CHROM_ID"="T1"."CHROM_ID" AND
"T2"."START_POS"="T1"."START_POS"+9)
select t1.tag,t1.chrom_id,t1.start_pos
from LEL t1
where exists
(select 'x' from GGS t2
where t2.chrom_id = t1.chrom_id
and t2.start_pos between t1.start_pos - 25 and t1.start_pos + 25
and exists
(select 'x' from PGD t3
where t3.chrom_id = t1.chrom_id
and t3.start_pos between t1.start_pos - 25 and t1.start_pos + 25
Execution Plan
<source>
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 15 | 405 | | 5723 (4)| 00:01:09 |
|* 1 | HASH JOIN SEMI | | 15 | 405 | | 5723 (4)| 00:01:09 |
|* 2 | HASH JOIN RIGHT SEMI| | 5998 | 111K| 8376K| 4788 (4)| 00:00:58 |
| 3 | TABLE ACCESS FULL | PGD | 428K| 3348K| | 257 (5)| 00:00:04 |
| 4 | TABLE ACCESS FULL | LEL | 2399K| 25M| | 1435 (5)| 00:00:18 |
| 5 | TABLE ACCESS FULL | GGS | 1531K| 11M| | 913 (5)| 00:00:11 |
</source>
Predicate Information (identified by operation id):
1 - access("T2"."CHROM_ID"="T1"."CHROM_ID")
filter("T2"."START_POS">="T1"."START_POS"-25 AND
"T2"."START_POS"<="T1"."START_POS"+25)
2 - access("T3"."CHROM_ID"="T1"."CHROM_ID")
filter("T3"."START_POS">="T1"."START_POS"-25 AND
"T3"."START_POS"<="T1"."START_POS"+25)
The first query runs fast, a few seconds. The later runs for ages. Any idea how I could get better performance on the second query? How comes that the predicted run time for the two queries are not that different, 00:00:45 for query 1 versus 00:01:09 for query 2?
But in reality the difference is enormous, or am I mis interpreting the execution plan output?
Kind Regards,
Gerben
The table data looks like:
CHROM_ID;START_POS;TAG
1;3001429;LEL
1;3001837;LEL
1;3003352;LEL
1;3007849;LEL
1;3008347;LEL
1;3009100;LEL
1;3010504;LEL
1;3016300;LEL
1;3018445;LELHi Rob,
Since the PGA_AGGREGATE_TARGET is set, the AREASIZE parameters are automatically set. The PGA_AGGREGATE_TARGET is set to approximately 1.26Gb.
I reran the query to monitor the pga memory usage. Most of the work_area's were able to run in optimal mode, only a few in one-pass and none in multi-pass mode.
I know that our PROBLEMATIC query is not responsible for the one-pass mode work_areas.
Do you notice something abnormal in the PGA statistics? Maybe something else is causing the bad query performance? Other init parameters I should look into?
There's also this discrepancy between the explain plan and the reality which is puzzling me...
Groet,
Gerben
SQL> SELECT * from v$pgastat;
NAME VALUE UNIT
aggregate PGA target parameter 1321439232 bytes
aggregate PGA auto target 1152248832 bytes
global memory bound 132136960 bytes
total PGA inuse 41159680 bytes
total PGA allocated 95112192 bytes
maximum PGA allocated 253027328 bytes
total freeable PGA memory 12713984 bytes
process count 20
max processes count 22
PGA memory freed back to OS 1026097152 bytes
total PGA used for auto workareas 0 bytes
maximum PGA used for auto workareas 148202496 bytes
total PGA used for manual workareas 0 bytes
maximum PGA used for manual workareas 536576 bytes
over allocation count 0
bytes processed 1795721216 bytes
extra bytes read/written 657868800 bytes
cache hit percentage 73.18 percent
recompute count (total) 7982
SQL> SELECT optimal_count, round(optimal_count*100/total, 2) optimal_perc,
onepass_count, round(onepass_count*100/total, 2) onepass_perc,
multipass_count, round(multipass_count*100/total, 2) multipass_perc
FROM
(SELECT decode(sum(total_executions), 0, 1, sum(total_executions)) total,
sum(OPTIMAL_EXECUTIONS) optimal_count,
sum(ONEPASS_EXECUTIONS) onepass_count,
sum(MULTIPASSES_EXECUTIONS) multipass_count
FROM v$sql_workarea_histogram
WHERE low_optimal_size > 64*1024);
OPTIMAL_COUNT OPTIMAL_PERC ONEPASS_COUNT ONEPASS_PERC MULTIPASS_COUNT MULTIPASS_PERC
238 96.75 8 3.25 0 0
SQL> SELECT LOW_OPTIMAL_SIZE/1024 low_kb,
(HIGH_OPTIMAL_SIZE+1)/1024 high_kb,
OPTIMAL_EXECUTIONS, ONEPASS_EXECUTIONS, MULTIPASSES_EXECUTIONS
FROM V$SQL_WORKAREA_HISTOGRAM
WHERE TOTAL_EXECUTIONS != 0;
LOW_KB HIGH_KB OPTIMAL_EXECUTIONS ONEPASS_EXECUTIONS MULTIPASSES_EXECUTIONS
2 4 28661 0 0
64 128 27 0 0
128 256 2 0 0
256 512 5 0 0
512 1024 208 0 0
1024 2048 1 0 0
2048 4096 0 2 0
4096 8192 10 0 0
8192 16384 5 0 0
65536 131072 6 6 0
131072 262144 1 0 0
SQL> SELECT name profile, cnt, decode(total, 0, 0, round(cnt*100/total)) percentage
FROM (SELECT name, value cnt, (sum(value) over ()) total
FROM V$SYSSTAT
WHERE name like 'workarea exec%');
PROFILE CNT PERCENTAGE
workarea executions - optimal 28930 100
workarea executions - onepass 8 0
workarea executions - multipass 0 0 -
PL/SQL performance questions
Hi,
I am responsible for a large, computation-intensive PL/SQL program that performs some batch processing on a large number of records.
I am trying to improve the performance of this program and have a couple of questions that I am hoping this forum can answer.
I am running Oracle 11.1.0.7 on Windows.
1. How does compiling with DEBUG information affect performance?
I found that my program units (packages, procedures, object types, etc) run significantly slower if they are compiled with debug information
I am trying to understand why this is so. Does debug information instrument the code and result in more code that needs to be executed?
Does adding debug information prevent compiler optimizations? both?
The reason I ask this question is to understand if it is valid to compare the performance of two different implementations if they are both compiled with debug information. For example, if one approach is 20% faster when compiled with debug information, is it safe to assume that it will also be 20% faster in production (without debug information)? Or, as I expect, does the presence of debug information change the performance profile of the code?
2. What is the best way to measure how long a PL/SQL program takes?
I want to compare to approaches, such as using a VARRAY vs. a TABLE variable. I have been doing this by creating two test procedures that performs the same task using the two approaches I want to evalulate.
How should I measure the time an approach takes so that it is not affected by other activity on my system? I have tried using CPU time (dbms_utility.get_cpu_time) and elapsed time. CPU time seems to be much
more consistent between runs, however, I am concerned that CPU time might not reflect all the time the process takes.
(I am aware of the profiler and have used that as well, however, I am at the point where profiling is providing diminishing returns).
3. I tried recompiling my entire system to be native compiled but to my great surprise, did not notice any measurable difference in performance!
I compiled all specification and bodies in all schemas to be native compiled. Can anyone explain why native compilation would not result in a significant performance improvement on a process that seems to be CPU-bound when it is running? Are there any other settings or additional steps that need to be performed for native compilation to be effective?
Thank you,
EricYes, debug must add instrumentation. I think that is the point of it. Whether it lowers the compiler optimisation level I don't know (I haven't read anywhere that it does) but surely if you stepping through code manually to debug it then you don't care.
I don't know of a way to measure pure CPU time independently of other system activity. One common approach is to write a test program that repeats your sample code a large enough number of times for a pattern to emerge. To find how much time individual components contribute, dbms_profiler can be quite helpful (most conveniently via a button press in IDEs such as PL/SQL Developer, but it can also be invoked from the command line.)
It is strange that no native compilation appears to make no difference. Are you sure everything is actually using it? e.g. is it shown as natively compiled in ALL_PLSQL_OBJECT_SETTINGS?
I would not expect a PL/SQL VARRAY variable to perform any differently to a nested table one - I expect they have an identical internal implementation. The difference is that VARRAYs have much reduced functionality and a normally unhelpful limit setting.
Edited by: William Robertson on Nov 6, 2008 11:49 PM -
Exporting to OMF for audio and After effect + delay in performance question
Hello
I have two questions.
First I have a problem with performance of FCP 5 I'm using a 500 GB firewire 800 external drive (not . Every time I hit the play button there is a delay of 2 seconds before playing my sequence I'm not sure why this is happening, I have a very basic sequence with one layer.
Second question - I'm trying to make FCP popular in my country but it is very hard since we work with OMF both in Protools and in After Effects. In Avid we exported media embedded OMF and also with audio. I know that there is an option to export OMF with audio but Protools has a hard time opening it. As for Video there is no choice, what can we do?
Thank youAnyone?
Maybe you are looking for
-
Unit testing and integration testing
hello 2 all, what is the diff bet unit and integration testing? in sap what is unit teesting consists of and integration testing consists of what? is this the work of test engineers r whose work is this? take care love ur parents
-
[SOLVED] Network Manager confusion on reading WiKi
I have installed KDE DE...I am unable to start the network through NM... On going thru Arch Wiki , https://wiki.archlinux.org/index.php/NetworkManager I understand that there are 2 NMs available..One is for GNOME but works in all DE (backend), Other
-
So I added a folder to my dock next to the trash can, containing all my music to make it easier to access my songs.. and I try to open up iTunes and everything has gone! It is completely blank. I try to copy all the music from that folder.. back into
-
Database restore without controlfiles ?
I had to restore database from backup and i do not have any control files. So how do i open the database?
-
Error Downloading Plugin for Swing Applet
Dear Fellows! i am using following Java Script in HTML to download Java Plugin for Swing Applet. <b>But</b> During download and installation it gives the error that the file is corrupted try to download again...., I have checked it on more than one P