Re: Confidence of Cardinality Estimates (CBO)
Iordan Iotzov wrote:
Teradata has the concept of confidence of cardinality estimates -
http://developer.teradata.com/database/articles/can-we-speak-confidentially-exposing-explain-confidence-levels
In short, the optimizer tries to figure out the amount of “guesswork” that is included into a cardinality estimate.
Is there anything similar in Oracle? I am looking for anything - supported or not!
Nice idea, but no.
(although, internally, there are points where it does know that it is guessing (which is where optimizer_dynamic_sampliing at level 3 comes into play) or where it is aware that predicate independence is not a realistic assumption (which is where level 4 comes into play)).
Regards
Jonathan Lewis
Hi Jonathan,
In my limited 12c experience, presence or absence of adaptive execution plans (STATISTICS COLLECTOR) is a proxy for cardinality confidence only in few situations.
That is, Oracle 12c generates an adaptive execution plan for most joins, even for joins in simple queries that pose no cardinality confidence challenges.
From what I have seen so far, the only situation where the CBO is confident enough about its estimates to skip the adaptive execution plan step (STATISTICS COLLECTOR) is for single value primary key/unique index scans.
One possible explanation is that the cost of adding an adaptive execution plan step is so low, that Oracle adds it almost indiscriminately.
I ran some tests ( http://wp.me/p1DHW2-7o ) and came across some interesting results:
->The cost of adaptive execution plan is actually negative for nested loops. That is, turning adaptive execution plans off would make a nested loop run slower. The difference is small, but it seems to be statistically significant.
->The cost of adaptive exec plan is negligible for hash joins. That is, no statistically significant difference was found between HJ queries using adaptive exec plans and those that do not use adaptive exec
plans.
Regards,
Iordan Iotzov
Similar Messages
-
Confidence of Cardinality Estimates (CBO)
Teradata has the concept of confidence of cardinality estimates -
http://developer.teradata.com/database/articles/can-we-speak-confidentially-exposing-explain-confidence-levels
In short, the optimizer tries to figure out the amount of “guesswork” that is included into a cardinality estimate.
Is there anything similar in Oracle? I am looking for anything - supported or not!
Thanks,
Iordan Iotzov
http://iiotzov.wordpress.com/Hi Jonathan,
In my limited 12c experience, presence or absence of adaptive execution plans (STATISTICS COLLECTOR) is a proxy for cardinality confidence only in few situations.
That is, Oracle 12c generates an adaptive execution plan for most joins, even for joins in simple queries that pose no cardinality confidence challenges.
From what I have seen so far, the only situation where the CBO is confident enough about its estimates to skip the adaptive execution plan step (STATISTICS COLLECTOR) is for single value primary key/unique index scans.
One possible explanation is that the cost of adding an adaptive execution plan step is so low, that Oracle adds it almost indiscriminately.
I ran some tests ( http://wp.me/p1DHW2-7o ) and came across some interesting results:
->The cost of adaptive execution plan is actually negative for nested loops. That is, turning adaptive execution plans off would make a nested loop run slower. The difference is small, but it seems to be statistically significant.
->The cost of adaptive exec plan is negligible for hash joins. That is, no statistically significant difference was found between HJ queries using adaptive exec plans and those that do not use adaptive exec
plans.
Regards,
Iordan Iotzov -
How can I improve optimizers poor cardinality estimates?
Hi all,
I have a query that is taking too long and it looks like the cardinality estimates are way off. It seems particulary bas with the hash joins
and I don't know how to get the optimizer to get a better estimate. The tables in the query were last analyzed a couple of weeks ago
using dbms_stats and DBMS_STATS.AUTO_SAMPLE_SIZE and FOR ALL COLUMNS SIZE AUTO but looking at dba_tab_col_statistics there is only a frequency histogram
on a column not used in the query. The data hasn't really changed that much since the last collection
This is 11.2.0.2.1 on linux x86_64.
create table test2 as
select /*+ GATHER_PLAN_STATISTICS */ DISTINCT
hts.resource_id resource_id,
a.start_time start_date,
hta.attribute12 alias1
FROM
hxc_time_attribute_usages htu,
hxc_time_attributes hta,
hxc_time_building_blocks a,
hxc_time_building_blocks b,
hxc_time_building_blocks c,
hxc_timecard_summary hts
WHERE
htu.time_attribute_id = hta.time_attribute_id
AND hta.attribute_category LIKE 'ELEMENT%'
AND hta.attribute12 IS NOT NULL
AND htu.time_building_block_id = c.time_building_block_id
AND a.time_building_block_id = b.parent_building_block_id
AND b.time_building_block_id = c.parent_building_block_id
AND c.time_building_block_id = htu.time_building_block_id
AND a.scope = 'TIMECARD'
AND b.scope = 'DAY'
AND c.scope ='DETAIL'
AND hts.timecard_id = a.time_building_block_id
Plan hash value: 1730726592
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 0 | CREATE TABLE STATEMENT | | 1 | | 0 |00:40:13.79 | 621K| 3182K| 2671K| | | | |
| 1 | LOAD AS SELECT | | 1 | | 0 |00:40:13.79 | 621K| 3182K| 2671K| 529K| 529K| 529K (0)| |
| 2 | SORT UNIQUE | | 1 | 1205 | 170K|00:40:13.37 | 618K| 3182K| 2671K| 11M| 11M| 10M (0)| |
|* 3 | HASH JOIN | | 1 | 1205 | 135M|00:36:59.88 | 618K| 3182K| 2671K| 3325M| 63M| 371M (1)| 18M|
|* 4 | HASH JOIN | | 1 | 10829 | 143M|00:11:47.18 | 616K| 894K| 384K| 2047M| 32M| 539M (1)| 2748K|
|* 5 | HASH JOIN | | 1 | 9541 | 28M|00:06:43.60 | 500K| 448K| 54765 | 751M| 16M| 607M (1)| 456K|
|* 6 | HASH JOIN | | 1 | 8885 | 7561K|00:05:28.13 | 383K| 276K| 0 | 211M| 8846K| 278M (0)| |
|* 7 | HASH JOIN | | 1 | 21193 | 2689K|00:05:00.55 | 266K| 160K| 0 | 169M| 9302K| 201M (0)| |
|* 8 | TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES | 1 | 20971 | 2637K|00:04:23.04 | 209K| 103K| 0 | | | | |
|* 9 | INDEX RANGE SCAN | HXC_TIME_ATTRIBUTES_FK2 | 1 | 71213 | 2640K|00:01:25.09 | 15774 | 15764 | 0 | | | | |
| 10 | TABLE ACCESS FULL | HXC_TIME_ATTRIBUTE_USAGES | 1 | 8451K| 8849K|00:00:08.71 | 56898 | 56825 | 0 | | | | |
|* 11 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1094K| 2703K|00:00:08.01 | 116K| 116K| 0 | | | | |
|* 12 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1094K| 3025K|00:00:13.12 | 116K| 116K| 0 | | | | |
|* 13 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1206K| 284K|00:00:19.24 | 116K| 116K| 0 | | | | |
| 14 | TABLE ACCESS FULL | HXC_TIMECARD_SUMMARY | 1 | 118K| 124K|00:00:05.15 | 2212 | 1183 | 0 | | | | |
Predicate Information (identified by operation id):
3 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
4 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
5 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
7 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
8 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
9 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
11 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
13 - filter("A"."SCOPE"='TIMECARD')I can get a slight improvement if I set optimizer_dynamic_sampling=4
Plan hash value: 2768898101
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 0 | CREATE TABLE STATEMENT | | 1 | | 0 |00:20:21.47 | 621K| 760K| 355K| | | | |
| 1 | LOAD AS SELECT | | 1 | | 0 |00:20:21.47 | 621K| 760K| 355K| 529K| 529K| 529K (0)| |
| 2 | SORT UNIQUE | | 1 | 433K| 170K|00:20:21.07 | 618K| 760K| 354K| 11M| 11M| 10M (0)| |
|* 3 | HASH JOIN | | 1 | 433K| 135M|00:17:36.89 | 618K| 760K| 354K| 171M| 9261K| 233M (0)| |
|* 4 | TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES | 1 | 2509K| 2637K|00:00:06.33 | 209K| 0 | 0 | | | | |
|* 5 | INDEX RANGE SCAN | HXC_TIME_ATTRIBUTES_FK2 | 1 | 71213 | 2640K|00:00:02.17 | 15767 | 0 | 0 | | | | |
|* 6 | HASH JOIN | | 1 | 1444K| 272M|00:07:38.99 | 409K| 760K| 354K| 2047M| 32M| 546M (1)| 2717K|
|* 7 | HASH JOIN | | 1 | 446K| 30M|00:01:43.68 | 352K| 399K| 50235 | 639M| 17M| 694M (1)| 418K|
|* 8 | HASH JOIN | | 1 | 377K| 8329K|00:00:41.18 | 235K| 232K| 0 | 17M| 2383K| 27M (0)| |
|* 9 | HASH JOIN | | 1 | 134K| 277K|00:00:23.28 | 118K| 116K| 0 | 4912K| 1573K| 7759K (0)| |
| 10 | TABLE ACCESS FULL | HXC_TIMECARD_SUMMARY | 1 | 118K| 124K|00:00:00.08 | 2212 | 0 | 0 | | | | |
|* 11 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1206K| 284K|00:00:22.13 | 116K| 116K| 0 | | | | |
|* 12 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 2988K| 3025K|00:00:05.05 | 116K| 116K| 0 | | | | |
|* 13 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 2514K| 2703K|00:00:03.65 | 116K| 116K| 0 | | | | |
| 14 | TABLE ACCESS FULL | HXC_TIME_ATTRIBUTE_USAGES | 1 | 8451K| 8849K|00:00:08.23 | 56898 | 56818 | 0 | | | | |
Predicate Information (identified by operation id):
3 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
4 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
5 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
7 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
8 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
9 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
11 - filter("A"."SCOPE"='TIMECARD')
12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
13 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
Note
- dynamic sampling used for this statement (level=4)But I still have a large difference in the Estimated and Actual, what can I do to help the optimizer get a better estimate?Hi Dom
Thank you for your input it is always appreciated!
I tried running with a manual workarea and a sort_area_size of 2000000000 but the result was worse then before.
Plan hash value: 1730726592
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 0 | CREATE TABLE STATEMENT | | 1 | | 0 |00:36:15.63 | 1085K| 2926K| 2475K| | | | |
| 1 | LOAD AS SELECT | | 1 | | 0 |00:36:15.63 | 1085K| 2926K| 2475K| 529K| 529K| 529K (0)| |
| 2 | SORT UNIQUE | | 1 | 1205 | 170K|00:36:14.83 | 1083K| 2926K| 2475K| 11M| 11M| 10M (0)| |
|* 3 | HASH JOIN | | 1 | 1205 | 135M|00:32:54.89 | 1083K| 2926K| 2475K| 3325M| 63M| 2048M (1)| 19M|
|* 4 | HASH JOIN | | 1 | 10829 | 143M|01:06:20.59 | 651K| 623K| 188K| 2047M| 32M| 2048M (1)| 1681K|
|* 5 | HASH JOIN | | 1 | 9541 | 28M|00:05:51.56 | 500K| 317K| 0 | 751M| 16M| 1088M (0)| |
|* 6 | HASH JOIN | | 1 | 8885 | 7561K|00:03:03.10 | 383K| 201K| 0 | 211M| 8846K| 379M (0)| |
|* 7 | HASH JOIN | | 1 | 21193 | 2689K|00:02:21.61 | 266K| 84965 | 0 | 169M| 9302K| 299M (0)| |
|* 8 | TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES | 1 | 20971 | 2637K|00:01:45.51 | 209K| 28147 | 0 | | | | |
|* 9 | INDEX RANGE SCAN | HXC_TIME_ATTRIBUTES_FK2 | 1 | 71213 | 2640K|00:00:43.17 | 15769 | 6921 | 0 | | | | |
| 10 | TABLE ACCESS FULL | HXC_TIME_ATTRIBUTE_USAGES | 1 | 8451K| 8849K|00:00:08.75 | 56898 | 56818 | 0 | | | | |
|* 11 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1094K| 2703K|00:00:22.74 | 116K| 116K| 0 | | | | |
|* 12 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1094K| 3025K|00:00:07.85 | 116K| 116K| 0 | | | | |
|* 13 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 1206K| 284K|00:00:07.42 | 117K| 116K| 0 | | | | |
| 14 | TABLE ACCESS FULL | HXC_TIMECARD_SUMMARY | 1 | 118K| 124K|00:00:01.57 | 2250 | 282 | 0 | | | | |
Query Block Name / Object Alias (identified by operation id):
1 - SEL$1
8 - SEL$1 / HTA@SEL$1
9 - SEL$1 / HTA@SEL$1
10 - SEL$1 / HTU@SEL$1
11 - SEL$1 / C@SEL$1
12 - SEL$1 / B@SEL$1
13 - SEL$1 / A@SEL$1
14 - SEL$1 / HTS@SEL$1
Predicate Information (identified by operation id):
3 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
4 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
5 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
7 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
8 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
9 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
11 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
13 - filter("A"."SCOPE"='TIMECARD')
66 rows selected.So I tried setting some cardinality hints, but again, it doesn;t seemd to have helped.
select /*+ GATHER_PLAN_STATISTICS
CARDINALITY(A@SEL$1 284000)
CARDINALITY(HTA@SEL$1 2637000)
CARDINALITY(C@SEL$1 270000)
CARDINALITY(B@SEL$1 300000) */
DISTINCT
hts.resource_id resource_id,
a.start_time start_date,
hta.attribute12 alias1
FROM hxc_time_attribute_usages htu,
hxc_time_attributes hta,
hxc_time_building_blocks a,
hxc_time_building_blocks b,
hxc_time_building_blocks c,
hxc_timecard_summary hts
WHERE
htu.time_attribute_id =hta.time_attribute_id
AND hta.attribute_category LIKE 'ELEMENT%'
AND hta.attribute12 IS NOT NULL
AND htu.time_building_block_id = c.time_building_block_id
AND a.time_building_block_id = b.parent_building_block_id
AND b.time_building_block_id = c.parent_building_block_id
AND c.time_building_block_id = htu.time_building_block_id
AND a.scope = 'TIMECARD'
AND b.scope = 'DAY'
AND c.scope ='DETAIL' AND hts
Plan hash value: 1839838244
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 0 | CREATE TABLE STATEMENT | | 1 | | 0 |00:18:00.46 | 1104K| 4415K| 3927K| | | | |
| 1 | LOAD AS SELECT | | 1 | | 0 |00:18:00.46 | 1104K| 4415K| 3927K| 529K| 529K| 529K (0)| |
| 2 | SORT UNIQUE | | 1 | 9678 | 170K|00:17:59.95 | 1101K| 4415K| 3926K| 11M| 11M| 10M (0)| |
|* 3 | HASH JOIN | | 1 | 9678 | 135M|01:02:43.31 | 1101K| 4415K| 3926K| 4318M| 98M| 305M (1)| 27M|
|* 4 | HASH JOIN | | 1 | 30691 | 272M|00:12:31.27 | 409K| 1009K| 602K| 2047M| 32M| 498M (1)| 2523K|
|* 5 | HASH JOIN | | 1 | 9485 | 30M|00:05:41.24 | 352K| 649K| 300K| 2047M| 33M| 461M (1)| 2355K|
|* 6 | HASH JOIN | | 1 | 22123 | 31M|00:01:56.55 | 349K| 367K| 17535 | 493M| 18M| 661M (1)| 146K|
|* 7 | HASH JOIN | | 1 | 79434 | 8102K|00:00:53.71 | 233K| 233K| 0 | 116M| 10M| 195M (0)| |
|* 8 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 270K| 2703K|00:00:24.01 | 116K| 116K| 0 | | |
|* 9 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 300K| 3025K|00:00:10.42 | 116K| 116K| 0 | | | | |
|* 10 | TABLE ACCESS FULL | HXC_TIME_BUILDING_BLOCKS | 1 | 284K| 284K|00:00:11.44 | 116K| 116K| 0 | | | | |
| 11 | TABLE ACCESS FULL | HXC_TIMECARD_SUMMARY | 1 | 118K| 124K|00:00:01.88 | 2212 | 256 | 0 | | | | |
| 12 | TABLE ACCESS FULL | HXC_TIME_ATTRIBUTE_USAGES | 1 | 8451K| 8849K|00:00:11.09 | 56898 | 56818 | 0 | | | | |
|* 13 | TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES | 1 | 2637K| 2637K|00:03:20.70 | 210K| 56613 | 0 | | | | |
|* 14 | INDEX RANGE SCAN | HXC_TIME_ATTRIBUTES_FK2 | 1 | 71213 | 2640K|00:01:34.62 | 16142 | 15527 | 0 | | | | |
Query Block Name / Object Alias (identified by operation id):
1 - SEL$1
8 - SEL$1 / C@SEL$1
9 - SEL$1 / B@SEL$1
10 - SEL$1 / A@SEL$1
11 - SEL$1 / HTS@SEL$1
12 - SEL$1 / HTU@SEL$1
13 - SEL$1 / HTA@SEL$1
14 - SEL$1 / HTA@SEL$1
Predicate Information (identified by operation id):
3 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
4 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
5 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
6 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
7 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
8 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
9 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
10 - filter("A"."SCOPE"='TIMECARD')
13 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
14 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
68 rows selected.
SQL>What else do you think I should try, or did I do the cardinality bit wrong, because I don't seem to be able to hint the HASH Join, only the table scan?
Thanks -
Query performance issues - Poor cardinality estimate?
Hi,
I have a query which is taking far longer than estimated by the explain plan (estimate 1min, query still running after several hours).
Plan hash value: 3287246760
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 195 | 3795 (1)| 00:00:46 |
| 1 | VIEW | | 1 | 195 | 3795 (1)| 00:00:46 |
| 2 | WINDOW SORT | | 1 | 151 | 3795 (1)| 00:00:46 |
| 3 | VIEW | | 1 | 151 | 3794 (1)| 00:00:46 |
| 4 | SORT UNIQUE | | 1 | 147 | 3794 (1)| 00:00:46 |
| 5 | WINDOW BUFFER | | 1 | 147 | 3794 (1)| 00:00:46 |
| 6 | SORT GROUP BY PIVOT | | 1 | 147 | 3794 (1)| 00:00:46 |
| 7 | NESTED LOOPS | | | | | |
| 8 | NESTED LOOPS | | 1 | 147 | 3793 (1)| 00:00:46 |
| 9 | NESTED LOOPS | | 3 | 297 | 1503 (1)| 00:00:19 |
|* 10 | HASH JOIN | | 238 | 15470 | 75 (7)| 00:00:01 |
| 11 | MAT_VIEW ACCESS FULL | VENTILATION | 17994 | 404K| 35 (0)| 00:00:01 |
| 12 | VIEW | | 17994 | 738K| 39 (11)| 00:00:01 |
| 13 | SORT UNIQUE | | 17994 | 702K| 39 (11)| 00:00:01 |
| 14 | WINDOW SORT | | 17994 | 702K| 39 (11)| 00:00:01 |
|* 15 | VIEW | | 17994 | 702K| 37 (6)| 00:00:01 |
| 16 | WINDOW SORT | | 17994 | 632K| 37 (6)| 00:00:01 |
| 17 | MAT_VIEW ACCESS FULL | VENTILATION | 17994 | 632K| 35 (0)| 00:00:01 |
| 18 | INLIST ITERATOR | | | | | |
|* 19 | TABLE ACCESS BY INDEX ROWID| LABEVENTS | 1 | 34 | 6 (0)| 00:00:01 |
|* 20 | INDEX RANGE SCAN | LABEVENTS_O5 | 5 | | 3 (0)| 00:00:01 |
|* 21 | INDEX RANGE SCAN | CHARTEVENTS_O5 | 4937 | | 12 (0)| 00:00:01 |
|* 22 | TABLE ACCESS BY INDEX ROWID | CHARTEVENTS | 1 | 48 | 763 (0)| 00:00:10 |
Predicate Information (identified by operation id):
10 - access("ICUS"."SUBJECT_ID"="FVGT48H"."SUBJECT_ID" AND
SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME")=SYS_EXTRACT_UTC("ICUS"."BEGIN_TIME"))
15 - filter((INTERNAL_FUNCTION("END_TIME")-INTERNAL_FUNCTION("BEGIN_TIME"))DAY(9) TO
SECOND(9)>INTERVAL'+02 00:00:00' DAY(2) TO SECOND(0))
19 - filter(SYS_EXTRACT_UTC("LE"."CHARTTIME")>=SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME") AND
SYS_EXTRACT_UTC("LE"."CHARTTIME")<=SYS_EXTRACT_UTC("FVGT48H"."END_TIME"))
20 - access("ICUS"."ICUSTAY_ID"="LE"."ICUSTAY_ID" AND ("LE"."ITEMID"=50013 OR
"LE"."ITEMID"=50019))
filter("LE"."ICUSTAY_ID" IS NOT NULL)
21 - access("LE"."ICUSTAY_ID"="CE"."ICUSTAY_ID")I tried removing the nested loops using the NO_USE_NL hints, which give the following plan:
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 195 | | 22789 (1)| 00:04:34 |
| 1 | VIEW | | 1 | 195 | | 22789 (1)| 00:04:34 |
| 2 | WINDOW SORT | | 1 | 151 | | 22789 (1)| 00:04:34 |
| 3 | VIEW | | 1 | 151 | | 22788 (1)| 00:04:34 |
| 4 | SORT UNIQUE | | 1 | 147 | | 22788 (1)| 00:04:34 |
| 5 | WINDOW BUFFER | | 1 | 147 | | 22788 (1)| 00:04:34 |
| 6 | SORT GROUP BY PIVOT | | 1 | 147 | | 22788 (1)| 00:04:34 |
|* 7 | HASH JOIN | | 1 | 147 | | 22787 (1)| 00:04:34 |
| 8 | VIEW | | 17994 | 738K| | 39 (11)| 00:00:01 |
| 9 | SORT UNIQUE | | 17994 | 702K| | 39 (11)| 00:00:01 |
| 10 | WINDOW SORT | | 17994 | 702K| | 39 (11)| 00:00:01 |
|* 11 | VIEW | | 17994 | 702K| | 37 (6)| 00:00:01 |
| 12 | WINDOW SORT | | 17994 | 632K| | 37 (6)| 00:00:01 |
| 13 | MAT_VIEW ACCESS FULL | VENTILATION | 17994 | 632K| | 35 (0)| 00:00:01 |
|* 14 | HASH JOIN | | 11873 | 1217K| 5800K| 22747 (1)| 00:04:33 |
|* 15 | HASH JOIN | | 86060 | 4790K| | 16141 (2)| 00:03:14 |
| 16 | MAT_VIEW ACCESS FULL | VENTILATION | 17994 | 404K| | 35 (0)| 00:00:01 |
|* 17 | TABLE ACCESS FULL | LABEVENTS | 176K| 5869K| | 16105 (2)| 00:03:14 |
| 18 | INLIST ITERATOR | | | | | | |
| 19 | TABLE ACCESS BY INDEX ROWID| CHARTEVENTS | 104K| 4911K| | 6024 (1)| 00:01:13 |
|* 20 | INDEX RANGE SCAN | CHARTEVENTS_O4 | 104K| | | 220 (1)| 00:00:03 |
Predicate Information (identified by operation id):
7 - access("ICUS"."SUBJECT_ID"="FVGT48H"."SUBJECT_ID" AND
SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME")=SYS_EXTRACT_UTC("ICUS"."BEGIN_TIME"))
filter(SYS_EXTRACT_UTC("LE"."CHARTTIME")>=SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME") AND
SYS_EXTRACT_UTC("LE"."CHARTTIME")<=SYS_EXTRACT_UTC("FVGT48H"."END_TIME"))
11 - filter((INTERNAL_FUNCTION("END_TIME")-INTERNAL_FUNCTION("BEGIN_TIME"))DAY(9) TO
SECOND(9)>INTERVAL'+02 00:00:00' DAY(2) TO SECOND(0))
14 - access("LE"."ICUSTAY_ID"="CE"."ICUSTAY_ID")
filter(SYS_EXTRACT_UTC("CHARTTIME")<SYS_EXTRACT_UTC("LE"."CHARTTIME"))
15 - access("ICUS"."ICUSTAY_ID"="LE"."ICUSTAY_ID")
17 - filter("LE"."ICUSTAY_ID" IS NOT NULL AND ("LE"."ITEMID"=50013 OR "LE"."ITEMID"=50019))
20 - access("CE"."ITEMID"=185 OR "CE"."ITEMID"=186 OR "CE"."ITEMID"=190 OR "CE"."ITEMID"=3420)The cardinality estimate looks way off to me - I'm expecting several thousand rows. I have up-to-date statistics.
Can anyone help?
Thanks,
DanWITH chf_patients AS (
-- Exclude patients with CHF by ICD9 code
select subject_id,
hadm_id
from mimic2v26.icd9
where code in ('398.91','402.01','402.91','428.0','428.0', '428.1', '404.13', '404.93', '428.9', '404.91')
, icustays AS (
/* Our ICU Stay population */
SELECT *
FROM MIMIC2V26.ICUSTAY_DETAIL
WHERE ICUSTAY_AGE_GROUP = 'adult'
AND SUBJECT_ID NOT IN (select subject_id from chf_patients)
-- AND SUBJECT_ID < 50
--select * from icustays;
-- Combine ventilation periods separated by < 48 hours.
, combine_ventilation as (
select subject_id,
icustay_id,
begin_time,
-- end_time as end_first_vent,
-- lead(begin_time,1) over (partition by icustay_id order by begin_time) as next_begin_time,
-- lead(begin_time,1) over (partition by icustay_id order by begin_time) - begin_time as time_to_next,
case when (lead(begin_time,1) over (partition by icustay_id order by begin_time) - begin_time) < interval '2' day
then lead(end_time,1) over (partition by icustay_id order by begin_time)
else end_time end as end_time
from mimic2devel.ventilation
--select * from combine_ventilation;
--select * from combine_ventilation where end_of_ventilation != end_time;
-- Get the first ventilation period which is > 48 hours.
, first_vent_gt_48hrs as (
select distinct subject_id,
first_value(begin_time) over (partition by subject_id order by begin_time) as begin_time,
first_value(end_time) over (partition by subject_id order by begin_time) as end_time
from combine_ventilation where end_time - begin_time > interval '48' hour
--select * from first_vent_gt_48hrs;
-- Find the ICU stay when it occurred
, icustay_first_vent_gt_48hrs as (
select fvgt48h.subject_id,
icus.icustay_id,
fvgt48h.begin_time,
fvgt48h.end_time
from first_vent_gt_48hrs fvgt48h
join mimic2devel.ventilation icus on icus.subject_id = fvgt48h.subject_id and fvgt48h.begin_time = icus.begin_time
--select /*+gather_plan_statistics*/ * from icustay_first_vent_gt_48hrs;
, pao2_fio2_during_ventilation as (
select /*+ NO_USE_NL(le ifvgt48h) */
le.subject_id,
le.hadm_id,
le.icustay_id,
charttime,
case when itemid = 50019 then 'PAO2'
when itemid = 50013 then 'FIO2'
end as item_type,
-- Some FIO2s are fractional instead of percentage
case when itemid = 50013 and valuenum > 1 then round(valuenum / 100,2)
else round(valuenum,2)
end as valuenum
from mimic2v26.labevents le
join icustay_first_vent_gt_48hrs ifvgt48h on ifvgt48h.icustay_id = le.icustay_id and le.charttime between ifvgt48h.begin_time and ifvgt48h.end_time
where le.itemid = 50019 or le.itemid = 50013
--select * from pao2_fio2_during_ventilation;
-- Check that FIO2s have valid range
, vent_data_pivot as (
select * from (
select subject_id, hadm_id, icustay_id, charttime, item_type, valuenum from pao2_fio2_during_ventilation)
pivot ( max(valuenum) as valuenum for item_type in ('FIO2' as fio2, 'PAO2' as pao2) )
--select * from vent_data_pivot;
-- Fill in prior FIO2 from chartevents
, get_prior_fio2s as (
select /*+ NO_USE_NL(vdp ce) */
distinct
vdp.subject_id,
vdp.hadm_id,
vdp.icustay_id,
vdp.charttime as pao2_charttime,
vdp.fio2_valuenum,
vdp.pao2_valuenum,
-- ce.itemid,
-- ce.charttime as chart_charttime,
-- ce.value1num,
first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_fio2_raw,
case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
then round(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100,2)
else round(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc),2)
end as most_recent_fio2,
first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_fio2_charttime,
vdp.charttime - first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as time_since_fio2,
-- first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_charttime
case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
then round(vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100),2)
else round(vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc)),2)
end as pf_ratio,
case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
then
case when vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100) < 200 then 1 else 0 end
else
case when vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc)) < 200 then 1 else 0 end
end as pf_ratio_below_thresh
from vent_data_pivot vdp
join mimic2v26.chartevents ce on vdp.icustay_id = ce.icustay_id and ce.charttime < vdp.charttime
where itemid in (190,3420,186,185)
--select * from get_prior_fio2s order by icustay_id, charttime;
, pf_data as (
select subject_id,
hadm_id,
icustay_id,
pao2_charttime,
lead(pao2_charttime) over (partition by icustay_id order by pao2_charttime) as next_pao2_charttime,
fio2_valuenum,
pao2_valuenum,
lead(pao2_valuenum) over (partition by icustay_id order by pao2_charttime) as next_pao2_valuenum,
most_recent_fio2_raw,
most_recent_fio2,
most_recent_fio2_charttime,
time_since_fio2,
pf_ratio,
lead(pf_ratio) over (partition by icustay_id order by pao2_charttime) as next_pf_ratio,
pf_ratio_below_thresh,
lead(pf_ratio_below_thresh) over (partition by icustay_id order by pao2_charttime) as next_pf_ratio_below_thresh
from get_prior_fio2s
select * from pf_data;Table structure is available here:
http://mimic.physionet.org/schema/latest/
Can I still get a TKPROF if the query doesn't complete? I'll have a go and post the results shortly.
Thanks,
Dan -
User Generated Data and Cardinality Estimates
Platform Information:
Windows Server 2003 R2
Oracle Enterprise Edition 10.2.0.4
Optimizer Parameters:
NAME TYPE VALUE
object_cache_optimal_size integer 102400
optimizer_dynamic_sampling integer 2
optimizer_features_enable string 10.2.0.4
optimizer_index_caching integer 90
optimizer_index_cost_adj integer 30
optimizer_mode string CHOOSE
optimizer_secure_view_merging boolean TRUE
Test Case:
var csv VARCHAR2(250);
exec :csv := '1,2,3,4,5,6,7,8,9,10';
EXPLAIN PLAN FOR WITH csv_to_rows AS
SELECT UPPER(
TRIM(
SUBSTR
txt
, INSTR (txt, ',', 1, level ) + 1
, INSTR (txt, ',', 1, level+1) - INSTR (txt, ',', 1, level) -1
) AS token
FROM (
SELECT ','||:csv||',' txt
FROM dual
) t
CONNECT BY LEVEL <= LENGTH(:csv)-LENGTH(REPLACE(:csv,',',''))+1
SELECT * FROM csv_to_rows;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Results:
Execution Plan
Plan hash value: 2403765415
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 19 | 2 (0)| 00:00:01 |
| 1 | VIEW | | 1 | 19 | 2 (0)| 00:00:01 |
|* 2 | CONNECT BY WITHOUT FILTERING| | | | | |
| 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
Predicate Information (identified by operation id):
2 - filter(LEVEL<=LENGTH(:CSV)-LENGTH(REPLACE(:CSV,',',''))+1)
Statistics
1 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
502 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
10 rows processed
Question:
Every once in a while I need to use [Tom Kyte's Varying in Lists|http://tkyte.blogspot.com/2006/06/varying-in-lists.html] method (for 9i as indicated in his blog entry) to convert a comma separated list to a "bindable" in list.
As one can see above the cardinality estimates are not correct. The execution plans I have seen from this method usually result in a nested loop join using an index. While this makes sense for small result sets it may not be the most efficient method with larger number of entries in the comma separated list.
Has anyone found a way to expose the correct or near correct cardinality to the optimizer at runtime? I can use the cardinality hint but the problem with that is that it must be defined as a scalar value and that may not work for all cases. The dynamic sampling won't work in this scenario because cardinality statistics already exist against DUAL.
I haven't noticed any detrimental effects in my environment so this may be purely an academic discussion but I thought I'd throw it out there :)I have definitely considered using this as a possibility but I have tried to shy away from writing data (even temporarily) when all I'm looking to do is query data.I agree with you about hard-coding the cardinality values.
A silight variation on David's suggestion is to use the dynamic sampling hint to get the statistics at run-time. There will be a slight performance cost to do this.
Remember that all of the explain plan statistics are estimates, which may or may not be accurate. Usually they are good, but every once in a while they are incorrect. -
CBO - Wrong Cardinality Estimate
Hello,
Version 10.2.0.3
I am trying to understand the figures in the Explain Plan. I am not able to explain the cardinality of 70 on step 4.
The query takes very long to execute (about 400 secs). I would expect HASH JOIN SEMI instead of NESTED LOOPS SEMI.
I have tried to provide as much information as possible. I have just requested the 10053 trace, dont have it now.
There is a primary key on ORDERS.ORDER_ID (NUMBER) column. However, we are forced to use to_char(order_id) to accomodate for COT_EXTERNAL_ID being VARCHAR2 field.
1 select cdw.* from cdw_orders cdw where cdw.cot_external_id in
2 (
3 select to_char(order_id) from orders o where o.status_id in (12,16,22)
4* )
SQL> /
Execution Plan
Plan hash value: 733167152
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 2 | 280 | 326 (1)| 00:00:04 |
| 1 | NESTED LOOPS SEMI | | 2 | 280 | 326 (1)| 00:00:04 |
| 2 | TABLE ACCESS FULL | CDW_ORDERS | 3362 | 433K| 293 (1)| 00:00:04 |
| 3 | INLIST ITERATOR | | | | | |
|* 4 | TABLE ACCESS BY INDEX ROWID| ORDERS | 70 | 560 | 1 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | ORDERS_STATUS_ID_IDX | 2 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
4 - filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
5 - access("O"."STATUS_ID"=12 OR "O"."STATUS_ID"=16 OR "O"."STATUS_ID"=22)Here is some of the details on the table columns and data.
SQL> select column_name,num_distinct,density,num_nulls,num_buckets from all_tab_columns where table_name = 'ORDERS'
2 and column_name in ('STATUS_ID','ORDER_ID');
COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS NUM_BUCKETS
ORDER_ID 177951 .00000561952447584 0 254
STATUS_ID 23 .00000275335899280 0 23
SQL> select num_rows from all_tables where table_name = 'ORDERS';
NUM_ROWS
177951
SQL> select index_name,distinct_keys,clustering_factor,num_rows,sample_size from all_indexes where index_name = 'ORDERS_STATUS_ID_IDX'
2 /
INDEX_NAME DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS SAMPLE_SIZE
ORDERS_STATUS_ID_IDX 25 35893 177951 177951Histograms on column STATUS_ID
SQL> select * from (
2 select column_name,endpoint_value,endpoint_number- nvl(lag(endpoint_number) over (order by endpoint_value),0) count
3 from all_tab_histograms where column_name = 'STATUS_ID' and table_name = 'ORDERS'
4 ) where endpoint_value in (12,16,22);
COLUMN_NAME ENDPOINT_VALUE COUNT
STATUS_ID 12 494
STATUS_ID 16 24
STATUS_ID 22 3064
SQL> select max(endpoint_number) from all_tab_histograms where column_name = 'STATUS_ID' and table_name = 'ORDERS' ;
MAX(ENDPOINT_NUMBER)
5641I tried to run the query for individual values instead of inlist to check the numbers.
1 select cdw.* from cdw_orders cdw where cdw.cot_external_id in
2 (
3 select to_char(order_id) from orders o where o.status_id = 12
4* )
SQL> /
Execution Plan
Plan hash value: 3178043291
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 2 | 280 | 33 (19)| 00:00:01 |
| 1 | MERGE JOIN SEMI | | 2 | 280 | 33 (19)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| CDW_ORDERS | 3362 | 433K| 21 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | CDW_ORD_COT_EXT_ID | 3362 | | 2 (0)| 00:00:01 |
|* 4 | SORT UNIQUE | | 15584 | 121K| 11 (46)| 00:00:01 |
|* 5 | VIEW | index$_join$_002 | 15584 | 121K| 9 (34)| 00:00:01 |
|* 6 | HASH JOIN | | | | | |
|* 7 | INDEX RANGE SCAN | ORDERS_STATUS_ID_IDX | 15584 | 121K| 1 (0)| 00:00:01 |
| 8 | INDEX FAST FULL SCAN | PK_ORDERS | 15584 | 121K| 5 (0)| 00:00:01 |
Predicate Information (identified by operation id):
4 - access("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
5 - filter("O"."STATUS_ID"=12)
6 - access(ROWID=ROWID)
7 - access("O"."STATUS_ID"=12)For status_id = 12, the cardinality on step 7 for orders_status_id_idx is 15584 which is inline with the expectation ie., (494/5641)*177951 = 15583.7 ~ 15584.
Now, I continue the same with status_is = 16
1 select cdw.* from cdw_orders cdw where cdw.cot_external_id in
2 (
3 select to_char(order_id) from orders o where o.status_id = 16
4* )
SQL> /
Execution Plan
Plan hash value: 43581000
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1363 | 186K| 10 (10)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | CDW_ORDERS | 2 | 264 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 1363 | 186K| 10 (10)| 00:00:01 |
| 3 | SORT UNIQUE | | 757 | 6056 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| ORDERS | 757 | 6056 | 2 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | ORDERS_STATUS_ID_IDX | 757 | | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | CDW_ORD_COT_EXT_ID | 2 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
5 - access("O"."STATUS_ID"=16)
6 - access("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))Here also the cardinality on step 5 for orders_status_id_idx is as expected ie., (24/5641)*177951 = 757.1 ~ 757
Finally, running the same for status_id = 22 surprises me
1 select cdw.* from cdw_orders cdw where cdw.cot_external_id in
2 (
3 select to_char(order_id) from orders o where o.status_id = 22
4* )
SQL> /
Execution Plan
Plan hash value: 3496542905
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 2 | 280 | 326 (1)| 00:00:04 |
| 1 | NESTED LOOPS SEMI | | 2 | 280 | 326 (1)| 00:00:04 |
| 2 | TABLE ACCESS FULL | CDW_ORDERS | 3362 | 433K| 293 (1)| 00:00:04 |
|* 3 | TABLE ACCESS BY INDEX ROWID| ORDERS | 60 | 480 | 1 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | ORDERS_STATUS_ID_IDX | 2 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
3 - filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
4 - access("O"."STATUS_ID"=22)Like the ones for 12 and 16, I would have expected the cardinality on step 4 to be (3064/5641)*177951 = 96657, but I see only 2.
This is where my doubt is. Is this got to do with 22 being a popular value ? Can someone explain this behaviour ?
As a solution I am thinking of creating an index on to_char(order_id) - function based, hoping that the step 3 CDW.COT_EXTERNAL_ID = TO_CHAR(ORDER_ID) changes
to access instead of filter. Let me know your thoughts on the index creation as well.
Thanks,
Rgds,
Gokul
Edited by: Gokul Gopal on 24-May-2012 02:40Hello Jonathan,
Apologies, I was wrong about optimizer_index_cost_adj value to be set to 100. I gather from DBA the value is set to currently set to 1.
I have pasted the 10053 trace file for value 22. I was not able to figure out the "jsel=min(1, 6.1094e-04)" bit.
/dborafiles/COTP/bycota2/udump/bycota2_ora_2147_values_22.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
ORACLE_HOME = /dboracle/orabase/product/10.2.0
System name: Linux
Node name: byl945d002
Release: 2.6.9-55.ELsmp
Version: #1 SMP Fri Apr 20 16:36:54 EDT 2007
Machine: x86_64
Instance name: bycota2
Redo thread mounted by this instance: 2
Oracle process number: 37
Unix process pid: 2147, image: oracle@byl945d002 (TNS V1-V3)
*** 2012-05-28 14:00:59.737
*** ACTION NAME:() 2012-05-28 14:00:59.737
*** MODULE NAME:(SQL*Plus) 2012-05-28 14:00:59.737
*** SERVICE NAME:(SYS$USERS) 2012-05-28 14:00:59.737
*** SESSION ID:(713.51629) 2012-05-28 14:00:59.737
Registered qb: SEL$1 0x973e5458 (PARSER)
signature (): qb_name=SEL$1 nbfros=1 flg=0
fro(0): flg=4 objn=51893 hint_alias="CDW"@"SEL$1"
Registered qb: SEL$2 0x973e6058 (PARSER)
signature (): qb_name=SEL$2 nbfros=1 flg=0
fro(0): flg=4 objn=51782 hint_alias="O"@"SEL$2"
Predicate Move-Around (PM)
PM: Considering predicate move-around in SEL$1 (#0).
PM: Checking validity of predicate move-around in SEL$1 (#0).
CBQT: Validity checks passed for 5r4bhr2yrt5gz.
apadrv-start: call(in-use=704, alloc=16344), compile(in-use=60840, alloc=63984)
Current SQL statement for this session:
select cdw.* from cdw_orders cdw where cdw.cot_external_id in (select to_char(o.order_id) from orders o where status_id = 22)
Legend
The following abbreviations are used by optimizer trace.
CBQT - cost-based query transformation
JPPD - join predicate push-down
FPD - filter push-down
PM - predicate move-around
CVM - complex view merging
SPJ - select-project-join
SJC - set join conversion
SU - subquery unnesting
OBYE - order by elimination
ST - star transformation
qb - query block
LB - leaf blocks
DK - distinct keys
LB/K - average number of leaf blocks per key
DB/K - average number of data blocks per key
CLUF - clustering factor
NDV - number of distinct values
Resp - response cost
Card - cardinality
Resc - resource cost
NL - nested loops (join)
SM - sort merge (join)
HA - hash (join)
CPUCSPEED - CPU Speed
IOTFRSPEED - I/O transfer speed
IOSEEKTIM - I/O seek time
SREADTIM - average single block read time
MREADTIM - average multiblock read time
MBRC - average multiblock read count
MAXTHR - maximum I/O system throughput
SLAVETHR - average slave I/O throughput
dmeth - distribution method
1: no partitioning required
2: value partitioned
4: right is random (round-robin)
512: left is random (round-robin)
8: broadcast right and partition left
16: broadcast left and partition right
32: partition left using partitioning of right
64: partition right using partitioning of left
128: use hash partitioning dimension
256: use range partitioning dimension
2048: use list partitioning dimension
1024: run the join in serial
0: invalid distribution method
sel - selectivity
ptn - partition
Peeked values of the binds in SQL statement
PARAMETERS USED BY THE OPTIMIZER
PARAMETERS WITH ALTERED VALUES
sort_area_retained_size = 65535
optimizer_mode = first_rows_100
optimizer_index_cost_adj = 25
optimizer_index_caching = 100
Bug Fix Control Environment
fix 4611850 = enabled
fix 4663804 = enabled
fix 4663698 = enabled
fix 4545833 = enabled
fix 3499674 = disabled
fix 4584065 = enabled
fix 4602374 = enabled
fix 4569940 = enabled
fix 4631959 = enabled
fix 4519340 = enabled
fix 4550003 = enabled
fix 4488689 = enabled
fix 3118776 = enabled
fix 4519016 = enabled
fix 4487253 = enabled
fix 4556762 = 15
fix 4728348 = enabled
fix 4723244 = enabled
fix 4554846 = enabled
fix 4175830 = enabled
fix 4722900 = enabled
fix 5094217 = enabled
fix 4904890 = enabled
fix 4483286 = disabled
fix 4969880 = disabled
fix 4711525 = enabled
fix 4717546 = enabled
fix 4904838 = enabled
fix 5005866 = enabled
fix 4600710 = enabled
fix 5129233 = enabled
fix 5195882 = enabled
fix 5084239 = enabled
fix 4595987 = enabled
fix 4134994 = enabled
fix 5104624 = enabled
fix 4908162 = enabled
fix 5015557 = enabled
PARAMETERS WITH DEFAULT VALUES
optimizer_mode_hinted = false
optimizer_features_hinted = 0.0.0
parallel_execution_enabled = true
parallel_query_forced_dop = 0
parallel_dml_forced_dop = 0
parallel_ddl_forced_degree = 0
parallel_ddl_forced_instances = 0
_query_rewrite_fudge = 90
optimizer_features_enable = 10.2.0.3
_optimizer_search_limit = 5
cpu_count = 8
active_instance_count = 2
parallel_threads_per_cpu = 2
hash_area_size = 131072
bitmap_merge_area_size = 1048576
sort_area_size = 65536
_sort_elimination_cost_ratio = 0
_optimizer_block_size = 8192
_sort_multiblock_read_count = 2
_hash_multiblock_io_count = 0
_db_file_optimizer_read_count = 32
_optimizer_max_permutations = 2000
pga_aggregate_target = 602112 KB
_pga_max_size = 204800 KB
_query_rewrite_maxdisjunct = 257
_smm_auto_min_io_size = 56 KB
_smm_auto_max_io_size = 248 KB
_smm_min_size = 602 KB
_smm_max_size = 102400 KB
_smm_px_max_size = 301056 KB
_cpu_to_io = 0
_optimizer_undo_cost_change = 10.2.0.3
parallel_query_mode = enabled
parallel_dml_mode = disabled
parallel_ddl_mode = enabled
sqlstat_enabled = false
_optimizer_percent_parallel = 101
_always_anti_join = choose
_always_semi_join = choose
_optimizer_mode_force = true
_partition_view_enabled = true
_always_star_transformation = false
_query_rewrite_or_error = false
_hash_join_enabled = true
cursor_sharing = exact
_b_tree_bitmap_plans = true
star_transformation_enabled = false
_optimizer_cost_model = choose
_new_sort_cost_estimate = true
_complex_view_merging = true
_unnest_subquery = true
_eliminate_common_subexpr = true
_pred_move_around = true
_convert_set_to_join = false
_push_join_predicate = true
_push_join_union_view = true
_fast_full_scan_enabled = true
_optim_enhance_nnull_detection = true
_parallel_broadcast_enabled = true
_px_broadcast_fudge_factor = 100
_ordered_nested_loop = true
_no_or_expansion = false
_system_index_caching = 0
_disable_datalayer_sampling = false
query_rewrite_enabled = true
query_rewrite_integrity = enforced
_query_cost_rewrite = true
_query_rewrite_2 = true
_query_rewrite_1 = true
_query_rewrite_expression = true
_query_rewrite_jgmigrate = true
_query_rewrite_fpc = true
_query_rewrite_drj = true
_full_pwise_join_enabled = true
_partial_pwise_join_enabled = true
_left_nested_loops_random = true
_improved_row_length_enabled = true
_index_join_enabled = true
_enable_type_dep_selectivity = true
_improved_outerjoin_card = true
_optimizer_adjust_for_nulls = true
_optimizer_degree = 0
_use_column_stats_for_function = true
_subquery_pruning_enabled = true
_subquery_pruning_mv_enabled = false
_or_expand_nvl_predicate = true
_like_with_bind_as_equality = false
_table_scan_cost_plus_one = true
_cost_equality_semi_join = true
_default_non_equality_sel_check = true
_new_initial_join_orders = true
_oneside_colstat_for_equijoins = true
_optim_peek_user_binds = true
_minimal_stats_aggregation = true
_force_temptables_for_gsets = false
workarea_size_policy = auto
_smm_auto_cost_enabled = true
_gs_anti_semi_join_allowed = true
_optim_new_default_join_sel = true
optimizer_dynamic_sampling = 2
_pre_rewrite_push_pred = true
_optimizer_new_join_card_computation = true
_union_rewrite_for_gs = yes_gset_mvs
_generalized_pruning_enabled = true
_optim_adjust_for_part_skews = true
_force_datefold_trunc = false
statistics_level = typical
_optimizer_system_stats_usage = true
skip_unusable_indexes = true
_remove_aggr_subquery = true
_optimizer_push_down_distinct = 0
_dml_monitoring_enabled = true
_optimizer_undo_changes = false
_predicate_elimination_enabled = true
_nested_loop_fudge = 100
_project_view_columns = true
_local_communication_costing_enabled = true
_local_communication_ratio = 50
_query_rewrite_vop_cleanup = true
_slave_mapping_enabled = true
_optimizer_cost_based_transformation = linear
_optimizer_mjc_enabled = true
_right_outer_hash_enable = true
_spr_push_pred_refspr = true
_optimizer_cache_stats = false
_optimizer_cbqt_factor = 50
_optimizer_squ_bottomup = true
_fic_area_size = 131072
_optimizer_skip_scan_enabled = true
_optimizer_cost_filter_pred = false
_optimizer_sortmerge_join_enabled = true
_optimizer_join_sel_sanity_check = true
_mmv_query_rewrite_enabled = true
_bt_mmv_query_rewrite_enabled = true
_add_stale_mv_to_dependency_list = true
_distinct_view_unnesting = false
_optimizer_dim_subq_join_sel = true
_optimizer_disable_strans_sanity_checks = 0
_optimizer_compute_index_stats = true
_push_join_union_view2 = true
_optimizer_ignore_hints = false
_optimizer_random_plan = 0
_query_rewrite_setopgrw_enable = true
_optimizer_correct_sq_selectivity = true
_disable_function_based_index = false
_optimizer_join_order_control = 3
_optimizer_cartesian_enabled = true
_optimizer_starplan_enabled = true
_extended_pruning_enabled = true
_optimizer_push_pred_cost_based = true
_sql_model_unfold_forloops = run_time
_enable_dml_lock_escalation = false
_bloom_filter_enabled = true
_update_bji_ipdml_enabled = 0
_optimizer_extended_cursor_sharing = udo
_dm_max_shared_pool_pct = 1
_optimizer_cost_hjsmj_multimatch = true
_optimizer_transitivity_retain = true
_px_pwg_enabled = true
optimizer_secure_view_merging = true
_optimizer_join_elimination_enabled = true
flashback_table_rpi = non_fbt
_optimizer_cbqt_no_size_restriction = true
_optimizer_enhanced_filter_push = true
_optimizer_filter_pred_pullup = true
_rowsrc_trace_level = 0
_simple_view_merging = true
_optimizer_rownum_pred_based_fkr = true
_optimizer_better_inlist_costing = all
_optimizer_self_induced_cache_cost = false
_optimizer_min_cache_blocks = 10
_optimizer_or_expansion = depth
_optimizer_order_by_elimination_enabled = true
_optimizer_outer_to_anti_enabled = true
_selfjoin_mv_duplicates = true
_dimension_skip_null = true
_force_rewrite_enable = false
_optimizer_star_tran_in_with_clause = true
_optimizer_complex_pred_selectivity = true
_optimizer_connect_by_cost_based = true
_gby_hash_aggregation_enabled = true
_globalindex_pnum_filter_enabled = true
_fix_control_key = 0
_optimizer_skip_scan_guess = false
_enable_row_shipping = false
_row_shipping_threshold = 80
_row_shipping_explain = false
_optimizer_rownum_bind_default = 10
_first_k_rows_dynamic_proration = true
_optimizer_native_full_outer_join = off
Bug Fix Control Environment
fix 4611850 = enabled
fix 4663804 = enabled
fix 4663698 = enabled
fix 4545833 = enabled
fix 3499674 = disabled
fix 4584065 = enabled
fix 4602374 = enabled
fix 4569940 = enabled
fix 4631959 = enabled
fix 4519340 = enabled
fix 4550003 = enabled
fix 4488689 = enabled
fix 3118776 = enabled
fix 4519016 = enabled
fix 4487253 = enabled
fix 4556762 = 15
fix 4728348 = enabled
fix 4723244 = enabled
fix 4554846 = enabled
fix 4175830 = enabled
fix 4722900 = enabled
fix 5094217 = enabled
fix 4904890 = enabled
fix 4483286 = disabled
fix 4969880 = disabled
fix 4711525 = enabled
fix 4717546 = enabled
fix 4904838 = enabled
fix 5005866 = enabled
fix 4600710 = enabled
fix 5129233 = enabled
fix 5195882 = enabled
fix 5084239 = enabled
fix 4595987 = enabled
fix 4134994 = enabled
fix 5104624 = enabled
fix 4908162 = enabled
fix 5015557 = enabled
PARAMETERS IN OPT_PARAM HINT
Column Usage Monitoring is ON: tracking level = 1
COST-BASED QUERY TRANSFORMATIONS
FPD: Considering simple filter push (pre rewrite) in SEL$1 (#0)
FPD: Current where clause predicates in SEL$1 (#0) :
"CDW"."COT_EXTERNAL_ID"=ANY (SELECT TO_CHAR("O"."ORDER_ID") FROM "ORDERS" "O")
Registered qb: SEL$1 0x974658b0 (COPY SEL$1)
signature(): NULL
Registered qb: SEL$2 0x9745e408 (COPY SEL$2)
signature(): NULL
Cost-Based Subquery Unnesting
SU: No subqueries to consider in query block SEL$2 (#2).
SU: Considering subquery unnesting in query block SEL$1 (#1)
SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on SEL$1 (#1).
SU: Checking validity of unnesting subquery SEL$2 (#2)
SU: Passed validity checks.
SU: Transforming ANY subquery to a join.
Registered qb: SEL$5DA710D3 0x974658b0 (SUBQUERY UNNEST SEL$1; SEL$2)
signature (): qb_name=SEL$5DA710D3 nbfros=2 flg=0
fro(0): flg=0 objn=51893 hint_alias="CDW"@"SEL$1"
fro(1): flg=0 objn=51782 hint_alias="O"@"SEL$2"
Cost-Based Complex View Merging
CVM: Finding query blocks in SEL$5DA710D3 (#1) that are valid to merge.
SU: Transforming ANY subquery to a join.
Set-Join Conversion (SJC)
SJC: Considering set-join conversion in SEL$5DA710D3 (#1).
Query block (0x2a973e5458) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT "CDW".* FROM "COT_PLUS"."ORDERS" "O","COT_PLUS"."CDW_ORDERS" "CDW" WHERE "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
Query block (0x2a973e5458) unchanged
Predicate Move-Around (PM)
PM: Considering predicate move-around in SEL$5DA710D3 (#1).
PM: Checking validity of predicate move-around in SEL$5DA710D3 (#1).
PM: PM bypassed: Outer query contains no views.
JPPD: Applying transformation directives
JPPD: Checking validity of push-down in query block SEL$5DA710D3 (#1)
JPPD: No view found to push predicate into.
FPD: Considering simple filter push in SEL$5DA710D3 (#1)
FPD: Current where clause predicates in SEL$5DA710D3 (#1) :
"CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
kkogcp: try to generate transitive predicate from check constraints for SEL$5DA710D3 (#1)
predicates with check contraints: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
after transitive predicate generation: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
finally: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
First K Rows: Setup begin
kkoqbc-start
: call(in-use=1592, alloc=16344), compile(in-use=101000, alloc=134224)
QUERY BLOCK TEXT
select cdw.* from cdw_orders cdw where cdw.cot_external_id in (select to_char(o.order_id) from orders o where status_id = 22)
QUERY BLOCK SIGNATURE
qb name was generated
signature (optimizer): qb_name=SEL$5DA710D3 nbfros=2 flg=0
fro(0): flg=0 objn=51893 hint_alias="CDW"@"SEL$1"
fro(1): flg=0 objn=51782 hint_alias="O"@"SEL$2"
SYSTEM STATISTICS INFORMATION
Using NOWORKLOAD Stats
CPUSPEED: 714 millions instruction/sec
IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
IOSEEKTIM: 10 milliseconds (default is 10)
BASE STATISTICAL INFORMATION
Table Stats::
Table: CDW_ORDERS Alias: CDW
#Rows: 3375 #Blks: 1504 AvgRowLen: 132.00
Index Stats::
Index: CDW_ORD_COT_EXT_ID Col#: 10
LVLS: 1 #LB: 232 #DK: 1878 LB/K: 1.00 DB/K: 1.00 CLUF: 1899.00
Index: CDW_ORD_REFERENCE_IDX Col#: 13
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00
Index: COMMITTED_IDX Col#: 12
LVLS: 1 #LB: 171 #DK: 1673 LB/K: 1.00 DB/K: 1.00 CLUF: 1657.00
Index: OBJID_IDX Col#: 16 17
LVLS: 2 #LB: 318 #DK: 3372 LB/K: 1.00 DB/K: 1.00 CLUF: 1901.00
Index: ORDID_IDX Col#: 14
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00
Table Stats::
Table: ORDERS Alias: O
#Rows: 178253 #Blks: 7300 AvgRowLen: 282.00
Index Stats::
Index: IDX_ORDERS_CONFIG Col#: 80
LVLS: 1 #LB: 215 #DK: 452 LB/K: 1.00 DB/K: 130.00 CLUF: 59161.00
Index: IDX_ORDERS_REFRENCE_NUMBER Col#: 6
LVLS: 1 #LB: 428 #DK: 68698 LB/K: 1.00 DB/K: 1.00 CLUF: 115830.00
Index: ORDERS_BILLING_SI_IDX Col#: 13
LVLS: 1 #LB: 84 #DK: 3049 LB/K: 1.00 DB/K: 8.00 CLUF: 27006.00
Index: ORDERS_LATEST_ORD_IDX Col#: 3
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00
Index: ORDERS_ORDER_TYPE_IDX Col#: 4
LVLS: 2 #LB: 984 #DK: 64 LB/K: 15.00 DB/K: 932.00 CLUF: 59702.00
Index: ORDERS_ORD_MINOR__IDX Col#: 43 5
LVLS: 2 #LB: 784 #DK: 112 LB/K: 7.00 DB/K: 375.00 CLUF: 42012.00
Index: ORDERS_OWNING_ORG_IDX Col#: 37
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00
Index: ORDERS_PARENT_ORD_IDX Col#: 2
LVLS: 1 #LB: 206 #DK: 37492 LB/K: 1.00 DB/K: 1.00 CLUF: 58051.00
Index: ORDERS_SD_CONFIG__IDX Col#: 42
LVLS: 2 #LB: 604 #DK: 10 LB/K: 60.00 DB/K: 3638.00 CLUF: 36389.00
Index: ORDERS_SPECIAL_OR_IDX Col#: 36
LVLS: 1 #LB: 63 #DK: 2 LB/K: 31.00 DB/K: 556.00 CLUF: 1113.00
Index: ORDERS_STATUS_ID_IDX Col#: 5
LVLS: 2 #LB: 635 #DK: 25 LB/K: 25.00 DB/K: 1440.00 CLUF: 36015.00
Index: PK_ORDERS Col#: 1
LVLS: 1 #LB: 408 #DK: 178253 LB/K: 1.00 DB/K: 1.00 CLUF: 131025.00
SINGLE TABLE ACCESS PATH
Column (#5): STATUS_ID(NUMBER)
AvgLen: 3.00 NDV: 20 Nulls: 0 Density: 2.7653e-06 Min: 2 Max: 33
Histogram: Freq #Bkts: 20 UncompBkts: 5567 EndPtVals: 20
Table: ORDERS Alias: O
Card: Original: 178253 Rounded: 95450 Computed: 95450.37 Non Adjusted: 95450.37
Access Path: TableScan
Cost: 1419.89 Resp: 1419.89 Degree: 0
Cost_io: 1408.00 Cost_cpu: 101897352
Resp_io: 1408.00 Resp_cpu: 101897352
kkofmx: index filter:"O"."STATUS_ID"=22
Access Path: index (skip-scan)
SS sel: 0.53548 ANDV (#skips): 60
SS io: 419.81 vs. table scan io: 1408.00
Skip Scan chosen
Access Path: index (SkipScan)
Index: ORDERS_ORD_MINOR__IDX
resc_io: 22918.81 resc_cpu: 204258888
ix_sel: 0.53548 ix_sel_with_filters: 0.53548
Cost: 5735.66 Resp: 5735.66 Degree: 1
Access Path: index (AllEqRange)
Index: ORDERS_STATUS_ID_IDX
resc_io: 19629.00 resc_cpu: 180830676
ix_sel: 0.53548 ix_sel_with_filters: 0.53548
Cost: 4912.53 Resp: 4912.53 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: TableScan
Cost: 1419.89 Degree: 1 Resp: 1419.89 Card: 95450.37 Bytes: 0
SINGLE TABLE ACCESS PATH
Table: CDW_ORDERS Alias: CDW
Card: Original: 3375 Rounded: 3375 Computed: 3375.00 Non Adjusted: 3375.00
Access Path: TableScan
Cost: 292.51 Resp: 292.51 Degree: 0
Cost_io: 291.00 Cost_cpu: 12971896
Resp_io: 291.00 Resp_cpu: 12971896
Best:: AccessPath: TableScan
Cost: 292.51 Degree: 1 Resp: 292.51 Card: 3375.00 Bytes: 0
OPTIMIZER STATISTICS AND COMPUTATIONS
GENERAL PLANS
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]: CDW_ORDERS[CDW]#0 ORDERS[O]#1
Now joining: ORDERS[O]#1
NL Join
Outer table: Card: 3375.00 Cost: 292.51 Resp: 292.51 Degree: 1 Bytes: 132
Inner table: ORDERS Alias: O
Access Path: TableScan
NL Join: Cost: 4788284.86 Resp: 4788284.86 Degree: 0
Cost_io: 4748144.00 Cost_cpu: 343916534896
Resp_io: 4748144.00 Resp_cpu: 343916534896
kkofmx: index filter:"O"."STATUS_ID"=22
OPTIMIZER PERCENT INDEX CACHING = 100
Access Path: index (FullScan)
Index: ORDERS_ORD_MINOR__IDX
resc_io: 22497.00 resc_cpu: 217815366
ix_sel: 1 ix_sel_with_filters: 0.53548
NL Join: Cost: 19004464.41 Resp: 19004464.41 Degree: 1
Cost_io: 18982134.75 Cost_cpu: 191314735126
Resp_io: 18982134.75 Resp_cpu: 191314735126
OPTIMIZER PERCENT INDEX CACHING = 100
Access Path: index (AllEqJoin)
Index: ORDERS_STATUS_ID_IDX
resc_io: 1.00 resc_cpu: 7981
ix_sel: 1.0477e-05 ix_sel_with_filters: 1.0477e-05
NL Join: Cost: 1137.05 Resp: 1137.05 Degree: 1
Cost_io: 1134.75 Cost_cpu: 19706236
Resp_io: 1134.75 Resp_cpu: 19706236
****** trying bitmap/domain indexes ******
Best NL cost: 1137.05
resc: 1137.05 resc_io: 1134.75 resc_cpu: 19706236
resp: 1137.05 resp_io: 1134.75 resp_cpu: 19706236
adjusting AJ/SJ sel based on min/max ranges: jsel=min(1, 6.1094e-04)Semi Join Card: 2.06 = outer (3375.00) * sel (6.1094e-04)
Join Card - Rounded: 2 Computed: 2.06
SM Join
Outer table:
resc: 292.51 card 3375.00 bytes: 132 deg: 1 resp: 292.51
Inner table: ORDERS Alias: O
resc: 1419.89 card: 95450.37 bytes: 8 deg: 1 resp: 1419.89
using dmeth: 2 #groups: 1
SORT resource Sort statistics
Sort width: 598 Area size: 616448 Max Area size: 104857600
Degree: 1
Blocks to Sort: 65 Row size: 156 Total Rows: 3375
Initial runs: 1 Merge passes: 0 IO Cost / pass: 0
Total IO sort cost: 0 Total CPU sort cost: 10349977
Total Temp space used: 0
SORT resource Sort statistics
Sort width: 598 Area size: 616448 Max Area size: 104857600
Degree: 1
Blocks to Sort: 223 Row size: 19 Total Rows: 95450
Initial runs: 2 Merge passes: 1 IO Cost / pass: 122
Total IO sort cost: 345 Total CPU sort cost: 85199490
Total Temp space used: 3089000
SM join: Resc: 2068.56 Resp: 2068.56 [multiMatchCost=0.00]
SM cost: 2068.56
resc: 2068.56 resc_io: 2044.00 resc_cpu: 210418716
resp: 2068.56 resp_io: 2044.00 resp_cpu: 210418716
SM Join (with index on outer)
Access Path: index (FullScan)
Index: CDW_ORD_COT_EXT_ID
resc_io: 2132.00 resc_cpu: 18119160
ix_sel: 1 ix_sel_with_filters: 1
Cost: 533.53 Resp: 533.53 Degree: 1
Outer table:
resc: 533.53 card 3375.00 bytes: 132 deg: 1 resp: 533.53
Inner table: ORDERS Alias: O
resc: 1419.89 card: 95450.37 bytes: 8 deg: 1 resp: 1419.89
using dmeth: 2 #groups: 1
SORT resource Sort statistics
Sort width: 598 Area size: 616448 Max Area size: 104857600
Degree: 1
Blocks to Sort: 223 Row size: 19 Total Rows: 95450
Initial runs: 2 Merge passes: 1 IO Cost / pass: 122
Total IO sort cost: 345 Total CPU sort cost: 85199490
Total Temp space used: 3089000
SM join: Resc: 2308.37 Resp: 2308.37 [multiMatchCost=0.00]
HA Join
Outer table:
resc: 292.51 card 3375.00 bytes: 132 deg: 1 resp: 292.51
Inner table: ORDERS Alias: O
resc: 1419.89 card: 95450.37 bytes: 8 deg: 1 resp: 1419.89
using dmeth: 2 #groups: 1
Cost per ptn: 1.67 #ptns: 1
hash_area: 151 (max=25600) Hash join: Resc: 1714.08 Resp: 1714.08 [multiMatchCost=0.00]
HA cost: 1714.08
resc: 1714.08 resc_io: 1699.00 resc_cpu: 129204369
resp: 1714.08 resp_io: 1699.00 resp_cpu: 129204369
Best:: JoinMethod: NestedLoopSemi
Cost: 1137.05 Degree: 1 Resp: 1137.05 Card: 2.06 Bytes: 140
Best so far: Table#: 0 cost: 292.5140 card: 3375.0000 bytes: 445500
Table#: 1 cost: 1137.0501 card: 2.0619 bytes: 280
Number of join permutations tried: 1
(newjo-save) [0 1 ]
Final - All Rows Plan: Best join order: 1
Cost: 1137.0501 Degree: 1 Card: 2.0000 Bytes: 280
Resc: 1137.0501 Resc_io: 1134.7500 Resc_cpu: 19706236
Resp: 1137.0501 Resp_io: 1134.7500 Resc_cpu: 19706236
kkoipt: Query block SEL$5DA710D3 (#1)
kkoqbc-end
: call(in-use=156048, alloc=164408), compile(in-use=103696, alloc=134224)
First K Rows: Setup end
*********************** -
Wrong cardinality estimate for range scan
select * from v$version;
BANNER
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - ProductionSQL : select * from GC_FULFILLMENT_ITEMS where MARKETPLACE_ID=:b1 and GC_FULFILLMENT_STATUS_ID=:b2;
Plan
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 474K| 99M| 102 (85)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| GC_FULFILLMENT_ITEMS | 474K| 99M| 102 (85)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | I_GCFI_GCFS_ID_SDOC_MKTPLID | 474K| | 91 (95)| 00:00:01 |
Predicate Information (identified by operation id):
2 - access("GC_FULFILLMENT_STATUS_ID"=TO_NUMBER(:B2) AND "MARKETPLACE_ID"=TO_NUMBER(:B1))
filter("MARKETPLACE_ID"=TO_NUMBER(:B1))If i use literals than CBO uses cardinality =1 (I believe this is due it fix control :5483301 which i set to off In my environment)
select * from GC_FULFILLMENT_ITEMS where MARKETPLACE_ID=5 and GC_FULFILLMENT_STATUS_ID=2;
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 220 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| GC_FULFILLMENT_ITEMS | 1 | 220 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | I_GCFI_GCFS_ID_SDOC_MKTPLID | 1 | | 2 (0)| 00:00:01 |
Predicate Information (identified by operation id):
2 - access("GC_FULFILLMENT_STATUS_ID"=2 AND "MARKETPLACE_ID"=5)
filter("MARKETPLACE_ID"=5)Here is column distribution and histogram information
Enter value for column_name: MARKETPLACE_ID
COLUMN_NAME ENDPOINT_VALUE CUMMULATIVE_FREQUENCY FREQUENCY ENDPOINT_ACTUAL_VALU
MARKETPLACE_ID 1 1 1
MARKETPLACE_ID 3 8548 8547
MARKETPLACE_ID 4 15608 7060
MARKETPLACE_ID 5 16385 777 --->
MARKETPLACE_ID 35691 16398 13
MARKETPLACE_ID 44551 16407 9
6 rows selected.
Enter value for column_name: GC_FULFILLMENT_STATUS_ID
COLUMN_NAME ENDPOINT_VALUE CUMMULATIVE_FREQUENCY FREQUENCY ENDPOINT_ACTUAL_VALU
GC_FULFILLMENT_STATUS_ID 5 19602 19602
GC_FULFILLMENT_STATUS_ID 6 19612 10
GC_FULFILLMENT_STATUS_ID 8 19802 190
3 rows selected.
Actual distribution
select MARKETPLACE_ID,count(*) from GC_FULFILLMENT_ITEMS group by MARKETPLACE_ID order by 1;
MARKETPLACE_ID COUNT(*)
1 2099
3 16339936
4 13358682
5 1471839 --->
35691 33623
44551 19881
78931 40273
101611 1
6309408
9 rows selected.
BHAVIK_DBA: GC1EU> select GC_FULFILLMENT_STATUS_ID,count(*) from GC_FULFILLMENT_ITEMS group by GC_FULFILLMENT_STATUS_ID order by 1;
GC_FULFILLMENT_STATUS_ID COUNT(*)
1 880
2 63 --->
3 24
5 37226908
6 22099
7 18
8 325409
9 343
8 rows selected.10053 trace
SINGLE TABLE ACCESS PATH
Table: GC_FULFILLMENT_ITEMS Alias: GC_FULFILLMENT_ITEMS
Card: Original: 36703588.000000 Rounded: 474909 Computed: 474909.06 Non Adjusted: 474909.06
Best:: AccessPath: IndexRange
Index: I_GCFI_GCFS_ID_SDOC_MKTPLID
Cost: 102.05 Degree: 1 Resp: 102.05 Card: 474909.06 Bytes: 0
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
DB_VERSION('11.2.0.2')
OPT_PARAM('_b_tree_bitmap_plans' 'false')
OPT_PARAM('_optim_peek_user_binds' 'false')
OPT_PARAM('_fix_control' '5483301:0')
ALL_ROWS
OUTLINE_LEAF(@"SEL$F5BB74E1")
MERGE(@"SEL$2")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$2")
INDEX_RS_ASC(@"SEL$F5BB74E1" "GC_FULFILLMENT_ITEMS"@"SEL$2" ("GC_FULFILLMENT_ITEMS"."GC_FULFILLMENT_STATUS_ID" "GC_FULFILLMENT_ITEMS"."SHIP_DELIVERY_OPTION_CODE" "GC_FULFILLMENT_ITEMS"."MARKETPLACE_ID"))
END_OUTLINE_DATA
*/Is there any reason why CBO is using card=474909.06 ? Having fix control () in place, it should have set card=1 if it is considering GC_FULFILLMENT_STATUS_ID= 2 as "rare" value..isn't it ?OraDBA02 wrote:
You are right Charles.
I was reading one of your blog and saw that.
As you said, it is an issue with SQLPLUS.
However, plan for the sql which is comming from application still shows the same (wrong cardinality) plan. It does not have TO_NUMBER function because of the reason that it does not experience data-type conversion that SQLPLUS has.
But YES...Plan is exactly the same with/without NO_NUMBER.OraDBA02,
I believe that some of the other people responding to this thread might have already described why the execution plan in the library cache is the same plan that you are seeing. One of the goals of using bind variables in SQL statements is to reduce the number of time consuming (and resource intensive) hard parses. That also means that a second goal is to share the same execution plan for future executions of the same SQL statement, even through bind variable values have changed. The catch here is that bind variable peeking, introduced with Oracle Database 9.0.1 (may be disabled by modifying a hidden parameter), helps the optimizer select the "best" (lowest calculated cost) execution plan for those specific bind variable values - the same plan may not be the "best" execution plan for other sets of bind variable values on future executions.
Histograms on one or more of the columns in the WHERE clause could either help or hinder the situation further. It might further help the first execution, but might further hinder future executions with different bind variable values. Oracle Database 11.1 introduced something called adaptive cursor sharing (and 11.2 introduced cardinality feedback) that in theory addresses issues where the execution plan should change for later executions with different bind variable values (but the SQL statement must execute poorly at least once).
There might be multiple child cursors in the library cache for the same SQL statement, each potentially with a different execution plan. I suggest finding the SQL_ID of the SQL statement that the application is submitting (you can do this by checking V$SQL or V$SQLAREA). Once you have the SQL_ID, go back to the SQL statement that I suggested for displaying the execution plan:
SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));The first NULL in the above SQL statement is where you would specify the SQL_ID. If you leave the second NULL in place, the above SQL statement will retrieve the execution plan for all child cursors with that SQL_ID.
For instance, if the SQL_ID was 75chksrfa5fbt, you would execute the following:
SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL'));Usually, you can take it a step further to see the bind variables that were used during the optimization phase. To do that, you would add the +PEEKED_BINDS format parameter:
SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL +PEEKED_BINDS'));Note that there are various optimizer parameters that affect the optimizer's decisions, for instance, maybe the optimizer mode is set to FIRST_ROWS. Also possibly helpful is the +OUTLINE format parameter that might provide a clue regarding the value of some of the parameters affecting the optimizer. The SQL statement that you would then enter is similar to the following:
SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL +PEEKED_BINDS +OUTLINE'));Additional information might be helpful. Please see the following two forum threads to see what kind of information you should gather:
When your query takes too long… : When your query takes too long ...
How to post a SQL statement tuning request: HOW TO: Post a SQL statement tuning request - template posting
Charles Hooper
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc. -
I'm trying to figure out how the CBO works and what are the parameters that I should change to get it work without "surprises". My Oracle version is 11.1.0.7, and this test was run in both single-instance and RAC. These run on Suse 10.
Here's the query:
SELECT * FROM
RAPIPAGO.RP_TRANSACCION_ITEM I
JOIN
RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
ON I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM
WHERE MES_PRESENTACION = '2009-05';
Or
SELECT * FROM
RAPIPAGO.RP_TRANSACCION_ITEM I,
RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
WHERE I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM AND MES_PRESENTACION = '2009-05';
Its cost is 156.175 and takes 46 seconds to complete.
If I use a hint to obligate the engine to use a combined index (ID_TRANSACCION_ITEM and MES_PRESENTACION, in that order), this is the result:
SELECT /*+ INDEX (I IX4_RP_TRANSACCION_ITEM) */ * FROM
RAPIPAGO.RP_TRANSACCION_ITEM I,
RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
WHERE I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM AND MES_PRESENTACION = '2009-05';
Its cost is 2.697.283 but takes only 1 second to complete...
This behavior is making me troubles in the production env as it is unpredictable and inefficient.
Is there any config that I can use to avoid or control this?
Thanks in advanced.If MES_PRESENTACION is a VARCHAR2, Oracle's ability to get accurate cardinality estimates will be greatly affected. The optimizer expects by default, for example, that you have a DATE column with values of date '2008-01-01' through date '2009-08-01' that that represents 20 months, so any month will have roughly 5% of the data. If you store that data as a string, however, the optimizer's ability to predict how selective a filter will be is going to be dramatically decreased.
Can you generate the query plans using the DBMS_XPLAN package and include the filter and access predicates? When you do, can you enclose them in the \ tag to preserve white space? DBMS_XPLAN provides a lot of information that can be useful.
Does the query really return 2.5 million rows?
How many rows are in RP_TRANSACCION_ITEM?
How many rows are in RP_TRANSACCION_ITEM_COMISION?
Which table is the MES_PRESENTACION column in? How many rows have a MES_PRESENTACION value of '2009-05'?
Is there a histogram on the MES_PRESENTACION column?
Justin -
Datatype best practice and plan cardinality
Hi,
I have a scenario where I need to store the data in the format YYYYMM (e.g. 201001 which means January, 2010).
I am trying to evaluate what is the most appropriate datatype to store this kind of data. I am comparing 2 options, NUMBER and DATE.
As the data is essentially a component of oracle date datatype and experts like Tom Kyte have proved (with examples) that using right
datatype is better for optimizer. So I was expecting that using DATE datatype will yield (at least) similar (if not better) cardinality estimates
than using NUMBER datatype. However, my tests show that when using DATE the cardinality estimates are way off from actuals whereas
using NUMBER the cardinality estimates are much closer to actuals.
My questions are:
1) What should be the most appropriate datatype used to store YYYYMM data?
2) Why does using DATE datatype yield estimates that are way off from actuals than using NUMBER datatype?
SQL> select * from V$VERSION ;
BANNER
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SQL> create table a nologging as select to_number(to_char(add_months(to_date('200101','YYYYMM'),level - 1), 'YYYYMM')) id from dual connect by level <= 289 ;
Table created.
SQL> create table b (id number) ;
Table created.
SQL> begin
2 for i in 1..8192
3 loop
4 insert into b select * from a ;
5 end loop;
6 commit;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL> alter table a add dt date ;
Table altered.
SQL> alter table b add dt date ;
Table altered.
SQL> select to_date(200101, 'YYYYMM') from dual ;
TO_DATE(2
01-JAN-01
SQL> update a set dt = to_date(id, 'YYYYMM') ;
289 rows updated.
SQL> update b set dt = to_date(id, 'YYYYMM') ;
2367488 rows updated.
SQL> commit ;
Commit complete.
SQL> exec dbms_stats.gather_table_stats(user, 'A', estimate_percent=>NULL) ;
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user, 'B', estimate_percent=>NULL) ;
SQL> explain plan for select count(*) from b where id between 200810 and 200903 ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 824 (4)| 00:00:10 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 46604 | 227K| 824 (4)| 00:00:10 |
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
2 - filter("ID"<=200903 AND "ID">=200810)
14 rows selected.
SQL> explain plan for select count(*) from b where dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 825 (4)| 00:00:10 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 5919 | 29595 | 825 (4)| 00:00:10 |
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
2 - filter("DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss'))
16 rows selected.Charles,
Thanks for your response.
I did not think of the possibilitty of histograms. When I ran the tests on 10.2.0.4, I could get the results as you have shown.
So I thought it might be due to some bug in 10.2.0.1. But interestingly, when I ran the test after collecting statistics using 'FOR ALL COLUMNS SIZE 1'
option, I got the cardinalities that match my 10.2.0.1 results (where METHOD_OPT was default i.e. 'FOR ALL COLUMNS SIZE AUTO').
So I carried out the tests again on 10.2.0.1 but the results did not look consistent to me. When there were no histograms on DATE column, the cardinality
was quite close to actuals but when I collected stats using 'FOR ALL COLUMNS SIZE SKEWONLY', it generated histograms on DATE column but
the cardinality was not quite close to actuals.
So I am bit confused about whether this is due to a bug or due to combined effect of optimizer's "intelligence" while collecting statistics using default option
values and the way table is queried (COL_USAGE$ data).
Here is my test:
SQL> select * from v$version ;
BANNER
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SQL> exec dbms_stats.delete_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
no rows selected
SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
ID 289 254 HEIGHT BALANCED
DT 289 254 HEIGHT BALANCED
SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 3691 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 38218 | 186K| 3691 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
14 rows selected.
SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 8 | 3693 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| B | 38218 | 298K| 3693 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss'))
16 rows selected.
SQL> connect sys as sysdba ;
Connected.
SQL> delete from sys.col_usage$ where obj# in (select object_id from all_objects where owner = 'HR' and object_name in ('A','B')) ;
4 rows deleted.
SQL> commit ;
Commit complete.
SQL> connect hr/hr ;
Connected.
SQL> set serveroutput on size 10000
SQL> exec dbms_stats.delete_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
ID 289 1 NONE
DT 289 1 NONE
SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 3691 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 110K| 541K| 3691 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
14 rows selected.
SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 8 | 3693 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| B | 58680 | 458K| 3693 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss'))
16 rows selected.
SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
ID 289 254 HEIGHT BALANCED
DT 289 1 NONE
SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 3690 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 46303 | 226K| 3690 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
14 rows selected.
SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 8 | 3692 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| B | 56797 | 443K| 3692 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss'))
16 rows selected.
SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
ID 289 254 HEIGHT BALANCED
DT 289 1 NONE
SQL> exec dbms_stats.gather_table_stats(user, 'B', method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY') ;
PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
ID 289 254 HEIGHT BALANCED
DT 289 254 HEIGHT BALANCED
SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 8 | 3692 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | TABLE ACCESS FULL| B | 27862 | 217K| 3692 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
hh24:mi:ss'))
16 rows selected.
SQL> explain plan for select count(*) from b where id between 200810 and 200903 ;
Explained.
SQL> select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
Plan hash value: 749587668
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 3690 (1)| 00:00:45 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| B | 32505 | 158K| 3690 (1)| 00:00:45 |
Predicate Information (identified by operation id):
2 - filter("ID"<=200903 AND "ID">=200810)
14 rows selected. -
hi all,
I think cost will show effect on performance of a query .
whether the cardinality and bytes also shows effect ?
thanks for all in advanceYou want the cardinality estimate to be accurate, not low.
How many rows does the query actually return? If the query actually returns the number of rows (roughly) that the optimizer expects, that implies that the optimizer probably picked a pretty good plan. If the optimizer radically over- or under-estimated how many rows are going to be returned, the optimizer probably picked a bad plan.
Think of the cardinality estimate like an estimate you'd get in the real world. If you're looking for someone to remodel your kitchen, for example, and someone gives you an estimate of 1 hour while another gives you an estimate of 1 year, you can be pretty confident that neither of those estimates is going to work out well for you. The guy that estimated that it would only take an hour is obviously underestimating the cost of the job. The guy that estimated a year, on the other hand, is obviously overestimating the cost of the job. In the real world, if you got that sort of estimate, you'd probably assume that there had been some sort of miscommunication about exactly what work you wanted done. In the Oracle realm, you'd generally suspect that there were incorrect, invalid, or missing statistics on some object in the database that was causing the optimizer to make incorrect estimates and you'd work to fix those statistics so that the optimizer's estimate becomes reasonable.
Justin -
Query Degradation--Hash Join Degraded
Hi All,
I found one query degradation issue.I am on 10.2.0.3.0 (Sun OS) with optimizer_mode=ALL_ROWS.
This is a dataware house db.
All 3 tables involved are parition tables (with daily partitions).Partitions are created in advance and ELT jobs loads bulk data into daily partitions.
I have checked that CBO is not using local indexes-created on them which i believe,is appropriate because when i used INDEX HINT, elapsed time increses.
I checked giving index hint for all tables one by one but dint get any performance improvement.
Partitions are daily loaded and after loading,partition-level stats are gathered with dbms_stats.
We are collecting stats at partition level(granularity=>'PARTITION').Even after collecting global stats,there is no change in access pattern.Stats gather command is given below.
PROCEDURE gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Only SOT_KEYMAP.IPK_SOT_KEYMAP is GLOBAL.Rest all indexes are LOCAL.
Earlier,we were having BIND PEEKING issue,which i fixed but introducing NO_INVALIDATE=>FALSE in stats gather job.
Here,Partition_name (20090219) is being passed through bind variables.
SELECT a.sotrelstg_sot_ud sotcrct_sot_ud,
b.sotkey_ud sotcrct_orig_sot_ud, a.ROWID stage_rowid
FROM (SELECT sotrelstg_sot_ud, sotrelstg_sys_ud,
sotrelstg_orig_sys_ord_id, sotrelstg_orig_sys_ord_vseq
FROM sot_rel_stage
WHERE sotrelstg_trd_date_ymd_part = '20090219'
AND sotrelstg_crct_proc_stat_cd = 'N'
AND sotrelstg_sot_ud NOT IN(
SELECT sotcrct_sot_ud
FROM sot_correct
WHERE sotcrct_trd_date_ymd_part ='20090219')) a,
(SELECT MAX(sotkey_ud) sotkey_ud, sotkey_sys_ud,
sotkey_sys_ord_id, sotkey_sys_ord_vseq,
sotkey_trd_date_ymd_part
FROM sot_keymap
WHERE sotkey_trd_date_ymd_part = '20090219'
AND sotkey_iud_cd = 'I'
--not to select logical deleted rows
GROUP BY sotkey_trd_date_ymd_part,
sotkey_sys_ud,
sotkey_sys_ord_id,
sotkey_sys_ord_vseq) b
WHERE a.sotrelstg_sys_ud = b.sotkey_sys_ud
AND a.sotrelstg_orig_sys_ord_id = b.sotkey_sys_ord_id
AND NVL(a.sotrelstg_orig_sys_ord_vseq, 1) = NVL(b.sotkey_sys_ord_vseq, 1);
During normal business hr, i found that query takes 5-7 min(which is also not acceptable), but during high load business hr,it is taking 30-50 min.
I found that most of the time it is spending on HASH JOIN (direct path write temp).We have sufficient RAM (64 GB total/41 GB available).
Below is the execution plan i got during normal business hr.
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 1 | HASH GROUP BY | | 1 | 1 | 7844K|00:05:28.78 | 16M| 217K| 35969 | | | | |
|* 2 | HASH JOIN | | 1 | 1 | 9977K|00:04:34.02 | 16M| 202K| 20779 | 580M| 10M| 563M (1)| 650K|
| 3 | NESTED LOOPS ANTI | | 1 | 6 | 7855K|00:01:26.41 | 16M| 1149 | 0 | | | | |
| 4 | PARTITION RANGE SINGLE| | 1 | 258K| 8183K|00:00:16.37 | 25576 | 1149 | 0 | | | | |
|* 5 | TABLE ACCESS FULL | SOT_REL_STAGE | 1 | 258K| 8183K|00:00:16.37 | 25576 | 1149 | 0 | | | | |
| 6 | PARTITION RANGE SINGLE| | 8183K| 326K| 327K|00:01:10.53 | 16M| 0 | 0 | | | | |
|* 7 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 8183K| 326K| 327K|00:00:53.37 | 16M| 0 | 0 | | | | |
| 8 | PARTITION RANGE SINGLE | | 1 | 846K| 14M|00:02:06.36 | 289K| 180K| 0 | | | | |
|* 9 | TABLE ACCESS FULL | SOT_KEYMAP | 1 | 846K| 14M|00:01:52.32 | 289K| 180K| 0 | | | | |
I will attached the same for high load business hr once query gives results.It is still executing for last 50 mins.
INDEX STATS (INDEXES ARE LOCAL INDEXES)
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_REL_STAGE IDXL_SOTRELSTG_SOT_UD SOTRELSTG_SOT_UD 1 25461560 25461560 184180
SOT_REL_STAGE SOTRELSTG_TRD_DATE 2 25461560 25461560 184180
_YMD_PART
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_KEYMAP IDXL_SOTKEY_ENTORDSYS_UD SOTKEY_ENTRY_ORD_S 1 1012306940 3 38308680
YS_UD
SOT_KEYMAP IDXL_SOTKEY_HASH SOTKEY_HASH 1 1049582320 1049582320 1049579520
SOT_KEYMAP SOTKEY_TRD_DATE_YM 2 1049582320 1049582320 1049579520
D_PART
SOT_KEYMAP IDXL_SOTKEY_SOM_ORD SOTKEY_SOM_UD 1 1023998560 268949136 559414840
SOT_KEYMAP SOTKEY_SYS_ORD_ID 2 1023998560 268949136 559414840
SOT_KEYMAP IPK_SOT_KEYMAP SOTKEY_UD 1 1030369480 1015378900 24226580
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_CORRECT IDXL_SOTCRCT_SOT_UD SOTCRCT_SOT_UD 1 412484756 412484756 411710982
SOT_CORRECT SOTCRCT_TRD_DATE_Y 2 412484756 412484756 411710982
MD_PART
INDEX partiton stas (from dba_ind_partitions)
INDEX_NAME PARTITION_NAME STATUS BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS SAMPLE_SIZE LAST_ANALYZ GLO
IDXL_SOTCRCT_SOT_UD P20090219 USABLE 1 372 327879 216663 327879 327879 20-Feb-2009 YES
IDXL_SOTKEY_ENTORDSYS_UD P20090219 USABLE 2 2910 3 36618 856229 856229 19-Feb-2009 YES
IDXL_SOTKEY_HASH P20090219 USABLE 2 7783 853956 853914 853956 119705 19-Feb-2009 YES
IDXL_SOTKEY_SOM_ORD P20090219 USABLE 2 6411 531492 157147 799758 132610 19-Feb-2009 YES
IDXL_SOTRELSTG_SOT_UD P20090219 USABLE 2 13897 9682052 45867 9682052 794958 20-Feb-2009 YESThanks in advance.
Bhavik DesaiHi Randolf,
Thanks for the time you spent on this issue.I appreciate it.
Please see my comments below:
1. You've mentioned several times that you're passing the partition name as bind variable, but you're obviously testing the statement with literals rather than bind
variables. So your tests obviously don't reflect what is going to happen in case of the actual execution. The cardinality estimates are potentially quite different when
using bind variables for the partition key.
Yes.I intentionaly used literals in my tests.I found couple of times that plan used by the application and plan generated by AUTOTRACE+EXPLAIN PLAN command...is same and
caused hrly elapsed time.
As i pointed out earlier,last month we solved couple of bind peeking issue by intproducing NO_VALIDATE=>FALSE in stats gather procedure,which we execute just after data
load into such daily partitions and before start of jobs which executes this query.
Execution plans From AWR (with parallelism on at table level DEGREE>1)-->This plan is one which CBO has used when degradation occured.This plan is used most of the times.
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
1918506000 46154275 918 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 1213870831
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | | | 19655 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 7 | NESTED LOOPS ANTI | | 1 | 116 | 19654 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 8 | HASH JOIN | | 1 | 102 | 19654 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 9 | PX JOIN FILTER CREATE| :BF0000 | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,02 | PCWP | |
| 10 | PX RECEIVE | | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,02 | PCWP | |
| 11 | PX SEND HASH | :TQ10000 | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,00 | P->P | HASH |
| 12 | PX BLOCK ITERATOR | | 13M| 664M| 2427 (3)| 00:00:44 | KEY | KEY | Q1,00 | PCWC | |
| 13 | TABLE ACCESS FULL| SOT_REL_STAGE | 13M| 664M| 2427 (3)| 00:00:44 | KEY | KEY | Q1,00 | PCWP | |
| 14 | PX RECEIVE | | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,02 | PCWP | |
| 15 | PX SEND HASH | :TQ10001 | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,01 | P->P | HASH |
| 16 | PX JOIN FILTER USE | :BF0000 | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,01 | PCWP | |
| 17 | PX BLOCK ITERATOR | | 27M| 1270M| 17209 (1)| 00:05:10 | KEY | KEY | Q1,01 | PCWC | |
| 18 | TABLE ACCESS FULL| SOT_KEYMAP | 27M| 1270M| 17209 (1)| 00:05:10 | KEY | KEY | Q1,01 | PCWP | |
| 19 | PARTITION RANGE SINGLE| | 16185 | 221K| 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
| 20 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 16185 | 221K| 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
Other Execution plan from AWR
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
1053251381 0 2925 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 3434900850
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | | | 1830 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 7 | NESTED LOOPS ANTI | | 1 | 131 | 1829 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 8 | HASH JOIN | | 1 | 117 | 1829 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 9 | PX JOIN FILTER CREATE| :BF0000 | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,02 | PCWP | |
| 10 | PX RECEIVE | | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,02 | PCWP | |
| 11 | PX SEND HASH | :TQ10000 | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,00 | P->P | HASH |
| 12 | PX BLOCK ITERATOR | | 1010K| 50M| 694 (1)| 00:00:13 | KEY | KEY | Q1,00 | PCWC | |
| 13 | TABLE ACCESS FULL| SOT_KEYMAP | 1010K| 50M| 694 (1)| 00:00:13 | KEY | KEY | Q1,00 | PCWP | |
| 14 | PX RECEIVE | | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,02 | PCWP | |
| 15 | PX SEND HASH | :TQ10001 | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,01 | P->P | HASH |
| 16 | PX JOIN FILTER USE | :BF0000 | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,01 | PCWP | |
| 17 | PX BLOCK ITERATOR | | 11M| 688M| 1129 (3)| 00:00:21 | KEY | KEY | Q1,01 | PCWC | |
| 18 | TABLE ACCESS FULL| SOT_REL_STAGE | 11M| 688M| 1129 (3)| 00:00:21 | KEY | KEY | Q1,01 | PCWP | |
| 19 | PARTITION RANGE SINGLE| | 5209 | 72926 | 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
| 20 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 5209 | 72926 | 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
EXECUTION PLAN AFTER SETTING DEGREE=1 (It was also degraded)
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
| 0 | SELECT STATEMENT | | 1 | 129 | | 42336 (2)| 00:12:43 | | |
| 1 | HASH GROUP BY | | 1 | 129 | | 42336 (2)| 00:12:43 | | |
| 2 | NESTED LOOPS ANTI | | 1 | 129 | | 42335 (2)| 00:12:43 | | |
|* 3 | HASH JOIN | | 1 | 115 | 51M| 42334 (2)| 00:12:43 | | |
| 4 | PARTITION RANGE SINGLE| | 846K| 41M| | 8241 (1)| 00:02:29 | 81 | 81 |
|* 5 | TABLE ACCESS FULL | SOT_KEYMAP | 846K| 41M| | 8241 (1)| 00:02:29 | 81 | 81 |
| 6 | PARTITION RANGE SINGLE| | 8161K| 490M| | 12664 (3)| 00:03:48 | 81 | 81 |
|* 7 | TABLE ACCESS FULL | SOT_REL_STAGE | 8161K| 490M| | 12664 (3)| 00:03:48 | 81 | 81 |
| 8 | PARTITION RANGE SINGLE | | 6525K| 87M| | 1 (0)| 00:00:01 | 81 | 81 |
|* 9 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 6525K| 87M| | 1 (0)| 00:00:01 | 81 | 81 |
Predicate Information (identified by operation id):
3 - access("SOTRELSTG_SYS_UD"="SOTKEY_SYS_UD" AND "SOTRELSTG_ORIG_SYS_ORD_ID"="SOTKEY_SYS_ORD_ID" AND
NVL("SOTRELSTG_ORIG_SYS_ORD_VSEQ",1)=NVL("SOTKEY_SYS_ORD_VSEQ",1))
5 - filter("SOTKEY_TRD_DATE_YMD_PART"=20090219 AND "SOTKEY_IUD_CD"='I')
7 - filter("SOTRELSTG_CRCT_PROC_STAT_CD"='N' AND "SOTRELSTG_TRD_DATE_YMD_PART"=20090219)
9 - access("SOTRELSTG_SOT_UD"="SOTCRCT_SOT_UD" AND "SOTCRCT_TRD_DATE_YMD_PART"=20090219)2. Why are you passing the partition name as bind variable? A statement executing 5 mins. best, > 2 hours worst obviously doesn't suffer from hard parsing issues and
doesn't need to (shouldn't) share execution plans therefore. So I strongly suggest to use literals instead of bind variables. This also solves any potential issues caused
by bind variable peeking.
This is a custom application which uses bind variables to extract data from daily partitions.So,daily automated data extract from daily paritions after load and ELT process.
Here,Value of bind variable is being passed through a procedure parameter.It would be bit difficult to use literals in such application.
3. All your posted plans suffer from bad cardinality estimates. The NO_MERGE hint suggested by Timur only caused a (significant) damage limitation by obviously reducing
the row source size by the group by operation before joining, but still the optimizer is way off, apart from the obviously wrong join order (larger row set first) in
particular the NESTED LOOP operation is causing the main troubles due to excessive logical I/O, as already pointed out by Timur.
Can i ask for alternatives to NESTED LOOP?
4. Your PLAN_TABLE seems to be old (you should see a corresponding note at the bottom of the DBMS_XPLAN.DISPLAY output), because none of the operations have a
filter/access predicate information attached. Since your main issue are the bad cardinality estimates, I strongly suggest to drop any existing PLAN_TABLEs in any non-Oracle
owned schemas because 10g already provides one in the SYS schema (GTT PLAN_TABLE$) exposed via a public synonym, so that the EXPLAIN PLAN information provides the
"Predicate Information" section below the plan covering the "Filter/Access" predicates.
Please post a revised explain plan output including this crucial information so that we get a clue why the cardinality estimates are way off.
I have dropped the old plan.Got above execution plan(listed above in first point) with PREDICATE information.
"As already mentioned the usage of bind variables for the partition name makes this issue potentially worse."
Is there any workaround without replacing bind variable.I am on 10g so 11g's feature will not help !!!
How are you gathering the statistics daily, can you post the exact command(s) used?
gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Thanks & Regards,
Bhavik Desai -
Hi,
I have a query regarding explain plan. While we gather the statistics then optimize choose the best possible plan out of the explain plans available. If we do not gather statistics on a table for a long time then which plan it choose:
If it will continue to use the same plan as it use in the starting when statistics were gathered or will change the plan as soon as dml activities performed and statistics getting old.
Thanks
GKHi,
Aman.... wrote:
Gulshan wrote:
Hi,
I have a query regarding explain plan. While we gather the statistics then optimize choose the best possible plan out of the explain plans available. If we do not gather statistics on a table for a long time then which plan it choose:The same plan which it has chosen in the starting with the previous statistics. The plan won't change automatically as long as you won't refresh the statistics. This is wrong even for Oracle 9i. Here are couple of examples when a plan might change with the same optimizer statistics:
* when you have a histogram on a column and are not a bright person to use bind variables, you might get a completely different execution plan because of a different incoming value. All that is needed to fall into this habit - a soft parse, which might be due to different reasons, for instance, due to session parameter modification (which also might change a plan even without a histogram)
* Starting with 10g, CBO makes adjustments to cardinality estimates for out of range values appeared in predicates -
Behaviour of default value of METHOD_OPT
Hello,
I was trying to test the impact of extended statistics feature of 11g when I was puzzled by another observation.
I created a table (from ALL_OBJECTS view). The data in this table was such that it had lots of rows where OWNER = 'PUBLIC'
and lots of rows where OBJECT_TYPE = 'JAVA CLASS' but no rows where OWNER = 'PUBLIC' AND OBJECT_TYPE = 'JAVA CLASS'.
I also create an index on the combination of (OWNER, OBJECT_TYPE).
Now, after collecting statistics on table and index, I queried the table for above condition (OWNER = 'PUBLIC' AND OBJECT_TYPE = 'JAVA CLASS').
To my surprise (or not), the query used the index.
Then I recollected the statistics on the table and index and now the same query started to do a full table scan.
Only creation of extended statistics ensured that the plan changed to indexed access subsequently. While this proved the use of extended stats,
I am not sure how oracle was able to use indexed access path initially but not afterwards.
Is this due to column usage monitoring info? Can anybody help?
Here is my test case:
SQL> select * from v$version ;
BANNER
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
5 rows selected.
SQL> show parameter optimizer
NAME TYPE VALUE
optimizer_capture_sql_plan_baselines boolean FALSE
optimizer_dynamic_sampling integer 2
optimizer_features_enable string 11.2.0.1
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_mode string ALL_ROWS
optimizer_secure_view_merging boolean TRUE
optimizer_use_invisible_indexes boolean FALSE
optimizer_use_pending_statistics boolean FALSE
optimizer_use_sql_plan_baselines boolean TRUE
SQL> create table t1 nologging as select * from all_objects ;
Table created.
SQL> exec dbms_stats.gather_table_stats(user, 'T1', no_invalidate=>false) ;
PL/SQL procedure successfully completed.
SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
no rows selected
SQL> select * from table(dbms_xplan.display_cursor) ;
PLAN_TABLE_OUTPUT
SQL_ID bnrj3cac3upfd, child number 0
select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
Plan hash value: 3617692013
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | | | 226 (100)| |
|* 1 | TABLE ACCESS FULL| T1 | 155 | 15190 | 226 (1)| 00:00:03 |
Predicate Information (identified by operation id):
1 - filter(("OBJECT_TYPE"='JAVA CLASS' AND "OWNER"='PUBLIC'))
18 rows selected.
SQL> create index t1_idx on t1(owner, object_type) nologging ;
Index created.
SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
no rows selected
SQL> select * from table(dbms_xplan.display_cursor) ;
PLAN_TABLE_OUTPUT
SQL_ID bnrj3cac3upfd, child number 0
select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
Plan hash value: 546753835
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | | | 23 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 633 | 62034 | 23 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 633 | | 3 (0)| 00:00:01 |
Predicate Information (identified by operation id):
2 - access("OWNER"='PUBLIC' AND "OBJECT_TYPE"='JAVA CLASS')
19 rows selected.
SQL> REM This shows that CBO decided to use the index even when there are no extended statistics
SQL> REM Now, we will gather statistics on the table again and see what happens
SQL> exec dbms_stats.gather_table_stats(user, 'T1', no_invalidate=>false) ;
PL/SQL procedure successfully completed.
SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
no rows selected
SQL> select * from table(dbms_xplan.display_cursor) ;
PLAN_TABLE_OUTPUT
SQL_ID bnrj3cac3upfd, child number 0
select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
Plan hash value: 3617692013
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | | | 226 (100)| |
|* 1 | TABLE ACCESS FULL| T1 | 11170 | 1069K| 226 (1)| 00:00:03 |
Predicate Information (identified by operation id):
1 - filter(("OBJECT_TYPE"='JAVA CLASS' AND "OWNER"='PUBLIC'))
18 rows selected.
SQL> REM And the plan changes to Full Table scan. Why?user503699 wrote:
Hemant K Chitale wrote:
A change in statistics drives a change in expected cardinality which drives a change in plan.In that case, how does one explain the same execution plan but huge difference in cardinalities between first and third execution ?1) Oracle sometimes can use index statistics. This most likely explains difference in cardinality estimates between 1st and 2nd statements
2) You didn't specify estimate_percent - and that is not a good practice. You can get different sets of statistics gathered with different DBMS_STATS runs even when data isn't changed
3) As already pointed out previously, histograms make CBO behave quite differently. Most likely you have histograms in presence in the 3rd statement, which is quite possible the result of not specifying estimate_percent -
Hello Experts,
I am reading a white paper about CBO statistics the link is here. http://www.oracle.com/technetwork/database/focus-areas/bi-datawarehousing/twp-optimizer-stats-concepts-110711-1354477.
In the part of Frequency Histograms, in number 4. It gives a formula that how optimizer calculates the cardinality when use frequency histograms.
the Optimizer would first need to determine how many buckets in the histogram have 10 as their end point. It does this by finding the bucket whose endpoint is 10, bucket 503, and then subtracts the previous bucket number, bucket 483, 503 - 483 = 20.
After that the pharagraph continues like below
Then the cardinality estimate would be calculated using the following formula (number of bucket endpoints / total number of bucket) X NUM_ROWS, 20/503 X 503, so the number of rows in the PROMOTOINS table where PROMO_CATEGORY_ID =10 is 20.
My question is, when optimizer subtracts the previous bucket number from the intended bucket number. In that example, the result is 503 - 483 = 20. So, cant we already find the cardinality? I don't understand that why optimnizer needs the following formula? At least, can somebody explain why?
(number of bucket endpoints / total number of bucket) X NUM_ROWS
At the same time, If you look at Oracle documentation here Histograms
The end points show different location. For example, in white paper the end point and the value is same. However, in documents end point the the bucket number. Basicly the concept of histogram is very simple but documents make it confusing. Please share your remarkable thoughts.
Thanks in advance.Hello Experts,
I am reading a white paper about CBO statistics the link is here. http://www.oracle.com/technetwork/database/focus-areas/bi-datawarehousing/twp-optimizer-stats-concepts-110711-1354477.
In the part of Frequency Histograms, in number 4. It gives a formula that how optimizer calculates the cardinality when use frequency histograms.
the Optimizer would first need to determine how many buckets in the histogram have 10 as their end point. It does this by finding the bucket whose endpoint is 10, bucket 503, and then subtracts the previous bucket number, bucket 483, 503 - 483 = 20.
After that the pharagraph continues like below
Then the cardinality estimate would be calculated using the following formula (number of bucket endpoints / total number of bucket) X NUM_ROWS, 20/503 X 503, so the number of rows in the PROMOTOINS table where PROMO_CATEGORY_ID =10 is 20.
My question is, when optimizer subtracts the previous bucket number from the intended bucket number. In that example, the result is 503 - 483 = 20. So, cant we already find the cardinality? I don't understand that why optimnizer needs the following formula? At least, can somebody explain why?
(number of bucket endpoints / total number of bucket) X NUM_ROWS
At the same time, If you look at Oracle documentation here Histograms
The end points show different location. For example, in white paper the end point and the value is same. However, in documents end point the the bucket number. Basicly the concept of histogram is very simple but documents make it confusing. Please share your remarkable thoughts.
Thanks in advance. -
Question about histograms and indexes
I read that if a histogram is generated for a column and that column has an index then if the where clause contains a value that has a high cardinality the CBO will skip using the index. The article was with reference to the benefits of histograms.
My question is: Why would the CBO skip using the index? Why not use it anyways? Is it because there is a cost associated with loading and using the index itself?
Would appreciate some clarification on this, thanks!!!First, the article in question doesn't say to create histograms on columns with high cardinality values. A primary key column is going to be the ultimate in high cardinality columns (each value is unique after all) but it's rarely appropriate to create a histogram on that column. It does say that that histograms are generally useful when the data in a particular column is highly skewed-- that is, different values occur at wildly different rates.
If you have a table of orders with an ORDER_STATUS column, for example, 95% of your orders may be CLOSED, 3% may be SHIPPING, and 1.9% may be IN PROCESS and 0.1% may be CANCELLED. Without a histogram, Oracle would take a look at that column and see that there were 4 distinct values, so it would assume an equal distribution across the statuses. Which would cause it to favor a full scan on the table even if you were looking just for the CANCELLED orders. With a histogram, the optimizer would favor an index on ORDER_STATUS for the cancelled query while still favoring a table scan if you're looking for closed orders.
Gathering unnecessary histograms will make statistics collection take longer, which can cause issues with SLAs. It can also force you to gather statistics more frequently/ cause statistics to get out of date more quickly if you have monotonically increasing values in a column. If you have a CREATE_DATE column, for example, and gather a histogram, values that are greater than the max value at the time the histogram was gathered might be incorrectly estimated to be too infrequent, which can cause problems. If Oracle thinks that 1/6th of the rows are from Jan, 1/6 from Feb, etc. through June and you start looking for values from July because you haven't gathered statistics in a month, the CBO's estimates are going to be off. Unnecessary histograms also cause Oracle to spend more time parsing queries, potentially with no better results. And it can make troubleshooting a bit more difficult because, depending on the version and various optimizer settings, there may be multiple query plans for the same statement or different query plans depending on the particular bind variable that is first passed in.
Justin
Maybe you are looking for
-
Partition using Bootcamp and Windows 7
Hi Would it be a good approach to partition using Bootcamp then install Windows 7? Later I plan to install Parallells. Can it be down this way or should I install Parallels first, then Windows 7? Thank you.
-
CreateOrderFromHistory() method
hi has anyone satisfactorily managed to use the above method. I have tried but cannot get it to return a new OrderID ( and save the new order in the Order Master table. I am trying to duplicate a user's shopping Cart when their card payment fails on
-
Distribute Planning Function(Using Variables)
Hi All, I'm trying to distribute KeyFigure value between months. For example:Between Jan2005 and April2005 i want to distribute values. 4000 gets distributed as below: Jan2005 - 1000 Feb2005 - 1000 Mar2005 - 1000 Apr2005 - 1000 I want user to select
-
IOS 4 still has a recurring meeting bug w/ Exchange Server 2007
http://tech.kateva.org/2010/08/ios-3-bug-with-recurring-exchange.html When decline an invite iOS 3 declined all instances of the meeting and removed them from phone. iOS 4 deletes it on the phone but not on the server. I suppose it's progress. Maybe.
-
How can I stop the spinner in the 'checking for mail' column?
Whenever I ope my mail app and read the bottom left column, I see the spinner and 'checking for mail'. Then I see 'sending 1 of 2'. Then the bar fills in, as though 2 mails have been sent. There is no mail sent. Then later I open the app again an