Re: Confidence of Cardinality Estimates (CBO)

Iordan Iotzov wrote:
Teradata has the concept of confidence of cardinality estimates -
http://developer.teradata.com/database/articles/can-we-speak-confidentially-exposing-explain-confidence-levels
In short, the optimizer tries to figure out the amount of “guesswork” that is included into a cardinality estimate.
Is there anything similar in Oracle? I am looking for anything - supported or not!
Nice idea, but no.
(although, internally, there are points where it does know that it is guessing (which is where optimizer_dynamic_sampliing at level 3 comes into play) or where it is aware that predicate independence is not a realistic assumption (which is where level 4 comes into play)).
Regards
Jonathan Lewis

Hi Jonathan,
In my limited 12c experience, presence or absence of adaptive execution plans (STATISTICS COLLECTOR) is a proxy for cardinality confidence only in few situations.
That is, Oracle 12c generates an adaptive execution plan for most joins, even for joins in simple queries that pose no cardinality confidence challenges.
From what I have seen so far, the only situation where the CBO is confident enough about its estimates to skip the adaptive execution plan step (STATISTICS COLLECTOR) is for single value primary key/unique index scans.
One possible explanation is that the cost of adding an adaptive execution plan step is so low, that Oracle adds it almost indiscriminately.
I ran some tests ( http://wp.me/p1DHW2-7o ) and came across some interesting results:
     ->The cost of adaptive execution plan is actually negative for nested loops. That is, turning adaptive execution plans off would make a nested loop run slower. The difference is small, but it seems to be statistically significant.
     ->The cost of adaptive exec plan is negligible for hash joins. That is, no statistically significant difference was found between HJ queries using adaptive exec plans and those that do not use adaptive exec
plans.
Regards,
Iordan Iotzov

Similar Messages

  • Confidence of Cardinality Estimates (CBO)

    Teradata has the concept of confidence of cardinality estimates -
    http://developer.teradata.com/database/articles/can-we-speak-confidentially-exposing-explain-confidence-levels
    In short, the optimizer tries to figure out the amount of “guesswork” that is included into a cardinality estimate.
    Is there anything similar in Oracle? I am looking for anything - supported or not!
    Thanks,
    Iordan Iotzov
    http://iiotzov.wordpress.com/

    Hi Jonathan,
    In my limited 12c experience, presence or absence of adaptive execution plans (STATISTICS COLLECTOR) is a proxy for cardinality confidence only in few situations.
    That is, Oracle 12c generates an adaptive execution plan for most joins, even for joins in simple queries that pose no cardinality confidence challenges.
    From what I have seen so far, the only situation where the CBO is confident enough about its estimates to skip the adaptive execution plan step (STATISTICS COLLECTOR) is for single value primary key/unique index scans.
    One possible explanation is that the cost of adding an adaptive execution plan step is so low, that Oracle adds it almost indiscriminately.
    I ran some tests ( http://wp.me/p1DHW2-7o ) and came across some interesting results:
         ->The cost of adaptive execution plan is actually negative for nested loops. That is, turning adaptive execution plans off would make a nested loop run slower. The difference is small, but it seems to be statistically significant.
         ->The cost of adaptive exec plan is negligible for hash joins. That is, no statistically significant difference was found between HJ queries using adaptive exec plans and those that do not use adaptive exec
    plans.
    Regards,
    Iordan Iotzov

  • How can I improve optimizers poor cardinality estimates?

    Hi all,
    I have a query that is taking too long and it looks like the cardinality estimates are way off. It seems particulary bas with the hash joins
    and I don't know how to get the optimizer to get a better estimate. The tables in the query were last analyzed a couple of weeks ago
    using dbms_stats and DBMS_STATS.AUTO_SAMPLE_SIZE and FOR ALL COLUMNS SIZE AUTO but looking at dba_tab_col_statistics there is only a frequency histogram
    on a column not used in the query. The data hasn't really changed that much since the last collection
    This is 11.2.0.2.1 on linux x86_64.
    create table test2 as
    select /*+ GATHER_PLAN_STATISTICS */ DISTINCT
    hts.resource_id resource_id,       
    a.start_time start_date,
    hta.attribute12 alias1
    FROM
    hxc_time_attribute_usages htu,
    hxc_time_attributes hta,       
    hxc_time_building_blocks a,
    hxc_time_building_blocks b,       
    hxc_time_building_blocks c,
    hxc_timecard_summary hts 
    WHERE    
    htu.time_attribute_id = hta.time_attribute_id       
    AND hta.attribute_category LIKE 'ELEMENT%'
    AND hta.attribute12 IS NOT NULL       
    AND htu.time_building_block_id = c.time_building_block_id       
    AND a.time_building_block_id = b.parent_building_block_id       
    AND b.time_building_block_id = c.parent_building_block_id       
    AND c.time_building_block_id = htu.time_building_block_id       
    AND a.scope = 'TIMECARD'       
    AND b.scope = 'DAY'       
    AND c.scope ='DETAIL'      
    AND hts.timecard_id = a.time_building_block_id
    Plan hash value: 1730726592
    | Id  | Operation                          | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
    |   0 | CREATE TABLE STATEMENT             |                           |      1 |        |      0 |00:40:13.79 |     621K|   3182K|   2671K|       |       |          |         |
    |   1 |  LOAD AS SELECT                    |                           |      1 |        |      0 |00:40:13.79 |     621K|   3182K|   2671K|   529K|   529K|  529K (0)|         |
    |   2 |   SORT UNIQUE                      |                           |      1 |   1205 |    170K|00:40:13.37 |     618K|   3182K|   2671K|    11M|    11M|   10M (0)|         |
    |*  3 |    HASH JOIN                       |                           |      1 |   1205 |    135M|00:36:59.88 |     618K|   3182K|   2671K|  3325M|    63M|  371M (1)|      18M|
    |*  4 |     HASH JOIN                      |                           |      1 |  10829 |    143M|00:11:47.18 |     616K|    894K|    384K|  2047M|    32M|  539M (1)|    2748K|
    |*  5 |      HASH JOIN                     |                           |      1 |   9541 |     28M|00:06:43.60 |     500K|    448K|  54765 |   751M|    16M|  607M (1)|     456K|
    |*  6 |       HASH JOIN                    |                           |      1 |   8885 |   7561K|00:05:28.13 |     383K|    276K|      0 |   211M|  8846K|  278M (0)|         |
    |*  7 |        HASH JOIN                   |                           |      1 |  21193 |   2689K|00:05:00.55 |     266K|    160K|      0 |   169M|  9302K|  201M (0)|         |
    |*  8 |         TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES       |      1 |  20971 |   2637K|00:04:23.04 |     209K|    103K|      0 |       |       |          |         |
    |*  9 |          INDEX RANGE SCAN          | HXC_TIME_ATTRIBUTES_FK2   |      1 |  71213 |   2640K|00:01:25.09 |   15774 |  15764 |      0 |       |       |          |         |
    |  10 |         TABLE ACCESS FULL          | HXC_TIME_ATTRIBUTE_USAGES |      1 |   8451K|   8849K|00:00:08.71 |   56898 |  56825 |      0 |       |       |          |         |
    |* 11 |        TABLE ACCESS FULL           | HXC_TIME_BUILDING_BLOCKS  |      1 |   1094K|   2703K|00:00:08.01 |     116K|    116K|      0 |       |       |          |         |
    |* 12 |       TABLE ACCESS FULL            | HXC_TIME_BUILDING_BLOCKS  |      1 |   1094K|   3025K|00:00:13.12 |     116K|    116K|      0 |       |       |          |         |
    |* 13 |      TABLE ACCESS FULL             | HXC_TIME_BUILDING_BLOCKS  |      1 |   1206K|    284K|00:00:19.24 |     116K|    116K|      0 |       |       |          |         |
    |  14 |     TABLE ACCESS FULL              | HXC_TIMECARD_SUMMARY      |      1 |    118K|    124K|00:00:05.15 |    2212 |   1183 |      0 |       |       |          |         |
    Predicate Information (identified by operation id):
       3 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
       4 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
       5 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
       6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
       7 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
       8 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
       9 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
           filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
      11 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      13 - filter("A"."SCOPE"='TIMECARD')I can get a slight improvement if I set optimizer_dynamic_sampling=4
    Plan hash value: 2768898101
    | Id  | Operation                      | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
    |   0 | CREATE TABLE STATEMENT         |                           |      1 |        |      0 |00:20:21.47 |     621K|    760K|    355K|       |       |          |         |
    |   1 |  LOAD AS SELECT                |                           |      1 |        |      0 |00:20:21.47 |     621K|    760K|    355K|   529K|   529K|  529K (0)|         |
    |   2 |   SORT UNIQUE                  |                           |      1 |    433K|    170K|00:20:21.07 |     618K|    760K|    354K|    11M|    11M|   10M (0)|         |
    |*  3 |    HASH JOIN                   |                           |      1 |    433K|    135M|00:17:36.89 |     618K|    760K|    354K|   171M|  9261K|  233M (0)|         |
    |*  4 |     TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES       |      1 |   2509K|   2637K|00:00:06.33 |     209K|      0 |      0 |       |       |          |         |
    |*  5 |      INDEX RANGE SCAN          | HXC_TIME_ATTRIBUTES_FK2   |      1 |  71213 |   2640K|00:00:02.17 |   15767 |      0 |      0 |       |       |          |         |
    |*  6 |     HASH JOIN                  |                           |      1 |   1444K|    272M|00:07:38.99 |     409K|    760K|    354K|  2047M|    32M|  546M (1)|    2717K|
    |*  7 |      HASH JOIN                 |                           |      1 |    446K|     30M|00:01:43.68 |     352K|    399K|  50235 |   639M|    17M|  694M (1)|     418K|
    |*  8 |       HASH JOIN                |                           |      1 |    377K|   8329K|00:00:41.18 |     235K|    232K|      0 |    17M|  2383K|   27M (0)|         |
    |*  9 |        HASH JOIN               |                           |      1 |    134K|    277K|00:00:23.28 |     118K|    116K|      0 |  4912K|  1573K| 7759K (0)|         |
    |  10 |         TABLE ACCESS FULL      | HXC_TIMECARD_SUMMARY      |      1 |    118K|    124K|00:00:00.08 |    2212 |      0 |      0 |       |       |          |         |
    |* 11 |         TABLE ACCESS FULL      | HXC_TIME_BUILDING_BLOCKS  |      1 |   1206K|    284K|00:00:22.13 |     116K|    116K|      0 |       |       |          |         |
    |* 12 |        TABLE ACCESS FULL       | HXC_TIME_BUILDING_BLOCKS  |      1 |   2988K|   3025K|00:00:05.05 |     116K|    116K|      0 |       |       |          |         |
    |* 13 |       TABLE ACCESS FULL        | HXC_TIME_BUILDING_BLOCKS  |      1 |   2514K|   2703K|00:00:03.65 |     116K|    116K|      0 |       |       |          |         |
    |  14 |      TABLE ACCESS FULL         | HXC_TIME_ATTRIBUTE_USAGES |      1 |   8451K|   8849K|00:00:08.23 |   56898 |  56818 |      0 |       |       |          |         |
    Predicate Information (identified by operation id):
       3 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
       4 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
       5 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
           filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
       6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
       7 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
       8 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
       9 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
      11 - filter("A"."SCOPE"='TIMECARD')
      12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      13 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
    Note
       - dynamic sampling used for this statement (level=4)But I still have a large difference in the Estimated and Actual, what can I do to help the optimizer get a better estimate?

    Hi Dom
    Thank you for your input it is always appreciated!
    I tried running with a manual workarea and a sort_area_size of 2000000000 but the result was worse then before.
    Plan hash value: 1730726592
    | Id  | Operation                          | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
    |   0 | CREATE TABLE STATEMENT             |                           |      1 |        |      0 |00:36:15.63 |    1085K|   2926K|   2475K|       |       |          |         |
    |   1 |  LOAD AS SELECT                    |                           |      1 |        |      0 |00:36:15.63 |    1085K|   2926K|   2475K|   529K|   529K|  529K (0)|         |
    |   2 |   SORT UNIQUE                      |                           |      1 |   1205 |    170K|00:36:14.83 |    1083K|   2926K|   2475K|    11M|    11M|   10M (0)|         |
    |*  3 |    HASH JOIN                       |                           |      1 |   1205 |    135M|00:32:54.89 |    1083K|   2926K|   2475K|  3325M|    63M| 2048M (1)|      19M|
    |*  4 |     HASH JOIN                      |                           |      1 |  10829 |    143M|01:06:20.59 |     651K|    623K|    188K|  2047M|    32M| 2048M (1)|    1681K|
    |*  5 |      HASH JOIN                     |                           |      1 |   9541 |     28M|00:05:51.56 |     500K|    317K|      0 |   751M|    16M| 1088M (0)|         |
    |*  6 |       HASH JOIN                    |                           |      1 |   8885 |   7561K|00:03:03.10 |     383K|    201K|      0 |   211M|  8846K|  379M (0)|         |
    |*  7 |        HASH JOIN                   |                           |      1 |  21193 |   2689K|00:02:21.61 |     266K|  84965 |      0 |   169M|  9302K|  299M (0)|         |
    |*  8 |         TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES       |      1 |  20971 |   2637K|00:01:45.51 |     209K|  28147 |      0 |       |       |          |         |
    |*  9 |          INDEX RANGE SCAN          | HXC_TIME_ATTRIBUTES_FK2   |      1 |  71213 |   2640K|00:00:43.17 |   15769 |   6921 |      0 |       |       |          |         |
    |  10 |         TABLE ACCESS FULL          | HXC_TIME_ATTRIBUTE_USAGES |      1 |   8451K|   8849K|00:00:08.75 |   56898 |  56818 |      0 |       |       |          |         |
    |* 11 |        TABLE ACCESS FULL           | HXC_TIME_BUILDING_BLOCKS  |      1 |   1094K|   2703K|00:00:22.74 |     116K|    116K|      0 |       |       |          |         |
    |* 12 |       TABLE ACCESS FULL            | HXC_TIME_BUILDING_BLOCKS  |      1 |   1094K|   3025K|00:00:07.85 |     116K|    116K|      0 |       |       |          |         |
    |* 13 |      TABLE ACCESS FULL             | HXC_TIME_BUILDING_BLOCKS  |      1 |   1206K|    284K|00:00:07.42 |     117K|    116K|      0 |       |       |          |         |
    |  14 |     TABLE ACCESS FULL              | HXC_TIMECARD_SUMMARY      |      1 |    118K|    124K|00:00:01.57 |    2250 |    282 |      0 |       |       |          |         |
    Query Block Name / Object Alias (identified by operation id):
       1 - SEL$1
       8 - SEL$1 / HTA@SEL$1
       9 - SEL$1 / HTA@SEL$1
      10 - SEL$1 / HTU@SEL$1
      11 - SEL$1 / C@SEL$1
      12 - SEL$1 / B@SEL$1
      13 - SEL$1 / A@SEL$1
      14 - SEL$1 / HTS@SEL$1
    Predicate Information (identified by operation id):
       3 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
       4 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
       5 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
       6 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
       7 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
       8 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
       9 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
           filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
      11 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      12 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      13 - filter("A"."SCOPE"='TIMECARD')
    66 rows selected.So I tried setting some cardinality hints, but again, it doesn;t seemd to have helped.
    select /*+ GATHER_PLAN_STATISTICS
    CARDINALITY(A@SEL$1 284000)
    CARDINALITY(HTA@SEL$1 2637000)
    CARDINALITY(C@SEL$1 270000)
    CARDINALITY(B@SEL$1 300000) */
    DISTINCT
    hts.resource_id resource_id,       
    a.start_time start_date,
    hta.attribute12 alias1  
    FROM hxc_time_attribute_usages htu,
    hxc_time_attributes hta,       
    hxc_time_building_blocks a,
    hxc_time_building_blocks b,       
    hxc_time_building_blocks c,
    hxc_timecard_summary hts 
    WHERE    
    htu.time_attribute_id =hta.time_attribute_id       
    AND hta.attribute_category LIKE 'ELEMENT%'
    AND hta.attribute12 IS NOT NULL       
    AND htu.time_building_block_id = c.time_building_block_id       
    AND a.time_building_block_id = b.parent_building_block_id       
    AND b.time_building_block_id = c.parent_building_block_id       
    AND c.time_building_block_id = htu.time_building_block_id       
    AND a.scope = 'TIMECARD'       
    AND b.scope = 'DAY'       
    AND c.scope ='DETAIL'        AND hts
    Plan hash value: 1839838244
    | Id  | Operation                      | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
    |   0 | CREATE TABLE STATEMENT         |                           |      1 |        |      0 |00:18:00.46 |    1104K|   4415K|   3927K|       |       |       |          |
    |   1 |  LOAD AS SELECT                |                           |      1 |        |      0 |00:18:00.46 |    1104K|   4415K|   3927K|   529K|   529K|  529K (0)|       |
    |   2 |   SORT UNIQUE                  |                           |      1 |   9678 |    170K|00:17:59.95 |    1101K|   4415K|   3926K|    11M|    11M|   10M (0)|       |
    |*  3 |    HASH JOIN                   |                           |      1 |   9678 |    135M|01:02:43.31 |    1101K|   4415K|   3926K|  4318M|    98M|  305M (1)|    27M|
    |*  4 |     HASH JOIN                  |                           |      1 |  30691 |    272M|00:12:31.27 |     409K|   1009K|    602K|  2047M|    32M|  498M (1)|    2523K|
    |*  5 |      HASH JOIN                 |                           |      1 |   9485 |     30M|00:05:41.24 |     352K|    649K|    300K|  2047M|    33M|  461M (1)|    2355K|
    |*  6 |       HASH JOIN                |                           |      1 |  22123 |     31M|00:01:56.55 |     349K|    367K|  17535 |   493M|    18M|  661M (1)|   146K|
    |*  7 |        HASH JOIN               |                           |      1 |  79434 |   8102K|00:00:53.71 |     233K|    233K|      0 |   116M|    10M|  195M (0)|       |
    |*  8 |         TABLE ACCESS FULL      | HXC_TIME_BUILDING_BLOCKS  |      1 |    270K|   2703K|00:00:24.01 |     116K|    116K|      0 |       |       |
    |*  9 |         TABLE ACCESS FULL      | HXC_TIME_BUILDING_BLOCKS  |      1 |    300K|   3025K|00:00:10.42 |     116K|    116K|      0 |       |       |       |          |
    |* 10 |        TABLE ACCESS FULL       | HXC_TIME_BUILDING_BLOCKS  |      1 |    284K|    284K|00:00:11.44 |     116K|    116K|      0 |       |       |       |          |
    |  11 |       TABLE ACCESS FULL        | HXC_TIMECARD_SUMMARY      |      1 |    118K|    124K|00:00:01.88 |    2212 |    256 |      0 |       |       |       |          |
    |  12 |      TABLE ACCESS FULL         | HXC_TIME_ATTRIBUTE_USAGES |      1 |   8451K|   8849K|00:00:11.09 |   56898 |  56818 |      0 |       |       |       |          |
    |* 13 |     TABLE ACCESS BY INDEX ROWID| HXC_TIME_ATTRIBUTES       |      1 |   2637K|   2637K|00:03:20.70 |     210K|  56613 |      0 |       |       |       |          |
    |* 14 |      INDEX RANGE SCAN          | HXC_TIME_ATTRIBUTES_FK2   |      1 |  71213 |   2640K|00:01:34.62 |   16142 |  15527 |      0 |       |       |       |          |
    Query Block Name / Object Alias (identified by operation id):
       1 - SEL$1
       8 - SEL$1 / C@SEL$1
       9 - SEL$1 / B@SEL$1
      10 - SEL$1 / A@SEL$1
      11 - SEL$1 / HTS@SEL$1
      12 - SEL$1 / HTU@SEL$1
      13 - SEL$1 / HTA@SEL$1
      14 - SEL$1 / HTA@SEL$1
    Predicate Information (identified by operation id):
       3 - access("HTU"."TIME_ATTRIBUTE_ID"="HTA"."TIME_ATTRIBUTE_ID")
       4 - access("HTU"."TIME_BUILDING_BLOCK_ID"="C"."TIME_BUILDING_BLOCK_ID")
       5 - access("HTS"."TIMECARD_ID"="A"."TIME_BUILDING_BLOCK_ID")
       6 - access("A"."TIME_BUILDING_BLOCK_ID"="B"."PARENT_BUILDING_BLOCK_ID")
       7 - access("B"."TIME_BUILDING_BLOCK_ID"="C"."PARENT_BUILDING_BLOCK_ID")
       8 - filter(("C"."SCOPE"='DETAIL' AND "C"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
       9 - filter(("B"."SCOPE"='DAY' AND "B"."PARENT_BUILDING_BLOCK_ID" IS NOT NULL))
      10 - filter("A"."SCOPE"='TIMECARD')
      13 - filter("HTA"."ATTRIBUTE12" IS NOT NULL)
      14 - access("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
           filter("HTA"."ATTRIBUTE_CATEGORY" LIKE 'ELEMENT%')
    68 rows selected.
    SQL>What else do you think I should try, or did I do the cardinality bit wrong, because I don't seem to be able to hint the HASH Join, only the table scan?
    Thanks

  • Query performance issues - Poor cardinality estimate?

    Hi,
    I have a query which is taking far longer than estimated by the explain plan (estimate 1min, query still running after several hours).
    Plan hash value: 3287246760
    | Id  | Operation                             | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT                      |                |     1 |   195 |  3795   (1)| 00:00:46 |
    |   1 |  VIEW                                 |                |     1 |   195 |  3795   (1)| 00:00:46 |
    |   2 |   WINDOW SORT                         |                |     1 |   151 |  3795   (1)| 00:00:46 |
    |   3 |    VIEW                               |                |     1 |   151 |  3794   (1)| 00:00:46 |
    |   4 |     SORT UNIQUE                       |                |     1 |   147 |  3794   (1)| 00:00:46 |
    |   5 |      WINDOW BUFFER                    |                |     1 |   147 |  3794   (1)| 00:00:46 |
    |   6 |       SORT GROUP BY PIVOT             |                |     1 |   147 |  3794   (1)| 00:00:46 |
    |   7 |        NESTED LOOPS                   |                |       |       |            |          |
    |   8 |         NESTED LOOPS                  |                |     1 |   147 |  3793   (1)| 00:00:46 |
    |   9 |          NESTED LOOPS                 |                |     3 |   297 |  1503   (1)| 00:00:19 |
    |* 10 |           HASH JOIN                   |                |   238 | 15470 |    75   (7)| 00:00:01 |
    |  11 |            MAT_VIEW ACCESS FULL       | VENTILATION    | 17994 |   404K|    35   (0)| 00:00:01 |
    |  12 |            VIEW                       |                | 17994 |   738K|    39  (11)| 00:00:01 |
    |  13 |             SORT UNIQUE               |                | 17994 |   702K|    39  (11)| 00:00:01 |
    |  14 |              WINDOW SORT              |                | 17994 |   702K|    39  (11)| 00:00:01 |
    |* 15 |               VIEW                    |                | 17994 |   702K|    37   (6)| 00:00:01 |
    |  16 |                WINDOW SORT            |                | 17994 |   632K|    37   (6)| 00:00:01 |
    |  17 |                 MAT_VIEW ACCESS FULL  | VENTILATION    | 17994 |   632K|    35   (0)| 00:00:01 |
    |  18 |           INLIST ITERATOR             |                |       |       |            |          |
    |* 19 |            TABLE ACCESS BY INDEX ROWID| LABEVENTS      |     1 |    34 |     6   (0)| 00:00:01 |
    |* 20 |             INDEX RANGE SCAN          | LABEVENTS_O5   |     5 |       |     3   (0)| 00:00:01 |
    |* 21 |          INDEX RANGE SCAN             | CHARTEVENTS_O5 |  4937 |       |    12   (0)| 00:00:01 |
    |* 22 |         TABLE ACCESS BY INDEX ROWID   | CHARTEVENTS    |     1 |    48 |   763   (0)| 00:00:10 |
    Predicate Information (identified by operation id):
      10 - access("ICUS"."SUBJECT_ID"="FVGT48H"."SUBJECT_ID" AND
                  SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME")=SYS_EXTRACT_UTC("ICUS"."BEGIN_TIME"))
      15 - filter((INTERNAL_FUNCTION("END_TIME")-INTERNAL_FUNCTION("BEGIN_TIME"))DAY(9) TO
                  SECOND(9)>INTERVAL'+02 00:00:00' DAY(2) TO SECOND(0))
      19 - filter(SYS_EXTRACT_UTC("LE"."CHARTTIME")>=SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME") AND
                  SYS_EXTRACT_UTC("LE"."CHARTTIME")<=SYS_EXTRACT_UTC("FVGT48H"."END_TIME"))
      20 - access("ICUS"."ICUSTAY_ID"="LE"."ICUSTAY_ID" AND ("LE"."ITEMID"=50013 OR
                  "LE"."ITEMID"=50019))
           filter("LE"."ICUSTAY_ID" IS NOT NULL)
      21 - access("LE"."ICUSTAY_ID"="CE"."ICUSTAY_ID")I tried removing the nested loops using the NO_USE_NL hints, which give the following plan:
    | Id  | Operation                            | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT                     |                |     1 |   195 |       | 22789   (1)| 00:04:34 |
    |   1 |  VIEW                                |                |     1 |   195 |       | 22789   (1)| 00:04:34 |
    |   2 |   WINDOW SORT                        |                |     1 |   151 |       | 22789   (1)| 00:04:34 |
    |   3 |    VIEW                              |                |     1 |   151 |       | 22788   (1)| 00:04:34 |
    |   4 |     SORT UNIQUE                      |                |     1 |   147 |       | 22788   (1)| 00:04:34 |
    |   5 |      WINDOW BUFFER                   |                |     1 |   147 |       | 22788   (1)| 00:04:34 |
    |   6 |       SORT GROUP BY PIVOT            |                |     1 |   147 |       | 22788   (1)| 00:04:34 |
    |*  7 |        HASH JOIN                     |                |     1 |   147 |       | 22787   (1)| 00:04:34 |
    |   8 |         VIEW                         |                | 17994 |   738K|       |    39  (11)| 00:00:01 |
    |   9 |          SORT UNIQUE                 |                | 17994 |   702K|       |    39  (11)| 00:00:01 |
    |  10 |           WINDOW SORT                |                | 17994 |   702K|       |    39  (11)| 00:00:01 |
    |* 11 |            VIEW                      |                | 17994 |   702K|       |    37   (6)| 00:00:01 |
    |  12 |             WINDOW SORT              |                | 17994 |   632K|       |    37   (6)| 00:00:01 |
    |  13 |              MAT_VIEW ACCESS FULL    | VENTILATION    | 17994 |   632K|       |    35   (0)| 00:00:01 |
    |* 14 |         HASH JOIN                    |                | 11873 |  1217K|  5800K| 22747   (1)| 00:04:33 |
    |* 15 |          HASH JOIN                   |                | 86060 |  4790K|       | 16141   (2)| 00:03:14 |
    |  16 |           MAT_VIEW ACCESS FULL       | VENTILATION    | 17994 |   404K|       |    35   (0)| 00:00:01 |
    |* 17 |           TABLE ACCESS FULL          | LABEVENTS      |   176K|  5869K|       | 16105   (2)| 00:03:14 |
    |  18 |          INLIST ITERATOR             |                |       |       |       |            |          |
    |  19 |           TABLE ACCESS BY INDEX ROWID| CHARTEVENTS    |   104K|  4911K|       |  6024   (1)| 00:01:13 |
    |* 20 |            INDEX RANGE SCAN          | CHARTEVENTS_O4 |   104K|       |       |   220   (1)| 00:00:03 |
    Predicate Information (identified by operation id):
       7 - access("ICUS"."SUBJECT_ID"="FVGT48H"."SUBJECT_ID" AND
                  SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME")=SYS_EXTRACT_UTC("ICUS"."BEGIN_TIME"))
           filter(SYS_EXTRACT_UTC("LE"."CHARTTIME")>=SYS_EXTRACT_UTC("FVGT48H"."BEGIN_TIME") AND
                  SYS_EXTRACT_UTC("LE"."CHARTTIME")<=SYS_EXTRACT_UTC("FVGT48H"."END_TIME"))
      11 - filter((INTERNAL_FUNCTION("END_TIME")-INTERNAL_FUNCTION("BEGIN_TIME"))DAY(9) TO
                  SECOND(9)>INTERVAL'+02 00:00:00' DAY(2) TO SECOND(0))
      14 - access("LE"."ICUSTAY_ID"="CE"."ICUSTAY_ID")
           filter(SYS_EXTRACT_UTC("CHARTTIME")<SYS_EXTRACT_UTC("LE"."CHARTTIME"))
      15 - access("ICUS"."ICUSTAY_ID"="LE"."ICUSTAY_ID")
      17 - filter("LE"."ICUSTAY_ID" IS NOT NULL AND ("LE"."ITEMID"=50013 OR "LE"."ITEMID"=50019))
      20 - access("CE"."ITEMID"=185 OR "CE"."ITEMID"=186 OR "CE"."ITEMID"=190 OR "CE"."ITEMID"=3420)The cardinality estimate looks way off to me - I'm expecting several thousand rows. I have up-to-date statistics.
    Can anyone help?
    Thanks,
    Dan

    WITH chf_patients AS (
    -- Exclude patients with CHF by ICD9 code
    select subject_id,
           hadm_id
      from mimic2v26.icd9
    where code in ('398.91','402.01','402.91','428.0','428.0', '428.1', '404.13', '404.93', '428.9', '404.91')
    , icustays AS (
        /* Our ICU Stay population */
    SELECT *
      FROM MIMIC2V26.ICUSTAY_DETAIL
    WHERE ICUSTAY_AGE_GROUP = 'adult'
       AND SUBJECT_ID NOT IN (select subject_id from chf_patients)
    --   AND SUBJECT_ID < 50
    --select * from icustays;
    -- Combine ventilation periods separated by < 48 hours.
    , combine_ventilation as (
    select subject_id,
           icustay_id,
           begin_time,
    --       end_time as end_first_vent,
    --       lead(begin_time,1) over (partition by icustay_id order by begin_time) as next_begin_time,
    --       lead(begin_time,1) over (partition by icustay_id order by begin_time) - begin_time as time_to_next,
           case when (lead(begin_time,1) over (partition by icustay_id order by begin_time) - begin_time) < interval '2' day
            then lead(end_time,1) over (partition by icustay_id order by begin_time)
            else end_time end as end_time
      from mimic2devel.ventilation
    --select * from combine_ventilation;
    --select * from combine_ventilation where end_of_ventilation != end_time;
    -- Get the first ventilation period which is >  48 hours.
    , first_vent_gt_48hrs as (
    select distinct subject_id,
           first_value(begin_time) over (partition by subject_id order by begin_time) as begin_time,
           first_value(end_time) over (partition by subject_id order by begin_time) as end_time
      from combine_ventilation where end_time - begin_time > interval '48' hour
    --select * from first_vent_gt_48hrs;
    -- Find the ICU stay when it occurred
    , icustay_first_vent_gt_48hrs as (
    select fvgt48h.subject_id,
           icus.icustay_id,
           fvgt48h.begin_time,
           fvgt48h.end_time
      from first_vent_gt_48hrs fvgt48h
      join mimic2devel.ventilation icus on icus.subject_id = fvgt48h.subject_id and fvgt48h.begin_time = icus.begin_time
    --select /*+gather_plan_statistics*/ * from icustay_first_vent_gt_48hrs;
    , pao2_fio2_during_ventilation as (
    select /*+ NO_USE_NL(le ifvgt48h) */
           le.subject_id,
           le.hadm_id,
           le.icustay_id,
           charttime,
           case when itemid = 50019 then 'PAO2'
                when itemid = 50013 then 'FIO2'
           end as item_type,
           -- Some FIO2s are fractional instead of percentage
           case when itemid = 50013 and valuenum > 1 then round(valuenum / 100,2)
                else round(valuenum,2)
           end as valuenum
      from mimic2v26.labevents le
      join icustay_first_vent_gt_48hrs ifvgt48h on ifvgt48h.icustay_id = le.icustay_id and le.charttime between ifvgt48h.begin_time and ifvgt48h.end_time
    where le.itemid = 50019 or le.itemid = 50013
    --select * from pao2_fio2_during_ventilation;
    -- Check that FIO2s have valid range
    , vent_data_pivot as (
    select * from (
        select subject_id, hadm_id, icustay_id, charttime, item_type, valuenum from pao2_fio2_during_ventilation)
        pivot ( max(valuenum) as valuenum for item_type in ('FIO2' as fio2, 'PAO2' as pao2) )
    --select * from vent_data_pivot;
    -- Fill in prior FIO2 from chartevents
    , get_prior_fio2s as (
    select /*+ NO_USE_NL(vdp ce) */
           distinct
           vdp.subject_id,
           vdp.hadm_id,
           vdp.icustay_id,
           vdp.charttime as pao2_charttime,
           vdp.fio2_valuenum,
           vdp.pao2_valuenum,
    --       ce.itemid,
    --       ce.charttime as chart_charttime,
    --       ce.value1num,
           first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_fio2_raw,
           case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
              then round(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100,2)
              else round(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc),2)
           end as most_recent_fio2,
           first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_fio2_charttime,
           vdp.charttime - first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as time_since_fio2,
    --       first_value(ce.charttime) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) as most_recent_charttime
           case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
              then round(vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100),2)
              else round(vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc)),2)
           end as pf_ratio,
           case when first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) > 1
              then
                case when vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc) / 100) < 200 then 1 else 0 end
              else
                case when vdp.pao2_valuenum/(first_value(ce.value1num) over (partition by ce.icustay_id, vdp.charttime order by ce.charttime desc)) < 200 then 1 else 0 end
           end as pf_ratio_below_thresh
      from vent_data_pivot vdp
      join mimic2v26.chartevents ce on vdp.icustay_id = ce.icustay_id and ce.charttime < vdp.charttime
    where itemid in (190,3420,186,185)
    --select * from get_prior_fio2s order by icustay_id, charttime;
    , pf_data as (
    select subject_id,
           hadm_id,
           icustay_id,
           pao2_charttime,
           lead(pao2_charttime) over (partition by icustay_id order by pao2_charttime) as next_pao2_charttime,
           fio2_valuenum,
           pao2_valuenum,
           lead(pao2_valuenum) over (partition by icustay_id order by pao2_charttime) as next_pao2_valuenum,
           most_recent_fio2_raw,
           most_recent_fio2,
           most_recent_fio2_charttime,
           time_since_fio2,
           pf_ratio,
           lead(pf_ratio) over (partition by icustay_id order by pao2_charttime) as next_pf_ratio,
           pf_ratio_below_thresh,
           lead(pf_ratio_below_thresh) over (partition by icustay_id order by pao2_charttime) as next_pf_ratio_below_thresh
      from get_prior_fio2s
    select * from pf_data;Table structure is available here:
    http://mimic.physionet.org/schema/latest/
    Can I still get a TKPROF if the query doesn't complete? I'll have a go and post the results shortly.
    Thanks,
    Dan

  • User Generated Data and Cardinality Estimates

    Platform Information:
    Windows Server 2003 R2
    Oracle Enterprise Edition 10.2.0.4
    Optimizer Parameters:
    NAME                                 TYPE        VALUE
    object_cache_optimal_size            integer     102400
    optimizer_dynamic_sampling           integer     2
    optimizer_features_enable            string      10.2.0.4
    optimizer_index_caching              integer     90
    optimizer_index_cost_adj             integer     30
    optimizer_mode                       string      CHOOSE
    optimizer_secure_view_merging        boolean     TRUE
    Test Case:
    var csv VARCHAR2(250);
    exec :csv := '1,2,3,4,5,6,7,8,9,10';
    EXPLAIN PLAN FOR WITH csv_to_rows AS
            SELECT UPPER(
                            TRIM(
                                    SUBSTR
                                            txt
                                    ,       INSTR (txt, ',', 1, level  ) + 1
                                    ,       INSTR (txt, ',', 1, level+1) - INSTR (txt, ',', 1, level) -1
                    )       AS token
            FROM    (
                            SELECT ','||:csv||',' txt
                            FROM dual
                    )       t
            CONNECT BY LEVEL <= LENGTH(:csv)-LENGTH(REPLACE(:csv,',',''))+1
    SELECT * FROM csv_to_rows;
    SELECT  * FROM TABLE(DBMS_XPLAN.DISPLAY);
    Results:
    Execution Plan
    Plan hash value: 2403765415
    | Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |      |     1 |    19 |     2   (0)| 00:00:01 |
    |   1 |  VIEW                         |      |     1 |    19 |     2   (0)| 00:00:01 |
    |*  2 |   CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
    |   3 |    FAST DUAL                  |      |     1 |       |     2   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - filter(LEVEL<=LENGTH(:CSV)-LENGTH(REPLACE(:CSV,',',''))+1)
    Statistics
              1  recursive calls
              0  db block gets
              0  consistent gets
              0  physical reads
              0  redo size
            502  bytes sent via SQL*Net to client
            396  bytes received via SQL*Net from client
              2  SQL*Net roundtrips to/from client
              1  sorts (memory)
              0  sorts (disk)
             10  rows processed
    Question:
    Every once in a while I need to use [Tom Kyte's Varying in Lists|http://tkyte.blogspot.com/2006/06/varying-in-lists.html] method (for 9i as indicated in his blog entry) to convert a comma separated list to a "bindable" in list.
    As one can see above the cardinality estimates are not correct. The execution plans I have seen from this method usually result in a nested loop join using an index. While this makes sense for small result sets it may not be the most efficient method with larger number of entries in the comma separated list.
    Has anyone found a way to expose the correct or near correct cardinality to the optimizer at runtime? I can use the cardinality hint but the problem with that is that it must be defined as a scalar value and that may not work for all cases. The dynamic sampling won't work in this scenario because cardinality statistics already exist against DUAL.
    I haven't noticed any detrimental effects in my environment so this may be purely an academic discussion but I thought I'd throw it out there :)

    I have definitely considered using this as a possibility but I have tried to shy away from writing data (even temporarily) when all I'm looking to do is query data.I agree with you about hard-coding the cardinality values.
    A silight variation on David's suggestion is to use the dynamic sampling hint to get the statistics at run-time. There will be a slight performance cost to do this.
    Remember that all of the explain plan statistics are estimates, which may or may not be accurate. Usually they are good, but every once in a while they are incorrect.

  • CBO - Wrong Cardinality Estimate

    Hello,
    Version 10.2.0.3
    I am trying to understand the figures in the Explain Plan. I am not able to explain the cardinality of 70 on step 4.
    The query takes very long to execute (about 400 secs). I would expect HASH JOIN SEMI instead of NESTED LOOPS SEMI.
    I have tried to provide as much information as possible. I have just requested the 10053 trace, dont have it now.
    There is a primary key on ORDERS.ORDER_ID (NUMBER) column. However, we are forced to use to_char(order_id) to accomodate for COT_EXTERNAL_ID being VARCHAR2 field.
      1  select cdw.* from cdw_orders cdw where cdw.cot_external_id in
      2  (
      3  select to_char(order_id) from orders o where o.status_id in (12,16,22)
      4* )
    SQL> /
    Execution Plan
    Plan hash value: 733167152
    | Id  | Operation                     | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                      |     2 |   280 |   326   (1)| 00:00:04 |
    |   1 |  NESTED LOOPS SEMI            |                      |     2 |   280 |   326   (1)| 00:00:04 |
    |   2 |   TABLE ACCESS FULL           | CDW_ORDERS           |  3362 |   433K|   293   (1)| 00:00:04 |
    |   3 |   INLIST ITERATOR             |                      |       |       |            |          |
    |*  4 |    TABLE ACCESS BY INDEX ROWID| ORDERS               |    70 |   560 |     1   (0)| 00:00:01 |
    |*  5 |     INDEX RANGE SCAN          | ORDERS_STATUS_ID_IDX |     2 |       |     1   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       4 - filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
       5 - access("O"."STATUS_ID"=12 OR "O"."STATUS_ID"=16 OR "O"."STATUS_ID"=22)Here is some of the details on the table columns and data.
    SQL> select column_name,num_distinct,density,num_nulls,num_buckets from all_tab_columns where table_name = 'ORDERS'
      2  and column_name in ('STATUS_ID','ORDER_ID');
    COLUMN_NAME                    NUM_DISTINCT                        DENSITY  NUM_NULLS NUM_BUCKETS
    ORDER_ID                             177951             .00000561952447584          0         254
    STATUS_ID                                23             .00000275335899280          0          23
    SQL> select num_rows from all_tables where table_name = 'ORDERS';
      NUM_ROWS
        177951
    SQL> select index_name,distinct_keys,clustering_factor,num_rows,sample_size from all_indexes where index_name = 'ORDERS_STATUS_ID_IDX'
      2  /
    INDEX_NAME                     DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS SAMPLE_SIZE
    ORDERS_STATUS_ID_IDX                      25             35893     177951      177951Histograms on column STATUS_ID
    SQL> select * from (
      2  select column_name,endpoint_value,endpoint_number- nvl(lag(endpoint_number) over (order by endpoint_value),0) count
      3  from all_tab_histograms where column_name = 'STATUS_ID' and table_name = 'ORDERS'
      4  ) where endpoint_value in (12,16,22);
    COLUMN_NAME                    ENDPOINT_VALUE      COUNT
    STATUS_ID                                  12        494
    STATUS_ID                                  16         24
    STATUS_ID                                  22       3064
    SQL> select max(endpoint_number) from all_tab_histograms where column_name = 'STATUS_ID' and table_name = 'ORDERS' ;
    MAX(ENDPOINT_NUMBER)
                    5641I tried to run the query for individual values instead of inlist to check the numbers.
      1  select cdw.* from cdw_orders cdw where cdw.cot_external_id in
      2  (
      3  select to_char(order_id) from orders o where o.status_id = 12
      4* )
    SQL> /
    Execution Plan
    Plan hash value: 3178043291
    | Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT             |                      |     2 |   280 |    33  (19)| 00:00:01 |
    |   1 |  MERGE JOIN SEMI             |                      |     2 |   280 |    33  (19)| 00:00:01 |
    |   2 |   TABLE ACCESS BY INDEX ROWID| CDW_ORDERS           |  3362 |   433K|    21   (0)| 00:00:01 |
    |   3 |    INDEX FULL SCAN           | CDW_ORD_COT_EXT_ID   |  3362 |       |     2   (0)| 00:00:01 |
    |*  4 |   SORT UNIQUE                |                      | 15584 |   121K|    11  (46)| 00:00:01 |
    |*  5 |    VIEW                      | index$_join$_002     | 15584 |   121K|     9  (34)| 00:00:01 |
    |*  6 |     HASH JOIN                |                      |       |       |            |          |
    |*  7 |      INDEX RANGE SCAN        | ORDERS_STATUS_ID_IDX | 15584 |   121K|     1   (0)| 00:00:01 |
    |   8 |      INDEX FAST FULL SCAN    | PK_ORDERS            | 15584 |   121K|     5   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       4 - access("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
           filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
       5 - filter("O"."STATUS_ID"=12)
       6 - access(ROWID=ROWID)
       7 - access("O"."STATUS_ID"=12)For status_id = 12, the cardinality on step 7 for orders_status_id_idx is 15584 which is inline with the expectation ie., (494/5641)*177951 = 15583.7 ~ 15584.
    Now, I continue the same with status_is = 16
      1  select cdw.* from cdw_orders cdw where cdw.cot_external_id in
      2  (
      3  select to_char(order_id) from orders o where o.status_id = 16
      4* )
    SQL> /
    Execution Plan
    Plan hash value: 43581000
    | Id  | Operation                      | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT               |                      |  1363 |   186K|    10  (10)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID   | CDW_ORDERS           |     2 |   264 |     1   (0)| 00:00:01 |
    |   2 |   NESTED LOOPS                 |                      |  1363 |   186K|    10  (10)| 00:00:01 |
    |   3 |    SORT UNIQUE                 |                      |   757 |  6056 |     2   (0)| 00:00:01 |
    |   4 |     TABLE ACCESS BY INDEX ROWID| ORDERS               |   757 |  6056 |     2   (0)| 00:00:01 |
    |*  5 |      INDEX RANGE SCAN          | ORDERS_STATUS_ID_IDX |   757 |       |     1   (0)| 00:00:01 |
    |*  6 |    INDEX RANGE SCAN            | CDW_ORD_COT_EXT_ID   |     2 |       |     1   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       5 - access("O"."STATUS_ID"=16)
       6 - access("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))Here also the cardinality on step 5 for orders_status_id_idx is as expected ie., (24/5641)*177951 = 757.1 ~ 757
    Finally, running the same for status_id = 22 surprises me
      1  select cdw.* from cdw_orders cdw where cdw.cot_external_id in
      2  (
      3  select to_char(order_id) from orders o where o.status_id = 22
      4* )
    SQL> /
    Execution Plan
    Plan hash value: 3496542905
    | Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT             |                      |     2 |   280 |   326   (1)| 00:00:04 |
    |   1 |  NESTED LOOPS SEMI           |                      |     2 |   280 |   326   (1)| 00:00:04 |
    |   2 |   TABLE ACCESS FULL          | CDW_ORDERS           |  3362 |   433K|   293   (1)| 00:00:04 |
    |*  3 |   TABLE ACCESS BY INDEX ROWID| ORDERS               |    60 |   480 |     1   (0)| 00:00:01 |
    |*  4 |    INDEX RANGE SCAN          | ORDERS_STATUS_ID_IDX |     2 |       |     1   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       3 - filter("CDW"."COT_EXTERNAL_ID"=TO_CHAR("ORDER_ID"))
       4 - access("O"."STATUS_ID"=22)Like the ones for 12 and 16, I would have expected the cardinality on step 4 to be (3064/5641)*177951 = 96657, but I see only 2.
    This is where my doubt is. Is this got to do with 22 being a popular value ? Can someone explain this behaviour ?
    As a solution I am thinking of creating an index on to_char(order_id) - function based, hoping that the step 3 CDW.COT_EXTERNAL_ID = TO_CHAR(ORDER_ID) changes
    to access instead of filter. Let me know your thoughts on the index creation as well.
    Thanks,
    Rgds,
    Gokul
    Edited by: Gokul Gopal on 24-May-2012 02:40

    Hello Jonathan,
    Apologies, I was wrong about optimizer_index_cost_adj value to be set to 100. I gather from DBA the value is set to currently set to 1.
    I have pasted the 10053 trace file for value 22. I was not able to figure out the "jsel=min(1, 6.1094e-04)" bit.
    /dborafiles/COTP/bycota2/udump/bycota2_ora_2147_values_22.trc
    Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP and Data Mining options
    ORACLE_HOME = /dboracle/orabase/product/10.2.0
    System name:     Linux
    Node name:     byl945d002
    Release:     2.6.9-55.ELsmp
    Version:     #1 SMP Fri Apr 20 16:36:54 EDT 2007
    Machine:     x86_64
    Instance name: bycota2
    Redo thread mounted by this instance: 2
    Oracle process number: 37
    Unix process pid: 2147, image: oracle@byl945d002 (TNS V1-V3)
    *** 2012-05-28 14:00:59.737
    *** ACTION NAME:() 2012-05-28 14:00:59.737
    *** MODULE NAME:(SQL*Plus) 2012-05-28 14:00:59.737
    *** SERVICE NAME:(SYS$USERS) 2012-05-28 14:00:59.737
    *** SESSION ID:(713.51629) 2012-05-28 14:00:59.737
    Registered qb: SEL$1 0x973e5458 (PARSER)
      signature (): qb_name=SEL$1 nbfros=1 flg=0
        fro(0): flg=4 objn=51893 hint_alias="CDW"@"SEL$1"
    Registered qb: SEL$2 0x973e6058 (PARSER)
      signature (): qb_name=SEL$2 nbfros=1 flg=0
        fro(0): flg=4 objn=51782 hint_alias="O"@"SEL$2"
    Predicate Move-Around (PM)
    PM: Considering predicate move-around in SEL$1 (#0).
    PM:   Checking validity of predicate move-around in SEL$1 (#0).
    CBQT: Validity checks passed for 5r4bhr2yrt5gz.
    apadrv-start: call(in-use=704, alloc=16344), compile(in-use=60840, alloc=63984)
    Current SQL statement for this session:
    select cdw.* from cdw_orders cdw where cdw.cot_external_id in (select to_char(o.order_id) from orders o where status_id = 22)
    Legend
    The following abbreviations are used by optimizer trace.
    CBQT - cost-based query transformation
    JPPD - join predicate push-down
    FPD - filter push-down
    PM - predicate move-around
    CVM - complex view merging
    SPJ - select-project-join
    SJC - set join conversion
    SU - subquery unnesting
    OBYE - order by elimination
    ST - star transformation
    qb - query block
    LB - leaf blocks
    DK - distinct keys
    LB/K - average number of leaf blocks per key
    DB/K - average number of data blocks per key
    CLUF - clustering factor
    NDV - number of distinct values
    Resp - response cost
    Card - cardinality
    Resc - resource cost
    NL - nested loops (join)
    SM - sort merge (join)
    HA - hash (join)
    CPUCSPEED - CPU Speed
    IOTFRSPEED - I/O transfer speed
    IOSEEKTIM - I/O seek time
    SREADTIM - average single block read time
    MREADTIM - average multiblock read time
    MBRC - average multiblock read count
    MAXTHR - maximum I/O system throughput
    SLAVETHR - average slave I/O throughput
    dmeth - distribution method
      1: no partitioning required
      2: value partitioned
      4: right is random (round-robin)
      512: left is random (round-robin)
      8: broadcast right and partition left
      16: broadcast left and partition right
      32: partition left using partitioning of right
      64: partition right using partitioning of left
      128: use hash partitioning dimension
      256: use range partitioning dimension
      2048: use list partitioning dimension
      1024: run the join in serial
      0: invalid distribution method
    sel - selectivity
    ptn - partition
    Peeked values of the binds in SQL statement
    PARAMETERS USED BY THE OPTIMIZER
      PARAMETERS WITH ALTERED VALUES
      sort_area_retained_size             = 65535
      optimizer_mode                      = first_rows_100
      optimizer_index_cost_adj            = 25
      optimizer_index_caching             = 100
      Bug Fix Control Environment
      fix  4611850 = enabled
      fix  4663804 = enabled
      fix  4663698 = enabled
      fix  4545833 = enabled
      fix  3499674 = disabled
      fix  4584065 = enabled
      fix  4602374 = enabled
      fix  4569940 = enabled
      fix  4631959 = enabled
      fix  4519340 = enabled
      fix  4550003 = enabled
      fix  4488689 = enabled
      fix  3118776 = enabled
      fix  4519016 = enabled
      fix  4487253 = enabled
      fix  4556762 = 15     
      fix  4728348 = enabled
      fix  4723244 = enabled
      fix  4554846 = enabled
      fix  4175830 = enabled
      fix  4722900 = enabled
      fix  5094217 = enabled
      fix  4904890 = enabled
      fix  4483286 = disabled
      fix  4969880 = disabled
      fix  4711525 = enabled
      fix  4717546 = enabled
      fix  4904838 = enabled
      fix  5005866 = enabled
      fix  4600710 = enabled
      fix  5129233 = enabled
      fix  5195882 = enabled
      fix  5084239 = enabled
      fix  4595987 = enabled
      fix  4134994 = enabled
      fix  5104624 = enabled
      fix  4908162 = enabled
      fix  5015557 = enabled
      PARAMETERS WITH DEFAULT VALUES
      optimizer_mode_hinted               = false
      optimizer_features_hinted           = 0.0.0
      parallel_execution_enabled          = true
      parallel_query_forced_dop           = 0
      parallel_dml_forced_dop             = 0
      parallel_ddl_forced_degree          = 0
      parallel_ddl_forced_instances       = 0
      _query_rewrite_fudge                = 90
      optimizer_features_enable           = 10.2.0.3
      _optimizer_search_limit             = 5
      cpu_count                           = 8
      active_instance_count               = 2
      parallel_threads_per_cpu            = 2
      hash_area_size                      = 131072
      bitmap_merge_area_size              = 1048576
      sort_area_size                      = 65536
      _sort_elimination_cost_ratio        = 0
      _optimizer_block_size               = 8192
      _sort_multiblock_read_count         = 2
      _hash_multiblock_io_count           = 0
      _db_file_optimizer_read_count       = 32
      _optimizer_max_permutations         = 2000
      pga_aggregate_target                = 602112 KB
      _pga_max_size                       = 204800 KB
      _query_rewrite_maxdisjunct          = 257
      _smm_auto_min_io_size               = 56 KB
      _smm_auto_max_io_size               = 248 KB
      _smm_min_size                       = 602 KB
      _smm_max_size                       = 102400 KB
      _smm_px_max_size                    = 301056 KB
      _cpu_to_io                          = 0
      _optimizer_undo_cost_change         = 10.2.0.3
      parallel_query_mode                 = enabled
      parallel_dml_mode                   = disabled
      parallel_ddl_mode                   = enabled
      sqlstat_enabled                     = false
      _optimizer_percent_parallel         = 101
      _always_anti_join                   = choose
      _always_semi_join                   = choose
      _optimizer_mode_force               = true
      _partition_view_enabled             = true
      _always_star_transformation         = false
      _query_rewrite_or_error             = false
      _hash_join_enabled                  = true
      cursor_sharing                      = exact
      _b_tree_bitmap_plans                = true
      star_transformation_enabled         = false
      _optimizer_cost_model               = choose
      _new_sort_cost_estimate             = true
      _complex_view_merging               = true
      _unnest_subquery                    = true
      _eliminate_common_subexpr           = true
      _pred_move_around                   = true
      _convert_set_to_join                = false
      _push_join_predicate                = true
      _push_join_union_view               = true
      _fast_full_scan_enabled             = true
      _optim_enhance_nnull_detection      = true
      _parallel_broadcast_enabled         = true
      _px_broadcast_fudge_factor          = 100
      _ordered_nested_loop                = true
      _no_or_expansion                    = false
      _system_index_caching               = 0
      _disable_datalayer_sampling         = false
      query_rewrite_enabled               = true
      query_rewrite_integrity             = enforced
      _query_cost_rewrite                 = true
      _query_rewrite_2                    = true
      _query_rewrite_1                    = true
      _query_rewrite_expression           = true
      _query_rewrite_jgmigrate            = true
      _query_rewrite_fpc                  = true
      _query_rewrite_drj                  = true
      _full_pwise_join_enabled            = true
      _partial_pwise_join_enabled         = true
      _left_nested_loops_random           = true
      _improved_row_length_enabled        = true
      _index_join_enabled                 = true
      _enable_type_dep_selectivity        = true
      _improved_outerjoin_card            = true
      _optimizer_adjust_for_nulls         = true
      _optimizer_degree                   = 0
      _use_column_stats_for_function      = true
      _subquery_pruning_enabled           = true
      _subquery_pruning_mv_enabled        = false
      _or_expand_nvl_predicate            = true
      _like_with_bind_as_equality         = false
      _table_scan_cost_plus_one           = true
      _cost_equality_semi_join            = true
      _default_non_equality_sel_check     = true
      _new_initial_join_orders            = true
      _oneside_colstat_for_equijoins      = true
      _optim_peek_user_binds              = true
      _minimal_stats_aggregation          = true
      _force_temptables_for_gsets         = false
      workarea_size_policy                = auto
      _smm_auto_cost_enabled              = true
      _gs_anti_semi_join_allowed          = true
      _optim_new_default_join_sel         = true
      optimizer_dynamic_sampling          = 2
      _pre_rewrite_push_pred              = true
      _optimizer_new_join_card_computation = true
      _union_rewrite_for_gs               = yes_gset_mvs
      _generalized_pruning_enabled        = true
      _optim_adjust_for_part_skews        = true
      _force_datefold_trunc               = false
      statistics_level                    = typical
      _optimizer_system_stats_usage       = true
      skip_unusable_indexes               = true
      _remove_aggr_subquery               = true
      _optimizer_push_down_distinct       = 0
      _dml_monitoring_enabled             = true
      _optimizer_undo_changes             = false
      _predicate_elimination_enabled      = true
      _nested_loop_fudge                  = 100
      _project_view_columns               = true
      _local_communication_costing_enabled = true
      _local_communication_ratio          = 50
      _query_rewrite_vop_cleanup          = true
      _slave_mapping_enabled              = true
      _optimizer_cost_based_transformation = linear
      _optimizer_mjc_enabled              = true
      _right_outer_hash_enable            = true
      _spr_push_pred_refspr               = true
      _optimizer_cache_stats              = false
      _optimizer_cbqt_factor              = 50
      _optimizer_squ_bottomup             = true
      _fic_area_size                      = 131072
      _optimizer_skip_scan_enabled        = true
      _optimizer_cost_filter_pred         = false
      _optimizer_sortmerge_join_enabled   = true
      _optimizer_join_sel_sanity_check    = true
      _mmv_query_rewrite_enabled          = true
      _bt_mmv_query_rewrite_enabled       = true
      _add_stale_mv_to_dependency_list    = true
      _distinct_view_unnesting            = false
      _optimizer_dim_subq_join_sel        = true
      _optimizer_disable_strans_sanity_checks = 0
      _optimizer_compute_index_stats      = true
      _push_join_union_view2              = true
      _optimizer_ignore_hints             = false
      _optimizer_random_plan              = 0
      _query_rewrite_setopgrw_enable      = true
      _optimizer_correct_sq_selectivity   = true
      _disable_function_based_index       = false
      _optimizer_join_order_control       = 3
      _optimizer_cartesian_enabled        = true
      _optimizer_starplan_enabled         = true
      _extended_pruning_enabled           = true
      _optimizer_push_pred_cost_based     = true
      _sql_model_unfold_forloops          = run_time
      _enable_dml_lock_escalation         = false
      _bloom_filter_enabled               = true
      _update_bji_ipdml_enabled           = 0
      _optimizer_extended_cursor_sharing  = udo
      _dm_max_shared_pool_pct             = 1
      _optimizer_cost_hjsmj_multimatch    = true
      _optimizer_transitivity_retain      = true
      _px_pwg_enabled                     = true
      optimizer_secure_view_merging       = true
      _optimizer_join_elimination_enabled = true
      flashback_table_rpi                 = non_fbt
      _optimizer_cbqt_no_size_restriction = true
      _optimizer_enhanced_filter_push     = true
      _optimizer_filter_pred_pullup       = true
      _rowsrc_trace_level                 = 0
      _simple_view_merging                = true
      _optimizer_rownum_pred_based_fkr    = true
      _optimizer_better_inlist_costing    = all
      _optimizer_self_induced_cache_cost  = false
      _optimizer_min_cache_blocks         = 10
      _optimizer_or_expansion             = depth
      _optimizer_order_by_elimination_enabled = true
      _optimizer_outer_to_anti_enabled    = true
      _selfjoin_mv_duplicates             = true
      _dimension_skip_null                = true
      _force_rewrite_enable               = false
      _optimizer_star_tran_in_with_clause = true
      _optimizer_complex_pred_selectivity = true
      _optimizer_connect_by_cost_based    = true
      _gby_hash_aggregation_enabled       = true
      _globalindex_pnum_filter_enabled    = true
      _fix_control_key                    = 0
      _optimizer_skip_scan_guess          = false
      _enable_row_shipping                = false
      _row_shipping_threshold             = 80
      _row_shipping_explain               = false
      _optimizer_rownum_bind_default      = 10
      _first_k_rows_dynamic_proration     = true
      _optimizer_native_full_outer_join   = off
      Bug Fix Control Environment
      fix  4611850 = enabled
      fix  4663804 = enabled
      fix  4663698 = enabled
      fix  4545833 = enabled
      fix  3499674 = disabled
      fix  4584065 = enabled
      fix  4602374 = enabled
      fix  4569940 = enabled
      fix  4631959 = enabled
      fix  4519340 = enabled
      fix  4550003 = enabled
      fix  4488689 = enabled
      fix  3118776 = enabled
      fix  4519016 = enabled
      fix  4487253 = enabled
      fix  4556762 = 15     
      fix  4728348 = enabled
      fix  4723244 = enabled
      fix  4554846 = enabled
      fix  4175830 = enabled
      fix  4722900 = enabled
      fix  5094217 = enabled
      fix  4904890 = enabled
      fix  4483286 = disabled
      fix  4969880 = disabled
      fix  4711525 = enabled
      fix  4717546 = enabled
      fix  4904838 = enabled
      fix  5005866 = enabled
      fix  4600710 = enabled
      fix  5129233 = enabled
      fix  5195882 = enabled
      fix  5084239 = enabled
      fix  4595987 = enabled
      fix  4134994 = enabled
      fix  5104624 = enabled
      fix  4908162 = enabled
      fix  5015557 = enabled
      PARAMETERS IN OPT_PARAM HINT
    Column Usage Monitoring is ON: tracking level = 1
    COST-BASED QUERY TRANSFORMATIONS
    FPD: Considering simple filter push (pre rewrite) in SEL$1 (#0)
    FPD:   Current where clause predicates in SEL$1 (#0) :
             "CDW"."COT_EXTERNAL_ID"=ANY (SELECT TO_CHAR("O"."ORDER_ID") FROM "ORDERS" "O")
    Registered qb: SEL$1 0x974658b0 (COPY SEL$1)
      signature(): NULL
    Registered qb: SEL$2 0x9745e408 (COPY SEL$2)
      signature(): NULL
    Cost-Based Subquery Unnesting
    SU: No subqueries to consider in query block SEL$2 (#2).
    SU: Considering subquery unnesting in query block SEL$1 (#1)
    SU: Performing unnesting that does not require costing.
    SU: Considering subquery unnest on SEL$1 (#1).
    SU:   Checking validity of unnesting subquery SEL$2 (#2)
    SU:   Passed validity checks.
    SU:   Transforming ANY subquery to a join.
    Registered qb: SEL$5DA710D3 0x974658b0 (SUBQUERY UNNEST SEL$1; SEL$2)
      signature (): qb_name=SEL$5DA710D3 nbfros=2 flg=0
        fro(0): flg=0 objn=51893 hint_alias="CDW"@"SEL$1"
        fro(1): flg=0 objn=51782 hint_alias="O"@"SEL$2"
    Cost-Based Complex View Merging
    CVM: Finding query blocks in SEL$5DA710D3 (#1) that are valid to merge.
    SU:   Transforming ANY subquery to a join.
    Set-Join Conversion (SJC)
    SJC: Considering set-join conversion in SEL$5DA710D3 (#1).
    Query block (0x2a973e5458) before join elimination:
    SQL:******* UNPARSED QUERY IS *******
    SELECT "CDW".* FROM "COT_PLUS"."ORDERS" "O","COT_PLUS"."CDW_ORDERS" "CDW" WHERE "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
    Query block (0x2a973e5458) unchanged
    Predicate Move-Around (PM)
    PM: Considering predicate move-around in SEL$5DA710D3 (#1).
    PM:   Checking validity of predicate move-around in SEL$5DA710D3 (#1).
    PM:     PM bypassed: Outer query contains no views.
    JPPD: Applying transformation directives
    JPPD: Checking validity of push-down in query block SEL$5DA710D3 (#1)
    JPPD:   No view found to push predicate into.
    FPD: Considering simple filter push in SEL$5DA710D3 (#1)
    FPD:   Current where clause predicates in SEL$5DA710D3 (#1) :
             "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
    kkogcp: try to generate transitive predicate from check constraints for SEL$5DA710D3 (#1)
    predicates with check contraints: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
    after transitive predicate generation: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
    finally: "CDW"."COT_EXTERNAL_ID"=TO_CHAR("O"."ORDER_ID") AND "O"."STATUS_ID"=22
    First K Rows: Setup begin
    kkoqbc-start
                : call(in-use=1592, alloc=16344), compile(in-use=101000, alloc=134224)
    QUERY BLOCK TEXT
    select cdw.* from cdw_orders cdw where cdw.cot_external_id in (select to_char(o.order_id) from orders o where status_id = 22)
    QUERY BLOCK SIGNATURE
    qb name was generated
    signature (optimizer): qb_name=SEL$5DA710D3 nbfros=2 flg=0
      fro(0): flg=0 objn=51893 hint_alias="CDW"@"SEL$1"
      fro(1): flg=0 objn=51782 hint_alias="O"@"SEL$2"
    SYSTEM STATISTICS INFORMATION
      Using NOWORKLOAD Stats
      CPUSPEED: 714 millions instruction/sec
      IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
      IOSEEKTIM: 10 milliseconds (default is 10)
    BASE STATISTICAL INFORMATION
    Table Stats::
      Table: CDW_ORDERS  Alias: CDW
        #Rows: 3375  #Blks:  1504  AvgRowLen:  132.00
    Index Stats::
      Index: CDW_ORD_COT_EXT_ID  Col#: 10
        LVLS: 1  #LB: 232  #DK: 1878  LB/K: 1.00  DB/K: 1.00  CLUF: 1899.00
      Index: CDW_ORD_REFERENCE_IDX  Col#: 13
        LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
      Index: COMMITTED_IDX  Col#: 12
        LVLS: 1  #LB: 171  #DK: 1673  LB/K: 1.00  DB/K: 1.00  CLUF: 1657.00
      Index: OBJID_IDX  Col#: 16 17
        LVLS: 2  #LB: 318  #DK: 3372  LB/K: 1.00  DB/K: 1.00  CLUF: 1901.00
      Index: ORDID_IDX  Col#: 14
        LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
    Table Stats::
      Table: ORDERS  Alias:  O
        #Rows: 178253  #Blks:  7300  AvgRowLen:  282.00
    Index Stats::
      Index: IDX_ORDERS_CONFIG  Col#: 80
        LVLS: 1  #LB: 215  #DK: 452  LB/K: 1.00  DB/K: 130.00  CLUF: 59161.00
      Index: IDX_ORDERS_REFRENCE_NUMBER  Col#: 6
        LVLS: 1  #LB: 428  #DK: 68698  LB/K: 1.00  DB/K: 1.00  CLUF: 115830.00
      Index: ORDERS_BILLING_SI_IDX  Col#: 13
        LVLS: 1  #LB: 84  #DK: 3049  LB/K: 1.00  DB/K: 8.00  CLUF: 27006.00
      Index: ORDERS_LATEST_ORD_IDX  Col#: 3
        LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
      Index: ORDERS_ORDER_TYPE_IDX  Col#: 4
        LVLS: 2  #LB: 984  #DK: 64  LB/K: 15.00  DB/K: 932.00  CLUF: 59702.00
      Index: ORDERS_ORD_MINOR__IDX  Col#: 43 5
        LVLS: 2  #LB: 784  #DK: 112  LB/K: 7.00  DB/K: 375.00  CLUF: 42012.00
      Index: ORDERS_OWNING_ORG_IDX  Col#: 37
        LVLS: 0  #LB: 0  #DK: 0  LB/K: 0.00  DB/K: 0.00  CLUF: 0.00
      Index: ORDERS_PARENT_ORD_IDX  Col#: 2
        LVLS: 1  #LB: 206  #DK: 37492  LB/K: 1.00  DB/K: 1.00  CLUF: 58051.00
      Index: ORDERS_SD_CONFIG__IDX  Col#: 42
        LVLS: 2  #LB: 604  #DK: 10  LB/K: 60.00  DB/K: 3638.00  CLUF: 36389.00
      Index: ORDERS_SPECIAL_OR_IDX  Col#: 36
        LVLS: 1  #LB: 63  #DK: 2  LB/K: 31.00  DB/K: 556.00  CLUF: 1113.00
      Index: ORDERS_STATUS_ID_IDX  Col#: 5
        LVLS: 2  #LB: 635  #DK: 25  LB/K: 25.00  DB/K: 1440.00  CLUF: 36015.00
      Index: PK_ORDERS  Col#: 1
        LVLS: 1  #LB: 408  #DK: 178253  LB/K: 1.00  DB/K: 1.00  CLUF: 131025.00
    SINGLE TABLE ACCESS PATH
      Column (#5): STATUS_ID(NUMBER)
        AvgLen: 3.00 NDV: 20 Nulls: 0 Density: 2.7653e-06 Min: 2 Max: 33
        Histogram: Freq  #Bkts: 20  UncompBkts: 5567  EndPtVals: 20
      Table: ORDERS  Alias: O    
        Card: Original: 178253  Rounded: 95450  Computed: 95450.37  Non Adjusted: 95450.37
      Access Path: TableScan
        Cost:  1419.89  Resp: 1419.89  Degree: 0
          Cost_io: 1408.00  Cost_cpu: 101897352
          Resp_io: 1408.00  Resp_cpu: 101897352
    kkofmx: index filter:"O"."STATUS_ID"=22
      Access Path: index (skip-scan)
        SS sel: 0.53548  ANDV (#skips): 60
        SS io: 419.81 vs. table scan io: 1408.00
        Skip Scan chosen
      Access Path: index (SkipScan)
        Index: ORDERS_ORD_MINOR__IDX
        resc_io: 22918.81  resc_cpu: 204258888
        ix_sel: 0.53548  ix_sel_with_filters: 0.53548
        Cost: 5735.66  Resp: 5735.66  Degree: 1
      Access Path: index (AllEqRange)
        Index: ORDERS_STATUS_ID_IDX
        resc_io: 19629.00  resc_cpu: 180830676
        ix_sel: 0.53548  ix_sel_with_filters: 0.53548
        Cost: 4912.53  Resp: 4912.53  Degree: 1
      ****** trying bitmap/domain indexes ******
      Best:: AccessPath: TableScan
             Cost: 1419.89  Degree: 1  Resp: 1419.89  Card: 95450.37  Bytes: 0
    SINGLE TABLE ACCESS PATH
      Table: CDW_ORDERS  Alias: CDW    
        Card: Original: 3375  Rounded: 3375  Computed: 3375.00  Non Adjusted: 3375.00
      Access Path: TableScan
        Cost:  292.51  Resp: 292.51  Degree: 0
          Cost_io: 291.00  Cost_cpu: 12971896
          Resp_io: 291.00  Resp_cpu: 12971896
      Best:: AccessPath: TableScan
             Cost: 292.51  Degree: 1  Resp: 292.51  Card: 3375.00  Bytes: 0
    OPTIMIZER STATISTICS AND COMPUTATIONS
    GENERAL PLANS
    Considering cardinality-based initial join order.
    Permutations for Starting Table :0
    Join order[1]:  CDW_ORDERS[CDW]#0  ORDERS[O]#1
    Now joining: ORDERS[O]#1
    NL Join
      Outer table: Card: 3375.00  Cost: 292.51  Resp: 292.51  Degree: 1  Bytes: 132
      Inner table: ORDERS  Alias: O
      Access Path: TableScan
        NL Join:  Cost: 4788284.86  Resp: 4788284.86  Degree: 0
          Cost_io: 4748144.00  Cost_cpu: 343916534896
          Resp_io: 4748144.00  Resp_cpu: 343916534896
    kkofmx: index filter:"O"."STATUS_ID"=22
    OPTIMIZER PERCENT INDEX CACHING = 100
      Access Path: index (FullScan)
        Index: ORDERS_ORD_MINOR__IDX
        resc_io: 22497.00  resc_cpu: 217815366
        ix_sel: 1  ix_sel_with_filters: 0.53548
        NL Join: Cost: 19004464.41  Resp: 19004464.41  Degree: 1
          Cost_io: 18982134.75  Cost_cpu: 191314735126
          Resp_io: 18982134.75  Resp_cpu: 191314735126
    OPTIMIZER PERCENT INDEX CACHING = 100
      Access Path: index (AllEqJoin)
        Index: ORDERS_STATUS_ID_IDX
        resc_io: 1.00  resc_cpu: 7981
        ix_sel: 1.0477e-05  ix_sel_with_filters: 1.0477e-05
        NL Join: Cost: 1137.05  Resp: 1137.05  Degree: 1
          Cost_io: 1134.75  Cost_cpu: 19706236
          Resp_io: 1134.75  Resp_cpu: 19706236
      ****** trying bitmap/domain indexes ******
      Best NL cost: 1137.05
              resc: 1137.05 resc_io: 1134.75 resc_cpu: 19706236
              resp: 1137.05 resp_io: 1134.75 resp_cpu: 19706236
    adjusting AJ/SJ sel based on min/max ranges: jsel=min(1, 6.1094e-04)Semi Join Card:  2.06 = outer (3375.00) * sel (6.1094e-04)
    Join Card - Rounded: 2 Computed: 2.06
    SM Join
      Outer table:
        resc: 292.51  card 3375.00  bytes: 132  deg: 1  resp: 292.51
      Inner table: ORDERS  Alias: O
        resc: 1419.89  card: 95450.37  bytes: 8  deg: 1  resp: 1419.89
        using dmeth: 2  #groups: 1
        SORT resource      Sort statistics
          Sort width:         598 Area size:      616448 Max Area size:   104857600
          Degree:               1
          Blocks to Sort:      65 Row size:          156 Total Rows:           3375
          Initial runs:         1 Merge passes:        0 IO Cost / pass:          0
          Total IO sort cost: 0      Total CPU sort cost: 10349977
          Total Temp space used: 0
        SORT resource      Sort statistics
          Sort width:         598 Area size:      616448 Max Area size:   104857600
          Degree:               1
          Blocks to Sort:     223 Row size:           19 Total Rows:          95450
          Initial runs:         2 Merge passes:        1 IO Cost / pass:        122
          Total IO sort cost: 345      Total CPU sort cost: 85199490
          Total Temp space used: 3089000
      SM join: Resc: 2068.56  Resp: 2068.56  [multiMatchCost=0.00]
      SM cost: 2068.56
         resc: 2068.56 resc_io: 2044.00 resc_cpu: 210418716
         resp: 2068.56 resp_io: 2044.00 resp_cpu: 210418716
    SM Join (with index on outer)
      Access Path: index (FullScan)
        Index: CDW_ORD_COT_EXT_ID
        resc_io: 2132.00  resc_cpu: 18119160
        ix_sel: 1  ix_sel_with_filters: 1
        Cost: 533.53  Resp: 533.53  Degree: 1
      Outer table:
        resc: 533.53  card 3375.00  bytes: 132  deg: 1  resp: 533.53
      Inner table: ORDERS  Alias: O
        resc: 1419.89  card: 95450.37  bytes: 8  deg: 1  resp: 1419.89
        using dmeth: 2  #groups: 1
        SORT resource      Sort statistics
          Sort width:         598 Area size:      616448 Max Area size:   104857600
          Degree:               1
          Blocks to Sort:     223 Row size:           19 Total Rows:          95450
          Initial runs:         2 Merge passes:        1 IO Cost / pass:        122
          Total IO sort cost: 345      Total CPU sort cost: 85199490
          Total Temp space used: 3089000
      SM join: Resc: 2308.37  Resp: 2308.37  [multiMatchCost=0.00]
    HA Join
      Outer table:
        resc: 292.51  card 3375.00  bytes: 132  deg: 1  resp: 292.51
      Inner table: ORDERS  Alias: O
        resc: 1419.89  card: 95450.37  bytes: 8  deg: 1  resp: 1419.89
        using dmeth: 2  #groups: 1
        Cost per ptn: 1.67  #ptns: 1
        hash_area: 151 (max=25600)   Hash join: Resc: 1714.08  Resp: 1714.08  [multiMatchCost=0.00]
      HA cost: 1714.08
         resc: 1714.08 resc_io: 1699.00 resc_cpu: 129204369
         resp: 1714.08 resp_io: 1699.00 resp_cpu: 129204369
    Best:: JoinMethod: NestedLoopSemi
           Cost: 1137.05  Degree: 1  Resp: 1137.05  Card: 2.06  Bytes: 140
    Best so far: Table#: 0  cost: 292.5140  card: 3375.0000  bytes: 445500
                 Table#: 1  cost: 1137.0501  card: 2.0619  bytes: 280
    Number of join permutations tried: 1
    (newjo-save)    [0 1 ]
    Final - All Rows Plan:  Best join order: 1
      Cost: 1137.0501  Degree: 1  Card: 2.0000  Bytes: 280
      Resc: 1137.0501  Resc_io: 1134.7500  Resc_cpu: 19706236
      Resp: 1137.0501  Resp_io: 1134.7500  Resc_cpu: 19706236
    kkoipt: Query block SEL$5DA710D3 (#1)
    kkoqbc-end
              : call(in-use=156048, alloc=164408), compile(in-use=103696, alloc=134224)
    First K Rows: Setup end
    ***********************

  • Wrong cardinality estimate for range scan

    select * from v$version;
    BANNER
    Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
    PL/SQL Release 11.2.0.2.0 - Production
    CORE    11.2.0.2.0      Production
    TNS for Linux: Version 11.2.0.2.0 - Production
    NLSRTL Version 11.2.0.2.0 - ProductionSQL : select * from GC_FULFILLMENT_ITEMS where MARKETPLACE_ID=:b1 and GC_FULFILLMENT_STATUS_ID=:b2;
    Plan
    | Id  | Operation                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT            |                             |   474K|    99M|   102  (85)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| GC_FULFILLMENT_ITEMS        |   474K|    99M|   102  (85)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | I_GCFI_GCFS_ID_SDOC_MKTPLID |   474K|       |    91  (95)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("GC_FULFILLMENT_STATUS_ID"=TO_NUMBER(:B2) AND "MARKETPLACE_ID"=TO_NUMBER(:B1))
           filter("MARKETPLACE_ID"=TO_NUMBER(:B1))If i use literals than CBO uses cardinality =1 (I believe this is due it fix control :5483301 which i set to off In my environment)
    select * from GC_FULFILLMENT_ITEMS where MARKETPLACE_ID=5 and GC_FULFILLMENT_STATUS_ID=2;
    | Id  | Operation                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT            |                             |     1 |   220 |     3   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| GC_FULFILLMENT_ITEMS        |     1 |   220 |     3   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | I_GCFI_GCFS_ID_SDOC_MKTPLID |     1 |       |     2   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("GC_FULFILLMENT_STATUS_ID"=2 AND "MARKETPLACE_ID"=5)
           filter("MARKETPLACE_ID"=5)Here is column distribution and histogram information
    Enter value for column_name: MARKETPLACE_ID
    COLUMN_NAME          ENDPOINT_VALUE CUMMULATIVE_FREQUENCY  FREQUENCY ENDPOINT_ACTUAL_VALU
    MARKETPLACE_ID                    1                     1          1
    MARKETPLACE_ID                    3                  8548       8547
    MARKETPLACE_ID                    4                 15608       7060
    MARKETPLACE_ID                    5                 16385        777   --->
    MARKETPLACE_ID                35691                 16398         13
    MARKETPLACE_ID                44551                 16407          9
    6 rows selected.
    Enter value for column_name: GC_FULFILLMENT_STATUS_ID
    COLUMN_NAME                    ENDPOINT_VALUE CUMMULATIVE_FREQUENCY  FREQUENCY ENDPOINT_ACTUAL_VALU
    GC_FULFILLMENT_STATUS_ID                    5                 19602      19602
    GC_FULFILLMENT_STATUS_ID                    6                 19612         10
    GC_FULFILLMENT_STATUS_ID                    8                 19802        190
    3 rows selected.
    Actual distribution
    select MARKETPLACE_ID,count(*) from GC_FULFILLMENT_ITEMS group by MARKETPLACE_ID order by 1;
    MARKETPLACE_ID   COUNT(*)
                 1       2099
                 3   16339936
                 4   13358682
                 5    1471839   --->
             35691      33623
             44551      19881
             78931      40273
            101611          1
                      6309408
    9 rows selected.
    BHAVIK_DBA: GC1EU> select GC_FULFILLMENT_STATUS_ID,count(*) from GC_FULFILLMENT_ITEMS group by GC_FULFILLMENT_STATUS_ID order by 1;
    GC_FULFILLMENT_STATUS_ID   COUNT(*)
                           1        880
                           2         63   --->
                           3         24
                           5   37226908
                           6      22099
                           7         18
                           8     325409
                           9        343
    8 rows selected.10053 trace
      SINGLE TABLE ACCESS PATH
      Table: GC_FULFILLMENT_ITEMS  Alias: GC_FULFILLMENT_ITEMS
        Card: Original: 36703588.000000  Rounded: 474909  Computed: 474909.06  Non Adjusted: 474909.06
      Best:: AccessPath: IndexRange
      Index: I_GCFI_GCFS_ID_SDOC_MKTPLID
             Cost: 102.05  Degree: 1  Resp: 102.05  Card: 474909.06  Bytes: 0
      Outline Data:
      /*+
        BEGIN_OUTLINE_DATA
          IGNORE_OPTIM_EMBEDDED_HINTS
          OPTIMIZER_FEATURES_ENABLE('11.2.0.2')
          DB_VERSION('11.2.0.2')
          OPT_PARAM('_b_tree_bitmap_plans' 'false')
          OPT_PARAM('_optim_peek_user_binds' 'false')
          OPT_PARAM('_fix_control' '5483301:0')
          ALL_ROWS
          OUTLINE_LEAF(@"SEL$F5BB74E1")
          MERGE(@"SEL$2")
          OUTLINE(@"SEL$1")
          OUTLINE(@"SEL$2")
          INDEX_RS_ASC(@"SEL$F5BB74E1" "GC_FULFILLMENT_ITEMS"@"SEL$2" ("GC_FULFILLMENT_ITEMS"."GC_FULFILLMENT_STATUS_ID" "GC_FULFILLMENT_ITEMS"."SHIP_DELIVERY_OPTION_CODE" "GC_FULFILLMENT_ITEMS"."MARKETPLACE_ID"))
        END_OUTLINE_DATA
      */Is there any reason why CBO is using card=474909.06 ? Having fix control () in place, it should have set card=1 if it is considering GC_FULFILLMENT_STATUS_ID= 2 as "rare" value..isn't it ?

    OraDBA02 wrote:
    You are right Charles.
    I was reading one of your blog and saw that.
    As you said, it is an issue with SQLPLUS.
    However, plan for the sql which is comming from application still shows the same (wrong cardinality) plan. It does not have TO_NUMBER function because of the reason that it does not experience data-type conversion that SQLPLUS has.
    But YES...Plan is exactly the same with/without NO_NUMBER.OraDBA02,
    I believe that some of the other people responding to this thread might have already described why the execution plan in the library cache is the same plan that you are seeing. One of the goals of using bind variables in SQL statements is to reduce the number of time consuming (and resource intensive) hard parses. That also means that a second goal is to share the same execution plan for future executions of the same SQL statement, even through bind variable values have changed. The catch here is that bind variable peeking, introduced with Oracle Database 9.0.1 (may be disabled by modifying a hidden parameter), helps the optimizer select the "best" (lowest calculated cost) execution plan for those specific bind variable values - the same plan may not be the "best" execution plan for other sets of bind variable values on future executions.
    Histograms on one or more of the columns in the WHERE clause could either help or hinder the situation further. It might further help the first execution, but might further hinder future executions with different bind variable values. Oracle Database 11.1 introduced something called adaptive cursor sharing (and 11.2 introduced cardinality feedback) that in theory addresses issues where the execution plan should change for later executions with different bind variable values (but the SQL statement must execute poorly at least once).
    There might be multiple child cursors in the library cache for the same SQL statement, each potentially with a different execution plan. I suggest finding the SQL_ID of the SQL statement that the application is submitting (you can do this by checking V$SQL or V$SQLAREA). Once you have the SQL_ID, go back to the SQL statement that I suggested for displaying the execution plan:
    SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'TYPICAL'));The first NULL in the above SQL statement is where you would specify the SQL_ID. If you leave the second NULL in place, the above SQL statement will retrieve the execution plan for all child cursors with that SQL_ID.
    For instance, if the SQL_ID was 75chksrfa5fbt, you would execute the following:
    SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL'));Usually, you can take it a step further to see the bind variables that were used during the optimization phase. To do that, you would add the +PEEKED_BINDS format parameter:
    SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL +PEEKED_BINDS'));Note that there are various optimizer parameters that affect the optimizer's decisions, for instance, maybe the optimizer mode is set to FIRST_ROWS. Also possibly helpful is the +OUTLINE format parameter that might provide a clue regarding the value of some of the parameters affecting the optimizer.  The SQL statement that you would then enter is similar to the following:
    SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY_CURSOR('75chksrfa5fbt',NULL,'TYPICAL +PEEKED_BINDS +OUTLINE'));Additional information might be helpful. Please see the following two forum threads to see what kind of information you should gather:
    When your query takes too long… : When your query takes too long ...
    How to post a SQL statement tuning request: HOW TO: Post a SQL statement tuning request - template posting
    Charles Hooper
    http://hoopercharles.wordpress.com/
    IT Manager/Oracle DBA
    K&M Machine-Fabricating, Inc.

  • CBO performance

    I'm trying to figure out how the CBO works and what are the parameters that I should change to get it work without "surprises". My Oracle version is 11.1.0.7, and this test was run in both single-instance and RAC. These run on Suse 10.
    Here's the query:
    SELECT * FROM
    RAPIPAGO.RP_TRANSACCION_ITEM I
    JOIN
    RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
    ON I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM
    WHERE MES_PRESENTACION = '2009-05';
    Or
    SELECT * FROM
    RAPIPAGO.RP_TRANSACCION_ITEM I,
    RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
    WHERE I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM AND MES_PRESENTACION = '2009-05';
    Its cost is 156.175 and takes 46 seconds to complete.
    If I use a hint to obligate the engine to use a combined index (ID_TRANSACCION_ITEM and MES_PRESENTACION, in that order), this is the result:
    SELECT /*+ INDEX (I IX4_RP_TRANSACCION_ITEM) */ * FROM
    RAPIPAGO.RP_TRANSACCION_ITEM I,
    RAPIPAGO.RP_TRANSACCION_ITEM_COMISION IT
    WHERE I.ID_TRANSACCION_ITEM = IT.ID_TRANSACCION_ITEM AND MES_PRESENTACION = '2009-05';
    Its cost is 2.697.283 but takes only 1 second to complete...
    This behavior is making me troubles in the production env as it is unpredictable and inefficient.
    Is there any config that I can use to avoid or control this?
    Thanks in advanced.

    If MES_PRESENTACION is a VARCHAR2, Oracle's ability to get accurate cardinality estimates will be greatly affected. The optimizer expects by default, for example, that you have a DATE column with values of date '2008-01-01' through date '2009-08-01' that that represents 20 months, so any month will have roughly 5% of the data. If you store that data as a string, however, the optimizer's ability to predict how selective a filter will be is going to be dramatically decreased.
    Can you generate the query plans using the DBMS_XPLAN package and include the filter and access predicates? When you do, can you enclose them in the \ tag to preserve white space?  DBMS_XPLAN provides a lot of information that can be useful.
    Does the query really return 2.5 million rows?
    How many rows are in RP_TRANSACCION_ITEM?
    How many rows are in RP_TRANSACCION_ITEM_COMISION?
    Which table is the MES_PRESENTACION column in?  How many rows have a MES_PRESENTACION value of '2009-05'?
    Is there a histogram on the MES_PRESENTACION column?
    Justin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

  • Datatype best practice and plan cardinality

    Hi,
    I have a scenario where I need to store the data in the format YYYYMM (e.g. 201001 which means January, 2010).
    I am trying to evaluate what is the most appropriate datatype to store this kind of data. I am comparing 2 options, NUMBER and DATE.
    As the data is essentially a component of oracle date datatype and experts like Tom Kyte have proved (with examples) that using right
    datatype is better for optimizer. So I was expecting that using DATE datatype will yield (at least) similar (if not better) cardinality estimates
    than using NUMBER datatype. However, my tests show that when using DATE the cardinality estimates are way off from actuals whereas
    using NUMBER the cardinality estimates are much closer to actuals.
    My questions are:
    1) What should be the most appropriate datatype used to store YYYYMM data?
    2) Why does using DATE datatype yield estimates that are way off from actuals than using NUMBER datatype?
    SQL> select * from V$VERSION ;
    BANNER
    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
    PL/SQL Release 10.2.0.1.0 - Production
    CORE     10.2.0.1.0     Production
    TNS for Linux: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production
    SQL>  create table a nologging as select to_number(to_char(add_months(to_date('200101','YYYYMM'),level - 1), 'YYYYMM')) id from dual connect by level <= 289 ;
    Table created.
    SQL> create table b (id number) ;
    Table created.
    SQL> begin
      2  for i in 1..8192
      3  loop
      4     insert into b select * from a ;
      5  end loop;
      6  commit;
      7  end;
      8  /
    PL/SQL procedure successfully completed.
    SQL> alter table a add dt date ;
    Table altered.
    SQL> alter table b add dt date ;
    Table altered.
    SQL> select to_date(200101, 'YYYYMM') from dual ;
    TO_DATE(2
    01-JAN-01
    SQL> update a set dt = to_date(id, 'YYYYMM') ;
    289 rows updated.
    SQL> update b set dt = to_date(id, 'YYYYMM') ;
    2367488 rows updated.
    SQL> commit ;
    Commit complete.
    SQL> exec dbms_stats.gather_table_stats(user, 'A', estimate_percent=>NULL) ;
    PL/SQL procedure successfully completed.
    SQL> exec dbms_stats.gather_table_stats(user, 'B', estimate_percent=>NULL) ;    
    SQL> explain plan for select count(*) from b where id between 200810 and 200903 ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time       |
    |   0 | SELECT STATEMENT   |       |     1 |     5 |   824   (4)| 00:00:10 |
    |   1 |  SORT AGGREGATE    |       |     1 |     5 |            |       |
    |*  2 |   TABLE ACCESS FULL| B       | 46604 |   227K|   824   (4)| 00:00:10 |
    Predicate Information (identified by operation id):
    PLAN_TABLE_OUTPUT
       2 - filter("ID"<=200903 AND "ID">=200810)
    14 rows selected.
    SQL> explain plan for select count(*) from b where dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time       |
    |   0 | SELECT STATEMENT   |       |     1 |     5 |   825   (4)| 00:00:10 |
    |   1 |  SORT AGGREGATE    |       |     1 |     5 |            |       |
    |*  2 |   TABLE ACCESS FULL| B       |  5919 | 29595 |   825   (4)| 00:00:10 |
    Predicate Information (identified by operation id):
    PLAN_TABLE_OUTPUT
       2 - filter("DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
               hh24:mi:ss') AND "DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
               hh24:mi:ss'))
    16 rows selected.

    Charles,
    Thanks for your response.
    I did not think of the possibilitty of histograms. When I ran the tests on 10.2.0.4, I could get the results as you have shown.
    So I thought it might be due to some bug in 10.2.0.1. But interestingly, when I ran the test after collecting statistics using 'FOR ALL COLUMNS SIZE 1'
    option, I got the cardinalities that match my 10.2.0.1 results (where METHOD_OPT was default i.e. 'FOR ALL COLUMNS SIZE AUTO').
    So I carried out the tests again on 10.2.0.1 but the results did not look consistent to me. When there were no histograms on DATE column, the cardinality
    was quite close to actuals but when I collected stats using 'FOR ALL COLUMNS SIZE SKEWONLY', it generated histograms on DATE column but
    the cardinality was not quite close to actuals.
    So I am bit confused about whether this is due to a bug or due to combined effect of optimizer's "intelligence" while collecting statistics using default option
    values and the way table is queried (COL_USAGE$ data).
    Here is my test:
    SQL> select * from v$version ;
    BANNER
    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
    PL/SQL Release 10.2.0.1.0 - Production
    CORE     10.2.0.1.0     Production
    TNS for Linux: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production
    SQL> exec dbms_stats.delete_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    no rows selected
    SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS HISTOGRAM
    ID                                      289         254 HEIGHT BALANCED
    DT                                      289         254 HEIGHT BALANCED
    SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     5 |  3691   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    | 38218 |   186K|  3691   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
    14 rows selected.
    SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     8 |  3693   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    | 38218 |   298K|  3693   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss'))
    16 rows selected.
    SQL> connect sys as sysdba ;
    Connected.
    SQL> delete from sys.col_usage$ where obj# in (select object_id from all_objects where owner = 'HR' and object_name in ('A','B')) ;
    4 rows deleted.
    SQL> commit ;
    Commit complete.
    SQL> connect hr/hr ;
    Connected.
    SQL> set serveroutput on size 10000
    SQL> exec dbms_stats.delete_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS HISTOGRAM
    ID                                      289           1 NONE
    DT                                      289           1 NONE
    SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     5 |  3691   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    |   110K|   541K|  3691   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
    14 rows selected.
    SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     8 |  3693   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    | 58680 |   458K|  3693   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss'))
    16 rows selected.
    SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS HISTOGRAM
    ID                                      289         254 HEIGHT BALANCED
    DT                                      289           1 NONE
    SQL> explain plan for select count(*) from b where b.id between 200810 and 200903 ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     5 |  3690   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    | 46303 |   226K|  3690   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."ID"<=200903 AND "B"."ID">=200810)
    14 rows selected.
    SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |      |     1 |     8 |  3692   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
    |*  2 |   TABLE ACCESS FULL| B    | 56797 |   443K|  3692   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
                  hh24:mi:ss'))
    16 rows selected.
    SQL> exec dbms_stats.gather_table_stats(user, 'B') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    COLUMN_NAME                    NUM_DISTINCT NUM_BUCKETS HISTOGRAM
    ID                                      289         254 HEIGHT BALANCED
    DT                                      289           1 NONE
    SQL> exec dbms_stats.gather_table_stats(user, 'B', method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY') ;
    PL/SQL procedure successfully completed.
    SQL> select column_name, num_distinct, num_buckets, histogram from user_tab_col_statistics where table_name = 'B' ;
    COLUMN_NAME                 NUM_DISTINCT NUM_BUCKETS HISTOGRAM
    ID                         289         254 HEIGHT BALANCED
    DT                         289         254 HEIGHT BALANCED
    SQL> explain plan for select count(*) from b where b.dt between to_date(200810, 'YYYYMM') and to_date(200903, 'YYYYMM') ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time       |
    |   0 | SELECT STATEMENT   |       |     1 |     8 |  3692   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |       |     1 |     8 |            |       |
    |*  2 |   TABLE ACCESS FULL| B       | 27862 |   217K|  3692   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("B"."DT"<=TO_DATE('2009-03-01 00:00:00', 'yyyy-mm-dd
               hh24:mi:ss') AND "B"."DT">=TO_DATE('2008-10-01 00:00:00', 'yyyy-mm-dd
               hh24:mi:ss'))
    16 rows selected.
    SQL> explain plan for select count(*) from b where id between 200810 and 200903 ;
    Explained.
    SQL> select * from table(dbms_xplan.display) ;
    PLAN_TABLE_OUTPUT
    Plan hash value: 749587668
    | Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time       |
    |   0 | SELECT STATEMENT   |       |     1 |     5 |  3690   (1)| 00:00:45 |
    |   1 |  SORT AGGREGATE    |       |     1 |     5 |            |       |
    |*  2 |   TABLE ACCESS FULL| B       | 32505 |   158K|  3690   (1)| 00:00:45 |
    Predicate Information (identified by operation id):
       2 - filter("ID"<=200903 AND "ID">=200810)
    14 rows selected.

  • Cardinality of a quer

    hi all,
    I think cost will show effect on performance of a query .
    whether the cardinality and bytes also shows effect ?
    thanks for all in advance

    You want the cardinality estimate to be accurate, not low.
    How many rows does the query actually return? If the query actually returns the number of rows (roughly) that the optimizer expects, that implies that the optimizer probably picked a pretty good plan. If the optimizer radically over- or under-estimated how many rows are going to be returned, the optimizer probably picked a bad plan.
    Think of the cardinality estimate like an estimate you'd get in the real world. If you're looking for someone to remodel your kitchen, for example, and someone gives you an estimate of 1 hour while another gives you an estimate of 1 year, you can be pretty confident that neither of those estimates is going to work out well for you. The guy that estimated that it would only take an hour is obviously underestimating the cost of the job. The guy that estimated a year, on the other hand, is obviously overestimating the cost of the job. In the real world, if you got that sort of estimate, you'd probably assume that there had been some sort of miscommunication about exactly what work you wanted done. In the Oracle realm, you'd generally suspect that there were incorrect, invalid, or missing statistics on some object in the database that was causing the optimizer to make incorrect estimates and you'd work to fix those statistics so that the optimizer's estimate becomes reasonable.
    Justin

  • Query Degradation--Hash Join Degraded

    Hi All,
    I found one query degradation issue.I am on 10.2.0.3.0 (Sun OS) with optimizer_mode=ALL_ROWS.
    This is a dataware house db.
    All 3 tables involved are parition tables (with daily partitions).Partitions are created in advance and ELT jobs loads bulk data into daily partitions.
    I have checked that CBO is not using local indexes-created on them which i believe,is appropriate because when i used INDEX HINT, elapsed time increses.
    I checked giving index hint for all tables one by one but dint get any performance improvement.
    Partitions are daily loaded and after loading,partition-level stats are gathered with dbms_stats.
    We are collecting stats at partition level(granularity=>'PARTITION').Even after collecting global stats,there is no change in access pattern.Stats gather command is given below.
    PROCEDURE gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
    Only SOT_KEYMAP.IPK_SOT_KEYMAP is GLOBAL.Rest all indexes are LOCAL.
    Earlier,we were having BIND PEEKING issue,which i fixed but introducing NO_INVALIDATE=>FALSE in stats gather job.
    Here,Partition_name (20090219) is being passed through bind variables.
    SELECT a.sotrelstg_sot_ud sotcrct_sot_ud,
    b.sotkey_ud sotcrct_orig_sot_ud, a.ROWID stage_rowid
    FROM (SELECT sotrelstg_sot_ud, sotrelstg_sys_ud,
    sotrelstg_orig_sys_ord_id, sotrelstg_orig_sys_ord_vseq
    FROM sot_rel_stage
    WHERE sotrelstg_trd_date_ymd_part = '20090219'
    AND sotrelstg_crct_proc_stat_cd = 'N'
    AND sotrelstg_sot_ud NOT IN(
    SELECT sotcrct_sot_ud
    FROM sot_correct
    WHERE sotcrct_trd_date_ymd_part ='20090219')) a,
    (SELECT MAX(sotkey_ud) sotkey_ud, sotkey_sys_ud,
    sotkey_sys_ord_id, sotkey_sys_ord_vseq,
    sotkey_trd_date_ymd_part
    FROM sot_keymap
    WHERE sotkey_trd_date_ymd_part = '20090219'
    AND sotkey_iud_cd = 'I'
    --not to select logical deleted rows
    GROUP BY sotkey_trd_date_ymd_part,
    sotkey_sys_ud,
    sotkey_sys_ord_id,
    sotkey_sys_ord_vseq) b
    WHERE a.sotrelstg_sys_ud = b.sotkey_sys_ud
    AND a.sotrelstg_orig_sys_ord_id = b.sotkey_sys_ord_id
    AND NVL(a.sotrelstg_orig_sys_ord_vseq, 1) = NVL(b.sotkey_sys_ord_vseq, 1);
    During normal business hr, i found that query takes 5-7 min(which is also not acceptable), but during high load business hr,it is taking 30-50 min.
    I found that most of the time it is spending on HASH JOIN (direct path write temp).We have sufficient RAM (64 GB total/41 GB available).
    Below is the execution plan i got during normal business hr.
    | Id  | Operation                 | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
    |   1 |  HASH GROUP BY            |                     |      1 |      1 |   7844K|00:05:28.78 |      16M|    217K|  35969 |       |       |          |         |
    |*  2 |   HASH JOIN               |                     |      1 |      1 |   9977K|00:04:34.02 |      16M|    202K|  20779 |   580M|    10M|  563M (1)|     650K|
    |   3 |    NESTED LOOPS ANTI      |                     |      1 |      6 |   7855K|00:01:26.41 |      16M|   1149 |      0 |       |       |          |         |
    |   4 |     PARTITION RANGE SINGLE|                     |      1 |    258K|   8183K|00:00:16.37 |   25576 |   1149 |      0 |       |       |          |         |
    |*  5 |      TABLE ACCESS FULL    | SOT_REL_STAGE       |      1 |    258K|   8183K|00:00:16.37 |   25576 |   1149 |      0 |       |       |          |         |
    |   6 |     PARTITION RANGE SINGLE|                     |   8183K|    326K|    327K|00:01:10.53 |      16M|      0 |      0 |       |       |          |         |
    |*  7 |      INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD |   8183K|    326K|    327K|00:00:53.37 |      16M|      0 |      0 |       |       |          |         |
    |   8 |    PARTITION RANGE SINGLE |                     |      1 |    846K|     14M|00:02:06.36 |     289K|    180K|      0 |       |       |          |         |
    |*  9 |     TABLE ACCESS FULL     | SOT_KEYMAP          |      1 |    846K|     14M|00:01:52.32 |     289K|    180K|      0 |       |       |          |         |
    I will attached the same for high load business hr once query gives results.It is still executing for last 50 mins.
    INDEX STATS (INDEXES ARE LOCAL INDEXES)
    TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
    SOT_REL_STAGE                       IDXL_SOTRELSTG_SOT_UD               SOTRELSTG_SOT_UD                 1   25461560      25461560            184180
    SOT_REL_STAGE                                                           SOTRELSTG_TRD_DATE               2   25461560      25461560            184180
                                                                            _YMD_PART
    TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
    SOT_KEYMAP                          IDXL_SOTKEY_ENTORDSYS_UD            SOTKEY_ENTRY_ORD_S               1 1012306940             3          38308680
                                                                            YS_UD
    SOT_KEYMAP                          IDXL_SOTKEY_HASH                    SOTKEY_HASH                      1 1049582320    1049582320        1049579520
    SOT_KEYMAP                                                              SOTKEY_TRD_DATE_YM               2 1049582320    1049582320        1049579520
                                                                            D_PART
    SOT_KEYMAP                          IDXL_SOTKEY_SOM_ORD                 SOTKEY_SOM_UD                    1 1023998560     268949136         559414840
    SOT_KEYMAP                                                              SOTKEY_SYS_ORD_ID                2 1023998560     268949136         559414840
    SOT_KEYMAP                          IPK_SOT_KEYMAP                      SOTKEY_UD                        1 1030369480    1015378900          24226580
    TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
    SOT_CORRECT                         IDXL_SOTCRCT_SOT_UD                 SOTCRCT_SOT_UD                   1  412484756     412484756         411710982
    SOT_CORRECT                                                             SOTCRCT_TRD_DATE_Y               2  412484756     412484756         411710982
                                                                            MD_PART
    INDEX partiton stas (from dba_ind_partitions)
    INDEX_NAME                     PARTITION_NAME       STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS SAMPLE_SIZE LAST_ANALYZ GLO
    IDXL_SOTCRCT_SOT_UD            P20090219            USABLE            1         372        327879            216663     327879      327879 20-Feb-2009 YES
    IDXL_SOTKEY_ENTORDSYS_UD       P20090219            USABLE            2        2910             3             36618     856229      856229 19-Feb-2009 YES
    IDXL_SOTKEY_HASH               P20090219            USABLE            2        7783        853956            853914     853956      119705 19-Feb-2009 YES
    IDXL_SOTKEY_SOM_ORD            P20090219            USABLE            2        6411        531492            157147     799758      132610 19-Feb-2009 YES
    IDXL_SOTRELSTG_SOT_UD          P20090219            USABLE            2       13897       9682052             45867    9682052      794958 20-Feb-2009 YESThanks in advance.
    Bhavik Desai

    Hi Randolf,
    Thanks for the time you spent on this issue.I appreciate it.
    Please see my comments below:
    1. You've mentioned several times that you're passing the partition name as bind variable, but you're obviously testing the statement with literals rather than bind
    variables. So your tests obviously don't reflect what is going to happen in case of the actual execution. The cardinality estimates are potentially quite different when
    using bind variables for the partition key.
    Yes.I intentionaly used literals in my tests.I found couple of times that plan used by the application and plan generated by AUTOTRACE+EXPLAIN PLAN command...is same and
    caused hrly elapsed time.
    As i pointed out earlier,last month we solved couple of bind peeking issue by intproducing NO_VALIDATE=>FALSE in stats gather procedure,which we execute just after data
    load into such daily partitions and before start of jobs which executes this query.
    Execution plans From AWR (with parallelism on at table level DEGREE>1)-->This plan is one which CBO has used when degradation occured.This plan is used most of the times.
    ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
            1918506000          46154275              918 CURSOR STATEMENT : 4
    CURSOR STATEMENT : 4
    PLAN_TABLE_OUTPUT
    SQL_ID 39708a3azmks7
    SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
    SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
    SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
    :B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
    SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
    SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
    B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
    Plan hash value: 1213870831
    | Id  | Operation                     | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    |   0 | SELECT STATEMENT              |                     |       |       | 19655 (100)|          |       |       |        |      |            |
    |   1 |  PX COORDINATOR               |                     |       |       |            |          |       |       |        |      |            |
    |   2 |   PX SEND QC (RANDOM)         | :TQ10003            |     1 |   116 | 19655   (1)| 00:05:54 |       |       |  Q1,03 | P->S | QC (RAND)  |
    |   3 |    HASH GROUP BY              |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       |  Q1,03 | PCWP |            |
    |   4 |     PX RECEIVE                |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       |  Q1,03 | PCWP |            |
    |   5 |      PX SEND HASH             | :TQ10002            |     1 |   116 | 19655   (1)| 00:05:54 |       |       |  Q1,02 | P->P | HASH       |
    |   6 |       HASH GROUP BY           |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       |  Q1,02 | PCWP |            |
    |   7 |        NESTED LOOPS ANTI      |                     |     1 |   116 | 19654   (1)| 00:05:54 |       |       |  Q1,02 | PCWP |            |
    |   8 |         HASH JOIN             |                     |     1 |   102 | 19654   (1)| 00:05:54 |       |       |  Q1,02 | PCWP |            |
    |   9 |          PX JOIN FILTER CREATE| :BF0000             |    13M|   664M|  2427   (3)| 00:00:44 |       |       |  Q1,02 | PCWP |            |
    |  10 |           PX RECEIVE          |                     |    13M|   664M|  2427   (3)| 00:00:44 |       |       |  Q1,02 | PCWP |            |
    |  11 |            PX SEND HASH       | :TQ10000            |    13M|   664M|  2427   (3)| 00:00:44 |       |       |  Q1,00 | P->P | HASH       |
    |  12 |             PX BLOCK ITERATOR |                     |    13M|   664M|  2427   (3)| 00:00:44 |   KEY |   KEY |  Q1,00 | PCWC |            |
    |  13 |              TABLE ACCESS FULL| SOT_REL_STAGE       |    13M|   664M|  2427   (3)| 00:00:44 |   KEY |   KEY |  Q1,00 | PCWP |            |
    |  14 |          PX RECEIVE           |                     |    27M|  1270M| 17209   (1)| 00:05:10 |       |       |  Q1,02 | PCWP |            |
    |  15 |           PX SEND HASH        | :TQ10001            |    27M|  1270M| 17209   (1)| 00:05:10 |       |       |  Q1,01 | P->P | HASH       |
    |  16 |            PX JOIN FILTER USE | :BF0000             |    27M|  1270M| 17209   (1)| 00:05:10 |       |       |  Q1,01 | PCWP |            |
    |  17 |             PX BLOCK ITERATOR |                     |    27M|  1270M| 17209   (1)| 00:05:10 |   KEY |   KEY |  Q1,01 | PCWC |            |
    |  18 |              TABLE ACCESS FULL| SOT_KEYMAP          |    27M|  1270M| 17209   (1)| 00:05:10 |   KEY |   KEY |  Q1,01 | PCWP |            |
    |  19 |         PARTITION RANGE SINGLE|                     | 16185 |   221K|     0   (0)|          |   KEY |   KEY |  Q1,02 | PCWP |            |
    |  20 |          INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD | 16185 |   221K|     0   (0)|          |   KEY |   KEY |  Q1,02 | PCWP |            |
    Other Execution plan from AWR
    ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
            1053251381                 0             2925 CURSOR STATEMENT : 4
    CURSOR STATEMENT : 4
    PLAN_TABLE_OUTPUT
    SQL_ID 39708a3azmks7
    SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
    SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
    SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
    :B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
    SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
    SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
    B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
    Plan hash value: 3434900850
    | Id  | Operation                     | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    |   0 | SELECT STATEMENT              |                     |       |       |  1830 (100)|          |       |       |        |      |            |
    |   1 |  PX COORDINATOR               |                     |       |       |            |          |       |       |        |      |            |
    |   2 |   PX SEND QC (RANDOM)         | :TQ10003            |     1 |   131 |  1830   (2)| 00:00:33 |       |       |  Q1,03 | P->S | QC (RAND)  |
    |   3 |    HASH GROUP BY              |                     |     1 |   131 |  1830   (2)| 00:00:33 |       |       |  Q1,03 | PCWP |            |
    |   4 |     PX RECEIVE                |                     |     1 |   131 |  1830   (2)| 00:00:33 |       |       |  Q1,03 | PCWP |            |
    |   5 |      PX SEND HASH             | :TQ10002            |     1 |   131 |  1830   (2)| 00:00:33 |       |       |  Q1,02 | P->P | HASH       |
    |   6 |       HASH GROUP BY           |                     |     1 |   131 |  1830   (2)| 00:00:33 |       |       |  Q1,02 | PCWP |            |
    |   7 |        NESTED LOOPS ANTI      |                     |     1 |   131 |  1829   (2)| 00:00:33 |       |       |  Q1,02 | PCWP |            |
    |   8 |         HASH JOIN             |                     |     1 |   117 |  1829   (2)| 00:00:33 |       |       |  Q1,02 | PCWP |            |
    |   9 |          PX JOIN FILTER CREATE| :BF0000             |  1010K|    50M|   694   (1)| 00:00:13 |       |       |  Q1,02 | PCWP |            |
    |  10 |           PX RECEIVE          |                     |  1010K|    50M|   694   (1)| 00:00:13 |       |       |  Q1,02 | PCWP |            |
    |  11 |            PX SEND HASH       | :TQ10000            |  1010K|    50M|   694   (1)| 00:00:13 |       |       |  Q1,00 | P->P | HASH       |
    |  12 |             PX BLOCK ITERATOR |                     |  1010K|    50M|   694   (1)| 00:00:13 |   KEY |   KEY |  Q1,00 | PCWC |            |
    |  13 |              TABLE ACCESS FULL| SOT_KEYMAP          |  1010K|    50M|   694   (1)| 00:00:13 |   KEY |   KEY |  Q1,00 | PCWP |            |
    |  14 |          PX RECEIVE           |                     |    11M|   688M|  1129   (3)| 00:00:21 |       |       |  Q1,02 | PCWP |            |
    |  15 |           PX SEND HASH        | :TQ10001            |    11M|   688M|  1129   (3)| 00:00:21 |       |       |  Q1,01 | P->P | HASH       |
    |  16 |            PX JOIN FILTER USE | :BF0000             |    11M|   688M|  1129   (3)| 00:00:21 |       |       |  Q1,01 | PCWP |            |
    |  17 |             PX BLOCK ITERATOR |                     |    11M|   688M|  1129   (3)| 00:00:21 |   KEY |   KEY |  Q1,01 | PCWC |            |
    |  18 |              TABLE ACCESS FULL| SOT_REL_STAGE       |    11M|   688M|  1129   (3)| 00:00:21 |   KEY |   KEY |  Q1,01 | PCWP |            |
    |  19 |         PARTITION RANGE SINGLE|                     |  5209 | 72926 |     0   (0)|          |   KEY |   KEY |  Q1,02 | PCWP |            |
    |  20 |          INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD |  5209 | 72926 |     0   (0)|          |   KEY |   KEY |  Q1,02 | PCWP |            |
    EXECUTION PLAN AFTER SETTING DEGREE=1 (It was also degraded)
    | Id  | Operation                 | Name                | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
    |   0 | SELECT STATEMENT          |                     |     1 |   129 |       | 42336   (2)| 00:12:43 |       |       |
    |   1 |  HASH GROUP BY            |                     |     1 |   129 |       | 42336   (2)| 00:12:43 |       |       |
    |   2 |   NESTED LOOPS ANTI       |                     |     1 |   129 |       | 42335   (2)| 00:12:43 |       |       |
    |*  3 |    HASH JOIN              |                     |     1 |   115 |    51M| 42334   (2)| 00:12:43 |       |       |
    |   4 |     PARTITION RANGE SINGLE|                     |   846K|    41M|       |  8241   (1)| 00:02:29 |    81 |    81 |
    |*  5 |      TABLE ACCESS FULL    | SOT_KEYMAP          |   846K|    41M|       |  8241   (1)| 00:02:29 |    81 |    81 |
    |   6 |     PARTITION RANGE SINGLE|                     |  8161K|   490M|       | 12664   (3)| 00:03:48 |    81 |    81 |
    |*  7 |      TABLE ACCESS FULL    | SOT_REL_STAGE       |  8161K|   490M|       | 12664   (3)| 00:03:48 |    81 |    81 |
    |   8 |    PARTITION RANGE SINGLE |                     |  6525K|    87M|       |     1   (0)| 00:00:01 |    81 |    81 |
    |*  9 |     INDEX RANGE SCAN      | IDXL_SOTCRCT_SOT_UD |  6525K|    87M|       |     1   (0)| 00:00:01 |    81 |    81 |
    Predicate Information (identified by operation id):
       3 - access("SOTRELSTG_SYS_UD"="SOTKEY_SYS_UD" AND "SOTRELSTG_ORIG_SYS_ORD_ID"="SOTKEY_SYS_ORD_ID" AND
                  NVL("SOTRELSTG_ORIG_SYS_ORD_VSEQ",1)=NVL("SOTKEY_SYS_ORD_VSEQ",1))
       5 - filter("SOTKEY_TRD_DATE_YMD_PART"=20090219 AND "SOTKEY_IUD_CD"='I')
       7 - filter("SOTRELSTG_CRCT_PROC_STAT_CD"='N' AND "SOTRELSTG_TRD_DATE_YMD_PART"=20090219)
       9 - access("SOTRELSTG_SOT_UD"="SOTCRCT_SOT_UD" AND "SOTCRCT_TRD_DATE_YMD_PART"=20090219)2. Why are you passing the partition name as bind variable? A statement executing 5 mins. best, > 2 hours worst obviously doesn't suffer from hard parsing issues and
    doesn't need to (shouldn't) share execution plans therefore. So I strongly suggest to use literals instead of bind variables. This also solves any potential issues caused
    by bind variable peeking.
    This is a custom application which uses bind variables to extract data from daily partitions.So,daily automated data extract from daily paritions after load and ELT process.
    Here,Value of bind variable is being passed through a procedure parameter.It would be bit difficult to use literals in such application.
    3. All your posted plans suffer from bad cardinality estimates. The NO_MERGE hint suggested by Timur only caused a (significant) damage limitation by obviously reducing
    the row source size by the group by operation before joining, but still the optimizer is way off, apart from the obviously wrong join order (larger row set first) in
    particular the NESTED LOOP operation is causing the main troubles due to excessive logical I/O, as already pointed out by Timur.
    Can i ask for alternatives to NESTED LOOP?
    4. Your PLAN_TABLE seems to be old (you should see a corresponding note at the bottom of the DBMS_XPLAN.DISPLAY output), because none of the operations have a
    filter/access predicate information attached. Since your main issue are the bad cardinality estimates, I strongly suggest to drop any existing PLAN_TABLEs in any non-Oracle
    owned schemas because 10g already provides one in the SYS schema (GTT PLAN_TABLE$) exposed via a public synonym, so that the EXPLAIN PLAN information provides the
    "Predicate Information" section below the plan covering the "Filter/Access" predicates.
    Please post a revised explain plan output including this crucial information so that we get a clue why the cardinality estimates are way off.
    I have dropped the old plan.Got above execution plan(listed above in first point) with PREDICATE information.
    "As already mentioned the usage of bind variables for the partition name makes this issue potentially worse."
    Is there any workaround without replacing bind variable.I am on 10g so 11g's feature will not help !!!
    How are you gathering the statistics daily, can you post the exact command(s) used?
    gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
    Thanks & Regards,
    Bhavik Desai

  • Query regarding Explain Plan

    Hi,
    I have a query regarding explain plan. While we gather the statistics then optimize choose the best possible plan out of the explain plans available. If we do not gather statistics on a table for a long time then which plan it choose:
    If it will continue to use the same plan as it use in the starting when statistics were gathered or will change the plan as soon as dml activities performed and statistics getting old.
    Thanks
    GK

    Hi,
    Aman.... wrote:
    Gulshan wrote:
    Hi,
    I have a query regarding explain plan. While we gather the statistics then optimize choose the best possible plan out of the explain plans available. If we do not gather statistics on a table for a long time then which plan it choose:The same plan which it has chosen in the starting with the previous statistics. The plan won't change automatically as long as you won't refresh the statistics. This is wrong even for Oracle 9i. Here are couple of examples when a plan might change with the same optimizer statistics:
    * when you have a histogram on a column and are not a bright person to use bind variables, you might get a completely different execution plan because of a different incoming value. All that is needed to fall into this habit - a soft parse, which might be due to different reasons, for instance, due to session parameter modification (which also might change a plan even without a histogram)
    * Starting with 10g, CBO makes adjustments to cardinality estimates for out of range values appeared in predicates

  • Behaviour of default value of METHOD_OPT

    Hello,
    I was trying to test the impact of extended statistics feature of 11g when I was puzzled by another observation.
    I created a table (from ALL_OBJECTS view). The data in this table was such that it had lots of rows where OWNER = 'PUBLIC'
    and lots of rows where OBJECT_TYPE = 'JAVA CLASS' but no rows where OWNER = 'PUBLIC' AND OBJECT_TYPE = 'JAVA CLASS'.
    I also create an index on the combination of (OWNER, OBJECT_TYPE).
    Now, after collecting statistics on table and index, I queried the table for above condition (OWNER = 'PUBLIC' AND OBJECT_TYPE = 'JAVA CLASS').
    To my surprise (or not), the query used the index.
    Then I recollected the statistics on the table and index and now the same query started to do a full table scan.
    Only creation of extended statistics ensured that the plan changed to indexed access subsequently. While this proved the use of extended stats,
    I am not sure how oracle was able to use indexed access path initially but not afterwards.
    Is this due to column usage monitoring info? Can anybody help?
    Here is my test case:
    SQL> select * from v$version ;
    BANNER
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
    PL/SQL Release 11.2.0.1.0 - Production
    CORE     11.2.0.1.0     Production
    TNS for Linux: Version 11.2.0.1.0 - Production
    NLSRTL Version 11.2.0.1.0 - Production
    5 rows selected.
    SQL> show parameter optimizer
    NAME                                 TYPE        VALUE
    optimizer_capture_sql_plan_baselines boolean     FALSE
    optimizer_dynamic_sampling           integer     2
    optimizer_features_enable            string      11.2.0.1
    optimizer_index_caching              integer     0
    optimizer_index_cost_adj             integer     100
    optimizer_mode                       string      ALL_ROWS
    optimizer_secure_view_merging        boolean     TRUE
    optimizer_use_invisible_indexes      boolean     FALSE
    optimizer_use_pending_statistics     boolean     FALSE
    optimizer_use_sql_plan_baselines     boolean     TRUE
    SQL> create table t1 nologging as select * from all_objects ;
    Table created.
    SQL> exec dbms_stats.gather_table_stats(user, 'T1', no_invalidate=>false) ;
    PL/SQL procedure successfully completed.
    SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
    no rows selected
    SQL> select * from table(dbms_xplan.display_cursor) ;
    PLAN_TABLE_OUTPUT
    SQL_ID  bnrj3cac3upfd, child number 0
    select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
    Plan hash value: 3617692013
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT  |      |       |       |   226 (100)|          |
    |*  1 |  TABLE ACCESS FULL| T1   |   155 | 15190 |   226   (1)| 00:00:03 |
    Predicate Information (identified by operation id):
       1 - filter(("OBJECT_TYPE"='JAVA CLASS' AND "OWNER"='PUBLIC'))
    18 rows selected.
    SQL> create index t1_idx on t1(owner, object_type) nologging ;
    Index created.
    SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
    no rows selected
    SQL> select * from table(dbms_xplan.display_cursor) ;
    PLAN_TABLE_OUTPUT
    SQL_ID  bnrj3cac3upfd, child number 0
    select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
    Plan hash value: 546753835
    | Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT            |        |       |       |    23 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1     |   633 | 62034 |    23   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | T1_IDX |   633 |       |     3   (0)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("OWNER"='PUBLIC' AND "OBJECT_TYPE"='JAVA CLASS')
    19 rows selected.
    SQL> REM This shows that CBO decided to use the index even when there are no extended statistics
    SQL> REM Now, we will gather statistics on the table again and see what happens
    SQL> exec dbms_stats.gather_table_stats(user, 'T1', no_invalidate=>false) ;
    PL/SQL procedure successfully completed.
    SQL> select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS' ;
    no rows selected
    SQL> select * from table(dbms_xplan.display_cursor) ;
    PLAN_TABLE_OUTPUT
    SQL_ID  bnrj3cac3upfd, child number 0
    select * from t1 where owner = 'PUBLIC' and object_type = 'JAVA CLASS'
    Plan hash value: 3617692013
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT  |      |       |       |   226 (100)|          |
    |*  1 |  TABLE ACCESS FULL| T1   | 11170 |  1069K|   226   (1)| 00:00:03 |
    Predicate Information (identified by operation id):
       1 - filter(("OBJECT_TYPE"='JAVA CLASS' AND "OWNER"='PUBLIC'))
    18 rows selected.
    SQL> REM And the plan changes to Full Table scan. Why?

    user503699 wrote:
    Hemant K Chitale wrote:
    A change in statistics drives a change in expected cardinality which drives a change in plan.In that case, how does one explain the same execution plan but huge difference in cardinalities between first and third execution ?1) Oracle sometimes can use index statistics. This most likely explains difference in cardinality estimates between 1st and 2nd statements
    2) You didn't specify estimate_percent - and that is not a good practice. You can get different sets of statistics gathered with different DBMS_STATS runs even when data isn't changed
    3) As already pointed out previously, histograms make CBO behave quite differently. Most likely you have histograms in presence in the 3rd statement, which is quite possible the result of not specifying estimate_percent

  • Higtogram calculation

    Hello Experts,
    I am reading a white paper about CBO statistics the link is here. http://www.oracle.com/technetwork/database/focus-areas/bi-datawarehousing/twp-optimizer-stats-concepts-110711-1354477.
    In the part of Frequency Histograms, in number 4. It gives a formula that how optimizer calculates the cardinality when use frequency histograms.
    the Optimizer would first need to determine how many buckets in the histogram have 10 as their end point. It does this by finding the bucket whose endpoint is 10, bucket 503, and then subtracts the previous bucket number, bucket 483, 503 - 483 = 20.
    After that the pharagraph continues like below
    Then the cardinality estimate would be calculated using the following formula (number of bucket endpoints / total number of bucket) X NUM_ROWS, 20/503 X 503, so the number of rows in the PROMOTOINS table where PROMO_CATEGORY_ID =10 is 20.
    My question is, when optimizer subtracts the previous bucket number from the intended bucket number. In that example, the result is 503 - 483 = 20. So, cant we already find the cardinality? I don't understand that why optimnizer needs the following formula? At least, can somebody explain why?
    (number of bucket endpoints / total number of bucket) X NUM_ROWS
    At the same time, If you look at Oracle documentation here Histograms
    The end points show different location. For example, in white paper the end point and the value is same. However, in documents end point the the bucket number. Basicly the concept of histogram is very simple but documents make it confusing. Please share your remarkable thoughts.
    Thanks in advance.

    Hello Experts,
    I am reading a white paper about CBO statistics the link is here. http://www.oracle.com/technetwork/database/focus-areas/bi-datawarehousing/twp-optimizer-stats-concepts-110711-1354477.
    In the part of Frequency Histograms, in number 4. It gives a formula that how optimizer calculates the cardinality when use frequency histograms.
    the Optimizer would first need to determine how many buckets in the histogram have 10 as their end point. It does this by finding the bucket whose endpoint is 10, bucket 503, and then subtracts the previous bucket number, bucket 483, 503 - 483 = 20.
    After that the pharagraph continues like below
    Then the cardinality estimate would be calculated using the following formula (number of bucket endpoints / total number of bucket) X NUM_ROWS, 20/503 X 503, so the number of rows in the PROMOTOINS table where PROMO_CATEGORY_ID =10 is 20.
    My question is, when optimizer subtracts the previous bucket number from the intended bucket number. In that example, the result is 503 - 483 = 20. So, cant we already find the cardinality? I don't understand that why optimnizer needs the following formula? At least, can somebody explain why?
    (number of bucket endpoints / total number of bucket) X NUM_ROWS
    At the same time, If you look at Oracle documentation here Histograms
    The end points show different location. For example, in white paper the end point and the value is same. However, in documents end point the the bucket number. Basicly the concept of histogram is very simple but documents make it confusing. Please share your remarkable thoughts.
    Thanks in advance.

  • Question about histograms and indexes

    I read that if a histogram is generated for a column and that column has an index then if the where clause contains a value that has a high cardinality the CBO will skip using the index. The article was with reference to the benefits of histograms.
    My question is: Why would the CBO skip using the index? Why not use it anyways? Is it because there is a cost associated with loading and using the index itself?
    Would appreciate some clarification on this, thanks!!!

    First, the article in question doesn't say to create histograms on columns with high cardinality values. A primary key column is going to be the ultimate in high cardinality columns (each value is unique after all) but it's rarely appropriate to create a histogram on that column. It does say that that histograms are generally useful when the data in a particular column is highly skewed-- that is, different values occur at wildly different rates.
    If you have a table of orders with an ORDER_STATUS column, for example, 95% of your orders may be CLOSED, 3% may be SHIPPING, and 1.9% may be IN PROCESS and 0.1% may be CANCELLED. Without a histogram, Oracle would take a look at that column and see that there were 4 distinct values, so it would assume an equal distribution across the statuses. Which would cause it to favor a full scan on the table even if you were looking just for the CANCELLED orders. With a histogram, the optimizer would favor an index on ORDER_STATUS for the cancelled query while still favoring a table scan if you're looking for closed orders.
    Gathering unnecessary histograms will make statistics collection take longer, which can cause issues with SLAs. It can also force you to gather statistics more frequently/ cause statistics to get out of date more quickly if you have monotonically increasing values in a column. If you have a CREATE_DATE column, for example, and gather a histogram, values that are greater than the max value at the time the histogram was gathered might be incorrectly estimated to be too infrequent, which can cause problems. If Oracle thinks that 1/6th of the rows are from Jan, 1/6 from Feb, etc. through June and you start looking for values from July because you haven't gathered statistics in a month, the CBO's estimates are going to be off. Unnecessary histograms also cause Oracle to spend more time parsing queries, potentially with no better results. And it can make troubleshooting a bit more difficult because, depending on the version and various optimizer settings, there may be multiple query plans for the same statement or different query plans depending on the particular bind variable that is first passed in.
    Justin

Maybe you are looking for

  • Partition using Bootcamp and Windows 7

    Hi Would it be a good approach to partition using Bootcamp then install Windows 7? Later I plan to install Parallells. Can it be down this way or should I install Parallels first, then Windows 7? Thank you.

  • CreateOrderFromHistory() method

    hi has anyone satisfactorily managed to use the above method. I have tried but cannot get it to return a new OrderID ( and save the new order in the Order Master table. I am trying to duplicate a user's shopping Cart when their card payment fails on

  • Distribute Planning Function(Using Variables)

    Hi All, I'm trying to distribute KeyFigure value between months. For example:Between Jan2005 and April2005 i want to distribute values. 4000 gets distributed as below: Jan2005 - 1000 Feb2005 - 1000 Mar2005 - 1000 Apr2005 - 1000 I want user to select

  • IOS 4 still has a recurring meeting bug w/ Exchange Server 2007

    http://tech.kateva.org/2010/08/ios-3-bug-with-recurring-exchange.html When decline an invite iOS 3 declined all instances of the meeting and removed them from phone. iOS 4 deletes it on the phone but not on the server. I suppose it's progress. Maybe.

  • How can I stop the spinner in the 'checking for mail' column?

    Whenever I ope my mail app and read the bottom left column, I see the spinner and 'checking for mail'.  Then I see 'sending 1 of 2'.  Then the bar fills in, as though 2 mails have been sent.  There is no mail sent.  Then later I open the app again an