HASH JOIN RIGHT SEMI
Hi,
(Sorry) DB version is 11.2.0.2
I was trying to tune a sql using advisory. Present cost of the query is around 2100K. Advisory says to accept profile and the cost of the query will be around 7000.
But when sql profile is accepted the execution path of the query started doing 'HASH JOIN RIGHT SEMI' and the temp usage is so so high.
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
| 0 | INSERT STATEMENT | | | | 1219G| 7142 (100)| |
| 1 | LOAD TABLE CONVENTIONAL | | | | 1219G| | |Because the temp usage is so high the temp tablespace keep growing and growing. How do i optimize this query?
thanks,
baskar.l
Edited by: baskar.l on Feb 26, 2012 10:05 PM
Hi Baskar,
cost of a SQL statement is not a very informative quantity. It's an estimate of how much resources would be consumed during the query's execution, not the actual figure.
You should look at actual rowsource statistics, not at optimizer estimates. Use gather_plan_statistics hint and then display using
select * from table(dbms_xplan.display_cursor(<sql_id>, null, 'allstats last'));and post results here.
Best regards,
Nikolay
Similar Messages
-
Performance- Hash Joins ?
Hi everyone,
Version :- 10.1.0.5.0
Below is the query and it takes long time to execute due to hash joins. Is there ant other way to change change the join approach or rewrite the SQL.
Tables used
rhs_premium :- 830 GB with partitons.
bu_res_line_map :- small table 4 MB
mgt_unit_res_line_map :- small table with 4 MB
mgt_unit_reserving_lines :- View
bu_reserving_lines :- View
SELECT prem.business_unit,
prem.direct_assumed_ceded_code,
prem.ledger_currency_code,
prem.mcc_code,
prem.management_unit,
prem.product,
prem.financial_product,
nvl(prem.premium_in_ledger_currency,0) premium_in_ledger_currency, -- NVL added Aug 21, 2008
nvl(prem.premium_in_original_currency,0) premium_in_original_currency, -- Added Aug 21, 2008 M4640
nvl(prem.premium_in_us_dollars,0) premium_in_us_dollars, -- NVL added Aug 21, 2008
prem.rcc,
prem.pre_explosion_rcc, -- Issue M1997 Laks 09/15
prem.acc_code,
prem.transaction_code,
prem.accounting_period,
prem.exposure_period,
prem.policy_effective_month,
prem.policy_key,
prem.policy_line_of_business,
prem.producer_code,
prem.underwriting_year,
prem.ledger_key,
prem.account_key,
prem.policy_coverage_key,
nvl(prem.original_currency_code,' ') original_currency_code, -- NVL added Issue M4640 Aug 21, 2008
prem.authorized_unauthorized,
prem.transaction_effective_date,
prem.transaction_expiration_date,
prem.exposure_year,
prem.reinsurance_mask,
prem.rcccategory,
prem.rccclassification,
prem.detail_level_indicator,
prem.bermbase_level_indicator,
prem.padb_level_indicator,
prem.sas_level_indicator, -- Issue 443
prem.cnv_adjust_indicator, -- Issue 413
prem.originating_business_unit,
prem.actuarial_blend_indicator,
ml1.management_line_of_business_id management_lob_id_act,
mu1.reserving_line_id mgt_unit_reserving_lob_id_act,
bu1.reserving_line_id business_unit_lob_id_act,
ml1.reserving_class_id mgt_unit_res_class_lob_id_act, -- added A12
bl1.reserving_class_id business_unit_class_lob_id_act,-- added A12
ml2.management_line_of_business_id management_lob_id_fin,
mu2.reserving_line_id mgt_unit_reserving_lob_id_fin,
bu2.reserving_line_id business_unit_lob_id_fin,
ml2.reserving_class_id mgt_unit_res_class_lob_id_fin, -- added A12
bl2.reserving_class_id business_unit_class_lob_id_fin -- added A12
FROM rhs_premium prem,
bu_res_line_map bu1,
mgt_unit_res_line_map mu1,
mgt_unit_reserving_lines ml1,
bu_reserving_lines bl1, -- added A12
bu_res_line_map bu2,
mgt_unit_res_line_map mu2,
mgt_unit_reserving_lines ml2,
bu_reserving_lines bl2 -- added A12
WHERE ( prem.business_unit = bu1.business_unit (+)
AND prem.product = bu1.product (+)
AND ( prem.management_unit = mu1.management_unit (+)
AND prem.product = mu1.product(+)
AND mu1.reserving_line_id = ml1.reserving_line_id (+)
AND bu1.reserving_line_id = bl1.reserving_line_id (+) -- added A12
AND ( prem.business_unit = bu2.business_unit (+)
AND prem.financial_product = bu2.product (+)
AND ( prem.management_unit = mu2.management_unit (+)
AND prem.financial_product = mu2.product(+)
AND mu2.reserving_line_id = ml2.reserving_line_id (+)
AND bu2.reserving_line_id = bl2.reserving_line_id (+);
PLAN_TABLE_OUTPUT
Plan hash value: 3475794837
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | 346M| 124G| 416K (2)| 01:37:13 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10008 | 346M| 124G| 416K (2)| 01:37:13 | | | Q1,08 | P->S | QC (RAND) |
|* 3 | HASH JOIN RIGHT OUTER | | 346M| 124G| 416K (2)| 01:37:13 | | | Q1,08 | PCWP | |
| 4 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 5 | PX RECEIVE | | 46 | 1196 | 7 (0)| 00:00:01 | | | Q1,08 | PCWP | |
| 6 | PX SEND BROADCAST | :TQ10000 | 46 | 1196 | 7 (0)| 00:00:01 | | | | S->P | BROADCAST |
| 7 | VIEW | BU_RESERVING_LINES | 46 | 1196 | 7 (0)| 00:00:01 | | | | | |
| 8 | NESTED LOOPS OUTER | | 46 | 1748 | 7 (0)| 00:00:01 | | | | | |
|* 9 | TABLE ACCESS FULL | RESERVING_LINES | 46 | 1564 | 7 (0)| 00:00:01 | | | | | |
|* 10 | INDEX UNIQUE SCAN | RESERVING_CLASSES_PK | 1 | 4 | 0 (0)| 00:00:01 | | | | | |
|* 11 | HASH JOIN RIGHT OUTER | | 346M| 116G| 415K (2)| 01:37:04 | | | Q1,08 | PCWP | |
| 12 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 13 | PX RECEIVE | | 124K| 2182K| 87 (2)| 00:00:02 | | | Q1,08 | PCWP | |
| 14 | PX SEND BROADCAST | :TQ10001 | 124K| 2182K| 87 (2)| 00:00:02 | | | | S->P | BROADCAST |
| 15 | TABLE ACCESS FULL | BU_RES_LINE_MAP | 124K| 2182K| 87 (2)| 00:00:02 | | | | | |
|* 16 | HASH JOIN RIGHT OUTER | | 346M| 110G| 415K (2)| 01:36:54 | | | Q1,08 | PCWP | |
| 17 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 18 | PX RECEIVE | | 46 | 1196 | 7 (0)| 00:00:01 | | | Q1,08 | PCWP | |
| 19 | PX SEND BROADCAST | :TQ10002 | 46 | 1196 | 7 (0)| 00:00:01 | | | | S->P | BROADCAST |
| 20 | VIEW | BU_RESERVING_LINES | 46 | 1196 | 7 (0)| 00:00:01 | | | | | |
| 21 | NESTED LOOPS OUTER | | 46 | 1748 | 7 (0)| 00:00:01 | | | | | |
|* 22 | TABLE ACCESS FULL | RESERVING_LINES | 46 | 1564 | 7 (0)| 00:00:01 | | | | | |
|* 23 | INDEX UNIQUE SCAN | RESERVING_CLASSES_PK | 1 | 4 | 0 (0)| 00:00:01 | | | | | |
|* 24 | HASH JOIN RIGHT OUTER | | 346M| 101G| 414K (1)| 01:36:45 | | | Q1,08 | PCWP | |
| 25 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 26 | PX RECEIVE | | 124K| 2182K| 87 (2)| 00:00:02 | | | Q1,08 | PCWP | |
| 27 | PX SEND BROADCAST | :TQ10003 | 124K| 2182K| 87 (2)| 00:00:02 | | | | S->P | BROADCAST |
| 28 | TABLE ACCESS FULL | BU_RES_LINE_MAP | 124K| 2182K| 87 (2)| 00:00:02 | | | | | |
|* 29 | HASH JOIN RIGHT OUTER | | 346M| 96G| 413K (1)| 01:36:35 | | | Q1,08 | PCWP | |
| 30 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 31 | PX RECEIVE | | 46 | 1794 | 11 (10)| 00:00:01 | | | Q1,08 | PCWP | |
| 32 | PX SEND BROADCAST | :TQ10004 | 46 | 1794 | 11 (10)| 00:00:01 | | | | S->P | BROADCAST |
| 33 | VIEW | MGT_UNIT_RESERVING_LINES | 46 | 1794 | 11 (10)| 00:00:01 | | | | | |
|* 34 | HASH JOIN OUTER | | 46 | 1886 | 11 (10)| 00:00:01 | | | | | |
|* 35 | TABLE ACCESS FULL | RESERVING_LINES | 46 | 1564 | 7 (0)| 00:00:01 | | | | | |
| 36 | TABLE ACCESS FULL | RESERVING_CLASSES | 107 | 749 | 3 (0)| 00:00:01 | | | | | |
|* 37 | HASH JOIN RIGHT OUTER | | 346M| 83G| 413K (1)| 01:36:26 | | | Q1,08 | PCWP | |
| 38 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 39 | PX RECEIVE | | 93763 | 1648K| 59 (0)| 00:00:01 | | | Q1,08 | PCWP | |
| 40 | PX SEND BROADCAST | :TQ10005 | 93763 | 1648K| 59 (0)| 00:00:01 | | | | S->P | BROADCAST |
| 41 | INDEX FAST FULL SCAN | MGT_UNIT_RES_LINE_MAP_UK | 93763 | 1648K| 59 (0)| 00:00:01 | | | | | |
|* 42 | HASH JOIN RIGHT OUTER | | 346M| 77G| 412K (1)| 01:36:17 | | | Q1,08 | PCWP | |
| 43 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 44 | PX RECEIVE | | 46 | 1794 | 11 (10)| 00:00:01 | | | Q1,08 | PCWP | |
| 45 | PX SEND BROADCAST | :TQ10006 | 46 | 1794 | 11 (10)| 00:00:01 | | | | S->P | BROADCAST |
| 46 | VIEW | MGT_UNIT_RESERVING_LINES | 46 | 1794 | 11 (10)| 00:00:01 | | | | | |
|* 47 | HASH JOIN OUTER | | 46 | 1886 | 11 (10)| 00:00:01 | | | | | |
|* 48 | TABLE ACCESS FULL | RESERVING_LINES | 46 | 1564 | 7 (0)| 00:00:01 | | | | | |
| 49 | TABLE ACCESS FULL | RESERVING_CLASSES | 107 | 749 | 3 (0)| 00:00:01 | | | | | |
|* 50 | HASH JOIN RIGHT OUTER | | 346M| 65G| 411K (1)| 01:36:08 | | | Q1,08 | PCWP | |
| 51 | BUFFER SORT | | | | | | | | Q1,08 | PCWC | |
| 52 | PX RECEIVE | | 93763 | 1648K| 59 (0)| 00:00:01 | | | Q1,08 | PCWP | |
| 53 | PX SEND BROADCAST | :TQ10007 | 93763 | 1648K| 59 (0)| 00:00:01 | | | | S->P | BROADCAST |
| 54 | INDEX FAST FULL SCAN| MGT_UNIT_RES_LINE_MAP_UK | 93763 | 1648K| 59 (0)| 00:00:01 | | | | | |
| 55 | PX BLOCK ITERATOR | | 346M| 59G| 411K (1)| 01:35:58 | 1 | 385 | Q1,08 | PCWC | |
| 56 | TABLE ACCESS FULL | RHS_PREMIUM_1 | 346M| 59G| 411K (1)| 01:35:58 | 1 | 385 | Q1,08 | PCWP | |
Predicate Information (identified by operation id):
3 - access("BU2"."RESERVING_LINE_ID"="BL2"."RESERVING_LINE_ID"(+))
9 - filter("RL"."RESERVING_LINE_TYPE"='BU')
10 - access("RL"."RESERVING_CLASS_ID"="RC"."RESERVING_CLASS_ID"(+))
11 - access("PREM"."BUSINESS_UNIT"="BU2"."BUSINESS_UNIT"(+) AND "PREM"."FINANCIAL_PRODUCT"="BU2"."PRODUCT"(+))
16 - access("BU1"."RESERVING_LINE_ID"="BL1"."RESERVING_LINE_ID"(+))
22 - filter("RL"."RESERVING_LINE_TYPE"='BU')
23 - access("RL"."RESERVING_CLASS_ID"="RC"."RESERVING_CLASS_ID"(+))
24 - access("PREM"."BUSINESS_UNIT"="BU1"."BUSINESS_UNIT"(+) AND "PREM"."PRODUCT"="BU1"."PRODUCT"(+))
29 - access("MU2"."RESERVING_LINE_ID"="ML2"."RESERVING_LINE_ID"(+))
34 - access("RL"."RESERVING_CLASS_ID"="RC"."RESERVING_CLASS_ID"(+))
35 - filter("RL"."RESERVING_LINE_TYPE"='MCC')
37 - access("PREM"."MANAGEMENT_UNIT"="MU2"."MANAGEMENT_UNIT"(+) AND "PREM"."FINANCIAL_PRODUCT"="MU2"."PRODUCT"(+))
42 - access("MU1"."RESERVING_LINE_ID"="ML1"."RESERVING_LINE_ID"(+))
47 - access("RL"."RESERVING_CLASS_ID"="RC"."RESERVING_CLASS_ID"(+))
48 - filter("RL"."RESERVING_LINE_TYPE"='MCC')
50 - access("PREM"."MANAGEMENT_UNIT"="MU1"."MANAGEMENT_UNIT"(+) AND "PREM"."PRODUCT"="MU1"."PRODUCT"(+))Please let me know if there is any possibilty.
Regards,
SandyBig query - 9 tables, outer joins, parallel query option. Reading/joining views - steps 7, 20, etc?
if you are reading most of the rows in both tables then hash joins are probably the most efficient way to read the data. The parallel query option may or may not be helping.
Outer joins usually make queries slower. Views can also have th same effect, and will probably affect the execution plan.
If you are reading views you can eliminate them by using the base tables to see how that affects performance. You can change the session parameter for DB_FILE_MULTIBLOCK_READ COUNT for more efficient full table scans if its set (the documentation and blog articles claim leaving it unset in the database is the most efficient use).
Edited by: riedelme on Jun 6, 2011 8:26 AM -
Setting PGA_AGGREGATE_TARGET to resolve Hash Join issues?
I'm running some fairly big queries against our data warehouse, 10s millions of rows in some of the tables we are joining. The response times we are getting vary but its not unusual for these to run for many hours (8 and 12 for a couple of the more recent ones). Looking at the long ops I see that it can take 30000 seconds to do a hash join. Full Table Scans are reasonably fast, the cost of the query is pretty good but the hash/sort/merge operations kill the queries every time.
As a developer with access to Google I am obviously well qualified to be telling our DBAs what they need to do to improve the performance of my query - and I stumbled upon the PGA_AGGREGATE_TARGET parameter. Its pretty low in my current environment, 50Mb until I go them to up it to 200Mb (they wouldn't go bigger). But from what I've been reading I think this needs to be alot bigger, the sizes of some of my tables are huge (and often I'll be joining entire datasets and grouping/aggregate them).
But am I looking at the wrong thing, some comments I've read suggest its limited to 200Mb anyway (a hidden parameter?) and that we'd be better off setting manual AREASIZE parameters, is this the case?
Any advice (knowing that I've not supplied explain plans, real sizings etc), but a point in the right direction would be greatly appreciated.
Cheers
RichardHi Richard,
Could you give some more details related to your problem:
Which DB version is exactly used ?
Which OS is installed on the DB Server?
Is it a 32 bit or 64 bit OS?
How large is your SGA?
How much free memory (RAM) do you have on your server?
Is the query executed in parallel?
But even without these information I can clearly say that 50MB is far too small.
By default I would recommend, for a DWH, to set the PGA twice as large as the SGA.
You may also have a look at the following documentation:
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/memory.htm#49321
http://www.pythian.com/documents/Working_with_Automatic_PGA.ppt
http://www.miracleltd.com/msdbf2007/js.ppt
http://www.vldb.org/conf/2002/S29P03.pdf
http://www.sloug.org/presentations/If_Memory_Richmond_Shee.ppt
Regards
Maurice -
Hi
I would like to know if there is difference between bind join end hash join.
For example If I write sql code in My query tool (in Data Federator Query server administrator XI 3.0) it is traduced in a hashjoin(....);If I set system parameters in right way to produce bind join, hashjoin(...) is present again!
how do i understand the difference?
byeQuery to track the holder and waiter info
People who reach this place for a similar problem can use the above link to find their answer
Regards,
Vishal -
Performance Tuning - remove hash join
Hi Every one,
Can some one help in tuning below query, i have hash join taking around 84%,
SELECT
PlanId
,ReplacementPlanId
FROM
( SELECT
pl.PlanId
,xpl.PlanId ReplacementPlanId
,ROW_NUMBER() OVER(PARTITION BY pl.PlanId ORDER BY xpl.PlanId) RN
FROM [dbo].[Plan] pl
JOIN [dbo].[Plan] xpl
ON pl.PPlanId = xpl.PPlanId
AND pl.Name = xpl.Name
WHERE
pl.SDate > (CONVERT(CHAR(10),DATEADD(M,-12,GETDATE()),120)) AND
xpl.Live = 1
AND pl.TypeId = 7
AND xpl.TypeId = 7
) p
WHERE RN = 1
Thanks in advanceCan you show an execution plan of the query? Is that possible to rewrite the query as
Sorry cannot test it right now
SELECT
PlanId
,ReplacementPlanId
FROM
(SELECT PlanId,Name,SDate,TypeId FROM [dbo].[Plan]) pl
CROSS APPLY
SELECT TOP (1) TypeId ,Live FROM [Plan] xpl
WHERE pl.PPlanId = xpl.PPlanId
AND pl.Name = xpl.Name
) AS Der
WHERE pl.SDate > (CONVERT(CHAR(10),DATEADD(M,-12,GETDATE()),120)) AND
Der.Live = 1
AND pl.TypeId = 7
AND Der.TypeId = 7
Best Regards,Uri Dimant SQL Server MVP,
http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting:
Large scale of database and data cleansing
Remote DBA Services:
Improves MS SQL Database Performance
SQL Server Integration Services:
Business Intelligence -
Hi
I would like to know if there is difference between bind join end hash join.
For example If I write sql code in My query tool (in Data Federator Query server administrator XI 3.0) it is traduced in a hashjoin(....);If I set system parameters in right way to produce bind join, hashjoin(...) is present again!
how do i understand the difference?
byeHi,
this is the WebIntelligence forum, but the tool you are describing is Data Federator (Enterprise Information Management)
You will have more luck over here: Data Services and Data Quality
Regards,
H -
Query Degradation--Hash Join Degraded
Hi All,
I found one query degradation issue.I am on 10.2.0.3.0 (Sun OS) with optimizer_mode=ALL_ROWS.
This is a dataware house db.
All 3 tables involved are parition tables (with daily partitions).Partitions are created in advance and ELT jobs loads bulk data into daily partitions.
I have checked that CBO is not using local indexes-created on them which i believe,is appropriate because when i used INDEX HINT, elapsed time increses.
I checked giving index hint for all tables one by one but dint get any performance improvement.
Partitions are daily loaded and after loading,partition-level stats are gathered with dbms_stats.
We are collecting stats at partition level(granularity=>'PARTITION').Even after collecting global stats,there is no change in access pattern.Stats gather command is given below.
PROCEDURE gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Only SOT_KEYMAP.IPK_SOT_KEYMAP is GLOBAL.Rest all indexes are LOCAL.
Earlier,we were having BIND PEEKING issue,which i fixed but introducing NO_INVALIDATE=>FALSE in stats gather job.
Here,Partition_name (20090219) is being passed through bind variables.
SELECT a.sotrelstg_sot_ud sotcrct_sot_ud,
b.sotkey_ud sotcrct_orig_sot_ud, a.ROWID stage_rowid
FROM (SELECT sotrelstg_sot_ud, sotrelstg_sys_ud,
sotrelstg_orig_sys_ord_id, sotrelstg_orig_sys_ord_vseq
FROM sot_rel_stage
WHERE sotrelstg_trd_date_ymd_part = '20090219'
AND sotrelstg_crct_proc_stat_cd = 'N'
AND sotrelstg_sot_ud NOT IN(
SELECT sotcrct_sot_ud
FROM sot_correct
WHERE sotcrct_trd_date_ymd_part ='20090219')) a,
(SELECT MAX(sotkey_ud) sotkey_ud, sotkey_sys_ud,
sotkey_sys_ord_id, sotkey_sys_ord_vseq,
sotkey_trd_date_ymd_part
FROM sot_keymap
WHERE sotkey_trd_date_ymd_part = '20090219'
AND sotkey_iud_cd = 'I'
--not to select logical deleted rows
GROUP BY sotkey_trd_date_ymd_part,
sotkey_sys_ud,
sotkey_sys_ord_id,
sotkey_sys_ord_vseq) b
WHERE a.sotrelstg_sys_ud = b.sotkey_sys_ud
AND a.sotrelstg_orig_sys_ord_id = b.sotkey_sys_ord_id
AND NVL(a.sotrelstg_orig_sys_ord_vseq, 1) = NVL(b.sotkey_sys_ord_vseq, 1);
During normal business hr, i found that query takes 5-7 min(which is also not acceptable), but during high load business hr,it is taking 30-50 min.
I found that most of the time it is spending on HASH JOIN (direct path write temp).We have sufficient RAM (64 GB total/41 GB available).
Below is the execution plan i got during normal business hr.
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
| 1 | HASH GROUP BY | | 1 | 1 | 7844K|00:05:28.78 | 16M| 217K| 35969 | | | | |
|* 2 | HASH JOIN | | 1 | 1 | 9977K|00:04:34.02 | 16M| 202K| 20779 | 580M| 10M| 563M (1)| 650K|
| 3 | NESTED LOOPS ANTI | | 1 | 6 | 7855K|00:01:26.41 | 16M| 1149 | 0 | | | | |
| 4 | PARTITION RANGE SINGLE| | 1 | 258K| 8183K|00:00:16.37 | 25576 | 1149 | 0 | | | | |
|* 5 | TABLE ACCESS FULL | SOT_REL_STAGE | 1 | 258K| 8183K|00:00:16.37 | 25576 | 1149 | 0 | | | | |
| 6 | PARTITION RANGE SINGLE| | 8183K| 326K| 327K|00:01:10.53 | 16M| 0 | 0 | | | | |
|* 7 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 8183K| 326K| 327K|00:00:53.37 | 16M| 0 | 0 | | | | |
| 8 | PARTITION RANGE SINGLE | | 1 | 846K| 14M|00:02:06.36 | 289K| 180K| 0 | | | | |
|* 9 | TABLE ACCESS FULL | SOT_KEYMAP | 1 | 846K| 14M|00:01:52.32 | 289K| 180K| 0 | | | | |
I will attached the same for high load business hr once query gives results.It is still executing for last 50 mins.
INDEX STATS (INDEXES ARE LOCAL INDEXES)
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_REL_STAGE IDXL_SOTRELSTG_SOT_UD SOTRELSTG_SOT_UD 1 25461560 25461560 184180
SOT_REL_STAGE SOTRELSTG_TRD_DATE 2 25461560 25461560 184180
_YMD_PART
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_KEYMAP IDXL_SOTKEY_ENTORDSYS_UD SOTKEY_ENTRY_ORD_S 1 1012306940 3 38308680
YS_UD
SOT_KEYMAP IDXL_SOTKEY_HASH SOTKEY_HASH 1 1049582320 1049582320 1049579520
SOT_KEYMAP SOTKEY_TRD_DATE_YM 2 1049582320 1049582320 1049579520
D_PART
SOT_KEYMAP IDXL_SOTKEY_SOM_ORD SOTKEY_SOM_UD 1 1023998560 268949136 559414840
SOT_KEYMAP SOTKEY_SYS_ORD_ID 2 1023998560 268949136 559414840
SOT_KEYMAP IPK_SOT_KEYMAP SOTKEY_UD 1 1030369480 1015378900 24226580
TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_CORRECT IDXL_SOTCRCT_SOT_UD SOTCRCT_SOT_UD 1 412484756 412484756 411710982
SOT_CORRECT SOTCRCT_TRD_DATE_Y 2 412484756 412484756 411710982
MD_PART
INDEX partiton stas (from dba_ind_partitions)
INDEX_NAME PARTITION_NAME STATUS BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS SAMPLE_SIZE LAST_ANALYZ GLO
IDXL_SOTCRCT_SOT_UD P20090219 USABLE 1 372 327879 216663 327879 327879 20-Feb-2009 YES
IDXL_SOTKEY_ENTORDSYS_UD P20090219 USABLE 2 2910 3 36618 856229 856229 19-Feb-2009 YES
IDXL_SOTKEY_HASH P20090219 USABLE 2 7783 853956 853914 853956 119705 19-Feb-2009 YES
IDXL_SOTKEY_SOM_ORD P20090219 USABLE 2 6411 531492 157147 799758 132610 19-Feb-2009 YES
IDXL_SOTRELSTG_SOT_UD P20090219 USABLE 2 13897 9682052 45867 9682052 794958 20-Feb-2009 YESThanks in advance.
Bhavik DesaiHi Randolf,
Thanks for the time you spent on this issue.I appreciate it.
Please see my comments below:
1. You've mentioned several times that you're passing the partition name as bind variable, but you're obviously testing the statement with literals rather than bind
variables. So your tests obviously don't reflect what is going to happen in case of the actual execution. The cardinality estimates are potentially quite different when
using bind variables for the partition key.
Yes.I intentionaly used literals in my tests.I found couple of times that plan used by the application and plan generated by AUTOTRACE+EXPLAIN PLAN command...is same and
caused hrly elapsed time.
As i pointed out earlier,last month we solved couple of bind peeking issue by intproducing NO_VALIDATE=>FALSE in stats gather procedure,which we execute just after data
load into such daily partitions and before start of jobs which executes this query.
Execution plans From AWR (with parallelism on at table level DEGREE>1)-->This plan is one which CBO has used when degradation occured.This plan is used most of the times.
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
1918506000 46154275 918 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 1213870831
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | | | 19655 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | 1 | 116 | 19655 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 7 | NESTED LOOPS ANTI | | 1 | 116 | 19654 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 8 | HASH JOIN | | 1 | 102 | 19654 (1)| 00:05:54 | | | Q1,02 | PCWP | |
| 9 | PX JOIN FILTER CREATE| :BF0000 | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,02 | PCWP | |
| 10 | PX RECEIVE | | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,02 | PCWP | |
| 11 | PX SEND HASH | :TQ10000 | 13M| 664M| 2427 (3)| 00:00:44 | | | Q1,00 | P->P | HASH |
| 12 | PX BLOCK ITERATOR | | 13M| 664M| 2427 (3)| 00:00:44 | KEY | KEY | Q1,00 | PCWC | |
| 13 | TABLE ACCESS FULL| SOT_REL_STAGE | 13M| 664M| 2427 (3)| 00:00:44 | KEY | KEY | Q1,00 | PCWP | |
| 14 | PX RECEIVE | | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,02 | PCWP | |
| 15 | PX SEND HASH | :TQ10001 | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,01 | P->P | HASH |
| 16 | PX JOIN FILTER USE | :BF0000 | 27M| 1270M| 17209 (1)| 00:05:10 | | | Q1,01 | PCWP | |
| 17 | PX BLOCK ITERATOR | | 27M| 1270M| 17209 (1)| 00:05:10 | KEY | KEY | Q1,01 | PCWC | |
| 18 | TABLE ACCESS FULL| SOT_KEYMAP | 27M| 1270M| 17209 (1)| 00:05:10 | KEY | KEY | Q1,01 | PCWP | |
| 19 | PARTITION RANGE SINGLE| | 16185 | 221K| 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
| 20 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 16185 | 221K| 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
Other Execution plan from AWR
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
1053251381 0 2925 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 3434900850
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | | | 1830 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,02 | P->P | HASH |
| 6 | HASH GROUP BY | | 1 | 131 | 1830 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 7 | NESTED LOOPS ANTI | | 1 | 131 | 1829 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 8 | HASH JOIN | | 1 | 117 | 1829 (2)| 00:00:33 | | | Q1,02 | PCWP | |
| 9 | PX JOIN FILTER CREATE| :BF0000 | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,02 | PCWP | |
| 10 | PX RECEIVE | | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,02 | PCWP | |
| 11 | PX SEND HASH | :TQ10000 | 1010K| 50M| 694 (1)| 00:00:13 | | | Q1,00 | P->P | HASH |
| 12 | PX BLOCK ITERATOR | | 1010K| 50M| 694 (1)| 00:00:13 | KEY | KEY | Q1,00 | PCWC | |
| 13 | TABLE ACCESS FULL| SOT_KEYMAP | 1010K| 50M| 694 (1)| 00:00:13 | KEY | KEY | Q1,00 | PCWP | |
| 14 | PX RECEIVE | | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,02 | PCWP | |
| 15 | PX SEND HASH | :TQ10001 | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,01 | P->P | HASH |
| 16 | PX JOIN FILTER USE | :BF0000 | 11M| 688M| 1129 (3)| 00:00:21 | | | Q1,01 | PCWP | |
| 17 | PX BLOCK ITERATOR | | 11M| 688M| 1129 (3)| 00:00:21 | KEY | KEY | Q1,01 | PCWC | |
| 18 | TABLE ACCESS FULL| SOT_REL_STAGE | 11M| 688M| 1129 (3)| 00:00:21 | KEY | KEY | Q1,01 | PCWP | |
| 19 | PARTITION RANGE SINGLE| | 5209 | 72926 | 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
| 20 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 5209 | 72926 | 0 (0)| | KEY | KEY | Q1,02 | PCWP | |
EXECUTION PLAN AFTER SETTING DEGREE=1 (It was also degraded)
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
| 0 | SELECT STATEMENT | | 1 | 129 | | 42336 (2)| 00:12:43 | | |
| 1 | HASH GROUP BY | | 1 | 129 | | 42336 (2)| 00:12:43 | | |
| 2 | NESTED LOOPS ANTI | | 1 | 129 | | 42335 (2)| 00:12:43 | | |
|* 3 | HASH JOIN | | 1 | 115 | 51M| 42334 (2)| 00:12:43 | | |
| 4 | PARTITION RANGE SINGLE| | 846K| 41M| | 8241 (1)| 00:02:29 | 81 | 81 |
|* 5 | TABLE ACCESS FULL | SOT_KEYMAP | 846K| 41M| | 8241 (1)| 00:02:29 | 81 | 81 |
| 6 | PARTITION RANGE SINGLE| | 8161K| 490M| | 12664 (3)| 00:03:48 | 81 | 81 |
|* 7 | TABLE ACCESS FULL | SOT_REL_STAGE | 8161K| 490M| | 12664 (3)| 00:03:48 | 81 | 81 |
| 8 | PARTITION RANGE SINGLE | | 6525K| 87M| | 1 (0)| 00:00:01 | 81 | 81 |
|* 9 | INDEX RANGE SCAN | IDXL_SOTCRCT_SOT_UD | 6525K| 87M| | 1 (0)| 00:00:01 | 81 | 81 |
Predicate Information (identified by operation id):
3 - access("SOTRELSTG_SYS_UD"="SOTKEY_SYS_UD" AND "SOTRELSTG_ORIG_SYS_ORD_ID"="SOTKEY_SYS_ORD_ID" AND
NVL("SOTRELSTG_ORIG_SYS_ORD_VSEQ",1)=NVL("SOTKEY_SYS_ORD_VSEQ",1))
5 - filter("SOTKEY_TRD_DATE_YMD_PART"=20090219 AND "SOTKEY_IUD_CD"='I')
7 - filter("SOTRELSTG_CRCT_PROC_STAT_CD"='N' AND "SOTRELSTG_TRD_DATE_YMD_PART"=20090219)
9 - access("SOTRELSTG_SOT_UD"="SOTCRCT_SOT_UD" AND "SOTCRCT_TRD_DATE_YMD_PART"=20090219)2. Why are you passing the partition name as bind variable? A statement executing 5 mins. best, > 2 hours worst obviously doesn't suffer from hard parsing issues and
doesn't need to (shouldn't) share execution plans therefore. So I strongly suggest to use literals instead of bind variables. This also solves any potential issues caused
by bind variable peeking.
This is a custom application which uses bind variables to extract data from daily partitions.So,daily automated data extract from daily paritions after load and ELT process.
Here,Value of bind variable is being passed through a procedure parameter.It would be bit difficult to use literals in such application.
3. All your posted plans suffer from bad cardinality estimates. The NO_MERGE hint suggested by Timur only caused a (significant) damage limitation by obviously reducing
the row source size by the group by operation before joining, but still the optimizer is way off, apart from the obviously wrong join order (larger row set first) in
particular the NESTED LOOP operation is causing the main troubles due to excessive logical I/O, as already pointed out by Timur.
Can i ask for alternatives to NESTED LOOP?
4. Your PLAN_TABLE seems to be old (you should see a corresponding note at the bottom of the DBMS_XPLAN.DISPLAY output), because none of the operations have a
filter/access predicate information attached. Since your main issue are the bad cardinality estimates, I strongly suggest to drop any existing PLAN_TABLEs in any non-Oracle
owned schemas because 10g already provides one in the SYS schema (GTT PLAN_TABLE$) exposed via a public synonym, so that the EXPLAIN PLAN information provides the
"Predicate Information" section below the plan covering the "Filter/Access" predicates.
Please post a revised explain plan output including this crucial information so that we get a clue why the cardinality estimates are way off.
I have dropped the old plan.Got above execution plan(listed above in first point) with PREDICATE information.
"As already mentioned the usage of bind variables for the partition name makes this issue potentially worse."
Is there any workaround without replacing bind variable.I am on 10g so 11g's feature will not help !!!
How are you gathering the statistics daily, can you post the exact command(s) used?
gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Thanks & Regards,
Bhavik Desai -
Simultaneous hash joins of the same large table with many small ones?
Hello
I've got a typical data warehousing scenario where a HUGE_FACT table is to be joined with numerous very small lookup/dimension tables for data enrichment. Joins with these small lookup tables are mutually independent, which means that the result of any of these joins is not needed to perform another join.
So this is a typical scenario for a hash join: the lookup table is converted into a hashed map in RAM memory, fits there without drama cause it's small and a single pass over the HUGE_FACT suffices to get the results.
Problem is, so far as I can see it in the query plan, these hash joins are not executed simultaneously but one after another, which renders Oracle to do the full scan of the HUGE_FACT (or any intermediary enriched form of it) as many times as there are joins.
Questions:
- is my interpretation correct that the mentioned joins are sequential, not simultaneous?
- if this is the case, is there any possibility to force Oracle to perform these joins simultaneously (building more than one hashed map in memory and doing the single pass over the HUGE_FACT while looking up in all of these hashed maps for matches)? If so, how to do it?
Please note that the parallel execution of a single join at a time is not the matter of the question.
Database version is 10.2.
Thank you very much in advance for any response.user13176880 wrote:
Questions:
- is my interpretation correct that the mentioned joins are sequential, not simultaneous?Correct. But why do you think this is an issue? Because of this:
which renders Oracle to do the full scan of the HUGE_FACT (or any intermediary enriched form of it) as many times as there are joins.That is (should not be) true. Oracle does one pass of the big table, and then sequentually joins to each of the hashmaps (of each of the smaller tables).
If you show us the execution plan, we can be sure of this.
- if this is the case, is there any possibility to force Oracle to perform these joins simultaneously (building more than one hashed map in memory and doing the single pass over the HUGE_FACT while looking up in all of these hashed maps for matches)? If so, how to do it?Yes there is. But again you should not need to resort to such a solution. What you can do is use subquery factoring (WITH clause) in conjunction with the MATERIALIZE hint to first construct the cartesian join of all of the smaller (dimension) tables. And then join the big table to that. -
End of file on communication channel when doing a (hash) join
I'm having a problem with a simple join between two tables (hash join, from the explain), one containing about 650k rows, the other 730k.
The query runs fine until I add to the select list a geometry field (oracle spatial). In that case I receive an end-of-file error, which seems to be caused by a memory saturation. I'm not sure about this, as I can't understand the trace very well...
I've increased the pga_aggregate_target and the pgamax_size but this have little effects on the problem...
Any hint? Could it be caused by the geometry field size?
thanks,
giovanniThanks for the quick reply. Here it is what you asked:
select from v$version;*
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
"CORE 10.2.0.4.0 Production"
TNS for 32-bit Windows: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
an excerpt from the trace file:
*** ACTION NAME:() 2010-04-16 12:48:17.796
*** MODULE NAME:(SQL*Plus) 2010-04-16 12:48:17.796
*** SERVICE NAME:(ABRUZZO) 2010-04-16 12:48:17.796
*** SESSION ID:(149.10) 2010-04-16 12:48:17.796
*** 2010-04-16 12:48:17.796
ksedmp: internal or fatal error
ORA-07445: trovata eccezione: dump della memoria [ACCESS_VIOLATION] [_kokekd2m+24] [PC:0x11CF9E0] [ADDR:0xBA8C193C] [UNABLE_TO_READ] []
Current SQL statement for this session:
select g.cr375_idobj,g.cr375_geometria,t.cr374_codicectr,t.cr374_scarpt_cont
from DBTI.Cr374_Scarpata t, DBTI.Cr375g_a_Scarp g
where t.cr374_idobj=g.cr375_idobj
check trace file c:\oracle\product\10.2.0\db_1\rdbms\trace\abruzzo_ora_0.trc for preloading .sym file messages
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
_kokekd2m+24 00000000
kokeq2iro+178 CALLrel kokekd2m+0 99384C8 9937624 BA8C18D8
9968588 3660D954 3A52AC00
__VInfreq__rworupo+ CALLrel _kokeq2iro+0 9968218 A05C810 EA7
321
_kxhrUnpack+71 CALLreg 00000000 99693C8 A05C810 0 9 0
qerhjWalkHashBucke CALLrel kxhrUnpack+0 9967DD0 99693C8 9 A05C810 0
t+210
__PGOSF352__qerhjIn CALLrel _qerhjWalkHashBucke A05CED4 99693C0 8
nerProbeHashTable+4 t+0
78
_kdstf0000101km+230 CALLreg 00000000
kdsttgr+1263 CALLrel kdstf0000101km+0 -
Why optimizer prefers nested loop over hash join?
What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
FROM t1 p, t2 d
WHERE d.emplid = p.id_psoft
AND p.flag_processed = 'N'
AND p.desc_pool = :b1
AND NOT d.name LIKE '%DUPLICATE%'
AND ROWNUM < 2tkprof output is:
Production
call count cpu elapsed disk query current rows
Parse 1 0.01 0.00 0 0 4 0
Execute 1 0.00 0.01 0 4 0 0
Fetch 1 228.83 223.48 0 4264533 0 1
total 3 228.84 223.50 0 4264537 4 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 108 (SANJEEV)
Rows Row Source Operation
1 COUNT STOPKEY (cr=4264533 pr=0 pw=0 time=223484076 us)
1 NESTED LOOPS (cr=4264533 pr=0 pw=0 time=223484031 us)
10401 TABLE ACCESS FULL T1 (cr=192 pr=0 pw=0 time=228969 us)
1 TABLE ACCESS FULL T2 (cr=4264341 pr=0 pw=0 time=223182508 us)Development
call count cpu elapsed disk query current rows
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.01 0 4 0 0
Fetch 1 0.05 0.03 0 512 0 1
total 3 0.06 0.06 0 516 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 113 (SANJEEV)
Rows Row Source Operation
1 COUNT STOPKEY (cr=512 pr=0 pw=0 time=38876 us)
1 HASH JOIN (cr=512 pr=0 pw=0 time=38846 us)
51 TABLE ACCESS FULL T2 (cr=492 pr=0 pw=0 time=30230 us)
861 TABLE ACCESS FULL T1 (cr=20 pr=0 pw=0 time=2746 us)sanjeevchauhan wrote:
What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
FROM t1 p, t2 d
WHERE d.emplid = p.id_psoft
AND p.flag_processed = 'N'
AND p.desc_pool = :b1
AND NOT d.name LIKE '%DUPLICATE%'
AND ROWNUM < 2
You've got already some suggestions, but the most straightforward way is to run the unhinted statement in both environments and then force the join and access methods you would like to see using hints, in your case probably "USE_HASH(P D)" in your production environment and "FULL(P) FULL(D) USE_NL(P D)" in your development environment should be sufficient to see the costs and estimates returned by the optimizer when using the alternate access and join patterns.
This give you a first indication why the optimizer thinks that the chosen access path seems to be cheaper than the obviously less efficient plan selected in production.
As already mentioned by Hemant using bind variables complicates things a bit since EXPLAIN PLAN is not reliable due to bind variable peeking performed when executing the statement, but not when explaining.
Since you're already on 10g you can get the actual execution plan used for all four variants using DBMS_XPLAN.DISPLAY_CURSOR which tells you more than the TKPROF output in the "Row Source Operation" section regarding the estimates and costs assigned.
Of course the result of your whole exercise might be highly dependent on the actual bind variable value used.
By the way, your statement is questionable in principle since you're querying for the first row of an indeterministic result set. It's not deterministic since you've defined no particular order so depending on the way Oracle executes the statement and the physical storage of your data this query might return different results on different runs.
This is either an indication of a bad design (If the query is supposed to return exactly one row then you don't need the ROWNUM restriction) or an incorrect attempt of a Top 1 query which requires you to specify somehow an order, either by adding a ORDER BY to the statement and wrapping it into an inline view, or e.g. using some analytic functions that allow you specify a RANK by a defined ORDER.
This is an example of how a deterministic Top N query could look like:
SELECT
FROM
SELECT p.*
FROM t1 p, t2 d
WHERE d.emplid = p.id_psoft
AND p.flag_processed = 'N'
AND p.desc_pool = :b1
AND NOT d.name LIKE '%DUPLICATE%'
ORDER BY <order_criteria>
WHERE ROWNUM <= 1;Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/ -
Seeking advice on a heavy hash join
We have to self-join a 190 million row table 160 times. On our production 9i database we give "RULE" hint and it finishes in about an hour using nested loop join, On our test 10g database, since "RULE" is not available any more. The optimiser chooses to use hash join (160 levels). And query never finishes in a reasonable time. We have gradually increase the hash_area_size from 8M to 512M, thinking that will help. But apparently it does not. Can anyone provide suggestions? Thanks.
There is an approach using Analytics Functions that may be useful here. The idea is to get all the data values required with 1 reference to Tab2 rather than 160, and let Analytics do the work of building the result set.
The goal here is to 'never do multiple references to the same table where 1 reference can suffice (usually with the aid of Analytics)'.
There are 2 variations depending on whether the ID values here (0,10,20,30,...) always follow an ascending pattern or if they don't. The second example is more generic and will also cover the ascending case.
name ID val ID val ID val ID val ID val
name1 0 1 10 1 20 1 30 1 40 1
-- Performance Summary - This demo used a max value of 1600 vs 15000
for a net number of rows of 2.5 million versus 225 million.
Any of the test views ( replacing 5,10, or 160 tab2 references) using the Analytics function used at most 4099 consistent gets. The original approach used 20553 consistent gets for 5 tab2 references, 41016 consistent gets for 10 tab2 references. This was in 10g, doing the hash join. I did try alter session set optimizer_mode = rule (just for test purposes) and that resulted in 46462 consistent gets for the 10 tab2 reference view while it did a merge join operation.
Autotrace for the Analytics version to replace the original 160 table sql.
JT(147)@JTDB10G>select * from analytics_2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
1112904 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
-- Minimal examples:
--As always, test thoroughly before using in production.
select name,
id0, value value0,
id1, value value1,
id2, value value2,
id3, value value3,
id4, value value4
from ( select name, id, value,
lead(id,0 ) over(partition by name, value order by rn ) id0 ,
lead(id,1 ) over(partition by name, value order by rn ) id1 ,
lead(id,2 ) over(partition by name, value order by rn ) id2 ,
lead(id,3 ) over(partition by name, value order by rn ) id3 ,
lead(id,4 ) over(partition by name, value order by rn ) id4 ,
rn
from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40)
)) where rn = 0;
-- execute a verions of the smaller analytics approch with bind variables
-- referencing the binds within the numbered in-line view is needed only if
-- the id0, id1, id2 values do not follow the ascending pattern shown in the
-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
variable l_id0 number;
variable l_id1 number;
variable l_id2 number;
variable l_id3 number;
variable l_id4 number;
exec :l_id1 := 0;
exec :l_id3 := 10;
exec :l_id2 := 20;
exec :l_id0 := 30;
exec :l_id4 := 40;
select name, bind_rn,
id0, value value0,
id1, value value1,
id2, value value2,
id3, value value3,
id4, value value4
from ( select name, id, value,
lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
bind_rn
from( select tab1.name, i0.id, i0.tab1value value, bind_rn
from tab1, tab2 i0,
(select 0 bind_rn, :l_id0 arg_value from dual union
select 1 , :l_id1 from dual union
select 2 , :l_id2 from dual union
select 3 , :l_id3 from dual union
select 4 , :l_id4 from dual ) table_of_args
where tab1.name='name1' and i0.tab1value=tab1.value
-- and i0.id in (0,10,20,30,40)
and i0.id = table_of_args.arg_value
)) where bind_rn = 0;
-- Full Test Case
-- table setup
drop table tab1;
drop table tab2;
create table tab1(name varchar2(100), value number) pctfree 0;
create table tab2(id number, tab1value number) pctfree 0;
begin
for x in 0 .. 1600 loop
for y in 1 .. 1600 loop
insert into tab1 values ('name' || x, y);
end loop;
end loop;
end;
-- 15000 results in 225,000,000
-- 1600 results in 2,560,000
begin
for x in 0 .. 1600 loop
for y in 1 .. 1600 loop
insert into tab2 values (x, y);
end loop;
end loop;
end;
commit;
CREATE BITMAP INDEX NAME_BITMAP ON TAB1(NAME);
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'JTOMMANEY',TABNAME => 'TAB1', -
estimate_percent => 20, CASCADE => TRUE);
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'JTOMMANEY',TABNAME => 'TAB2', -
estimate_percent => 20, CASCADE => TRUE);
alter session set optimizer_mode = 'RULE';
-- set up some views both the original approach, and the analytis approach
create view original_5_tab2_tables_join as
select tab1.name name,
i0.id id0, i0.tab1value value0,
i1.id id1, i1.tab1value value1,
i2.id id2, i2.tab1value value2,
i3.id id3, i3.tab1value value3,
i4.id id4, i4.tab1value value4
from tab1,
tab2 i0, tab2 i1, tab2 i2, tab2 i3, tab2 i4
where tab1.name='name1'
and (i0.id=0 and i0.tab1value=tab1.value)
and (i1.id=10 and i1.tab1value=tab1.value)
and (i2.id=20 and i2.tab1value=tab1.value)
and (i3.id=30 and i3.tab1value=tab1.value)
and (i4.id=40 and i4.tab1value=tab1.value);
create view replace_5_tab2_joins as
select name,
id0, value value0,
id1, value value1,
id2, value value2,
id3, value value3,
id4, value value4
from ( select name, id, value,
lead(id,0 ) over(partition by name, value order by rn ) id0 ,
lead(id,1 ) over(partition by name, value order by rn ) id1 ,
lead(id,2 ) over(partition by name, value order by rn ) id2 ,
lead(id,3 ) over(partition by name, value order by rn ) id3 ,
lead(id,4 ) over(partition by name, value order by rn ) id4 ,
rn from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40)
)) where rn = 0;
create view original_10_tab2_tables_join as
select tab1.name name,
i0.id id0, i0.tab1value value0,
i1.id id1, i1.tab1value value1,
i2.id id2, i2.tab1value value2,
i3.id id3, i3.tab1value value3,
i4.id id4, i4.tab1value value4,
i5.id id5, i5.tab1value value5,
i6.id id6, i6.tab1value value6,
i7.id id7, i7.tab1value value7,
i8.id id8, i8.tab1value value8,
i9.id id9, i9.tab1value value9
from tab1,
tab2 i0, tab2 i1, tab2 i2, tab2 i3, tab2 i4,
tab2 i5, tab2 i6, tab2 i7, tab2 i8, tab2 i9
where tab1.name='name1'
and (i0.id=0 and i0.tab1value=tab1.value)
and (i1.id=10 and i1.tab1value=tab1.value)
and (i2.id=20 and i2.tab1value=tab1.value)
and (i3.id=30 and i3.tab1value=tab1.value)
and (i4.id=40 and i4.tab1value=tab1.value)
and (i5.id=50 and i5.tab1value=tab1.value)
and (i6.id=60 and i6.tab1value=tab1.value)
and (i7.id=70 and i7.tab1value=tab1.value)
and (i8.id=80 and i8.tab1value=tab1.value)
and (i9.id=90 and i9.tab1value=tab1.value);
create view replace_10_tab2_joins as
select name,
id0, value value0,
id1, value value1,
id2, value value2,
id3, value value3,
id4, value value4,
id5, value value5,
id6, value value6,
id7, value value7,
id8, value value8,
id9, value value9
from ( select name, id, value,
lead(id,0 ) over(partition by name, value order by rn ) id0 ,
lead(id,1 ) over(partition by name, value order by rn ) id1 ,
lead(id,2 ) over(partition by name, value order by rn ) id2 ,
lead(id,3 ) over(partition by name, value order by rn ) id3 ,
lead(id,4 ) over(partition by name, value order by rn ) id4 ,
lead(id,5 ) over(partition by name, value order by rn ) id5 ,
lead(id,6 ) over(partition by name, value order by rn ) id6 ,
lead(id,7 ) over(partition by name, value order by rn ) id7 ,
lead(id,8 ) over(partition by name, value order by rn ) id8 ,
lead(id,9 ) over(partition by name, value order by rn ) id9 ,
rn from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40,50,60,70,80,90)
)) where rn = 0;
-- set up some views both the original approach, and the analytics approach
spool cr_v1.sql may need to clean up heading, linefeed from created file
begin
dbms_output.put_line('create or replace view original_160_joins as select /*+ rule */ tab1.name ');
for x in 0 .. 160 loop
dbms_output.put_line( ',i' || x || '.id id' || x || ' ,i' || x || '.tab1value value' || x ) ;
end loop;
dbms_output.put_line('from tab1' );
for x in 0 .. 160 loop
dbms_output.put_line( ',tab2 i' || x ) ;
end loop;
dbms_output.put_line(' where tab1.name = ''name1''' );
for x in 0 .. 160 loop
dbms_output.put_line( ' and i' || x || '.id=' || (x * 10) || ' and i' || x || '.tab1value=tab1.value ' ) ;
end loop;
dbms_output.put_line( ' ;');
end;
--spool off
--@cr_v1.sql
spool cr_v2.sql may need to clean up heading, linefeed from created file
begin
dbms_output.put_line('create or replace view analytics_2_joins as select name ');
for x in 0 .. 160 loop
dbms_output.put_line( ',id' || x || ', value value' || x ) ;
end loop;
dbms_output.put_line('from ( select name, id, value ' );
for x in 0 .. 160 loop
dbms_output.put_line( ',lead(id,' || x || ') over(partition by name, value order by rn ) id' || x ) ;
end loop;
dbms_output.put_line(' , rn from( select tab1.name, i0.id, i0.tab1value value, ');
dbms_output.put_line(' (row_number() over (partition by tab1value order by i0.id )) -1 rn ');
dbms_output.put_line(' from tab1, tab2 i0 where tab1.name=''name1'' and i0.tab1value=tab1.value and i0.id in ( ');
for x in 0 .. 159 loop
dbms_output.put_line( (x * 10) || ',' ) ;
end loop;
dbms_output.put_line( ' 1600))) where rn = 0;');
end;
--spool off
--@cr_v2.sql
-- We now have 6 views established
-- Original Approach Analytics Approach w/ 1 tab2 reference
-- 5 tab2s original_5_tab2_tables_join replace_5_tab2_joins
-- 10 tab2s original_10_tab2_tables_join replace_10_tab2_joins
--160 tab2s original_160_joins analytics_2_joins
-- plus we will use call the version with bind variables, but not from a view.
-- Data validation:
select 'orig_minus_new: ' || count(*) from
( select * from original_5_tab2_tables_join minus select * from replace_5_tab2_joins ) union
select 'new_minus_orig: ' || count(*) from
( select * from replace_5_tab2_joins minus select * from original_5_tab2_tables_join );
select 'orig_minus_new: ' || count(*) from
( select * from original_10_tab2_tables_join minus select * from replace_10_tab2_joins ) union
select 'new_minus_orig: ' || count(*) from
( select * from replace_10_tab2_joins minus select * from original_10_tab2_tables_join );
select 'orig_minus_new: ' || count(*) from
( select * from original_160_joins minus select * from analytics_2_joins );
select 'new_minus_orig: ' || count(*) from
( select * from analytics_2_joins minus select * from original_160_joins );
-- Performance test
alter session set workarea_size_policy=manual ;
alter session set sort_area_size = 64000000;
alter session set hash_area_size = 64000000;
set autotrace traceonly stat
select * from original_5_tab2_tables_join;
select * from replace_5_tab2_joins;
select * from original_10_tab2_tables_join;
select * from replace_10_tab2_joins;
select * from analytics_2_joins;
--select * from original_160_joins;
-- execute a verions of the smaller analytics approch with bind variables
-- referencing the binds within the numbered in-line view is needed only if
-- the id0, id1, id2 values do not follow the ascending pattern shown in the
-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
variable l_id0 number;
variable l_id1 number;
variable l_id2 number;
variable l_id3 number;
variable l_id4 number;
exec :l_id1 := 0;
exec :l_id3 := 10;
exec :l_id2 := 20;
exec :l_id0 := 30;
exec :l_id4 := 40;
select name, bind_rn,
id0, value value0,
id1, value value1,
id2, value value2,
id3, value value3,
id4, value value4
from ( select name, id, value,
lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
bind_rn
from( select tab1.name, i0.id, i0.tab1value value, bind_rn
from tab1, tab2 i0,
(select 0 bind_rn, :l_id0 arg_value from dual union
select 1 , :l_id1 from dual union
select 2 , :l_id2 from dual union
select 3 , :l_id3 from dual union
select 4 , :l_id4 from dual ) table_of_args
where tab1.name='name1' and i0.tab1value=tab1.value
-- and i0.id in (0,10,20,30,40)
and i0.id = table_of_args.arg_value
)) where bind_rn = 0;
JT(147)@JTDB10G>select * from original_5_tab2_tables_join;
1600 rows selected.
Statistics
8 recursive calls
2 db block gets
20553 consistent gets
18555 physical reads
0 redo size
52052 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from replace_5_tab2_joins;
1600 rows selected.
Statistics
8 recursive calls
2 db block gets
4101 consistent gets
3710 physical reads
0 redo size
52052 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from original_10_tab2_tables_join;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
41016 consistent gets
37115 physical reads
0 redo size
85636 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from replace_10_tab2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
85636 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from analytics_2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
1112904 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>--select * from original_160_joins;
JT(147)@JTDB10G>
JT(147)@JTDB10G>----------------------------------------------------------------------------------------
JT(147)@JTDB10G>-- execute a verions of the smaller analytics approch with bind variables
JT(147)@JTDB10G>-- referencing the binds within the numbered in-line view is needed only if
JT(147)@JTDB10G>-- the id0, id1, id2 values do not follow the ascending pattern shown in the
JT(147)@JTDB10G>-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
JT(147)@JTDB10G>----------------------------------------------------------------------------------------
JT(147)@JTDB10G>
JT(147)@JTDB10G>variable l_id0 number;
JT(147)@JTDB10G>variable l_id1 number;
JT(147)@JTDB10G>variable l_id2 number;
JT(147)@JTDB10G>variable l_id3 number;
JT(147)@JTDB10G>variable l_id4 number;
JT(147)@JTDB10G>
JT(147)@JTDB10G>exec :l_id1 := 0;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id3 := 10;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id2 := 20;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id0 := 30;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id4 := 40;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>
JT(147)@JTDB10G>select name, bind_rn,
2 id0, value value0,
3 id1, value value1,
4 id2, value value2,
5 id3, value value3,
6 id4, value value4
7 from ( select name, id, value,
8 lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
9 lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
10 lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
11 lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
12 lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
13 bind_rn
14 from( select tab1.name, i0.id, i0.tab1value value, bind_rn
15 from tab1, tab2 i0,
16 (select 0 bind_rn, :l_id0 arg_value from dual union
17 select 1 , :l_id1 from dual union
18 select 2 , :l_id2 from dual union
19 select 3 , :l_id3 from dual union
20 select 4 , :l_id4 from dual ) table_of_args
21 where tab1.name='name1' and i0.tab1value=tab1.value
22 -- and i0.id in (0,10,20,30,40)
23 and i0.id = table_of_args.arg_value
24 )) where bind_rn = 0;
1600 rows selected.
Statistics
1 recursive calls
0 db block gets
4099 consistent gets
3707 physical reads
0 redo size
52111 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed -
How to hint hash join order on indexes ?
Hi,
in my 9.2.0.8 DB I've got query like this:
SELECT COUNT (agreementno) cnt
FROM followup
WHERE agreementno = :v001 AND actioncode = :v002 AND resultcode = :v003;
Plan
SELECT STATEMENT CHOOSECost: 11 Bytes: 18 Cardinality: 1
5 SORT AGGREGATE Bytes: 18 Cardinality: 1
4 VIEW index$_join$_001 Cost: 11 Bytes: 18 Cardinality: 1
3 HASH JOIN Bytes: 18 Cardinality: 1
1 INDEX RANGE SCAN NON-UNIQUE IDX_FOLLOWUP06 Cost: 13 Bytes: 18 Cardinality: 1
2 INDEX RANGE SCAN UNIQUE PK_FOLLOWUP Cost: 13 Bytes: 18 Cardinality: 1 I need to change join order of indexes, so the proble one would be PK_FOLLOWUP .
Of course the best plan is index range scan on pk but during to hight CF Oracle is combining 2 indexes .
Regards.
GregHmm sorry for red-herring.
I guess you could also consider hand-coding the index join, something like:
SELECT /*+ ORDERED USE_HASH (b) */
COUNT (a.empno)
FROM (SELECT e.empno
FROM emp e
WHERE e.empno > 7000) a,
(SELECT e.empno
FROM emp e
WHERE e.ename LIKE 'S%') b
WHERE a.ROWID = b.ROWID;or...
WITH e AS
(SELECT e.empno, e.ename
FROM emp e)
SELECT /*+ ORDERED USE_HASH (b) */
COUNT (a.empno)
FROM e a, e b
WHERE b.ename LIKE 'S%'
AND a.empno > 7000
AND a.ROWID = b.ROWID; -
Hi,
Both the querys are returning same results, but in my first query hash join and second query nested loop . How ? PLs explain
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno>20;
6 rows
Plan hash value: 4102772462
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 6 | 348 | 6 (17)| 00:00:01 |
|* 1 | HASH JOIN | | 6 | 348 | 6 (17)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 60 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PK_DEPT | 3 | | 1 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | EMP | 7 | 266 | 3 (0)| 00:00:01 |
Predicate Information (identified by operation id):
1 - access("A"."DEPTNO"="B"."DEPTNO")
3 - access("B"."DEPTNO">20)
4 - filter("A"."DEPTNO">20)
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno=30;
6 rows
Plan hash value: 568005898
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 5 | 290 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 290 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | EMP | 5 | 190 | 3 (0)| 00:00:01 |
Predicate Information (identified by operation id):
3 - access("B"."DEPTNO"=30)
4 - filter("A"."DEPTNO"=30)Hi,
Unless specifically requested, Oracle picks the best execution plan based on estimates of table sizes, column selectivity and many other variables. Even though Oracle does its best to have the estimates as accurate as possible, they are frequently different, and in some cases quite different, from the actual values.
In the first query, Oracle estimated that the predicate “ b.deptno>20” would limit the number of records to 6, and based on that it decided the use Hash Join.
In the second query, Oracle estimated that the predicate “b.deptno=30” would limit the number of records to 5, and based on that it decided the use Nested Loops Join.
The fact that the actual number of records is the same is irrelevant because Oracle used the estimate, rather the actual number of records to pick the best plan.
HTH,
Iordan
Iotzov -
Hi ,
From my explain plan i see HASH JOIN ANTI .
what arises to this join in my select query ?
select * from ( select * from tbl1 where not exists (select * from tbl2 where tbl1.col1 = tbl2.col1) jj
where not exist (select * from tbl3 where jj.col1 = tbl3.col1))i think it's the dbl not exists that cause this hash join anti
but what impact will this type of join has on my query performance ?
tks & rdgsHi ,
but i tot when using SELECT where xx in ( ) or XX NOT in ( ) then it'll for every single record then it would perform the select query within the IN ( ) or NOT IN ( )
wherelse EXISTS or NOT EXISTS will actually do prelisting and then compare from there onwards ?
tks & rdgs -
Parallel Hash Join always swapping to TEMP
Hi,
I've experienced some strange behaviour on Oracle 9.2.0.5 recently: simple query hash joining two tables - smaller with 16k records/1 mb of size and bigger with 2.5m records/1.5 gb of size is swapping to TEMP when launched in parallel mode (4 set of PQ slaves). What is strange serial execution is running as expected - in-memory Hash Join occurs. It's worth to add that both parallel and serial execution properly selects smaller table as inner one but parallel query always decides to buffer the source data (no matter how big is it).
To be more precise - all table stats are gathered, I have enough PGA memory assigned to queries (WORKAREA_POLICY_SIZE=AUTO, PGA_AGGREGATE_TARGET=6GB) and I properly analyze the results. Even hidden parameter SMMPX_MAX_SIZE is properly set to about 2GB, the issue is that parallel execution still decides to swap (even if the inner data size for each slave is about 220kb!).
I dig into the traces (10104 event) and found some substantial difference between serial and parallel execution. It looks like some internal flag orders PQ slaves to always buffer the data, here is what I found in PQ slave trace:
HASH JOIN STATISTICS (INITIALIZATION)
Original memory: 4428800
Memory after all overhead: 4283220
Memory for slots: 3809280
Calculated overhead for partitions and row/slot managers: 473940
Hash-join fanout: 8
Number of partitions: 9
Number of slots: 15
Multiblock IO: 31
Block size(KB): 8
Cluster (slot) size(KB): 248
Hash-join fanout (manual): 8
Cluster/slot size(KB) (manual): 280
Minimum number of bytes per block: 8160
Bit vector memory allocation(KB): 128
Per partition bit vector length(KB): 16
Maximum possible row length: 1455
Estimated build size (KB): 645
Estimated Row Length (includes overhead): 167
Immutable Flags:
BUFFER the output of the join for Parallel Query
kxhfSetPhase: phase=BUILD
kxhfAddChunk: add chunk 0 (sz=32) to slot table
kxhfAddChunk: chunk 0 (lbs=800003ff640ebb50, slotTab=800003ff640ebce8) successfuly added
kxhfSetPhase: phase=PROBE_1
Bolded is the part that is not present in serial mode. Unfortunatelly I cannot find anything that could help identifying the reason or setting that drives this behaviour :(
Best regards
Bazyli
Edited by: user10419027 on Oct 13, 2008 3:53 AMJonathan,
Distribution seems to be as expected (HASH/HASH), please have a look on the query plan:
PLAN_TABLE_OUTPUT
| Id | Operation | Name | Rows | Bytes | Cost | TQ |IN-OUT| PQ Distrib |
| 0 | SELECT STATEMENT | | 456K| 95M| 876 | | | |
|* 1 | HASH JOIN | | 456K| 95M| 876 | 43,02 | P->S | QC (RAND) |
| 2 | TABLE ACCESS FULL | SH30_8700195_9032_0_TMP_TEST | 16555 | 468K| 16 | 43,00 | P->P | HASH |
| 3 | TABLE ACCESS FULL | SH30_8700195_9031_0_TMP_TEST | 2778K| 503M| 860 | 43,01 | P->P | HASH |
Predicate Information (identified by operation id):
1 - access(NVL("A"."PROD_ID",'NULL!')=NVL("B"."PROD_ID",'NULL!') AND
NVL("A"."PROD_UNIT_OF_MEASR_ID",'NULL!')=NVL("B"."PROD_UNIT_OF_MEASR_ID",'NULL!'))Let me also share with you trace files from parallel and serial execution.
First, serial execution (only 10104 event details):
Dump file /opt/oracle/admin/cdwep4/udump/cdwep401_ora_18729.trc
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0.5
System name: HP-UX
Node name: ethp1018
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: cdwep401
Redo thread mounted by this instance: 1
Oracle process number: 100
Unix process pid: 18729, image: oracle@ethp1018 (TNS V1-V3)
kxhfInit(): enter
kxhfInit(): exit
*** HASH JOIN STATISTICS (INITIALIZATION) ***
Original memory: 4341760
Memory after all overhead: 4163446
Memory for slots: 3301376
Calculated overhead for partitions and row/slot managers: 862070
Hash-join fanout: 8
Number of partitions: 8
Number of slots: 13
Multiblock IO: 31
Block size(KB): 8
Cluster (slot) size(KB): 248
Hash-join fanout (manual): 8
Cluster/slot size(KB) (manual): 240
Minimum number of bytes per block: 8160
Bit vector memory allocation(KB): 128
Per partition bit vector length(KB): 16
Maximum possible row length: 1455
Estimated build size (KB): 1083
Estimated Row Length (includes overhead): 67
# Immutable Flags:
kxhfSetPhase: phase=BUILD
kxhfAddChunk: add chunk 0 (sz=32) to slot table
kxhfAddChunk: chunk 0 (lbs=800003ff6c063b20, slotTab=800003ff6c063cb8) successfuly added
kxhfSetPhase: phase=PROBE_1
qerhjFetch: max build row length (mbl=110)
*** END OF HASH JOIN BUILD (PHASE 1) ***
Revised row length: 68
Revised row count: 16555
Revised build size: 1089KB
kxhfResize(enter): resize to 12 slots (numAlloc=8, max=13)
kxhfResize(exit): resized to 12 slots (numAlloc=8, max=12)
Slot table resized: old=13 wanted=12 got=12 unload=0
*** HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
Number of partitions which could fit in memory: 8
Number of partitions left in memory: 8
Total number of slots in in-memory partitions: 8
Total number of rows in in-memory partitions: 16555
(used as preliminary number of buckets in hash table)
Estimated max # of build rows that can fit in avail memory: 55800
### Partition Distribution ###
Partition:0 rows:2131 clusters:1 slots:1 kept=1
Partition:1 rows:1975 clusters:1 slots:1 kept=1
Partition:2 rows:1969 clusters:1 slots:1 kept=1
Partition:3 rows:2174 clusters:1 slots:1 kept=1
Partition:4 rows:2041 clusters:1 slots:1 kept=1
Partition:5 rows:2092 clusters:1 slots:1 kept=1
Partition:6 rows:2048 clusters:1 slots:1 kept=1
Partition:7 rows:2125 clusters:1 slots:1 kept=1
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Revised number of hash buckets (after flushing): 16555
Allocating new hash table.
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Requested size of hash table: 4096
Actual size of hash table: 4096
Number of buckets: 32768
kxhfResize(enter): resize to 14 slots (numAlloc=8, max=12)
kxhfResize(exit): resized to 14 slots (numAlloc=8, max=14)
freeze work area size to: 4357K (14 slots)
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of rows (may have changed): 16555
Number of in-memory partitions (may have changed): 8
Final number of hash buckets: 32768
Size (in bytes) of hash table: 262144
kxhfIterate(end_iterate): numAlloc=8, maxSlots=14
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 21129
Number of buckets with 1 rows: 8755
Number of buckets with 2 rows: 2024
Number of buckets with 3 rows: 433
Number of buckets with 4 rows: 160
Number of buckets with 5 rows: 85
Number of buckets with 6 rows: 69
Number of buckets with 7 rows: 41
Number of buckets with 8 rows: 32
Number of buckets with 9 rows: 18
Number of buckets with between 10 and 19 rows: 21
Number of buckets with between 20 and 29 rows: 1
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 32768 Empty buckets: 21129 Non-empty buckets: 11639
Total number of rows: 16555
Maximum number of rows in a bucket: 24
Average number of rows in non-empty buckets: 1.422373
=====================
.... (lots of fetching) ....
qerhjFetch: max probe row length (mpl=0)
qerhjFreeSpace(): free hash-join memory
kxhfRemoveChunk: remove chunk 0 from slot tableAnd finally, PQ slave output (only one trace, please note Immutable Flag that I believe orders Oracle to buffer to TEMP):
Dump file /opt/oracle/admin/cdwep4/bdump/cdwep401_p002_4640.trc
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0.5
System name: HP-UX
Node name: ethp1018
Release: B.11.11
Version: U
Machine: 9000/800
Instance name: cdwep401
Redo thread mounted by this instance: 1
Oracle process number: 86
Unix process pid: 4640, image: oracle@ethp1018 (P002)
kxhfInit(): enter
kxhfInit(): exit
*** HASH JOIN STATISTICS (INITIALIZATION) ***
Original memory: 4428800
Memory after all overhead: 4283220
Memory for slots: 3809280
Calculated overhead for partitions and row/slot managers: 473940
Hash-join fanout: 8
Number of partitions: 9
Number of slots: 15
Multiblock IO: 31
Block size(KB): 8
Cluster (slot) size(KB): 248
Hash-join fanout (manual): 8
Cluster/slot size(KB) (manual): 280
Minimum number of bytes per block: 8160
Bit vector memory allocation(KB): 128
Per partition bit vector length(KB): 16
Maximum possible row length: 1455
Estimated build size (KB): 645
Estimated Row Length (includes overhead): 167
# Immutable Flags:
BUFFER the output of the join for Parallel Query
kxhfSetPhase: phase=BUILD
kxhfAddChunk: add chunk 0 (sz=32) to slot table
kxhfAddChunk: chunk 0 (lbs=800003ff640ebb50, slotTab=800003ff640ebce8) successfuly added
kxhfSetPhase: phase=PROBE_1
qerhjFetch: max build row length (mbl=96)
*** END OF HASH JOIN BUILD (PHASE 1) ***
Revised row length: 54
Revised row count: 4203
Revised build size: 221KB
kxhfResize(enter): resize to 16 slots (numAlloc=8, max=15)
kxhfResize(exit): resized to 16 slots (numAlloc=8, max=16)
Slot table resized: old=15 wanted=16 got=16 unload=0
*** HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
Number of partitions which could fit in memory: 8
Number of partitions left in memory: 8
Total number of slots in in-memory partitions: 8
Total number of rows in in-memory partitions: 4203
(used as preliminary number of buckets in hash table)
Estimated max # of build rows that can fit in avail memory: 85312
### Partition Distribution ###
Partition:0 rows:537 clusters:1 slots:1 kept=1
Partition:1 rows:554 clusters:1 slots:1 kept=1
Partition:2 rows:497 clusters:1 slots:1 kept=1
Partition:3 rows:513 clusters:1 slots:1 kept=1
Partition:4 rows:498 clusters:1 slots:1 kept=1
Partition:5 rows:543 clusters:1 slots:1 kept=1
Partition:6 rows:547 clusters:1 slots:1 kept=1
Partition:7 rows:514 clusters:1 slots:1 kept=1
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Revised number of hash buckets (after flushing): 4203
Allocating new hash table.
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Requested size of hash table: 1024
Actual size of hash table: 1024
Number of buckets: 8192
kxhfResize(enter): resize to 18 slots (numAlloc=8, max=16)
kxhfResize(exit): resized to 18 slots (numAlloc=8, max=18)
freeze work area size to: 5812K (18 slots)
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of rows (may have changed): 4203
Number of in-memory partitions (may have changed): 8
Final number of hash buckets: 8192
Size (in bytes) of hash table: 65536
kxhfIterate(end_iterate): numAlloc=8, maxSlots=18
*** (continued) HASH JOIN BUILD HASH TABLE (PHASE 1) ***
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 5284
Number of buckets with 1 rows: 2177
Number of buckets with 2 rows: 510
Number of buckets with 3 rows: 104
Number of buckets with 4 rows: 51
Number of buckets with 5 rows: 14
Number of buckets with 6 rows: 14
Number of buckets with 7 rows: 13
Number of buckets with 8 rows: 12
Number of buckets with 9 rows: 4
Number of buckets with between 10 and 19 rows: 9
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 8192 Empty buckets: 5284 Non-empty buckets: 2908
Total number of rows: 4203
Maximum number of rows in a bucket: 16
Average number of rows in non-empty buckets: 1.445323
kxhfWrite: hash-join is spilling to disk
kxhfWrite: Writing dba=950281 slot=8 part=8
kxhfWrite: Writing dba=950312 slot=9 part=8
kxhfWrite: Writing dba=950343 slot=10 part=8
kxhfWrite: Writing dba=950374 slot=11 part=8
.... (lots of writing) ....
kxhfRead(): Reading dba=950281 into slot=15
kxhfIsDone: waiting slot=15 lbs=800003ff640ebb50
kxhfRead(): Reading dba=950312 into slot=16
kxhfIsDone: waiting slot=16 lbs=800003ff640ebb50
kxhfRead(): Reading dba=950343 into slot=17
kxhfFreeSlots(800003ff7c068918): all=0 alloc=18 max=18
EmptySlots:15 8 9 10 11 12 13
PendingSlots:
kxhfIsDone: waiting slot=17 lbs=800003ff640ebb50
kxhfRead(): Reading dba=950374 into slot=15
kxhfFreeSlots(800003ff7c068918): all=0 alloc=18 max=18
EmptySlots:16 8 9 10 11 12 13
PendingSlots:
.... (lots of reading) ....
qerhjFetchPhase2(): building a hash table
kxhfFreeSlots(800003ff7c068980): all=1 alloc=18 max=18
EmptySlots:2 4 6 1 0 7 5 3 14 17 16 15 8 9 10 11 12 13
PendingSlots:
qerhjFreeSpace(): free hash-join memory
kxhfRemoveChunk: remove chunk 0 from slot tableWhy do you think it's surprising that Oracle utilizes TEMP? Basing on traces Oracle seems to be very sure it should spill to disk. I believe the key to answer is this immutable flag printing "BUFFER the output of the join for Parallel Query" - as I mentioned in one of previous posts it's opposite to "Not BUFFER(execution) output of the join for PQ" which appears in some traces found on internet.
Best regards
Bazyli
Maybe you are looking for
-
How can I unlink my iCloud and apple IDs?
When i signed up for an iCloud account, i linked it to my apply ID. Now, knowing that I can't make the iCloud my primary email, I want to separate the two. Unfortunately, I havent had any luck. Every time that I remove the email, and try to log into
-
IPod playlists & crossfade playback
I have created some great playlists in ITunes using the crossfade playback function. Does anyone know whether I can: 1) Transfer the playlists to my iPod with the crossfade function intact? 2) If not, any suggestions for how I can burn the playlist t
-
Hi Guru, my client is asking mismatching of vat amount that is - When generate annual VAT report containing assessable value along with tax values for a particular year which should mat
-
Synching email from MBA to MBP
I'm thinking of getting a MacBook Air for some extensive travelling. My email address is POP3 not IMAP, and when I return from traveling I'd like to get all the emails I've sent and received off the MBA and onto my main computer (MBP). What's the eas
-
Hi Friends.. I have mac book pro 2013. Today when I started its got suddenly turn off and also not turn on. The mac does not charged also not turn on. charger define only green light. I reset SMC but its not work. Please If any one have Idea.