Nested loop vs Hash Join

Hi,
Both the querys are returning same results, but in my first query hash join and second query nested loop . How ? PLs explain
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno>20;
6 rows
Plan hash value: 4102772462
| Id | Operation                    | Name    | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |         |     6 |   348 |     6 (17)| 00:00:01 |
|* 1 | HASH JOIN                   |         |     6 |   348 |     6 (17)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     3 |    60 |     2   (0)| 00:00:01 |
|* 3 |    INDEX RANGE SCAN          | PK_DEPT |     3 |       |     1   (0)| 00:00:01 |
|* 4 |   TABLE ACCESS FULL          | EMP     |     7 |   266 |     3   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   1 - access("A"."DEPTNO"="B"."DEPTNO")
   3 - access("B"."DEPTNO">20)
   4 - filter("A"."DEPTNO">20)
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno=30;
6 rows
Plan hash value: 568005898
| Id | Operation                    | Name    | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |         |     5 |   290 |     4   (0)| 00:00:01 |
|   1 | NESTED LOOPS                |         |     5 |   290 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    20 |     1   (0)| 00:00:01 |
|* 3 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
|* 4 |   TABLE ACCESS FULL          | EMP     |     5 |   190 |     3   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   3 - access("B"."DEPTNO"=30)
   4 - filter("A"."DEPTNO"=30)

Hi,
Unless specifically requested, Oracle picks the best execution plan based on estimates of table sizes, column selectivity and many other variables. Even though Oracle does its best to have the estimates as accurate as possible, they are frequently different, and in some cases quite different, from the actual values.
In the first query, Oracle estimated that the predicate “ b.deptno>20” would limit the number of records to 6, and based on that it decided the use Hash Join.
In the second query, Oracle estimated that the predicate “b.deptno=30” would limit the number of records to 5, and based on that it decided the use Nested Loops Join.
The fact that the actual number of records is the same is irrelevant because Oracle used the estimate, rather the actual number of records to pick the best plan.
HTH,
Iordan
Iotzov

Similar Messages

Why optimizer prefers nested loop over hash join?

What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
    FROM t1 p, t2 d
    WHERE d.emplid = p.id_psoft
      AND p.flag_processed = 'N'
      AND p.desc_pool = :b1
      AND NOT d.name LIKE '%DUPLICATE%'
      AND ROWNUM < 2tkprof output is:
Production
call     count       cpu    elapsed       disk      query    current        rows
Parse        1      0.01       0.00          0          0          4           0
Execute      1      0.00       0.01          0          4          0           0
Fetch        1    228.83     223.48          0    4264533          0           1
total        3    228.84     223.50          0    4264537          4           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 108 (SANJEEV)
Rows     Row Source Operation
      1 COUNT STOPKEY (cr=4264533 pr=0 pw=0 time=223484076 us)
      1   NESTED LOOPS (cr=4264533 pr=0 pw=0 time=223484031 us)
10401    TABLE ACCESS FULL T1 (cr=192 pr=0 pw=0 time=228969 us)
      1    TABLE ACCESS FULL T2 (cr=4264341 pr=0 pw=0 time=223182508 us)Development
call     count       cpu    elapsed       disk      query    current        rows
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.01          0          4          0           0
Fetch        1      0.05       0.03          0        512          0           1
total        3      0.06       0.06          0        516          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 113 (SANJEEV)
Rows     Row Source Operation
      1 COUNT STOPKEY (cr=512 pr=0 pw=0 time=38876 us)
      1   HASH JOIN (cr=512 pr=0 pw=0 time=38846 us)
     51    TABLE ACCESS FULL T2 (cr=492 pr=0 pw=0 time=30230 us)
    861    TABLE ACCESS FULL T1 (cr=20 pr=0 pw=0 time=2746 us)

sanjeevchauhan wrote:
What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
FROM t1 p, t2 d
WHERE d.emplid = p.id_psoft
AND p.flag_processed = 'N'
AND p.desc_pool = :b1
AND NOT d.name LIKE '%DUPLICATE%'
AND ROWNUM < 2
You've got already some suggestions, but the most straightforward way is to run the unhinted statement in both environments and then force the join and access methods you would like to see using hints, in your case probably "USE_HASH(P D)" in your production environment and "FULL(P) FULL(D) USE_NL(P D)" in your development environment should be sufficient to see the costs and estimates returned by the optimizer when using the alternate access and join patterns.
This give you a first indication why the optimizer thinks that the chosen access path seems to be cheaper than the obviously less efficient plan selected in production.
As already mentioned by Hemant using bind variables complicates things a bit since EXPLAIN PLAN is not reliable due to bind variable peeking performed when executing the statement, but not when explaining.
Since you're already on 10g you can get the actual execution plan used for all four variants using DBMS_XPLAN.DISPLAY_CURSOR which tells you more than the TKPROF output in the "Row Source Operation" section regarding the estimates and costs assigned.
Of course the result of your whole exercise might be highly dependent on the actual bind variable value used.
By the way, your statement is questionable in principle since you're querying for the first row of an indeterministic result set. It's not deterministic since you've defined no particular order so depending on the way Oracle executes the statement and the physical storage of your data this query might return different results on different runs.
This is either an indication of a bad design (If the query is supposed to return exactly one row then you don't need the ROWNUM restriction) or an incorrect attempt of a Top 1 query which requires you to specify somehow an order, either by adding a ORDER BY to the statement and wrapping it into an inline view, or e.g. using some analytic functions that allow you specify a RANK by a defined ORDER.
This is an example of how a deterministic Top N query could look like:
SELECT
FROM
SELECT p.*
    FROM t1 p, t2 d
    WHERE d.emplid = p.id_psoft
      AND p.flag_processed = 'N'
      AND p.desc_pool = :b1
      AND NOT d.name LIKE '%DUPLICATE%'
ORDER BY <order_criteria>
WHERE ROWNUM <= 1;Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

Generally when does optimizer use nested loop and Hash joins ?

Version: 11.2.0.3, 10.2
Lets say I have a table called ORDER and ORDER_DETAIL.
ORDER_DETAIL is the child table of ORDERS .
This is what I understand about Nested Loop:
When we join ORDER AND ORDER_DETAIL tables oracle will form a 'nested loop' in which for each order_ID in ORDER table (outer loop), oracle will look for corresponding multiple ORDER_IDs in the ORDER_DETAIL table.
Is nested loop used when the driving table (ORDER in this case) is smaller than the child table (ORDER_DETAIL) ?
Is nested loop more likely to use Indexes in general ?
How will these two tables be joined using Hash joins ?
When is it ideal to use hash joins ?

Your description of a nested loop is correct.
The overall rule is that Oracle will use the plan that it calculates to be, in general, fastest. That mostly means fewest I/O's, but there are various factors that adjust its calculation - e.g. it expects index blocks to be cached, multiple reads entries in an index may reference the same block, full scans get multiple blocks per I/O. It also has CPU cost calculations, but they generally only become significant with function / package calls or domain indexes (spatial, text, ...).
Nested loop with an index will require one indexed read of the child table per outer table row, so its I/O cost is roughly twice the number of rows expected to match the where clause conditions on the outer table.
A hash join reads the of the smaller table into a hash table then matches the rows from the larger table against the hash table, so its I/O cost is the cost of a full scan of each table (unless the smaller table is too big to fit in a single in-memory hash table). Hash joins generally don't use indexes - it doesn't use the index to look up each result. It can use an index as a data source, as a narrow version of the table or a way to identify the rows satisfying the other where clause conditions.
If you are processing the whole of both tables, Oracle is likely to use a hash join, and be very fast and efficient about it.
If your where clause restricts it to a just few rows from the parent table and a few corresponding rows from the child table, and you have an index Oracle is likely to use a nested loops solution.
If the tables are very small, either plan is efficient - you may be surprised by its choice.
Don't be worry about plans with full scans and hash joins - they are usually very fast and efficient. Often bad performance comes from having to do nested loop lookups for lots of rows.

Oracle 11g - Nested loops on outer joins

Hello,
I have a select query that was working with no problems. The results are used to insert data into a temp table.
Recently, it would not complete executing. The explain plan shows a cartesian. But, there could be problems with using nested loops on the outer join. Interestingly, when I copy production code and rename the temp table and rename the view, it works.
Can someone take a look at the code and help. Maybe offer a suggestion on tuning too? Thanks.
CREATE TABLE "CT"
( "TN" VARCHAR2(30) NOT NULL ENABLE,
"COL_NAME" VARCHAR2(30) NOT NULL ENABLE,
"CDE" VARCHAR2(5) NOT NULL ENABLE,
"CDE_DESC" VARCHAR2(80) NOT NULL ENABLE,
"CDE_STAT" CHAR(1));
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('INDSD', 'STCD', 'U', 'RF', 'A');
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('AT', 'TCD', '001', 'RL', 'A');
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('AT', 'TCD', '033', 'PFR', 'A');
CREATE TABLE "IPP"
( "IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"PLCD" VARCHAR2(5) NOT NULL ENABLE,
"CBCD" VARCHAR2(5));
insert into IPP (IND_ID, PLCD, CBCD)
values (2007, 'AS', '04');
insert into IPP (IND_ID, PLCD, CBCD)
values (797098, 'AS', '34');
insert into IPP (IND_ID, PLCD, CBCD)
values (797191, 'AS','04');
CREATE TABLE "INDS"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"IND_CID" NUMBER(*,0),
"GFLG" VARCHAR2(1),
"HHID" NUMBER(9,0),
"DOB" DATE,
"DOB_FLAG" VARCHAR2(1),
"VCD" VARCHAR2(5),
"VTDTE" DATE,
"VPPCD" VARCHAR2(4),
"VRCDTE" DATE NOT NULL ENABLE,
"VDSID" NUMBER(9,0),
"VTRANSID" NUMBER(12,0),
"VOWNCD" VARCHAR2(5),
"RCDTE" DATE,
"LRDTE" DATE
insert into INDS (OPCD, IND_ID, IND_CID, GFLG, HHID, DOB, DOB_FLAG, VCD, VTDTE, VPPCD, VRCDTE, VDSID, VTRANSID, VOWNCD, RCDTE, LRDTE)
values ('USST', 2007, 114522319, '', 304087673, to_date('01-01-1980', 'dd-mm-yyyy'), 'F', '2', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), '', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), 1500016, null, 'USST', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), to_date('18-07-2012 21:52:53', 'dd-mm-yyyy hh24:mi:ss'));
insert into INDS (OPCD, IND_ID, IND_CID, GFLG, HHID, DOB, DOB_FLAG, VCD, VTDTE, VPPCD, VRCDTE, VDSID, VTRANSID, VOWNCD, RCDTE, LRDTE)
values ('USST', 304087678, 115242519, '', 304087678, to_date('01-01-1984', 'dd-mm-yyyy'), 'F', '2', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), '', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), 1500016, null, 'USST', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), to_date('18-07-2012 21:52:53', 'dd-mm-yyyy hh24:mi:ss'));
CREATE TABLE "INDS_TYPE"
( "IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"STCD" VARCHAR2(5) NOT NULL ENABLE);
insert into INDS_type (IND_ID, STCD)
values (2007, 'U');
insert into INDS_type (IND_ID, STCD)
values (313250322, 'U');
insert into INDS_type (IND_ID, STCD)
values (480058122, 'U');
CREATE TABLE "PLOP"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"PLCD" VARCHAR2(5) NOT NULL ENABLE,
"PPLF" VARCHAR2(1));
insert into PLOP (OPCD, PLCD, PPLF)
values ('USST', 'SP', 'Y');
insert into PLOP (OPCD, PLCD, PPLF)
values ('PMUSA', 'ST', '');
insert into PLOP (OPCD, PLCD, PPLF)
values ('USST', 'RC', '');
CREATE TABLE "IND_T"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"CID" NUMBER(9,0) NOT NULL ENABLE,
"CBCD" VARCHAR2(5),
"PF" VARCHAR2(1) NOT NULL ENABLE,
"DOB" DATE,
"VCD" VARCHAR2(5),
"VOCD" VARCHAR2(5),
"IND_CID" NUMBER,
"RCDTE" DATE NOT NULL ENABLE
insert into IND_T (OPCD, CID, CBCD,PF, DOB, VCD, VOCD, IND_CID, RCDTE)
values ('JMC', 2007, '04', 'F',to_date('11-10-1933', 'dd-mm-yyyy'), '2', 'PMUSA', 363004880, to_date('30-09-2009 04:31:34', 'dd-mm-yyyy hh24:mi:ss'));
insert into IND_T (OPCD, CID, CBCD,PF, DOB, VCD, VOCD, IND_CID, RCDTE)
values ('JMC', 2008, '04', 'N',to_date('01-01-1980', 'dd-mm-yyyy'), '2', 'PMUSA', 712606335, to_date('05-04-2013 19:36:05', 'dd-mm-yyyy hh24:mi:ss'));
CREATE TABLE "IC"
( "CID" NUMBER(9,0) NOT NULL ENABLE,
"CF" CHAR(1));
insert into IC (CID, CF)
values (2007, 'N');
insert into IC (CID, CF)
values (100, 'N');
insert into IC (CID, CF)
values (200, 'N');
CREATE OR REPLACE FORCE VIEW "INDSS_V" ("OPCD", "IND_ID", "IND_CID", "GFLG", "HHID", "DOB", "DOB_FLAG", "VCD", "VTDTE", "VPPCD", "VRCDTE", "VDSID", "VTRANSID", "VOWNCD", "RCDTE", "LRDTE") AS
SELECT DISTINCT a.OPCD, a.IND_ID, a.IND_CID, a.GFLG, a.HHID,
a.DOB, a.DOB_flag, a.VCD, a.VTDTE,
a.VPPCD, a.VRCDTE, a.VDSID, a.VTRANSID,
a.VOWNCD, a.RCDTE, a.LRDTE
FROM INDS a, INDS_type b
WHERE a.IND_ID = b.IND_ID
AND b.STCD in (select CDE
from CT --database link
where TN = 'INDSD'
and COL_NAME = 'STCD'
and CDE_STAT = 'A') ;
--insert /*+ parallel(IND_T,2) */ into IND_T
select /*+ parallel(a,4) */
a.OPCD as OPCD
, a.IND_ID as CID
, b.CBCD as CBCD
, NULL as BFCD
, 'N' as PF
, a.DOB as DOB
, a.VCD as VCD
, a.VOWNCD as VOCD
, a.IND_CID as IND_CID
, a.RCDTE as RCDTE
from INDSS_V a
, (select /*+ parallel(IPP,4) */ * from IPP IPP , PLOP PLO
where plo.PLCD = ipp.PLCD
and PPLF='Y') b
, IC c
where a.IND_ID = b.IND_ID (+)
and a.OPCD = b.OPCD (+)
and a.IND_ID = c.CID
and c.CF = 'N';

Please consult
HOW TO: Post a SQL statement tuning request - template posting
Also format your code and post it using the [ code ] and [ /code ] tags. (Leave out the extra space after [ and before ])
Sybrand Bakker
Senior Oracle DBA
Edited by: sybrand_b on 10-apr-2013 17:57

Hash join vs nested loop

DECLARE @tableA table (Productid varchar(20),Product varchar(20),RateID int)
insert into @tableA values('1','Mobile',2);
insert into @tableA values('2','Chargers',4);
insert into @tableA values('3','Stand',6);
insert into @tableA values('4','Adapter',8);
insert into @tableA values('5','Cover',10);
insert into @tableA values('6','Protector',12);
--SELECT * FROM @tableA
DECLARE @tableB table (id varchar(20),RateID int,Rate int)
insert into @tableB values('1',2,200);
insert into @tableB values('2',4,40);
insert into @tableB values('3',6,60);
insert into @tableB values('4',8,80);
insert into @tableB values('5',10,10);
insert into @tableB values('6',12,15);
--SELECT * FROM @tableB
SELECT Product,Rate
FROM @tableA a
JOIN @tableB b ON a.RateID = b.RateID
Above is the sample query, where in execution plan it shows the Hash Match (inner Join). Now how do I change it to Nested Loop with out changing the query? help plz

Is Hash Match(inner join) or Nested loop is better to have in the query?
That depends on the size of the tables, available indexes etc. The optimizer will (hopefully) make the best choice.
Above is the sample query, where in execution plan it shows the Hash Match (inner Join). Now how do I change it to Nested Loop with out changing the query?
The answer that you should leave that to the optimizer in most cases.
I see that the logical read for nested loop is higher than Hash Match.
But Hash Match tends to need more CPU. The best way to two compare two queries or plans is wallclock time.
On a big tables, how do we reduce the logical read?
Make sure that there are usable indexes.
Erland Sommarskog, SQL Server MVP, [email protected]

HASH JOIN or NESTED LOOP

I've been asked to check if HASH JOIN is more suitable than NESTED LOOP(which CBO chose by default) for the following query.
SELECT CM_DETAILS.TASK_ID FROM GEN_TYPE, CM_DETAILS WHERE ( ( ( ( ( CM_DETAILS.STAT_CODE < 8 ) AND ( GEN_TYPE.TASK_ID = CM_DETAILS.TASK_ID ) ) AND ( GEN_TYPE.DEST_LOCN_ID = 5 ) ) AND ( GEN_TYPE.COM_ID = 7 ) ) AND ( ( ( CM_DETAILS.CASE_NO = 1 ) OR ( CM_DETAILS.CASE_NO = 3 ) ) OR ( CM_DETAILS.CASE_NO = 9 ) ) )
Both GEN_TYPE and CM_DETAILS tables have over 330,000 rows.
Version: 10g R2
Any thoughts?

As gintsp gave you very nice tip but there is initialization parameter "     OPTIMIZER_INDEX_COST_ADJ" which cause what path to be chose for CBO,but usually expert says for changing init paramter setting should be at last resort.
It has default value of 100 which indicates to the CBO that indexed access is 100% as costly (i.e., equally costly) as FULL table scan access.
SQL> column plan_plus_exp format a100
SQL> set linesize 1000
SQL> SET AUTOTRACE TRACEONLY
SQL> SELECT e.ename,d.dname
2    FROM emp e,dept d
3   WHERE e.deptno=d.deptno
4 /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=10 Bytes=320)
   1    0   HASH JOIN (Cost=7 Card=10 Bytes=320)
   2    1     TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=90)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
        672 recursive calls
          0 db block gets
        151 consistent gets
         27 physical reads
          0 redo size
        793 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
         15 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=10 Bytes=320)
   1    0   HASH JOIN (Cost=7 Card=10 Bytes=320)
   2    1     TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=90)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          0 recursive calls
          0 db block gets
         15 consistent gets
          0 physical reads
          0 redo size
        793 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> SHOW PARAMETER optimizer
NAME                                 TYPE                             VALUE
optimizer_dynamic_sampling           integer                          2
optimizer_features_enable            string                           10.1.0
optimizer_index_caching              integer                          0
optimizer_index_cost_adj             integer                          100<--------
optimizer_mode                       string                           ALL_ROWS
SQL> ALTER SESSION SET optimizer_index_cost_adj=35
2 /
Session altered.
SQL> SELECT e.ename,d.dname
2    FROM emp e,dept d
3   WHERE e.deptno=d.deptno
4 /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=10 Bytes=320)
   1    0   MERGE JOIN (Cost=5 Card=10 Bytes=320)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (TABLE) (Cost=1 Card=5 Bytes=90)
   3    2       INDEX (FULL SCAN) OF 'DEPT_PRIMARY_KEY' (INDEX (UNIQUE)) (Cost=1 Card=5)
   4    1     SORT (JOIN) (Cost=4 Card=14 Bytes=196)
   5    4       TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          1 recursive calls
          0 db block gets
         11 consistent gets
          1 physical reads
          0 redo size
        733 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=10 Bytes=320)
   1    0   MERGE JOIN (Cost=5 Card=10 Bytes=320)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (TABLE) (Cost=1 Card=5 Bytes=90)
   3    2       INDEX (FULL SCAN) OF 'DEPT_PRIMARY_KEY' (INDEX (UNIQUE)) (Cost=1 Card=5)
   4    1     SORT (JOIN) (Cost=4 Card=14 Bytes=196)
   5    4       TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          0 recursive calls
          0 db block gets
         11 consistent gets
          0 physical reads
          0 redo size
        733 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
         14 rows processedKhurram

Nested loop join v/s Sort merge

I have seen that nested loops are better if the inner table is being indexed, because for each outer table row, we are looking for a match in the inner table. But is there any case when optimizer still goes for a nested loop even if there is no index on the inner table. That is my first question ?
My second question := When doing a sort merge join oracle has to sort both result sets and then merge them. Oracle says that if both the row sets, if already sorted is definately better for performance. Ya thats obvious. But back to my upper question, when there is no index on the inner table, is it the situation when oracle goes for a sort merge join ?

My response should really have examples but since I do not have any handy I will just say about your first question. If there is no index available from table A to table B yes it is possible a nested loop join may still be used and table B read via full table scan within a nested loop. If table B is very small and consists of only a block or two this may be relatively efficient plan. It is more likely you sould see table B full scanned and the result feed into a hash join, but I have seen the plan you mention.
Back before hash joins were introduced with 7.3 (if my memory is correct) you would see sort/merge joins used more often than you do now. Generally speaking no index on the join conditions would exist for this option to be chosen.
If you really want to know why and sometimes what the optimizer is going to do buy Jonathan Lewis's book Cost-Based Oracle Fundamentals. If explains the optimizer in more depth than any other source I know of.
HTH -- Mark D Powell --

Help tuning NESTED LOOPS OUTER joins

Hello,
I have inherited this nasty query (below) that is taking an awful time to complete (more than 2 hrs a day)
The worst bit is that I need to outer join my fact table so many times as I need bit’s and pieces from other tables/mviews.
When I look at the explain plan I see that this situation means that the cbo is doing several NESTED LOOPS OUTER join operations. I understand that these nested loops mean going through every row in my primary table to see if there is a match in the secondary table (much smaller) which makes it extremely inefficient, is this right?
The stats on the tables are all refreshed daily.
Any ideas on how I can improve the performance here?
Thanks in advance!
The query:
explain plan for
SELECT x.user_id AS user_id,
x.login_name AS login_name,
c.date_of_birth AS date_of_birth,
x.registration_site AS registration_site,
x.organisation AS organisation,
c.user_title AS user_title,
c.first_name AS first_name,
c.last_name AS last_name,
x.email_address AS email_address,
x.user_status AS user_status,
x.user_privilege AS user_access_privilege,
x.date_registration AS date_registration,
x.affiliate_id AS affiliate_id,
x.mobile_number AS mobile_number,
x.optional_parameter AS vt_number,
gud.display_name AS chat_name,
REPLACE (s4.address_line_1, ',', '') AS address_line_1,
REPLACE (s4.address_line_2, ',', '') AS address_line_2,
REPLACE (s4.town, ',', '') AS town,
REPLACE (s4.county, ',', '') AS county,
REPLACE (s4.postcode, ',', '') AS postcode,
s4.country AS country,
s3.last_login AS last_login_date,
x.email_send_newsletter AS email_send_newsletter,
x.email_give_details_thirdparty AS email_give_details_thirdparty,
NVL (ia.cash_balance, 0) AS current_cash_balance,
NVL (ia.bonus_balance, 0) AS current_bonus_balance,
x.external_affiliate_id AS external_affiliate_id,
r.currency_code AS currency,
NVL (ia.points_balance, 0) AS current_loyalty_points_balance,
p.status AS buyer_status,
NVL (ia.bi_bonus_balance, 0) AS current_bi_bonus_balance,
NVL (ia.pending_balance, 0) AS current_pending_balance,
l.level_name AS current_loyalty_level,
l.date_level_achieved AS date_level_achieved,
NVL (l.current_period_loyalty_points, 0) AS current_period_loyalty_points,
r.region AS user_region,
x.registration_platform AS registration_platform,
x.external_user_name AS external_user_name,
c.home_number AS home_number,
pr.code AS reg_promo_code,
g.date_first_buy AS date_first_buy
FROM gl_user_registrations x,
gl_region r,
MVW_USER_BALANCES ia,
gl_customers c,
gl_user_display_names gud,
gl_user_last_login s3,
(SELECT z.user_id AS user_id,
z.address_line_1 AS address_line_1,
z.address_line_2 AS address_line_2,
z.town AS town,
z.county AS county,
z.postcode AS postcode,
z.country AS country
FROM gl_user_addresses z
WHERE z.is_current = 1) s4,
gl_user_buyer_mapping upm,
gl_buyer p,
mvw_user_loyalty_points l,
MVW_USER_PROMO_CODE_REG pr,
MVW_USER_FIRST_BUY_DATE g
WHERE x.base_region = r.region
AND x.user_id = ia.user_id (+)
AND x.customer_id = c.customer_id(+)
AND x.user_id = gud.user_id (+)
AND x.user_id = s4.user_id (+)
AND x.user_id = s3.user_id (+)
AND x.user_id = upm.user_id (+)
AND upm.buyer_id = p.buyer_id
AND x.user_id = l.user_id (+)
AND x.user_id = pr.user_id (+)
AND x.user_id = g.user_id (+);
select * from table(dbms_xplan.display);
Plan hash value: 2158171613
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 100 | 63100 | 135 (1)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 100 | 63100 | 135 (1)| 00:00:01 |
| 2 | NESTED LOOPS OUTER | | 100 | 60600 | 120 (1)| 00:00:01 |
| 3 | NESTED LOOPS OUTER | | 100 | 57100 | 105 (1)| 00:00:01 |
| 4 | NESTED LOOPS OUTER | | 100 | 55400 | 90 (2)| 00:00:01 |
| 5 | NESTED LOOPS OUTER | | 100 | 53600 | 70 (2)| 00:00:01 |
|* 6 | HASH JOIN | | 100 | 47000 | 55 (2)| 00:00:01 |
| 7 | TABLE ACCESS FULL | GL_REGION | 18 | 252 | 2 (0)| 00:00:01 |
| 8 | NESTED LOOPS OUTER | | 100 | 22800 | 52 (0)| 00:00:01 |
| 9 | NESTED LOOPS OUTER | | 100 | 19700 | 47 (0)| 00:00:01 |
| 10 | NESTED LOOPS OUTER | | 100 | 17600 | 37 (0)| 00:00:01 |
| 11 | NESTED LOOPS | | 100 | 15800 | 27 (0)| 00:00:01 |
| 12 | NESTED LOOPS | | 102 | 2754 | 17 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | GL_BUYER | 6143K| 64M| 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| GL_USER_BUYER_MAPPING | 1 | 16 | 1 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | GL_USER_BUYER_MAPPPING_IX | 1 | | 1 (0)| 00:00:01 |
| 16 | TABLE ACCESS BY INDEX ROWID | GL_USER_REGISTRATIONS | 1 | 131 | 1 (0)| 00:00:01 |
|* 17 | INDEX UNIQUE SCAN | PK_GL_USER_REGISTRATIONS | 1 | | 1 (0)| 00:00:01 |
| 18 | TABLE ACCESS BY INDEX ROWID | GL_USER_LAST_LOGIN | 1 | 18 | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | GL_USER_LAST_LOGIN_PK | 1 | | 1 (0)| 00:00:01 |
| 20 | TABLE ACCESS BY INDEX ROWID | GL_USER_DISPLAY_NAMES | 1 | 21 | 1 (0)| 00:00:01 |
|* 21 | INDEX UNIQUE SCAN | PK_GL_USER_DISPLAY_NAMES | 1 | | 1 (0)| 00:00:01 |
| 22 | TABLE ACCESS BY INDEX ROWID | GL_CUSTOMERS | 1 | 31 | 1 (0)| 00:00:01 |
|* 23 | INDEX UNIQUE SCAN | PK_GL_CUSTOMERS | 1 | | 1 (0)| 00:00:01 |
|* 24 | TABLE ACCESS BY INDEX ROWID | GL_USER_ADDRESSES | 1 | 66 | 1 (0)| 00:00:01 |
|* 25 | INDEX RANGE SCAN | IX_GL_USER_ADDRESSES1 | 1 | | 1 (0)| 00:00:01 |
| 26 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_FIRST_BUY_DATE | 1 | 18 | 1 (0)| 00:00:01 |
|* 27 | INDEX RANGE SCAN | MVW_USER_FS_DATE_IDX | 1 | | 1 (0)| 00:00:01 |
| 28 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_PROMO_CODE_REG | 1 | 17 | 1 (0)| 00:00:01 |
|* 29 | INDEX RANGE SCAN | MVW_USER_PROMO_CODE_IDX | 1 | | 1 (0)| 00:00:01 |
| 30 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_LOYALTY_POINTS | 1 | 35 | 1 (0)| 00:00:01 |
|* 31 | INDEX RANGE SCAN | MVW_USER_LYP_IDX | 1 | | 1 (0)| 00:00:01 |
| 32 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_BALANCES | 1 | 25 | 1 (0)| 00:00:01 |
|* 33 | INDEX RANGE SCAN | MVW_USER_BALANCES_IDX | 1 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
6 - access("X"."BASE_REGION"="R"."REGION")
15 - access("UPM"."BUYER_ID"="P"."BUYER_ID")
17 - access("X"."USER_ID"="UPM"."USER_ID")
19 - access("X"."USER_ID"="S3"."USER_ID"(+))
21 - access("X"."USER_ID"="GUD"."USER_ID"(+))
23 - access("X"."CUSTOMER_ID"="C"."CUSTOMER_ID"(+))
24 - filter("Z"."IS_CURRENT"(+)=1)
25 - access("X"."USER_ID"="Z"."USER_ID"(+))
27 - access("X"."USER_ID"="G"."USER_ID"(+))
29 - access("X"."USER_ID"="PR"."USER_ID"(+))
31 - access("X"."USER_ID"="L"."USER_ID"(+))
33 - access("X"."USER_ID"="IA"."USER_ID"(+))

Hi,
1) What you are saying about nested loops is true about any join (except, of course, cartesian joins): you are taking rows from rowsource A and find matching rows from rowsource B. This doesn't make a join method efficient or inefficient.
2) The plan you posted does not indicate any performance problem whatsoever. I know you have one, but it's not possible to address it without having any information about it. Trace it, get dbms_xplan.display_cursor dump with rowsource stats, or real-time SQL monitoring report (if your version and license allow it) and post the results here, then we'd be able to help
3) One efficient way to perform queries of your type (big fact table joined to a bunch of small dimension tables) is star transformation, but there are certain pre-requisites for that (like bitmap indexes on FK constraints) -- please read the documentation on star queries/transformations and see if that is an option for you
Best regards,
Nikolay

When does oracle use a complete nested loop join?

Hi!
Does Oracle Database use a complete nested loop join? I mean, imagine 2 tables without any indexes.. is there any case where for each row in the outer table Oracle does a complete scan in the inner table? I know that this is the original algorithm for the nested loop join, but some data bases prefer to make a temp table to autoindex the inner table and never makes the complete scan in the inner table..
thanks!!

user12040235 wrote:
If the table do not have indexes.. some data bases prefer to scan one time the inner table, to index all values, and than, for every row in the outter loop table, it will do a index search.
I just like to know oracle does the same thing, or it does the complete scan..If you have two tables without indexes, Oracle may consider scanning one table, extracting the smallest data set it can get away with, and then building a hash table of that data set (rather than creating an in-memory copy with index). At this point Oracle can then do a nested loop join into the in-memory hash table.
However, this is called a hash join, and the order of tables will appear to be reversed, viz:
nested loop
    table scan full ABC
    table scan full XYZ
{code]
becomeshash join
table scan full XYZ
table scan full ABC
See: http://jonathanlewis.wordpress.com/2010/08/02/joins/ as a starting point if you want to read more on this topic.
Regards
Jonathan Lewis

Nested loop, merge join and harsh join

Can any one tell me the difference/relationship between nested loop, harsh join and merge join...Thanx

Check Oracle Performance Tuning Guide
13.6 Understanding Joins
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523

CBO (optimizer) nest-loop join question

OS: Red Hat Linux
DB: 11gR1
I have gotten two conflicting answers while reading books by Don Burleson and Dan Hotka. It has to do with the CBO and nested-joins:
One says the CBO will choose the 'smaller' table as the driving table, the other states that the 'larger' table will be the driving table. And both stick by this philosophy as the preferred goal of any SQL Tuning -- that is, one states that the 'smaller' table should be the driving table. The other says the 'larger' table should be the driving table.
I had always thought that the 'smaller' table should be the driving table. That in a nested loop the driving will not likely use an index even. Who is correct? (I am not going to say who said what, btw). :-)
But I got to let one of them know they got a 'typo' ... :-)
Thx.

user601798 wrote:
It is an over-simplistic scenario but, as I mentioned, if all other things are 'equal' -- which would include 'access time/work', then I think the small table as the driving table has the advantage.
It is not possible for +"*all* other things to be equal"+. (my emphasis).
If by +'access time/work'+ you mean the total is the same then it doesn't matter which table is first, the time/work is the same either way round.
If you want to say that the +'access time/work'+ for acquiring the first rowsource is the same for both paths, and the +'access time/work'+ for acquiring related rows from the second table is the same FOR EACH DRIVING ROW, then the total +'access time/work'+ will be difference, and it would be better to start with the smaller table. (The example by Salman Qureshi above: Re: CBO (optimizer) nest-loop join question would apply.)
On the other hand, and ignoring any idea of "all other things being equal", smaller tables tend to have smaller indexes, so if your smaller rowsource comes from a smaller table then acquiring those rows may be cheaper than acquiring rows from a larger table - which leads to the observation that (even with perfectly precise indexing):
<ul>
smaller number of rows * larger unit cost to find related rows
</ul>
may produce a larger value than
<ul>
larger number of rows * smaller unit cost to find related rows
</ul>
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
A general reminder about "Forum Etiquette / Reward Points": http://forums.oracle.com/forums/ann.jspa?annID=718
If you never mark your questions as answered people will eventually decide that it's not worth trying to answer you because they will never know whether or not their answer has been of any use, or whether you even bothered to read it.
It is also important to mark answers that you thought helpful - again it lets other people know that you appreciate their help, but it also acts as a pointer for other people when they are researching the same question, moreover it means that when you mark a bad or wrong answer as helpful someone may be prompted to tell you (and the rest of the forum) what's so bad or wrong about the answer you found helpful.

Is merge join cartesian more cpu intensibe than nested loop ?

Hi,
just wonderning which access method is more cpu intensive , lets supposed we got 2 the same row sources and doing joing via merge join cartesian and next case is nested loop .
I know NL can be cpu intensive because of tight loop access , but what abour MJC ?
I can see bufferd sort but not sure is that cpu friendly ?
Regards
GregG

Hi,
I think in your case it's more accurate to compare a NESTED LOOP (NL) to a MERGE JOIN (MJ), because CARTESIAN MERGE JOIN is a rather special case of MJ.
Merge join sorts its inputs before combining them, and it could be efficient when one or both of inputs are already sorted.
Regarding your question (which is more CPU intensive):
1) if MERGE JOIN involves disk spills, then CPU is probably irrelevant, because disk operations are much more expensive
2) the amount of work to combine rowsources via a MJ depends on how well they are aligned with respect to each other, so I don't think it can be expressed via a simple formula.
For nested loops, the situation is much more simple: you don't need to do any special work do combine the rowsource, so the cost is just the sum of the cost acquiring the outer rowsource plus the number of iterations times the cost of one iteration. If the data is read from disk, then CPU probably won't matter much, if most of reads are logical ones than CPU becomes of a factor (it's hard to tell how much work CPU will have to do per one logical read because there are two many factors here -- how many columns are accessed, how they are located within the block, are there any expensive math functions applied to any of them etc.)
Best regards,
Nikolay

Inner / outer table in nested loops join

I can't understand what 'inner' / 'outer'
table means in nested loops join operation.
please explain the exact meaning.
maybe i do not understand the nested loops
join itself. I tried to find the meanings
in Oracle manual, but I couldn't.

If I understand correctly your question. An outer table loop is where you have a table with a primary key (master table) and you want to iterate into that table which have details forign key (inner loop table) for example you have customers table each have many invoices.
hope that ansowers your query.
<BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by 4baf:
I can't understand what 'inner' / 'outer'
table means in nested loops join operation.
please explain the exact meaning.
maybe i do not understand the nested loops
join itself. I tried to find the meanings
in Oracle manual, but I couldn't.<HR></BLOCKQUOTE>
null

Query Degradation--Hash Join Degraded

Hi All,
I found one query degradation issue.I am on 10.2.0.3.0 (Sun OS) with optimizer_mode=ALL_ROWS.
This is a dataware house db.
All 3 tables involved are parition tables (with daily partitions).Partitions are created in advance and ELT jobs loads bulk data into daily partitions.
I have checked that CBO is not using local indexes-created on them which i believe,is appropriate because when i used INDEX HINT, elapsed time increses.
I checked giving index hint for all tables one by one but dint get any performance improvement.
Partitions are daily loaded and after loading,partition-level stats are gathered with dbms_stats.
We are collecting stats at partition level(granularity=>'PARTITION').Even after collecting global stats,there is no change in access pattern.Stats gather command is given below.
PROCEDURE gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Only SOT_KEYMAP.IPK_SOT_KEYMAP is GLOBAL.Rest all indexes are LOCAL.
Earlier,we were having BIND PEEKING issue,which i fixed but introducing NO_INVALIDATE=>FALSE in stats gather job.
Here,Partition_name (20090219) is being passed through bind variables.
SELECT a.sotrelstg_sot_ud sotcrct_sot_ud,
b.sotkey_ud sotcrct_orig_sot_ud, a.ROWID stage_rowid
FROM (SELECT sotrelstg_sot_ud, sotrelstg_sys_ud,
sotrelstg_orig_sys_ord_id, sotrelstg_orig_sys_ord_vseq
FROM sot_rel_stage
WHERE sotrelstg_trd_date_ymd_part = '20090219'
AND sotrelstg_crct_proc_stat_cd = 'N'
AND sotrelstg_sot_ud NOT IN(
SELECT sotcrct_sot_ud
FROM sot_correct
WHERE sotcrct_trd_date_ymd_part ='20090219')) a,
(SELECT MAX(sotkey_ud) sotkey_ud, sotkey_sys_ud,
sotkey_sys_ord_id, sotkey_sys_ord_vseq,
sotkey_trd_date_ymd_part
FROM sot_keymap
WHERE sotkey_trd_date_ymd_part = '20090219'
AND sotkey_iud_cd = 'I'
--not to select logical deleted rows
GROUP BY sotkey_trd_date_ymd_part,
sotkey_sys_ud,
sotkey_sys_ord_id,
sotkey_sys_ord_vseq) b
WHERE a.sotrelstg_sys_ud = b.sotkey_sys_ud
AND a.sotrelstg_orig_sys_ord_id = b.sotkey_sys_ord_id
AND NVL(a.sotrelstg_orig_sys_ord_vseq, 1) = NVL(b.sotkey_sys_ord_vseq, 1);
During normal business hr, i found that query takes 5-7 min(which is also not acceptable), but during high load business hr,it is taking 30-50 min.
I found that most of the time it is spending on HASH JOIN (direct path write temp).We have sufficient RAM (64 GB total/41 GB available).
Below is the execution plan i got during normal business hr.
| Id | Operation                 | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
|   1 | HASH GROUP BY            |                     |      1 |      1 |   7844K|00:05:28.78 |      16M|    217K| 35969 |       |       |          |         |
|* 2 |   HASH JOIN               |                     |      1 |      1 |   9977K|00:04:34.02 |      16M|    202K| 20779 |   580M|    10M| 563M (1)|     650K|
|   3 |    NESTED LOOPS ANTI      |                     |      1 |      6 |   7855K|00:01:26.41 |      16M|   1149 |      0 |       |       |          |         |
|   4 |     PARTITION RANGE SINGLE|                     |      1 |    258K|   8183K|00:00:16.37 |   25576 |   1149 |      0 |       |       |          |         |
|* 5 |      TABLE ACCESS FULL    | SOT_REL_STAGE       |      1 |    258K|   8183K|00:00:16.37 |   25576 |   1149 |      0 |       |       |          |         |
|   6 |     PARTITION RANGE SINGLE|                     |   8183K|    326K|    327K|00:01:10.53 |      16M|      0 |      0 |       |       |          |         |
|* 7 |      INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD |   8183K|    326K|    327K|00:00:53.37 |      16M|      0 |      0 |       |       |          |         |
|   8 |    PARTITION RANGE SINGLE |                     |      1 |    846K|     14M|00:02:06.36 |     289K|    180K|      0 |       |       |          |         |
|* 9 |     TABLE ACCESS FULL     | SOT_KEYMAP          |      1 |    846K|     14M|00:01:52.32 |     289K|    180K|      0 |       |       |          |         |
I will attached the same for high load business hr once query gives results.It is still executing for last 50 mins.
INDEX STATS (INDEXES ARE LOCAL INDEXES)
TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_REL_STAGE                       IDXL_SOTRELSTG_SOT_UD               SOTRELSTG_SOT_UD                 1   25461560      25461560            184180
SOT_REL_STAGE                                                           SOTRELSTG_TRD_DATE               2   25461560      25461560            184180
                                                                        _YMD_PART
TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_KEYMAP                          IDXL_SOTKEY_ENTORDSYS_UD            SOTKEY_ENTRY_ORD_S               1 1012306940             3          38308680
                                                                        YS_UD
SOT_KEYMAP                          IDXL_SOTKEY_HASH                    SOTKEY_HASH                      1 1049582320    1049582320        1049579520
SOT_KEYMAP                                                              SOTKEY_TRD_DATE_YM               2 1049582320    1049582320        1049579520
                                                                        D_PART
SOT_KEYMAP                          IDXL_SOTKEY_SOM_ORD                 SOTKEY_SOM_UD                    1 1023998560     268949136         559414840
SOT_KEYMAP                                                              SOTKEY_SYS_ORD_ID                2 1023998560     268949136         559414840
SOT_KEYMAP                          IPK_SOT_KEYMAP                      SOTKEY_UD                        1 1030369480    1015378900          24226580
TABLE_NAME                          INDEX_NAME                          COLUMN_NAME        COLUMN_POSITION   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
SOT_CORRECT                         IDXL_SOTCRCT_SOT_UD                 SOTCRCT_SOT_UD                   1 412484756     412484756         411710982
SOT_CORRECT                                                             SOTCRCT_TRD_DATE_Y               2 412484756     412484756         411710982
                                                                        MD_PART
INDEX partiton stas (from dba_ind_partitions)
INDEX_NAME                     PARTITION_NAME       STATUS       BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS SAMPLE_SIZE LAST_ANALYZ GLO
IDXL_SOTCRCT_SOT_UD            P20090219            USABLE            1         372        327879            216663     327879      327879 20-Feb-2009 YES
IDXL_SOTKEY_ENTORDSYS_UD       P20090219            USABLE            2        2910             3             36618     856229      856229 19-Feb-2009 YES
IDXL_SOTKEY_HASH               P20090219            USABLE            2        7783        853956            853914     853956      119705 19-Feb-2009 YES
IDXL_SOTKEY_SOM_ORD            P20090219            USABLE            2        6411        531492            157147     799758      132610 19-Feb-2009 YES
IDXL_SOTRELSTG_SOT_UD          P20090219            USABLE            2       13897       9682052             45867    9682052      794958 20-Feb-2009 YESThanks in advance.
Bhavik Desai

Hi Randolf,
Thanks for the time you spent on this issue.I appreciate it.
Please see my comments below:
1. You've mentioned several times that you're passing the partition name as bind variable, but you're obviously testing the statement with literals rather than bind
variables. So your tests obviously don't reflect what is going to happen in case of the actual execution. The cardinality estimates are potentially quite different when
using bind variables for the partition key.
Yes.I intentionaly used literals in my tests.I found couple of times that plan used by the application and plan generated by AUTOTRACE+EXPLAIN PLAN command...is same and
caused hrly elapsed time.
As i pointed out earlier,last month we solved couple of bind peeking issue by intproducing NO_VALIDATE=>FALSE in stats gather procedure,which we execute just after data
load into such daily partitions and before start of jobs which executes this query.
Execution plans From AWR (with parallelism on at table level DEGREE>1)-->This plan is one which CBO has used when degradation occured.This plan is used most of the times.
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
        1918506000          46154275              918 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 1213870831
| Id | Operation                     | Name                | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT              |                     |       |       | 19655 (100)|          |       |       |        |      |            |
|   1 | PX COORDINATOR               |                     |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)         | :TQ10003            |     1 |   116 | 19655   (1)| 00:05:54 |       |       | Q1,03 | P->S | QC (RAND) |
|   3 |    HASH GROUP BY              |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       | Q1,03 | PCWP |            |
|   4 |     PX RECEIVE                |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       | Q1,03 | PCWP |            |
|   5 |      PX SEND HASH             | :TQ10002            |     1 |   116 | 19655   (1)| 00:05:54 |       |       | Q1,02 | P->P | HASH       |
|   6 |       HASH GROUP BY           |                     |     1 |   116 | 19655   (1)| 00:05:54 |       |       | Q1,02 | PCWP |            |
|   7 |        NESTED LOOPS ANTI      |                     |     1 |   116 | 19654   (1)| 00:05:54 |       |       | Q1,02 | PCWP |            |
|   8 |         HASH JOIN             |                     |     1 |   102 | 19654   (1)| 00:05:54 |       |       | Q1,02 | PCWP |            |
|   9 |          PX JOIN FILTER CREATE| :BF0000             |    13M|   664M| 2427   (3)| 00:00:44 |       |       | Q1,02 | PCWP |            |
| 10 |           PX RECEIVE          |                     |    13M|   664M| 2427   (3)| 00:00:44 |       |       | Q1,02 | PCWP |            |
| 11 |            PX SEND HASH       | :TQ10000            |    13M|   664M| 2427   (3)| 00:00:44 |       |       | Q1,00 | P->P | HASH       |
| 12 |             PX BLOCK ITERATOR |                     |    13M|   664M| 2427   (3)| 00:00:44 |   KEY |   KEY | Q1,00 | PCWC |            |
| 13 |              TABLE ACCESS FULL| SOT_REL_STAGE       |    13M|   664M| 2427   (3)| 00:00:44 |   KEY |   KEY | Q1,00 | PCWP |            |
| 14 |          PX RECEIVE           |                     |    27M| 1270M| 17209   (1)| 00:05:10 |       |       | Q1,02 | PCWP |            |
| 15 |           PX SEND HASH        | :TQ10001            |    27M| 1270M| 17209   (1)| 00:05:10 |       |       | Q1,01 | P->P | HASH       |
| 16 |            PX JOIN FILTER USE | :BF0000             |    27M| 1270M| 17209   (1)| 00:05:10 |       |       | Q1,01 | PCWP |            |
| 17 |             PX BLOCK ITERATOR |                     |    27M| 1270M| 17209   (1)| 00:05:10 |   KEY |   KEY | Q1,01 | PCWC |            |
| 18 |              TABLE ACCESS FULL| SOT_KEYMAP          |    27M| 1270M| 17209   (1)| 00:05:10 |   KEY |   KEY | Q1,01 | PCWP |            |
| 19 |         PARTITION RANGE SINGLE|                     | 16185 |   221K|     0   (0)|          |   KEY |   KEY | Q1,02 | PCWP |            |
| 20 |          INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD | 16185 |   221K|     0   (0)|          |   KEY |   KEY | Q1,02 | PCWP |            |
Other Execution plan from AWR
ELAPSED_TIME_DELTA BUFFER_GETS_DELTA DISK_READS_DELTA CURSOR(SELECT*FROMTA
        1053251381                 0             2925 CURSOR STATEMENT : 4
CURSOR STATEMENT : 4
PLAN_TABLE_OUTPUT
SQL_ID 39708a3azmks7
SELECT A.SOTRELSTG_SOT_UD SOTCRCT_SOT_UD, B.SOTKEY_UD SOTCRCT_ORIG_SOT_UD, A.ROWID STAGE_ROWID FROM (SELECT SOTRELSTG_SOT_UD,
SOTRELSTG_SYS_UD, SOTRELSTG_ORIG_SYS_ORD_ID, SOTRELSTG_ORIG_SYS_ORD_VSEQ FROM SOT_REL_STAGE WHERE SOTRELSTG_TRD_DATE_YMD_PART = :B1 AND
SOTRELSTG_CRCT_PROC_STAT_CD = 'N' AND SOTRELSTG_SOT_UD NOT IN( SELECT SOTCRCT_SOT_UD FROM SOT_CORRECT WHERE SOTCRCT_TRD_DATE_YMD_PART =
:B1 )) A, (SELECT MAX(SOTKEY_UD) SOTKEY_UD, SOTKEY_SYS_UD, SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ, SOTKEY_TRD_DATE_YMD_PART FROM
SOT_KEYMAP WHERE SOTKEY_TRD_DATE_YMD_PART = :B1 AND SOTKEY_IUD_CD = 'I' GROUP BY SOTKEY_TRD_DATE_YMD_PART, SOTKEY_SYS_UD,
SOTKEY_SYS_ORD_ID, SOTKEY_SYS_ORD_VSEQ) B WHERE A.SOTRELSTG_SYS_UD = B.SOTKEY_SYS_UD AND A.SOTRELSTG_ORIG_SYS_ORD_ID =
B.SOTKEY_SYS_ORD_ID AND NVL(A.SOTRELSTG_ORIG_SYS_ORD_VSEQ, 1) = NVL(B.SOTKEY_SYS_ORD_VSEQ, 1)
Plan hash value: 3434900850
| Id | Operation                     | Name                | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT              |                     |       |       | 1830 (100)|          |       |       |        |      |            |
|   1 | PX COORDINATOR               |                     |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)         | :TQ10003            |     1 |   131 | 1830   (2)| 00:00:33 |       |       | Q1,03 | P->S | QC (RAND) |
|   3 |    HASH GROUP BY              |                     |     1 |   131 | 1830   (2)| 00:00:33 |       |       | Q1,03 | PCWP |            |
|   4 |     PX RECEIVE                |                     |     1 |   131 | 1830   (2)| 00:00:33 |       |       | Q1,03 | PCWP |            |
|   5 |      PX SEND HASH             | :TQ10002            |     1 |   131 | 1830   (2)| 00:00:33 |       |       | Q1,02 | P->P | HASH       |
|   6 |       HASH GROUP BY           |                     |     1 |   131 | 1830   (2)| 00:00:33 |       |       | Q1,02 | PCWP |            |
|   7 |        NESTED LOOPS ANTI      |                     |     1 |   131 | 1829   (2)| 00:00:33 |       |       | Q1,02 | PCWP |            |
|   8 |         HASH JOIN             |                     |     1 |   117 | 1829   (2)| 00:00:33 |       |       | Q1,02 | PCWP |            |
|   9 |          PX JOIN FILTER CREATE| :BF0000             | 1010K|    50M|   694   (1)| 00:00:13 |       |       | Q1,02 | PCWP |            |
| 10 |           PX RECEIVE          |                     | 1010K|    50M|   694   (1)| 00:00:13 |       |       | Q1,02 | PCWP |            |
| 11 |            PX SEND HASH       | :TQ10000            | 1010K|    50M|   694   (1)| 00:00:13 |       |       | Q1,00 | P->P | HASH       |
| 12 |             PX BLOCK ITERATOR |                     | 1010K|    50M|   694   (1)| 00:00:13 |   KEY |   KEY | Q1,00 | PCWC |            |
| 13 |              TABLE ACCESS FULL| SOT_KEYMAP          | 1010K|    50M|   694   (1)| 00:00:13 |   KEY |   KEY | Q1,00 | PCWP |            |
| 14 |          PX RECEIVE           |                     |    11M|   688M| 1129   (3)| 00:00:21 |       |       | Q1,02 | PCWP |            |
| 15 |           PX SEND HASH        | :TQ10001            |    11M|   688M| 1129   (3)| 00:00:21 |       |       | Q1,01 | P->P | HASH       |
| 16 |            PX JOIN FILTER USE | :BF0000             |    11M|   688M| 1129   (3)| 00:00:21 |       |       | Q1,01 | PCWP |            |
| 17 |             PX BLOCK ITERATOR |                     |    11M|   688M| 1129   (3)| 00:00:21 |   KEY |   KEY | Q1,01 | PCWC |            |
| 18 |              TABLE ACCESS FULL| SOT_REL_STAGE       |    11M|   688M| 1129   (3)| 00:00:21 |   KEY |   KEY | Q1,01 | PCWP |            |
| 19 |         PARTITION RANGE SINGLE|                     | 5209 | 72926 |     0   (0)|          |   KEY |   KEY | Q1,02 | PCWP |            |
| 20 |          INDEX RANGE SCAN     | IDXL_SOTCRCT_SOT_UD | 5209 | 72926 |     0   (0)|          |   KEY |   KEY | Q1,02 | PCWP |            |
EXECUTION PLAN AFTER SETTING DEGREE=1 (It was also degraded)
| Id | Operation                 | Name                | Rows | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
|   0 | SELECT STATEMENT          |                     |     1 |   129 |       | 42336   (2)| 00:12:43 |       |       |
|   1 | HASH GROUP BY            |                     |     1 |   129 |       | 42336   (2)| 00:12:43 |       |       |
|   2 |   NESTED LOOPS ANTI       |                     |     1 |   129 |       | 42335   (2)| 00:12:43 |       |       |
|* 3 |    HASH JOIN              |                     |     1 |   115 |    51M| 42334   (2)| 00:12:43 |       |       |
|   4 |     PARTITION RANGE SINGLE|                     |   846K|    41M|       | 8241   (1)| 00:02:29 |    81 |    81 |
|* 5 |      TABLE ACCESS FULL    | SOT_KEYMAP          |   846K|    41M|       | 8241   (1)| 00:02:29 |    81 |    81 |
|   6 |     PARTITION RANGE SINGLE|                     | 8161K|   490M|       | 12664   (3)| 00:03:48 |    81 |    81 |
|* 7 |      TABLE ACCESS FULL    | SOT_REL_STAGE       | 8161K|   490M|       | 12664   (3)| 00:03:48 |    81 |    81 |
|   8 |    PARTITION RANGE SINGLE |                     | 6525K|    87M|       |     1   (0)| 00:00:01 |    81 |    81 |
|* 9 |     INDEX RANGE SCAN      | IDXL_SOTCRCT_SOT_UD | 6525K|    87M|       |     1   (0)| 00:00:01 |    81 |    81 |
Predicate Information (identified by operation id):
   3 - access("SOTRELSTG_SYS_UD"="SOTKEY_SYS_UD" AND "SOTRELSTG_ORIG_SYS_ORD_ID"="SOTKEY_SYS_ORD_ID" AND
              NVL("SOTRELSTG_ORIG_SYS_ORD_VSEQ",1)=NVL("SOTKEY_SYS_ORD_VSEQ",1))
   5 - filter("SOTKEY_TRD_DATE_YMD_PART"=20090219 AND "SOTKEY_IUD_CD"='I')
   7 - filter("SOTRELSTG_CRCT_PROC_STAT_CD"='N' AND "SOTRELSTG_TRD_DATE_YMD_PART"=20090219)
   9 - access("SOTRELSTG_SOT_UD"="SOTCRCT_SOT_UD" AND "SOTCRCT_TRD_DATE_YMD_PART"=20090219)2. Why are you passing the partition name as bind variable? A statement executing 5 mins. best, > 2 hours worst obviously doesn't suffer from hard parsing issues and
doesn't need to (shouldn't) share execution plans therefore. So I strongly suggest to use literals instead of bind variables. This also solves any potential issues caused
by bind variable peeking.
This is a custom application which uses bind variables to extract data from daily partitions.So,daily automated data extract from daily paritions after load and ELT process.
Here,Value of bind variable is being passed through a procedure parameter.It would be bit difficult to use literals in such application.
3. All your posted plans suffer from bad cardinality estimates. The NO_MERGE hint suggested by Timur only caused a (significant) damage limitation by obviously reducing
the row source size by the group by operation before joining, but still the optimizer is way off, apart from the obviously wrong join order (larger row set first) in
particular the NESTED LOOP operation is causing the main troubles due to excessive logical I/O, as already pointed out by Timur.
Can i ask for alternatives to NESTED LOOP?
4. Your PLAN_TABLE seems to be old (you should see a corresponding note at the bottom of the DBMS_XPLAN.DISPLAY output), because none of the operations have a
filter/access predicate information attached. Since your main issue are the bad cardinality estimates, I strongly suggest to drop any existing PLAN_TABLEs in any non-Oracle
owned schemas because 10g already provides one in the SYS schema (GTT PLAN_TABLE$) exposed via a public synonym, so that the EXPLAIN PLAN information provides the
"Predicate Information" section below the plan covering the "Filter/Access" predicates.
Please post a revised explain plan output including this crucial information so that we get a clue why the cardinality estimates are way off.
I have dropped the old plan.Got above execution plan(listed above in first point) with PREDICATE information.
"As already mentioned the usage of bind variables for the partition name makes this issue potentially worse."
Is there any workaround without replacing bind variable.I am on 10g so 11g's feature will not help !!!
How are you gathering the statistics daily, can you post the exact command(s) used?
gather_table_part_stats(i_owner_name,i_table_name,i_part_name,i_estimate:= DBMS_STATS.AUTO_SAMPLE_SIZE, i_invalidate IN VARCHAR2 := 'Y',i_debug:= 'N')
Thanks & Regards,
Bhavik Desai

Seeking advice on a heavy hash join

We have to self-join a 190 million row table 160 times. On our production 9i database we give "RULE" hint and it finishes in about an hour using nested loop join, On our test 10g database, since "RULE" is not available any more. The optimiser chooses to use hash join (160 levels). And query never finishes in a reasonable time. We have gradually increase the hash_area_size from 8M to 512M, thinking that will help. But apparently it does not. Can anyone provide suggestions? Thanks.

There is an approach using Analytics Functions that may be useful here. The idea is to get all the data values required with 1 reference to Tab2 rather than 160, and let Analytics do the work of building the result set.
The goal here is to 'never do multiple references to the same table where 1 reference can suffice (usually with the aid of Analytics)'.
There are 2 variations depending on whether the ID values here (0,10,20,30,...) always follow an ascending pattern or if they don't. The second example is more generic and will also cover the ascending case.
name ID val ID val ID val ID val ID val
name1 0 1 10 1 20 1 30 1 40 1
-- Performance Summary - This demo used a max value of 1600 vs 15000
for a net number of rows of 2.5 million versus 225 million.
Any of the test views ( replacing 5,10, or 160 tab2 references) using the Analytics function used at most 4099 consistent gets. The original approach used 20553 consistent gets for 5 tab2 references, 41016 consistent gets for 10 tab2 references. This was in 10g, doing the hash join. I did try alter session set optimizer_mode = rule (just for test purposes) and that resulted in 46462 consistent gets for the 10 tab2 reference view while it did a merge join operation.
Autotrace for the Analytics version to replace the original 160 table sql.
JT(147)@JTDB10G>select * from analytics_2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
1112904 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
-- Minimal examples:
--As always, test thoroughly before using in production.
select name,
     id0, value value0,
     id1, value value1,
     id2, value value2,
     id3, value value3,
     id4, value value4
from ( select name, id, value,
     lead(id,0 ) over(partition by name, value order by rn ) id0 ,
     lead(id,1 ) over(partition by name, value order by rn ) id1 ,
     lead(id,2 ) over(partition by name, value order by rn ) id2 ,
     lead(id,3 ) over(partition by name, value order by rn ) id3 ,
     lead(id,4 ) over(partition by name, value order by rn ) id4 ,
rn
from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40)
)) where rn = 0;
-- execute a verions of the smaller analytics approch with bind variables
-- referencing the binds within the numbered in-line view is needed only if
-- the id0, id1, id2 values do not follow the ascending pattern shown in the
-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
variable l_id0 number;
variable l_id1 number;
variable l_id2 number;
variable l_id3 number;
variable l_id4 number;
exec :l_id1 := 0;
exec :l_id3 := 10;
exec :l_id2 := 20;
exec :l_id0 := 30;
exec :l_id4 := 40;
select name, bind_rn,
     id0, value value0,
     id1, value value1,
     id2, value value2,
     id3, value value3,
     id4, value value4
from ( select name, id, value,
     lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
     lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
     lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
     lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
     lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
     bind_rn
from( select tab1.name, i0.id, i0.tab1value value, bind_rn
     from tab1, tab2 i0,
          (select 0 bind_rn, :l_id0 arg_value from dual union
          select 1 , :l_id1 from dual union
          select 2 , :l_id2 from dual union
          select 3 , :l_id3 from dual union
          select 4 , :l_id4 from dual ) table_of_args
     where tab1.name='name1' and i0.tab1value=tab1.value
-- and i0.id in (0,10,20,30,40)
and i0.id = table_of_args.arg_value
)) where bind_rn = 0;
-- Full Test Case
-- table setup
drop table tab1;
drop table tab2;
create table tab1(name varchar2(100), value number) pctfree 0;
create table tab2(id number, tab1value number) pctfree 0;
begin
for x in 0 .. 1600 loop
for y in 1 .. 1600 loop
     insert into tab1 values ('name' || x, y);
end loop;
end loop;
end;
-- 15000 results in 225,000,000
-- 1600 results in 2,560,000
begin
for x in 0 .. 1600 loop
for y in 1 .. 1600 loop
     insert into tab2 values (x, y);
end loop;
end loop;
end;
commit;
CREATE BITMAP INDEX NAME_BITMAP ON TAB1(NAME);
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'JTOMMANEY',TABNAME => 'TAB1', -
     estimate_percent => 20,     CASCADE => TRUE);
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'JTOMMANEY',TABNAME => 'TAB2', -
     estimate_percent => 20,     CASCADE => TRUE);
alter session set optimizer_mode = 'RULE';
-- set up some views both the original approach, and the analytis approach
create view original_5_tab2_tables_join as
select tab1.name name,
     i0.id id0, i0.tab1value value0,
     i1.id id1, i1.tab1value value1,
     i2.id id2, i2.tab1value value2,
     i3.id id3, i3.tab1value value3,
     i4.id id4, i4.tab1value value4
from tab1,
tab2 i0, tab2 i1, tab2 i2, tab2 i3, tab2 i4
where tab1.name='name1'
and (i0.id=0 and i0.tab1value=tab1.value)
and (i1.id=10 and i1.tab1value=tab1.value)
and (i2.id=20 and i2.tab1value=tab1.value)
and (i3.id=30 and i3.tab1value=tab1.value)
and (i4.id=40 and i4.tab1value=tab1.value);
create view replace_5_tab2_joins as
select name,
     id0, value value0,
     id1, value value1,
     id2, value value2,
     id3, value value3,
     id4, value value4
from ( select name, id, value,
     lead(id,0 ) over(partition by name, value order by rn ) id0 ,
     lead(id,1 ) over(partition by name, value order by rn ) id1 ,
     lead(id,2 ) over(partition by name, value order by rn ) id2 ,
     lead(id,3 ) over(partition by name, value order by rn ) id3 ,
     lead(id,4 ) over(partition by name, value order by rn ) id4 ,
rn from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40)
)) where rn = 0;
create view original_10_tab2_tables_join as
select tab1.name name,
     i0.id id0, i0.tab1value value0,
     i1.id id1, i1.tab1value value1,
     i2.id id2, i2.tab1value value2,
     i3.id id3, i3.tab1value value3,
     i4.id id4, i4.tab1value value4,
     i5.id id5, i5.tab1value value5,
     i6.id id6, i6.tab1value value6,
     i7.id id7, i7.tab1value value7,
     i8.id id8, i8.tab1value value8,
     i9.id id9, i9.tab1value value9
from tab1,
tab2 i0, tab2 i1, tab2 i2, tab2 i3, tab2 i4,
tab2 i5, tab2 i6, tab2 i7, tab2 i8, tab2 i9
where tab1.name='name1'
and (i0.id=0 and i0.tab1value=tab1.value)
and (i1.id=10 and i1.tab1value=tab1.value)
and (i2.id=20 and i2.tab1value=tab1.value)
and (i3.id=30 and i3.tab1value=tab1.value)
and (i4.id=40 and i4.tab1value=tab1.value)
and (i5.id=50 and i5.tab1value=tab1.value)
and (i6.id=60 and i6.tab1value=tab1.value)
and (i7.id=70 and i7.tab1value=tab1.value)
and (i8.id=80 and i8.tab1value=tab1.value)
and (i9.id=90 and i9.tab1value=tab1.value);
create view replace_10_tab2_joins as
select name,
     id0, value value0,
     id1, value value1,
     id2, value value2,
     id3, value value3,
     id4, value value4,
     id5, value value5,
     id6, value value6,
     id7, value value7,
     id8, value value8,
     id9, value value9
from ( select name, id, value,
     lead(id,0 ) over(partition by name, value order by rn ) id0 ,
     lead(id,1 ) over(partition by name, value order by rn ) id1 ,
     lead(id,2 ) over(partition by name, value order by rn ) id2 ,
     lead(id,3 ) over(partition by name, value order by rn ) id3 ,
     lead(id,4 ) over(partition by name, value order by rn ) id4 ,
     lead(id,5 ) over(partition by name, value order by rn ) id5 ,
     lead(id,6 ) over(partition by name, value order by rn ) id6 ,
     lead(id,7 ) over(partition by name, value order by rn ) id7 ,
     lead(id,8 ) over(partition by name, value order by rn ) id8 ,
     lead(id,9 ) over(partition by name, value order by rn ) id9 ,
rn from( select tab1.name, i0.id, i0.tab1value value,
(row_number() over (partition by tab1value order by i0.id )) -1 rn
from tab1, tab2 i0 where tab1.name='name1' and i0.tab1value=tab1.value
and i0.id in (0,10,20,30,40,50,60,70,80,90)
)) where rn = 0;
-- set up some views both the original approach, and the analytics approach
spool cr_v1.sql may need to clean up heading, linefeed from created file
begin
dbms_output.put_line('create or replace view original_160_joins as select /*+ rule */ tab1.name ');
for x in 0 .. 160 loop
dbms_output.put_line( ',i' || x || '.id id' || x || ' ,i' || x || '.tab1value value' || x ) ;
end loop;
dbms_output.put_line('from tab1' );
for x in 0 .. 160 loop
dbms_output.put_line( ',tab2 i' || x ) ;
end loop;
dbms_output.put_line(' where tab1.name = ''name1''' );
for x in 0 .. 160 loop
dbms_output.put_line( ' and i' || x || '.id=' || (x * 10) || ' and i' || x || '.tab1value=tab1.value ' ) ;
end loop;
dbms_output.put_line( ' ;');
end;
--spool off
--@cr_v1.sql
spool cr_v2.sql may need to clean up heading, linefeed from created file
begin
dbms_output.put_line('create or replace view analytics_2_joins as select name ');
for x in 0 .. 160 loop
dbms_output.put_line( ',id' || x || ', value value' || x ) ;
end loop;
dbms_output.put_line('from ( select name, id, value ' );
for x in 0 .. 160 loop
dbms_output.put_line( ',lead(id,' || x || ') over(partition by name, value order by rn ) id' || x ) ;
end loop;
dbms_output.put_line(' , rn from( select tab1.name, i0.id, i0.tab1value value, ');
dbms_output.put_line(' (row_number() over (partition by tab1value order by i0.id )) -1 rn ');
dbms_output.put_line(' from tab1, tab2 i0 where tab1.name=''name1'' and i0.tab1value=tab1.value and i0.id in ( ');
for x in 0 .. 159 loop
dbms_output.put_line( (x * 10) || ',' ) ;
end loop;
dbms_output.put_line( ' 1600))) where rn = 0;');
end;
--spool off
--@cr_v2.sql
-- We now have 6 views established
-- Original Approach     Analytics Approach w/ 1 tab2 reference
-- 5 tab2s     original_5_tab2_tables_join      replace_5_tab2_joins
-- 10 tab2s     original_10_tab2_tables_join      replace_10_tab2_joins
--160 tab2s original_160_joins          analytics_2_joins
-- plus we will use call the version with bind variables, but not from a view.
-- Data validation:
select 'orig_minus_new: ' || count(*) from
( select * from original_5_tab2_tables_join minus select * from replace_5_tab2_joins ) union
select 'new_minus_orig: ' || count(*) from
( select * from replace_5_tab2_joins minus select * from original_5_tab2_tables_join );
select 'orig_minus_new: ' || count(*) from
( select * from original_10_tab2_tables_join minus select * from replace_10_tab2_joins ) union
select 'new_minus_orig: ' || count(*) from
( select * from replace_10_tab2_joins minus select * from original_10_tab2_tables_join );
select 'orig_minus_new: ' || count(*) from
( select * from original_160_joins minus select * from analytics_2_joins );
select 'new_minus_orig: ' || count(*) from
( select * from analytics_2_joins minus select * from original_160_joins );
-- Performance test
alter session set workarea_size_policy=manual ;
alter session set sort_area_size = 64000000;
alter session set hash_area_size = 64000000;
set autotrace traceonly stat
select * from original_5_tab2_tables_join;
select * from replace_5_tab2_joins;
select * from original_10_tab2_tables_join;
select * from replace_10_tab2_joins;
select * from analytics_2_joins;
--select * from original_160_joins;
-- execute a verions of the smaller analytics approch with bind variables
-- referencing the binds within the numbered in-line view is needed only if
-- the id0, id1, id2 values do not follow the ascending pattern shown in the
-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
variable l_id0 number;
variable l_id1 number;
variable l_id2 number;
variable l_id3 number;
variable l_id4 number;
exec :l_id1 := 0;
exec :l_id3 := 10;
exec :l_id2 := 20;
exec :l_id0 := 30;
exec :l_id4 := 40;
select name, bind_rn,
     id0, value value0,
     id1, value value1,
     id2, value value2,
     id3, value value3,
     id4, value value4
from ( select name, id, value,
     lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
     lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
     lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
     lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
     lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
     bind_rn
from( select tab1.name, i0.id, i0.tab1value value, bind_rn
     from tab1, tab2 i0,
          (select 0 bind_rn, :l_id0 arg_value from dual union
          select 1 , :l_id1 from dual union
          select 2 , :l_id2 from dual union
          select 3 , :l_id3 from dual union
          select 4 , :l_id4 from dual ) table_of_args
     where tab1.name='name1' and i0.tab1value=tab1.value
-- and i0.id in (0,10,20,30,40)
and i0.id = table_of_args.arg_value
)) where bind_rn = 0;
JT(147)@JTDB10G>select * from original_5_tab2_tables_join;
1600 rows selected.
Statistics
8 recursive calls
2 db block gets
20553 consistent gets
18555 physical reads
0 redo size
52052 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from replace_5_tab2_joins;
1600 rows selected.
Statistics
8 recursive calls
2 db block gets
4101 consistent gets
3710 physical reads
0 redo size
52052 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from original_10_tab2_tables_join;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
41016 consistent gets
37115 physical reads
0 redo size
85636 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from replace_10_tab2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
85636 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>select * from analytics_2_joins;
1600 rows selected.
Statistics
0 recursive calls
0 db block gets
4099 consistent gets
3710 physical reads
0 redo size
1112904 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed
JT(147)@JTDB10G>--select * from original_160_joins;
JT(147)@JTDB10G>
JT(147)@JTDB10G>----------------------------------------------------------------------------------------
JT(147)@JTDB10G>-- execute a verions of the smaller analytics approch with bind variables
JT(147)@JTDB10G>-- referencing the binds within the numbered in-line view is needed only if
JT(147)@JTDB10G>-- the id0, id1, id2 values do not follow the ascending pattern shown in the
JT(147)@JTDB10G>-- example. This will handle case where id0 = 30, id2 =20, id4 = 40 , etc.
JT(147)@JTDB10G>----------------------------------------------------------------------------------------
JT(147)@JTDB10G>
JT(147)@JTDB10G>variable l_id0 number;
JT(147)@JTDB10G>variable l_id1 number;
JT(147)@JTDB10G>variable l_id2 number;
JT(147)@JTDB10G>variable l_id3 number;
JT(147)@JTDB10G>variable l_id4 number;
JT(147)@JTDB10G>
JT(147)@JTDB10G>exec :l_id1 := 0;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id3 := 10;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id2 := 20;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id0 := 30;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>exec :l_id4 := 40;
PL/SQL procedure successfully completed.
JT(147)@JTDB10G>
JT(147)@JTDB10G>select name, bind_rn,
2      id0, value value0,
3      id1, value value1,
4      id2, value value2,
5      id3, value value3,
6      id4, value value4
7 from ( select name, id, value,
8      lead(id,0 ) over(partition by name, value order by bind_rn ) id0 ,
9      lead(id,1 ) over(partition by name, value order by bind_rn ) id1 ,
10      lead(id,2 ) over(partition by name, value order by bind_rn ) id2 ,
11      lead(id,3 ) over(partition by name, value order by bind_rn ) id3 ,
12      lead(id,4 ) over(partition by name, value order by bind_rn ) id4 ,
13      bind_rn
14 from( select tab1.name, i0.id, i0.tab1value value, bind_rn
15      from tab1, tab2 i0,
16           (select 0 bind_rn, :l_id0 arg_value from dual union
17           select 1 , :l_id1 from dual union
18           select 2 , :l_id2 from dual union
19           select 3 , :l_id3 from dual union
20           select 4 , :l_id4 from dual ) table_of_args
21      where tab1.name='name1' and i0.tab1value=tab1.value
22 -- and i0.id in (0,10,20,30,40)
23 and i0.id = table_of_args.arg_value
24 )) where bind_rn = 0;
1600 rows selected.
Statistics
1 recursive calls
0 db block gets
4099 consistent gets
3707 physical reads
0 redo size
52111 bytes sent via SQL*Net to client
1250 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1600 rows processed

Nested loop vs Hash Join

Similar Messages

Maybe you are looking for