CBO (optimizer) nest-loop join question

OS: Red Hat Linux
DB: 11gR1
I have gotten two conflicting answers while reading books by Don Burleson and Dan Hotka. It has to do with the CBO and nested-joins:
One says the CBO will choose the 'smaller' table as the driving table, the other states that the 'larger' table will be the driving table. And both stick by this philosophy as the preferred goal of any SQL Tuning -- that is, one states that the 'smaller' table should be the driving table. The other says the 'larger' table should be the driving table.
I had always thought that the 'smaller' table should be the driving table. That in a nested loop the driving will not likely use an index even. Who is correct? (I am not going to say who said what, btw). :-)
But I got to let one of them know they got a 'typo' ... :-)
Thx.

user601798 wrote:
It is an over-simplistic scenario but, as I mentioned, if all other things are 'equal' -- which would include 'access time/work', then I think the small table as the driving table has the advantage.
It is not possible for +"*all* other things to be equal"+. (my emphasis).
If by +'access time/work'+ you mean the total is the same then it doesn't matter which table is first, the time/work is the same either way round.
If you want to say that the +'access time/work'+ for acquiring the first rowsource is the same for both paths, and the +'access time/work'+ for acquiring related rows from the second table is the same FOR EACH DRIVING ROW, then the total +'access time/work'+ will be difference, and it would be better to start with the smaller table. (The example by Salman Qureshi above: Re: CBO (optimizer) nest-loop join question would apply.)
On the other hand, and ignoring any idea of "all other things being equal", smaller tables tend to have smaller indexes, so if your smaller rowsource comes from a smaller table then acquiring those rows may be cheaper than acquiring rows from a larger table - which leads to the observation that (even with perfectly precise indexing):
<ul>
smaller number of rows * larger unit cost to find related rows
</ul>
may produce a larger value than
<ul>
larger number of rows * smaller unit cost to find related rows
</ul>
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
A general reminder about "Forum Etiquette / Reward Points": http://forums.oracle.com/forums/ann.jspa?annID=718
If you never mark your questions as answered people will eventually decide that it's not worth trying to answer you because they will never know whether or not their answer has been of any use, or whether you even bothered to read it.
It is also important to mark answers that you thought helpful - again it lets other people know that you appreciate their help, but it also acts as a pointer for other people when they are researching the same question, moreover it means that when you mark a bad or wrong answer as helpful someone may be prompted to tell you (and the rest of the forum) what's so bad or wrong about the answer you found helpful.

Similar Messages

Nested loop join v/s Sort merge

I have seen that nested loops are better if the inner table is being indexed, because for each outer table row, we are looking for a match in the inner table. But is there any case when optimizer still goes for a nested loop even if there is no index on the inner table. That is my first question ?
My second question := When doing a sort merge join oracle has to sort both result sets and then merge them. Oracle says that if both the row sets, if already sorted is definately better for performance. Ya thats obvious. But back to my upper question, when there is no index on the inner table, is it the situation when oracle goes for a sort merge join ?

My response should really have examples but since I do not have any handy I will just say about your first question. If there is no index available from table A to table B yes it is possible a nested loop join may still be used and table B read via full table scan within a nested loop. If table B is very small and consists of only a block or two this may be relatively efficient plan. It is more likely you sould see table B full scanned and the result feed into a hash join, but I have seen the plan you mention.
Back before hash joins were introduced with 7.3 (if my memory is correct) you would see sort/merge joins used more often than you do now. Generally speaking no index on the join conditions would exist for this option to be chosen.
If you really want to know why and sometimes what the optimizer is going to do buy Jonathan Lewis's book Cost-Based Oracle Fundamentals. If explains the optimizer in more depth than any other source I know of.
HTH -- Mark D Powell --

Inner / outer table in nested loops join

I can't understand what 'inner' / 'outer'
table means in nested loops join operation.
please explain the exact meaning.
maybe i do not understand the nested loops
join itself. I tried to find the meanings
in Oracle manual, but I couldn't.

If I understand correctly your question. An outer table loop is where you have a table with a primary key (master table) and you want to iterate into that table which have details forign key (inner loop table) for example you have customers table each have many invoices.
hope that ansowers your query.
<BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by 4baf:
I can't understand what 'inner' / 'outer'
table means in nested loops join operation.
please explain the exact meaning.
maybe i do not understand the nested loops
join itself. I tried to find the meanings
in Oracle manual, but I couldn't.<HR></BLOCKQUOTE>
null

When does oracle use a complete nested loop join?

Hi!
Does Oracle Database use a complete nested loop join? I mean, imagine 2 tables without any indexes.. is there any case where for each row in the outer table Oracle does a complete scan in the inner table? I know that this is the original algorithm for the nested loop join, but some data bases prefer to make a temp table to autoindex the inner table and never makes the complete scan in the inner table..
thanks!!

user12040235 wrote:
If the table do not have indexes.. some data bases prefer to scan one time the inner table, to index all values, and than, for every row in the outter loop table, it will do a index search.
I just like to know oracle does the same thing, or it does the complete scan..If you have two tables without indexes, Oracle may consider scanning one table, extracting the smallest data set it can get away with, and then building a hash table of that data set (rather than creating an in-memory copy with index). At this point Oracle can then do a nested loop join into the in-memory hash table.
However, this is called a hash join, and the order of tables will appear to be reversed, viz:
nested loop
    table scan full ABC
    table scan full XYZ
{code]
becomeshash join
table scan full XYZ
table scan full ABC
See: http://jonathanlewis.wordpress.com/2010/08/02/joins/ as a starting point if you want to read more on this topic.
Regards
Jonathan Lewis

Nested Loop and Driving Table.

In the documentation I read these (/B19306_01/server.102/b14211/optimops.htm#i82080).
The optimizer uses nested loop joins when joining small number of rows, with a good driving condition between the two tables. You drive from the outer loop to the inner loop, so the order of tables in the execution plan is important.
The outer loop is the driving row source. It produces a set of rows for driving the join condition. The row source can be a table accessed using an index scan or a full table scan. Also, the rows can be produced from any other operation. For example, the output from a nested loop join can be used as a row source for another nested loop join.>
I need some help to understand the bold line, i.e. so the order of tables in the execution plan is important.
There are various conflicting opinion about the driving table (some says smaller and some says larger table is good option) and unfortunately I did not understand any of those logic.
I read these threads/blogs also.
CBO (optimizer) nest-loop join question
http://hoopercharles.wordpress.com/2011/03/21/nested-loops-join-the-smaller-table-is-the-driving-table-the-larger-table-is-the-driving-table/
In practice, I have seen explain plan for select e.ename,d.dname
2 from emp e, dept d
3 where e.deptno=d.deptno; usages emp as driving table (I only understood part of Aman's logic that dept table access would be faster when there would be an index available over it and hence it is in the inner loop)
Whereas, SQL> explain plan for
2 select e.ename,d.dname
3 from emp e, dept d
4 where e.deptno=d.deptno
5 and e.deptno=20; usages dept as driving table.
I have use USE_NL_WITH_INDEX with LEADING hint to change it, but it is giving some adverse effect (sometimes with optimizer_mode ALL_ROWS it is ignoring the hint completely).
so the order of tables in the execution plan is important. How can I determine it ? I have read Tom's effective oracle by design but there is also no solution.

Not sure I quite understand the question.
Both threads contain lots of useful information about the broad considerations in play and debunking any myths.
I need some help to understand the bold line, i.e.
"so the order of tables in the execution plan is important"I read this as meaning just what it says, i.e. read the execution plan carefully and that
NESTED LOOP
A
Bis not the same as
NESTED LOOP
B
AA 10053 trace would normally be quite verbose about it's considerations as to join order.

Nested loop vs Hash Join

Hi,
Both the querys are returning same results, but in my first query hash join and second query nested loop . How ? PLs explain
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno>20;
6 rows
Plan hash value: 4102772462
| Id | Operation                    | Name    | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |         |     6 |   348 |     6 (17)| 00:00:01 |
|* 1 | HASH JOIN                   |         |     6 |   348 |     6 (17)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     3 |    60 |     2   (0)| 00:00:01 |
|* 3 |    INDEX RANGE SCAN          | PK_DEPT |     3 |       |     1   (0)| 00:00:01 |
|* 4 |   TABLE ACCESS FULL          | EMP     |     7 |   266 |     3   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   1 - access("A"."DEPTNO"="B"."DEPTNO")
   3 - access("B"."DEPTNO">20)
   4 - filter("A"."DEPTNO">20)
select *
from emp a,dept b
where a.deptno=b.deptno and b.deptno=30;
6 rows
Plan hash value: 568005898
| Id | Operation                    | Name    | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |         |     5 |   290 |     4   (0)| 00:00:01 |
|   1 | NESTED LOOPS                |         |     5 |   290 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    20 |     1   (0)| 00:00:01 |
|* 3 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
|* 4 |   TABLE ACCESS FULL          | EMP     |     5 |   190 |     3   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   3 - access("B"."DEPTNO"=30)
   4 - filter("A"."DEPTNO"=30)

Hi,
Unless specifically requested, Oracle picks the best execution plan based on estimates of table sizes, column selectivity and many other variables. Even though Oracle does its best to have the estimates as accurate as possible, they are frequently different, and in some cases quite different, from the actual values.
In the first query, Oracle estimated that the predicate “ b.deptno>20” would limit the number of records to 6, and based on that it decided the use Hash Join.
In the second query, Oracle estimated that the predicate “b.deptno=30” would limit the number of records to 5, and based on that it decided the use Nested Loops Join.
The fact that the actual number of records is the same is irrelevant because Oracle used the estimate, rather the actual number of records to pick the best plan.
HTH,
Iordan
Iotzov

NESTED LOOP or SORT MERGE

Hi All,
How can we judge which one is best to use for the query- NESTED LOOP
- SORT MERGE
- HASH JOINThanx.. Ratan

Hi...
For Nested Loop Joins:
Nested loop joins are useful when small subsets of data are being joined and if the join condition is an efficient way of accessing the second table.
It is very important to ensure that the inner table is driven from (dependent on) the outer table. If the inner table's access path is independent of the outer table, then the same rows are retrieved for every iteration of the outer loop, degrading performance considerably. In such cases, hash joins joining the two independent row sources perform better.
For Hash Join:
Hash joins are used for joining large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows.
For Sort Merge Joins:
Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:
The row sources are sorted already.
A sort operation does not have to be done.
However, if a sort merge join involves choosing a slower access method (an index scan as opposed to a full table scan), then the benefit of using a sort merge might be lost.
Sort merge joins are useful when the join condition between two tables is an inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins perform better than nested loop joins for large data sets. You cannot use hash joins unless there is an equality condition.
In a merge join, there is no concept of a driving table. The join consists of two steps:
Sort join operation: Both the inputs are sorted on the join key.
Merge join operation: The sorted lists are merged together.
If the input is already sorted by the join column, then a sort join operation is not performed for that row source.

Why optimizer prefers nested loop over hash join?

What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
    FROM t1 p, t2 d
    WHERE d.emplid = p.id_psoft
      AND p.flag_processed = 'N'
      AND p.desc_pool = :b1
      AND NOT d.name LIKE '%DUPLICATE%'
      AND ROWNUM < 2tkprof output is:
Production
call     count       cpu    elapsed       disk      query    current        rows
Parse        1      0.01       0.00          0          0          4           0
Execute      1      0.00       0.01          0          4          0           0
Fetch        1    228.83     223.48          0    4264533          0           1
total        3    228.84     223.50          0    4264537          4           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 108 (SANJEEV)
Rows     Row Source Operation
      1 COUNT STOPKEY (cr=4264533 pr=0 pw=0 time=223484076 us)
      1   NESTED LOOPS (cr=4264533 pr=0 pw=0 time=223484031 us)
10401    TABLE ACCESS FULL T1 (cr=192 pr=0 pw=0 time=228969 us)
      1    TABLE ACCESS FULL T2 (cr=4264341 pr=0 pw=0 time=223182508 us)Development
call     count       cpu    elapsed       disk      query    current        rows
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.01          0          4          0           0
Fetch        1      0.05       0.03          0        512          0           1
total        3      0.06       0.06          0        516          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 113 (SANJEEV)
Rows     Row Source Operation
      1 COUNT STOPKEY (cr=512 pr=0 pw=0 time=38876 us)
      1   HASH JOIN (cr=512 pr=0 pw=0 time=38846 us)
     51    TABLE ACCESS FULL T2 (cr=492 pr=0 pw=0 time=30230 us)
    861    TABLE ACCESS FULL T1 (cr=20 pr=0 pw=0 time=2746 us)

sanjeevchauhan wrote:
What do I look for if I want to find out why the server prefers a nested loop over hash join?
The server is 10.2.0.4.0.
The query is:
SELECT p.*
FROM t1 p, t2 d
WHERE d.emplid = p.id_psoft
AND p.flag_processed = 'N'
AND p.desc_pool = :b1
AND NOT d.name LIKE '%DUPLICATE%'
AND ROWNUM < 2
You've got already some suggestions, but the most straightforward way is to run the unhinted statement in both environments and then force the join and access methods you would like to see using hints, in your case probably "USE_HASH(P D)" in your production environment and "FULL(P) FULL(D) USE_NL(P D)" in your development environment should be sufficient to see the costs and estimates returned by the optimizer when using the alternate access and join patterns.
This give you a first indication why the optimizer thinks that the chosen access path seems to be cheaper than the obviously less efficient plan selected in production.
As already mentioned by Hemant using bind variables complicates things a bit since EXPLAIN PLAN is not reliable due to bind variable peeking performed when executing the statement, but not when explaining.
Since you're already on 10g you can get the actual execution plan used for all four variants using DBMS_XPLAN.DISPLAY_CURSOR which tells you more than the TKPROF output in the "Row Source Operation" section regarding the estimates and costs assigned.
Of course the result of your whole exercise might be highly dependent on the actual bind variable value used.
By the way, your statement is questionable in principle since you're querying for the first row of an indeterministic result set. It's not deterministic since you've defined no particular order so depending on the way Oracle executes the statement and the physical storage of your data this query might return different results on different runs.
This is either an indication of a bad design (If the query is supposed to return exactly one row then you don't need the ROWNUM restriction) or an incorrect attempt of a Top 1 query which requires you to specify somehow an order, either by adding a ORDER BY to the statement and wrapping it into an inline view, or e.g. using some analytic functions that allow you specify a RANK by a defined ORDER.
This is an example of how a deterministic Top N query could look like:
SELECT
FROM
SELECT p.*
    FROM t1 p, t2 d
    WHERE d.emplid = p.id_psoft
      AND p.flag_processed = 'N'
      AND p.desc_pool = :b1
      AND NOT d.name LIKE '%DUPLICATE%'
ORDER BY <order_criteria>
WHERE ROWNUM <= 1;Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

Generally when does optimizer use nested loop and Hash joins ?

Version: 11.2.0.3, 10.2
Lets say I have a table called ORDER and ORDER_DETAIL.
ORDER_DETAIL is the child table of ORDERS .
This is what I understand about Nested Loop:
When we join ORDER AND ORDER_DETAIL tables oracle will form a 'nested loop' in which for each order_ID in ORDER table (outer loop), oracle will look for corresponding multiple ORDER_IDs in the ORDER_DETAIL table.
Is nested loop used when the driving table (ORDER in this case) is smaller than the child table (ORDER_DETAIL) ?
Is nested loop more likely to use Indexes in general ?
How will these two tables be joined using Hash joins ?
When is it ideal to use hash joins ?

Your description of a nested loop is correct.
The overall rule is that Oracle will use the plan that it calculates to be, in general, fastest. That mostly means fewest I/O's, but there are various factors that adjust its calculation - e.g. it expects index blocks to be cached, multiple reads entries in an index may reference the same block, full scans get multiple blocks per I/O. It also has CPU cost calculations, but they generally only become significant with function / package calls or domain indexes (spatial, text, ...).
Nested loop with an index will require one indexed read of the child table per outer table row, so its I/O cost is roughly twice the number of rows expected to match the where clause conditions on the outer table.
A hash join reads the of the smaller table into a hash table then matches the rows from the larger table against the hash table, so its I/O cost is the cost of a full scan of each table (unless the smaller table is too big to fit in a single in-memory hash table). Hash joins generally don't use indexes - it doesn't use the index to look up each result. It can use an index as a data source, as a narrow version of the table or a way to identify the rows satisfying the other where clause conditions.
If you are processing the whole of both tables, Oracle is likely to use a hash join, and be very fast and efficient about it.
If your where clause restricts it to a just few rows from the parent table and a few corresponding rows from the child table, and you have an index Oracle is likely to use a nested loops solution.
If the tables are very small, either plan is efficient - you may be surprised by its choice.
Don't be worry about plans with full scans and hash joins - they are usually very fast and efficient. Often bad performance comes from having to do nested loop lookups for lots of rows.

HASH JOIN or NESTED LOOP

I've been asked to check if HASH JOIN is more suitable than NESTED LOOP(which CBO chose by default) for the following query.
SELECT CM_DETAILS.TASK_ID FROM GEN_TYPE, CM_DETAILS WHERE ( ( ( ( ( CM_DETAILS.STAT_CODE < 8 ) AND ( GEN_TYPE.TASK_ID = CM_DETAILS.TASK_ID ) ) AND ( GEN_TYPE.DEST_LOCN_ID = 5 ) ) AND ( GEN_TYPE.COM_ID = 7 ) ) AND ( ( ( CM_DETAILS.CASE_NO = 1 ) OR ( CM_DETAILS.CASE_NO = 3 ) ) OR ( CM_DETAILS.CASE_NO = 9 ) ) )
Both GEN_TYPE and CM_DETAILS tables have over 330,000 rows.
Version: 10g R2
Any thoughts?

As gintsp gave you very nice tip but there is initialization parameter "     OPTIMIZER_INDEX_COST_ADJ" which cause what path to be chose for CBO,but usually expert says for changing init paramter setting should be at last resort.
It has default value of 100 which indicates to the CBO that indexed access is 100% as costly (i.e., equally costly) as FULL table scan access.
SQL> column plan_plus_exp format a100
SQL> set linesize 1000
SQL> SET AUTOTRACE TRACEONLY
SQL> SELECT e.ename,d.dname
2    FROM emp e,dept d
3   WHERE e.deptno=d.deptno
4 /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=10 Bytes=320)
   1    0   HASH JOIN (Cost=7 Card=10 Bytes=320)
   2    1     TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=90)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
        672 recursive calls
          0 db block gets
        151 consistent gets
         27 physical reads
          0 redo size
        793 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
         15 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=10 Bytes=320)
   1    0   HASH JOIN (Cost=7 Card=10 Bytes=320)
   2    1     TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=90)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          0 recursive calls
          0 db block gets
         15 consistent gets
          0 physical reads
          0 redo size
        793 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> SHOW PARAMETER optimizer
NAME                                 TYPE                             VALUE
optimizer_dynamic_sampling           integer                          2
optimizer_features_enable            string                           10.1.0
optimizer_index_caching              integer                          0
optimizer_index_cost_adj             integer                          100<--------
optimizer_mode                       string                           ALL_ROWS
SQL> ALTER SESSION SET optimizer_index_cost_adj=35
2 /
Session altered.
SQL> SELECT e.ename,d.dname
2    FROM emp e,dept d
3   WHERE e.deptno=d.deptno
4 /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=10 Bytes=320)
   1    0   MERGE JOIN (Cost=5 Card=10 Bytes=320)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (TABLE) (Cost=1 Card=5 Bytes=90)
   3    2       INDEX (FULL SCAN) OF 'DEPT_PRIMARY_KEY' (INDEX (UNIQUE)) (Cost=1 Card=5)
   4    1     SORT (JOIN) (Cost=4 Card=14 Bytes=196)
   5    4       TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          1 recursive calls
          0 db block gets
         11 consistent gets
          1 physical reads
          0 redo size
        733 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
         14 rows processed
SQL> /
14 rows selected.
Execution Plan
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=10 Bytes=320)
   1    0   MERGE JOIN (Cost=5 Card=10 Bytes=320)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (TABLE) (Cost=1 Card=5 Bytes=90)
   3    2       INDEX (FULL SCAN) OF 'DEPT_PRIMARY_KEY' (INDEX (UNIQUE)) (Cost=1 Card=5)
   4    1     SORT (JOIN) (Cost=4 Card=14 Bytes=196)
   5    4       TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=196)
Statistics
          0 recursive calls
          0 db block gets
         11 consistent gets
          0 physical reads
          0 redo size
        733 bytes sent via SQL*Net to client
        508 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
         14 rows processedKhurram

Hash join vs nested loop

DECLARE @tableA table (Productid varchar(20),Product varchar(20),RateID int)
insert into @tableA values('1','Mobile',2);
insert into @tableA values('2','Chargers',4);
insert into @tableA values('3','Stand',6);
insert into @tableA values('4','Adapter',8);
insert into @tableA values('5','Cover',10);
insert into @tableA values('6','Protector',12);
--SELECT * FROM @tableA
DECLARE @tableB table (id varchar(20),RateID int,Rate int)
insert into @tableB values('1',2,200);
insert into @tableB values('2',4,40);
insert into @tableB values('3',6,60);
insert into @tableB values('4',8,80);
insert into @tableB values('5',10,10);
insert into @tableB values('6',12,15);
--SELECT * FROM @tableB
SELECT Product,Rate
FROM @tableA a
JOIN @tableB b ON a.RateID = b.RateID
Above is the sample query, where in execution plan it shows the Hash Match (inner Join). Now how do I change it to Nested Loop with out changing the query? help plz

Is Hash Match(inner join) or Nested loop is better to have in the query?
That depends on the size of the tables, available indexes etc. The optimizer will (hopefully) make the best choice.
Above is the sample query, where in execution plan it shows the Hash Match (inner Join). Now how do I change it to Nested Loop with out changing the query?
The answer that you should leave that to the optimizer in most cases.
I see that the logical read for nested loop is higher than Hash Match.
But Hash Match tends to need more CPU. The best way to two compare two queries or plans is wallclock time.
On a big tables, how do we reduce the logical read?
Make sure that there are usable indexes.
Erland Sommarskog, SQL Server MVP, [email protected]

Is merge join cartesian more cpu intensibe than nested loop ?

Hi,
just wonderning which access method is more cpu intensive , lets supposed we got 2 the same row sources and doing joing via merge join cartesian and next case is nested loop .
I know NL can be cpu intensive because of tight loop access , but what abour MJC ?
I can see bufferd sort but not sure is that cpu friendly ?
Regards
GregG

Hi,
I think in your case it's more accurate to compare a NESTED LOOP (NL) to a MERGE JOIN (MJ), because CARTESIAN MERGE JOIN is a rather special case of MJ.
Merge join sorts its inputs before combining them, and it could be efficient when one or both of inputs are already sorted.
Regarding your question (which is more CPU intensive):
1) if MERGE JOIN involves disk spills, then CPU is probably irrelevant, because disk operations are much more expensive
2) the amount of work to combine rowsources via a MJ depends on how well they are aligned with respect to each other, so I don't think it can be expressed via a simple formula.
For nested loops, the situation is much more simple: you don't need to do any special work do combine the rowsource, so the cost is just the sum of the cost acquiring the outer rowsource plus the number of iterations times the cost of one iteration. If the data is read from disk, then CPU probably won't matter much, if most of reads are logical ones than CPU becomes of a factor (it's hard to tell how much work CPU will have to do per one logical read because there are two many factors here -- how many columns are accessed, how they are located within the block, are there any expensive math functions applied to any of them etc.)
Best regards,
Nikolay

Help tuning NESTED LOOPS OUTER joins

Hello,
I have inherited this nasty query (below) that is taking an awful time to complete (more than 2 hrs a day)
The worst bit is that I need to outer join my fact table so many times as I need bit’s and pieces from other tables/mviews.
When I look at the explain plan I see that this situation means that the cbo is doing several NESTED LOOPS OUTER join operations. I understand that these nested loops mean going through every row in my primary table to see if there is a match in the secondary table (much smaller) which makes it extremely inefficient, is this right?
The stats on the tables are all refreshed daily.
Any ideas on how I can improve the performance here?
Thanks in advance!
The query:
explain plan for
SELECT x.user_id AS user_id,
x.login_name AS login_name,
c.date_of_birth AS date_of_birth,
x.registration_site AS registration_site,
x.organisation AS organisation,
c.user_title AS user_title,
c.first_name AS first_name,
c.last_name AS last_name,
x.email_address AS email_address,
x.user_status AS user_status,
x.user_privilege AS user_access_privilege,
x.date_registration AS date_registration,
x.affiliate_id AS affiliate_id,
x.mobile_number AS mobile_number,
x.optional_parameter AS vt_number,
gud.display_name AS chat_name,
REPLACE (s4.address_line_1, ',', '') AS address_line_1,
REPLACE (s4.address_line_2, ',', '') AS address_line_2,
REPLACE (s4.town, ',', '') AS town,
REPLACE (s4.county, ',', '') AS county,
REPLACE (s4.postcode, ',', '') AS postcode,
s4.country AS country,
s3.last_login AS last_login_date,
x.email_send_newsletter AS email_send_newsletter,
x.email_give_details_thirdparty AS email_give_details_thirdparty,
NVL (ia.cash_balance, 0) AS current_cash_balance,
NVL (ia.bonus_balance, 0) AS current_bonus_balance,
x.external_affiliate_id AS external_affiliate_id,
r.currency_code AS currency,
NVL (ia.points_balance, 0) AS current_loyalty_points_balance,
p.status AS buyer_status,
NVL (ia.bi_bonus_balance, 0) AS current_bi_bonus_balance,
NVL (ia.pending_balance, 0) AS current_pending_balance,
l.level_name AS current_loyalty_level,
l.date_level_achieved AS date_level_achieved,
NVL (l.current_period_loyalty_points, 0) AS current_period_loyalty_points,
r.region AS user_region,
x.registration_platform AS registration_platform,
x.external_user_name AS external_user_name,
c.home_number AS home_number,
pr.code AS reg_promo_code,
g.date_first_buy AS date_first_buy
FROM gl_user_registrations x,
gl_region r,
MVW_USER_BALANCES ia,
gl_customers c,
gl_user_display_names gud,
gl_user_last_login s3,
(SELECT z.user_id AS user_id,
z.address_line_1 AS address_line_1,
z.address_line_2 AS address_line_2,
z.town AS town,
z.county AS county,
z.postcode AS postcode,
z.country AS country
FROM gl_user_addresses z
WHERE z.is_current = 1) s4,
gl_user_buyer_mapping upm,
gl_buyer p,
mvw_user_loyalty_points l,
MVW_USER_PROMO_CODE_REG pr,
MVW_USER_FIRST_BUY_DATE g
WHERE x.base_region = r.region
AND x.user_id = ia.user_id (+)
AND x.customer_id = c.customer_id(+)
AND x.user_id = gud.user_id (+)
AND x.user_id = s4.user_id (+)
AND x.user_id = s3.user_id (+)
AND x.user_id = upm.user_id (+)
AND upm.buyer_id = p.buyer_id
AND x.user_id = l.user_id (+)
AND x.user_id = pr.user_id (+)
AND x.user_id = g.user_id (+);
select * from table(dbms_xplan.display);
Plan hash value: 2158171613
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 100 | 63100 | 135 (1)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 100 | 63100 | 135 (1)| 00:00:01 |
| 2 | NESTED LOOPS OUTER | | 100 | 60600 | 120 (1)| 00:00:01 |
| 3 | NESTED LOOPS OUTER | | 100 | 57100 | 105 (1)| 00:00:01 |
| 4 | NESTED LOOPS OUTER | | 100 | 55400 | 90 (2)| 00:00:01 |
| 5 | NESTED LOOPS OUTER | | 100 | 53600 | 70 (2)| 00:00:01 |
|* 6 | HASH JOIN | | 100 | 47000 | 55 (2)| 00:00:01 |
| 7 | TABLE ACCESS FULL | GL_REGION | 18 | 252 | 2 (0)| 00:00:01 |
| 8 | NESTED LOOPS OUTER | | 100 | 22800 | 52 (0)| 00:00:01 |
| 9 | NESTED LOOPS OUTER | | 100 | 19700 | 47 (0)| 00:00:01 |
| 10 | NESTED LOOPS OUTER | | 100 | 17600 | 37 (0)| 00:00:01 |
| 11 | NESTED LOOPS | | 100 | 15800 | 27 (0)| 00:00:01 |
| 12 | NESTED LOOPS | | 102 | 2754 | 17 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | GL_BUYER | 6143K| 64M| 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID| GL_USER_BUYER_MAPPING | 1 | 16 | 1 (0)| 00:00:01 |
|* 15 | INDEX RANGE SCAN | GL_USER_BUYER_MAPPPING_IX | 1 | | 1 (0)| 00:00:01 |
| 16 | TABLE ACCESS BY INDEX ROWID | GL_USER_REGISTRATIONS | 1 | 131 | 1 (0)| 00:00:01 |
|* 17 | INDEX UNIQUE SCAN | PK_GL_USER_REGISTRATIONS | 1 | | 1 (0)| 00:00:01 |
| 18 | TABLE ACCESS BY INDEX ROWID | GL_USER_LAST_LOGIN | 1 | 18 | 1 (0)| 00:00:01 |
|* 19 | INDEX UNIQUE SCAN | GL_USER_LAST_LOGIN_PK | 1 | | 1 (0)| 00:00:01 |
| 20 | TABLE ACCESS BY INDEX ROWID | GL_USER_DISPLAY_NAMES | 1 | 21 | 1 (0)| 00:00:01 |
|* 21 | INDEX UNIQUE SCAN | PK_GL_USER_DISPLAY_NAMES | 1 | | 1 (0)| 00:00:01 |
| 22 | TABLE ACCESS BY INDEX ROWID | GL_CUSTOMERS | 1 | 31 | 1 (0)| 00:00:01 |
|* 23 | INDEX UNIQUE SCAN | PK_GL_CUSTOMERS | 1 | | 1 (0)| 00:00:01 |
|* 24 | TABLE ACCESS BY INDEX ROWID | GL_USER_ADDRESSES | 1 | 66 | 1 (0)| 00:00:01 |
|* 25 | INDEX RANGE SCAN | IX_GL_USER_ADDRESSES1 | 1 | | 1 (0)| 00:00:01 |
| 26 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_FIRST_BUY_DATE | 1 | 18 | 1 (0)| 00:00:01 |
|* 27 | INDEX RANGE SCAN | MVW_USER_FS_DATE_IDX | 1 | | 1 (0)| 00:00:01 |
| 28 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_PROMO_CODE_REG | 1 | 17 | 1 (0)| 00:00:01 |
|* 29 | INDEX RANGE SCAN | MVW_USER_PROMO_CODE_IDX | 1 | | 1 (0)| 00:00:01 |
| 30 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_LOYALTY_POINTS | 1 | 35 | 1 (0)| 00:00:01 |
|* 31 | INDEX RANGE SCAN | MVW_USER_LYP_IDX | 1 | | 1 (0)| 00:00:01 |
| 32 | MAT_VIEW ACCESS BY INDEX ROWID | MVW_USER_BALANCES | 1 | 25 | 1 (0)| 00:00:01 |
|* 33 | INDEX RANGE SCAN | MVW_USER_BALANCES_IDX | 1 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
6 - access("X"."BASE_REGION"="R"."REGION")
15 - access("UPM"."BUYER_ID"="P"."BUYER_ID")
17 - access("X"."USER_ID"="UPM"."USER_ID")
19 - access("X"."USER_ID"="S3"."USER_ID"(+))
21 - access("X"."USER_ID"="GUD"."USER_ID"(+))
23 - access("X"."CUSTOMER_ID"="C"."CUSTOMER_ID"(+))
24 - filter("Z"."IS_CURRENT"(+)=1)
25 - access("X"."USER_ID"="Z"."USER_ID"(+))
27 - access("X"."USER_ID"="G"."USER_ID"(+))
29 - access("X"."USER_ID"="PR"."USER_ID"(+))
31 - access("X"."USER_ID"="L"."USER_ID"(+))
33 - access("X"."USER_ID"="IA"."USER_ID"(+))

Hi,
1) What you are saying about nested loops is true about any join (except, of course, cartesian joins): you are taking rows from rowsource A and find matching rows from rowsource B. This doesn't make a join method efficient or inefficient.
2) The plan you posted does not indicate any performance problem whatsoever. I know you have one, but it's not possible to address it without having any information about it. Trace it, get dbms_xplan.display_cursor dump with rowsource stats, or real-time SQL monitoring report (if your version and license allow it) and post the results here, then we'd be able to help
3) One efficient way to perform queries of your type (big fact table joined to a bunch of small dimension tables) is star transformation, but there are certain pre-requisites for that (like bitmap indexes on FK constraints) -- please read the documentation on star queries/transformations and see if that is an option for you
Best regards,
Nikolay

Oracle 11g - Nested loops on outer joins

Hello,
I have a select query that was working with no problems. The results are used to insert data into a temp table.
Recently, it would not complete executing. The explain plan shows a cartesian. But, there could be problems with using nested loops on the outer join. Interestingly, when I copy production code and rename the temp table and rename the view, it works.
Can someone take a look at the code and help. Maybe offer a suggestion on tuning too? Thanks.
CREATE TABLE "CT"
( "TN" VARCHAR2(30) NOT NULL ENABLE,
"COL_NAME" VARCHAR2(30) NOT NULL ENABLE,
"CDE" VARCHAR2(5) NOT NULL ENABLE,
"CDE_DESC" VARCHAR2(80) NOT NULL ENABLE,
"CDE_STAT" CHAR(1));
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('INDSD', 'STCD', 'U', 'RF', 'A');
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('AT', 'TCD', '001', 'RL', 'A');
insert into CT (TN, COL_NAME, CDE, CDE_DESC, CDE_STAT)
values ('AT', 'TCD', '033', 'PFR', 'A');
CREATE TABLE "IPP"
( "IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"PLCD" VARCHAR2(5) NOT NULL ENABLE,
"CBCD" VARCHAR2(5));
insert into IPP (IND_ID, PLCD, CBCD)
values (2007, 'AS', '04');
insert into IPP (IND_ID, PLCD, CBCD)
values (797098, 'AS', '34');
insert into IPP (IND_ID, PLCD, CBCD)
values (797191, 'AS','04');
CREATE TABLE "INDS"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"IND_CID" NUMBER(*,0),
"GFLG" VARCHAR2(1),
"HHID" NUMBER(9,0),
"DOB" DATE,
"DOB_FLAG" VARCHAR2(1),
"VCD" VARCHAR2(5),
"VTDTE" DATE,
"VPPCD" VARCHAR2(4),
"VRCDTE" DATE NOT NULL ENABLE,
"VDSID" NUMBER(9,0),
"VTRANSID" NUMBER(12,0),
"VOWNCD" VARCHAR2(5),
"RCDTE" DATE,
"LRDTE" DATE
insert into INDS (OPCD, IND_ID, IND_CID, GFLG, HHID, DOB, DOB_FLAG, VCD, VTDTE, VPPCD, VRCDTE, VDSID, VTRANSID, VOWNCD, RCDTE, LRDTE)
values ('USST', 2007, 114522319, '', 304087673, to_date('01-01-1980', 'dd-mm-yyyy'), 'F', '2', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), '', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), 1500016, null, 'USST', to_date('06-04-2011 09:21:37', 'dd-mm-yyyy hh24:mi:ss'), to_date('18-07-2012 21:52:53', 'dd-mm-yyyy hh24:mi:ss'));
insert into INDS (OPCD, IND_ID, IND_CID, GFLG, HHID, DOB, DOB_FLAG, VCD, VTDTE, VPPCD, VRCDTE, VDSID, VTRANSID, VOWNCD, RCDTE, LRDTE)
values ('USST', 304087678, 115242519, '', 304087678, to_date('01-01-1984', 'dd-mm-yyyy'), 'F', '2', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), '', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), 1500016, null, 'USST', to_date('06-04-2011 09:21:39', 'dd-mm-yyyy hh24:mi:ss'), to_date('18-07-2012 21:52:53', 'dd-mm-yyyy hh24:mi:ss'));
CREATE TABLE "INDS_TYPE"
( "IND_ID" NUMBER(9,0) NOT NULL ENABLE,
"STCD" VARCHAR2(5) NOT NULL ENABLE);
insert into INDS_type (IND_ID, STCD)
values (2007, 'U');
insert into INDS_type (IND_ID, STCD)
values (313250322, 'U');
insert into INDS_type (IND_ID, STCD)
values (480058122, 'U');
CREATE TABLE "PLOP"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"PLCD" VARCHAR2(5) NOT NULL ENABLE,
"PPLF" VARCHAR2(1));
insert into PLOP (OPCD, PLCD, PPLF)
values ('USST', 'SP', 'Y');
insert into PLOP (OPCD, PLCD, PPLF)
values ('PMUSA', 'ST', '');
insert into PLOP (OPCD, PLCD, PPLF)
values ('USST', 'RC', '');
CREATE TABLE "IND_T"
( "OPCD" VARCHAR2(5) NOT NULL ENABLE,
"CID" NUMBER(9,0) NOT NULL ENABLE,
"CBCD" VARCHAR2(5),
"PF" VARCHAR2(1) NOT NULL ENABLE,
"DOB" DATE,
"VCD" VARCHAR2(5),
"VOCD" VARCHAR2(5),
"IND_CID" NUMBER,
"RCDTE" DATE NOT NULL ENABLE
insert into IND_T (OPCD, CID, CBCD,PF, DOB, VCD, VOCD, IND_CID, RCDTE)
values ('JMC', 2007, '04', 'F',to_date('11-10-1933', 'dd-mm-yyyy'), '2', 'PMUSA', 363004880, to_date('30-09-2009 04:31:34', 'dd-mm-yyyy hh24:mi:ss'));
insert into IND_T (OPCD, CID, CBCD,PF, DOB, VCD, VOCD, IND_CID, RCDTE)
values ('JMC', 2008, '04', 'N',to_date('01-01-1980', 'dd-mm-yyyy'), '2', 'PMUSA', 712606335, to_date('05-04-2013 19:36:05', 'dd-mm-yyyy hh24:mi:ss'));
CREATE TABLE "IC"
( "CID" NUMBER(9,0) NOT NULL ENABLE,
"CF" CHAR(1));
insert into IC (CID, CF)
values (2007, 'N');
insert into IC (CID, CF)
values (100, 'N');
insert into IC (CID, CF)
values (200, 'N');
CREATE OR REPLACE FORCE VIEW "INDSS_V" ("OPCD", "IND_ID", "IND_CID", "GFLG", "HHID", "DOB", "DOB_FLAG", "VCD", "VTDTE", "VPPCD", "VRCDTE", "VDSID", "VTRANSID", "VOWNCD", "RCDTE", "LRDTE") AS
SELECT DISTINCT a.OPCD, a.IND_ID, a.IND_CID, a.GFLG, a.HHID,
a.DOB, a.DOB_flag, a.VCD, a.VTDTE,
a.VPPCD, a.VRCDTE, a.VDSID, a.VTRANSID,
a.VOWNCD, a.RCDTE, a.LRDTE
FROM INDS a, INDS_type b
WHERE a.IND_ID = b.IND_ID
AND b.STCD in (select CDE
from CT --database link
where TN = 'INDSD'
and COL_NAME = 'STCD'
and CDE_STAT = 'A') ;
--insert /*+ parallel(IND_T,2) */ into IND_T
select /*+ parallel(a,4) */
a.OPCD as OPCD
, a.IND_ID as CID
, b.CBCD as CBCD
, NULL as BFCD
, 'N' as PF
, a.DOB as DOB
, a.VCD as VCD
, a.VOWNCD as VOCD
, a.IND_CID as IND_CID
, a.RCDTE as RCDTE
from INDSS_V a
, (select /*+ parallel(IPP,4) */ * from IPP IPP , PLOP PLO
where plo.PLCD = ipp.PLCD
and PPLF='Y') b
, IC c
where a.IND_ID = b.IND_ID (+)
and a.OPCD = b.OPCD (+)
and a.IND_ID = c.CID
and c.CF = 'N';

Please consult
HOW TO: Post a SQL statement tuning request - template posting
Also format your code and post it using the [ code ] and [ /code ] tags. (Leave out the extra space after [ and before ])
Sybrand Bakker
Senior Oracle DBA
Edited by: sybrand_b on 10-apr-2013 17:57

Nested loop, merge join and harsh join

Can any one tell me the difference/relationship between nested loop, harsh join and merge join...Thanx

Check Oracle Performance Tuning Guide
13.6 Understanding Joins
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i51523

CBO (optimizer) nest-loop join question

Similar Messages

Maybe you are looking for