Efficiency - SubQuery vs Distinct

(Ora816/Win2K)
Which would be most efficient -
A non-correlated subquery, or a 'distinct' operator?
What if I were to have two levels of subquery (again non-correlated) - does that then become less or more efficient than the distinct?
Thanks

Consider two tables, tblDocument has a primary unique key of DocumentID. tblDocumentLookup has indexing on DocumentNo but it will not be unique in the table.
The version of the first derivation query using distinct would be:
select distinct d.DocumentID, d.VariousOtherFields
from tblDocument d, tblDocumentLookup dl
where d.DocumentID = dl.DocumentNo
and dl.Value = 'mystring';
versus
The version of the first derivation query using a non-correlated subquery would be:
select d.DocumentID, d.VariousOtherFields
from tblDocument d
where d.DocumentID in
(select dl.DocumentNo
from tblDocumentLookup dl
where dl.Value = 'mystring');
For the second derivation, consider a third table, tblLookup which has a primary unique key of LookupID. In the table tblDocumentLookup, the value LookupNo whilst indexed will not be unique in the table.
The version of the second derivation query using distinct would be:
select distinct d.DocumentID, d.VariousOtherFields
from tblDocument d, tblDocumentLookup dl, tblLookup l
where d.DocumentID = dl.DocumentNo and l.LookupID = dl.LookupNo
and dl.OtherValue = X
and l.Value = 'mystring';
versus
The version of the second derivation query using a non-correlated subquery would be:
select d.DocumentID, d.VariousOtherFields
from tblDocument d
where d.DocumentID in
(select dl.DocumentNo
from tblDocumentLookup dl
where dl.OtherValue = X
and dl.LookupNo in ( select l.LookupID
from tblLookup l
where l.Value = 'mystring' ));
I hope this helps explain. From my perspective I see distinct as being inefficient, and the subquery being inefficient, but my question really would be in this type of situation which will be the better of the two evils.
Thanks
Jason.

Similar Messages

  • Efficiency of "Count(Distinct Case" in SQL

    Hi,
    Could you please let me know if "Count(Distinct Case" statement is efficient for a million rows or is there a better way to do it
    For example -this table below contains a set of customers with status flag as 'new' or 'existing'.
    CREATE TABLE tableA
    ( cust_id NUMBER
    , status VARCHAR(10)
    ,txn_id NUMBER
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 6433, 'New', 11);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 6433, 'New', 21);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 1234, 'existing', 31);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 1234, 'existing', 41);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 7654, 'New', 51);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 7654, 'New', 61);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 9999, 'existing', 71);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 8888, 'New', 81);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 8888, 'existing', 91);
    INSERT INTO tableA (cust_id, status,txn_id) VALUES ( 2121, 'New', 100);
    am using the below SQL to calculate the number of distinct customers with status 'New'.
    Select
    Count(Distinct Case When status = 'New' Then cust_id end) New_Cust_Cnt
    from tableA
    Regards
    -Learnsequel

    san wrote:
    Hello,
    Select
    Count(Distinct Case When status = 'New' Then cust_id end) New_Cust_Cnt
    from tableA
    _Use like this:_
    Select
    Count(cust_id) New_Cust_Cnt
    from tableA
    where status='new';And also you can create index on status you will get faster.
    Thanks,
    SanjeevaAny how you have to use DISTINCT keywork. Otherwise you will not get the correct results for the OP's data.

  • SELECT DISTINCT alternative

    This is sort of related to another issue that's been posted on this message board.
    As a temp fix, I'm trying to have my Results page pull in a bunch of products from a Products table. .
    I have another table of Warnings that get joined by the product ID
    The Warnings table will have multiple items with the same part number (different warning IDs)
    So, when I pull in the products, I get multiples of the same products
    I've tried  a number of different methods to get this not to pull in multiples:
    I've tried DISTINCT just on products
    I've tried a subquery using DISTINCT the WHERE statement
    DISTINCT just doesn't want to work and I've read it's not exactly the best way to do it.
    How else can I do it?
    Below is basically how I've tried to do it.
    SELECT P.ID, HAZ.PID, HAZ.hazardID
    FROM tblProducts P 
    LEFT OUTER JOIN tblHazard HAZ ON P.ID = HAZ.PID
    WHERE (HAZ.PID IN (SELECT DISTINCT PID FROM tblHazard sHAZ))

    I want multiple rows but not duplicates.
    Lets say I have 7 different partnumbers but in this database I have some that have multiples of the same part number. However, Instead of seeing part numbers:
    1,2,2,3,4,5,5,6,7
    I want to see part numbers:
    1,2,3,4,5,6,7
    The ONLY way I know to fix this is to use DISTINCT but that only works if all the matching rows that have duplicates are exactly the same. In my case, the hazDate and the ID and hazID will be different so they will be DISTINCT. Therefore, I would, no doubt, get "duplicates."
    In other words, I want every part number once but no duplicates.
    Using this example again:
    SELECT DISTINCT PID, hazdate
    FROM          tblHazard
    Above gives me "1,2,2,3,4,5,5,6,7" Since the hazdate/times are  different in every row, it sees them as distinct even though product number is the same. The only one I really want to test if it's DISTINCT is the PID. So, when I test DISTINCT on just the PID, it works! Below works.
    SELECT DISTINCT PID
    FROM          tblHazard
    Gives me 1,2,3,4,5,6,7
    However, as I said above, I need to have the hazDate and hazID available. I run an "If statement." If it's hazID =  1, I show one type of warning and if it's hazID = 4, I show another type.
    So, what I need,  is a different way to eliminate duplicate part number rows.

  • "connect by" problem with "select distinct"

    When I run the following SQL (using "Scott" DB):
    select *
    from emp
    where deptno = 30 or mgr is null
    start with mgr is null
    connect by prior empno = mgr
    order siblings by ename
    I get the results one would expect. The President is first and all those reporting to him/her are listed in correct sequence.
    EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
    7839,KING,PRESIDENT,,11/17/1981,5000,10
    7698,BLAKE,MANAGER,7839,5/1/1981,2850,30
    7499,ALLEN,SALESMAN,7698,2/20/1981,1600,300,30
    7900,JAMES,CLERK,7698,12/3/1981,950,30
    7654,MARTIN,SALESMAN,7698,9/28/1981,1250,1400,30
    7844,TURNER,SALESMAN,7698,9/8/1981,1500,0,30
    7521,WARD,SALESMAN,7698,2/22/1981,1250,500,30
    However, when I run the same query but make it "select distinct" I get the following:
    EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
    7499,ALLEN,SALESMAN,7698,2/20/1981,1600,300,30
    7698,BLAKE,MANAGER,7839,5/1/1981,2850,,30
    7900,JAMES,CLERK,7698,12/3/1981,950,,30
    7839,KING,PRESIDENT,,11/17/1981,5000,,10
    7654,MARTIN,SALESMAN,7698,9/28/1981,1250,1400,30
    7844,TURNER,SALESMAN,7698,9/8/1981,1500,0,30
    7521,WARD,SALESMAN,7698,2/22/1981,1250,500,30
    Why would adding "distinct" to the select cause the result to be sorted STRICTLY by ename (per "order siblings by...")?
    Finally, if I "select distinct" but don't specify any "order" I get this, in NO APPARENT order:
    EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
    7499,ALLEN,SALESMAN,7698,2/20/1981,1600,300,30
    7521,WARD,SALESMAN,7698,2/22/1981,1250,500,30
    7654,MARTIN,SALESMAN,7698,9/28/1981,1250,1400,30
    7698,BLAKE,MANAGER,7839,5/1/1981,2850,,30
    7839,KING,PRESIDENT,,11/17/1981,5000,,10
    7844,TURNER,SALESMAN,7698,9/8/1981,1500,0,30
    7900,JAMES,CLERK,7698,12/3/1981,950,,30
    Thanks in advance for any insight offered!
    -Gene

    you have to specify what is going to be the distict field.No you don't. DISTINCT keyword applies to the whole SELECT list. See your own link.
    In any case this does not appear to have anything to do with what you SELECT, rather that the SORT UNIQUE caused by the DISTINCT keyword appears to prevent the ORDER SIBLINGS BY clause from working correctly.
    Not really sure why you need DISTINCT in this example, no doubt this is being applied elsewhere. Given that you have duplicates in the rowset and that hierarchical query now supports views, perhaps it would be more efficient to apply DISTINCT keyword first, something like...
    SELECT e.*
    FROM (SELECT DISTINCT e.*
    FROM emp e
    WHERE e.deptno = 30
    OR e.mgr IS NULL) e
    START WITH e.mgr IS NULL
    CONNECT BY PRIOR e.empno = e.mgr
    ORDER SIBLINGS BY e.ename;
    Alternatively you could skip ORDER SIBLINGS BY clause and use SYS_CONNECT_BY_PATH function to get your order, something like...
    SELECT e.*
    FROM (SELECT DISTINCT e.*,
    SYS_CONNECT_BY_PATH () path
    FROM emp e
    WHERE e.depno = 30
    OR e.mgr IS NULL
    START WITH e.mgr IS NULL
    CONNECT BY PRIOR e.empno = e.mgr) e
    ORDER BY e.path
    Padders

  • MS Access woes

    Hi all,
    I am quite frustrated with running sql queries on an MS Access 2002 database.
    First a simple query such as
    'Select DISTINCT VariableName from DISTRIBUTIONDATA' returns all the rows in my database.
    I also tried 'Select DISTINCTROW VariableName from DISTRIBUTIONDATA' but to no avail.
    I tried executing the same query in MSAccess 2002, ofcourse it works perfectly fine. The other query that I so desperately wish to run is
    Select Count(*) AS Expr1 From (SELECT DISTINCT VariableName, DataGroupName from DistributionData WHERE SettingID='Regional' GROUP BY VariableName, DataGroupName)
    However it just does not seem to work. The java error that I get is
    java.sql.SQLException: [Microsoft][ODBC Microsoft Access Driver] Syntax error in FROM clause.
         at sun.jdbc.odbc.JdbcOdbc.createSQLException(JdbcOdbc.java:6958)
         at sun.jdbc.odbc.JdbcOdbc.standardError(JdbcOdbc.java:7115)
         at sun.jdbc.odbc.JdbcOdbc.SQLExecDirect(JdbcOdbc.java:3111)
    whereas I could not find much of an error in my from clause. The above query is what is being built and is printed out with a system.out statement.
    Infact even the subquery
    SELECT DISTINCT VariableName, DataGroupName from DistributionData WHERE SettingID='Regional' GROUP BY VariableName, DataGroupName
    from the above from clause runs on it's own, it simply does not return the correct number of records.
    Thanks in advance for your help...

    Firstly, I don't have much option, I have to use MS access.. i know it sucks.. but still...
    Secondly, Here's some of the code... ( and yeah I have tried n combinations of capitalization - doesn't help)
    Statement inputStmt = inputConnection.createStatement(
    ResultSet.TYPE_SCROLL_INSENSITIVE,
    ResultSet.CONCUR_READ_ONLY);
    String inputQuery = "SELECT DISTINCT VariableName, DataGroupName FROM DistributionData WHERE SettingID='"
    + categories[i] + "' GROUP BY VariableName, DataGroupName";
    System.out.println( inputQuery );
    //get all datagroupnames and variablenames for this category
    ResultSet inputSet = inputStmt.executeQuery( inputQuery );
    // get the count of this result set for each category
    inputSet.last();
    int count1 = inputSet.getRow();
    System.out.println("For " +categories[i]+ " we have "+count1+" rows in inputdatabase");
    inputSet.beforeFirst();
    This is the output that it produces
    You chose to open this file: NationDB.mdb
    SELECT DISTINCT VariableName, DataGroupName FROM DistributionData WHERE SettingID='Regional' GROUP BY VariableName, DataGroupName
    For Regional we have 12 rows in inputdatabase.
    If I change the above query to simply SELECT GROUP BY VariableName from DistributionData ( don't pay too much attention to the capitalization) then it gives me a Syntax error. ..
    lastly, how do I define a query in Access and select from that. ??
    Thanks

  • Duplicating in Ledger report

    Hi Guys,
    This is the query i am using to run ledger report.
    i am getting the desired output but its getting duplicated  if there is Bill no (U_BillNo) for that transaction.
    I have created udf (U_BillNo) at row level.
    pls give me some  solution.
    SELECT distinct T1.[RefDate], T1.[BaseRef], T1.[Memo], T0.[Debit], T0.[Credit],T1.[LocTotal],T3.[U_BillNo]
    FROM [dbo].[JDT1]  T0
    INNER JOIN [dbo].[OJDT]  T1 ON T0.TransId = T1.TransId
    INNER JOIN OCRD T2 on T2.CardCode = T0.ShortName
    INNER JOIN PCH1 T3 ON T2.CardCode = T3.BaseCard
    WHERE T2.[CardName] =[%0]
    Regards,
    Vamsi.

    Hi Gordon,
    I have created this UDF for AP Invoice in row wise and there is  multiple bill no for same invoice based on diff line items.
    so when i run the report  the same line item is appearing two times one line item displaying with bill no and another line item without bill no.
    i tried to create a subquery for bill no but its showing error like below,
    Normal Query
    SELECT distinct T1.[RefDate], T1.[BaseRef], T1.[Memo], T0.[Debit], T0.[Credit],T1.[LocTotal],T3.[U_BillNo]
    FROM [dbo].[JDT1]  T0
    INNER JOIN [dbo].[OJDT]  T1 ON T0.TransId = T1.TransId
    INNER JOIN OCRD T2 on T2.CardCode = T0.ShortName
    INNER JOIN PCH1 T3 ON T2.CardCode = T3.BaseCard
    WHERE T2.[CardName] =[%0]
    With SubQuery
    SELECT distinct T1.[RefDate], T1.[BaseRef], T1.[Memo], T0.[Debit], T0.[Credit],T1.[LocTotal],
    (SELECT (U_BillNo) FROM PCH1 WHERE T2.CardCode = T3.BaseCard) as 'Bill No'
    FROM [dbo].[JDT1]  T0
    INNER JOIN [dbo].[OJDT]  T1 ON T0.TransId = T1.TransId
    INNER JOIN OCRD T2 on T2.CardCode = T0.ShortName
    INNER JOIN PCH1 T3 ON T2.CardCode = T3.BaseCard
    WHERE T2.[CardName] =[%0]
    Pls give me a solution .
    Regards,
    Vamsi.

  • Disallow select for update

    Hi,
    I am making a read only user, with select rights on all of the tables in another scheme. How do I dissalow this user to lock the production table with select for update?
    Regards
    Nico

    Good idea in principle, but there may be more efficient solutions than DISTINCT. Any operation which prevents view merging appears to work (or rather not work) e.g.
    CREATE OR REPLACE VIEW view_name
    AS
       SELECT /*+ NO_MERGE */ column_name
       FROM   table_name;...is sufficient to raise ORA-02014: cannot select FOR UPDATE from view with DISTINCT, GROUP BY, etc. when attempting SELECT ... FOR UPDATE.
    It is important to note though that /*+ NO_MERGE */ could theoretically be ignored by the optimizer - another cheap alternative you might consider would be to reference ROWNUM in the view definition, e.g.
    CREATE OR REPLACE VIEW view_name
    AS
       SELECT column_name
       FROM   table_name
       WHERE  ROWNUM >= 1;

  • Query running forever..please help ?

    I am running the following query on my Oracle 10g db.
    select count(*) from port p where p.ne not in (Select distinct  ne from network);The table port has 9 million records and the subquery Select distinct  ne from network returns 85K records.
    I need to get the result but for some reason the query is running for a long time without giving the result.
    Can i Correct/Optimize this query for faster results.
    Note : ne column is the Fkey in port table
    Help highy appreciated.
    thx

    user8651741 wrote:
    If you are just trying to find records in port without a corresponding record in network you could try something like
    select ne from port
    minus
    select distinct ne from network;I tried the query but its doing the same thing running endlessly
    The Explain for this query gave
    PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                           
    | Id  | Operation              | Name         | Rows  | Bytes | Cost  |                                                                                                                                                                                                                                     
    |   0 | SELECT STATEMENT       |              |  9854K|   142M|   167K|                                                                                                                                                                                                                                     
    |   1 |  MINUS                 |              |       |       |       |                                                                                                                                                                                                                                     
    |   2 |   SORT UNIQUE          |              |  9854K|   140M|   167K|                                                                                                                                                                                                                                     
    |   3 |    TABLE ACCESS FULL   | PORT      |  9854K|   140M|   115K|                                                                                                                                                                                                                                     
    |   4 |   SORT UNIQUE          |              | 85491 |  1586K|   583 |                                                                                                                                                                                                                                     
    |   5 |    INDEX FAST FULL SCAN| SYS_C0037316 | 85491 |  1586K|    70 |                                                                                                                                                                                                                                     
    Note                                                                                                                                                                                                                                                                                                        
       - 'PLAN_TABLE' is old version   

  • How can I remove distinct from subquery?

    Hello, I'll expose my problem, I don't know if may be is a bug.
    Suppose We have
    Table A:
    field A1 primary key
    field A2
    field A3
    Table B
    field B1 primary key
    field B2 foreig key to A(A1)
    field B3
    I use reverse tool to build objects from DB schema.
    Everything runs ok but when I want to recover A objects
    with sum(B3) from B,it doesn't runs fine.
    To achieve this I've made a class C with fields of class A and a double
    for sum. The code I've use is:
    KodoQuery query = (KodoQuery)getPersistenceManager()
    .newQuery(A.class, "this == b.b2");
    query.declareVariables("B b");
    query.declareImports("x.x.B");
    query.setResultClass(C.class);
    query.setResult("this as a, sum(b.b3) as s");
    query.setGrouping("a");
    Collection c = query.execute();
    I use PostGreSQL 7.4, and the sql that Kodo generates is like this:
    SELECT s.A1, SUM(s.C) AS c
    FROM ( SELECT DISTINCT a.A1 AS A1, s.B3 AS C
    FROM PUBLIC.A a INNER JOIN PUBLIC.B b ON (a.A1 = b.B2) ) s
    GROUP BY s.A1
    But the problem is 'DISTINCT' keyword. If I have records on B
    with same B3 value then:
    B1=1,B2=1,B3=25
    B1=2,B2=1,B3=25
    The query returns SUM(B3)=25 which is not valid!!!, it must be 50.
    It fails with repeated B3 values because distinct keyword.
    I bypass this adding a third field to class C that I will not use, and
    introduce it into query:
    KodoQuery query = (KodoQuery)getPersistenceManager()
    .newQuery(A.class, "this == b.b2");
    query.declareVariables("B b");
    query.declareImports("x.x.B");
    query.setResultClass(C.class);
    query.setResult("this as a, sum(b.b3) as s, sum(b.b1) as t");
    query.setGrouping("a");
    Collection c = query.execute();
    With this, I force to put primary key into subquery, and disables
    DISTINCT, but I never will use this field :(. SQL generated by KODO:
    SELECT s.A1, SUM(s.C) AS c, SUM(s.T) as t
    FROM ( SELECT DISTINCT a.A1 AS A1, s.B3 AS C, s.B1 as T
    FROM PUBLIC.A a INNER JOIN PUBLIC.B b ON (a.A1 = b.B2) ) s
    GROUP BY s.A1
    Questions:
    1.- What may I doing wrong?
    2.- How can I disable Distinct from subquerys?
    And, no, I don't want to select by class B and then search A objects.
    I want a C object with A object plus SUM of values from B.
    I am achieving this with this patch.
    I know that when there aren't B objects I don't receive results.
    Is there any way to build an OUTER JOIN with JDOQL ?
    In order to bypass this, I execute the query another time but reseting to
    0, but I will prefer to be able to build an OUTER JOIN ;)
    PD: Little class descriptions:
    class A{ int a1, int a2, int a3} (autogenerated by kodo)
    class B{ int b1, int b2, double b3} (autogenerated by kodo)
    class C{ A a, double c, int t} (generated by me)
    Sorry for my english :(
    thanks.

    I'll address your outer join question first, since it's the easiest. Kodo will
    create outer joins automatically as needed to satisfy your JDOQL. In this case,
    your JDOQL does not warrant an outer join. Your filter criteria on the query is
    "this == b.b2". That means the query cannot and should not return any values
    when there are no A's that match that join. Why not try this equivalent but
    simpler query?
    Query q = pm.newQuery (B.class);
    q.setResult ("b2 as a, sum(b3) as s");
    q.setGrouping ("b2");
    q.setResultClass (C.class);
    That will return results grouped by the A objects from the b2 field, and the sum
    of the b3 field for all B's in each group. It should use an outer join.
    About DISTINCT:
    The pattern of a distinct subselect is caused by two conditions:
    1. Your result clause is non-distinct. That is, you did:
    q.setResult("this as a, sum(b3) as s")
    rather than:
    q.setResult("distinct this as a, sum(b3) as s")
    This means that you want to allow duplicates in the projected values you get back.
    2. You use an unbound variable ("b") in a way that could lead to duplicates
    caused by relational joins. JDO always eliminates duplicates caused by database
    joins.
    So in order to eliminate the possible duplicates caused by #2, Kodo issues the
    query as a DISTINCT subselect. Then in order to allow duplicates of the
    projected results (#1), Kodo uses a non-distinct outer select.
    The simplest way to eliminate the DISTINCT subselect is to use the alternate
    query on B.class I mentioned above. That will result in no DISTINCTs at all.
    Another way is to use the "distinct" keyword in your setResult() call, as I
    showed in #1. That will result in a DISTINCT select without any subselect.
    Without reverting to a SQL query (which you can still execute through JDO APIs,
    as described in the documentation), I believe those are the options available.

  • SQL query with multiple tables - what is the most efficient way?

    Hello I am learning PL/SQL. I have a simple procedure where I need to find number of employees and departments per location as per user input of location_id.
    I have 3 Tables:
    LOCATIONS
    location_id (pk)
    location_name
    DEPARTMENTS
    department_id (pk)
    location_id (fk)
    department_name
    EMPLOYEES
    employee_id (pk)
    department_id (fk)
    employee_name
    1 Location can have 0-MANY Departments
    1 Employee has 1 Department
    Here is the query I came up with for PL/SQL procedure:
    /*Ecount, Dcount are NUMBER variables */
    SELECT SUM (EmployeeCount), COUNT(DepartmentNumber)
         INTO Ecount, Dcount
         FROM     
         (SELECT COUNT(employee_id) EmployeeCount, department_id DepartmentNumber
              FROM employees
              GROUP BY department_id
              HAVING department_id IN
                        (SELECT department_id
                        FROM departments
                        WHERE location_id = userInput));
    I do get the correct result, but I am just wondering if my query is on the right track and if there is a more "efficient" way of doing this.
    Thanks in advance for helping a newbie out.

    Hi,
    Welcome to the forum!
    Something like this will be more efficient:
    SELECT    COUNT (employee_id)               AS ECount
    ,       COUNT (DISTINCT department_id)     AS DCount
    FROM       employees
    WHERE       department_id IN (     SELECT     department_id
                        FROM      departments
                        WHERE      location_id = :userInput
    ;You should also try a join instead of the IN subquery.
    For efficiency, do only the things you need to do.
    For example, you don't need a count of employees in each department, so don't compute one. That means you won't need the in-line view, so don't have one.
    You don't need PL/SQL for this job, so don't use PL/SQL if you don't have to. (I realize this question was out of context, so you may have good reasons for doing this in PL/SQL.)
    Do all filtering as early as possible. Don't waste effort computing things that won't be used .
    A particular example of this is: Never use a HAVING clause when you can use a WHERE clause. What's the difference between a WHERE clause and a HAVING clause? The WHERE clause is applied before aggregate functions are computed, and the HAVING clause is applied after; there's no other difference. Therefore, if the HAVING clause isn't referencing an aggregate function, it could be done in a WHERE clause instead.

  • How to select the data efficiently from the table

    hi every one,
      i need some help in selecting data from FAGLFLEXA table.i have to select many amounts from different group of G/L accounts
    (groups are predefined here  which contains a set of g/L account no.).
    if i select every time for each group then it will be a performance issue, in order to avoid it what should i do, can any one suggest me a method or a smaple query so that i can perform the task efficiently.

    Hi ,
    1.select and keep the data in internal table
    2.avoid select inside loop ..endloop.
    3.try to use for all entries
    check the below details
    Hi Praveen,
    Performance Notes
    1.Keep the Result Set Small
    You should aim to keep the result set small. This reduces both the amount of memory used in the database system and the network load when transferring data to the application server. To reduce the size of your result sets, use the WHERE and HAVING clauses.
    Using the WHERE Clause
    Whenever you access a database table, you should use a WHERE clause in the corresponding Open SQL statement. Even if a program containing a SELECT statement with no WHERE clause performs well in tests, it may slow down rapidly in your production system, where the data volume increases daily. You should only dispense with the WHERE clause in exceptional cases where you really need the entire contents of the database table every time the statement is executed.
    When you use the WHERE clause, the database system optimizes the access and only transfers the required data. You should never transfer unwanted data to the application server and then filter it using ABAP statements.
    Using the HAVING Clause
    After selecting the required lines in the WHERE clause, the system then processes the GROUP BY clause, if one exists, and summarizes the database lines selected. The HAVING clause allows you to restrict the grouped lines, and in particular, the aggregate expressions, by applying further conditions.
    Effect
    If you use the WHERE and HAVING clauses correctly:
    • There are no more physical I/Os in the database than necessary
    • No unwanted data is stored in the database cache (it could otherwise displace data that is actually required)
    • The CPU usage of the database host is minimize
    • The network load is reduced, since only the data that is required by the application is transferred to the application server.
    Minimize the Amount of Data Transferred
    Data is transferred between the database system and the application server in blocks. Each block is up to 32 KB in size (the precise size depends on your network communication hardware). Administration information is transported in the blocks as well as the data.
    To minimize the network load, you should transfer as few blocks as possible. Open SQL allows you to do this as follows:
    Restrict the Number of Lines
    If you only want to read a certain number of lines in a SELECT statement, use the UP TO <n> ROWS addition in the FROM clause. This tells the database system only to transfer <n> lines back to the application server. This is more efficient than transferring more lines than necessary back to the application server and then discarding them in your ABAP program.
    If you expect your WHERE clause to return a large number of duplicate entries, you can use the DISTINCT addition in the SELECT clause.
    Restrict the Number of Columns
    You should only read the columns from a database table that you actually need in the program. To do this, list the columns in the SELECT clause. Note here that the INTO CORRESPONDING FIELDS addition in the INTO clause is only efficient with large volumes of data, otherwise the runtime required to compare the names is too great. For small amounts of data, use a list of variables in the INTO clause.
    Do not use * to select all columns unless you really need them. However, if you list individual columns, you may have to adjust the program if the structure of the database table is changed in the ABAP Dictionary. If you specify the database table dynamically, you must always read all of its columns.
    Use Aggregate Functions
    If you only want to use data for calculations, it is often more efficient to use the aggregate functions of the SELECT clause than to read the individual entries from the database and perform the calculations in the ABAP program.
    Aggregate functions allow you to find out the number of values and find the sum, average, minimum, and maximum values.
    Following an aggregate expression, only its result is transferred from the database.
    Data Transfer when Changing Table Lines
    When you use the UPDATE statement to change lines in the table, you should use the WHERE clause to specify the relevant lines, and then SET statements to change only the required columns.
    When you use a work area to overwrite table lines, too much data is often transferred. Furthermore, this method requires an extra SELECT statement to fill the work area. Minimize the Number of Data Transfers
    In every Open SQL statement, data is transferred between the application server and the database system. Furthermore, the database system has to construct or reopen the appropriate administration data for each database access. You can therefore minimize the load on the network and the database system by minimizing the number of times you access the database.
    Multiple Operations Instead of Single Operations
    When you change data using INSERT, UPDATE, and DELETE, use internal tables instead of single entries. If you read data using SELECT, it is worth using multiple operations if you want to process the data more than once, other wise, a simple select loop is more efficient.
    Avoid Repeated Access
    As a rule you should read a given set of data once only in your program, and using a single access. Avoid accessing the same data more than once (for example, SELECT before an UPDATE).
    Avoid Nested SELECT Loops
    A simple SELECT loop is a single database access whose result is passed to the ABAP program line by line. Nested SELECT loops mean that the number of accesses in the inner loop is multiplied by the number of accesses in the outer loop. You should therefore only use nested SELECT loops if the selection in the outer loop contains very few lines.
    However, using combinations of data from different database tables is more the rule than the exception in the relational data model. You can use the following techniques to avoid nested SELECT statements:
    ABAP Dictionary Views
    You can define joins between database tables statically and systemwide as views in the ABAP Dictionary. ABAP Dictionary views can be used by all ABAP programs. One of their advantages is that fields that are common to both tables (join fields) are only transferred once from the database to the application server.
    Views in the ABAP Dictionary are implemented as inner joins. If the inner table contains no lines that correspond to lines in the outer table, no data is transferred. This is not always the desired result. For example, when you read data from a text table, you want to include lines in the selection even if the corresponding text does not exist in the required language. If you want to include all of the data from the outer table, you can program a left outer join in ABAP.
    The links between the tables in the view are created and optimized by the database system. Like database tables, you can buffer views on the application server. The same buffering rules apply to views as to tables. In other words, it is most appropriate for views that you use mostly to read data. This reduces the network load and the amount of physical I/O in the database.
    Joins in the FROM Clause
    You can read data from more than one database table in a single SELECT statement by using inner or left outer joins in the FROM clause.
    The disadvantage of using joins is that redundant data is read from the hierarchically-superior table if there is a 1:N relationship between the outer and inner tables. This can considerably increase the amount of data transferred from the database to the application server. Therefore, when you program a join, you should ensure that the SELECT clause contains a list of only the columns that you really need. Furthermore, joins bypass the table buffer and read directly from the database. For this reason, you should use an ABAP Dictionary view instead of a join if you only want to read the data.
    The runtime of a join statement is heavily dependent on the database optimizer, especially when it contains more than two database tables. However, joins are nearly always quicker than using nested SELECT statements.
    Subqueries in the WHERE and HAVING Clauses
    Another way of accessing more than one database table in the same Open SQL statement is to use subqueries in the WHERE or HAVING clause. The data from a subquery is not transferred to the application server. Instead, it is used to evaluate conditions in the database system. This is a simple and effective way of programming complex database operations.
    Using Internal Tables
    It is also possible to avoid nested SELECT loops by placing the selection from the outer loop in an internal table and then running the inner selection once only using the FOR ALL ENTRIES addition. This technique stems from the time before joins were allowed in the FROM clause. On the other hand, it does prevent redundant data from being transferred from the database.
    Using a Cursor to Read Data
    A further method is to decouple the INTO clause from the SELECT statement by opening a cursor using OPEN CURSOR and reading data line by line using FETCH NEXT CURSOR. You must open a new cursor for each nested loop. In this case, you must ensure yourself that the correct lines are read from the database tables in the correct order. This usually requires a foreign key relationship between the database tables, and that they are sorted by the foreign key. Minimize the Search Overhead
    You minimize the size of the result set by using the WHERE and HAVING clauses. To increase the efficiency of these clauses, you should formulate them to fit with the database table indexes.
    Database Indexes
    Indexes speed up data selection from the database. They consist of selected fields of a table, of which a copy is then made in sorted order. If you specify the index fields correctly in a condition in the WHERE or HAVING clause, the system only searches part of the index (index range scan).
    The primary index is always created automatically in the R/3 System. It consists of the primary key fields of the database table. This means that for each combination of fields in the index, there is a maximum of one line in the table. This kind of index is also known as UNIQUE.
    If you cannot use the primary index to determine the result set because, for example, none of the primary index fields occur in the WHERE or HAVING clause, the system searches through the entire table (full table scan). For this case, you can create secondary indexes, which can restrict the number of table entries searched to form the result set.
    You specify the fields of secondary indexes using the ABAP Dictionary. You can also determine whether the index is unique or not. However, you should not create secondary indexes to cover all possible combinations of fields.
    Only create one if you select data by fields that are not contained in another index, and the performance is very poor. Furthermore, you should only create secondary indexes for database tables from which you mainly read, since indexes have to be updated each time the database table is changed. As a rule, secondary indexes should not contain more than four fields, and you should not have more than five indexes for a single database table.
    If a table has more than five indexes, you run the risk of the optimizer choosing the wrong one for a particular operation. For this reason, you should avoid indexes with overlapping contents.
    Secondary indexes should contain columns that you use frequently in a selection, and that are as highly selective as possible. The fewer table entries that can be selected by a certain column, the higher that column’s selectivity. Place the most selective fields at the beginning of the index. Your secondary index should be so selective that each index entry corresponds to at most five percent of the table entries. If this is not the case, it is not worth creating the index. You should also avoid creating indexes for fields that are not always filled, where their value is initial for most entries in the table.
    If all of the columns in the SELECT clause are contained in the index, the system does not have to search the actual table data after reading from the index. If you have a SELECT clause with very few columns, you can improve performance dramatically by including these columns in a secondary index.
    Formulating Conditions for Indexes
    You should bear in mind the following when formulating conditions for the WHERE and HAVING clauses so that the system can use a database index and does not have to use a full table scan.
    Check for Equality and Link Using AND
    The database index search is particularly efficient if you check all index fields for equality (= or EQ) and link the expressions using AND.
    Use Positive Conditions
    The database system only supports queries that describe the result in positive terms, for example, EQ or LIKE. It does not support negative expressions like NE or NOT LIKE.
    If possible, avoid using the NOT operator in the WHERE clause, because it is not supported by database indexes; invert the logical expression instead.
    Using OR
    The optimizer usually stops working when an OR expression occurs in the condition. This means that the columns checked using OR are not included in the index search. An exception to this are OR expressions at the outside of conditions. You should try to reformulate conditions that apply OR expressions to columns relevant to the index, for example, into an IN condition.
    Using Part of the Index
    If you construct an index from several columns, the system can still use it even if you only specify a few of the columns in a condition. However, in this case, the sequence of the columns in the index is important. A column can only be used in the index search if all of the columns before it in the index definition have also been specified in the condition.
    Checking for Null Values
    The IS NULL condition can cause problems with indexes. Some database systems do not store null values in the index structure. Consequently, this field cannot be used in the index.
    Avoid Complex Conditions
    Avoid complex conditions, since the statements have to be broken down into their individual components by the database system.
    Reduce the Database Load
    Unlike application servers and presentation servers, there is only one database server in your system. You should therefore aim to reduce the database load as much as possible. You can use the following methods:
    Buffer Tables on the Application Server
    You can considerably reduce the time required to access data by buffering it in the application server table buffer. Reading a single entry from table T001 can take between 8 and 600 milliseconds, while reading it from the table buffer takes 0.2 - 1 milliseconds.
    Whether a table can be buffered or not depends its technical attributes in the ABAP Dictionary. There are three buffering types:
    • Resident buffering (100%) The first time the table is accessed, its entire contents are loaded in the table buffer.
    • Generic buffering In this case, you need to specify a generic key (some of the key fields) in the technical settings of the table in the ABAP Dictionary. The table contents are then divided into generic areas. When you access data with one of the generic keys, the whole generic area is loaded into the table buffer. Client-specific tables are often buffered generically by client.
    • Partial buffering (single entry) Only single entries are read from the database and stored in the table buffer.
    When you read from buffered tables, the following happens:
    1. An ABAP program requests data from a buffered table.
    2. The ABAP processor interprets the Open SQL statement. If the table is defined as a buffered table in the ABAP Dictionary, the ABAP processor checks in the local buffer on the application server to see if the table (or part of it) has already been buffered.
    3. If the table has not yet been buffered, the request is passed on to the database. If the data exists in the buffer, it is sent to the program.
    4. The database server passes the data to the application server, which places it in the table buffer.
    5. The data is passed to the program.
    When you change a buffered table, the following happens:
    1. The database table is changed and the buffer on the application server is updated. The database interface logs the update statement in the table DDLOG. If the system has more than one application server, the buffer on the other servers is not updated at once.
    2. All application servers periodically read the contents of table DDLOG, and delete the corresponding contents from their buffers where necessary. The granularity depends on the buffering type. The table buffers in a distributed system are generally synchronized every 60 seconds (parameter: rsdisp/bufreftime).
    3. Within this period, users on non-synchronized application servers will read old data. The data is not recognized as obsolete until the next buffer synchronization. The next time it is accessed, it is re-read from the database.
    You should buffer the following types of tables:
    • Tables that are read very frequently
    • Tables that are changed very infrequently
    • Relatively small tables (few lines, few columns, or short columns)
    • Tables where delayed update is acceptable.
    Once you have buffered a table, take care not to use any Open SQL statements that bypass the buffer.
    The SELECT statement bypasses the buffer when you use any of the following:
    • The BYPASSING BUFFER addition in the FROM clause
    • The DISTINCT addition in the SELECT clause
    • Aggregate expressions in the SELECT clause
    • Joins in the FROM clause
    • The IS NULL condition in the WHERE clause
    • Subqueries in the WHERE clause
    • The ORDER BY clause
    • The GROUP BY clause
    • The FOR UPDATE addition
    Furthermore, all Native SQL statements bypass the buffer.
    Avoid Reading Data Repeatedly
    If you avoid reading the same data repeatedly, you both reduce the number of database accesses and reduce the load on the database. Furthermore, a "dirty read" may occur with database tables other than Oracle. This means that the second time you read data from a database table, it may be different from the data read the first time. To ensure that the data in your program is consistent, you should read it once only and then store it in an internal table.
    Sort Data in Your ABAP Programs
    The ORDER BY clause in the SELECT statement is not necessarily optimized by the database system or executed with the correct index. This can result in increased runtime costs. You should only use ORDER BY if the database sort uses the same index with which the table is read. To find out which index the system uses, use SQL Trace in the ABAP Workbench Performance Trace. If the indexes are not the same, it is more efficient to read the data into an internal table or extract and sort it in the ABAP program using the SORT statement.
    Use Logical Databases
    SAP supplies logical databases for all applications. A logical database is an ABAP program that decouples Open SQL statements from application programs. They are optimized for the best possible database performance. However, it is important that you use the right logical database. The hierarchy of the data you want to read must reflect the structure of the logical database, otherwise, they can have a negative effect on performance. For example, if you want to read data from a table right at the bottom of the hierarchy of the logical database, it has to read at least the key fields of all tables above it in the hierarchy. In this case, it is more efficient to use a SELECT statement.
    Work Processes
    Work processes execute the individual dialog steps in R/3 applications. The next two sections describe firstly the structure of a work process, and secondly the different types of work process in the R/3 System.
    Structure of a Work Process
    Work processes execute the dialog steps of application programs. They are components of an application server. The following diagram shows the components of a work process:
    Each work process contains two software processors and a database interface.
    Screen Processor
    In R/3 application programming, there is a difference between user interaction and processing logic. From a programming point of view, user interaction is controlled by screens. As well as the actual input mask, a screen also consists of flow logic. The screen flow logic controls a large part of the user interaction. The R/3 Basis system contains a special language for programming screen flow logic. The screen processor executes the screen flow logic. Via the dispatcher, it takes over the responsibility for communication between the work process and the SAPgui, calls modules in the flow logic, and ensures that the field contents are transferred from the screen to the flow logic.
    ABAP Processor
    The actual processing logic of an application program is written in ABAP - SAP’s own programming language. The ABAP processor executes the processing logic of the application program, and communicates with the database interface. The screen processor tells the ABAP processor which module of the screen flow logic should be processed next. The following screen illustrates the interaction between the screen and the ABAP processors when an application program is running.
    Database Interface
    The database interface provides the following services:
    • Establishing and terminating connections between the work process and the database.
    • Access to database tables
    • Access to R/3 Repository objects (ABAP programs, screens and so on)
    • Access to catalog information (ABAP Dictionary)
    • Controlling transactions (commit and rollback handling)
    • Table buffer administration on the application server.
    The following diagram shows the individual components of the database interface:
    The diagram shows that there are two different ways of accessing databases: Open SQL and Native SQL.
    Open SQL statements are a subset of Standard SQL that is fully integrated in ABAP. They allow you to access data irrespective of the database system that the R/3 installation is using. Open SQL consists of the Data Manipulation Language (DML) part of Standard SQL; in other words, it allows you to read (SELECT) and change (INSERT, UPDATE, DELETE) data. The tasks of the Data Definition Language (DDL) and Data Control Language (DCL) parts of Standard SQL are performed in the R/3 System by the ABAP Dictionary and the authorization system. These provide a unified range of functions, irrespective of database, and also contain functions beyond those offered by the various database systems.
    Open SQL also goes beyond Standard SQL to provide statements that, in conjunction with other ABAP constructions, can simplify or speed up database access. It also allows you to buffer certain tables on the application server, saving excessive database access. In this case, the database interface is responsible for comparing the buffer with the database. Buffers are partly stored in the working memory of the current work process, and partly in the shared memory for all work processes on an application server. Where an R/3 System is distributed across more than one application server, the data in the various buffers is synchronized at set intervals by the buffer management. When buffering the database, you must remember that data in the buffer is not always up to date. For this reason, you should only use the buffer for data which does not often change.
    Native SQL is only loosely integrated into ABAP, and allows access to all of the functions contained in the programming interface of the respective database system. Unlike Open SQL statements, Native SQL statements are not checked and converted, but instead are sent directly to the database system. Programs that use Native SQL are specific to the database system for which they were written. R/3 applications contain as little Native SQL as possible. In fact, it is only used in a few Basis components (for example, to create or change table definitions in the ABAP Dictionary).
    The database-dependent layer in the diagram serves to hide the differences between database systems from the rest of the database interface. You choose the appropriate layer when you install the Basis system. Thanks to the standardization of SQL, the differences in the syntax of statements are very slight. However, the semantics and behavior of the statements have not been fully standardized, and the differences in these areas can be greater. When you use Native SQL, the function of the database-dependent layer is minimal.
    Types of Work Process
    Although all work processes contain the components described above, they can still be divided into different types. The type of a work process determines the kind of task for which it is responsible in the application server. It does not specify a particular set of technical attributes. The individual tasks are distributed to the work processes by the dispatcher.
    Before you start your R/3 System, you determine how many work processes it will have, and what their types will be. The dispatcher starts the work processes and only assigns them tasks that correspond to their type. This means that you can distribute work process types to optimize the use of the resources on your application servers.
    The following diagram shows again the structure of an application server, but this time, includes the various possible work process types:
    The various work processes are described briefly below. Other parts of this documentation describe the individual components of the application server and the R/3 System in more detail.
    Dialog Work Process
    Dialog work processes deal with requests from an active user to execute dialog steps.
    Update Work Process
    Update work processes execute database update requests. Update requests are part of an SAP LUW that bundle the database operations resulting from the dialog in a database LUW for processing in the background.
    Background Work Process
    Background work processes process programs that can be executed without user interaction (background jobs).
    Enqueue Work Process
    The enqueue work process administers a lock table in the shared memory area. The lock table contains the logical database locks for the R/3 System and is an important part of the SAP LUW concept. In an R/3 System, you may only have one lock table. You may therefore also only have one application server with enqueue work processes.
    Spool Work Process
    The spool work process passes sequential datasets to a printer or to optical archiving. Each application server may contain several spool work process.
    The services offered by an application server are determined by the types of its work processes. One application server may, of course, have more than one function. For example, it may be both a dialog server and the enqueue server, if it has several dialog work processes and an enqueue work process.
    You can use the system administration functions to switch a work process between dialog and background modes while the system is still running. This allows you, for example, to switch an R/3 System between day and night operation, where you have more dialog than background work processes during the day, and the other way around during the night.
    ABAP Application Server
    R/3 programs run on application servers. They are an important component of the R/3 System. The following sections describe application servers in more detail.
    Structure of an ABAP Application Server
    The application layer of an R/3 System is made up of the application servers and the message server. Application programs in an R/3 System are run on application servers. The application servers communicate with the presentation components, the database, and also with each other, using the message server.
    The following diagram shows the structure of an application server:
    The individual components are:
    Work Processes
    An application server contains work processes, which are components that can run an application. Work processes are components that are able to execute an application (that is, one dialog step each). Each work process is linked to a memory area containing the context of the application being run. The context contains the current data for the application program. This needs to be available in each dialog step. Further information about the different types of work process is contained later on in this documentation.
    Dispatcher
    Each application server contains a dispatcher. The dispatcher is the link between the work processes and the users logged onto the application server. Its task is to receive requests for dialog steps from the SAP GUI and direct them to a free work process. In the same way, it directs screen output resulting from the dialog step back to the appropriate user.
    Gateway
    Each application server contains a gateway. This is the interface for the R/3 communication protocols (RFC, CPI/C). It can communicate with other application servers in the same R/3 System, with other R/3 Systems, with R/2 Systems, or with non-SAP systems.
    The application server structure as described here aids the performance and scalability of the entire R/3 System. The fixed number of work processes and dispatching of dialog steps leads to optimal memory use, since it means that certain components and the memory areas of a work process are application-independent and reusable. The fact that the individual work processes work independently makes them suitable for a multi-processor architecture. The methods used in the dispatcher to distribute tasks to work processes are discussed more closely in the section Dispatching Dialog Steps.
    Shared Memory
    All of the work processes on an application server use a common main memory area called shared memory to save contexts or to buffer constant data locally.
    The resources that all work processes use (such as programs and table contents) are contained in shared memory. Memory management in the R/3 System ensures that the work processes always address the correct context, that is the data relevant to the current state of the program that is running. A mapping process projects the required context for a dialog step from shared memory into the address of the relevant work process. This reduces the actual copying to a minimum.
    Local buffering of data in the shared memory of the application server reduces the number of database reads required. This reduces access times for application programs considerably. For optimal use of the buffer, you can concentrate individual applications (financial accounting, logistics, human resources) into separate application server groups.
    Database Connection
    When you start up an R/3 System, each application server registers its work processes with the database layer, and receives a single dedicated channel for each. While the system is running, each work process is a user (client) of the database system (server). You cannot change the work process registration while the system is running. Neither can you reassign a database channel from one work process to another. For this reason, a work process can only make database changes within a single database logical unit of work (LUW). A database LUW is an inseparable sequence of database operations. This has important consequences for the programming model explained below.
    Dispatching Dialog Steps
    The number of users logged onto an application server is often many times greater than the number of available work processes. Furthermore, it is not restricted by the R/3 system architecture. Furthermore, each user can run several applications at once. The dispatcher has the important task of distributing all dialog steps among the work processes on the application server.
    The following diagram is an example of how this might happen:
    1. The dispatcher receives the request to execute a dialog step from user 1 and directs it to work process 1, which happens to be free. The work process addresses the context of the application program (in shared memory) and executes the dialog step. It then becomes free again.
    2. The dispatcher receives the request to execute a dialog step from user 2 and directs it to work process 1, which is now free again. The work process executes the dialog step as in step 1.
    3. While work process 1 is still working, the dispatcher receives a further request from user 1 and directs it to work process 2, which is free.
    4. After work processes 1 and 2 have finished processing their dialog steps, the dispatcher receives another request from user 1 and directs it to work process 1, which is free again.
    5. While work process 1 is still working, the dispatcher receives a further request from user 2 and directs it to work process 2, which is free.
    From this example, we can see that:
    • A dialog step from a program is assigned to a single work process for execution.
    • The individual dialog steps of a program can be executed on different work processes, and the program context must be addressed for each new work process.
    • A work process can execute dialog steps of different programs from different users.
    The example does not show that the dispatcher tries to distribute the requests to the work processes such that the same work process is used as often as possible for the successive dialog steps in an application. This is useful, since it saves the program context having to be addressed each time a dialog step is executed.
    Dispatching and the Programming Model
    The separation of application and presentation layer made it necessary to split up application programs into dialog steps. This, and the fact that dialog steps are dispatched to individual work processes, has had important consequences for the programming model.
    As mentioned above, a work process can only make database changes within a single database logical unit of work (LUW). A database LUW is an inseparable sequence of database operations. The contents of the database must be consistent at its beginning and end. The beginning and end of a database LUW are defined by a commit command to the database system (database commit). During a database LUW, that is, between two database commits, the database system itself ensures consistency within the database. In other words, it takes over tasks such as locking database entries while they are being edited, or restoring the old data (rollback) if a step terminates in an error.
    A typical SAP application program extends over several screens and the corresponding dialog steps. The user requests database changes on the individual screens that should lead to the database being consistent once the screens have all been processed. However, the individual dialog steps run on different work processes, and a single work process can process dialog steps from other applications. It is clear that two or more independent applications whose dialog steps happen to be processed on the same work process cannot be allowed to work with the same database LUW.
    Consequently, a work process must open a separate database LUW for each dialog step. The work process sends a commit command (database commit) to the database at the end of each dialog step in which it makes database changes. These commit commands are called implicit database commits, since they are not explicitly written into the application program.
    These implicit database commits mean that a database LUW can be kept open for a maximum of one dialog step. This leads to a considerable reduction in database load, serialization, and deadlocks, and enables a large number of users to use the same system.
    However, the question now arises of how this method (1 dialog step = 1 database LUW) can be reconciled with the demand to make commits and rollbacks dependent on the logical flow of the application program instead of the technical distribution of dialog steps. Database update requests that depend on one another form logical units in the program that extend over more than one dialog step. The database changes associated with these logical units must be executed together and must also be able to be undone together.
    The SAP programming model contains a series of bundling techniques that allow you to group database updates together in logical units. The section of an R/3 application program that bundles a set of logically-associated database operations is called an SAP LUW. Unlike a database LUW, a SAP LUW includes all of the dialog steps in a logical unit, including the database update.
    Happy Reading...
    shibu

  • Execution of subquery of IN and EXISTS clause.

    Hi Friends,
    Suppose we have following two tables:
    emp
    empno number
    ename varchar2(100)
    deptno number
    salary number
    dept
    deptno number
    location varchar2(100)
    deptname varchar2(100)
    status varchar2(100)
    Where dept is the master table for emp.
    Following query is fine to me:
    SELECT empno, ename
    FROM emp,dept
    WHERE emp.deptno = dept.deptno
    AND emp.salary &gt;=5000
    AND dept.status = 'ACTIVE';
    But I want to understand the behaviour of inline query (Used with IN and EXISTS clause) for which I have used this tables as an example (Just as Demo).
    1)
    Suppose we rewrite the above query as following:
    SELECT empno, ename
    FROM emp
    WHERE emp.salary &gt;=5000
    AND deptno in (SELECT deptno FROM dept where status = 'ACTIVE')
    Question: as shown in above query, suppose in our where clause, we have a condition with IN construct whose subquery is independent (it is not using any column of master query's resultset.). Then, will that query be executed only once or will it be executed for N number of times (N= number of records in emp table)
    In other words, how may times the subquery of IN clause as in above query be executed by complier to prepared the subquery's resultset?
    2)
    Suppose the we use the EXISTS clause (or NOT EXISTS clause) with subquery where, the subquery uses the field of master query in its where clause.
    SELECT E.empno, E.ename
    FROM emp E
    WHERE E.salary &gt;=5000
    AND EXISTS (SELECT 'X' FROM dept D where status = 'ACTIVE' AND D.deptno = E.deptno)
    Here also, I got same confusion. For how many times the subquery for EXISTS will be executed by oracle. For one time or for N number of times (I think, it will be N number of times).
    3)
    I know we can't define any fix thumbrule and its highly depends on requirement and other factors, but in general, Suppose our main query is on heavily loaded large transaction table and need to check existance of record in some less loaded and somewhat smaller transaction table, than which way will be better from performance point of view from above three. (1. Use of JOIN, 2. Use of IN, 3. Use of EXISTS)
    Please help me get solutions to these confusions..
    Thanks and Regards,
    Dipali..

    Dipali,
    First, I posted the links with my name only, I don;t know how did you pick another handle for addressing it?Never mind that.
    >
    Now another confusion I got.. I read that even if we used EXISTS and , CBO feels (from statistics and all his analysis) that using IN would be more efficient, than it will rewrite the query. My confusion is that, If CBO is smart enough to rewrite the query in its most efficient form, Is there any scope/need for a Developer/DBA to do SQL/Query tuning? Does this means that now , developer need not to work hard to write query in best menner, instade just what he needs to do is to write the query which resluts the data required by him..? Does this now mean that now no eperts are required for SQL tuning?
    >
    Where did you read that?Its good to see the reference which says this.I haven't come across any such thing where CBO will rewrite the query like this. Have a look at the following query.What we want to do is to get the list of all teh departments which have atleast one employee working in it.So how would be we write this query? Theremay be many ways.One,out of them is to use distinct.Let's see how it works,
    SQL> select * from V$version;
    BANNER
    Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
    PL/SQL Release 11.1.0.6.0 - Production
    CORE    11.1.0.6.0      Production
    TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
    NLSRTL Version 11.1.0.6.0 - Production
    SQL> set timing on
    SQL> set autot trace exp
    SQL> SELECT distinct  D.deptno, D.dname
      2        FROM     scott.dept D,scott.emp E
      3  where e.deptno=d.deptno
      4  order by d.deptno;
    Elapsed: 00:00:00.12
    Execution Plan
    Plan hash value: 925733878
    | Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| T
    ime     |
    |   0 | SELECT STATEMENT              |         |     9 |   144 |     7  (29)| 0
    0:00:01 |
    |   1 |  SORT UNIQUE                  |         |     9 |   144 |     7  (29)| 0
    0:00:01 |
    |   2 |   MERGE JOIN                  |         |    14 |   224 |     6  (17)| 0
    0:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    52 |     2   (0)| 0
    0:00:01 |
    |   4 |     INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 0
    0:00:01 |
    |*  5 |    SORT JOIN                  |         |    14 |    42 |     4  (25)| 0
    0:00:01 |
    |   6 |     TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 0
    0:00:01 |
    Predicate Information (identified by operation id):
       5 - access("E"."DEPTNO"="D"."DEPTNO")
           filter("E"."DEPTNO"="D"."DEPTNO")
    SQL>
    SQL> SELECT distinct  D.deptno, D.dname
      2        FROM     scott.dept D,scott.emp E
      3  where e.deptno=d.deptno
      4  order by d.deptno;
        DEPTNO DNAME
            10 ACCOUNTING
            20 RESEARCH
            30 SALES
    Elapsed: 00:00:00.04
    SQL>So CBO did what we asked it do so.It made a full sort merge join.Now there is nothing wrong in it.There is no intelligence added by CBO to it.So now what, the query looks okay isn't it.If the answer is yes than let's finish the talk here.If no than we proceed further.
    We deliberately used the term "atleast" here.This would govern that we are not looking for entirely matching both the sources, emp and dept.Any matching result should solve our query's result.So , with "our knowledge" , we know that Exist can do that.Let's write teh query by it and see,
    SQL> SELECT   D.deptno, D.dname
      2        FROM     scott.dept D
      3          WHERE    EXISTS
      4                 (SELECT 1
      5                  FROM   scott.emp E
      6                  WHERE  E.deptno = D.deptno)
      7        ORDER BY D.deptno;
        DEPTNO DNAME
            10 ACCOUNTING
            20 RESEARCH
            30 SALES
    Elapsed: 00:00:00.00
    SQL>Wow, that's same but there is a small difference in the timing.Note that I did run the query several times to elliminate the physical reads and recursive calls to effect the demo. So its the same result, let's see the plan.
    SQL> SELECT   D.deptno, D.dname
      2        FROM     scott.dept D
      3          WHERE    EXISTS
      4                 (SELECT 1
      5                  FROM   scott.emp E
      6                  WHERE  E.deptno = D.deptno)
      7        ORDER BY D.deptno;
    Elapsed: 00:00:00.00
    Execution Plan
    Plan hash value: 1090737117
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Ti
    me     |
    |   0 | SELECT STATEMENT             |         |     3 |    48 |     6  (17)| 00
    :00:01 |
    |   1 |  MERGE JOIN SEMI             |         |     3 |    48 |     6  (17)| 00
    :00:01 |
    |   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     4 |    52 |     2   (0)| 00
    :00:01 |
    |   3 |    INDEX FULL SCAN           | PK_DEPT |     4 |       |     1   (0)| 00
    :00:01 |
    |*  4 |   SORT UNIQUE                |         |    14 |    42 |     4  (25)| 00
    :00:01 |
    |   5 |    TABLE ACCESS FULL         | EMP     |    14 |    42 |     3   (0)| 00
    :00:01 |
    Predicate Information (identified by operation id):
       4 - access("E"."DEPTNO"="D"."DEPTNO")
           filter("E"."DEPTNO"="D"."DEPTNO")Can you see a keyword called Semi here? This means that Oralce did make an equi join but not complete.Compare the bytes/rows returned from this as well as cost with the first query.Can you notice the difference?
    So what do we get from all this?You asked that if CBO becomes so smart, won't we need developers/dbas at that time?The answer is , what one wants to be, a monkey or an astranaut? Confused,read this,
    http://www.method-r.com/downloads/doc_download/6-the-oracle-advisors-from-a-different-perspective-karen-morton
    So it won't matter how much CBO would become intelligent, there will be still limitations to where it can go, what it can do.There will always be a need for a human to look all the automations.Rememember even the most sofisticated system needs some button to be pressed to get it on which is done by a human hand's finger ;-).
    Happy new year!
    HTH
    Aman....

  • Cost of using subquery vs using same table twice in query

    Hi all,
    In a current project, I was asked by my supervisor what is the cost difference between the following two methods. First method is using a subquery to get the name field from table2. A subquery is needed because it requires the field sa_id from table1. The second method is using table2 again under a different alias to obtain table2.name. The two table2 are not self-joined. The outcome of these two queries are the same.
    Using subquery:
    select a.sa_id R1, b.other_field R2,
    (select b.name from b
    where b.b_id = a.sa_id) R3
    from table1 a, table2 b
    where ...Using same table twice (table2 under 2 different aliases)
    select a.sa_id R1, b.other_field R2, c.name R3
    from table1 a, table2 b, table2 c
    where
    c.b_id = a.sa_id,
    and ....Can anyone tell me which version is better and why? (or under what circumstances, which version is better). And what are the costs involved? Many thanks.

    pl/sql novice wrote:
    Hi all,
    In a current project, I was asked by my supervisor what is the cost difference between the following two methods. First method is using a subquery to get the name field from table2. A subquery is needed because it requires the field sa_id from table1. The second method is using table2 again under a different alias to obtain table2.name. The two table2 are not self-joined. The outcome of these two queries are the same.
    Using subquery:
    Using same table twice (table2 under 2 different aliases)
    Can anyone tell me which version is better and why? (or under what circumstances, which version is better). And what are the costs involved? Many thanks.In theory, if you use the scalar "subquery" approach, the correlated subquery needs to be executed for each row of your result set. Depending on how efficient the subquery is performed this could require significant resources, since you have that recursive SQL that needs to be executed for each row.
    The "join" approach needs to read the table only twice, may be it can even use an indexed access path. So in theory the join approach should perform better in most cases.
    Now the Oracle runtime engine (since Version 8) introduces a feature called "filter optimization" that also applies to correlated scalar subqueries. Basically it works like an in-memory hash table that caches the (hashed) input values to the (deterministic) correlated subquery and the corresponding output values. The number of entries of the hash table is fixed until 9i (256 entries) whereas in 10g it is controlled by a internal parameter that determines the size of the table (and therefore can hold different number of entries depending on the size of each element).
    If the input value of the next row corresponds to the input value of the previous row then this optimization returns immediately the corresponding output value without any further action. If the input value can be found in the hash table, the corresponding output value is returned, otherwise execute the query and keep the result combination and eventually attempt to store this new combination in the hash table, but if a hash collision occurs the combination will be discarded.
    So the effectiveness of this clever optimization largely depends on three different factors: The order of the input values (because as long as the input value doesn't change the corresponding output value will be returned immediately without any further action required), the number of distinct input values and finally the rate of hash collisions that might occur when attempting to store a combination in the in-memory hash table.
    In summary unfortunately you can't really tell how good this optimization is going to work at runtime and therefore can't be properly reflected in the execution plan.
    You need to test both approaches individually because in the optimal case the optimization of the scalar subquery will be superior to the join approach, but it could also well be the other around, depending on the factors mentioned.
    Regards,
    Randolf
    Oracle related stuff blog:
    http://oracle-randolf.blogspot.com/
    SQLTools++ for Oracle (Open source Oracle GUI for Windows):
    http://www.sqltools-plusplus.org:7676/
    http://sourceforge.net/projects/sqlt-pp/

  • Eliminating duplicates from subquery...

    What is the best way to eliminate duplicates from a subquery:
    SELECT dept_no, dept_name
    FROM dept D
    WHERE EXISTS (
    SELECT 'X'
    FROM emp E
    WHERE E.dept_no = D.dept_no);
    OR
    SELECT dept_no, dept_name
    FROM dept D
    WHERE ( SELECT 'X'
    FROM emp E
    WHERE E.dept_no = D.dept_no AND ROWNUM < 2);
    Thanks!

    >
    UPDATE TABLE1
    SET COL1 = (
    SELECT DISTINCT COL1
    FROM TABLE2, TABLE3
    WHERE TABLE2.ID = TABLE3.ID
    You need to refine your example. At present you appear to be updating every row in table1 to the same value - but only if col1 (which could be from table2 or table3 - or might be accidental capture from table1) holds just one distinct value across the query; but it looks as if you're likely to get 'single row subquery returns more than one row[ as an error.
    I guess you're trying to do something LIKE:
    update t1
    set t1.col1 = (
      select t2.col2
      from t2
      where  t2.colx = t1.coly
      and  exists (
        select null
        from t3
        where t3.id = t2.id
    )The most efficient access path depends on how many rows will have to be examined in each table, and how many times you will have to jump to another table to find related data - and if your query is roughly the shape of this one, the optimizer may be able to transform it in a variety of ways to find an efficient access path.
    As it stands, my example will be setting col1 to null whenever there is no match for coly in table t2 - and the optimizer would have to drive off t1 looking at every row to do this. Your requirement, and available predicates, indexes and constraints, may allow, or force, a completely different strategy.
    Regards
    Jonathan Lewis
    http://jonathanlewis.wordpress.com
    http://www.jlcomp.demon.co.uk

  • Can a different SubQuery replace a Set function Minus?

    I'm a student in an Oracle SQL class using 10g. We are nearing the end of our class.
    We are working on a chapter on Subqueries.
    In one of the homework problems I have a solution that uses a Subquery.
    But it also uses a Minus. We studied the Set functions earlier. So my solution
    does use things we have already studied. I'm just wondering if the whole
    problem could be solved with a different use of Subqueries.
    And eliminating the use of the Minus.
    Here is the question - 'List the title of all books in the same category as books previously
    purchased by customer 1007. Do not include books already purchased by this customer.'
    And here is my solution:
    SELECT InitCap(Title) AS "Book Title",
    Category
    FROM Books
    WHERE Category IN
    ( SELECT DISTINCT(Category)
    FROM Books JOIN OrderItems USING (ISBN)
    JOIN Orders USING (Order#)
    JOIN Customers USING (Customer#)
    WHERE Customer# = 1007 )
    Minus
    SELECT InitCap(Title),
    Category
    FROM Books JOIN OrderItems USING (ISBN)
    JOIN Orders USING (Order#)
    JOIN Customers USING (Customer#)
    WHERE Customer# = 1007
    ORDER BY Category;
    There is nothing tricky about the tables.
    Customers has Customer# which Joins to Orders via the Customer#.
    Orders Joins to a table called OrderItems via the Order#.
    OrderItems is also joined to Books via the ISBN.
    So to get the details of an Order (like which specific books and quantities)
    we have to get to Orderitems which has the ISBN and the Quantity.
    But every time I go back and look at this question I keep seeing the answer as
    'Find the large group, take out the part we don't need, leaving the answer'.
    Well, I hope I gave enough explanation here.
    Thanks for any thoughts or advice.

    Hi,
    To understand the problem better, let's do the join that finds the given customer's books only once, by putting it in a WITH clause. That has the additional advantage of keeping it away from the rest of the query.
    Since you understand the issues with INITCAP (Title), let's further simplify by not doing INITCAP.
    Now we can concentrate on alternatives to MINUS.
    MINUS, which you are already doing, can thus be written like this:
    WITH  this_customers_orders  AS
         SELECT     Title
         ,     Category
         FROM     Books
         JOIN     OrderItems     USING (ISBN)
         JOIN     Orders          USING (Order#)
         WHERE     Customer#     = 1007
    SELECT     Title
    ,     Category
    FROM     Books
    WHERE     Category     IN (
                      SELECT  category
                      FROM        this_customers_orders
    MINUS
    SELECT     *
    FROM     this_customers_orders;You should always format code, so you can easily see where sub-queries begin and end, and what the principal parts of each query are. This is especially important when you are learning.
    To get the same results using NOT EXISTS:
    WITH  this_customers_orders  AS
         SELECT     Title     ...     -- as shown above
    SELECT     Title
    ,     Category
    FROM     Books     b
    WHERE     Category     IN (
                      SELECT  category
                      FROM        this_customers_orders
    AND     NOT EXISTS (
                     SELECT  NULL
                 FROM        this_customers_orders
                 WHERE   Title   = b.Title
                     );Note that the NOT EXISTS sub-query is corellated to the main query. Perhaps 99% of the uncorellated EXISTS (and NOT EXISTS) sub-queries that I've seen have been errors.
    You can also get the same results using NOT IN:
    WITH  this_customers_orders  AS
         SELECT     Title     ...     -- as shown above
    SELECT     Title
    ,     Category
    FROM     Books     b
    WHERE     Category     IN (
                      SELECT  category
                      FROM        this_customers_orders
    AND     Title      NOT IN (
                   SELECT  book_title
                   FROM    this_customers_orders
                   WHERE   book_title     IS NOT NULL
                   );Notice the condition "WHERE book_title     IS NOT NULL" in the NOT IN sub-query. "x NOT IN (sub_query_y)" will never be TRUE if sub_query_y has even one NULL value.
    You can also get the same results by doing an outer join:
    WITH  this_customers_orders  AS
         SELECT     INITCAP (Title)     AS Title     ...     -- as shown above
    SELECT     b.Title
    ,     b.Category
    FROM          Books               b
    LEFT OUTER JOIN     this_customers_orders     t     ON b.Title = t.Title
    WHERE     Category     IN (
                      SELECT  category
                      FROM        this_customers_orders
    AND     t.Title        IS NULL;An outer join is almost certainly the least efficient way of doing this. The reason I mentioned it here is so that, if you ever see yourself writing a outer join like this, you'll have a model for writing it miore efficiiently as a MINUS, NOT EXISTS or NOT IN query instead.

Maybe you are looking for

  • A better way to manage Notes?

    We are a small team in collaboration on a book-length manuscript.My editors are inserting hundred of Notes into the document that I must cycle through and resolve each one. But I find the Note icon soooo tiny. When I'm at full page I can barely see t

  • Can i upgrade to 16gb on macbook pro 13 late 2011?

    hello, i have a macbook 13 inch late 2011, 2.8 ghz and i want to upgrade from 4gb to 16gb- is it possible? i want to add 8g + 4g cards

  • Suggestions on Replacement Hard Drive for Mac Book Pro

    I have a MBP - 15-inch, Late 2008 - the apple tech tells me its starting to get daaged- before it goes I want to replace it. She told me to go to Newegg and purchase one there...however not sure what to get... I want a 1TB drive if its possible....an

  • Problem with Panasonic P2

    I am new to this forum and hope someone can help me. I shot video by Panasonic AG-HVX203 to shoot video in DVCPRO HD (1080/50i) format.  This is PAL format.  Video is stored in P2 card.  And then I import the MXF files to the timeline of Premiere CS4

  • Add fields automatically

    I thought I read somewhere that Livecycle version 8 could automatically add fields to your form by trying to guess where they would go. Is this possible?