Optimizing sql

I'm trying to improve perfomance of a list of sql's. My sql are something like this:
select /*+ PARALLEL(t,5) */ * from tabla1 where yyyymm in (200301, 200302, 200303, 200304, 200305, 200306) and field1 = '010102' and field2 = ..... fieldn =
The fields yyyymm and field1 is in all my sql's but the rest of the fields (field2-fieldn) depends on the sql, (it varies from one sql to other).
My table is partitioned by yyyymm and i have an index (normal, not bitmap) by field1. The rest of the fields don't have a lot of possible values (10-20 values more or less), one of them is in a lot of sql's but it has three posible values and I dont' have an index because I think it doesnt't worth, doesnt' it??
I use parallel hint because server can make parallel executing. It's a very large table, tipical fact tables of a date warehouse, about 10 millions records.
Any idea to impove perfomance, to make faster the sql's??
Any advice or idea will be greatly apreciatted.

you could start adding an alias to your table to make the hint used
select /*+ PARALLEL(t,5) */ * from tabla1 t where yyyymm in (200301, 200302, 200303, 200304, 200305, 200306) and field1 = '010102' and field2 = ..... fieldn =
then could help to change the index to include yyyymm field and making it LOCAL if your access to data is mostly thru the yyyymm field.
how many parallel query process can you use? i am asking that because 5 is the default value, but you have to understand that the value is instance wide, so if 2 users issue queries that use a parallelism of 5, they need 10 parallel query processes, if the users becomes 3 they will need 15 processes and so on...
Check the parallel_max_servers parameter.
Could you post the explain plan for the query?
you can get it just typing
set autotrace trace explain
and your query in sqlplus.
Anyway arre you sure it's a fact table of a datwarehouse? I cant see any dimension table in your query

Similar Messages

  • Optimized sql not properly generated when using SAP tables

    Hi Experts
    We are using BODS 4.1 sp2.
    We have a simple dataflow where we pull data from SAP r3 using direct download and push it in our database.
    Basically its like this:
    SAP Table-->Query Transform-->Oracle Database
    Inside query transform,we have applied some conditions in where clause,which are combinations of 'or' and 'and' operator.
    However when we are generating optimized SQL for the same,we observe that the conditions in where clause are not being included in the optimized query.
    But if we replace all conditions by only 'or'  or by only 'and' condition,the optimized sql query with condition  is generated.
    Somehow the optimized sql with conditions is not generated if where clause has mixer of 'and' and 'or' conditions.
    The issue is only observerd for SAP Tables
    However the same dataflow in bods 4.0,generates an optimized sql containing the mixed conditions of 'and' and 'or' in where clause.
    Please help me on this.

    Dear Shiva
    We e
    Let me explain this in detail to you:
    We have a simple dataflow where we push data from the sap table directly into our oracle database.
    The dataflow has a query transform where we have some conditions to filter the data.
    The method used to pull data from R3 is direct download.
    System Used:
    BODS 4.2 SP1
    SAP R3 620 release(4.7)
    The where clause has mixed 'and' and 'or' operators.The problem is that this where clause is not getting pushed in the Optimized sql.
    Eg: Table1.feild1=some value and Table1.feild2=some value or Table1.feild3=some value.
    If this is the condition in  where clause,the where clause does not appear in optimized sql.
    Also the condition in where clause is from same table.
    However if all the condition in where clause are either all 'and'  or either all 'or',the condition gets pushed and appears in optimized sql.
    But if their is mixed 'and' and 'or' it fails.
    The same was not the case when we used BODS 4.0 with R3 620(4.7).
    We investigated further and used R3 730 with BODS 4.2.With R3 720,we the where condition was getting pushed into optimized sql for mixed 'and' 'or' predicated.
    SAP support too confirmed the same.They were able to reproduce the issue.
    We will use ABAP dataflow and see if if resolves the issue.
    Regards
    Ankit

  • Optimal SQL

    hi,
    when i see performance tuning in orafaq.com, they mentioned the following with respect to SQL.
    Application Tuning:
    Experience shows that approximately 80% of all Oracle system performance problems are resolved by coding optimal SQL.
    i would like to know is there any document with guidelines (dos and donts) on writing optimal SQL ?
    regards

    Hi,
    Here are a few things I try to keep in mind when writing SQL, and some common mistakes I've noticed.
    Optimal SQL starts before you even write a query; it starts with a good table design.
    Normalize your tables.
    Use the right datatype. A common mistake is to use a VARCHAR2 or NUMBER column when a DATE is appropriate.
    Use SQL instead of PL/SQL, expecially PL/SQL that does DML one row at a time. MERGE is a very powerful tool in pure SQL.
    Help the optimizer.
    Write comparisons so that an indexed column is alone on one side of the comparison operator. For example, if you're looking for orders that are more than 60 days old, don't say
    WHERE   SYSDATE - order_date  > 60    -- *** INEFFICIENT  ***write it this way instead
    WHERE   order_date  < SYSDATE - 60Some tools are inherently slow. These include<ul>
    <li>SELECT DISTINCT
    <li>UNION
    <li>Regular expressions
    <li>CONNECT BY</ul>
    All of these are wonderful, useful tools, but they have a price, and you can often get the exact results you need faster with some weaker tool, even if it requires a little more code.

  • Sub data flow (Optimized SQL) execution order ?

    I am looking for a solution in Designer of Data Services XI 3.2.
    Is there a way to specifiy in a Data Flow
    (without using 'Transaction control' options and Embedded Data Flow )
    the order in which Sub data flows ( Optimized SQL ) are executed ?
    Thank you in advance.
    Georg

    First, if you are using MDX to calculate the value of C - don’t.  MDX script logic can be extremely inefficient from a processing and memory utilization standpoint vs. SQL logic even if the syntax is shorter.
    Logic executes in the order you place the code in the script.  You have three commit blocks and they would execute in that order.  I notice you don't have a commit after the calculation for C.  You should always put a commit statement after each calculation section or you can get uncommitted data even though there is an implied commit after the last line of code executes.  Don't get in the habit of relying on this.
    You can see the logic logs from the temp folders on the file server as suggested above but they will mainly give you the SQL queries generated which can be helpful in debugging scoping issues but they can be hard to sift through.
    I recommend trying putting a commit statement after your calculation of C and that will probably resolve the issue.  I also strongly suggest you switch the calculation to SQL logic to avoid performance issues when you start having these calculations run under high concurrency or on larger volumes of data than what you're probably testing with.

  • Optimized SQL generation?

    Hi,
    I am working with Crystal Reports 2008 SP3 together with an Oracle database. I am using several tables and views; my reports then have several formula fields where I make some calculations and restrictions.
    When I look at the SQL statement from the database menu I see that CR constructs the SQL statement the way I made the joins in the Database Expert.
    Does CR sends then an optimized SQL statement to the database including my restrictions and conditions from my formula fields or does it simply fetch data from the database and makes the filtering then?
    Thanks!

    IanWaterman wrote:
    > If you add
    >
    > = {@yourformula}
    > Provided your formula is not print time (ie using variables or summaries) it will resolve and pass to database. If not all data will be brought back to crystal for filtering
    I have e.g. formulas like this:
    @restriction1
    {mytable1.col1} = 8 and {mytable1.col3} = 7
    Are such formulas pushed to the database server or does CR then do it locally, retrieving more data from the database then necessary?

  • Guidelines to create an optimized SQL queries

    Dear all,
    what is the basic strategy to create an optimized query? In the FAQ SQL and PL/SQL FAQ , it shows how to post a question on a query that needs to be optimized, but it doesn't tell how to with it. The Performance Tuning Guide for 11g rel 2 doc shows how access path works and how to read explain plan, but I cannot really find a basic guidelines to optimize query in general.
    Say I have a complex query that I need to optimized, I get all the info that the thread advise to like optimizer info, explain plan, tkprof dump, and so on. I determine the nature of the data that are being queried with their index. At this point what else I need to do? I cannot just post my query to OTN every time I bump into slow query.
    Best regards,
    Val

    I think that some of the most important guidelines are:
    1. ALWAYS create documentation that explains
    a. what the query is supposed to do - 'selects all records for employees that have stock options that are due to expire within 60 days'
    b. any special constructs/tricks used in the query. For example if you added one or more hints to the query explain why. If the query has
    a complex/compound CASE or DECODE statement add a one line comment explaining what it does.
    2. Gather as much information about the context the query runs in as possible.
    a. how much source data are we dealing with? Large tables or small?
    b. how large is the result set expected to be? A few records, thousands, millions?
    c. are they regular tables or partitioned tables?
    d. are the tables local or on remote servers?
    e. how often is the query executed? once in a blue moon or concurrently by large numbers of users?
    3. For existing queries always confirm that it is the query that actually needs to be tuned before trying to tune it. Too many times I have seen people trying to tune a query when it is actually another part of the process that needs to be tuned.
    4. Always test queries using realistic data - data and environment as close as possible to that which will be used in production.
    5. Always run an explain plan before, during and after you test the query. Save the final plan in a repository so that it is available for comparison if a problem later occurs that you suspect might be related to the query. It is easier to diagnose a possible degradation of performance if you have a previous execution plan to compare the current one to.
    6. Use common sense when writing/evaluating you query - if it looks too complicated it probably is. If you have trouble understanding or testing it the next person that comes along will probably have even more trouble.
    Hope that adds some to what the docs provide.

  • Help Needing in optimizing SQL

    I need help optimizing the following SQL.
    Following are the schema elements -
    cto_xref_job_comp - Contains the Job Data
    cto_mast_component - Contains component
    cto_mast_product - Contains product info
    cto_mast_model - Contains model info
    cto_xref_mod_prod - Contains model and product assoiciation
    cto_mast_model_scan - Contain the Scan order for each model family (the mod_id(should be renamed) refers to mod_fam_id on the cto_mast_model table).
    Here is what I am trying to achieve
    I want all the ATP components whose Travel card order (on cto_mast_model_scan > 0 ) and in addition I need the Phantoms that do not have any ATP components under them if they do not appear on ATP component list.
    SELECT f.travelcard_order, b.job_number,
    b.product_code,
    b.line_number,
                   b.component_type,
    b.component_item_number,
    b.parent_phantom,
    b.item_type,
    a.comp_id,
    a.comp_type_id,
    a.comp_desc_short,
    b.batch_id,
    b.quantity_per_unit,
    a.comp_notes,
                   b.COMPONENT_ITEM_DESCRPTION
    FROM      cto_xref_job_comp b,
                        cto_mast_component a,
                        cto_mast_product c,
    cto_xref_mod_prod d,
    cto_mast_model e,
                   cto_mast_model_scan f
    WHERE ( b.job_number = 'CTO2499814001' ) AND
    ( b.batch_id = 21 ) AND
    (b.item_type = 'ATP') AND
                   (b.parent_phantom = a.comp_desc_short ) and          
                   (b.product_code = c.prod_desc_short ) and
    (c.prod_id = d.prod_id ) and
    (d.mod_id = e.mod_id) and
    (e.mod_fam_id = f.mod_id) and
    (a.comp_type_id=f.comp_type_id) and
                                       (f.travelcard_order > 0)
    union
    SELECT distinct f.travelcard_order, b.job_number,
    b.product_code,
    b.line_number,
                   b.component_type,
    b.parent_phantom,
    b.item_type,
    a.comp_id,
    a.comp_type_id,
    a.comp_desc_short,
    b.batch_id,
    1,
    a.comp_notes,
    FROM      cto_xref_job_comp b,
                        cto_mast_component a,
                        cto_mast_product c,
    cto_xref_mod_prod d,
    cto_mast_model e,
                   cto_mast_model_scan f
    WHERE ( b.job_number = 'CTO2499814001' ) AND
    ( b.batch_id = 21 ) AND (b.item_type is null) and
                   (substr(b.parent_phantom,1,1)='C') and
                   (b.parent_phantom = a.comp_desc_short )
    and (b.parent_phantom not in (select parent_phantom from cto_xref_job_comp where job_number=b.job_number and batch_id=b.batch_id and item_type='ATP')) and           
                   (b.product_code = c.prod_desc_short ) and
    (c.prod_id = d.prod_id ) and
    (d.mod_id = e.mod_id) and
    (e.mod_fam_id = f.mod_id) and
    (a.comp_type_id=f.comp_type_id) and (f.travelcard_order > 0)

    Here is a small example.
    DECLARE
         my_table VARCHAR2(10) := 'DUAL';
         my_value INTEGER;
    BEGIN
         EXECUTE IMMEDIATE 'SELECT 1 FROM ' || my_table INTO my_value;
         DBMS_OUTPUT.PUT_LINE(my_value);
    END;
    you can use your cursor value in the place of my_table.
    Thanks,
    Karthick.
    http://www.karthickarp.blogspot.com/

  • Optimizing SQL statement

    Hi,
    Performance of the below SQL is poor, it takes plenty of time to process this.
    In what are all the ways, we can optimize this SQL statement.
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from
    (select dt_fld1, fld2, fld3 from Table1)
    group by dt_fld1
    I thought of rewriting it as
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from Table1
    group by dt_fld1
    What are all the other ways available like using query hints, creating materialized view, analyzing execution plan, update statistics using dbms_stats package, etc.
    to optimize this SQL Statement.
    Any method available in SQL itself to optimize this like using WITH clause,
    WITH tbl as (select dt_fld1, fld2, fld3 from Table1)
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from
    group by dt_fld1
    What is the advantage we will get using WITH clause?
    Thank you.

    845956 wrote:
    Hi,
    Performance of the below SQL is poor, it takes plenty of time to process this.
    In what are all the ways, we can optimize this SQL statement.
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from
    (select dt_fld1, fld2, fld3 from Table1)
    group by dt_fld1
    I thought of rewriting it as
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from Table1
    group by dt_fld1
    You're approaching this in the wrong way. Looking at the query tells you very little about how it is being executed. Rewriting the SQL in lots of different ways isn't really an effective way to tune a statement.
    You need to start by looking at the execution plan - that will tell you exactly how the query is being executed - which objects are being accesses in which order and by what method. If you'd used EXPLAIN PLAN on these two queries you would have found that they both end up with the same execution plan - the inline view in the first statement would be removed anyway giving you the second statement.
    DTYLER_APP@pssdev2> create table table1(dt_fld date, fld2 number,fld3 number)
      2  /
    Table created.
    DTYLER_APP@pssdev2> insert into table1 select TRUNC(sysdate), MOD(rownum,1000),rownum from dual
      2  connect by level <=10000
      3  /
    10000 rows created.
    DTYLER_APP@pssdev2> exec dbms_stats.gather_table_stats(user,'TABLE1');
    PL/SQL procedure successfully completed.
    DTYLER_APP@pssdev2> explain plan for
      2  select dt_fld, min(fld2), max(fld2), avg(fld2),
      3          min(fld3), max(fld3), avg(fld3)
      4  from
      5       (select dt_fld, fld2, fld3 from Table1)
      6  group by dt_fld
      7  /
    Explained.
    DTYLER_APP@pssdev2> select * from table(dbms_xplan.display)
      2  /
    PLAN_TABLE_OUTPUT
    Plan hash value: 1879194101
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   1 |  HASH GROUP BY     |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| TABLE1 | 10000 |   146K|     8   (0)| 00:00:01 |
    9 rows selected.
    DTYLER_APP@pssdev2> explain plan for
      2  select dt_fld, min(fld2), max(fld2), avg(fld2),
      3          min(fld3), max(fld3), avg(fld3)
      4  from
      5       Table1
      6  group by dt_fld
      7  /
    Explained.
    DTYLER_APP@pssdev2> select * from table(dbms_xplan.display)
      2  /
    PLAN_TABLE_OUTPUT
    Plan hash value: 1879194101
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   1 |  HASH GROUP BY     |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| TABLE1 | 10000 |   146K|     8   (0)| 00:00:01 |
    9 rows selected.
    What are all the other ways available like using query hints, creating materialized view, analyzing execution plan, update statistics using dbms_stats package, etc.
    to optimize this SQL Statement.
    Based on the information you have provided, no-one can really answer that. You need to firstly define what the expected or acceptable run time for this statement is - that way you have something to work to. From there you need to look at the execution plan and ideally post some information about the tables, indexes, volumes of data etc.
    Any method available in SQL itself to optimize this like using WITH clause,
    WITH tbl as (select dt_fld1, fld2, fld3 from Table1)
    select dt_fld, min(fld2), max(fld2), avg(fld2),
    min(fld3), max(fld3), avg(fld3)
    from
    group by dt_fld1
    In the example you have given, absolutely nothing - it will most likely be optimised out of the final query anyway.
    DTYLER_APP@pssdev2> explain plan for
      2  WITH tbl as (select dt_fld, fld2, fld3 from Table1)
      3  select dt_fld, min(fld2), max(fld2), avg(fld2),
      4          min(fld3), max(fld3), avg(fld3)
      5  from  tbl
      6  group by dt_fld
      7  /
    Explained.
    DTYLER_APP@pssdev2> select * from table(dbms_xplan.display)
      2  /
    PLAN_TABLE_OUTPUT
    Plan hash value: 1879194101
    | Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   1 |  HASH GROUP BY     |        |     1 |    15 |     9  (12)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| TABLE1 | 10000 |   146K|     8   (0)| 00:00:01 |
    9 rows selected.So you need to give some more detail on the actual query you're working with.
    HTH
    David

  • Optimizing SQL for views on views

    i'm trying to optimize the performance of a view, which joins 2
    other views:
    select *
    from v1, v2
    where v2.xxx = v1.xxx
    when i select data from the view, i set a where clause that
    results in only one matching row in the view v1 that can be
    accessed by rowid (unique index). There is also a (none-unique)
    index on the column xxx of that v2 view which should by used by
    the optimizer (rule-based, 7.3.4.3.0).
    But it isn't. Instead the database performs a full table scan of
    the driving table of the v2 view, finds some rows and merges the
    data with those from the v1 view. But as the v2 view is very
    large it takes very long....
    When i type
    select * from v2 where xxx='abc'
    the query executes qickly because the index on xxx is used.
    What can prevent the optimizer from using the index on xxx in my
    view?
    I even tried to force use of the index by the INDEX hint but it
    didn't work.
    any help appreciated
    thanks
    null

    Thanks kgronau,
    My Oracle gateway for SQL server is in a Linux box and obviously the SQL server in a windows box.
    In that case how I would execute dg4msql_cvw.sql which is under $ORACLE_HOME/dg4msql/admin path of Linux server?
    Regards
    Satish

  • CIPT Optimation SQL job crashing ccm.exe

    Any one seen this before and have idea on how to fix it or if this is a known bug? Have done searches on bugtoolkit and have not found any yet. The CIPT job completes fine but ccm crashes while it is running.
    I have moved the start time of the CIPT SQL job to later in the morning when other jobs are done running and it also crashes.
    SQL SP3a
    OS 2000.2.6sr4
    3.3(4)sr2
    BARS 4.04

    I got this issue fixed by purging old CDR records, and shrinking the CDR database.
    The way I found out it was something to do with CDR database was I excecuted the steps in the CIPT SQL job one by one and the step1 involving CDR took 8-9 minutes to run. After purging records it is down to 1 minute or so and nothing has unregistered/crashed like previous times I had ran the CIPT SQL job.

  • Optimized SQL query

    Dear Forum,
    Pls help me to select the best query..
    1) select /*+ full(a) parallel(a,10) */ a.receipt_id,b.receipt_peek,a.receipt_key
    from receipt a, receipt_link b
    where a.receipt_id = b.receipt_id
    OR
    2) select /*+ full(a) parallel(a,10) */ a.receipt_id,b.receipt_peek,a.receipt_key
    from receipt a
    where exists ( select /*+index(b receipt_pk) */  receipt_peek
    from receipt_link b
    where b.receipt_id = a.receipt_id )
    Thanks

    You have given incomplete information
    Please provide the information as stated here,
    When your query takes too long ...
    http://forums.oracle.com/forums/thread.jspa?messageID=1945338&#1945338
    Adith

  • SQL Developer 2.1 - Autotrace not only explain statement but executes it

    I need old behaviour when statement was only explained and not executed. I am optimizing sql updates, merges and inserts on tables with 50M rows and it gets me angry if SD executes the statement. I found a workaround by using explain plan for ..., but I think that such a tool as SD is should give me an option to configure the Autotrace functionality (I even tried to setup what should be returned as a result of auto trace in menu Tools/Preferences/Database/ Autotrace/Explain plan , but it always executes the statement).

    Don't know if I get what you're asking.
    Explain plan: F10, Autotrace: F6.
    Autotrace logically executes the statement (RTM), Explain doesn't.
    If you get other behaviour, please explain yourself better.
    Hope that helps,
    K.

  • Please sugges the link helpful for writing efficient sql queries

    Please suggest any good resource that is weblink which can help me in optimizing sql queries. especially while writing select statements improve the execution time of the sql query.
    Thanx in advance
    prasanth

    in general I found books from O'Reilly very helpful (not only for Oracle, but for Unix too).
    Moreover there is pretty good Oracle Documentation available.
    After all, it's not only about writing good queries, but also about setting up data-models, indexes, primary keys, etc.
    Look for a real slow computer, take a lot of data, then try writing your speedy queries. This is the school of hard knocks, on the long run it's the best training.

  • Accessing BKPF table takes too long

    Hi,
    Is there another way to have a faster and more optimized sql query that will access the table BKPF? Or other smaller tables that contain the same data?
    I'm using this:
       select bukrs gjahr belnr budat blart
       into corresponding fields of table i_bkpf
       from bkpf
       where bukrs eq pa_bukrs
       and gjahr eq pa_gjahr
       and blart in so_DocTypes
       and monat in so_monat.
    The report is taking too long and is eating up a lot of resources.
    Any helpful advice is highly appreciated. Thanks!

    Hi max,
    I also tried using BUDAT in the where clause of my sql statement, but even that takes too long.
        select bukrs gjahr belnr budat blart monat
         appending corresponding fields of table i_bkpf
         from bkpf
         where bukrs eq pa_bukrs
         and gjahr eq pa_gjahr
         and blart in so_DocTypes
         and budat in so_budat.
    I also tried accessing the table per day, but it didn't worked too...
       while so_budat-low le so_budat-high.
         select bukrs gjahr belnr budat blart monat
         appending corresponding fields of table i_bkpf
         from bkpf
         where bukrs eq pa_bukrs
         and gjahr eq pa_gjahr
         and blart in so_DocTypes
         and budat eq so_budat-low.
         so_budat-low = so_budat-low + 1.
       endwhile.
    I think our BKPF tables contains a very large set of data. Is there any other table besides BKPF where we could get all accounting document numbers in a given period?

  • Deleting duplicates from a table ,who's size is 386 GB

    Need to delete duplicate records from the table.Table contains 33 columns out of them only PK_NUM is the primary key columns. As PK_NUM contains unique records we need to consider either min/max value.
    Sample data :
    PK_NUM
    Name
    AGE
    1
    ABC
    20
    2
    PQR
    25
    3
    ABC
    20
    Expected data should contains only 2 records:
    PK_NUM
    Name
    AGE
    1
    ABC
    20
    2
    PQR
    25
    *1 can be replaced by 3 ,vice versa.
    Size of table : 386 GB
    Total records in the table : 1766799022
    Distinct records in the table : 69237983(Row distinct with out Primary key)
    Duplicate records in the table : 1697561039(Row duplicates without primary key)
    Column details :
    4 :  Date data type
    4 :  Number data type
    1 :  Char data type
    24:  Varchar2 data type
    DB details : Oracle Database 11g EE::11.2.0.2.0 ::64bit Production
    My plan here is to
    Pull distinct records and store it in a back up table.(ie by using insert into select)
    Truncate existing table and move records from back up to existing.
    As data size is huge ,
    Want to know what is the optimized sql for retrieving the distinct records
    Any estimate on how much it will take to complete (insert into select) and to truncate the existing table.
    Please do let me know ,if there is any other best way to achieve this.My ultimate goal is to remove the duplicates.

    As data size is huge ,
    Want to know what is the optimized sql for retrieving the distinct records
    Any estimate on how much it will take to complete (insert into select) and to truncate the existing table.
    @ 1. - Your best chance seems to be (should require a single FTS only)
    create backup_table as
    select pk,name,age,a_date,a_string,a_number, ...
      from (select pk,name,age,a_date,a_string,a_number, ...
                   row_number() over (partition by name,age order by a_date) rn
              from big_table
    where rn = 1
    @ 2. - Having statistics in place and (at least nearly) up to date explain plan should return an appropriate estimate
    Regards
    Etbin
    select pk,name,age,a_date,a_string,a_number
      from (select pk,name,age,a_date,a_string,a_number,
                   row_number() over (partition by name,age order by a_date) rn
              from big_table
    where rn = 1
    Operation
    Options
    Object
    Rows
    Time
    Cost
    Bytes
    Filter
    Predicates *
    Access
    Predicates
    SELECT STATEMENT 
    13,044
    1
    30
    53,023,860
    VIEW
    13,044
    1
    30
    53,023,860
    "RN" = 1
    WINDOW
    SORT PUSHED RANK
    13,044
    1
    30
    495,672
    ROW_NUMBER() OVER ( PARTITION BY "NAME","AGE" ORDER BY "A_DATE")< = 1
    TABLE ACCESS
    STORAGE FULL
    BIG_TABLE
    13,044
    1
    26
    495,672
    select pk,name,age,a_date,a_string,a_number
      from big_table
    where pk in (select min(pk) keep (dense_rank first order by a_date)
                    from big_table
                   group by name,age
    Operation
    Options
    Object
    Rows
    Time
    Cost
    Bytes
    Filter
    Predicates *
    Access
    Predicates
    SELECT STATEMENT 
    6,000
    1
    52
    306,000
    HASH JOIN
    6,000
    1
    52
    306,000
    "PK" = "$kkqu_col_1"
    VIEW
    VW_NSO_1
    6,000
    1
    27
    78,000
    HASH
    UNIQUE
    6,000
    1
    27
    126,000
    SORT
    GROUP BY
    6,000
    1
    27
    126,000
    TABLE ACCESS
    STORAGE FULL
    BIG_TABLE
    13,044
    1
    23
    273,924
    TABLE ACCESS
    STORAGE FULL
    BIG_TABLE
    13,044
    1
    24
    495,672
    Message was edited by: Etbin

Maybe you are looking for

  • Default deselected GR indicator in service PO line item

    Is there a way to have the Goods Receipt indicator default to unchecked/deselected for Service line items on the PO (Item Category D)? Thanks for your responses.

  • Trouble with flash player after i had virus and downloaded mcafee

    I had a virus and had Mcafee come in and clean the virus out and then downloaded Mcafee and since i have not been able to use flash player i have downloaded and says it is downloaded but not working i have windows vista, can anyone help?

  • Passing paramaters to a method

    I am passing objects to another method. I do some changes to them and expect the changed objects to be available in the calling method ( since they are supposedely passed by reference) when the execution is returned to it. However, I don't get the ch

  • Change of GL account - Cannot be efffected

    Hello All A PO was created successfully; After the PO creation immediately realised I entered a wrong GL account, so have changed it. But the GL account which I changed did not get saved when I saved the PO, there is a information message ME 664; (Ch

  • Itunes has Stopped working - I have tried uninstalling and re-installing  - I am at wits end

    I keep getting the message iTunes has stopped working.  I have tried uninstalling and re-installing iTunes many times.  I had contacted Apple at one point and they told me that I needed to uninstall items in the following order iTunes, QuickTime, App