MINUS on a 4 million row table - how to enhance speed?

I have a query like this:
select col1, col2, col3
from table1
MINUS
select col1, col2, col3
from EXTERNAL_TABLE
table1 has approximately 4 million records and external file has roughly 4 milllion rows.
MINUS takes around 25 mins here. How can I speed up?
Thanks in Advance!!!

To make something go faster, you first need to know what makes it slow.
Simple actually - to solve a problem we need to know what the problem is. Can't answer a question without knowing what the question is.
Reading a total of 8 million rows means a lot more I/O than usual... so that is likely a culprit for the slow performance. If that is the case, you will need to find a way to increase the time it takes to perform all that I/O. Or do less I/O.
But you need to pop the hood and take a look at just what is causing what you think is slow performance. (it may not even be slow - it may be as fast as it can do given the hardware and other limitations that are imposed on Oracle and this MINUS process)

Similar Messages

  • Delete from 95 million rows table ...

    Hi folks, need to delete from a 95 millions rows regular table, what should be my best options, have tried CTAS using parallel, but it failed after 1+ hrs ... it was due to bad query, but checking is there any other way to achieve this.
    Thanks in advance.

    user8604530 wrote:
    Hi folks, need to delete from a 95 millions rows regular table, what should be my best options, have tried CTAS using parallel, but it failed after 1+ hrs ... it was due to bad query, but checking is there any other way to achieve this.
    Thanks in advance.how many rows in the table BEFORE the DELETE?
    how many rows in the table AFTER the DELETE?
    How do I ask a question on the forums?
    SQL and PL/SQL FAQ
    Handle:     user8604530
    Status Level:     Newbie
    Registered:     Mar 10, 2010
    Total Posts:     64
    Total Questions:     26 (22 unresolved)
    I extend to you my condolences since you rarely get your questions answered.

  • Fetch only more than or equal to 10 million rows tables

    Hi all,
    How to fetch tables has more than 10 million rows with is plsql? i got this from some other site I couldn't remember.
    Somebody can help me on this please. your help is greatly appreciated.
    declare
    counter number;
    begin
    for x in (select segment_name, owner
    from dba_segments
    where segment_type='TABLE'
    and owner='KOMAKO') loop
    execute immediate 'select count(*) from '||x.owner||'.'||x.segment_name into counter;
    dbms_output.put_line(rpad(x.owner,30,' ') ||'.' ||rpad(x.segment_name,30,' ') ||' : ' || counter ||' row(s)');
    end loop;
    end;
    Thank you,
    gg

    1) This code appears to work. Of course, there seems to be no need to select from DBA_SEGMENTS when DBA_TABLES would more straightforward. And, of course, you'd have to do something when the count exceeded 10 million.
    2) If you are using the cost-based optimizer (CBO) and your statistics are reasonably accurate and you can tolerate a degree of staleness/ approximation in the row counts, you could just select the NUM_ROWS column from DBA_TABLES.
    Justin

  • Best way to refresh 5 million row table

    Hello,
    I have a table with 5 million rows that needs to be refreshed every 2 weeks.
    Currently I am dropping and creating the table which takes a very long time and gives a warning related to table space at the end of execution. It does create the table with the actual number of rows but I am not sure why I get the table space warning at the end.
    Any help is greatly appreciated.
    Thanks.

    Can you please post your query.
    # What is the size of temporary tablespace
    # Is you query performing any sorts ?
    Monitor the TEMP tablespace usage from below after executing your SQL query
    SELECT TABLESPACE_NAME, BYTES_USED, BYTES_FREE
    FROM V$TEMP_SPACE_HEADER;
    {code}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • Google style autosuggest with millions rows table

    Hi All,
    I'm exploring the ways of implementing a "google style autosuggest" on a table with no less than 30 millions rows. It has a field with an address (varchar) and I'd like to create a Ajax call while the user is typing that would suggest the user few addresses.
    I was thinking about using contains+fuzzy... but not sure if it will be fast enough and if it will return the right results.
    Any suggestions ?
    thanks

    2 million rows with XML type data
    link may be of your interest.
    HTH
    Girish Sharma

  • Importing huge (10 million rows) tables into HANA

    Dear All,
    i need to import huge tables (40.000.000 records)  into an HANA system, in order to develop a demo system for a customer.
    I'm planning to use CSV files.
    Does someone have experience in a task like this? is there a limit to CSV file dimension?
    Many thanks
    Leopoldo Capasso

    Check out this blog for best practice to load high volume data from flat files:
    http://scn.sap.com/community/hanainmemory/blog/2013/04/08/bestpracticesforsaphanadataloads
    Hope this helps.
    Rama

  • General Scenario- Adding columns into a table with more than 100 million rows

    I was asked/given a scenario, what issues do you encounter when you try to add new columns to a table with more than 200 million rows? How do you overcome those?
    Thanks in advance.
    svk

    For such a large table, it is better to add the new column to the end of the table to avoid any performance impact, as RSingh suggested.
    Also avoid to use any default on the newly created statement, or SQL Server will have to fill up 200 million fields with this default value. If you need one, add an empty column and update the column by using small batches (otherwise you lock up the whole
    table). Add the default after all the rows have a value for the new column.

  • Schema Design for 10^6+ rows table (Indicator Column / Bitmap Join Index?)

    Hi all,
    I read following suggestion for a SELECT with LEFT OUTER JOIN in a DB2 consulting company paper for a 10 million-rows table:
    SELECT columns
    FROM ACCTS A LEFT JOIN OPT1 O1
    ON      A.ACCT_NO = O1.ACCT_NO
    AND     A.FLAG1 = ‘Y’
    LEFT JOIN OPT2 O2
    ON      A.ACCT_NO = O2.ACCT_NO
    AND     A.FLAG2 = ‘Y’
    WHERE A.ACCT_NO = 1
    For DB2, according to the paper, the following is true: Iff A.FLAG1 <> ‘Y’ Then no Table or Index Access on OPT1 is done. Same for A.FLAG2/OPT2.
    I recreated the situation for ORACLE with the following script and came to some really interesting questions
    DROP TABLE maintbl CASCADE CONSTRAINTS;
    DROP TABLE opt1 CASCADE CONSTRAINTS;
    DROP TABLE opt2 CASCADE CONSTRAINTS;
    CREATE TABLE maintbl
    id INTEGER NOT NULL,
    dat VARCHAR2 (2000 CHAR),
    opt1 CHAR (1),
    opt2 CHAR (1),
    CONSTRAINT CK_maintbl_opt1 CHECK(opt1 IN ('Y', 'N')) INITIALLY IMMEDIATE ENABLE VALIDATE,
    CONSTRAINT CK_maintbl_opt2 CHECK(opt2 IN ('Y', 'N')) INITIALLY IMMEDIATE ENABLE VALIDATE,
    CONSTRAINT PK_maintbl PRIMARY KEY(id)
    CREATE TABLE opt1
    maintbl_id INTEGER NOT NULL,
    adddat1 VARCHAR2 (100 CHAR),
    adddat2 VARCHAR2 (100 CHAR),
    CONSTRAINT PK_opt1 PRIMARY KEY(maintbl_id),
    CONSTRAINT FK_opt1_maintbltable FOREIGN KEY(maintbl_id) REFERENCES maintbl(id)
    CREATE TABLE opt2
    maintbl_id INTEGER NOT NULL,
    adddat1 VARCHAR2 (100 CHAR),
    adddat2 VARCHAR2 (100 CHAR),
    CONSTRAINT PK_opt2 PRIMARY KEY(maintbl_id),
    CONSTRAINT FK_opt2_maintbltable FOREIGN KEY(maintbl_id) REFERENCES maintbl(id)
    INSERT ALL
    WHEN 1 = 1 THEN
    INTO maintbl (ID, opt1, opt2, dat) VALUES (nr, is_even, is_odd, maintbldat)
    WHEN is_even = 'N' THEN
    INTO opt1 (maintbl_id, adddat1, adddat2) VALUES (nr, adddat1, adddat2)
    WHEN is_even = 'Y' THEN
    INTO opt2 (maintbl_ID, adddat1, adddat2) VALUES (nr, ADDdat1, ADDdat2)
    SELECT LEVEL AS NR,
    CASE WHEN MOD(LEVEL, 2) = 0 THEN 'Y' ELSE 'N' END AS is_even,
    CASE WHEN MOD(LEVEL, 2) = 1 THEN 'Y' ELSE 'N' END AS is_odd,
    TO_CHAR(DBMS_RANDOM.RANDOM) AS maintbldat,
    TO_CHAR(DBMS_RANDOM.RANDOM) AS adddat1,
    TO_CHAR(DBMS_RANDOM.RANDOM) AS adddat2
    FROM DUAL
    CONNECT BY LEVEL <= 100;
    COMMIT;
    SELECT * FROM maintbl
    LEFT OUTER JOIN opt1 ON maintbl.id = opt1.maintbl_id AND maintbl.opt1 = 'Y'
    LEFT OUTER JOIN opt2 ON maintbl.id = opt2.maintbl_id AND maintbl.opt2 = 'Y'
    WHERE id = 1;
    Explain plan for "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi":
    http://i.imgur.com/f0AiA.png
    As one can see, the DB uses a view to index-access the opt tables iff indicator column maintbl.opt1='Y' in the main table.
    Explain plan for "Oracle Database 11g Express Edition Release 11.2.0.2.0 - Production":
    http://i.imgur.com/iKfj8.png
    As one can see, the DB does NOT use the view, instead uses a pretty useless case-statement
    Now my questions:
    1) What does the optimizer do in 11.2 XE?!?
    2) In General: Do you suggest this table-setup? Does your yes/no suggestion depend on the rowcount in the tables? Of course I see the problem with incorrectly updated columns and would NEVER do it if there is another truly relational solution with same performance possibly.
    3) Is there a way to avoid performance issues if I don't use an indicator column in ORACLE? Is this what a [Bitmap Join Index|http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#autoId14] is for?
    Thanks in advance and happy discussing,
    Blama

    Fair enough. I've included a cut-down set of SQL below.
    CREATE TABLE DIMENSION_DATE
    DATE_ID NUMBER,
    CALENDAR_DATE DATE,
    CONSTRAINT DATE_ID
    PRIMARY KEY
    (DATE_ID)
    CREATE UNIQUE INDEX DATE_I1 ON DIMENSION_DATE
    (CALENDAR_DATE, DATE_ID);
    CREATE TABLE ORDER_F
    ORDER_ID VARCHAR2(40 BYTE),
    SUBMITTEDDATE_FK NUMBER,
    COMPLETEDDATE_FK NUMBER,
    -- Then I add the first bitmap index, which works:
    CREATE BITMAP INDEX SUBMITTEDDATE_FK ON ORDER_F
    (DIMENSION_DATE.DATE_ID)
    FROM ORDER_F, DIMENSION_DATE
    WHERE ORDER_F.SUBMITTEDDATE_FK = DIMENSION_DATE.DATE_ID;
    -- Then attempt the next one:
    CREATE BITMAP INDEX completeddate_fk
    ON ORDER_F(b.date_id)
    FROM ORDER_F, DIMENSION_DATE b
    WHERE ORDER_F.completeddate_fk = b.date_id;
    -- which results in:
    -- ORA-01408: such column list already indexed

  • How to Load 100 Million Rows in Partioned Table

    Dear all,
    I a workling in VLDB application.
    I have a Table with 5 columns
    For ex- A,B,C,D,DATE_TIME
    I CREATED THE RANGE (DAILY) PARTIONED TABLE ON COLUMN (DATE_TIME).
    AS WELL CREATED NUMBER OF INDEX FOR EX,
    INDEX ON A
    COMPOSITE INDEX ON DATE_TIME,B,C
    REQUIREMENT
    NEED TO LOAD APPROX 100 MILLION RECORDS IN THIS TABLE EVERYDAY ( IT WILL LOAD VIA SQL LOADER OR FROM TEMP TABLE (INSERT INTO ORIG SELECT * FROM TEMP)...
    QUESTION
    TABLE IS INDEXED SO I AM NOT ABLE TO USE SQLLDR FEATURE DIRECT=TRUE.
    SO LET ME KNOW WHAT THE BEST AVILABLE WAY TO LOAD THE DATA IN THIS TABLE ????
    Note--> PLEASE REMEMBER I CAN'T DROP AND CREATE INDEX DAILY DUE TO HUGE DATA QUANTITY.

    Actually a simpler issue then what you seem to think it is.
    Q. What is the most expensive and slow operation on a database server?
    A. I/O. The more I/O, the more latency there is, the longer the wait times are, the bigger the response times are, etc.
    So how do you deal with VLT's? By minimizing I/O. For example, using direct loads/inserts (see SQL APPEND hint) means less I/O as we are only using empty data blocks. Doing one pass through the data (e.g. apply transformations as part of the INSERT and not afterwards via UPDATEs) means less I/O. Applying proper filter criteria. Etc.
    Okay, what do you do when you cannot minimize I/O anymore? In that case, you need to look at processing that I/O volume in parallel. Instead of serially reading and writing a 100 million rows, you (for example) use 10 processes that each reads and writes 10 million rows. I/O bandwidth is there to be used. It is extremely unlikely that a single process can fully utilised the available I/O bandwidth. So use more processes, each processing a chunk of data, to use more of that available I/O bandwidth.
    Lastly, think DDL before DML when dealing with VLT's. For example, a CTAS to create a new data set and then doing a partition exchange to make that new data set part of the destination table, is a lot faster than deleting that partition's data directly, and then running a INSERT to refresh that partition's data.
    That in a nutshell is about it - think I/O and think of ways to use it as effectively as possible. With VLT's and VLDB's one cannot afford to waste I/O.

  • How to export a table with half a million rows?

    I need to export a table that has 535,000 rows. I tried to export to Excel and it exported only 65,535 rows. I tried to export to a text file and it said it was using the clipboard (?) and 65,000 rows was the maximum. Surely there has to be a way to export
    the entire table. I've been able to import much bigger csv files than this, millions of rows.

    What version of Access are you using?  Are you attempting to copy and paste records or are you using Access' export functionality from the menu/ribbon?  I'm using Access 2010 and just exported a million record table to both a text file and to Excel
    (.xlsx format).  Excel 2003 (using .xls 97-2003 format) does have a limit of 65,536 rows but the later .xlsx format does not.
    -Bruce

  • OC4J - How to get data from large table (more than 9 million rows) by EJB?

    SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS
    O I use Jdeveloper to create EJB that has finder methods to get data from a big table (more S than 9 million rows). Deploy is OK but when run client program I always get timeout
    O error or not enough memory error,
    S Can any one help me?
    O urgent
    SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS

    Your problem may be that you are simply trying to load so many objects (found by your finder) that you are exceeding available memory. For example if each object is 100 bytes and you try to load 1,000,000 objects thats 100Mb of memory gone.
    You could try increasing the amount of memory available to OC4J with the appropriate argument on the command line (or in the 10gAS console). For example to make 1Gb available to OC4J you would add the argument:
    -Xmx1000m
    Of course you need have this available as hard memory on your server or you will incur serious swapping.
    Chris

  • How to get data from large table (more than 9 million rows) by EJB?

    I have a giant table, it has more than 9 million rows.
    I want to use ejb finders method to get data from this table but always get not enough memory error or time out error,
    Can anyone give me solutions?
    Thx

    Your problem may be that you are simply trying to load so many objects (found by your finder) that you are exceeding available memory. For example if each object is 100 bytes and you try to load 1,000,000 objects thats 100Mb of memory gone.
    You could try increasing the amount of memory available to OC4J with the appropriate argument on the command line (or in the 10gAS console). For example to make 1Gb available to OC4J you would add the argument:
    -Xmx1000m
    Of course you need have this available as hard memory on your server or you will incur serious swapping.
    Chris

  • Insert/select one million rows at a time from source to target table

    Hi,
    Oracle 10.2.0.4.0
    I am trying to insert around 10 million rows into table target from source as follows:
    INSERT /*+ APPEND NOLOGGING */ INTO target
    SELECT *
    FROM source f
    WHERE
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);There is a unique index on target table on col1,col2
    I was having issues with undo and now I am getting the follwing error with temp space
    ORA-01652: unable to extend temp segment by 64 in tablespace TEMPI believce it would be easier if I did bulk insert one million rows at a time and commit.
    I appriciate any advice on this please.
    Thanks,
    Ashok

    902986 wrote:
    NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);
    I don't know if it has any bearing on the case, but is that WHERE clause on purpose or a typo? Should it be:
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.COL1 and f.col2 = m.col2);Anyway - how much of your data already exists in target compared to source?
    Do you have 10 million in source and very few in target, so most of source will be inserted into target?
    Or do you have 9 million already in target, so most of source will be filtered away and only few records inserted?
    And what is the explain plan for your statement?
    INSERT /*+ APPEND NOLOGGING */ INTO target
    SELECT *
    FROM source f
    WHERE
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);As your error has to do with TEMP, your statement might possibly try to do a lot of work in temp to materialize the resultset or parts of it to maybe use in a hash join before inserting.
    So perhaps you can work towards an explain plan that allows the database to do the inserts "along the way" rather than calculate the whole thing in temp first.
    That probably will go much slower (for example using nested loops for each row to check the exists), but that's a tradeoff - if you can't have sufficient TEMP then you may have to optimize for less usage of that resource at the expense of another resource ;-)
    Alternatively ask your DBA to allocate more room in TEMP tablespace. Or have the DBA check if there are other sessions using a lot of TEMP in which case maybe you just have to make sure your session is the only one using lots of TEMP at the time you execute.

  • Analyse a partitioned table with more than 50 million rows

    Hi,
    I have a partitioned table with more than 50 million rows. The last analyse is on 1/25/2007. Do I need to analyse him? (query runs on this table is very slow).
    If I need to analyse him, what is the best way? Use DBMS_STATS and schedule a job?
    Thanks

    A partitioned table has global statistics as well as partition (and subpartition if the table is subpartitioned) statistics. My guess is that you mean to say that the last time that global statistics were gathered was in 2007. Is that guess accurate? Are the partition-level statistics more recent?
    Do any of your queries actually use global statistics? Or would you expect that every query involving this table would specify one or more values for the partitioning key and thus force partition pruning to take place? If all your queries are doing partition pruning, global statistics are irrelevant, so it doesn't matter how old and out of date they are.
    Are you seeing any performance problems that are potentially attributable to stale statistics on this table? If you're not seeing any performance problems, leaving the statistics well enough alone may be the most prudent course of action. Gathering statistics would only have the potential to change query plans. And since the cost of a query plan regressing is orders of magnitude greater than the benefit of a different query performing faster (at least for most queries in most systems), the balance of risks would argue for leaving the stats alone if there is no problem you're trying to solve.
    If your system does actually use global statistics and there are performance problems that you believe are potentially attributable to stale global statistics and your partition level statistics are accurate, you can gather just global statistics on the table probably with a reasonably small sample size. Make sure, though, that you back up your existing statistics just in case a query plan goes south. Ideally, you'd also have a test environment with identical (or nearly identical) data volumes that you could use to verify that gathering statistics doesn't cause any problems.
    Justin

  • Update all rows in a table which has 8-10 million rows take for ever

    Hi All,
    Greetings!
    I have to update 8million rows on a table. Basically have to reset the batch_id with the current batch number. it contains 8-10 million rows and i have tried with bulk update and then also it takes long time. below is the table structure
    sales_stg (it has composite key of product,upc and market)
    =======
    product_id
    upc
    market_id
    batch_id
    process_status
    I have to update batch_id,process_status to current batch_id (a number) and process_status as zero. I have to update all the rows with these values for batch_id = 0.
    I tried bulk update an it takes more than 2hrs to do. (I limit the update to 1000).
    Any help in this regard.
    Naveen.

    The fastest way will probably be to not use a select loop but a direct update like in William's example. The main downside is if you do too many rows you risk filling up your rollback/undo; to keep things as simple as possible I wouldn't do batching except for this. Also, we did some insert timings a few years ago on 9iR1 and found that the performance curve on frequent commits started to level off after 4K rows (fewer commits were still better) so you could see how performance improves by performing fewer commits if that's an issue.
    The other thing you could consider if you have the license is using the parallel query option.

Maybe you are looking for

  • Use LR to clean up and make a new folder

    I have imported all photos and removed duplicates or useless photos. Can LR copy all the photos in the catalog to a new folder on the harddrive, so I can remove all old folders and keep just one cleaned up folder.

  • ITunes 7.1.1 stops responding after several hours

    My Intel iMac is Wifi connected to the Internet. I like to listen to iTunes(Radio-Jazz Genre) and I have iTunes sending music to my downstairs stereo via AirTunes to an AirPort Express. I will be downstairs reading or cooking and notice that the musi

  • How to change attributes of fields ?

    hi evryone,    I am new to  ERP HCM (HR) . I have used standard services as leave request and all. In Leave Request i want to change the attributes of the fields on user entry. Is this possible to be done? if yes , how?  how can i see the structure c

  • Will Lightroom 4 work with Mac OS 10.7.5?

    I have bought Lightroom 4 from Adobe online and downloaded it to my Mac but can't open or activate it. Will it work with Mac OS 10.7.5?

  • Create a file

    Hi, I just would like to create a file, which the user can download. Scenario: The user choose some options on the JSF-page. Then a database query will execute. The user should be able to download that data of the query as a .csv file. So how should