Million rows insert

Hi
I want to insert 10 millions rows oracle 8.1 to 10g.
Which way can i followed? (Bulk insert or SQL loader)
Thanks
SS

Johan Stf wrote:
Ok, slap me if this is wrong.
create sequence CustID
increment by 1
start with 1
insert into customers (id)
select CustID.nextval
from dual
connect by level <= 1000000
The question is why do you need to have the sequence then?
You could simply write the following using the LEVEL pseudo column:
insert /*+ append */ into customers (id)
select level as id
from dual
connect by level <= 1000000;I've added the append hint for using a direct-path insert if possible. Note that this locks the object for any other DML activity and you can't read from the object afterwards until you commit, so use with care.
Note that using the CONNECT BY LEVEL <= n uses a significant amount of memory if n is large.
For more information, see here:
http://blog.tanelpoder.com/2008/06/08/generating-lots-of-rows-using-connect-by-safely/
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/
Edited by: Randolf Geist on Oct 28, 2008 2:17 PM
Added the memory consumption caveat

Similar Messages

  • Inserting 10 million rows in to a table hangs

    HI through toad iam using a simple for loop to insert 10 million rows into a table by saying
    for i in 1 ......10000000
    insert.................
    It hangs ........ for lot of time
    is there a better way to insert the rows in to the table....?
    i have to test for performance.... and i have to insert 50 million rows in its child table..
    practically when the code moves to production it will have these many rows...(may be more also) thats why i have to test for these many rows
    plz suggest a better way for this
    Regards
    raj

    Must be a 'hardware thing'.
    My ancient desktop (pentium IV, 1.8 Ghz, 512 MB), running XE, needs:
    MHO%xe> desc t
    Naam                                      Null?    Type
    N                                                  NUMBER
    A                                                  VARCHAR2(10)
    B                                                  VARCHAR2(10)
    MHO%xe> insert /*+ APPEND */ into t
      2  with my_data as (
      3  select level n, 'abc' a, 'def' b from dual
      4  connect by level <= 10000000
      5  )
      6  select * from my_data;
    10000000 rijen zijn aangemaakt.
    Verstreken: 00:04:09.71
    MHO%xe> drop table t;
    Tabel is verwijderd.
    Verstreken: 00:00:31.50
    MHO%xe> create table t (n number, a varchar2(10), b varchar2(10));
    Tabel is aangemaakt.
    Verstreken: 00:00:01.04
    MHO%xe> insert into t
      2  with my_data as (
      3  select level n, 'abc' a, 'def' b from dual
      4  connect by level <= 10000000
      5  )
      6  select * from my_data;
    10000000 rijen zijn aangemaakt.
    Verstreken: 00:02:44.12
    MHO%xe>  drop table t;
    Tabel is verwijderd.
    Verstreken: 00:00:09.46
    MHO%xe> create table t (n number, a varchar2(10), b varchar2(10));
    Tabel is aangemaakt.
    Verstreken: 00:00:00.15
    MHO%xe> insert /*+ APPEND */ into t
      2   with my_data as (
      3   select level n, 'abc' a, 'def' b from dual
      4   connect by level <= 10000000
      5   )
      6   select * from my_data;
    10000000 rijen zijn aangemaakt.
    Verstreken: 00:01:03.89
    MHO%xe>  drop table t;
    Tabel is verwijderd.
    Verstreken: 00:00:27.17
    MHO%xe> create table t (n number, a varchar2(10), b varchar2(10));
    Tabel is aangemaakt.
    Verstreken: 00:00:01.15
    MHO%xe> insert into t
      2  with my_data as (
      3  select level n, 'abc' a, 'def' b from dual
      4  connect by level <= 10000000
      5  )
      6  select * from my_data;
    10000000 rijen zijn aangemaakt.
    Verstreken: 00:01:56.10Yea, 'cached' it a bit (ofcourse ;) )
    But the append hint seems to knibble about 50 sec off anyway (using NO indexes at all) on my 'configuration'.

  • Insert/select one million rows at a time from source to target table

    Hi,
    Oracle 10.2.0.4.0
    I am trying to insert around 10 million rows into table target from source as follows:
    INSERT /*+ APPEND NOLOGGING */ INTO target
    SELECT *
    FROM source f
    WHERE
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);There is a unique index on target table on col1,col2
    I was having issues with undo and now I am getting the follwing error with temp space
    ORA-01652: unable to extend temp segment by 64 in tablespace TEMPI believce it would be easier if I did bulk insert one million rows at a time and commit.
    I appriciate any advice on this please.
    Thanks,
    Ashok

    902986 wrote:
    NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);
    I don't know if it has any bearing on the case, but is that WHERE clause on purpose or a typo? Should it be:
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.COL1 and f.col2 = m.col2);Anyway - how much of your data already exists in target compared to source?
    Do you have 10 million in source and very few in target, so most of source will be inserted into target?
    Or do you have 9 million already in target, so most of source will be filtered away and only few records inserted?
    And what is the explain plan for your statement?
    INSERT /*+ APPEND NOLOGGING */ INTO target
    SELECT *
    FROM source f
    WHERE
            NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);As your error has to do with TEMP, your statement might possibly try to do a lot of work in temp to materialize the resultset or parts of it to maybe use in a hash join before inserting.
    So perhaps you can work towards an explain plan that allows the database to do the inserts "along the way" rather than calculate the whole thing in temp first.
    That probably will go much slower (for example using nested loops for each row to check the exists), but that's a tradeoff - if you can't have sufficient TEMP then you may have to optimize for less usage of that resource at the expense of another resource ;-)
    Alternatively ask your DBA to allocate more room in TEMP tablespace. Or have the DBA check if there are other sessions using a lot of TEMP in which case maybe you just have to make sure your session is the only one using lots of TEMP at the time you execute.

  • Tuning an insert sql that inserts a million rows doing a full table scan

    Hi Experts,
    I am on Oracle 11.2.0.3 on Linux. I have a sql that inserts data in a history/archive table from a main application table based on date. The application table has 3 million rows in it. and all rows that are older then 6 months should go into a history/archive table. this was recently decided and we have 1 million rows that satisfy this criteria. This insert into archive table is taking about 3 minutes. The explain plan shows a full table scan on the main table - which is the right thing as we are pulling out 1 million rows from main table into history table.
    My question is that, is there a way I can make this sql go faster?
    Here is the query plan (I changed the table names etc.)
       INSERT INTO EMP_ARCH
       SELECT *
    FROM EMP M
    where HIRE_date < (sysdate - :v_num_days);
    call     count       cpu    elapsed       disk      query    current        rows
    Parse        2      0.00       0.00          0          0          0           0
    Execute      2     96.22     165.59      92266     147180    8529323     1441230
    Fetch        0      0.00       0.00          0          0          0           0
    total        4     96.22     165.59      92266     147180    8529323     1441230
    Misses in library cache during parse: 1
    Misses in library cache during execute: 1
    Optimizer mode: FIRST_ROWS
    Parsing user id: 166
    Rows     Row Source Operation
    1441401   TABLE ACCESS FULL EMP (cr=52900 pr=52885 pw=0 time=21189581 us)
    I heard that there is a way to use opt_param hint to increase the multiblock read count but didn't seem to work for me....I will be thankful for suggestions on this. also can collections and changing this to pl/sql make it faster?
    Thanks,
    OrauserN

    Also I wish experts share their insight on how we make full table scan go faster (apart from parallel suggestion I mean).
    Please make up your mind about what question you actually want help with.
    First you said you want help making the INSERT query go faster but the rest of your replies, including the above statement, imply that you are obsessed with making full table scans go faster.
    You also said:
    our management would like us to come forth with the best way to do it
    But when people make suggestions you make replies about you and your abilities:
    I do not have the liberty to do the "alter session enable parallel dml".  I have to work within this constraings
    Does 'management' want the best way to do whichever question you are asking?
    Or is is just YOU that want the best way (whatever you mean by best) based on some unknown set of constraints?
    As SB already said, you clearly do NOT have an actual problem since you have already completed the task of inserting the data, several times in fact. So the time it takes to do it is irrevelant.
    There is no universal agreement on what the word 'best' means for any given use case and you haven't given us your definition either. So how would we know what might be 'best'?
    So until you provide the COMPLETE list of constraints you are just wasting our time asking for suggestions that you refute with a comment about some 'constraint' you have.
    You also haven't provided ANY information that indicates that it is the full table scan that is the root of the problem. It is far more likely to be the INSERT into the table and a simple use of NOLOGGING with an APPEND hint might be all that is needed.
    IMHO the 'best' way would be to partition both the source and target tables and just use EXCHANGE PARTITION to move the data. That operation would only take a millisecond or two.
    But, let me guess, that would not conform to one of your 'unknown' constraints would it?

  • Inserting 320 millions rows...the speed change whilst the query is running

    Hi guys, I got a strange behavior in my data warehouse. When I insert data in a table I start to take note of the speed. So in one minute, at the beginning the speed was 3 millions every minute (it expected 312 millions rows) but now, after five hours the
    speed is 83.000 row per minute and the table has already 234 millions rows. Now, I'm wondering if this behavior is right and how I can improve the performance (if I can whilst the insert is running).
    Many Thanks

    change Database recovery mode to bulklogged(Preferably)/ simple.
    No - that will not solve the problem because INSERT INTO isn't bulk logged (automatically). To force bulk logged operation the target table need to be locked exclusively!
    Greg Robidoux has a great matrix for that!
    http://www.mssqltips.com/sqlservertip/1185/minimally-logging-bulk-load-inserts-into-sql-server/
    @DIEGOCTIN,
    I assume nobody can really give you an answer because it could have so much reasons for it! As Josh has written GROWTH could have been a problem if Instant File Initialization isn't setup. But 320 mil records in one transaction could cause a heavy growth
    of the log file, too. This cannot participate from Instant File Initialization.
    Another good tip came from Olaf, too!
    I would suggest to insert the data with a tablock. In this case the transaction is minimally logged and the operation will not copy rows but pages into the target table. I've written a WIKI about that topic here:
    http://social.technet.microsoft.com/wiki/contents/articles/20651.sql-server-delete-a-huge-amount-of-data-from-a-table.aspx
    Wish you all a merry christmas and a happy new year!
    MCM - SQL Server 2008
    MCSE - SQL Server 2012
    db Berater GmbH
    SQL Server Blog (german only)

  • Update all rows in a table which has 8-10 million rows take for ever

    Hi All,
    Greetings!
    I have to update 8million rows on a table. Basically have to reset the batch_id with the current batch number. it contains 8-10 million rows and i have tried with bulk update and then also it takes long time. below is the table structure
    sales_stg (it has composite key of product,upc and market)
    =======
    product_id
    upc
    market_id
    batch_id
    process_status
    I have to update batch_id,process_status to current batch_id (a number) and process_status as zero. I have to update all the rows with these values for batch_id = 0.
    I tried bulk update an it takes more than 2hrs to do. (I limit the update to 1000).
    Any help in this regard.
    Naveen.

    The fastest way will probably be to not use a select loop but a direct update like in William's example. The main downside is if you do too many rows you risk filling up your rollback/undo; to keep things as simple as possible I wouldn't do batching except for this. Also, we did some insert timings a few years ago on 9iR1 and found that the performance curve on frequent commits started to level off after 4K rows (fewer commits were still better) so you could see how performance improves by performing fewer commits if that's an issue.
    The other thing you could consider if you have the license is using the parallel query option.

  • Migration of million rows from remote table using merge

    I need to migrate (using merge) almost 15 million rows from a remote database table having unique index (DAY, Key) with the following data setup.
    DAY1 -- Key1 -- NKey11
    DAY1 -- Key2 -- NKey12
    DAY2 -- Key1 -- NKey21
    DAY2 -- Key2 -- NKey22
    DAY3 -- Key1 -- NKey31
    DAY3 -- Key2 -- NKey32
    In my database, I have to merge all these 15 million rows into a table having unique index (Key); no DAY in destination table.
    First, it would be executed for DAY1 and the merge command will insert following two rows. For DAY2, it would update two rows with the NKey2 (for each Key) values and so on.
    Key1 -- NKey11
    Key2 -- NKey12
    I am looking for the best possible approach. Please note that I cannot make any change at remote database.
    Right now, I am using the following one which is taking huge time for DAY2 and so on (mainly update).
    MERGE INTO destination D
      USING (SELECT /*+ DRIVING_SITE(A) */ DAY, Key, NKey
                   FROM source@dblink A WHERE DAY = v_day) S
      ON (D.Key = S.Key)
    WHEN MATCHED THEN
       UPDATE SET D.NKey = S.NKey
    WHEN NOT MATCHED THEN
       INSERT (D.Key, D.NKey) VALUES (S.Key, S.NKey)
    LOG ERRORS INTO err$_destination REJECT LIMIT UNLIMITED;Edited by: 986517 on Feb 14, 2013 3:29 PM
    Edited by: 986517 on Feb 14, 2013 3:33 PM

    MERGE INTO destination D
      USING (SELECT /*+ DRIVING_SITE(A) */ DAY, Key, NKey
                   FROM source@dblink A WHERE DAY = v_day) S
      ON (D.Key = S.Key)
    WHEN MATCHED THEN
       UPDATE SET D.NKey = S.NKey
    WHEN NOT MATCHED THEN
       INSERT (D.Key, D.NKey) VALUES (S.Key, S.NKey)
    LOG ERRORS INTO err$_destination REJECT LIMIT UNLIMITED;The first remark I have to emphasize here is that the hint /*+ DRIVING_SITE(A) */ is silently ignored because in case of insert/update/delete/merge the driving site is always the site where the insert/update/delete is done.
    http://jonathanlewis.wordpress.com/2008/12/05/distributed-dml/#more-809
    Right now, I am using the following one which is taking huge time for DAY2 and so on (mainly update).The second remark is that you've realised that your MERGE is taking time but you didn't trace it to see where time is being spent. For that you can either use the 10046 trace event or at a first step get the execution plan followed by your MERGE statement.
    LOG ERRORS INTO err$_destination REJECT LIMIT UNLIMITED;The third remark is related to the DML error logging : be aware that unique keys will empeach the DML error loggig to work correctly.
    http://hourim.wordpress.com/?s=DML+error
    And finally I advise you to look at the following blog article I wrote about enhancing an insert/select over db-link
    http://hourim.wordpress.com/?s=insert+select
    Mohamed Houri
    www.hourim.wordpress.com

  • Best method to update database table for 3 to 4 million rows

    Hi All,
    I have 3 to 4 million rows are there in my excel file and we have to load to Z-Table.
    The intent is to load and keep 18 months of history in this table. 
    so what should be best way for huge volume of data to Z-Table from excel file.
    If is from the program, is that the best way use the FM 'GUI_DOWNLOAD' and down load those entries into the internal table and directly do as below
    INSERT Z_TABLE from IT_DOWNLOAD.
    I think for the huge amount of data it goes to dump.
    please suggest me the best possible way or any psudo code  to insert those huge entries into that Z_TABLE.
    Thanks in advance..

    Hi,
    You get the dump because of uploading that much records into itnernal table from excel file...
    in this case, do the follwowing.
    data : w_int type i,
             w_int1 type i value 1.
    data itab type standard table of ALSMEX_TABLINE with header line.
    do.
       refresh itab.
       w_int = w_int1..
       w_int1 = w_int + 25000.
       CALL FUNCTION 'ALSM_EXCEL_TO_INTERNAL_TABLE'
      EXPORTING
        FILENAME                      = <filename>
        I_BEGIN_COL                   = 1
        I_BEGIN_ROW                   = w_int
        I_END_COL                     = 10
        I_END_ROW                     = w_int1
      TABLES
        INTERN                        = itab
    * EXCEPTIONS
    *   INCONSISTENT_PARAMETERS       = 1
    *   UPLOAD_OLE                    = 2
    *   OTHERS                        = 3
    IF SY-SUBRC <> 0.
    * MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
    *         WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
    ENDIF.
    if itab is not initial.
    write logic to segregate the data from itab to the main internal table and then
    insert records from the main internal table to database table.
    else.
    exit.
    endif.
    enddo.
    Regards,
    Siddarth

  • Problem with BULK COLLECT with million rows - Oracle 9.0.1.4

    We have a requirement where are supposed to load 58 millions of rows into a FACT Table in our DATA WAREHOUSE. We initially planned to use Oracle Warehouse Builder but due to performance reasons, decided to write custom code. We wrote a custome procedure which opens a simple cursor and reads all the 58 million rows from the SOURCE Table and in a loop processes the rows and inserts the records into a TARGET Table. The logic works fine but it took 20hrs to complete the load.
    We then tried to leverage the BULK COLLECT and FORALL and PARALLEL options and modified our PL/SQL code completely to reflect these. Our code looks very simple.
    1. We declared PL/SQL BINARY_INDEXed Tables to store the data in memory.
    2. We used BULK COLLECT into FETCH the data.
    3. We used FORALL statement while inserting the data.
    We did not introduce any of our transformation logic yet.
    We tried with the 600,000 records first and it completed in 1 min and 29 sec with no problems. We then doubled the no. of rows to 1.2 million and the program crashed with the following error:
    ERROR at line 1:
    ORA-04030: out of process memory when trying to allocate 16408 bytes (koh-kghu
    call ,pmucalm coll)
    ORA-06512: at "VVA.BULKLOAD", line 66
    ORA-06512: at line 1
    We got the same error even with 1 million rows.
    We do have the following configuration:
    SGA - 8.2 GB
    PGA
    - Aggregate Target - 3GB
    - Current Allocated - 439444KB (439 MB)
    - Maximum allocated - 2695753 KB (2.6 GB)
    Temp Table Space - 60.9 GB (Total)
    - 20 GB (Available approximately)
    I think we do have more than enough memory to process the 1 million rows!!
    Also, some times the same program results in the following error:
    SQL> exec bulkload
    BEGIN bulkload; END;
    ERROR at line 1:
    ORA-03113: end-of-file on communication channel
    We did not even attempt the full load. Also, we are not using the PARALLEL option yet.
    Are we hitting any bug here? Or PL/SQL is not capable of mass loads? I would appreciate any thoughts on this?
    Thanks,
    Haranadh
    Following is the code:
    set echo off
    set timing on
    create or replace procedure bulkload as
    -- SOURCE --
    TYPE src_cpd_dt IS TABLE OF ima_ama_acct.cpd_dt%TYPE;
    TYPE src_acqr_ctry_cd IS TABLE OF ima_ama_acct.acqr_ctry_cd%TYPE;
    TYPE src_acqr_pcr_ctry_cd IS TABLE OF ima_ama_acct.acqr_pcr_ctry_cd%TYPE;
    TYPE src_issr_bin IS TABLE OF ima_ama_acct.issr_bin%TYPE;
    TYPE src_mrch_locn_ref_id IS TABLE OF ima_ama_acct.mrch_locn_ref_id%TYPE;
    TYPE src_ntwrk_id IS TABLE OF ima_ama_acct.ntwrk_id%TYPE;
    TYPE src_stip_advc_cd IS TABLE OF ima_ama_acct.stip_advc_cd%TYPE;
    TYPE src_authn_resp_cd IS TABLE OF ima_ama_acct.authn_resp_cd%TYPE;
    TYPE src_authn_actvy_cd IS TABLE OF ima_ama_acct.authn_actvy_cd%TYPE;
    TYPE src_resp_tm_id IS TABLE OF ima_ama_acct.resp_tm_id%TYPE;
    TYPE src_mrch_ref_id IS TABLE OF ima_ama_acct.mrch_ref_id%TYPE;
    TYPE src_issr_pcr IS TABLE OF ima_ama_acct.issr_pcr%TYPE;
    TYPE src_issr_ctry_cd IS TABLE OF ima_ama_acct.issr_ctry_cd%TYPE;
    TYPE src_acct_num IS TABLE OF ima_ama_acct.acct_num%TYPE;
    TYPE src_tran_cnt IS TABLE OF ima_ama_acct.tran_cnt%TYPE;
    TYPE src_usd_tran_amt IS TABLE OF ima_ama_acct.usd_tran_amt%TYPE;
    src_cpd_dt_array src_cpd_dt;
    src_acqr_ctry_cd_array      src_acqr_ctry_cd;
    src_acqr_pcr_ctry_cd_array     src_acqr_pcr_ctry_cd;
    src_issr_bin_array      src_issr_bin;
    src_mrch_locn_ref_id_array     src_mrch_locn_ref_id;
    src_ntwrk_id_array      src_ntwrk_id;
    src_stip_advc_cd_array      src_stip_advc_cd;
    src_authn_resp_cd_array      src_authn_resp_cd;
    src_authn_actvy_cd_array      src_authn_actvy_cd;
    src_resp_tm_id_array      src_resp_tm_id;
    src_mrch_ref_id_array      src_mrch_ref_id;
    src_issr_pcr_array      src_issr_pcr;
    src_issr_ctry_cd_array      src_issr_ctry_cd;
    src_acct_num_array      src_acct_num;
    src_tran_cnt_array      src_tran_cnt;
    src_usd_tran_amt_array      src_usd_tran_amt;
    j number := 1;
    CURSOR c1 IS
    SELECT
    cpd_dt,
    acqr_ctry_cd ,
    acqr_pcr_ctry_cd,
    issr_bin,
    mrch_locn_ref_id,
    ntwrk_id,
    stip_advc_cd,
    authn_resp_cd,
    authn_actvy_cd,
    resp_tm_id,
    mrch_ref_id,
    issr_pcr,
    issr_ctry_cd,
    acct_num,
    tran_cnt,
    usd_tran_amt
    FROM ima_ama_acct ima_ama_acct
    ORDER BY issr_bin;
    BEGIN
    OPEN c1;
    FETCH c1 bulk collect into
    src_cpd_dt_array ,
    src_acqr_ctry_cd_array ,
    src_acqr_pcr_ctry_cd_array,
    src_issr_bin_array ,
    src_mrch_locn_ref_id_array,
    src_ntwrk_id_array ,
    src_stip_advc_cd_array ,
    src_authn_resp_cd_array ,
    src_authn_actvy_cd_array ,
    src_resp_tm_id_array ,
    src_mrch_ref_id_array ,
    src_issr_pcr_array ,
    src_issr_ctry_cd_array ,
    src_acct_num_array ,
    src_tran_cnt_array ,
    src_usd_tran_amt_array ;
    CLOSE C1;
    FORALL j in 1 .. src_cpd_dt_array.count
    INSERT INTO ima_dly_acct (
         CPD_DT,
         ACQR_CTRY_CD,
         ACQR_TIER_CD,
         ACQR_PCR_CTRY_CD,
         ACQR_PCR_TIER_CD,
         ISSR_BIN,
         OWNR_BUS_ID,
         USER_BUS_ID,
         MRCH_LOCN_REF_ID,
         NTWRK_ID,
         STIP_ADVC_CD,
         AUTHN_RESP_CD,
         AUTHN_ACTVY_CD,
         RESP_TM_ID,
         PROD_REF_ID,
         MRCH_REF_ID,
         ISSR_PCR,
         ISSR_CTRY_CD,
         ACCT_NUM,
         TRAN_CNT,
         USD_TRAN_AMT)
         VALUES (
         src_cpd_dt_array(j),
         src_acqr_ctry_cd_array(j),
         null,
         src_acqr_pcr_ctry_cd_array(j),
              null,
              src_issr_bin_array(j),
              null,
              null,
              src_mrch_locn_ref_id_array(j),
              src_ntwrk_id_array(j),
              src_stip_advc_cd_array(j),
              src_authn_resp_cd_array(j),
              src_authn_actvy_cd_array(j),
              src_resp_tm_id_array(j),
              null,
              src_mrch_ref_id_array(j),
              src_issr_pcr_array(j),
              src_issr_ctry_cd_array(j),
              src_acct_num_array(j),
              src_tran_cnt_array(j),
              src_usd_tran_amt_array(j));
    COMMIT;
    END bulkload;
    SHOW ERRORS
    -----------------------------------------------------------------------------

    do you have a unique key available in the rows you are fetching?
    It seems a cursor with 20 million rows that is as wide as all the columnsyou want to work with is a lot of memory for the server to use at once. You may be able to do this with parallel processing (dop over 8) and a lot of memory for the warehouse box (and the box you are extracting data from)...but is this the most efficient (and thereby fastest) way to do it?
    What if you used a cursor to select a unique key only, and then during the cursor loop fetch each record, transform it, and insert it into the target?
    Its a different way to do a lot at once, but it cuts down on the overall memory overhead for the process.
    I know this isnt as elegant as a single insert to do it all at once, but sometimes trimming a process down so it takes less resources at any given moment is much faster than trying to do the whole thing at once.
    My solution is probably biased by transaction systems, so I would be interested in what the data warehouse community thinks of this.
    For example:
    source table my_transactions (tx_seq_id number, tx_fact1 varchar2(10), tx_fact2 varchar2(20), tx_fact3 number, ...)
    select a cursor of tx_seq_id only (even at 20 million rows this is not much)
    you could then either use a for loop or even bulk collect into a plsql collection or table
    then process individually like this:
    procedure process_a_tx(p_tx_seq_id in number)
    is
    rTX my_transactions%rowtype;
    begin
    select * into rTX from my_transactions where tx_seq_id = p_tx_seq_id;
    --modify values as needed
    insert into my_target(a, b, c) values (rtx.fact_1, rtx.fact2, rtx.fact3);
    commit;
    exception
    when others
    rollback;
    --write to a log or raise and exception
    end process_a_tx;
    procedure collect_tx
    is
    cursor tx is
    select tx_seq_id from my_transactions;
    begin
    for rTx in cTx loop
    process_a_tx(rtx.tx_seq_id);
    end loop;
    end collect_tx;

  • Oracle 10 Million records insert using Pro c

    Hi,
         As i am new to Oracle 10G Pro c and SQL Loader, i would like to know few informations regarding the same.
    My requirement is this that, i need to read a 20GB file (20Million Lines) line by line and convert the line into database records with few data manipulation and insert
    into a Oracle 10G database table.
         I read some articles and it says Pro C is faster than SQL Loader in performance (fast insertion). And also Pro C talks to
    the oracle directly and it puts the data pages directly into Oracle but not through SQL Engine or Parser.
         Even in the Pro c samples, i have seen a For loop to insert mulitple records. Will each insertion cost
         more time ?
         Is there any bulk insert program on Pro C like 10 Million rows at a shot ?
         Or Pro c can do upload of a file data into Oracle database table ?
         If any one already posted this query means please inform me the thread number or id
    Thank you,
    Ganesh
    Edited by: user12165645 on Nov 10, 2009 2:06 AM

    Alex Nuijten wrote:
    Personally I would go for either an External Table or SQL*Loader, mainly because I've never used Pro*C.. ;)
    And so far I never needed to resort to other option because of poor performance.I fully agree. Also we are talking about "only" 10 mill rows. This will take some time, but probably not more than 30 min, max. 2 hours depending on many factors including storage, network and current database activities. I've seen systems where such a load took only a few minutes.
    I guess if there is a difference between Pro*C and external tables I would still opt for external tables, because they are so much easier to use.

  • DELETE QUERY FOR A TABLE WITH MILLION ROWS

    Hello,
    I have a requirement where I have to compare 2 tables - both having around million rows, and delete data based on a single column.
    DELETE FROM TABLE_A WHERE COLUMN_A NOT IN
    (SELECT COLUMN_A FROM TABLE_B)
    COLUMN_A had index defined on it in both tables. Still it is taking a long time. What is the best way to achieve this? any work around?
    thanks

    How many rows are you deleting from this table ? If the precentage is large then the better option is
    1) Create a new table where COLUMN_A NOT IN
    (SELECT COLUMN_A FROM TABLE_B)
    2) TRUNCATE table_A
    3)Insert in to table_A (select * from new table)
    4) If you have any constraints then may be it can be diaabled.
    thanks

  • How to Load 100 Million Rows in Partioned Table

    Dear all,
    I a workling in VLDB application.
    I have a Table with 5 columns
    For ex- A,B,C,D,DATE_TIME
    I CREATED THE RANGE (DAILY) PARTIONED TABLE ON COLUMN (DATE_TIME).
    AS WELL CREATED NUMBER OF INDEX FOR EX,
    INDEX ON A
    COMPOSITE INDEX ON DATE_TIME,B,C
    REQUIREMENT
    NEED TO LOAD APPROX 100 MILLION RECORDS IN THIS TABLE EVERYDAY ( IT WILL LOAD VIA SQL LOADER OR FROM TEMP TABLE (INSERT INTO ORIG SELECT * FROM TEMP)...
    QUESTION
    TABLE IS INDEXED SO I AM NOT ABLE TO USE SQLLDR FEATURE DIRECT=TRUE.
    SO LET ME KNOW WHAT THE BEST AVILABLE WAY TO LOAD THE DATA IN THIS TABLE ????
    Note--> PLEASE REMEMBER I CAN'T DROP AND CREATE INDEX DAILY DUE TO HUGE DATA QUANTITY.

    Actually a simpler issue then what you seem to think it is.
    Q. What is the most expensive and slow operation on a database server?
    A. I/O. The more I/O, the more latency there is, the longer the wait times are, the bigger the response times are, etc.
    So how do you deal with VLT's? By minimizing I/O. For example, using direct loads/inserts (see SQL APPEND hint) means less I/O as we are only using empty data blocks. Doing one pass through the data (e.g. apply transformations as part of the INSERT and not afterwards via UPDATEs) means less I/O. Applying proper filter criteria. Etc.
    Okay, what do you do when you cannot minimize I/O anymore? In that case, you need to look at processing that I/O volume in parallel. Instead of serially reading and writing a 100 million rows, you (for example) use 10 processes that each reads and writes 10 million rows. I/O bandwidth is there to be used. It is extremely unlikely that a single process can fully utilised the available I/O bandwidth. So use more processes, each processing a chunk of data, to use more of that available I/O bandwidth.
    Lastly, think DDL before DML when dealing with VLT's. For example, a CTAS to create a new data set and then doing a partition exchange to make that new data set part of the destination table, is a lot faster than deleting that partition's data directly, and then running a INSERT to refresh that partition's data.
    That in a nutshell is about it - think I/O and think of ways to use it as effectively as possible. With VLT's and VLDB's one cannot afford to waste I/O.

  • Processing 410 million rows

    Hi I have a table that has 410 million rows.
    And there is a function that I need to apply on two of the column.
    If the function return certain value, I would like to insert some of the info into another table.
    This is the prototype code. Would using pipelinie function be appropriate ?
    It looks like there will be a full-scan no-matter-what..as there is no functional index definied in those two columns for this particular function..
    And the expected no. for the output is 2-3 % so I will say 25% of that ~= 10 millions
    create table pipetest1 as
    select rownum id, ceil(dbms_random.value(0,99)) value1 ,ceil(dbms_random.value(0,99)) value2 from dual connect by level < 1001;
    create table pipetest_result1 (id number,code varchar2(1));
    create or replace function pipetest1_YN(value1 in number, value2 in number) return varchar2 AS
    lv_result_Y  varchar2(1) := 'Y';
    lv_result_N varchar2(1) := 'N';
    lv_result_Q varchar2(1) := 'Q';
    begin
        if mod(value1,2) = 0 then
            if mod( (value1 + value2),5) = 0 then
                return lv_result_Y;
            else
                return lv_result_N;
            end if;    
        elsif (mod(value1,3) = 0 ) then
            if ( mod((value1+value2),7) = 0 ) then
                return lv_result_Y;
            end if;   
        else
                return lv_result_N;
        end if;
        return lv_result_Q;
    EXCEPTION WHEN OTHERS THEN
        RETURN  lv_result_Q;
    end pipetest1_YN;
    --select * FROM PIPETEST1
    select id,value1,value2,pipetest1_YN(NVL(value1,1),NVL(value2,1)) from pipetest1
    with j1 as (
    select id,value1,value2,pipetest1_YN(NVL(value1,1),NVL(value2,1)) result from pipetest1
    select count(*) cnt,result from j1 group by result
    insert into pipetest_result1
    with j1 as (
    select id,value1,value2,pipetest1_YN(NVL(value1,1),NVL(value2,1)) result from pipetest1
    select id,result from j1 where j1.result='Y'
    select * from  pipetest_result1Edited by: vxwo0owxv on Feb 15, 2012 4:36 PM

    consider also using a virtual column
    create table virtual_col as select rownum id, ceil(dbms_random.value(0,99)) value1 ,ceil(dbms_random.value(0,99)) value2 from dual connect by level < 1001;
    alter table virtual_col add (check_thingo as (decode(mod(value1,2)
                              ,0,decode(mod(value1+value2,5),0,'Y','N')
                              ,decode(mod(value1,3)
                                                  ,0,decode(mod(value1+value2,7)
                                                                               ,0,'Y','N'
                                                  ,'N'
    select * from virtual_col; 

  • Left joins on multi-million rows

    i have a simple query doing left joining on several tables, upward of 7 tables. each table has several hundred million rows.
    tblA is 1:M tblB and tblB is 1:M tblC and so on.
    how to tune a query liked that?
    sample query is
    select distinct
    a.col,b.col,c.col
    from tblA a left join tblB b
    on a.id=b.id
    and a.col is not null
    left join tblC
    on b.id=c.id
    and c.col > criteria
    thanks.

    hi
    a simple query is liked
    SELECT my_DEP.description,
    my_DEP.addr_id,
    hundredRowsTbl.address,
    5MillTbl.checkin_TIME,
    5MillTbl.checkout_TIME,
    hundredRowsTbl.ID2,
    5MillTbl.ID,
    5MillTbl.col2,
    my_DEP.col3,
    5MillTbl.col13,
    hundreds.desc,
    50mmTbl.col6,
    50mmTbl.col5,
    5MillTbl.col33
    FROM
    my.5MillTbl 5MillTbl
    LEFT OUTER JOIN
    my.50mmTbl 50mmTbl
    ON 5MillTbl.ID = 50mmTbl.ID
    LEFT OUTER JOIN my.hundreds hundreds
    ON 5MillTbl.banding =
    hundreds.banding
    INNER JOIN my.my_DEP my_DEP
    ON 5MillTbl.organization_ID = my_DEP.organization_ID
    INNER JOIN my.my_40millTbl
    ON 5MillTbl.seqID = my_40millTbl.seqID
    LEFT OUTER JOIN my.hundredRowsTbl hundredRowsTbl
    ON my_DEP.addr_id = hundredRowsTbl.ID2
    LEFT OUTER JOIN my.30millTbl 30millTbl
    ON my_DEP.organization_ID = 30millTbl.dept_id
    WHERE 1=1
    AND 5MillTbl.ID IS NOT NULL
    AND ( 5MillTbl.checkout_TIME >= TO_DATE ('01-01-2009 00:00:00', 'DD-MM-YYYY HH24:MI:SS')
    AND 5MillTbl.checkout_TIME <TO_DATE ('12-31-2010 00:00:00', 'DD-MM-YYYY HH24:MI:SS')
    AND ( 5MillTbl.col2 IS NULL
    OR NOT (5MillTbl.col2 = 5
    OR 5MillTbl.col2 = 6)
    AND 5MillTbl.ID IS NOT NULL
    AND 30millTbl.TYPE= '30'
    AND my_DEP.addr_id = 61

  • How can I use multiple row insert or update into DB in JSP?

    Hi all,
    pls help for my question.
    "How can I use multiple rows insert or update into DB in JSP?"
    I mean I will insert or update the multiple records like grid component. All the data I enter will go into the DB.
    With thanks,

    That isn't true. Different SQL databases have
    different capabilities and use different syntax, That's true - every database has its own quirks and extensions. No disagreement there. But they all follow ANSI SQL for CRUD operations. Since the OP said they wanted to do INSERTs and UPDATEs in batches, I assumed that ANSI SQL was sufficient.
    I'd argue that it's best to use ANSI SQL as much as possible, especially if you want your JDBC code to be portable between databases.
    and there are also a lot of different ways of talking to
    SQL databases that are possible in JSP, from using
    plain old java.sql.* in scriptlets to using the
    jstlsql taglib. I've done maintenance on both, and
    they are as different as night and day.Right, because you don't maintain JSP and Java classes the same way. No news there. Both java.sql and JSTL sql taglib are both based on SQL and JDBC. Same difference, except that one uses tags and the other doesn't. Both are Java JDBC code in the end.
    Well, sure. As long as you only want to update rows
    with the same value in column 2. I had the impression
    he wanted to update a whole table. If he only meant
    update all rows with the same value in a given column
    with the same value, that's trivial. All updates do
    that. But as far as I know there's know way to update
    more than one row where the values are different.I used this as an example to demonstrate that it's possible to UPDATE more than one row at a time. If I have 1,000 rows, and each one is a separate UPDATE statement that's unique from all the others, I guess I'd have to write 1,000 UPDATE statements. It's possible to have them all either succeed or fail as a single unit of work. I'm pointing out transaction, because they weren't coming up in the discussion.
    Unless you're using MySQL, for instance. I only have
    experience with MySQL and M$ SQL Server, so I don't
    know what PostgreSQL, Oracle, Sybase, DB2 and all the
    rest are capable of, but I know for sure that MySQL
    can insert multiple rows while SQL Server can't (or at
    least I've never seen the syntax for doing it if it
    does).Right, but this syntax seems to be specific to MySQL The moment you use it, you're locked into MySQL. There are other ways to accomplish the same thing with ANSI SQL.
    Don't assume that all SQL databases are the same.
    They're not, and it can really screw you up badly if
    you assume you can deploy a project you've developed
    with one database in an environment where you have to
    use a different one. Even different versions of the
    same database can have huge differences. I recommend
    you get a copy of the O'Reilly book, SQL in a
    Nutshell. It covers the most common DBMSes and does a
    good job of pointing out the differences.Yes, I understand that.
    It's funny that you're telling me not to assume that all SQL databases are the same. You're the one who's proposing that the OP use a MySQL-specific extension.
    I haven't looked at the MySQL docs to find out how the syntax you're suggesting works. What if one value set INSERT succeeds and the next one fails? Does MySQL roll back the successful INSERT? Is the unit of work under the JDBC driver's control with autoCommit?
    The OP is free to follow your suggestion. I'm pointing out that there are transactions for units of work and ANSI SQL ways to accomplish the same thing.

Maybe you are looking for