What is the best approach to insert millions of records?

Hi,
What is the best approach to insert millions of record in table.
If error occurred while inserting the record then how to know which record has failed.
Thanks & Regards,
Sunita

Hello 942793
There isn't a best approach if you do not provide us the requirements and the environment...
It depends on what for you is the best.
Questions:
1.) Can you disable the Constraints / unique Indexes on the table?
2.) Is there a possibility to run parallel queries?
3.) Do you need to know the rows which can not be inserted if the constraints are enabled? Or it is not necessary?
4.) Do you need it fast or you have time to do it?
What does "best approach" mean for you?
Regards,
David

Similar Messages

What would be best approach to migrate millions of records from on premise SQL server to Azure SQL DB?

Team,
In our project, we have a requirement of data migration. We have following scenario and I really appreciate any suggestion from you all on implementation part of it.
Scenario:
We have millions of records to be migrated to destination SQL database after some transformation.
The source SQL server is on premise in partners domain and destination server is in Azure.
Can you please suggest what would be best approach to do so.
thanks,
Bishnu
Bishnupriya Pradhan

You can use SSIS itself for this
Have a batch logic which will identify data batches within source and then include data flow tasks to do the data transfer to Azure. The batch size chosen should be as per buffer meory availability + parallel tasks executing etc.
You can use ODBC or ADO .NET connection to connect to azure.
http://visakhm.blogspot.in/2013/09/connecting-to-azure-instance-using-ssis.html
Please Mark This As Answer if it solved your issue
Please Vote This As Helpful if it helps to solve your issue
Visakh
My Wiki User Page
My MSDN Page
My Personal Blog
My Facebook Page

What's the best way to insert/update thousands records in multiple tables

Can anyone give an example of how to insert/update thousands records in multiple tables on performance wise? or what should I do to improve the performance?
Thanks
jim

You can see a set of sample applications in various scenarious available at
http://otn.oracle.com/sample_code/tech/java/sqlj_jdbc/content.html

Best way to insert millions of records into the table

Hi,
Performance point of view, I am looking for the suggestion to choose best way to insert millions of records into the table.
Also guide me How to implement in easier way to make better performance.
Thanks,
Orahar.

Orahar wrote:
Its Distributed data. No. of clients and N no. of Transaction data fetching from the database based on the different conditions and insert into another transaction table which is like batch process.Sounds contradictory.
If the source data is already in the database, it is centralised.
In that case you ideally do not want the overhead of shipping that data to a client, the client processing it, and the client shipping the results back to the database to be stored (inserted).
It is must faster and more scalable for the client to instruct the database (via a stored proc or package) what to do, and that code (running on the database) to process the data.
For a stored proc, the same principle applies. It is faster for it to instruct the SQL engine what to do (via an INSERT..SELECT statement), then pulling the data from the SQL engine using a cursor fetch loop, and then pushing that data again to the SQL engine using an insert statement.
An INSERT..SELECT can also be done as a direct path insert. This introduces some limitations, but is faster than a normal insert.
If the data processing is too complex for an INSERT..SELECT, then pulling the data into PL/SQL, processing it there, and pushing it back into the database is the next best option. This should be done using bulk processing though in order to optimise the data transfer process between the PL/SQL and SQL engines.
Other performance considerations are the constraints on the insert table, the triggers, the indexes and so on. Make sure that data integrity is guaranteed (e.g. via PKs and FKs), and optimal (e.g. FKs should be indexes on the referenced table). Using triggers - well, that may not be the best approach (like for exampling using a trigger to assign a sequence value when it can be faster done in the insert SQL itself). Personally, I avoid using triggers - I rather have that code residing in a PL/SQL API for manipulating data in that table.
The type of table also plays a role. Make sure that the decision about the table structure, hashed, indexed, partitioned, etc, is the optimal one for the data structure that is to reside in that table.

What is the best approach to process data on row by row basis ?

Hi Gurus,
I need to code stored proc to process sales_orders into Invoices. I
think that I must do row by row operation, but if possible I don't want
to use cursor. The algorithm is below :
for all sales_orders with status = "open"
check for credit limit
if over credit limit -> insert row log_table; process next order
check for overdue
if there is overdue invoice -> insert row to log_table; process
next order
check all order_items for stock availability
if there is item that has not enough stock -> insert row to
log_table; process next order
if all check above are passed:
create Invoice (header + details)
end_for
What is the best approach to process data on row by row basis like
above ?
Thank you for your help,
xtanto

Processing data row by row is not the fastest method out there. You'll be sending much more SQL statements towards the database than needed. The advice is to use SQL, and if not possible or too complex, use PL/SQL with bulk processing.
In this case a SQL only solution is possible.
The example below is oversimplified, but it shows the idea:
SQL> create table sales_orders
2 as
3 select 1 no, 'O' status, 'Y' ind_over_credit_limit, 'N' ind_overdue, 'N' ind_stock_not_available from dual union all
4 select 2, 'O', 'N', 'N', 'N' from dual union all
5 select 3, 'O', 'N', 'Y', 'Y' from dual union all
6 select 4, 'O', 'N', 'Y', 'N' from dual union all
7 select 5, 'O', 'N', 'N', 'Y' from dual
8 /
Tabel is aangemaakt.
SQL> create table log_table
2 ( sales_order_no number
3 , message        varchar2(100)
4 )
5 /
Tabel is aangemaakt.
SQL> create table invoices
2 ( sales_order_no number
3 )
4 /
Tabel is aangemaakt.
SQL> select * from sales_orders
2 /
        NO STATUS IND_OVER_CREDIT_LIMIT IND_OVERDUE IND_STOCK_NOT_AVAILABLE
         1 O      Y                     N           N
         2 O      N                     N           N
         3 O      N                     Y           Y
         4 O      N                     Y           N
         5 O      N                     N           Y
5 rijen zijn geselecteerd.
SQL> insert
2    when ind_over_credit_limit = 'Y' then
3         into log_table (sales_order_no,message) values (no,'Over credit limit')
4    when ind_overdue = 'Y' and ind_over_credit_limit = 'N' then
5         into log_table (sales_order_no,message) values (no,'Overdue')
6    when ind_stock_not_available = 'Y' and ind_overdue = 'N' and ind_over_credit_limit = 'N' then
7         into log_table (sales_order_no,message) values (no,'Stock not available')
8    else
9         into invoices (sales_order_no) values (no)
10 select * from sales_orders where status = 'O'
11 /
5 rijen zijn aangemaakt.
SQL> select * from invoices
2 /
SALES_ORDER_NO
             2
1 rij is geselecteerd.
SQL> select * from log_table
2 /
SALES_ORDER_NO MESSAGE
             1 Over credit limit
             3 Overdue
             4 Overdue
             5 Stock not available
4 rijen zijn geselecteerd.Hope this helps.
Regards,
Rob.

What is the best approach to trying to find high freq hits in a file?

Lets say I have a text document that has millions of rows of information like "Name, address, last time checked in:"
What is the best approach if I were to look for the top 5 people who appears the most on this huge list?
Thanks!

If it is not in a database and it's just one fileYou can still put it into a DB.
with all those data, what approach would be good in
the realm of Java? I thought I already said that.
Would Map still be the best
choice?Simplest? Probably. Best? Only you can determine that.
Would the complexity be n^2 since you would
need to put everything in, then compare all the
sizes?No, it should be O(2N) (which is really O(N)). Inseting into the map is O(N), and then iterating once over the entries and adjusting your running top 5 is O(N).

What's the best approach for handeling about 1300 connections in Oracle.

What's the best approach for handling about 1300 connections in Oracle 9i/10g through a Java application?
1.Using separate schema s for various type users(We can store only relevant data with a particular schema. Then No. of records per table can be reduced by replicating tables but we have to maintain all data with a another schema Then we need update two schema s for a given session.Because we maintain separate scheama for a one user and another schema for all data and then there may be Updating problems)
OR
2. Using single schema for all users.
Note: All users may access the same tables and there may be lot of records than previous case.
What is the Best case.
Please give Your valuable ideas

It is a true but i want a solution from you all.I want you to tell me how to fix my friends car.

In Acrobat Professional 8, what is the best way to insert/combine multiple pdf's together in a large

In Acrobat Professional 8, what is the best way to insert/combine multiple pdf's together in a large volume?
We have 300 pdf reports and need to insert a 2 page cover page infront of each report. Not sure if Batch processing is best???
Thanks for any tips.

Probably each cover page is different too. I would probably just bite the bullet and do each individually. I would create the 2 cover pages in WORD or other word processor and print to cover.pdf. Then open a PDF and Pages>Insert Pages or the cover.pdf to the front of the open PDF and save as to the current PDF. Then repeat 299 times. Each time you would make the appropriate change to the DOC file and print a new cover.pdf file (you might want to turn off open in Acrobat for this processing in the printer properties to save time). Probably a good idea to keep a list of the files to check off what has been done (you can generate a list in DOS by Start>cmd, change the directory to your location [cd path], and do "dir >>list.txt". That will give you a list to use.). There may be an easier way, but by the time you get it figured out you might be done this way.

What is the best practice for inserting (unique) rows into a table containing key columns constraint where source may contain duplicate (already existing) rows?

My final data table contains a two key columns unique key constraint. I insert data into this table from a daily capture table (which also contains the two columns that make up the key in the final data table but are not constrained
(not unique) in the daily capture table). I don't want to insert rows from daily capture which already exists in final data table (based on the two key columns). Currently, what I do is to select * into a #temp table from the join
of daily capture and final data tables on these two key columns. Then I delete the rows in the daily capture table which match the #temp table. Then I insert the remaining rows from daily capture into the final data table.
Would it be possible to simplify this process by using an Instead Of trigger in the final table and just insert directly from the daily capture table? How would this look?
What is the best practice for inserting unique (new) rows and ignoring duplicate rows (rows that already exist in both the daily capture and final data tables) in my particular operation?
Rich P

Please follow basic Netiquette and post the DDL we need to answer this. Follow industry and ANSI/ISO standards in your data. You should follow ISO-11179 rules for naming data elements. You should follow ISO-8601 rules for displaying temporal data. We need
to know the data types, keys and constraints on the table. Avoid dialect in favor of ANSI/ISO Standard SQL. And you need to read and download the PDF for:
https://www.simple-talk.com/books/sql-books/119-sql-code-smells/
>> My final data table contains a two key columns unique key constraint. [unh? one two-column key or two one column keys? Sure wish you posted DDL] I insert data into this table from a daily capture table (which also contains the two columns that make
up the key in the final data table but are not constrained (not unique) in the daily capture table). <<
Then the "capture table" is not a table at all! Remember the fist day of your RDBMS class? A table has to have a key. You need to fix this error. What ETL tool do you use?
>> I don't want to insert rows from daily capture which already exists in final data table (based on the two key columns). <<
MERGE statement; Google it. And do not use temp tables.
--CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking
in Sets / Trees and Hierarchies in SQL

What is the best way of insertion using structured or binary type in oracle

what is the best way of insertion using structured or binary type in oracle xml db 11g database

SQL*Loader.

What are the best approaches for mapping re-start in OWB?

What are the best approaches for mapping re-start in OWB?
We are using OWB repository 10.2.0.1.0 and OWB client 10.2.0.1.31. The Oracle version is 10 G (10.2.0.3.0). OWB is installed on Linux.
We have number of mappings. We built process flows for mappings as well.
I like to know, what are the best approches to incorportate re-start options in our process. ie a failure of mapping in process flow.
How do we re-cycle failed rows?
Are there any builtin features/best approaches in OWB to implement the above?
Does runtime audit tables help us to build re-start process?
If not, do we need to maintain our own tables (custom) to maintain such data?
How did our forum members handled above situations?
Any idea ?
Thanks in advance.
RI

Hi RI,
How many mappings (range) do you have in a process flows?Several hundreds (100-300 mappings).
If we have three mappings (eg m1, m2, m3) in process flow. What will happen if m2 fails?Suppose mappings connected sequentially (m1 -> m2 -> m3). When m2 fails then processflow is suspended (transition to m3 will not be performed). You should obviate cause of error (modify mapping and redeploy, correct data, etc) and then repeat m2 mapping execution from Workflow monitor - open diagram with processflow, select mapping m2 and click button Expedite, choose option Repeat.
In re-start, will it run m1 again and m2 son on, or will it re-start at row1 of m2?You can specify restart point. "at row1 of m2" - I don't understand what you mean (all mappings run in Set based mode, so in case of error all table updates will rollback,
but there are several exception - for example multiple target tables in mapping without corelated commit, or error in post-mapping - you must carefully analyze results of error).
What will happen if m3 fails?Process is suspended and you can restart execution from m3.
By having without failover and with max.number of errors=0, you achieve re-cycle failed rows to zero (0).This settings guarantee existence only two return result of mapping - SUCCSES or ERROR.
What is the impact, if we have large volume of data?In my opinion for large volume Set based mode is the prefered processing mode of data processing.
With this mode you have full range enterprise features of Oracle database - parallel query, parallel DML, nologging, etc.
Oleg

What is the best approach to converting LV7.1 tags to LV2012 shared variables in multiple VIs?

What is the best approach to upgrading from LV7.1/DSC tags to LV2012/DSC shared variables, in multiple VIs running on multiple platforms? Our system is composed of about 5 PCs running Windows 2000/LV7.1 Runtime, plus a PLC, and a main controller running XP/SP3/LV2012. About 3 of the PCs publish sensor information via tags across the LAN to the main controller. Only the main controller is currently being upgraded. Rudimentary questions:
1. Will the other PCs running the 7.1 RTE (with tags) be able to communicate with the main controller running 2012 (shared variables)?
2. Is it necessary to convert from tags to shared variables, or will the deprecated legacy tag VIs from LV7.1 work in LV2012?
3. Will all the main controller VIs need to be incorporated into a project in order to use shared variables?
4. Is the only way to do this is to find all tag items and replace them with shared variable items?
Thanks in advance with any information and advice!
lb
Solved!
Go to Solution.

Hi lb,
We're glad to hear you're upgrading, but because there was a fundamental change in architecture since version 7.1, there will likely be some portions that require a rewrite.
The RTE needs to match the version of DSC your using. Also, the tag architecture used in 7.1 is not compatible with the shared variable approach used in 2012. Please see the KnowledgeBase article Do I Need to Upgrade My DSC Runtime Version After Upgrading the LabVIEW DSC Module?
You will also need to convert from tags to shared variables. The change from tags to shared variables took place in the transition to LabVIEW 8. The KnowledgeBase Migrating from LabVIEW DSC 7.1 to 8.0 gives the process for changing from tags to shared variables.
Hope this gets you headed in the right direction. Let us know if you have more questions.
Thanks,
Dave C.
Applications Engineer
National Instruments

Newbie: What is the best approach to integrate BO Enterprise into web app

Hi
1. I am very new to Business Objects and .Net. I need to know what's the best approach
when intergrating bo into my web app i.e which sdk do i use?
For now i want to provide very basic viewing functionality for the following reports :
-> Crystal Reports
-> Web Intellegence Reports
-> PDF Reports
2. Where do i find a standalone install for the Business Objects Enteprise XI .Net providers?
I only managed to find the wssdk but i can't find the others. Business Objects Enteprise XI
does not want to install on my machine (development) - installed fine on server, so i was hoping i could find a standalone install.

To answer question one, you can use the Enterprise .NET SDK for each, though for viewing Webi documents it is much easier to use the opendocument method of URL reporting to view them.
The Crystal Reports and PDF instances can be viewed easily using the SDK.
Here is a link to the Developer Library:
[http://devlibrary.businessobjects.com/]
VB.NET XI Samples:
[http://support.businessobjects.com/communityCS/FilesAndUpdates/bexi_vbnet_samples.zip.asp]
C# XI Samples:
[http://support.businessobjects.com/communityCS/FilesAndUpdates/bexi_csharp_samples.zip.asp]
Other samples:
[https://boc.sdn.sap.com/codesamples]
I answered the provider question on your other thread.
Good luck!
Jason

What´s the best approach to work with Excel, csv files

Hi gurus. I got a question for you. According to your experience what's the best approach to work with Excel or csv files that have to be uploaded through DataServices to you datawarehouse.
Let's say your end-user, who is not a programmer, creates a group of 4 excel files with different calculations in a monthly basis, so they can generate a set of reports from their datawarehouse once the files have been uploaded to tables in your DWH. The calculations vary from month to month. The user doesn't have a front-end to upload the excel files directly to Data Services. The end user needs to keep a track of which person uploaded the files for a determined month.
1. The end user should place their 4 excel files in a shared directory that will be seen by DataServices.
2. DataServices will execute certain scheduled job that will read the four files and upload them to the Datawarehouse at a determined time, lets say at 9:00pm.
It makes me wonder... what happens if the user needs to present their reports immediately so they can´t wait until 9:00pm. Is it possible for the end user to execute some kind of action (out of the DataServices Environment) so DataServices "could know" that it has to process those files right now, instead of waiting for the night schedule?
Is there a way that DS will track who was the person who uploaded those files?
Would it be better to build a front-end for the end user so they can upload their four files directlyto the datawarehouse?
Waiting for your comments to resolve this dilemma
Best Regards
Erika

Hi,
There are functions in DS that captures the input files automatically. You could use file_exists() or wait_for_file() option to do that. Schedule the job to run every certain minute and if the file exists then run. This could be done by using a certain file name with date and timestamp etc or after running move the old files to archive and DS wait for new files to show up.
Check this - Selective Reading and Postprocessing - Enterprise Information Management - SCN Wiki
Hope this helps.
Arun

What is the best approach to install BI statistics in SAP BI ?

Hello All,
what is the best approach to install BI statistics in SAP BI ?
by collecting objects in standard BI content- 0TCT*objects or
by executing some standard tcodes.
Regards,
Siva

the best approach is based on version of your BW system follow up install steps in notes:
BW 3.x:
309955 - BW statistics - Questions, answers and errors
BW 7.x
934848 - Collective note: (FAQ) BI Administration CockpitBW 7.x
Cheers,
m./

What is the best approach to insert millions of records?

Similar Messages

Maybe you are looking for