MINUS on a 4 million row table - how to enhance speed?

I have a query like this:
select col1, col2, col3
from table1
MINUS
select col1, col2, col3
from EXTERNAL_TABLE
table1 has approximately 4 million records and external file has roughly 4 milllion rows.
MINUS takes around 25 mins here. How can I speed up?
Thanks in Advance!!!

To make something go faster, you first need to know what makes it slow.
Simple actually - to solve a problem we need to know what the problem is. Can't answer a question without knowing what the question is.
Reading a total of 8 million rows means a lot more I/O than usual... so that is likely a culprit for the slow performance. If that is the case, you will need to find a way to increase the time it takes to perform all that I/O. Or do less I/O.
But you need to pop the hood and take a look at just what is causing what you think is slow performance. (it may not even be slow - it may be as fast as it can do given the hardware and other limitations that are imposed on Oracle and this MINUS process)

Similar Messages

Delete from 95 million rows table ...

Hi folks, need to delete from a 95 millions rows regular table, what should be my best options, have tried CTAS using parallel, but it failed after 1+ hrs ... it was due to bad query, but checking is there any other way to achieve this.
Thanks in advance.

user8604530 wrote:
Hi folks, need to delete from a 95 millions rows regular table, what should be my best options, have tried CTAS using parallel, but it failed after 1+ hrs ... it was due to bad query, but checking is there any other way to achieve this.
Thanks in advance.how many rows in the table BEFORE the DELETE?
how many rows in the table AFTER the DELETE?
How do I ask a question on the forums?
SQL and PL/SQL FAQ
Handle:     user8604530
Status Level:     Newbie
Registered:     Mar 10, 2010
Total Posts:     64
Total Questions:     26 (22 unresolved)
I extend to you my condolences since you rarely get your questions answered.

Fetch only more than or equal to 10 million rows tables

Hi all,
How to fetch tables has more than 10 million rows with is plsql? i got this from some other site I couldn't remember.
Somebody can help me on this please. your help is greatly appreciated.
declare
counter number;
begin
for x in (select segment_name, owner
from dba_segments
where segment_type='TABLE'
and owner='KOMAKO') loop
execute immediate 'select count(*) from '||x.owner||'.'||x.segment_name into counter;
dbms_output.put_line(rpad(x.owner,30,' ') ||'.' ||rpad(x.segment_name,30,' ') ||' : ' || counter ||' row(s)');
end loop;
end;
Thank you,
gg

1) This code appears to work. Of course, there seems to be no need to select from DBA_SEGMENTS when DBA_TABLES would more straightforward. And, of course, you'd have to do something when the count exceeded 10 million.
2) If you are using the cost-based optimizer (CBO) and your statistics are reasonably accurate and you can tolerate a degree of staleness/ approximation in the row counts, you could just select the NUM_ROWS column from DBA_TABLES.
Justin

Best way to refresh 5 million row table

Hello,
I have a table with 5 million rows that needs to be refreshed every 2 weeks.
Currently I am dropping and creating the table which takes a very long time and gives a warning related to table space at the end of execution. It does create the table with the actual number of rows but I am not sure why I get the table space warning at the end.
Any help is greatly appreciated.
Thanks.

Can you please post your query.
# What is the size of temporary tablespace
# Is you query performing any sorts ?
Monitor the TEMP tablespace usage from below after executing your SQL query
SELECT TABLESPACE_NAME, BYTES_USED, BYTES_FREE
FROM V$TEMP_SPACE_HEADER;
{code}

Google style autosuggest with millions rows table

Hi All,
I'm exploring the ways of implementing a "google style autosuggest" on a table with no less than 30 millions rows. It has a field with an address (varchar) and I'd like to create a Ajax call while the user is typing that would suggest the user few addresses.
I was thinking about using contains+fuzzy... but not sure if it will be fast enough and if it will return the right results.
Any suggestions ?
thanks

2 million rows with XML type data
link may be of your interest.
HTH
Girish Sharma

Importing huge (10 million rows) tables into HANA

Dear All,
i need to import huge tables (40.000.000 records) into an HANA system, in order to develop a demo system for a customer.
I'm planning to use CSV files.
Does someone have experience in a task like this? is there a limit to CSV file dimension?
Many thanks
Leopoldo Capasso

Check out this blog for best practice to load high volume data from flat files:
http://scn.sap.com/community/hanainmemory/blog/2013/04/08/bestpracticesforsaphanadataloads
Hope this helps.
Rama

General Scenario- Adding columns into a table with more than 100 million rows

I was asked/given a scenario, what issues do you encounter when you try to add new columns to a table with more than 200 million rows? How do you overcome those?
Thanks in advance.
svk

For such a large table, it is better to add the new column to the end of the table to avoid any performance impact, as RSingh suggested.
Also avoid to use any default on the newly created statement, or SQL Server will have to fill up 200 million fields with this default value. If you need one, add an empty column and update the column by using small batches (otherwise you lock up the whole
table). Add the default after all the rows have a value for the new column.

Schema Design for 10^6+ rows table (Indicator Column / Bitmap Join Index?)

Hi all,
I read following suggestion for a SELECT with LEFT OUTER JOIN in a DB2 consulting company paper for a 10 million-rows table:
SELECT columns
FROM ACCTS A LEFT JOIN OPT1 O1
ON      A.ACCT_NO = O1.ACCT_NO
AND     A.FLAG1 = ‘Y’
LEFT JOIN OPT2 O2
ON      A.ACCT_NO = O2.ACCT_NO
AND     A.FLAG2 = ‘Y’
WHERE A.ACCT_NO = 1
For DB2, according to the paper, the following is true: Iff A.FLAG1 <> ‘Y’ Then no Table or Index Access on OPT1 is done. Same for A.FLAG2/OPT2.
I recreated the situation for ORACLE with the following script and came to some really interesting questions
DROP TABLE maintbl CASCADE CONSTRAINTS;
DROP TABLE opt1 CASCADE CONSTRAINTS;
DROP TABLE opt2 CASCADE CONSTRAINTS;
CREATE TABLE maintbl
id INTEGER NOT NULL,
dat VARCHAR2 (2000 CHAR),
opt1 CHAR (1),
opt2 CHAR (1),
CONSTRAINT CK_maintbl_opt1 CHECK(opt1 IN ('Y', 'N')) INITIALLY IMMEDIATE ENABLE VALIDATE,
CONSTRAINT CK_maintbl_opt2 CHECK(opt2 IN ('Y', 'N')) INITIALLY IMMEDIATE ENABLE VALIDATE,
CONSTRAINT PK_maintbl PRIMARY KEY(id)
CREATE TABLE opt1
maintbl_id INTEGER NOT NULL,
adddat1 VARCHAR2 (100 CHAR),
adddat2 VARCHAR2 (100 CHAR),
CONSTRAINT PK_opt1 PRIMARY KEY(maintbl_id),
CONSTRAINT FK_opt1_maintbltable FOREIGN KEY(maintbl_id) REFERENCES maintbl(id)
CREATE TABLE opt2
maintbl_id INTEGER NOT NULL,
adddat1 VARCHAR2 (100 CHAR),
adddat2 VARCHAR2 (100 CHAR),
CONSTRAINT PK_opt2 PRIMARY KEY(maintbl_id),
CONSTRAINT FK_opt2_maintbltable FOREIGN KEY(maintbl_id) REFERENCES maintbl(id)
INSERT ALL
WHEN 1 = 1 THEN
INTO maintbl (ID, opt1, opt2, dat) VALUES (nr, is_even, is_odd, maintbldat)
WHEN is_even = 'N' THEN
INTO opt1 (maintbl_id, adddat1, adddat2) VALUES (nr, adddat1, adddat2)
WHEN is_even = 'Y' THEN
INTO opt2 (maintbl_ID, adddat1, adddat2) VALUES (nr, ADDdat1, ADDdat2)
SELECT LEVEL AS NR,
CASE WHEN MOD(LEVEL, 2) = 0 THEN 'Y' ELSE 'N' END AS is_even,
CASE WHEN MOD(LEVEL, 2) = 1 THEN 'Y' ELSE 'N' END AS is_odd,
TO_CHAR(DBMS_RANDOM.RANDOM) AS maintbldat,
TO_CHAR(DBMS_RANDOM.RANDOM) AS adddat1,
TO_CHAR(DBMS_RANDOM.RANDOM) AS adddat2
FROM DUAL
CONNECT BY LEVEL <= 100;
COMMIT;
SELECT * FROM maintbl
LEFT OUTER JOIN opt1 ON maintbl.id = opt1.maintbl_id AND maintbl.opt1 = 'Y'
LEFT OUTER JOIN opt2 ON maintbl.id = opt2.maintbl_id AND maintbl.opt2 = 'Y'
WHERE id = 1;
Explain plan for "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi":
http://i.imgur.com/f0AiA.png
As one can see, the DB uses a view to index-access the opt tables iff indicator column maintbl.opt1='Y' in the main table.
Explain plan for "Oracle Database 11g Express Edition Release 11.2.0.2.0 - Production":
http://i.imgur.com/iKfj8.png
As one can see, the DB does NOT use the view, instead uses a pretty useless case-statement
Now my questions:
1) What does the optimizer do in 11.2 XE?!?
2) In General: Do you suggest this table-setup? Does your yes/no suggestion depend on the rowcount in the tables? Of course I see the problem with incorrectly updated columns and would NEVER do it if there is another truly relational solution with same performance possibly.
3) Is there a way to avoid performance issues if I don't use an indicator column in ORACLE? Is this what a [Bitmap Join Index|http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#autoId14] is for?
Thanks in advance and happy discussing,
Blama

Fair enough. I've included a cut-down set of SQL below.
CREATE TABLE DIMENSION_DATE
DATE_ID NUMBER,
CALENDAR_DATE DATE,
CONSTRAINT DATE_ID
PRIMARY KEY
(DATE_ID)
CREATE UNIQUE INDEX DATE_I1 ON DIMENSION_DATE
(CALENDAR_DATE, DATE_ID);
CREATE TABLE ORDER_F
ORDER_ID VARCHAR2(40 BYTE),
SUBMITTEDDATE_FK NUMBER,
COMPLETEDDATE_FK NUMBER,
-- Then I add the first bitmap index, which works:
CREATE BITMAP INDEX SUBMITTEDDATE_FK ON ORDER_F
(DIMENSION_DATE.DATE_ID)
FROM ORDER_F, DIMENSION_DATE
WHERE ORDER_F.SUBMITTEDDATE_FK = DIMENSION_DATE.DATE_ID;
-- Then attempt the next one:
CREATE BITMAP INDEX completeddate_fk
ON ORDER_F(b.date_id)
FROM ORDER_F, DIMENSION_DATE b
WHERE ORDER_F.completeddate_fk = b.date_id;
-- which results in:
-- ORA-01408: such column list already indexed

How to Load 100 Million Rows in Partioned Table

Dear all,
I a workling in VLDB application.
I have a Table with 5 columns
For ex- A,B,C,D,DATE_TIME
I CREATED THE RANGE (DAILY) PARTIONED TABLE ON COLUMN (DATE_TIME).
AS WELL CREATED NUMBER OF INDEX FOR EX,
INDEX ON A
COMPOSITE INDEX ON DATE_TIME,B,C
REQUIREMENT
NEED TO LOAD APPROX 100 MILLION RECORDS IN THIS TABLE EVERYDAY ( IT WILL LOAD VIA SQL LOADER OR FROM TEMP TABLE (INSERT INTO ORIG SELECT * FROM TEMP)...
QUESTION
TABLE IS INDEXED SO I AM NOT ABLE TO USE SQLLDR FEATURE DIRECT=TRUE.
SO LET ME KNOW WHAT THE BEST AVILABLE WAY TO LOAD THE DATA IN THIS TABLE ????
Note--> PLEASE REMEMBER I CAN'T DROP AND CREATE INDEX DAILY DUE TO HUGE DATA QUANTITY.

Actually a simpler issue then what you seem to think it is.
Q. What is the most expensive and slow operation on a database server?
A. I/O. The more I/O, the more latency there is, the longer the wait times are, the bigger the response times are, etc.
So how do you deal with VLT's? By minimizing I/O. For example, using direct loads/inserts (see SQL APPEND hint) means less I/O as we are only using empty data blocks. Doing one pass through the data (e.g. apply transformations as part of the INSERT and not afterwards via UPDATEs) means less I/O. Applying proper filter criteria. Etc.
Okay, what do you do when you cannot minimize I/O anymore? In that case, you need to look at processing that I/O volume in parallel. Instead of serially reading and writing a 100 million rows, you (for example) use 10 processes that each reads and writes 10 million rows. I/O bandwidth is there to be used. It is extremely unlikely that a single process can fully utilised the available I/O bandwidth. So use more processes, each processing a chunk of data, to use more of that available I/O bandwidth.
Lastly, think DDL before DML when dealing with VLT's. For example, a CTAS to create a new data set and then doing a partition exchange to make that new data set part of the destination table, is a lot faster than deleting that partition's data directly, and then running a INSERT to refresh that partition's data.
That in a nutshell is about it - think I/O and think of ways to use it as effectively as possible. With VLT's and VLDB's one cannot afford to waste I/O.

How to export a table with half a million rows?

I need to export a table that has 535,000 rows. I tried to export to Excel and it exported only 65,535 rows. I tried to export to a text file and it said it was using the clipboard (?) and 65,000 rows was the maximum. Surely there has to be a way to export
the entire table. I've been able to import much bigger csv files than this, millions of rows.

What version of Access are you using? Are you attempting to copy and paste records or are you using Access' export functionality from the menu/ribbon? I'm using Access 2010 and just exported a million record table to both a text file and to Excel
(.xlsx format). Excel 2003 (using .xls 97-2003 format) does have a limit of 65,536 rows but the later .xlsx format does not.
-Bruce

OC4J - How to get data from large table (more than 9 million rows) by EJB?

SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS
O I use Jdeveloper to create EJB that has finder methods to get data from a big table (more S than 9 million rows). Deploy is OK but when run client program I always get timeout
O error or not enough memory error,
S Can any one help me?
O urgent
SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS SOS

Your problem may be that you are simply trying to load so many objects (found by your finder) that you are exceeding available memory. For example if each object is 100 bytes and you try to load 1,000,000 objects thats 100Mb of memory gone.
You could try increasing the amount of memory available to OC4J with the appropriate argument on the command line (or in the 10gAS console). For example to make 1Gb available to OC4J you would add the argument:
-Xmx1000m
Of course you need have this available as hard memory on your server or you will incur serious swapping.
Chris

How to get data from large table (more than 9 million rows) by EJB?

I have a giant table, it has more than 9 million rows.
I want to use ejb finders method to get data from this table but always get not enough memory error or time out error,
Can anyone give me solutions?
Thx

Your problem may be that you are simply trying to load so many objects (found by your finder) that you are exceeding available memory. For example if each object is 100 bytes and you try to load 1,000,000 objects thats 100Mb of memory gone.
You could try increasing the amount of memory available to OC4J with the appropriate argument on the command line (or in the 10gAS console). For example to make 1Gb available to OC4J you would add the argument:
-Xmx1000m
Of course you need have this available as hard memory on your server or you will incur serious swapping.
Chris

Insert/select one million rows at a time from source to target table

Hi,
Oracle 10.2.0.4.0
I am trying to insert around 10 million rows into table target from source as follows:
INSERT /*+ APPEND NOLOGGING */ INTO target
SELECT *
FROM source f
WHERE
        NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);There is a unique index on target table on col1,col2
I was having issues with undo and now I am getting the follwing error with temp space
ORA-01652: unable to extend temp segment by 64 in tablespace TEMPI believce it would be easier if I did bulk insert one million rows at a time and commit.
I appriciate any advice on this please.
Thanks,
Ashok

902986 wrote:
NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);
I don't know if it has any bearing on the case, but is that WHERE clause on purpose or a typo? Should it be:
        NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.COL1 and f.col2 = m.col2);Anyway - how much of your data already exists in target compared to source?
Do you have 10 million in source and very few in target, so most of source will be inserted into target?
Or do you have 9 million already in target, so most of source will be filtered away and only few records inserted?
And what is the explain plan for your statement?
INSERT /*+ APPEND NOLOGGING */ INTO target
SELECT *
FROM source f
WHERE
        NOT EXISTS(SELECT 1 from target m WHERE f.col1 = m.col2 and f.col2 = m.col2);As your error has to do with TEMP, your statement might possibly try to do a lot of work in temp to materialize the resultset or parts of it to maybe use in a hash join before inserting.
So perhaps you can work towards an explain plan that allows the database to do the inserts "along the way" rather than calculate the whole thing in temp first.
That probably will go much slower (for example using nested loops for each row to check the exists), but that's a tradeoff - if you can't have sufficient TEMP then you may have to optimize for less usage of that resource at the expense of another resource ;-)
Alternatively ask your DBA to allocate more room in TEMP tablespace. Or have the DBA check if there are other sessions using a lot of TEMP in which case maybe you just have to make sure your session is the only one using lots of TEMP at the time you execute.

Analyse a partitioned table with more than 50 million rows

Hi,
I have a partitioned table with more than 50 million rows. The last analyse is on 1/25/2007. Do I need to analyse him? (query runs on this table is very slow).
If I need to analyse him, what is the best way? Use DBMS_STATS and schedule a job?
Thanks

A partitioned table has global statistics as well as partition (and subpartition if the table is subpartitioned) statistics. My guess is that you mean to say that the last time that global statistics were gathered was in 2007. Is that guess accurate? Are the partition-level statistics more recent?
Do any of your queries actually use global statistics? Or would you expect that every query involving this table would specify one or more values for the partitioning key and thus force partition pruning to take place? If all your queries are doing partition pruning, global statistics are irrelevant, so it doesn't matter how old and out of date they are.
Are you seeing any performance problems that are potentially attributable to stale statistics on this table? If you're not seeing any performance problems, leaving the statistics well enough alone may be the most prudent course of action. Gathering statistics would only have the potential to change query plans. And since the cost of a query plan regressing is orders of magnitude greater than the benefit of a different query performing faster (at least for most queries in most systems), the balance of risks would argue for leaving the stats alone if there is no problem you're trying to solve.
If your system does actually use global statistics and there are performance problems that you believe are potentially attributable to stale global statistics and your partition level statistics are accurate, you can gather just global statistics on the table probably with a reasonably small sample size. Make sure, though, that you back up your existing statistics just in case a query plan goes south. Ideally, you'd also have a test environment with identical (or nearly identical) data volumes that you could use to verify that gathering statistics doesn't cause any problems.
Justin

Update all rows in a table which has 8-10 million rows take for ever

Hi All,
Greetings!
I have to update 8million rows on a table. Basically have to reset the batch_id with the current batch number. it contains 8-10 million rows and i have tried with bulk update and then also it takes long time. below is the table structure
sales_stg (it has composite key of product,upc and market)
=======
product_id
upc
market_id
batch_id
process_status
I have to update batch_id,process_status to current batch_id (a number) and process_status as zero. I have to update all the rows with these values for batch_id = 0.
I tried bulk update an it takes more than 2hrs to do. (I limit the update to 1000).
Any help in this regard.
Naveen.

The fastest way will probably be to not use a select loop but a direct update like in William's example. The main downside is if you do too many rows you risk filling up your rollback/undo; to keep things as simple as possible I wouldn't do batching except for this. Also, we did some insert timings a few years ago on 9iR1 and found that the performance curve on frequent commits started to level off after 4K rows (fewer commits were still better) so you could see how performance improves by performing fewer commits if that's an issue.
The other thing you could consider if you have the license is using the parallel query option.

MINUS on a 4 million row table - how to enhance speed?

Similar Messages

Maybe you are looking for