Managing a big table

Hi All,
I have a big table in my database. When I say big, it is related to data stored in it (around 70 million recs) and also no of columns (425).
I do not have any problems with it now, but going ahead I assume, it would be a bottleneck or very difficult to manage this table.
I have a star schema for the application of which this is a master table.
Apart from partitioning the table is there any other way of better handling such a table.
Regards

Hi,
Usually the fact tables tend to be smaller in number of columns and larger in number of records while the dimension tables obey to the opposite larger number of columns, which is were the powerful of the dimension lays on, and very few (in some exceptions even millions of record) records. So the high number of columns make me thing that the fact table may be, only may be, I don't have enough information, improperly designed. If that is the case then you may want to revisit that design and most likely you will find some 'facts' in your fact table that can become attributes of any of the dimension tables linked to.
Can you say why are you adding new columns to the fact table? A fact table is created for a specific business process and if done properly there shouldn't be such a requirement of adding new columns. A fact use to be limited in the number of metrics you can take from it. In fact, it is more common the oposite, a factless fact table.
In any case, from the point of view of handling this large table with so many columns I would say that you have to focus on avoiding the increasing number of columns. There is nothing in the database itself, such as partitioning that could do this for you. So one option is to figure out which columns you want to get 'vertical partition' and split the table in at least two new tables. The set of columns will be those that are more frequently used or those that are more critical to you.Then you will have to link these two tables together and with the rest of dimensions. But, again if you are adding new columns then is just a matter of time that you will be running in the same situation in the future.
I am sorry but cannot offer better advice than to revisit the design of your fact table. For doing that you may want to have a look at http://www.kimballgroup.com/html/designtips.html
LW

Similar Messages

Regarding the SAP big tables in ECC 6.0

Hi,
We are having SAP ECC 6.0 running on Oracle 10.2g database. Please can anyone of you give fine details on the big tables as below. What are they? Where are they being used? Do they need to be so big? Can we clean them up?
Table          Size
TST03          220 GB
COEP          125 GB
SOFFCONT1      92 GB
GLPCA          31 GB
EDI40          18GB
Thanks,
Narendra

Hello Narendra,
TST03 merits special attention, certainly if it is the largest table in your database. TST03 contains the contents of spool requests and it often happens that at some time in the past there was a huge number of spool data in the system causing TST03 to inflate enormously. Even if this spool data was cleaned up later Oracle will not shrink the table automatically. It is perfectly possible that you have a 220 GB table containing virtually nothing.
There are a lot of fancy scripts and procedures around to find out how much data is actually in the table, but personally I often use a quick-and-dirty check based on the current statistics.
sqlplus /
select (num_rows * avg_row_len)/(1024*1024) "MB IN USE" from dba_tables where table_name = 'TST03';
This will produce a (rough) estimate of the amount of space actually taken up by rows in the table. If this is very far below 220 GB then the table is overinflated and you do best to reorganize it online with BRSPACE.
As to the other tables: there are procedures for prevention, archiving and/or deletion for all of them. The best advice was given in an earlier response to your post, namely to use the SAP Database Management Guide.
Regards,
Mark

Big tables from MySql to MSSql error Connection timeout

HI,
I have a big database around 46GB in Mysql format and I managed to convert all the database to MSsql except two tables, the biggest ones. When I try to migrate those 2 tables, one by one , after a while I get the error message "Connection timeout and
was disabled"
I encreased the timeout from SSMA option from 15 to 1440 and decreased the basc from 1000 to 500 and same thing, The tables have 52 mil rows and 110 milion rows with 1,5 GB and 6.5 GB.
What can I do to migrate them
Thank You

Hi,
According to your description, we need to verify that if you have installed the latest version of
MySQL ODBC driver. If you have installed it, in order to make a shorter duration transaction and avoid the timeout, you can try to reduce the batch size to lower value such as 200 or 100 in
SSMA option. Also, as Raju’s post, you can try to use incremental data migration in SSMA to migrate lager tables, then check if it is successful.
In addition, you can use other methods to migrate big tables from MySQL to SQL Server. For example, you can copy the data directly from SQL Server using OpenQuery, and you could include WHERE clause to limit the rows. For more details, please review this
blog ：Migrate MySQL to Microsoft SQL Server. Or you can write queries for MySQL to export your data as csv, and then use the
BULK INSERT features of SQL Server to import the csv data.
Thanks
Lydia Zhang

How to UPDATE a big table in Oracle via Bulk Load

Hi all,
in a datastore target as Oracle 11g, I have a big table having 300milions of record; the structure is One integer key + 10 columns attributes .
In IQ Source i have the same table with the same size ; the structure is One integer key + 1 column attributes .
What i need to do is to UPDATE that single field in Oracle from the values stored in IQ .
Any idea on how to organize efficiently the dataflow and the target writing mode ? bulk load ? api ?
thank you
Maurizio

Hi,
You cannot do bulk load when you need to UPDATE a field. Because all a bulk load does is add records to your table.
Since you have to UPDATE a field, i would suggest to go for SCD with
source > TC > MO > KG >target
Arun

SELECT query performance : One big table Vs many small tables

Hello,
We are using BDB 11g with SQLITE support. I have a query about 'select' query performance when we have one huge table vs. multiple small tables.
Basically in our application, we need to run select query multiple times and today we have one huge table. Do you guys think breaking them into
multiple small tables will help ?
For test purposes we tried creating multiple tables but performance of 'select' query was more or less same. Would that be because all tables will map to only one database in backed with key/value pair and when we run lookup (select query) on small table or big table it wont make difference ?
Thanks.

Hello,
There is some information on this topic in the FAQ at:
http://www.oracle.com/technology/products/berkeley-db/faq/db_faq.html#9-63
If this does not address your question, please just let me know.
Thanks,
Sandra

HS ODBC GONE AWAY ON BIG TABLE QRY

Hello,
I have an HS ODBC connection set up pointing to a MySQL 5.0 database on Windows using mysql odbc 3.51.12. Oracle XE is on the same box and tnsames, sqlnet.ora, and HS ok is all set up.
The problem is I have a huge table 100 mill rows, in MySQL, and when I run a query in Oracle SQL Developer it runs for about two minutes then I get errrors ORA-00942 lost connection, or gone away.
I can run a query against a smaller table in the schema and it returns rows quickly. So I know the HS ODBC connection is working.
I noticed the HS service running on Windows starts up and uses 1.5 gig of memory and the CPU time maxes to 95%, on the big table query, then the connection drops.
Any advice on what to do here. There doesn't seem to be any config settings with HS service to limit or increase the rows, or increase the cache.
MySQL does have some advanced ODBC driver options that I will try.
Does anyone have any suggestions on how to handle this overloading problem??
Thanks for the help,

FYI, HS is Oracle Hetrogenous service to connect to non-oracle databases.
I actually found a workaround. The table is so large the query crashes. So I broke table up with 5 MySql views, and now am able to query the views using select insert Oracle stored procedure into Oracle table.

Table.Join/Merge in Power Query takes extremly long time to process for big tables

Hi,
I tried to simply merge/inner join two big tables(one has 300,000+ rows after filtering and the other has 30,000+ rows after filtering) in PQ. However, for this simple join operation, PQ took at least 10 minutes (I killed the Query Editor after 10
minutes' processing) to load the preview.
Here's how I did the join job: I first loaded tables into the workbook, then did the filtering for each table and at last, used the merge function to do the join based on a same field.
Did I do anything wrong here? Or is there any way to improve the load efficiency?
P.S. no custom SQL was used during the process. I was hoping the so called "Query Folding" can help speed the process, but it seems it didn't work here.
Thanks.
Regards,
Qilong

Hi!
You should import the source tables
in Access. This will speed up the work of
PQ in several times.

Excution of a PL/SQL procedure with CURSOR for big tables

I have prepared a proceudre that uses CURSOR to make a complex query for tables with big number of records, something like 900'000. And the execution failed; ORA-01652:impossible to extend the temporary segment of 64 in the space of storage TEMP.
Any sugestion.

This brings us to the following question: How could I calculate the bytes required by a cursor?. It is a selection of certain fields of very big tables. Let's say that the fields are NUMBER(4), NUMBER(8) and CHAR(2). The fields are in 2 relational tables of 900'000 each. What size is required for a procedure like this.
Your help is really appreciated.

How to use partioning for big table

Hi,
Oracle 10gR2/Redhat4
RAC database
ASM
I have a big table TRACES that will also grow very fast, actually I have 15 000 000 rows.
TRACES (ID NUMBER,
COUNTRY_NUM NUMBER,
Timestampe NUMBER,
MESSAGE VARCHAR2(300),
type_of_action VARCHAR(20),
CREATED_TIME DATE,
UPDATE_DATE DATE)
The querys that asked this table are and the made a lot of I/O in disks!!
select count(*) as y0_
from TRACES this_
where this_.COUNTRY_NUM = :1
and this_.TIMESTAMP between :2 and :3
and lower(this_.MESSAGE) like :4;
SELECT *
FROM (SELECT this_.id ,
this_.TIMESTAMP
FROM traces this_
WHERE this_.COUNTRY_NUM = :1
AND this_.TIMESTAMP BETWEEN :2 AND :3
AND this_.type_of_action = :4
AND LOWER (this_.MESSAGE) LIKE :5
ORDER BY this_.TIMESTAMP DESC)
WHERE ROWNUM <= :6;
I have 16 distinct COUNTRY_NUM in the table and the TIMESTAMPE is a number that the application insert in the table.
My question is the best solution to tune this table is to use partitioninig to a smal parts?
I need to made a partioning using a list by COUNTRY_NUM and date (YEAR/mounth) , is it a best way to it?
NB: for an example of TRACES in my test database
1 select COUNTR_NUM,count(*) from traces
2 group by COUNTR_NUM
3* order by COUNTR_NUM
SQL> /
COUNTR_NUM COUNT(*)
-1 194716
3 1796581
4 1429393
5 1536092
6 151820
7 148431
8 76452
9 91456
10 91044
11 186370
13 76
15 29317
16 33470

Hello,
You can automate and use dbms_scheduler to add monthly partition. Here is an example of your partitioned table with monthly partitions
CREATE TABLE traces (
   id NUMBER,
   country_num NUMBER,
   timestampe NUMBER,
   MESSAGE VARCHAR2 (300),
   type_of_action VARCHAR (20),
   created_time DATE,
   update_date DATE
TABLESPACE TEST_DATA - your tablespace_name
PARTITION BY RANGE (created_time)
   (PARTITION traces_200901
       VALUES LESS THAN
          (TO_DATE (' 2009-02-01 00:00:00',
                    'SYYYY-MM-DD HH24:MI:SS',
                    'NLS_CALENDAR=GREGORIAN'
       TABLESPACE test_data, -- Here you can put partition on difference tablespaces meaning different data files residing on diferent disks (Reducing i/o coententions)
   PARTITION traces_200902
      VALUES LESS THAN
         (TO_DATE (' 2009-03-01 00:00:00',
                   'SYYYY-MM-DD HH24:MI:SS',
                   'NLS_CALENDAR=GREGORIAN'
      TABLESPACE test_data);Regards

Performance question - Caching data of a big table

Hi All,
I have a general question about caching, I am using an Oracle 11g R2 database.
I have a big table about 50 millions of rows that is accessed very often by my application. Some query runs slow and some are ok. But (obviously) when the data of this table are already in the cache (so basically when a user requests the same thing twice or many times) it runs very quickly.
Does somebody has any recommendations about caching the data / table of this size ?
Many thanks.

Chiwatel wrote:
With better formatting (I hope), sorry I am not used to the new forum !
Plan hash value: 2501344126
| Id | Operation                            | Name          | Starts | E-Rows |E-Bytes| Cost (%CPU)| Pstart| Pstop | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
| 0 | SELECT STATEMENT        |                    |      1 |        |      | 7232 (100)|      |      | 68539 |00:14:20.06 |    212K| 87545 |      |      |          |
| 1 | SORT ORDER BY                      |                |      1 | 7107 | 624K| 7232 (1)|      |      | 68539 |00:14:20.06 |    212K| 87545 | 3242K| 792K| 2881K (0)|
2 | NESTED LOOPS                      |                |      1 |        |      |            |      |      | 68539 |00:14:19.26 |    212K| 87545 |      |      |          |
| 3 |    NESTED LOOPS                      |                |      1 | 7107 | 624K| 7230 (1)|      |      | 70492 |00:07:09.08 |    141K| 43779 |      |      |          |
* 4 |    INDEX RANGE SCAN                | CM_MAINT_PK_ID |      1 | 7107 | 284K|    59 (0)|      |      | 70492 |00:00:04.90 |    496 |    453 |      |      |          |
| 5 |    PARTITION RANGE ITERATOR        |                | 70492 |      1 |      |    1 (0)| KEY | KEY | 70492 |00:07:03.32 |    141K| 43326 |      |      |          |
|* 6 |      INDEX UNIQUE SCAN              | D1T400P0      | 70492 |      1 |      |    1 (0)| KEY | KEY | 70492 |00:07:01.71 |    141K| 43326 |      |      |          |
|* 7 |    TABLE ACCESS BY GLOBAL INDEX ROWID| D1_DVC_EVT    | 70492 |      1 |    49 |    2 (0)| ROWID | ROWID | 68539 |00:07:09.17 | 70656 | 43766 |      |      |          |
Predicate Information (identified by operation id):
4 - access("ERO"."MAINT_OBJ_CD"='D1-DEVICE' AND "ERO"."PK_VALUE1"='461089508922')
6 - access("ERO"."DVC_EVT_ID"="E"."DVC_EVT_ID")
7 - filter(("E"."DVC_EVT_TYPE_CD"='END-GSMLOWLEVEL-EXCP-SEV-1' OR "E"."DVC_EVT_TYPE_CD"='STR-GSMLOWLEVEL-EXCP-SEV-1'))
Your user has executed a query to return 68,000 rows - what type of user is it, a human being cannot possibly cope with that much data and it's not entirely surprising that it might take quite some time to return it.
One thing I'd check is whether you're always getting the same execution plan - Oracle's estimates here are out by a factor of about 95 (7,100 rows predicted vs. 68,500 returned) perhaps some of your variation in timing relates to plan changes.
If you check the figures you'll see about half your time came from probing the unique index, and half came from visiting the table. In general it's hard to beat Oracle's caching algorithms, but indexes are often much smaller than the tables they cover, so it's possible that your best strategy is to protect this index at the cost of the table. Rather than trying to create a KEEP cache the index, though, you MIGHT find that you get some benefit from creating a RECYCLE cache for the table, using a small percentage of the available memory - the target is to fix things so that table blocks you won't revisit don't push index blocks you will revisit from memory.
Another detail to consider is that if you are visiting the index and table completely randomly (for 68,500 locations) it's possible that you end up re-reading blocks several times in the course of the visit. If you order the intermediate result set from the from the driving table first you may find that you're walking the index and table in order and don't have to re-read any blocks. This is something only you can know, though. THe code would have to change to include an inline view with a no_merge and no_eliminate_oby hint.
Regards
Jonathan Lewis

Max(serial_no) on a big table

Hello Gurus,
I am using max(serial_no) on a big table
how to get performance of such query
SELECT MAX(SERIAL_NO) FROM DOCUMENTS agr WHERE STATUS NOT IN ('A', 'NA');
| Id | Operation | Name | Rows | Bytes | Cost |
| 0 | SELECT STATEMENT | | 1 | 8 | 5250 |
| 1 | SORT AGGREGATE | | 1 | 8 | |
| 2 | TABLE ACCESS FULL | DOCUMENTS | 846K| 6613K| 5250 |
No index onn STATUS column
thanks in advance

NOT IN does not consider index.I wouldn't say so generally
SQL> explain plan
   for
      select max (empno)
        from emp
       where empno not in
                (7369,
                 7499,
                 7521,
                 7566,
                 7654,
                 7698,
                 7782,
                 7788,
                 7839,
                 7844,
                 7876,
                 7900)
Explain complete.
SQL> select * from table (dbms_xplan.display ())
PLAN_TABLE_OUTPUT
Plan hash value: 1767367665
| Id | Operation                   | Name   | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |        |     1 |     4 |     2   (0)| 00:00:01 |
|   1 | SORT AGGREGATE             |        |     1 |     4 |            |          |
|   2 |   FIRST ROW                 |        |     1 |     4 |     2   (0)| 00:00:01 |
|* 3 |    INDEX FULL SCAN (MIN/MAX)| PK_EMP |     1 |     4 |     2   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   3 - filter("EMPNO"<>7369 AND "EMPNO"<>7499 AND "EMPNO"<>7521 AND
              "EMPNO"<>7566 AND "EMPNO"<>7654 AND "EMPNO"<>7698 AND "EMPNO"<>7782 AND
              "EMPNO"<>7788 AND "EMPNO"<>7839 AND "EMPNO"<>7844 AND "EMPNO"<>7876 AND
              "EMPNO"<>7900)
18 rows selected.

Very Big Table (36 Indexes, 1000000 Records)

Hi
I have a very big table (76 columns, 1000000 records), these 76 columns include 36 foreign key columns , each FK has an index on the table, and only one of these FK columns has a value at the same time while all other FK have NULL value. All these FK columns are of type NUMBER(20,0).
I am facing performance problem which I want to resolve taking in consideration that this table is used with DML (Insert,Update,Delete) along with Query (Select) operations, all these operations and queries are done daily. I want to improve this table performance , and I am facing these scenarios:
1- Replace all these 36 FK columns with 2 columns (ID, TABLE_NAME) (ID for master table ID value, and TABLE_NAME for master table name) and create only one index on these 2 columns.
2- partition the table using its YEAR column, keep all FK columns but drop all indexes on these columns.
3- partition the table using its YEAR column, and drop all FK columns, create (ID,TABLE_NAME) columns, and create index on (TABLE_NAME,YEAR) columns.
Which way has more efficiency?
Do I have to take "master-detail" relations in mind when building Forms on this table?
Are there any other suggestions?
I am using Oracle 8.1.7 database.
Please Help.

Hi everybody
I would like to thank you for your cooperation and I will try to answer your questions, but please note that I am a developer in the first place and I am new to oracle database administration, so please forgive me if I did any mistakes.
Q: Have you gathered statistics on the tables in your database?
A: No I did not. And if I must do it, must I do it for all database tables or only for this big table?
Q:Actually tracing the session with 10046 level 8 will give some clear idea on where your query is waiting.
A: Actually I do not know what you mean by "10046 level 8".
Q: what OS and what kind of server (hardware) are you using
A: I am using Windows2000 Server operating system, my server has 2 Intel XEON 500MHz + 2.5GB RAM + 4 * 36GB Hard Disks(on RAID 5 controller).
Q: how many concurrent user do you have an how many transactions per hour
A: I have 40 concurrent users, and an average 100 transaction per hour, but the peak can goes to 1000 transaction per hour.
Q: How fast should your queries be executed
A: I want the queries be executed in about 10 to 15 seconds, or else every body here will complain. Please note that because of this table is highly used, there is a very good chance to 2 or more transaction to exist at the same time, one of them perform query, and the other perform DML operation. Some of these queries are used in reports, and it can be long query(ex. retrieve the summary of 50000 records).
Q:please show use the explain plan of these queries
A: If I understand your question, you ask me to show you the explain plan of those queries, well, first, I do not know how , an second, I think it is a big question because I can not collect all kind of queries that have been written on this table (some of them exist in server packages, and the others performed by Forms or Reports).

Getting an error when inserting a big table inside a ""-

hi iam getting an error String not properly closed when i add big table inside a "".plz give me an idea how to insert a table inside a String.iam using jsp .
thanks in advance

1) Please use proper English grammar, spelling, punctuation, and capitalization.
2) I don't know what you mean "convert this forum page to a Word doc." Are you talking about taking HTML that's served up from some arbitrary URL and saving it in a Word doc? If so, you would not have the page's content hardcoded in your JSP. If you need to hardcode the page's content in your JSP, then I'm still not understanding what you're doing.
3) If you want to write it out in .doc format, then you need to use a library that can write that format, such as POI from jakarta: http://jakarta.apache.org/poi/
4) Don't tell me not to say I don't understand. If I don't understand what you're saying, I'm going to say so. It would be stupid for me to pretend I understand when I don't.
I don't think I can help you. Good luck.

Hello! Can't open an IDML file. ID file was created in CC (10). It is a 100 page (50 spreads) doc that is one big table. It was created in CC (10) and saved as an IDML file. I have CS6 and when I try to open it, it shuts down ID almost instantly. The file

Hello! Can't open an IDML file. ID file was created in CC (10). It is a 100 page (50 spreads) doc that is one big table. It was created in CC (10) and saved as an IDML file. I have CS6 and when I try to open it, it shuts down ID almost instantly. The file was created on a MAC and I am trying to open it on a MAC. Any/all advice is greatly appreciated as I am up against a deadline with this unopened file! Many thanks in advance, Diane

There's a good chance the file is corrupt. As whomever sent it to you to verify it opens on their end.

Gather table stats takes time for big table

Table has got millions of record and it is partitioned. when i analyze the table using the following syntax it takes more than 5 hrs and it has got one index.
I tried with auto sample size and also by changing the estimate percentage value like 20, 50 70 etc. But the time is almost same.
exec dbms_stats.gather_table_stats(ownname=>'SYSADM',tabname=>'TEST',granularity =>'ALL',ESTIMATE_PERCENT=>100,cascade=>TRUE);
What i should do to reduce the analyze time for Big tables. Can anyone help me. l

Hello,
The behaviour of the ESTIMATE_PERCENT may change from one Release to another.
In some Release when you specify a "too high" (>25%,...) ESTIMATE_PERCENT in fact you collect the Statistics over 100% of the rows, as in COMPUTE mode:
Using DBMS_STATS.GATHER_TABLE_STATS With ESTIMATE_PERCENT Parameter Samples All Rows [ID 223065.1]For later Release, *10g* or *11g*, you have the possibility to use the following value:
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZEIn fact, you may use it even in *9.2*, but in this release it is recommended using a specific estimate value.
More over, starting with *10.1* it's possible to Schedule the Statistics collect by using DBMS_SCHEDULE and, specify a Window so that the Job doesn't run during production hours.
So, the answer may depends on the Oracle Release and also on the Application (SAP, Peoplesoft, ...).
Best regards,
Jean-Valentin

Managing a big table

Similar Messages

Maybe you are looking for