Partition Pruning vs Partition-Wise Join

Hi,
I am not sure if this is the right place for this question, but here it goes.
I am in a situation where in the begining I have to join two big tables without any where clauses. It is pretty much a Caretsian Product where a criteria meets (Internal Code), but I have to join all rows. (Seems like a good place for Partition-Wise Join).
Later I only need to update certain rows based on a key value (Customer ID, Region ID). (Good candidate for Partition Pruning).
What would be the best option. Is there a way to use both?
Assume that following:
Table 1 has the structure of
Customer ID
Internal Code
Other Data
There are about 1000 Customer ID. Each Customer ID has 1000 Internal Codes.
Table 2 has the structure of
Region ID
Internal Code
Other Data
There are about 5000 Region ID. Each Region ID has 1000 Internal Codes(same as Table 1).
I am currently thinking of doing a HASH PARTITION (8 partitions) on Customer ID for Table 1 and HASH PARTITION (8 partitions) on Region ID for Table 2.
The initial insert will take a long time, but when I go to update the joined data based on specific Customer ID, or Region ID atleast from one Table only one Partition will be used.
I would sincerely appreciate some advice from the gurus.
Thanks...

Hi,
I still don't understand what it is that you are trying to do.
Would it be possible for you to create a silly example with just a few rows
to show us what it is that you are trying to accomplish?
Then we can help you solve whatever problem it is that you are having.
create table t1(
   customer_id   number       not null
,internal_code varchar2(20) not null
,<other_columns>
,constraint t1_pk primary key(customer_id, internal_code)
create table t2(
   region_id number not null
,internal_code varchar2(20) not null
,<other_columns>
,constraint t2_pk primary key(region_id, internal_code)
insert into t1(customer_id, internal_code, ...) values(...);
insert into t1(customer_id, internal_code, ...) values(...);
insert into t2(region_id, internal_code, ...) values(...);
insert into t2(region_id, internal_code, ...) values(...);
select <the rating calculation>
   from t1 join t2 using(internal_code);

Similar Messages

Partition wise join

Hello,
i'm playing with partition. I have read about (full) partition wise join in hash-hash table. The description and the example is in Oracle Database Warehousing guide (b14223.pdf) in chaper 5.
<cite>
A full partition-wise join divides a large join into smaller joins between a pair of
partitions from the two joined tables. To use this feature, you must equipartition both
tables on their join keys. For example, consider a large join between a sales table and a
customer table on the column customerid. The query "find the records of all
customers who bought more than 100 articles in Quarter 3 of 1999" is a typical example
of a SQL statement performing such a join. The following is an example of this:
SELECT c.cust_last_name, COUNT(*)
FROM sales s, customers c
WHERE s.cust_id = c.cust_id AND
s.time_id BETWEEN TO_DATE('01-JUL-1999', 'DD-MON-YYYY') AND
(TO_DATE('01-OCT-1999', 'DD-MON-YYYY'))
GROUP BY c.cust_last_name HAVING COUNT(*) > 100;
</cite>
I have created some sales table with 1M rows and customers table with 1k rows (it means that approximately each customer has one thousand of invoices). Both tables are hash partitioned with 64 partitions. The are analyzed. But there is not any improvement comparing the same table without paritioning and with index on customerid column.
I would just to see that partition wise joins is working. It means is is either faster or it consume less resources.
Thanks
sasa

I don't think that 64 partition is too many ... the problem is that there isn't any gain from partitioning.
Lets see the explain plan:
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
| 0 | SELECT STATEMENT | | 50 | 2500 | 11436 (5)| 00:02:18 | | |
|* 1 | FILTER | | | | | | | |
| 2 | HASH GROUP BY | | 50 | 2500 | 11436 (5)| 00:02:18 | | |
|* 3 | HASH JOIN | | 903K| 43M| 11320 (4)| 00:02:16 | | |
| 4 | PARTITION HASH ALL| | 1000 | 34000 | 88 (0)| 00:00:02 | 1 | 64 |
| 5 | TABLE ACCESS FULL| PART_CUSTOMERS | 1000 | 34000 | 88 (0)| 00:00:02 | 1 | 64 |
| 6 | PARTITION HASH ALL| | 903K| 13M| 11219 (4)| 00:02:15 | 1 | 64 |
|* 7 | TABLE ACCESS FULL| PART_INVOICES | 903K| 13M| 11219 (4)| 00:02:15 | 1 | 64 |
And compare this explain plan with the following which is created on non-partition tables (the cost of expalin plan for non partitioned tables is 180 compare to cost 11436 of partitioned tables):
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 46 | 180 (2)| 00:00:03 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 1 | 46 | 180 (2)| 00:00:03 |
|* 3 | TABLE ACCESS BY INDEX ROWID | PART_INVOICES2 | 903 | 10836 | 177 (1)| 00:00:03 |
| 4 | NESTED LOOPS | | 903 | 41538 | 179 (1)| 00:00:03 |
| 5 | TABLE ACCESS BY INDEX ROWID| PART_CUSTOMERS2 | 1 | 34 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | PART_CUSTOMERS2_IDX01 | 1 | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | PART_INVOCIES2_IDX02 | 9991 | | 22 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Make parallel query (e.g. partition-wise join) evenly distributed cross RAC

How To Make Parallel Query's Slaves (e.g. full partition-wise join) Evenly Distributed Across RAC Nodes?
Environment
* 4-node Oracle 10gR2 (10.2.0.3)
* all instances are included in the same distribution group
* tables are hash-partitioned by the same join key
* 8-CPU per node, 48GB RAM per node
Query
Join 3 big tables (each has DOP=4) based on the hash partition key column.
Problem
The QC is always on one node, and all the slaves are on another node. The slave processes are supposed to be distributed or allocated to multiple nodes/instances, but even the query spawns 16 or more slaves, these slaves are running from only one node. And the QC process is never running on the same node! ? ! ? !
The other 2 nodes are not busy during this time. Is there any configuration wrong or missing here? Why can't the RAC distribute the slaves better, or at least run some slaves together with QC?
Please advise.
Thank you very much!
Eric

Hi,
If your PARALLEL_INSTANCE_GROUP and LOAD_BALANCING is set properly it means the oracle is assuming that intra node parallelism is more beneficial than inter node parallelism, i mean parallelism across multiple nodes. This is very true in scenarios where partition wise joins are involved. intra node parallelism avoids unnecessary interconnect traffic.

Hash Partitioning and Partition-Wise Joins

Hi,
For the ETL process of a Data Warehouse project I have to join 2 tables (~10M rows) over their primary key.
I was thinking of hash partitioning these 2 tables over their PK. Through this the database (9.2) should be able to do a Hash-Hash Full Partition-wise Join. For more detail about that you can have a look at:
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/parpart.htm#98291
What I'm looking for are some documents or recommandation concerning the number of hash partitions to create depending on the number of rows of the tables, CPU of the server or any other parameters.
I would be grateful if someone could give some input.
Mike

here you have all papers:
Oracle9i Database List of Books
(Release 2 (9.2))
http://otn.oracle.com/pls/db92/db92.docindex?remark=homepage
Joel P�rez

Expected for a partition-wise join?

I have two tables that are partitioned by a hash of the same VARCHAR2(16) strings. When I do a query similar to
select * from table1 a join table2 b
on b.partition_Column = a.partition_Column
I get the following as the "Operation" portion of an explain from Oracle Developer running Oracle 11gR1:
PARTITION HASH(ALL)
HASH JOIN
TABLE ACCESS(FULL) schema.Table1
TABLE ACCESS(FULL) schema.Table2
Is this indicative of a partition-wise hashed join?

pstart/pstop does give partition information but the key to whether this is a "partition-wise" join, is that the partition operation is above the join operation.
Using David's tables above, example of serial partition-wise join:
| Id | Operation           | Name   | Rows | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
|   0 | SELECT STATEMENT    |        | 2500M| 4670G|       |   111K (61)| 00:09:17 |       |       |
|   1 | PARTITION HASH ALL |        | 2500M| 4670G|       |   111K (61)| 00:09:17 |     1 |     8 |
|* 2 |   HASH JOIN         |        | 2500M| 4670G|    12M|   111K (61)| 00:09:17 |       |       |
|   3 |    TABLE ACCESS FULL| TABLE1 |   100K|    95M|       | 5891   (1)| 00:00:30 |     1 |     8 |
|   4 |    TABLE ACCESS FULL| TABLE2 |   200K|   191M|       | 11808   (1)| 00:01:00 |     1 |     8 |
------------------------------------------------------------------------------------------------------Example of serial non PWJ:
| Id | Operation           | Name   | Rows | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
|   0 | SELECT STATEMENT    |        | 2500M| 4670G|       |   111K (61)| 00:09:17 |       |       |
|* 1 | HASH JOIN          |        | 2500M| 4670G|    96M|   111K (61)| 00:09:17 |       |       |
|   2 |   PARTITION HASH ALL|        |   100K|    95M|       | 5891   (1)| 00:00:30 |     1 |     8 |
|   3 |    TABLE ACCESS FULL| TABLE1 |   100K|    95M|       | 5891   (1)| 00:00:30 |     1 |     8 |
|   4 |   PARTITION HASH ALL|        |   200K|   191M|       | 11808   (1)| 00:01:00 |     1 |     8 |
|   5 |    TABLE ACCESS FULL| TABLE2 |   200K|   191M|       | 11808   (1)| 00:01:00 |     1 |     8 |
------------------------------------------------------------------------------------------------------Example of parallel PWJ:
| Id | Operation               | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT        |          | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       |        |      |            |
|   1 | PX COORDINATOR         |          |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000 | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       | Q1,00 | P->S | QC (RAND) |
|   3 |    PX PARTITION HASH ALL|          | 2500M| 4670G| 23536 (80)| 00:01:58 |     1 |     8 | Q1,00 | PCWC |            |
|* 4 |     HASH JOIN           |          | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       | Q1,00 | PCWP |            |
|   5 |      TABLE ACCESS FULL | TABLE1   |   100K|    95M| 1628   (1)| 00:00:09 |     1 |     8 | Q1,00 | PCWP |            |
|   6 |      TABLE ACCESS FULL | TABLE2   |   200K|   191M| 3263   (1)| 00:00:17 |     1 |     8 | Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------E.g. parallel non PWJ:
| Id | Operation                  | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT           |          | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       |        |      |            |
|   1 | PX COORDINATOR            |          |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)      | :TQ10001 | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       | Q1,01 | P->S | QC (RAND) |
|* 3 |    HASH JOIN               |          | 2500M| 4670G| 23536 (80)| 00:01:58 |       |       | Q1,01 | PCWP |            |
|   4 |     PART JOIN FILTER CREATE| :BF0000 |   100K|    95M| 1628   (1)| 00:00:09 |       |       | Q1,01 | PCWP |            |
|   5 |      PX RECEIVE            |          |   100K|    95M| 1628   (1)| 00:00:09 |       |       | Q1,01 | PCWP |            |
|   6 |       PX SEND BROADCAST    | :TQ10000 |   100K|    95M| 1628   (1)| 00:00:09 |       |       | Q1,00 | P->P | BROADCAST |
|   7 |        PX BLOCK ITERATOR   |          |   100K|    95M| 1628   (1)| 00:00:09 |     1 |     8 | Q1,00 | PCWC |            |
|   8 |         TABLE ACCESS FULL | TABLE1   |   100K|    95M| 1628   (1)| 00:00:09 |     1 |     8 | Q1,00 | PCWP |            |
|   9 |     PX BLOCK ITERATOR      |          |   200K|   191M| 3263   (1)| 00:00:17 |:BF0000|:BF0000| Q1,01 | PCWC |            |
| 10 |      TABLE ACCESS FULL     | TABLE2   |   200K|   191M| 3263   (1)| 00:00:17 |:BF0000|:BF0000| Q1,01 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------------------Edited by: Dom Brooks on Jul 8, 2011 11:32 AM
Added serial vs parallel

Hint to disable partition wise join

Is there a way to disable partition wise join(serial) in 10gR2? i.e. via hint.. The reason I want to do this is, to use intra-partition parallelism for a very big partition. re-partitioning or subpartitioning is not an option for now. SQL is scanning only one partition so P-W join is not useful and it limit the intra-partition parallelism.
TIA for your answers.

user4529833 wrote:
Above is the plan. Currently there is no prallelism being used but P-W join is used as you can see. Table EC is huge .. (cardinality is screwed up here becasue of IN clause , which has just one vallid part key. [ 3rd party crappy app, so can't change it.] ) . I'd like to enable parallelism here using parallel (EC, 6) hint , it just applied to hash-join and not to table EC because of P-W join, I believe. What I want is to scan EC table via PQ slave.. i.e. PX BLOCK INTERATOR step before TABLE access step... How do I get one? Will PQ_DISTRIBUTE help me there??? or Is there any way to speed up the scan of EC..
The pq_distribute() should do the job. Here's an example
select
     /*+
          parallel(pt_range_1 2)
          parallel(pt_range_2 2)
          ordered
--          pq_distribute(pt_range_2 hash hash)
--          pq_distribute(pt_range_2 broadcast none)
     pt_range_2.grp,
     count(pt_range_1.small_vc)
from
     pt_range_1,
     pt_range_2
where
     pt_range_1.id in (10,20,40)
and     pt_range_2.id = pt_range_1.id
group by
     pt_range_2.grp
| Id | Operation                      | Name       | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT               |            |     3 |    42 |     6 (34)| 00:00:01 |       |       |        |      |            |
|   1 | PX COORDINATOR                |            |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)          | :TQ10001   |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,01 | P->S | QC (RAND) |
|   3 |    HASH GROUP BY               |            |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,01 | PCWP |            |
|   4 |     PX RECEIVE                 |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,01 | PCWP |            |
|   5 |      PX SEND HASH              | :TQ10000   |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,00 | P->P | HASH       |
|   6 |       PX PARTITION RANGE INLIST|            |     3 |    42 |     5 (20)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWC |            |
|* 7 |        HASH JOIN               |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,00 | PCWP |            |
|* 8 |         TABLE ACCESS FULL      | PT_RANGE_1 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWP |            |
|* 9 |         TABLE ACCESS FULL      | PT_RANGE_2 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------------------------Unhinted I have a partition-wise parallel join.
The next plan is using hash disrtibution - which may be better for you if the EX_C table is large:
| Id | Operation                  | Name       | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT           |            |     3 |    42 |     6 (34)| 00:00:01 |       |       |        |      |            |
|   1 | PX COORDINATOR            |            |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)      | :TQ10003   |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,03 | P->S | QC (RAND) |
|   3 |    HASH GROUP BY           |            |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   4 |     PX RECEIVE             |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   5 |      PX SEND HASH          | :TQ10002   |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,02 | P->P | HASH       |
|* 6 |       HASH JOIN BUFFERED   |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,02 | PCWP |            |
|   7 |        PX RECEIVE          |            |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,02 | PCWP |            |
|   8 |         PX SEND HASH       | :TQ10000   |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,00 | P->P | HASH       |
|   9 |          PX BLOCK ITERATOR |            |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWC |            |
|* 10 |           TABLE ACCESS FULL| PT_RANGE_1 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWP |            |
| 11 |        PX RECEIVE          |            |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,02 | PCWP |            |
| 12 |         PX SEND HASH       | :TQ10001   |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,01 | P->P | HASH       |
| 13 |          PX BLOCK ITERATOR |            |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,01 | PCWC |            |
|* 14 |           TABLE ACCESS FULL| PT_RANGE_2 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------Then the broadcast version if the EC_C data is relatively small (so that the whole set can fit in the memory of each slave)
| Id | Operation                  | Name       | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT           |            |     3 |    42 |     6 (34)| 00:00:01 |       |       |        |      |            |
|   1 | PX COORDINATOR            |            |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)      | :TQ10002   |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,02 | P->S | QC (RAND) |
|   3 |    HASH GROUP BY           |            |     3 |    42 |     6 (34)| 00:00:01 |       |       | Q1,02 | PCWP |            |
|   4 |     PX RECEIVE             |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,02 | PCWP |            |
|   5 |      PX SEND HASH          | :TQ10001   |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,01 | P->P | HASH       |
|* 6 |       HASH JOIN            |            |     3 |    42 |     5 (20)| 00:00:01 |       |       | Q1,01 | PCWP |            |
|   7 |        PX RECEIVE          |            |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,01 | PCWP |            |
|   8 |         PX SEND BROADCAST | :TQ10000   |     3 |    21 |     2   (0)| 00:00:01 |       |       | Q1,00 | P->P | BROADCAST |
|   9 |          PX BLOCK ITERATOR |            |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWC |            |
|* 10 |           TABLE ACCESS FULL| PT_RANGE_1 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,00 | PCWP |            |
| 11 |        PX BLOCK ITERATOR   |            |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,01 | PCWC |            |
|* 12 |         TABLE ACCESS FULL | PT_RANGE_2 |     3 |    21 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) | Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------The "hash join buffered" in the hash/hash distribution might hammer your temporary tablespace though, thanks to [an oddity I discovered |http://jonathanlewis.wordpress.com/2008/11/05/px-buffer/] in parallel hash joins a little while ago.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
http://www.jlcomp.demon.co.uk
"Science is more than a body of knowledge; it is a way of thinking" Carl Sagan

Partition pruning not working for partitioned table joins

Hi,
We are joining 4 partitioned tables on partition column & other key columns. And we are filtering the driving table on partition key. But explain plan is showing that all tables except the driving table are not partition pruning and scanning all partitions.Is there any limitation that filter condition cannot be dynamic?
Thanks a lot in advance.
Here are the details...
SELECT a.pay_prd_id,
              a.a_id,
              a.a_evnt_no
FROM b,
            c,
            a,
            d
WHERE (    a.pay_prd_id = b.pay_prd_id ---partition range all
            AND a.a_evnt_no = b.b_evnt_no
            AND a.a_id       = b.b_id
   AND (    a.pay_prd_id = c.pay_prd_id---partition range all
        AND a.a_evnt_no = c.c_evnt_no
        AND a.a_id       = c.c_id
   AND (    a.pay_prd_id = d.pay_prd_id---partition range all
        AND a.a_evnt_no = d.d_evnt_no
        AND a.a_id       = d.d_id
   AND (a.pay_prd_id = ---partition range single
           CASE '201202'
              WHEN 'YYYYMM'
                 THEN (SELECT min(pay_prd_id)
                                  FROM pay_prd
                                 WHERE pay_prd_stat_cd = 2)
              ELSE TO_NUMBER ('201202', '999999')
           END
DDLs.
create table pay_prd
pay_prd_id number(6),
pay_prd_stat_cd integer,
pay_prd_stat_desc varchar2(20),
a_last_upd_dt DATE
insert into pay_prd
select 201202,2,'OPEN',sysdate from dual
union all
select 201201,1,'CLOSE',sysdate from dual
union all
select 201112,1,'CLOSE',sysdate from dual
union all
select 201111,1,'CLOSE',sysdate from dual
union all
select 201110,1,'CLOSE',sysdate from dual
union all
select 201109,1,'CLOSE',sysdate from dual
CREATE TABLE A
(PAY_PRD_ID    NUMBER(6) NOT NULL,
A_ID        NUMBER(9) NOT NULL,
A_EVNT_NO    NUMBER(3) NOT NULL,
A_DAYS        NUMBER(3),
A_LAST_UPD_DT    DATE
PARTITION BY RANGE (PAY_PRD_ID)
INTERVAL( 1)
PARTITION A_0001 VALUES LESS THAN (201504)
ENABLE ROW MOVEMENT;
ALTER TABLE A ADD CONSTRAINT A_PK PRIMARY KEY (PAY_PRD_ID,A_ID,A_EVNT_NO) USING INDEX LOCAL;
insert into a
select 201202,1111,1,65,sysdate from dual
union all
select 201202,1111,2,75,sysdate from dual
union all
select 201202,1111,3,85,sysdate from dual
union all
select 201202,1111,4,95,sysdate from dual
CREATE TABLE B
(PAY_PRD_ID    NUMBER(6) NOT NULL,
B_ID        NUMBER(9) NOT NULL,
B_EVNT_NO    NUMBER(3) NOT NULL,
B_DAYS        NUMBER(3),
B_LAST_UPD_DT    DATE
PARTITION BY RANGE (PAY_PRD_ID)
INTERVAL( 1)
PARTITION B_0001 VALUES LESS THAN (201504)
ENABLE ROW MOVEMENT;
ALTER TABLE B ADD CONSTRAINT B_PK PRIMARY KEY (PAY_PRD_ID,B_ID,B_EVNT_NO) USING INDEX LOCAL;
insert into b
select 201202,1111,1,15,sysdate from dual
union all
select 201202,1111,2,25,sysdate from dual
union all
select 201202,1111,3,35,sysdate from dual
union all
select 201202,1111,4,45,sysdate from dual
CREATE TABLE C
(PAY_PRD_ID    NUMBER(6) NOT NULL,
C_ID        NUMBER(9) NOT NULL,
C_EVNT_NO    NUMBER(3) NOT NULL,
C_DAYS        NUMBER(3),
C_LAST_UPD_DT    DATE
PARTITION BY RANGE (PAY_PRD_ID)
INTERVAL( 1)
PARTITION C_0001 VALUES LESS THAN (201504)
ENABLE ROW MOVEMENT;
ALTER TABLE C ADD CONSTRAINT C_PK PRIMARY KEY (PAY_PRD_ID,C_ID,C_EVNT_NO) USING INDEX LOCAL;
insert into c
select 201202,1111,1,33,sysdate from dual
union all
select 201202,1111,2,44,sysdate from dual
union all
select 201202,1111,3,55,sysdate from dual
union all
select 201202,1111,4,66,sysdate from dual
CREATE TABLE D
(PAY_PRD_ID    NUMBER(6) NOT NULL,
D_ID        NUMBER(9) NOT NULL,
D_EVNT_NO    NUMBER(3) NOT NULL,
D_DAYS        NUMBER(3),
D_LAST_UPD_DT    DATE
PARTITION BY RANGE (PAY_PRD_ID)
INTERVAL( 1)
PARTITION D_0001 VALUES LESS THAN (201504)
ENABLE ROW MOVEMENT;
ALTER TABLE D ADD CONSTRAINT D_PK PRIMARY KEY (PAY_PRD_ID,D_ID,D_EVNT_NO) USING INDEX LOCAL;
insert into c
select 201202,1111,1,33,sysdate from dual
union all
select 201202,1111,2,44,sysdate from dual
union all
select 201202,1111,3,55,sysdate from dual
union all
select 201202,1111,4,66,sysdate from dual

Below query generated from Business Objects and submitted to Database (the case statement is generated by BO). Cant we use Case/Subquery/Decode etc for the partitioned column? We are assuming that the case causing the issue to not to dynamic partition elimination on the other joined partitioned tables (TAB_B_RPT, TAB_C_RPT).
SELECT TAB_D_RPT.acvy_amt,
       TAB_A_RPT.itnt_typ_desc,
       TAB_A_RPT.ls_typ_desc,
       TAB_A_RPT.evnt_no,
       TAB_C_RPT.pay_prd_id,
       TAB_B_RPT.id,
       TAB_A_RPT.to_mdfy,
       TAB_A_RPT.stat_desc
FROM TAB_D_RPT,
       TAB_C_RPT fee_rpt,
       TAB_C_RPT,
       TAB_A_RPT,
       TAB_B_RPT
WHERE (TAB_B_RPT.id = TAB_A_RPT.id)
   AND (    TAB_A_RPT.pay_prd_id = TAB_D_RPT.pay_prd_id -- expecting Partition Range Single, but doing Partition Range ALL
        AND TAB_A_RPT.evnt_no    = TAB_D_RPT.evnt_no
        AND TAB_A_RPT.id         = TAB_D_RPT.id
   AND (    TAB_A_RPT.pay_prd_id = TAB_C_RPT.pay_prd_id -- expecting Partition Range Single, but doing Partition Range ALL
        AND TAB_A_RPT.evnt_no    = TAB_C_RPT.evnt_no
        AND TAB_A_RPT.id         = TAB_C_RPT.id
   AND (    TAB_A_RPT.pay_prd_id = fee_rpt.pay_prd_id -- expecting Partition Range Single
        AND TAB_A_RPT.evnt_no    = fee_rpt.evnt_no
        AND TAB_A_RPT.id         = fee_rpt.id
   AND (TAB_A_RPT.rwnd_ind = 'N')
   AND (TAB_A_RPT.pay_prd_id =
           CASE '201202'
              WHEN 'YYYYMM'
                 THEN (SELECT DISTINCT pay_prd.pay_prd_id
                                  FROM pay_prd
                                 WHERE pay_prd.stat_cd = 2)
              ELSE TO_NUMBER ('201202', '999999')
           END
And its explain plan is...
Plan
SELECT STATEMENT ALL_ROWS Cost: 79 K Bytes: 641 M Cardinality: 3 M
18 HASH JOIN Cost: 79 K Bytes: 641 M Cardinality: 3 M
3 PART JOIN FILTER CREATE SYS.:BF0000 Cost: 7 K Bytes: 72 M Cardinality: 3 M
2 PARTITION RANGE ALL Cost: 7 K Bytes: 72 M Cardinality: 3 M Partition #: 3 Partitions accessed #1 - #1048575
1 TABLE ACCESS FULL TABLE TAB_D_RPT Cost: 7 K Bytes: 72 M Cardinality: 3 M Partition #: 3 Partitions accessed #1 - #1048575
17 HASH JOIN Cost: 57 K Bytes: 182 M Cardinality: 874 K
14 PART JOIN FILTER CREATE SYS.:BF0001 Cost: 38 K Bytes: 87 M Cardinality: 914 K
13 HASH JOIN Cost: 38 K Bytes: 87 M Cardinality: 914 K
6 PART JOIN FILTER CREATE SYS.:BF0002 Cost: 8 K Bytes: 17 M Cardinality: 939 K
5 PARTITION RANGE ALL Cost: 8 K Bytes: 17 M Cardinality: 939 K Partition #: 9 Partitions accessed #1 - #1048575
4 TABLE ACCESS FULL TABLE TAB_C_RPT Cost: 8 K Bytes: 17 M Cardinality: 939 K Partition #: 9 Partitions accessed #1 - #1048575
12 HASH JOIN Cost: 24 K Bytes: 74 M Cardinality: 957 K
7 INDEX FAST FULL SCAN INDEX (UNIQUE) TAB_B_RPT_PK Cost: 675 Bytes: 10 M Cardinality: 941 K
11 PARTITION RANGE SINGLE Cost: 18 K Bytes: 65 M Cardinality: 970 K Partition #: 13 Partitions accessed #KEY(AP)
10 TABLE ACCESS FULL TABLE TAB_A_RPT Cost: 18 K Bytes: 65 M Cardinality: 970 K Partition #: 13 Partitions accessed #KEY(AP)
9 HASH UNIQUE Cost: 4 Bytes: 14 Cardinality: 2
8 TABLE ACCESS FULL TABLE PAY_PRD Cost: 3 Bytes: 14 Cardinality: 2
16 PARTITION RANGE JOIN-FILTER Cost: 8 K Bytes: 106 M Cardinality: 939 K Partition #: 17 Partitions accessed #:BF0001
15 TABLE ACCESS FULL TABLE TAB_C_RPT Cost: 8 K Bytes: 106 M Cardinality: 939 K Partition #: 17 Partitions accessed #:BF0001
Thanks Again.

Partition pruning in the partition_wise join

Given tables table1 partitioned by hash on client_id column, and another table2 equipartitioned (hash on client_id column) I want to get both tables joined, but read the data partition by partition (in serial or from multiple client sessions). How to do this?
select * from table1 T1, table2 T2
where t1.client_id = t2.client_id
The query above works fine - it DOES partition_wise join, but i cant use it since huge amount of rows it returns. Id like to use
select *
from table1 partition (P1) T1 ,
table2 T2
where t1.client_id = t2.client_id
But in this case Oracle does not prune all Table2 partitions - it makes join between 1st partition of Table1 and all partitions of Table2. First question is why?
I can fix it by pointing table2 partition too, then second question is - how can I be sure that all corresponding partitions with the same order number (given uniform partitioning for several tables) contain same keys?

Using composite changes nothing. I've realized the problem - partition pruning is performed basing on "where" clause content. So, using "select from partition"
does not affect a joined table. Probably, creating "partition_id" column filled with
custom hash would solve the problem.

Partition pruning doesn't work

Hi all,
in this query I've joined 3 tables, each of them partitioned along the field PARTITION_KEY_PRE:
SELECT *
FROM t_cp_pp_basdat basdat LEFT OUTER JOIN t_cp_pp_custom custom
ON basdat.pre_processing_id = custom.pre_processing_id
AND basdat.partition_key_pre = custom.partition_key_pre
AND basdat.customer_unique_id_f = custom.customer_unique_id_f
LEFT OUTER JOIN t_cp_pp_groups groups
ON basdat.partition_key_pre = groups.partition_key_pre
AND basdat.pre_processing_id = groups.pre_processing_id
AND custom.customer_group_id = groups.customer_group_id
WHERE basdat.partition_key_pre = '111100361A0700';
The problem is that the T_CP_PP_GROUPS table don't prune; if I remove the condition
custom.customer_group_id = groups.customer_group_id
the prune works, but obviously the query is wrong. Here it is the explain plan
Operation                    Object Name          Cost     Object Node     In/Out     PStart          PStop
SELECT STATEMENT Optimizer Mode=ALL_ROWS               6156
PX COORDINATOR
PX SEND QC (RANDOM)          SYS.:TQ10003          6156      :Q1003          P->S      QC (RANDOM)
HASH JOIN OUTER                         6156      :Q1003          PCWP
PX RECEIVE                              50      :Q1003          PCWP
PX SEND PARTITION (KEY)          SYS.:TQ10002          50      :Q1002          P->P      PART (KEY)
VIEW                              50      :Q1002          PCWP
HASH JOIN RIGHT OUTER BUFFERED               50      :Q1002          PCWP
PX RECEIVE                         5      :Q1002          PCWP
PX SEND HASH          SYS.:TQ10000          5      :Q1000          P->P      HASH
PX BLOCK ITERATOR                    5      :Q1000          PCWC      KEY     KEY
TABLE ACCESS FULL     TUKB103.T_CP_PP_CUSTOM     5      :Q1000          PCWP      466     466
PX RECEIVE                         44      :Q1002          PCWP
PX SEND HASH          SYS.:TQ10001          44      :Q1001          P->P      HASH
PX BLOCK ITERATOR                    44      :Q1001          PCWC      KEY     KEY
TABLE ACCESS FULL     TUKB103.T_CP_PP_BASDAT     44      :Q1001          PCWP      1437     1437
PX PARTITION LIST ALL                         6013      :Q1003          PCWC      1     635
TABLE ACCESS FULL          TUKB103.T_CP_PP_GROUPS     6013      :Q1003          PCWP      1     635
Can anyone help me to tune this query?
Thanks,
Davide

Please learn to use { code } tags. Since that makes you code to read much better in the forum. See the FAQ for further instructions on that.
About the pruning question.
You could experiment with different types of joins.
a) Join the GROUPS table with the columns from CUSTOM
SELECT *
FROM t_cp_pp_basdat basdat
LEFT OUTER JOIN t_cp_pp_custom custom
   ON basdat.pre_processing_id = custom.pre_processing_id
   AND basdat.partition_key_pre = custom.partition_key_pre
   AND basdat.customer_unique_id_f = custom.customer_unique_id_f
LEFT OUTER JOIN t_cp_pp_groups groups
ON custom.partition_key_pre = groups.partition_key_pre
AND custom.pre_processing_id = groups.pre_processing_id
AND custom.customer_group_id = groups.customer_group_id
WHERE basdat.partition_key_pre = '111100361A0700'; b) Put the partion key into each on clause.
SELECT *
FROM t_cp_pp_basdat basdat
LEFT OUTER JOIN t_cp_pp_custom custom
   ON basdat.pre_processing_id = custom.pre_processing_id
   AND custom.partition_key_pre = '111100361A0700'
   AND basdat.customer_unique_id_f = custom.customer_unique_id_f
LEFT OUTER JOIN t_cp_pp_groups groups
ON custom.partition_key_pre = '111100361A0700'
AND basdat.pre_processing_id = groups.pre_processing_id
AND custom.customer_group_id = groups.customer_group_id
WHERE basdat.partition_key_pre = '111100361A0700'; Did you consider to add subpartions to your tables? You could profit from some partition wise joins. Customer_Unique_ID would be a candidate for a hash subpartition.

How to choose the partition in oracle tables?

Dear all,
i m in need to create the partitions on prod.db tables since i m not aware of creating partitions?i just go through some theroy concepts and understood the range and list partitions (i.e)Range is normally used for values less than like jan,feb,march or values less than 50,000 values less than 1,00,000 like that each partition is having separate tablespaces to increase the performance. and for list is used to denoting the directions like west,east,north,south like that.
Now what i want to know is ?
1.)when will i can go ahead with partitions?
2.)before creating partitions is it advise to create index or not needed?
3.)if i started to create partition what is the leading column have to create partition and which partition has to choose?
pls let me know and pardon me if i did any mistakes.
thanks in advance..

I had to research on same topic. One of my teammates suggested few points that might help you also.
Advantages of partitioning:
1) Partitioning enables data management operations such data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.
2) Partitioning improves query performance. In some cases, the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. Parallel queries/DML and Partition-wise Joins are also got benefited much.
3) Partitioning increases the availability of mission-critical databases if critical tables and indexes are divided into partitions to reduce the maintenance windows, recovery times, and impact of failures. (Each partition can have separate physical attributes such as pctfree, pctused, and tablespaces.)
Partitioning can be implemented without requiring any modifications to your applications. For example, you could convert a nonpartitioned table to a partitioned table without needing to modify any of the SELECT statements or DML statements which access that table. You do not need to rewrite your application code to take advantage of partitioning.
Disadvantages of partitioning:-
1) Advantages of partition nullified when you use bind variables.
Additional administration tasks to manage partitions viz. If situation arises for rebuilding of index then rebuilding should to be done for each individual partition.
2) Need more space to implement partitioning objects.
3) More time for some tasks, such as create non-partitioning indexes, collection of “global" statistics (dbms_stat’s granularity parameter to be set to GLOBAL. if sub partition are used then we have to set it to ALL).
4) Partition would implies a modification (of explain plan) for ALL the queries against the partitioned tables. So, if some queries use the choosing partition key and may greatly improve, some other queries not use the partition key and are dramatically bad impact by the partitioning.
5) To get the full advantage of partitioning (partition pruning, partition-wise joins, and so on), you must use the Cost Based Optimizer (CBO). If you use the RBO, and a table in the query is partitioned, Oracle kicks in the CBO while optimizing it. But because the statistics are not present, the CBO makes up the statistics, and this could lead to severely expensive optimization plans and extremely poor performance.
Message was edited by:
Abou

Reference partitioning - thoughts

Hi,
At moment we use range-hash partitioning of a large dimension table (dimension model warehouse) table with 2 levels - range partitioned on columns only available at bottom level of hierarchy - date and issue_id.
Result is a partition with null value - assume would get a null partition in large fact table if was partitioned with reference to the large dimension.
Large fact table similarly partitioned date range-hash local bitmap indexes
Suggested to use would get automatic partition-wise joins if used reference partitioning
Would have thought would get that with range-hash on both dimension
Any disadvtanatges with reference partitioning.
Know can't us range interval partitioning.
Thanks

>
At moment, the large dimension table and large fact table are have the same partitioning strategy but partitioned independently(range-hash)
the range column is a date datatype and the hash column is the surrogate key
>
As long as the 'hash column' is the SAME key value in both tables there is no problem. Obviously you can't hash on one column/value in one table and a different one in the other table.
>
With regards null values the dimesnion table has 3 levels in it (part of a dimensional model data wqarehouse)i.e. the date on which tabel partitioned is only at the loest level of the diemnsion.
>
High or low doesn't matter and, as you ask in your other thread (Order of columsn in table - how important from performance perspective the column order generally doesn't matter.
>
By default in a diemsnional model data warehouse, this attribute not populated in the higher levels therefore is a default null value in the dimension table for such records
>
Still not clear what you mean by this. The columns must be populated at some point or they wouldn't need to be in the table. Can you provide a small sample of data that shows what you mean?
>
The problem the performance team are attempting to solve is as follows:
the tow tables are joined on the sub-partition key, they have tried joined the two tables together on the entire partition key but then complained they don'y get star transformation.
>
Which means that team isn't trying to 'solve' a problem at all. They are just trying to mechanically achieve a 'star transformation'.
A full partition-wise join REQUIRES that the partitioning be on the join columns or you need to use reference partitioning. See the doc I provided the link for earlier:
>
Full Partition-Wise Joins
A full partition-wise join divides a large join into smaller joins between a pair of partitions from the two joined tables. To use this feature, you must equipartition both tables on their join keys, or use reference partitioning.
>
They believe that by partitioning by reference as opposed to indepently they will get a partition-wise join automatically.
>
They may. But you don't need to partition by reference to get partition-wise joins. And you don't need to get 'star transformation' to get the best performance.
Static partition pruning will occur, if possible, whether a star transformation is done or not. It is dynamic pruning that is done AFTER a star transform. Again, you need to review all of the relevant sections of that doc. They cover most of this, with example code and example execution plans.
>
Dynamic Pruning with Star Transformation
Statements that get transformed by the database using the star transformation result in dynamic pruning.
>
Also, there are some requirements before star transformation can even be considered. The main one is that it must be ENABLED; it is NOT enabled by default. Has your team enabled the use of the star transform?
The database data warehousing guide discusses star queries and how to tune them:
http://docs.oracle.com/cd/E11882_01/server.112/e25554/schemas.htm#CIHFGCEJ
>
Tuning Star Queries
To get the best possible performance for star queries, it is important to follow some basic guidelines:
A bitmap index should be built on each of the foreign key columns of the fact table or tables.
The initialization parameter STAR_TRANSFORMATION_ENABLED should be set to TRUE. This enables an important optimizer feature for star-queries. It is set to FALSE by default for backward-compatibility.
When a data warehouse satisfies these conditions, the majority of the star queries running in the data warehouse uses a query execution strategy known as the star transformation. The star transformation provides very efficient query performance for star queries.
>
And that doc section ALSO has example code and an example execution plan that shows that the star transform is being use.
That also also has some important info about how Oracle chooses to use a star transform and a large list of restrictions where the transform is NOT supported.
>
How Oracle Chooses to Use Star Transformation
The optimizer generates and saves the best plan it can produce without the transformation. If the transformation is enabled, the optimizer then tries to apply it to the query and, if applicable, generates the best plan using the transformed query. Based on a comparison of the cost estimates between the best plans for the two versions of the query, the optimizer then decides whether to use the best plan for the transformed or untransformed version.
If the query requires accessing a large percentage of the rows in the fact table, it might be better to use a full table scan and not use the transformations. However, if the constraining predicates on the dimension tables are sufficiently selective that only a small portion of the fact table must be retrieved, the plan based on the transformation will probably be superior.
Note that the optimizer generates a subquery for a dimension table only if it decides that it is reasonable to do so based on a number of criteria. There is no guarantee that subqueries will be generated for all dimension tables. The optimizer may also decide, based on the properties of the tables and the query, that the transformation does not merit being applied to a particular query. In this case the best regular plan will be used.
Star Transformation Restrictions
Star transformation is not supported for tables with any of the following characteristics:
>
Re reference partitioning
>
Also this is a data warehouse star model and mentioned to us that reference partitioning not great with local indexes - the large fact table has several local bitmpa indexes.
Any thoughts on reference partitioning negatively impacting performance in this way compared to standalone partitioned table.
>
Reference partitioning is for those situations where your child table does NOT have a column that the parent table is being partitioned on. That is NOT your use case. Dont' use reference partitioning unless your use case is appropriate.
I suggest that you and your team thoroughly review all of the relevant sections of both the database data warehousing guide and the VLDB and partitioning guide.
Then create a SIMPLE data model that only includes your partitioning keys and not all of the other columns. Experiment with that simple model with a small amount of data and run the traces and execution plans until you get the behaviour you think you are wanting.
Then scale it up and test it. You cannot design it all ahead of time and expect it to work the way you want.
You need to use an iterative approach. That starts by collecting all the relevant information about your data: how much data, how is it organized, how is it updated (batch or online), how is it queried. You already mention using hash subpartitioning but haven't posted ANYTHING that indicates you even need to use hash. So why has that decision already been made when you haven't even gotten past the basics yet?

How data is distributed in HASH partitions

Guys,
I want to partitions my one big table into 5 different partitions based on HASH value of the LOCATION field of the table.
My question is, Will the data be distributed equally in partitions or will end up in one partition or I need to have 5 diferent HASH value for location key to end up in five partitions.

Hash partitioning enables easy partitioning of data that does not lend itself to range or list partitioning. It does this with a simple syntax and is easy to implement. It is a better choice than range partitioning when:
1) You do not know beforehand how much data maps into a given range
2) The sizes of range partitions would differ quite substantially or would be difficult to balance manually
3) Range partitioning would cause the data to be undesirably clustered
4) Performance features such as parallel DML, partition pruning, and partition-wise joins are important
The concepts of splitting, dropping or merging partitions do not apply to hash partitions. Instead, hash partitions can be added and coalesced.
What I think that is, in your case list partitioning can be of choice.
http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14220/partconc.htm#i462869

Help needed in EXPLAIN PLAN for a partitioned table

Oracle Version 9i
(Paste execution plan in a spreadsheet to make sense out of it)
I am trying to tune this query -
select * from comm c ,dates d
where
d.active_period = 'Y'
and c.period = d.period;
Operation     Object Name     Rows     Bytes     Cost     Object Node     In/Out     PStart     PStop
SELECT STATEMENT Optimizer Mode=CHOOSE          5 M          278887
HASH JOIN          5 M     5G     278887
TABLE ACCESS FULL     SCHEMA.DATES     24      1 K     8
PARTITION LIST ALL                                   1     8
TABLE ACCESS FULL     SCHEMA.COMM     6 M     5G     277624                1     8
However, I know that the dates table will return only one record. So, if I add another condition to the above query, I get this execution plan. The comm table is huge but it is partitioned on period.
select * from comm c ,dates d
where
d.active_period = 'Y'
and c.period = qd.period
and c.period = 'OCT-07'
Operation     Object Name     Rows     Bytes     Cost     Object Node     In/Out     PStart     PStop
SELECT STATEMENT Optimizer Mode=CHOOSE          1           8
MERGE JOIN CARTESIAN          1      9 K     8
TABLE ACCESS FULL     SCHEMA.DATES     1      69      8
BUFFER SORT          1      9 K
TABLE ACCESS BY LOCAL INDEX ROWID     SCHEMA.COMM     1      9 K                    7     7
INDEX RANGE SCAN     SCHEMA.COMM_NP9     1                          7     7
How can I make the query such that the comm table does not have to traverse all of its partitions? The partitioning is based on quarters so it will get its data in one partition only (in the above example - partition no. 7)
Thanks in advance for writing in your replies :)

You need to specify period = 'OCT-07', otherwise there is no way the optimizer can know it needs to access only one partition.
Alternatively, partition the DATES table in exactly the same way on "period", and partition-wise joins should kick in, and effectively accessing only the active partition.

Partitioning SCD Type 2 dimensions and possible update sto partition key

Hi,
We have a fact table which only gets inserts and are considering range partitioning this on sell_by_date (as can archive off products which long past sell by date) and hash sub-partition on product_id.
Heard about partition wise joins and are considering partitioning the SCD Type 2 product dimension in similar way, range partition on product_sell_by_date and sub-partition by hash on product_id in oredr to gte benefiit of partition-wise joins.
However, is possibility that product_sell_by_date could be updated via delta loads which maintian the SCD type 2 - in theory this attribute shouldn't be updated much.
Any thoughts, generally on partitioning dimension tables and partition keys which can be updated.
Many Thanks

1. Create a function as mentioned by you "max value +1 on grouping of data "on target schema.
2. Map to the target column
3. select TARGET as exection area in interface ( don't select source or staging) for this column in interface mapping...
4. select option only INSERT ( remove de-select UPDATE option) for this column in interface mapping.
5. Execute the interface.
Please let me know if this works
Edited by: neeraj_singh on 6 Aug, 2012 10:38 PM
Edited by: neeraj_singh on 7 Aug, 2012 1:55 AM

Partitioning design question

I'm investigating partitioning one or more tables in our schema to see if that will speed performance for some of our queries. I'm wondering if the following structure is viable, however.
Table structure - this is a snippet of relevant info:
CREATE TABLE ASSET (
asset NUMBER, -- primary key
assetType NUMBER,
company NUMBER,
created SYSTIMESTAMP,
modified SYSTIMESTAMP
lobData CLOB
...)The current table has ~ 60 million rows. All queries are filtered at least on the company column, and possibly by other criteria (never/rarely by date). The number of rows a company can have in this table can vary greatly - the largest company has about 2.4 million, and the smallest about 1000. This table is joined by several other tables via the primary key, but rarely queried itself by the primary key (no range pkey queries exist).
I'm thinking of partitioning by company (range) - however, I'm not sure if the uneven distribution of company data makes that an effective partition. The number of companies is relatively small (~6000 ) and does not grow significantly (perhaps 1-2 new companies a day). The data in this table is pretty active - ~200k deletes/inserts a day.
Does it make sense to range partition by company? I was thinking of partitioning per company (1 partition per company) - but the partitions would be quite different in size. Is there a limit to the number of partitions a table can have (is it 64k?). Does partitioning even make sense for this table structure?
Any thoughts or insights would be most helpful - thank you.

kellypw wrote:
I'm investigating partitioning one or more tables in our schema to see if that will speed performance for some of our queries. I'm wondering if the following structure is viable, however.
Table structure - this is a snippet of relevant info:
CREATE TABLE ASSET (
asset NUMBER, -- primary key
assetType NUMBER,
company NUMBER,
created SYSTIMESTAMP,
modified SYSTIMESTAMP
lobData CLOB
...)The current table has ~ 60 million rows. All queries are filtered at least on the company column, and possibly by other criteria (never/rarely by date). The number of rows a company can have in this table can vary greatly - the largest company has about 2.4 million, and the smallest about 1000. This table is joined by several other tables via the primary key, but rarely queried itself by the primary key (no range pkey queries exist).
I'm thinking of partitioning by company (range) - however, I'm not sure if the uneven distribution of company data makes that an effective partition. The number of companies is relatively small (~6000 ) and does not grow significantly (perhaps 1-2 new companies a day). The data in this table is pretty active - ~200k deletes/inserts a day.
The version of Oracle is very important.
Partitioning on company looks like a sensible option since you ALWAYS filter on company - but list partitioning makes more sense than range partitioning because it is more "truthful"
Unfortunately it looks, at first sight, as if you have a logical error in the design - I'm wondering if the company should be part of the primary key of the asset. If you partition by company you won't be able to do partition-wise joins to the other tables when joining on primary key (I've interpreted your statement to mean that the asset is the foreign key in other tables) unless you happen to be running 11g and use "ref partitioning".
It's hard to predict the impact of 6,000 partitions, especially with such extreme variations in size. With list partitioning it's worth thinking about putting each large company into its own partition, but using a small number of partitions (or even just the default partition) for all the rest.
Regards
Jonathan Lewis

Partition Pruning vs Partition-Wise Join

Similar Messages

Maybe you are looking for