How data is distributed in HASH partitions

Guys,
I want to partitions my one big table into 5 different partitions based on HASH value of the LOCATION field of the table.
My question is, Will the data be distributed equally in partitions or will end up in one partition or I need to have 5 diferent HASH value for location key to end up in five partitions.

Hash partitioning enables easy partitioning of data that does not lend itself to range or list partitioning. It does this with a simple syntax and is easy to implement. It is a better choice than range partitioning when:
1) You do not know beforehand how much data maps into a given range
2) The sizes of range partitions would differ quite substantially or would be difficult to balance manually
3) Range partitioning would cause the data to be undesirably clustered
4) Performance features such as parallel DML, partition pruning, and partition-wise joins are important
The concepts of splitting, dropping or merging partitions do not apply to hash partitions. Instead, hash partitions can be added and coalesced.
What I think that is, in your case list partitioning can be of choice.
http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14220/partconc.htm#i462869

Similar Messages

What is the significance of Hash Partition ?

Hi All,
First time i am going to implement Hash partition as well as subpartition also.before implementing i have some query regarding that.
1.What is the Max no. of partition or sub partition we can specify and default ?
2.How do we know whch data comes under whch Hash partition ? i mean suppose incase of range partition based on specified range we are able to know what data comes under what partition. Same incase of List partition.
Does anyone have any idea.
Thanks n advance.
Anwar

1. Take a look here : http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14237/limits003.htm
2. Take a look here : Re: Access to HASH PARTITION
Nicolas.
Correction of link
Message was edited by:
N. Gasparotto

How to move rows from one partition to another?

We have a data retention requirement for 6 months and after 6 months the data can be removed except for few records with certain statuses (which cannot be removed even if they are over 6 months). We have Oracle 11g.
I wanted to see if the following strategy works:
I will have monthly partitions and 8 sub partitions (hash) in each of the main partitions. The hash sub-partitions are to spread the load of application data and balance the data insertion to avoid any contention.
At the end of 6 months, is it possible for me to move the row that needs to be kept to a different partition and drop the last partition. I wanted to avoid data deletion because we only have very little database downtime and data deletion require us to coelize the database to avoid fragmentation.
If I can move the data to another partition, how will I do it?
Thanks

I think you didn't get intentions correctly.
My intention is to move the required data to a partition and then drop the original partition. that because the amount of data required to keep for future use is less than 1%.
I understood what you meant to convey. Thanks a lot.
I laid out eh plan something like the following.
1. Add a new date column to facilitate the partitioning (say its called PARTITION_DT. PARTITION_DT will have the same data as ROW_CREAT_DTTM without the timestamp portion of the date.)
2. We will start with monthly partition using PARTITION_DT, Create a 'Default' partition as well.
3. Add hash sub-partitions on MSG_ID to spread the application data load and avoid contention (so the application load is balanced just like how its balanced through the hash partitions today)
4. At the end of the data retention period, identify the records that needs to be retained (because of some trade statuses, etc ) and update the PARTITION_DT with a distant past date so the row will automatically move to the Default partition. The identification of the records to be retained and updating the PARTITION_DT can be done through a batch job. since this step involve very few records, it can be done in minutes or seconds.
5. Now, we can drop the oldest partition and rebuild the index if required.
6 The process should add sufficient number of monthly and its sub-partitions ahead of time to make sure we always have partitions available for new data.
The entire process can be automated and executed through a scheduled job.

Does hash partition distribute data evenly across partitions?

As per Oracle documentation, it is mentioned that hash partitioning uses oracle hashing algorithms to assign a hash value to each rows partitioning key and place it in the appropriate partition. And the data will be evenly distributed across the partitions. Ofcourse following following conditions :
1. Partition count should follow 2^n logic
2. Data in partition key column should have high cardinality.
I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions. To verify it, i performed a small test :
Table script :
Create table ch_acct_mast_hash(
Cod_acct_no number)
Partition by hash(cod_acct_no)
PARTITIONS 128;
Data population script :
declare
i number;
l number;
begin
i := 1000000000000000;
for l in 1 .. 100000 loop
insert into ch_acct_mast_hash values (i);
i := i + 1;
end loop;
commit;
end;
Row-count check :
select count(1) from Ch_Acct_Mast_hash ; --rowcount is 100000
Gather stats script :
begin
dbms_stats.gather_table_stats('C43HDEV', 'CH_ACCT_MAST_HASH');
end;
Data distribution check :
Select min(num_rows), max(num_rows) from dba_tab_partitions
where table_name = 'CH_ACCT_MAST_HASH';
Result is :
min(num_rows) = 700
max(num_rows) = 853
As per the result, it seems there is lot of skewness in data distribution across partitions. Maybe I am missing something, or something is not right.
Can anybody help me to understand this behavior?
Edited by: Kshitij Kasliwal on Nov 2, 2012 4:49 AM

>
I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions.
>
All keys with the same data value will also have the same hash value and so will be in the same partition.
So the actual hash distribution in any particular case will depend on the actual data distribution. And, as Iordan showed, the data distribution depends not only on cardinality but on the standard deviation of the key values.
To use a shorter version of that examle consider these data samples which each have 10 values. There is a calculator here
http://easycalculation.com/statistics/standard-deviation.php
0,1,0,2,0,3,0,4,0,5 - total 10, distinct 6, %distinct 60, mean 1.5, stan deviation 1.9, variance 3.6 - similar to Iordan's example
0,5,0,5,0,5,0,5,0,5 - total 10, distinct 2, %distinct 20, mean 2.5, stan dev. 2.64, variance 6.9
5,5,5,5,5,5,5,5,5,5 - total 10, distinct 1, %distinct 10, mean 5, stan dev. 0, variance 0
0,1,2,3,4,5,6,7,8,9 - total 10, distinct 10, %distinct 100, mean 4.5, stan dev. 3.03, variance 9.2
The first and last examples have the highest cardinality but only the last has unique values (i.e. 100% distinct).
Note that the first example is lower for all other attributes but that doesn't mean it would hash more evenly.
Also note that the last example, the unique values, has the highest variance.
So this is no single attribute that is controlling. As Iordan showed the first example has a high %distinct but all of those '0' values will hash to the same partition so even using a perfect hash the data would use 6 partitions.

How many hash partitions create is prefer

I want to create hash partitions on a large table,but i don't know how many partitions to create?
create table T_PARTITION_HASH
ID   NUMBER,
NAME VARCHAR2(50)
partition by hash (ID)
partition T_HASH_P1
    tablespace USERS,
partition T_HASH_P2
    tablespace USERS,
partition T_HASH_P3
    tablespace USERS
);

Agustin UN wrote:
What is the table size in rows and Mb?
What is the grow estimated of table?
What access is on table?
. :-) Any help with my english will be welcome :-).the table contain about 1 hundred million rows.
the table grows 1000000 rows per day.
the table is use for statistical analysis in data wasehouse.
Edited by: user7244870 on Nov 5, 2010 2:22 AM

How to create up-to-date Recovery Drive and Recovery Partition on Windows 8.1 U1.

I am running 8.1 and exploring the recovery options.
I'm periodically creating recovery images, etc. but would like to do better than that.
I'd like to know how to:
Create an updated Recovery Partition that will restore to a Recovery Image (WIM) that I choose.
Modify / Create a Recovery Drive that includes a system image
of my choice, as opposed to an image created by the manufacturer / supplier (and the extra work that that implies). If I can create an up to date partition and mark it as such, then I'm clearly done. Should I not be able to sensibly
make my own partition, how do I create this on a USB recovery drive? (The built in interface has a Boolean option, use hard drive recovery partition OR use nothing. This would be solved if it
also had an option to use current recovery image. I'm happy to write simple code to make this happen if that's what it takes.)
Put another way. I can make a recovery image at will, how do I create a recovery partition and recovery drive to match?
I'm no expert on the ins and outs of Windows 8.1 recovery, if this is already covered elsewhere, I'd appreciate a link.

I have documented an unsupported process for creating an automated recovery for use in a task sequence
here. However, if I read your post correctly you are suggesting being able to do this from USB media. The method for creating the recovery partition is simple. It is simply a partition containing a folder called RecoveryImage with your install.wim
inside. Reagentc is the tool that will actually tell Windows where that recovery is. While I don't personally have a use for what your suggesting, I may give it a shot just to see if it can be done. I suspect it will work.
Here are the links that got me started:
ReagentC:
http://technet.microsoft.com/en-us/library/dd799242(v=WS.10).aspx
Creating Push Button Reset:
http://technet.microsoft.com/en-us/library/hh824917.aspx
Good Luck!
bill
Regards, Bill Moore @BMooreAtDell

Best Way to Load Data in Hash Partition

Hi,
I have partitioning by Hash on a Large Table of 5 TB. We have to load Data say more than 500GB daily on that table from ETL.
What is the best way to Load data into that Big Table which has hash Partition .
Regards
Sahil Soni

Do you have any specific requirements to match records to lookup tables or it just a straight load - that is an insert?
Do you have any specific performance requirements?
The easiest and fastest way to load data into Oracle is via external file and parallel query/parallel insert. Remember that parallel DML is not enabled by default and you have to do so via alter session command. You can leverage multiple CPU cores and direct path operation to perform the load.
Assuming your database is on a linux/unix server - you could NFS load the file if it is on a remote system, but then you will be most likely limited by network transfer speed.

How many Hash Partitions do I need?

Is there a general rule, guideline or best practice that I can follow to derive the optimum number of hash partitions I should create for a table to be hash partitioned?
I see that oracle recommends to consider partitioning if the table exceeds 2GB, so should I attempt to limit the size of the partitions to 2GB?
Why is this recommendation based on number of bytes vs. number or rows? It seems to me that a 100m row table should be consdidered for partitioning regardless if it has 1gb of data or 5gb of data. Please advise as to why the number of bytes is consdered a key factor vs. number of rows.
-Pat

Thanks for the response. Actually I currently have a table with 1.1 billion rows and growing daily. This table is currently hash partitioned with 8 partitions. I'm thinking we should go to 64 partitions, but I really don't have any information for which to base this on other than a gut feel. Based on the access patterns for this table I do not believe that there is any real downside for creating 64 partitions, but before having the customer take the down time to go to 64 partitions it would be nice to have some basis for justifying the 64 partitions.

Uneven distribution in Hash Partitioning

Version :11.1.0.7.0 - 64bit Production
OS :RHEL 5.3
I have a range partitioning on ACCOUNTING_DATE column and have 24 monthly partitions.
To get rid of buffer busy waits on index, i have created global partitioned index using below ddl
DDL :
CREATE INDEX IDX_GL_BATCH_ID ON SL_JOURNAL_ENTRY_LINES(GL_BATCH_ID)
GLOBAL PARTITION BY HASH (GL_BATCH_ID) PARTITIONS 16 TABLESPACE OTC_IDX PARALLEL 8 INITRANS 8 MAXTRANS 8 PCTFREE 0 ONLINE;After index creation, i realized that only one index hash partition got all rows.
select partition_name,num_rows from dba_ind_partitions where index_name='IDX_GL_BATCH_ID';
PARTITION_NAME                   NUM_ROWS
SYS_P77                                 0
SYS_P79                                 0
SYS_P80                                 0
SYS_P81                                 0
SYS_P83                                 0
SYS_P84                                 0
SYS_P85                                 0
SYS_P87                                 0
SYS_P88                                 0
SYS_P89                                 0
SYS_P91                                 0
SYS_P92                                 0
SYS_P78                                 0
SYS_P82                                 0
SYS_P86                                 0
SYS_P90                         256905355As far as i understand, HASH partitioning will distribute evenly. By looking at above distribution, i think, i did not benefit of having multiple insert points using HASH partitioning as well.
Here is index column statistics :
select TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,NUM_NULLS,LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM,AVG_COL_LEN from dba_tab_col_statistics where table_name='SL_JOURNAL_ENTRY_LINES' and COLUMN_NAME='GL_BATCH_ID';
TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT NUM_NULLS LAST_ANALYZED        SAMPLE_SIZE HISTOGRAM       AVG_COL_LEN
SL_JOURNAL_ENTRY_LINES         GL_BATCH_ID                     1          0 2010/12/28 22:00:51    259218636 NONE                      4

It looks like that inserted data has always the same value for the partitioning key: it is expected that in this case the same partition is used because
>
For optimal data distribution, the following requirements should be satisfied:
Choose a column or combination of columns that is unique or almost unique.
Create multiple partitions and subpartitions for each partition that is a power of two. For example, 2, 4, 8, 16, 32, 64, 128, and so on.
>
See http://download.oracle.com/docs/cd/E11882_01/server.112/e16541/part_avail.htm#VLDBG1270.
Edited by: P. Forstmann on 29 déc. 2010 09:06

Modify HUGE HASH partition table to RANGE partition and HASH subpartition

I have a table with 130,000,000 rows hash partitioned as below
----RANGE PARTITION--
CREATE TABLE TEST_PART(
C_NBR CHAR(12),
YRMO_NBR NUMBER(6),
LINE_ID CHAR(2))
PARTITION BY RANGE (YRMO_NBR)(
PARTITION TEST_PART_200009 VALUES LESS THAN(200009),
PARTITION TEST_PART_200010 VALUES LESS THAN(200010),
PARTITION TEST_PART_200011 VALUES LESS THAN(200011),
PARTITION TEST_PART_MAX VALUES LESS THAN(MAXVALUE)
CREATE INDEX TEST_PART_IX_001 ON TEST_PART(C_NBR, LINE_ID);
Data: -
INSERT INTO TEST_PART
VALUES ('2000',200001,'CM');
INSERT INTO TEST_PART
VALUES ('2000',200009,'CM');
INSERT INTO TEST_PART
VALUES ('2000',200010,'CM');
VALUES ('2006',NULL,'CM');
COMMIT;
Now, I need to keep this table from growing by deleting records that fall b/w a specific range of YRMO_NBR. I think it will be easy if I create a range partition on YRMO_NBR field and then create the current hash partition as a sub-partition.
How do I change the current partition of the table from HASH partition to RANGE partition and a sub-partition (HASH) without losing the data and existing indexes?
The table after restructuring should look like the one below
COMPOSIT PARTITION-- RANGE PARTITION & HASH SUBPARTITION --
CREATE TABLE TEST_PART(
C_NBR CHAR(12),
YRMO_NBR NUMBER(6),
LINE_ID CHAR(2))
PARTITION BY RANGE (YRMO_NBR)
SUBPARTITION BY HASH (C_NBR) (
PARTITION TEST_PART_200009 VALUES LESS THAN(200009) SUBPARTITIONS 2,
PARTITION TEST_PART_200010 VALUES LESS THAN(200010) SUBPARTITIONS 2,
PARTITION TEST_PART_200011 VALUES LESS THAN(200011) SUBPARTITIONS 2,
PARTITION TEST_PART_MAX VALUES LESS THAN(MAXVALUE) SUBPARTITIONS 2
CREATE INDEX TEST_PART_IX_001 ON TEST_PART(C_NBR,LINE_ID);
Pls advice
Thanks in advance

Sorry for the confusion in the first part where I had given a RANGE PARTITION instead of HASH partition. Pls read as follows;
I have a table with 130,000,000 rows hash partitioned as below
----HASH PARTITION--
CREATE TABLE TEST_PART(
C_NBR CHAR(12),
YRMO_NBR NUMBER(6),
LINE_ID CHAR(2))
PARTITION BY HASH (C_NBR)
PARTITIONS 2
STORE IN (PCRD_MBR_MR_02, PCRD_MBR_MR_01);
CREATE INDEX TEST_PART_IX_001 ON TEST_PART(C_NBR,LINE_ID);
Data: -
INSERT INTO TEST_PART
VALUES ('2000',200001,'CM');
INSERT INTO TEST_PART
VALUES ('2000',200009,'CM');
INSERT INTO TEST_PART
VALUES ('2000',200010,'CM');
VALUES ('2006',NULL,'CM');
COMMIT;
Now, I need to keep this table from growing by deleting records that fall b/w a specific range of YRMO_NBR. I think it will be easy if I create a range partition on YRMO_NBR field and then create the current hash partition as a sub-partition.
How do I change the current partition of the table from hash partition to range partition and a sub-partition (hash) without losing the data and existing indexes?
The table after restructuring should look like the one below
COMPOSIT PARTITION-- RANGE PARTITION & HASH SUBPARTITION --
CREATE TABLE TEST_PART(
C_NBR CHAR(12),
YRMO_NBR NUMBER(6),
LINE_ID CHAR(2))
PARTITION BY RANGE (YRMO_NBR)
SUBPARTITION BY HASH (C_NBR) (
PARTITION TEST_PART_200009 VALUES LESS THAN(200009) SUBPARTITIONS 2,
PARTITION TEST_PART_200010 VALUES LESS THAN(200010) SUBPARTITIONS 2,
PARTITION TEST_PART_200011 VALUES LESS THAN(200011) SUBPARTITIONS 2,
PARTITION TEST_PART_MAX VALUES LESS THAN(MAXVALUE) SUBPARTITIONS 2
CREATE INDEX TEST_PART_IX_001 ON TEST_PART(C_NBR,LINE_ID);
Pls advice
Thanks in advance

Cost to change hash partition key column in a history table

Hi All,
I have the following scenario.
We have a history table in production which has 16 hash partitions on the basis of key_column.
But the nature of data that we have in history table that has 878 distinct values of the key_column and about 1000 million data and all partitons are in same tablespace.
Now we have a Pro*C module which purges data from this history table in the following way..
> DELETE FROM hsitory_tab
> WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
> AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
> AND ROWNUM <= 210;
Now (p_date,t_data are one of the two columns in history table) data is deleted using thiese two date column conditions but key_column for partition is different.
So as per aboove statement this history table containd 6 months data.
DBA is asking to change this query and use partiton date wise.Now will it be proper to change the partition key_column (the existing hash partiton key_column >have 810 distinct values) and what things we need to cosider to calculate cost behind this hash partition key_column cahange(if it is appropriate to change >partition )key_column)Hope i explained my problem clearly and waiting for your suggestions .
Thanks in advance.

Hi Sir
Many thanks for the reply.
For first point -
we are in plan to move the database to 10g after a lot of hastle between client.For second point -
If we do partition by date or week we will have 30 or 7 partitions .As suggested by you as we have 16 partitions in the table best approach would be to have >partition by week then we will have 7 partitions and then each query will heat 7 partitions .For third point -
Our main aim to reduce the timings of a job(a Pro*C program) which contains the following delete query to delete data from a history table .So accroding to the >query it is deleting data every day for 7 months and while deleting it it queries this hug etable by date.So in this case hash partition or range partiton or >hash/range partition which will be more suitable.
DELETE FROM hsitory_tab
WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
AND ROWNUM <= 210;I have read in hash partition is used so that data will be evenly distributed in all partitions (though it depends on nature of data).In my case i want some suggestion from you to take the best approach .

Design capture of hash partitioned tables

Hi,
Designer version 9.0.2.94.11
I am trying to capture from a server model where the tables are hash partitioned. But this errors because Designer only knows about range partitions. Does anyone know how I can get Designer to capture these tables and their constraints?
Thanks
Pete

Pete,
I have tried all three "current" Designer clients 6i, 9i, and 10g, at the "current" revision of the repository (I can post details if interested). I have trawled the net for instances of this too, there are many.
As stated by Sue, the Designer product model does not support this functionality (details can be found on ORACLE Metalink under [Bug No. 1484454] if you have access), if not, see excerpt below. It appears that at the moment ORACLE have no urgent plans to change this (the excerpt is dated as raised in 2001 and last updated in May 2004).
Composite partitioning and List partitioning are equally affected.
>>>>> ORACLE excerpt details STARTS >>>>>
CDS-18014 Error: Table Partition 'P1' has a null String parameter
'valueLessThan' in file ..\cddo\cddotp.cpp function
cddotp_table_partition::cddotp_table_partition and line 122
*** 03/02/01 01:16 am ***
*** 06/19/01 03:49 am *** (CHG: Pri->2)
*** 06/19/01 03:49 am ***
Publishing bug, and upping priority - user is stuck hitting this issue.
*** 09/27/01 04:23 pm *** (CHG: FixBy->9.0.2?)
*** 10/03/01 08:30 am *** (CHG: FixBy->9.1)
*** 10/03/01 08:30 am ***
This should be considered seriously when looking at ERs we should be able to
do this
*** 05/01/02 04:37 pm ***
*** 05/02/02 11:44 am ***
I have reproduced this problem in 6.5.82.2.
*** 05/02/02 11:45 am *** ESCALATION -> WAITING
*** 05/20/02 07:38 am ***
*** 05/20/02 07:38 am *** ESCALATED
*** 05/28/02 11:24 pm *** (CHG: FixBy->9.0.3)
*** 05/30/02 06:23 am ***
Hash partitioning is not modelled in repository and to do so would require a
major model change. This is not feasible at the moment but I am leaving this
open as an enhancement request because it is a much requested facility.
Although we can't implement this I think we should try to detect 'partition by
hash', output a warning message that it is not supported and then ignore it.
At least then capture can continue. If this is possible, it should be tested
and the status re-set to '15'
*** 05/30/02 06:23 am *** (CHG: FixBy->9.1)
*** 06/06/02 02:16 am *** (CHG: Sta->15)
*** 06/06/02 02:16 am RESPONSE ***
It was not possible to ignore the HASH and continue processing without a
considerable amount of work so we have not made any changes. The existing
ERROR message highlights that the problem is with the partition. To enable
the capture to continue the HASH clause must be removed from the file.
*** 06/10/02 08:32 am *** ESCALATION -> CLOSED
*** 06/10/02 09:34 am RESPONSE ***
*** 06/12/02 06:17 pm RESPONSE ***
*** 08/14/02 06:07 am *** (CHG: FixBy->10)
*** 01/16/03 10:05 am *** (CHG: Asg->NEW OWNER)
*** 02/13/03 06:02 am RESPONSE ***
*** 05/04/04 05:58 am RESPONSE ***
*** 05/04/04 07:15 am *** (CHG: Sta->97)
*** 05/04/04 07:15 am RESPONSE ***
<<<<< ORACLE excerpt details ENDS <<<<<
I (like I'm sure many of us) have an urgent immediate need for this sort of functionality, and have therefore resolved to looking at some form of post process to produce the required output.
I imagine that it will be necessary to flag the Designer meta-data content and then manipulate the generator output once it's done its "raw" generation as a RANGE partition stuff (probably by using the VALUE_LESS_THAN field as its mandatory, and meaningless for HASH partitions!).
An alternative would be to write an API level generator for this using the same flag, probably using PL/SQL.
If you have (or anyone else has) any ideas on this, then I'd be happy to share them to see what we can cobble together in the absence of an ORACLE interface to their own product.
Peter

Hash Partition on non unique column

I have Dim and fact tables expected volumns early growth 250 million rows.
I have Hash partitioned Dim table where i have unique Key but on Fact table i do not have a single unique key column(fact is child of Dim ).
Since fact table doesn't has single unique column,how to do hash partition on Fact Table ?
Need advice.

Hi there, I have similar kind of a situation, need your suggestions.
we are using the Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
We have a HR data approximately of about 270 mill for 100,000 persons in an SAP module, & we need to load this data into an oracle table.
This is an accumulated data for the past 2 yrs starting from Jan 2011. & the assumpitions are the data that will be incrementally loaded every day is about 600,000.
The data granularity is available at the fraction of the day known as TIME_TYPE for a given day.
for example an person can have multiple records on a given day, depending upon the TIME_TYP.
sample data:
Pers_ID     Payroll_date     Time_Type      Day_Hrs
1960     1-Jan-11     Maximum Vacation     4
1960     1-Jan-11     Vacation Quota Maximum     2
1960     1-Jan-11     Maximum Sick     2
1960 2-Jan-11     Paid Eligible OT Hrs     3
1960     2-Jan-11     Paid Hours     3
1960     2-Jan-11     WW Eligible OT Hrs     2
1960     5-Jan-11     Daily Overtime Hours     2
1960     5-Jan-11     Weekly Additional Hours     2
1960      5-Jan-11     Personal Quota Balance     2
1960     5-Jan-11     Total Overtime Hours     2
The above data is of an individual person, his time spent on a particular day over a 3 day period.
My question is how best I can design the table (partitioned table), so that the data loading process can be fast (for the initial load as well as the daily incremental load)
also we have lot of reporting on this table, few reports such as total no. of hrs utilized by a particular employee, whats the most time_typ used by a group of employees, & so on.
please let me know your suggestions & thoughts.
Edited by: user3084749 on Feb 5, 2013 1:46 PM

A question on Hash Partition

Hi,
I'm facing a proble.My table has 16 partition and all the partitions are hash partitioned.
But i found only one partition is being populated heavily rather than the other ones it near to 3-4 times of other partitions.
My database version is 9i.
Can anyone suggest me in this.
Thanks in advance
say my table structure like this ..
CREATE TABLE TAB1(COL1 NUMBER,COL2 VARCHAR2(10),COL3 VARCHAR(10));
PARTITON BY HASH(COL3)
PARTITION P1 TABLESPACE TS1
PARTITION P16 TABLESPACE TS1
And i have only one index i.e
create index indx on tab1(col1,col2,col3);
Edited by: bp on Feb 17, 2009 4:40 AM

bp wrote:
My table has near 1000 million data as it is a history table.
Partition_key (col3) has distinct 926 values.
One thing is sure as the cardinality of col3 is very low in comparison with the amount of data in the table data is not evenly distributed.
Now another problem is one value (say col3 =1) of col3 from the 926 distinct values only goes to p16 partition but surprisingly this is not going to any other partitons.We have no other objects on this table to control the flow of data between partitions.
I really counld not find any reason of such behaviour.I'm not sure if I understand what you attempt to describe. You mean to say that in partition p16 there is only one value of COL3 found, and this partition holds more rows than the other partitions. Whereas the remaining partitions cover more COL3 values but hold less rows.
A single COL3 value always maps to the same hash value, so if you don't change the number of hash partitions and cause a "rebalancing" the same value should always map to the same partition (it still does after rebalancing but it might be a different partition now). You might be unlucky that there is currently no other value than "1" that maps to the same hash value. You could think about adding/removing hash partitions to change the distribution of the rows, but this could be a quite expensive operation given the amount of data in your table.
Are these 926 distinct values evenly distributed or is the data skewed in this column? Your description suggests that the data is skewed if a single value in a partition holds more rows than the other partitions that cover multiple values.
You could do a simple
SELECT COUNT(*), COL3
FROM TAB
GROUP BY COL3
ORDER BY COUNT(*) DESC
to find this out, or check the column statistics if there is an histogram on that column describing the column skew. If that query takes too long use a SAMPLE size (...FROM TAB SAMPLE (1)...) indicates a 1 percent sampling. You need to scale the counts then by the sampling factor.
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

Hash partition algorithm

If I hash partition a table on CUSTOMER_ID into say p partitions. I receive a daily batch feed of flat file transaction records that contain CUSTOMER ID. I need to split the batch of incoming source records into p parts and each part should correspond to one of the p partitions. I can do this if I am able to execute the same hash algorithm with CUSTOMER ID as parameter which will give me a number between 1 and p. Now I know the partition that oracle has asiigned this CUSTOMER ID to and therefore I can distribute the batch records amongst parallel threads with affinity between threads and oracle table partitions.
Can anybody let me know if the hash algorithm is avalable to call? Is it available in any package?

I hope I understood well your requirement : you want to divide the input file in 3 files corresponding to the partitions, right ?
Since your partitioned table is based on hash algorithm nothing obvious.
But since you are doing update only, you could have a pre-check in the database to know which partition a row is in : based on the partition key read from the input file, write 1 file for each partition accordingly. And then process with your 3 batches running on the different partitions based on their own file. It will require one full scan of the input file before processing, so I don't know how much gain you could hope from such thing though.
Nicolas.

How data is distributed in HASH partitions

Similar Messages

Maybe you are looking for