Hash Partition on non unique column

I have Dim and fact tables expected volumns early growth 250 million rows.
I have Hash partitioned Dim table where i have unique Key but on Fact table i do not have a single unique key column(fact is child of Dim ).
Since fact table doesn't has single unique column,how to do hash partition on Fact Table ?
Need advice.

Hi there, I have similar kind of a situation, need your suggestions.
we are using the Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
We have a HR data approximately of about 270 mill for 100,000 persons in an SAP module, & we need to load this data into an oracle table.
This is an accumulated data for the past 2 yrs starting from Jan 2011. & the assumpitions are the data that will be incrementally loaded every day is about 600,000.
The data granularity is available at the fraction of the day known as TIME_TYPE for a given day.
for example an person can have multiple records on a given day, depending upon the TIME_TYP.
sample data:
Pers_ID     Payroll_date     Time_Type      Day_Hrs
1960     1-Jan-11     Maximum Vacation     4
1960     1-Jan-11     Vacation Quota Maximum     2
1960     1-Jan-11     Maximum Sick     2
1960 2-Jan-11     Paid Eligible OT Hrs     3
1960     2-Jan-11     Paid Hours     3
1960     2-Jan-11     WW Eligible OT Hrs     2
1960     5-Jan-11     Daily Overtime Hours     2
1960     5-Jan-11     Weekly Additional Hours     2
1960      5-Jan-11     Personal Quota Balance     2
1960     5-Jan-11     Total Overtime Hours     2
The above data is of an individual person, his time spent on a particular day over a 3 day period.
My question is how best I can design the table (partitioned table), so that the data loading process can be fast (for the initial load as well as the daily incremental load)
also we have lot of reporting on this table, few reports such as total no. of hrs utilized by a particular employee, whats the most time_typ used by a group of employees, & so on.
please let me know your suggestions & thoughts.
Edited by: user3084749 on Feb 5, 2013 1:46 PM

Similar Messages

Can I have a primary key as a non-unique column

Hi all,
I have a table with 35 columns and only one column in a not null column. But this column data is not unique. I want to create a primary key with non-unique index, can I don it, if it is not possible is there any other way to it. Please help me with this.
Thanks for your help.vinaykotha

1) Do the 'Unique Column combination' check first, using this example. (Pl. Change the column name and number of columns you consider as candidate key.) The SQL as follows:-
;WITH CTE (ProductKey, CustomerKey, SalesTerritoryKey, DupRec)
AS
(SELECT ProductKey, CustomerKey, SalesTerritoryKey, ROW_NUMBER() OVER
(PARTITION BY ProductKey, CustomerKey, SalesTerritoryKey
ORDER BY ProductKey, CustomerKey, SalesTerritoryKey) AS DupRec
FROM dbo.FactSales
SELECT *
FROM CTE
WHERE DupRec > 1
2) if CTE Table returns no records, you are good to create a Composite CLUSTERED PRIMARY KEY, like:
ALTER TABLE dbo.FactSales
ADD CONSTRAINT PK_ProductKey_CustomerKey_SalesTerritoryKey
PRIMARY KEY CLUSTERED (ProductKey, CustomerKey, SalesTerritoryKey);
If no error..bingo! if NOT, Don't worry, create a Composite Index like this:-
3)
CREATE NONCLUSTERED INDEX [IX_ProductKey_CustomerKey_SalesTerritoryKey]
ON dbo.FactSales(ProductKey, CustomerKey, SalesTerritoryKey);
This will work still efficiently. Usually all transaction table are like that like Daily_Order table, Ship_Details table etc.
-NC

Script to make a non-unique column unique?

Hi all,
Currently I have a table that contains nothing but ID's (the rest of the table is to be populated at a later date).
In this table there are around 500,000,000 ID's, of which only about 35,000,000 are unique.
I need to run a script that will find all duplicate entries of an ID and delete them (only the duplicates). Only one copy of each ID should be in the table so that I can make the column the PK.
I can't think of how to do this. Any ideas?
Thanks,
fakelvis

Hello
There's a couple of ways you could do this, but I think creating a new table with the distinct list of ids, truncating the original and re-inserting the distinct may be the most optimal...if I understand correctly.
SQL> --Set up some test data to use with method 1 CTAS + Truncate
SQL> CREATE TABLE dt_test_big_tab AS SELECT object_id FROM dba_objects WHERE object_id IS NOT NUL
2 /
Table created.
SQL> INSERT INTO dt_test_big_tab SELECT * from dt_test_big_tab
2 /
36210 rows created.
SQL> /
72420 rows created.
SQL> /
144840 rows created.
SQL> /
289680 rows created.
SQL> /
579360 rows created.
SQL> /
1158720 rows created.
SQL> commit;
Commit complete.
SQL> SELECT COUNT(*) from dt_test_big_tab
2 /
COUNT(*)
   2317440
SQL> SELECT COUNT(DISTINCT object_id) from dt_test_big_tab
2 /
COUNT(DISTINCTOBJECT_ID)
                   36210
SQL> --Set up a copy of the test data to use with method 2 - Delete duplicates
SQL> CREATE TABLE dt_test_big_tab_2 as select * from dt_test_big_tab
2 /
Table created.
SQL> set timi on
SQL> --Method one, ctas + truncate
SQL> CREATE TABLE dt_test_unique AS SELECT DISTINCT object_id FROM dt_test_big_tab
2 /
Table created.
Elapsed: 00:00:02.07
SQL> SELECT COUNT(*) from dt_test_unique
2 /
COUNT(*)
     36210
Elapsed: 00:00:00.00
SQL> SELECT COUNT(DISTINCT object_id) from dt_test_unique
2 /
COUNT(DISTINCTOBJECT_ID)
                   36210
Elapsed: 00:00:00.00
SQL> TRUNCATE TABLE dt_test_big_tab
2 /
Table truncated.
Elapsed: 00:00:01.02
SQL> INSERT INTO dt_test_big_tab(object_id) SELECT object_id FROM dt_test_unique
2 /
36210 rows created.
Elapsed: 00:00:00.00
SQL> SELECT COUNT(*) from dt_test_big_tab
2 /
COUNT(*)
     36210
Elapsed: 00:00:00.01
SQL> SELECT COUNT(DISTINCT object_id) from dt_test_big_tab
2 /
COUNT(DISTINCTOBJECT_ID)
                   36210
Elapsed: 00:00:00.00
SQL> --Method 2, delete duplicates
SQL> DELETE
2 FROM
3     dt_test_big_tab_2
4 WHERE
5     rowid IN(SELECT
6                             del_rowid
7                     FROM
8                             (SELECT
9                                     rowid del_rowid,
10                                     ROW_NUMBER() OVER(partition by object_id ORDER BY object_id) rn
11                             FROM
12                                     dt_test_big_tab_2
13                             )
14                     WHERE
15                                     rn > 1
16                             );
2281230 rows deleted.
Elapsed: 00:05:39.06
SQL> select count(*) from dt_test_big_tab_2;
COUNT(*)
     36210
Elapsed: 00:00:00.03
SQL> select count(distinct object_id) from dt_test_big_tab_2;
COUNT(DISTINCTOBJECT_ID)
                   36210
Elapsed: 00:00:00.00
SQL>HTH
David

Oracle 11.2 - Perform parallel DML on a non partitioned table with LOB column

Hi,
Since I wanted to demonstrate new Oracle 12c enhancements on SecureFiles, I tried to use PDML statements on a non partitioned table with LOB column, in both Oracle 11g and Oracle 12c releases. The Oracle 11.2 SecureFiles and Large Objects Developer's Guide of January 2013 clearly says:
Parallel execution of the following DML operations on tables with LOB columns is supported. These operations run in parallel execution mode only when performed on a partitioned table. DML statements on non-partitioned tables with LOB columns continue to execute in serial execution mode.
INSERT AS SELECT
CREATE TABLE AS SELECT
DELETE
UPDATE
MERGE (conditional UPDATE and INSERT)
Multi-table INSERT
So I created and populated a simple table with a BLOB column:
SQL> CREATE TABLE T1 (A BLOB);
Table created.
Then, I tried to see the execution plan of a parallel DELETE:
SQL> EXPLAIN PLAN FOR
2 delete /*+parallel (t1,8) */ from t1;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
Plan hash value: 3718066193
| Id | Operation             | Name     | Rows | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
|   0 | DELETE STATEMENT      |          | 2048 |     2   (0)| 00:00:01 |        |      |            |
|   1 | DELETE               | T1       |       |            |          |        |      |            |
|   2 |   PX COORDINATOR      |          |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 | 2048 |     2   (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
|   4 |     PX BLOCK ITERATOR |          | 2048 |     2   (0)| 00:00:01 | Q1,00 | PCWC |            |
|   5 |      TABLE ACCESS FULL| T1       | 2048 |     2   (0)| 00:00:01 | Q1,00 | PCWP |            |
PLAN_TABLE_OUTPUT
Note
   - dynamic sampling used for this statement (level=2)
And I finished by executing the statement.
SQL> commit;
Commit complete.
SQL> alter session enable parallel dml;
Session altered.
SQL> delete /*+parallel (t1,8) */ from t1;
2048 rows deleted.
As we can see, the statement has been run as parallel:
SQL> select * from v$pq_sesstat;
STATISTIC                      LAST_QUERY SESSION_TOTAL
Queries Parallelized                    1             1
DML Parallelized                        0             0
DDL Parallelized                        0             0
DFO Trees                               1             1
Server Threads                          5             0
Allocation Height                       5             0
Allocation Width                        1             0
Local Msgs Sent                        55            55
Distr Msgs Sent                         0             0
Local Msgs Recv'd                      55            55
Distr Msgs Recv'd                       0             0
11 rows selected.
Is it normal ? It is not supposed to be supported on Oracle 11g with non-partitioned table containing LOB column....
Thank you for your help.
Michael

Yes I did it. I tried with force parallel dml, and that is the results on my 12c DB, with the non partitionned and SecureFiles LOB column.
SQL> explain plan for delete from t1;
Explained.
| Id | Operation             | Name     | Rows | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
|   0 | DELETE STATEMENT      |          |     4 |     2   (0)| 00:00:01 |        |      |            |
|   1 | DELETE               | T1       |       |            |          |        |      |            |
|   2 |   PX COORDINATOR      |          |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 |     4 |     2   (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
|   4 |     PX BLOCK ITERATOR |          |     4 |     2   (0)| 00:00:01 | Q1,00 | PCWC |            |
|   5 |      TABLE ACCESS FULL| T1       |     4 |     2   (0)| 00:00:01 | Q1,00 | PCWP |            |
The DELETE is not performed in Parallel.
I tried with another statement :
SQL> explain plan for
2        insert into t1 select * from t1;
Here are the results:
11g
| Id | Operation                | Name     | Rows | Bytes | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
|   0 | INSERT STATEMENT         |          |     4 | 8008 |     2   (0)| 00:00:01 |        |      |            |
|   1 | LOAD TABLE CONVENTIONAL | T1       |       |       |            |          |        |      |            |
|   2 |   PX COORDINATOR         |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)   | :TQ10000 |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
|   4 |     PX BLOCK ITERATOR    |          |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | PCWC |            |
|   5 |      TABLE ACCESS FULL   | T1       |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | PCWP |            |
12c
| Id | Operation                          | Name     | Rows | Bytes | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
|   0 | INSERT STATEMENT                   |          |     4 | 8008 |     2   (0)| 00:00:01 |        |      |            |
|   1 | PX COORDINATOR                    |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)              | :TQ10000 |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
|   3 |    LOAD AS SELECT                  | T1       |       |       |            |          | Q1,00 | PCWP |            |
|   4 |     OPTIMIZER STATISTICS GATHERING |          |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR             |          |     4 | 8008 |     2   (0)| 00:00:01 | Q1,00 | PCWC |            |
It seems that the DELETE statement has problems but not the INSERT AS SELECT !

Display columns as rows from non-unique key table

Hi OTN/Users, I hope you can assist me
Given a table:
create table t (a varchar2(30), b int, c date );
with this data within:
insert into t values ( a1, 40, to_date( '01-Dec-2012'));
insert into t values ( a1, 50, to_date( '01-Dec-2012'));
insert into t values ( a1, 60, to_date( '01-Dec-2012'));
insert into t values ( b1, 10, to_date( '01-Dec-2012'));
insert into t values ( b1, 20, to_date( '01-Dec-2012'));
insert into t values ( b1, 30, to_date( '01-Dec-2012'));
insert into t values ( c1, 60, to_date( '01-Dec-2012'));
insert into t values ( c1, 70, to_date( '01-Dec-2012'));
insert into t values ( c1, 80, to_date( '01-Dec-2012'));
- I want to output the columns for each of 'a' as a single row e.g:
a1 40 50 60 01-Dec-2012
b1 10 20 30 01-Dec-2012
I've almost got it right, but the 'a' col repeats 4 times for each row of output:
a1 40
a1 50
a1 60
a1 01-Dec-2012
-I want to supress repeat output of the first column 'a' but display the rest in a straight line.
I've tried various things (Pivot, Rollup etc), but the fact i'm keying on a table with non unique rows has complicated things perhaps.
Any help would be much appreciated

Hi,
Pre-11g this is how you would do it :[11.2] Pri @ Bepripd1 > !cat t.sql
with t(a,b,c) as (
     select 'a1', 40, to_date( '01-Dec-2012') from dual union all
     select 'a1', 50, to_date( '01-Dec-2012') from dual union all
     select 'a1', 60, to_date( '01-Dec-2012') from dual union all
     select 'b1', 10, to_date( '01-Dec-2012') from dual union all
     select 'b1', 20, to_date( '01-Dec-2012') from dual union all
     select 'b1', 30, to_date( '01-Dec-2012') from dual union all
     select 'c1', 60, to_date( '01-Dec-2012') from dual union all
     select 'c1', 70, to_date( '01-Dec-2012') from dual union all
     select 'c1', 80, to_date( '01-Dec-2012') from dual
------ end of sample data ------
select
     a
     ,max(decode(n,1,b,null)) q1
     ,max(decode(n,2,b,null)) q2
     ,max(decode(n,3,b,null)) q3
     ,c
from (
     select a, b, c, row_number() over (partition by a order by b) n
     from t
group by a,c
order by a,c
[11.2] Pri @ Bepripd1 > @t
A          Q1         Q2         Q3 C
a1         40         50         60 01/12/2012 00:00:00
b1         10         20         30 01/12/2012 00:00:00
c1         60         70         80 01/12/2012 00:00:00------
From 11g onward, you would :[11.2] Pri @ Bepripd1 > !cat t.sql
with t(a,b,c) as (
     select 'a1', 40, to_date( '01-Dec-2012') from dual union all
     select 'a1', 50, to_date( '01-Dec-2012') from dual union all
     select 'a1', 60, to_date( '01-Dec-2012') from dual union all
     select 'b1', 10, to_date( '01-Dec-2012') from dual union all
     select 'b1', 20, to_date( '01-Dec-2012') from dual union all
     select 'b1', 30, to_date( '01-Dec-2012') from dual union all
     select 'c1', 60, to_date( '01-Dec-2012') from dual union all
     select 'c1', 70, to_date( '01-Dec-2012') from dual union all
     select 'c1', 80, to_date( '01-Dec-2012') from dual
------ end of sample data ------
select a,q1,q2,q3,c
from (
     select a, b, c, row_number() over (partition by a order by b) n
     from t
pivot (
     max(b)
     for n in (
          1 as q1
          ,2 as q2
          ,3 as q3
order by a,c
[11.2] Pri @ Bepripd1 > @t
A          Q1         Q2         Q3 C
a1         40         50         60 01/12/2012 00:00:00
b1         10         20         30 01/12/2012 00:00:00
c1         60         70         80 01/12/2012 00:00:00Edited by: Nicosa on Nov 9, 2012 2:42 PM

WHat is the best index type for non uniqueness / Varchar columns in SQL 2008 R2

Hello All Greetings,
Please help me here with my doubt,
in my table i have two columns about a million rows, it has about 20 columns in it, three columns with name as Period, Gender so most of the time these two columns use in where clause,
Gender will contain Either M or F , Period contains YYYY-Month (2013-December, 2013-August) etc so i would like to add a Index to these two columns so that in will increase the performance, so please let me know what type of indexes i need to add to
these columns in the table,
please note that only one time we will add data to the table which will take only 2 minutes but we query the table every day
so my question what is the best index type that i need to create on columns with non uniqueness values in the column.,
Thank you In Advance,
Milan

There is nothing whatever wrong with creating an index on a VARCHAR column, or set of columns.
Regarding the performance of VARCHAR/INT, as with everything in a RDBMS, it depends on what you are doing. What you may be thinking of is the fact that clustering a table on a VARCHAR key is (in SQL Server) marginally less efficient than clustering on a monotonically
increasing numerical key, and can introduce fragmentation.
Or you may be thinking of what you have heard about writing JOINs on VARCHAR columns - it is true, it is a little less efficient than a JOIN on numeric type, but it is only a little less efficient, nothing that would lead you to never join on varchar cols.
None of this does not mean that you should not create indexes on VARCHAR columns. A needed index on a VARCHAR column will boost query performance, often by orders of magnitude. If you need an index on a VARCHAR, create it. It makes no sense to try to find an
integer column to create the index on - the engine will never use it.
Check this reference: http://stackoverflow.com/questions/14041481/is-it-good-to-create-a-nonclustered-index-on-a-column-of-type-varchar
Mark ANSWER if this reply resolves your query, If helpful then VOTE HELPFUL
INSQLSERVER.COM
Mohammad Nizamuddin

Cost to change hash partition key column in a history table

Hi All,
I have the following scenario.
We have a history table in production which has 16 hash partitions on the basis of key_column.
But the nature of data that we have in history table that has 878 distinct values of the key_column and about 1000 million data and all partitons are in same tablespace.
Now we have a Pro*C module which purges data from this history table in the following way..
> DELETE FROM hsitory_tab
> WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
> AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
> AND ROWNUM <= 210;
Now (p_date,t_data are one of the two columns in history table) data is deleted using thiese two date column conditions but key_column for partition is different.
So as per aboove statement this history table containd 6 months data.
DBA is asking to change this query and use partiton date wise.Now will it be proper to change the partition key_column (the existing hash partiton key_column >have 810 distinct values) and what things we need to cosider to calculate cost behind this hash partition key_column cahange(if it is appropriate to change >partition )key_column)Hope i explained my problem clearly and waiting for your suggestions .
Thanks in advance.

Hi Sir
Many thanks for the reply.
For first point -
we are in plan to move the database to 10g after a lot of hastle between client.For second point -
If we do partition by date or week we will have 30 or 7 partitions .As suggested by you as we have 16 partitions in the table best approach would be to have >partition by week then we will have 7 partitions and then each query will heat 7 partitions .For third point -
Our main aim to reduce the timings of a job(a Pro*C program) which contains the following delete query to delete data from a history table .So accroding to the >query it is deleting data every day for 7 months and while deleting it it queries this hug etable by date.So in this case hash partition or range partiton or >hash/range partition which will be more suitable.
DELETE FROM hsitory_tab
WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
AND ROWNUM <= 210;I have read in hash partition is used so that data will be evenly distributed in all partitions (though it depends on nature of data).In my case i want some suggestion from you to take the best approach .

OWB Dataprofiling - Unique Key Analysis on non-number columns

Does anyone know an easy way to enable unique key analysis on non-Number columns (OWB 10.2.0.3).
It seems that the profiler by default disables the
'Use in relationsship discovery' when the documented datatype is non-Number.
Is this a setting which can be configured for OWB, or is there a smart way to set this property for all columns?
thks in advance

Hi don't think there is a way to to automatically switch this on/off, a small script can be created to set the option on/off for all columns in the table.
Cheers
David

Uneven distribution in Hash Partitioning

Version :11.1.0.7.0 - 64bit Production
OS :RHEL 5.3
I have a range partitioning on ACCOUNTING_DATE column and have 24 monthly partitions.
To get rid of buffer busy waits on index, i have created global partitioned index using below ddl
DDL :
CREATE INDEX IDX_GL_BATCH_ID ON SL_JOURNAL_ENTRY_LINES(GL_BATCH_ID)
GLOBAL PARTITION BY HASH (GL_BATCH_ID) PARTITIONS 16 TABLESPACE OTC_IDX PARALLEL 8 INITRANS 8 MAXTRANS 8 PCTFREE 0 ONLINE;After index creation, i realized that only one index hash partition got all rows.
select partition_name,num_rows from dba_ind_partitions where index_name='IDX_GL_BATCH_ID';
PARTITION_NAME                   NUM_ROWS
SYS_P77                                 0
SYS_P79                                 0
SYS_P80                                 0
SYS_P81                                 0
SYS_P83                                 0
SYS_P84                                 0
SYS_P85                                 0
SYS_P87                                 0
SYS_P88                                 0
SYS_P89                                 0
SYS_P91                                 0
SYS_P92                                 0
SYS_P78                                 0
SYS_P82                                 0
SYS_P86                                 0
SYS_P90                         256905355As far as i understand, HASH partitioning will distribute evenly. By looking at above distribution, i think, i did not benefit of having multiple insert points using HASH partitioning as well.
Here is index column statistics :
select TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,NUM_NULLS,LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM,AVG_COL_LEN from dba_tab_col_statistics where table_name='SL_JOURNAL_ENTRY_LINES' and COLUMN_NAME='GL_BATCH_ID';
TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT NUM_NULLS LAST_ANALYZED        SAMPLE_SIZE HISTOGRAM       AVG_COL_LEN
SL_JOURNAL_ENTRY_LINES         GL_BATCH_ID                     1          0 2010/12/28 22:00:51    259218636 NONE                      4

It looks like that inserted data has always the same value for the partitioning key: it is expected that in this case the same partition is used because
>
For optimal data distribution, the following requirements should be satisfied:
Choose a column or combination of columns that is unique or almost unique.
Create multiple partitions and subpartitions for each partition that is a power of two. For example, 2, 4, 8, 16, 32, 64, 128, and so on.
>
See http://download.oracle.com/docs/cd/E11882_01/server.112/e16541/part_avail.htm#VLDBG1270.
Edited by: P. Forstmann on 29 déc. 2010 09:06

IOT or Hash partition

Hi all,
I want to insert large data into a table to be retreived later using a key column (like emp no).
To the performance point of view, which is more efficient: IOT (Index Organized Table) or Hash Partition ?

I highly appreciate your time Justin. Your explanation clarified many things to me.
However, I have small notes on your comments:
Firt:
<<IOT's tend to be useful when you have thin, tall tables (many rows, few columns) where you always want to retrieve all the rows.>>
Regarding this claim, I referred to the following sources:
1. Sybex-Oracle9i Performance Tuning book
"If you access the table using its primary key, an IOT will return the rows more quickly than a traditional table."
2. http://www.tlingua.com/articles/iot.html
For single row fetch,"IOTs could provide a substantial performance gain as well as reducing the demand for disk drives"
For Index Range Scans,"IOTs significantly outperform the standard B-tree/table model during index range scans."
3. Oracle9i Database Administratorâs Guide Release 2 (9.2)
"Index-organized tables are particularly useful when you are using applications that must retrieve data based on a primary key."
As you can see Justin, none of them mentioned the thin-tall-table fact. Did you obtain it from practical experience or from some source?
Also they all showed that IOT is most useful when retreiving based on PK.
Second:
"In general, partitioning works best when you are doing set-based processing where you can use partition elimination to concentrate on a particular subset of the data."
In Sybex-Oracle9i Performance Tuning book it is stated that "Hash partitions work best when applications retrieve the data from the partitioned table via the unique key. Range lookups on hash partitioned tables derive no benefit from the partitioning.".
I can see there is some confilict, isn't it?
Thanks in advance.

Does hash partition distribute data evenly across partitions?

As per Oracle documentation, it is mentioned that hash partitioning uses oracle hashing algorithms to assign a hash value to each rows partitioning key and place it in the appropriate partition. And the data will be evenly distributed across the partitions. Ofcourse following following conditions :
1. Partition count should follow 2^n logic
2. Data in partition key column should have high cardinality.
I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions. To verify it, i performed a small test :
Table script :
Create table ch_acct_mast_hash(
Cod_acct_no number)
Partition by hash(cod_acct_no)
PARTITIONS 128;
Data population script :
declare
i number;
l number;
begin
i := 1000000000000000;
for l in 1 .. 100000 loop
insert into ch_acct_mast_hash values (i);
i := i + 1;
end loop;
commit;
end;
Row-count check :
select count(1) from Ch_Acct_Mast_hash ; --rowcount is 100000
Gather stats script :
begin
dbms_stats.gather_table_stats('C43HDEV', 'CH_ACCT_MAST_HASH');
end;
Data distribution check :
Select min(num_rows), max(num_rows) from dba_tab_partitions
where table_name = 'CH_ACCT_MAST_HASH';
Result is :
min(num_rows) = 700
max(num_rows) = 853
As per the result, it seems there is lot of skewness in data distribution across partitions. Maybe I am missing something, or something is not right.
Can anybody help me to understand this behavior?
Edited by: Kshitij Kasliwal on Nov 2, 2012 4:49 AM

>
I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions.
>
All keys with the same data value will also have the same hash value and so will be in the same partition.
So the actual hash distribution in any particular case will depend on the actual data distribution. And, as Iordan showed, the data distribution depends not only on cardinality but on the standard deviation of the key values.
To use a shorter version of that examle consider these data samples which each have 10 values. There is a calculator here
http://easycalculation.com/statistics/standard-deviation.php
0,1,0,2,0,3,0,4,0,5 - total 10, distinct 6, %distinct 60, mean 1.5, stan deviation 1.9, variance 3.6 - similar to Iordan's example
0,5,0,5,0,5,0,5,0,5 - total 10, distinct 2, %distinct 20, mean 2.5, stan dev. 2.64, variance 6.9
5,5,5,5,5,5,5,5,5,5 - total 10, distinct 1, %distinct 10, mean 5, stan dev. 0, variance 0
0,1,2,3,4,5,6,7,8,9 - total 10, distinct 10, %distinct 100, mean 4.5, stan dev. 3.03, variance 9.2
The first and last examples have the highest cardinality but only the last has unique values (i.e. 100% distinct).
Note that the first example is lower for all other attributes but that doesn't mean it would hash more evenly.
Also note that the last example, the unique values, has the highest variance.
So this is no single attribute that is controlling. As Iordan showed the first example has a high %distinct but all of those '0' values will hash to the same partition so even using a perfect hash the data would use 6 partitions.

Need help on List-Hash partition - oracle 11 feature !

Can a list-hash partitioned tabled be exchanged for a partition?
Say, the table is partitioned by list on CODE (varchar2) column and subpartitioned by a NUMBER column
i.e. create table TAB1 (ID, Code, Number)
partition by LIST (Code)
subpartition by HASH (Number)
subpartition template
( subpartition1 , subpartition2 , subpartition3)
partition part1 values ('A'),
partition part1 values ('B'),
partition part1 values ('C')
Lets say the subpartitions1,2 and 3 have values 1,2,3,4,5,6....10, how can I move only say value 1 and 2 into another table using exchange partition method? Is this possible?

>
Thanks for the reply. The db version details is as below. And I am more interested in knowing if and how can data be extracted from hash sub-partitions for a given sub-partition key value, using partition exchange. Can anyone demonstrate this or point to any article that demonstrates this? I am not even sure if something like is possible.
>
What part of my reply didn't you undertand?
Except now you are saying 'extract' where before you wanted to exchange the hash subpartition. If you exchange then the subpartition will now have NO data since it will have been exchanged with an empty table.
In a partition exchange ALL of the partition (or subpartition) is exchanged, not just part of it. So for a hash subpartition you either exchange ALL data or none of it. If you only want some of the data in the subpartition you have to query it out.
No one can provide any samples until you provide a valid sample yourself. You said your partitions have character data
partition part1 values ('A'),
partition part1 values ('B'),
partition part1 values ('C')
);But then you ask about manipulating numeric data
>
Lets say the subpartitions1,2 and 3 have values 1,2,3,4,5,6....10, how
>
Which is it?
Post the DDL for the table and show which subpartition you want to query or exchange.

Unique index vs non-unique index

Hi Gurus,
I'm getting lots of "TABLE ACCESS FULL" for lots of columns which have non-unique indexes is some queries. So my question is does optimizer does not pick up non-unique index but only pickes up unique indexes for those columns.
Thanks
Amitava.

amitavachatterjee1975 wrote:
Hi Gurus,
I'm getting lots of "TABLE ACCESS FULL" for lots of columns which have non-unique indexes is some queries. So my question is does optimizer does not pick up non-unique index but only pickes up unique indexes for those columns.
Thanks
Amitava.WHY MY INDEX IS NOT BEING USED
http://communities.bmc.com/communities/docs/DOC-10031
http://searchoracle.techtarget.com/tip/Why-isn-t-my-index-getting-used
http://www.orafaq.com/tuningguide/not%20using%20index.html

Creation of Hash Partitioned Global Index

Hash Partion Index creation
Hi friends,
Could you suggest me whether we can create a hash partitioned index by using syntax as below in 9i.
CREATE INDEX hgidx ON tab (c1,c2,c3) GLOBAL
PARTITION BY HASH (c1,c2)
(PARTITION p1 TABLESPACE tbs_1,
PARTITION p2 TABLESPACE tbs_2,
PARTITION p3 TABLESPACE tbs_3,
PARTITION p4 TABLESPACE tbs_4);
I am getting error ORA-14005 Missing Key word Range.
Thanks in advance for your help.

Yaseer,
Is it possible to create Non-Partitioned and Global Index on Range-Partitioned Table?
Yes
We have 4 indexes on CS_BILLING range-partitioned table, in which one is CBS_CLIENT_CODE(*local partitioned index*) and others are unknown types of index to me??
Means other 3 indexes are what type indexes ...either non-partitioned global index OR non-partitioned normal index??
You got local index and 3 non-partitioned "NORMAL" b-tree tyep indexes
Also if we create index as :(create index i_name on t_name(c_name)) By default it will create Global index. Please correct me......
Above staement will create non-partitioned index
Here is an example of creating global partitioned indexes
CREATE INDEX month_ix ON sales(sales_month)
   GLOBAL PARTITION BY RANGE(sales_month)
      (PARTITION pm1_ix VALUES LESS THAN (2)
       PARTITION pm2_ix VALUES LESS THAN (3)
       PARTITION pm3_ix VALUES LESS THAN (4)
        PARTITION pm12_ix VALUES LESS THAN (MAXVALUE));Regards

What index is suitable for a table with no unique columns and no primary key

alpha
beta
gamma
col1
col2
col3
100
1
-1
a
b
c
100
1
-2
d
e
f
101
1
-2
t
t
y
102
2
1
j
k
l
Sample data above and below is the dataype for each one of them
alpha datatype- string
beta datatype-integer
gamma datatype-integer
col1,col2,col3 are all string datatypes.
Note:columns are not unique and we would be using alpha,beta,gamma to uniquely identify a record .Now as you see my sample data this is in a table which doesnt have index .I would like to have a index created covering these columns (alpha,beta,gamma) .I
beleive that creating clustered index having covering columns will be better.
What would you recommend the index type should be here in this case.Say data volume is 1 milion records and we always use the alpha,beta,gamma columns when we filiter or query records
what index is suitable for a table with no unique columns and primary key?
col1
col2
col3
Mudassar

Many thanks for your explanation .
When I tried querying using the below query on my heap table the sql server suggested to create NON CLUSTERED INDEX INCLUDING columns ,[beta],[gamma] ,[col1]
,[col2] ,[col3]
SELECT [alpha]
,[beta]
,[gamma]
,[col1]
,[col2]
,[col3]
FROM [TEST].[dbo].[Test]
where [alpha]='10100'
My question is why it didn't suggest Clustered INDEX and chose NON clustered index ?
Mudassar

Hash Partition on non unique column

Similar Messages

Maybe you are looking for