Hash Partition on non unique column

I have Dim and fact tables expected volumns early growth 250 million rows.
I have Hash partitioned Dim table where i have unique Key but on Fact table i do not have a single unique key column(fact is child of Dim ).
Since fact table doesn't has single unique column,how to do hash partition on Fact Table ?
Need advice.

Hi there, I have similar kind of a situation, need your suggestions.
we are using the Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
We have a HR data approximately of about 270 mill for 100,000 persons in an SAP module, & we need to load this data into an oracle table.
This is an accumulated data for the past 2 yrs starting from Jan 2011. & the assumpitions are the data that will be incrementally loaded every day is about 600,000.
The data granularity is available at the fraction of the day known as TIME_TYPE for a given day.
for example an person can have multiple records on a given day, depending upon the TIME_TYP.
sample data:
Pers_ID     Payroll_date     Time_Type      Day_Hrs
1960     1-Jan-11     Maximum Vacation     4
1960     1-Jan-11     Vacation Quota Maximum     2
1960     1-Jan-11     Maximum Sick     2
1960 2-Jan-11     Paid Eligible OT Hrs     3
1960     2-Jan-11     Paid Hours     3
1960     2-Jan-11     WW Eligible OT Hrs     2
1960     5-Jan-11     Daily Overtime Hours     2
1960     5-Jan-11     Weekly Additional Hours     2
1960      5-Jan-11     Personal Quota Balance     2
1960     5-Jan-11     Total Overtime Hours     2
The above data is of an individual person, his time spent on a particular day over a 3 day period.
My question is how best I can design the table (partitioned table), so that the data loading process can be fast (for the initial load as well as the daily incremental load)
also we have lot of reporting on this table, few reports such as total no. of hrs utilized by a particular employee, whats the most time_typ used by a group of employees, & so on.
please let me know your suggestions & thoughts.
Edited by: user3084749 on Feb 5, 2013 1:46 PM

Similar Messages

  • Can I have a primary key as a non-unique column

    Hi all,
    I have a table with 35 columns and only one column in a not null column. But this column data is not unique. I want to create a primary key with non-unique index, can I don it, if it is not possible is there any other way to it. Please help me with this.
    Thanks for your help.vinaykotha

    1) Do the 'Unique Column combination' check first, using this example. (Pl. Change the column name and number of columns you consider as candidate key.) The SQL as follows:-
    ;WITH CTE (ProductKey, CustomerKey, SalesTerritoryKey, DupRec)
     AS
      (SELECT ProductKey, CustomerKey, SalesTerritoryKey, ROW_NUMBER() OVER
         (PARTITION BY ProductKey, CustomerKey, SalesTerritoryKey
          ORDER BY ProductKey, CustomerKey, SalesTerritoryKey) AS DupRec
       FROM dbo.FactSales 
    SELECT *
    FROM CTE
    WHERE DupRec > 1
    2) if CTE Table returns no records, you are good to create a Composite CLUSTERED PRIMARY KEY, like:
    ALTER TABLE dbo.FactSales
    ADD CONSTRAINT PK_ProductKey_CustomerKey_SalesTerritoryKey 
    PRIMARY KEY CLUSTERED (ProductKey, CustomerKey, SalesTerritoryKey);
    If no error..bingo! if NOT, Don't worry, create a Composite Index like this:-
    3)
    CREATE NONCLUSTERED INDEX [IX_ProductKey_CustomerKey_SalesTerritoryKey] 
    ON dbo.FactSales(ProductKey, CustomerKey, SalesTerritoryKey);
    This will work still efficiently. Usually all transaction table are like that like Daily_Order table, Ship_Details table etc.
    -NC

  • Script to make a non-unique column unique?

    Hi all,
    Currently I have a table that contains nothing but ID's (the rest of the table is to be populated at a later date).
    In this table there are around 500,000,000 ID's, of which only about 35,000,000 are unique.
    I need to run a script that will find all duplicate entries of an ID and delete them (only the duplicates). Only one copy of each ID should be in the table so that I can make the column the PK.
    I can't think of how to do this. Any ideas?
    Thanks,
    fakelvis

    Hello
    There's a couple of ways you could do this, but I think creating a new table with the distinct list of ids, truncating the original and re-inserting the distinct may be the most optimal...if I understand correctly.
    SQL> --Set up some test data to use with method 1 CTAS + Truncate
    SQL> CREATE TABLE dt_test_big_tab AS SELECT object_id FROM dba_objects WHERE object_id IS NOT NUL
      2  /
    Table created.
    SQL> INSERT INTO dt_test_big_tab SELECT * from dt_test_big_tab
      2  /
    36210 rows created.
    SQL> /
    72420 rows created.
    SQL> /
    144840 rows created.
    SQL> /
    289680 rows created.
    SQL> /
    579360 rows created.
    SQL> /
    1158720 rows created.
    SQL> commit;
    Commit complete.
    SQL> SELECT COUNT(*) from dt_test_big_tab
      2  /
      COUNT(*)
       2317440
    SQL> SELECT COUNT(DISTINCT object_id) from dt_test_big_tab
      2  /
    COUNT(DISTINCTOBJECT_ID)
                       36210
    SQL> --Set up a copy of the test data to use with method 2 - Delete duplicates
    SQL> CREATE TABLE dt_test_big_tab_2 as select * from dt_test_big_tab
      2  /
    Table created.
    SQL> set timi on
    SQL> --Method one, ctas + truncate
    SQL> CREATE TABLE dt_test_unique AS SELECT DISTINCT object_id FROM dt_test_big_tab
      2  /
    Table created.
    Elapsed: 00:00:02.07
    SQL> SELECT COUNT(*) from dt_test_unique
      2  /
      COUNT(*)
         36210
    Elapsed: 00:00:00.00
    SQL> SELECT COUNT(DISTINCT object_id) from dt_test_unique
      2  /
    COUNT(DISTINCTOBJECT_ID)
                       36210
    Elapsed: 00:00:00.00
    SQL> TRUNCATE TABLE dt_test_big_tab
      2  /
    Table truncated.
    Elapsed: 00:00:01.02
    SQL> INSERT INTO dt_test_big_tab(object_id) SELECT object_id FROM dt_test_unique
      2  /
    36210 rows created.
    Elapsed: 00:00:00.00
    SQL> SELECT COUNT(*) from dt_test_big_tab
      2  /
      COUNT(*)
         36210
    Elapsed: 00:00:00.01
    SQL> SELECT COUNT(DISTINCT object_id) from dt_test_big_tab
      2  /
    COUNT(DISTINCTOBJECT_ID)
                       36210
    Elapsed: 00:00:00.00
    SQL> --Method 2, delete duplicates
    SQL> DELETE
      2  FROM
      3     dt_test_big_tab_2
      4  WHERE
      5     rowid IN(SELECT
      6                             del_rowid
      7                     FROM
      8                             (SELECT
      9                                     rowid del_rowid,
    10                                     ROW_NUMBER() OVER(partition by object_id ORDER BY object_id) rn
    11                             FROM
    12                                     dt_test_big_tab_2
    13                             )
    14                     WHERE
    15                                     rn > 1
    16                             );
    2281230 rows deleted.
    Elapsed: 00:05:39.06
    SQL> select count(*) from dt_test_big_tab_2;
      COUNT(*)
         36210
    Elapsed: 00:00:00.03
    SQL> select count(distinct object_id) from dt_test_big_tab_2;
    COUNT(DISTINCTOBJECT_ID)
                       36210
    Elapsed: 00:00:00.00
    SQL>HTH
    David

  • Oracle 11.2 - Perform parallel DML on a non partitioned table with LOB column

    Hi,
    Since I wanted to demonstrate new Oracle 12c enhancements on SecureFiles, I tried to use PDML statements on a non partitioned table with LOB column, in both Oracle 11g and Oracle 12c releases. The Oracle 11.2 SecureFiles and Large Objects Developer's Guide of January 2013 clearly says:
    Parallel execution of the following DML operations on tables with LOB columns is supported. These operations run in parallel execution mode only when performed on a partitioned table. DML statements on non-partitioned tables with LOB columns continue to execute in serial execution mode.
    INSERT AS SELECT
    CREATE TABLE AS SELECT
    DELETE
    UPDATE
    MERGE (conditional UPDATE and INSERT)
    Multi-table INSERT
    So I created and populated a simple table with a BLOB column:
    SQL> CREATE TABLE T1 (A BLOB);
    Table created.
    Then, I tried to see the execution plan of a parallel DELETE:
    SQL> EXPLAIN PLAN FOR
      2  delete /*+parallel (t1,8) */ from t1;
    Explained.
    SQL> select * from table(dbms_xplan.display);
    PLAN_TABLE_OUTPUT
    Plan hash value: 3718066193
    | Id  | Operation             | Name     | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    |   0 | DELETE STATEMENT      |          |  2048 |     2   (0)| 00:00:01 |        |      |            |
    |   1 |  DELETE               | T1       |       |            |          |        |      |            |
    |   2 |   PX COORDINATOR      |          |       |            |          |        |      |            |
    |   3 |    PX SEND QC (RANDOM)| :TQ10000 |  2048 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
    |   4 |     PX BLOCK ITERATOR |          |  2048 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
    |   5 |      TABLE ACCESS FULL| T1       |  2048 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    PLAN_TABLE_OUTPUT
    Note
       - dynamic sampling used for this statement (level=2)
    And I finished by executing the statement.
    SQL> commit;
    Commit complete.
    SQL> alter session enable parallel dml;
    Session altered.
    SQL> delete /*+parallel (t1,8) */ from t1;
    2048 rows deleted.
    As we can see, the statement has been run as parallel:
    SQL> select * from v$pq_sesstat;
    STATISTIC                      LAST_QUERY SESSION_TOTAL
    Queries Parallelized                    1             1
    DML Parallelized                        0             0
    DDL Parallelized                        0             0
    DFO Trees                               1             1
    Server Threads                          5             0
    Allocation Height                       5             0
    Allocation Width                        1             0
    Local Msgs Sent                        55            55
    Distr Msgs Sent                         0             0
    Local Msgs Recv'd                      55            55
    Distr Msgs Recv'd                       0             0
    11 rows selected.
    Is it normal ? It is not supposed to be supported on Oracle 11g with non-partitioned table containing LOB column....
    Thank you for your help.
    Michael

    Yes I did it. I tried with force parallel dml, and that is the results on my 12c DB, with the non partitionned and SecureFiles LOB column.
    SQL> explain plan for delete from t1;
    Explained.
    | Id  | Operation             | Name     | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    |   0 | DELETE STATEMENT      |          |     4 |     2   (0)| 00:00:01 |        |      |            |
    |   1 |  DELETE               | T1       |       |            |          |        |      |            |
    |   2 |   PX COORDINATOR      |          |       |            |          |        |      |            |
    |   3 |    PX SEND QC (RANDOM)| :TQ10000 |     4 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
    |   4 |     PX BLOCK ITERATOR |          |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
    |   5 |      TABLE ACCESS FULL| T1       |     4 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    The DELETE is not performed in Parallel.
    I tried with another statement :
    SQL> explain plan for
    2        insert into t1 select * from t1;
    Here are the results:
    11g
    | Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    |   0 | INSERT STATEMENT         |          |     4 |  8008 |     2   (0)| 00:00:01 |        |      |            |
    |   1 |  LOAD TABLE CONVENTIONAL | T1       |       |       |            |          |        |      |            |
    |   2 |   PX COORDINATOR         |          |       |       |            |          |        |      |            |
    |   3 |    PX SEND QC (RANDOM)   | :TQ10000 |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
    |   4 |     PX BLOCK ITERATOR    |          |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
    |   5 |      TABLE ACCESS FULL   | T1       |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    12c
    | Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
    |   0 | INSERT STATEMENT                   |          |     4 |  8008 |     2   (0)| 00:00:01 |        |      |            |
    |   1 |  PX COORDINATOR                    |          |       |       |            |          |        |      |            |
    |   2 |   PX SEND QC (RANDOM)              | :TQ10000 |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
    |   3 |    LOAD AS SELECT                  | T1       |       |       |            |          |  Q1,00 | PCWP |            |
    |   4 |     OPTIMIZER STATISTICS GATHERING |          |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
    |   5 |      PX BLOCK ITERATOR             |          |     4 |  8008 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
    It seems that the DELETE statement has problems but not the INSERT AS SELECT !

  • Display columns as rows from non-unique key table

    Hi OTN/Users, I hope you can assist me
    Given a table:
    create table t (a varchar2(30), b int, c date );
    with this data within:
    insert into t values ( a1, 40, to_date( '01-Dec-2012'));
    insert into t values ( a1, 50, to_date( '01-Dec-2012'));
    insert into t values ( a1, 60, to_date( '01-Dec-2012'));
    insert into t values ( b1, 10, to_date( '01-Dec-2012'));
    insert into t values ( b1, 20, to_date( '01-Dec-2012'));
    insert into t values ( b1, 30, to_date( '01-Dec-2012'));
    insert into t values ( c1, 60, to_date( '01-Dec-2012'));
    insert into t values ( c1, 70, to_date( '01-Dec-2012'));
    insert into t values ( c1, 80, to_date( '01-Dec-2012'));
    - I want to output the columns for each of 'a' as a single row e.g:
    a1 40 50 60 01-Dec-2012
    b1 10 20 30 01-Dec-2012
    I've almost got it right, but the 'a' col repeats 4 times for each row of output:
    a1 40
    a1 50
    a1 60
    a1 01-Dec-2012
    -I want to supress repeat output of the first column 'a' but display the rest in a straight line.
    I've tried various things (Pivot, Rollup etc), but the fact i'm keying on a table with non unique rows has complicated things perhaps.
    Any help would be much appreciated

    Hi,
    Pre-11g this is how you would do it :[11.2] Pri @ Bepripd1 > !cat t.sql
    with t(a,b,c) as (
         select  'a1', 40, to_date( '01-Dec-2012') from dual union all
         select  'a1', 50, to_date( '01-Dec-2012') from dual union all
         select  'a1', 60, to_date( '01-Dec-2012') from dual union all
         select  'b1', 10, to_date( '01-Dec-2012') from dual union all
         select  'b1', 20, to_date( '01-Dec-2012') from dual union all
         select  'b1', 30, to_date( '01-Dec-2012') from dual union all
         select  'c1', 60, to_date( '01-Dec-2012') from dual union all
         select  'c1', 70, to_date( '01-Dec-2012') from dual union all
         select  'c1', 80, to_date( '01-Dec-2012') from dual
    ------ end of sample data ------
    select
         a
         ,max(decode(n,1,b,null)) q1
         ,max(decode(n,2,b,null)) q2
         ,max(decode(n,3,b,null)) q3
         ,c
    from (
         select a, b, c, row_number() over (partition by a order by b) n
         from t
    group by a,c
    order by a,c
    [11.2] Pri @ Bepripd1 > @t
    A          Q1         Q2         Q3 C
    a1         40         50         60 01/12/2012 00:00:00
    b1         10         20         30 01/12/2012 00:00:00
    c1         60         70         80 01/12/2012 00:00:00------
    From 11g onward, you would :[11.2] Pri @ Bepripd1 > !cat t.sql
    with t(a,b,c) as (
         select  'a1', 40, to_date( '01-Dec-2012') from dual union all
         select  'a1', 50, to_date( '01-Dec-2012') from dual union all
         select  'a1', 60, to_date( '01-Dec-2012') from dual union all
         select  'b1', 10, to_date( '01-Dec-2012') from dual union all
         select  'b1', 20, to_date( '01-Dec-2012') from dual union all
         select  'b1', 30, to_date( '01-Dec-2012') from dual union all
         select  'c1', 60, to_date( '01-Dec-2012') from dual union all
         select  'c1', 70, to_date( '01-Dec-2012') from dual union all
         select  'c1', 80, to_date( '01-Dec-2012') from dual
    ------ end of sample data ------
    select a,q1,q2,q3,c
    from (
         select a, b, c, row_number() over (partition by a order by b) n
         from t
    pivot (
         max(b)
         for n in (
              1 as q1
              ,2 as q2
              ,3 as q3
    order by a,c
    [11.2] Pri @ Bepripd1 > @t
    A          Q1         Q2         Q3 C
    a1         40         50         60 01/12/2012 00:00:00
    b1         10         20         30 01/12/2012 00:00:00
    c1         60         70         80 01/12/2012 00:00:00Edited by: Nicosa on Nov 9, 2012 2:42 PM

  • WHat is the best index type for non uniqueness / Varchar columns in SQL 2008 R2

    Hello All Greetings,
    Please help me here with my doubt,
    in my table i have two columns about a million rows, it has about 20 columns in it, three columns with name as Period, Gender so most of the time these two columns use in where clause,
    Gender  will contain Either M or F , Period contains YYYY-Month (2013-December, 2013-August) etc so i would like to add a Index to these two columns so that in will increase the performance, so please let me know what type of indexes i need to add to
    these columns in the table,
    please note that only one time we will add data to the table which will take only 2 minutes but we query the table every day
    so my question what is the best index type that i need to create on columns with non uniqueness values in the column.,
    Thank you In Advance,
    Milan

    There is nothing whatever wrong with creating an index on a VARCHAR column, or set of columns.
    Regarding the performance of VARCHAR/INT, as with everything in a RDBMS, it depends on what you are doing. What you may be thinking of is the fact that clustering a table on a VARCHAR key is (in SQL Server) marginally less efficient than clustering on a monotonically
    increasing numerical key, and can introduce fragmentation.
    Or you may be thinking of what you have heard about writing JOINs on VARCHAR columns - it is true, it is a little less efficient than a JOIN on numeric type, but it is only a little less efficient, nothing that would lead you to never join on varchar cols.
    None of this does not mean that you should not create indexes on VARCHAR columns. A needed index on a VARCHAR column will boost query performance, often by orders of magnitude. If you need an index on a VARCHAR, create it. It makes no sense to try to find an
    integer column to create the index on - the engine will never use it.
    Check this reference: http://stackoverflow.com/questions/14041481/is-it-good-to-create-a-nonclustered-index-on-a-column-of-type-varchar
    Mark ANSWER if this reply resolves your query, If helpful then VOTE HELPFUL
    INSQLSERVER.COM
    Mohammad Nizamuddin

  • Cost to change hash partition key column in a history table

    Hi All,
    I have the following scenario.
    We have a history table in production which has 16 hash partitions on the basis of key_column.
    But the nature of data that we have in history table that has 878 distinct values of the key_column and about 1000 million data and all partitons are in same tablespace.
    Now we have a Pro*C module which purges data from this history table in the following way..
    > DELETE FROM hsitory_tab
    > WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
    > AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
    > AND ROWNUM <= 210;
    Now (p_date,t_data are one of the two columns in history table) data is deleted using thiese two date column conditions but key_column for partition is different.
    So as per aboove statement this history table containd 6 months data.
    DBA is asking to change this query and use partiton date wise.Now will it be proper to change the partition key_column (the existing hash partiton key_column >have 810 distinct values) and what things we need to cosider to calculate cost behind this hash partition key_column cahange(if it is appropriate to change >partition )key_column)Hope i explained my problem clearly and waiting for your suggestions .
    Thanks in advance.

    Hi Sir
    Many thanks for the reply.
    For first point -
    we are in plan to move the database to 10g after a lot of hastle between client.For second point -
    If we do partition by date or week we will have 30 or 7 partitions .As suggested by you as we have 16 partitions in the table best approach would be to have >partition by week then we will have 7 partitions and then each query will heat 7 partitions .For third point -
    Our main aim to reduce the timings of a job(a Pro*C program) which contains the following delete query to delete data from a history table .So accroding to the >query it is deleting data every day for 7 months and while deleting it it queries this hug etable by date.So in this case hash partition or range partiton or >hash/range partition which will be more suitable.
    DELETE FROM hsitory_tab
    WHERE p_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
    AND t_date < (TO_DATE(sysdate+1, 'YYYYMMDD') - 210)
    AND ROWNUM <= 210;I have read in hash partition is used so that data will be evenly distributed in all partitions (though it depends on nature of data).In my case i want some suggestion from you to take the best approach .

  • OWB Dataprofiling - Unique Key Analysis on non-number columns

    Does anyone know an easy way to enable unique key analysis on non-Number columns (OWB 10.2.0.3).
    It seems that the profiler by default disables the
    'Use in relationsship discovery' when the documented datatype is non-Number.
    Is this a setting which can be configured for OWB, or is there a smart way to set this property for all columns?
    thks in advance

    Hi don't think there is a way to to automatically switch this on/off, a small script can be created to set the option on/off for all columns in the table.
    Cheers
    David

  • Uneven distribution in Hash Partitioning

    Version :11.1.0.7.0 - 64bit Production
    OS :RHEL 5.3
    I have a range partitioning on ACCOUNTING_DATE column and have 24 monthly partitions.
    To get rid of buffer busy waits on index, i have created global partitioned index using below ddl
    DDL :
    CREATE INDEX IDX_GL_BATCH_ID ON SL_JOURNAL_ENTRY_LINES(GL_BATCH_ID)
    GLOBAL PARTITION BY HASH (GL_BATCH_ID) PARTITIONS 16 TABLESPACE OTC_IDX PARALLEL 8 INITRANS 8 MAXTRANS 8 PCTFREE 0 ONLINE;After index creation, i realized that only one index hash partition got all rows.
    select partition_name,num_rows from dba_ind_partitions where index_name='IDX_GL_BATCH_ID';
    PARTITION_NAME                   NUM_ROWS
    SYS_P77                                 0
    SYS_P79                                 0
    SYS_P80                                 0
    SYS_P81                                 0
    SYS_P83                                 0
    SYS_P84                                 0
    SYS_P85                                 0
    SYS_P87                                 0
    SYS_P88                                 0
    SYS_P89                                 0
    SYS_P91                                 0
    SYS_P92                                 0
    SYS_P78                                 0
    SYS_P82                                 0
    SYS_P86                                 0
    SYS_P90                         256905355As far as i understand, HASH partitioning will distribute evenly. By looking at above distribution, i think, i did not benefit of having multiple insert points using HASH partitioning as well.
    Here is index column statistics :
    select TABLE_NAME,COLUMN_NAME,NUM_DISTINCT,NUM_NULLS,LAST_ANALYZED,SAMPLE_SIZE,HISTOGRAM,AVG_COL_LEN from dba_tab_col_statistics where table_name='SL_JOURNAL_ENTRY_LINES'  and COLUMN_NAME='GL_BATCH_ID';
    TABLE_NAME                     COLUMN_NAME          NUM_DISTINCT  NUM_NULLS LAST_ANALYZED        SAMPLE_SIZE HISTOGRAM       AVG_COL_LEN
    SL_JOURNAL_ENTRY_LINES         GL_BATCH_ID                     1          0 2010/12/28 22:00:51    259218636 NONE                      4

    It looks like that inserted data has always the same value for the partitioning key: it is expected that in this case the same partition is used because
    >
    For optimal data distribution, the following requirements should be satisfied:
    Choose a column or combination of columns that is unique or almost unique.
    Create multiple partitions and subpartitions for each partition that is a power of two. For example, 2, 4, 8, 16, 32, 64, 128, and so on.
    >
    See http://download.oracle.com/docs/cd/E11882_01/server.112/e16541/part_avail.htm#VLDBG1270.
    Edited by: P. Forstmann on 29 déc. 2010 09:06

  • IOT or Hash partition

    Hi all,
    I want to insert large data into a table to be retreived later using a key column (like emp no).
    To the performance point of view, which is more efficient: IOT (Index Organized Table) or Hash Partition ?

    I highly appreciate your time Justin. Your explanation clarified many things to me.
    However, I have small notes on your comments:
    Firt:
    <<IOT's tend to be useful when you have thin, tall tables (many rows, few columns) where you always want to retrieve all the rows.>>
    Regarding this claim, I referred to the following sources:
    1. Sybex-Oracle9i Performance Tuning book
    "If you access the table using its primary key, an IOT will return the rows more quickly than a traditional table."
    2. http://www.tlingua.com/articles/iot.html
    For single row fetch,"IOTs could provide a substantial performance gain as well as reducing the demand for disk drives"
    For Index Range Scans,"IOTs significantly outperform the standard B-tree/table model during index range scans."
    3. Oracle9i Database Administrator’s Guide Release 2 (9.2)
    "Index-organized tables are particularly useful when you are using applications that must retrieve data based on a primary key."
    As you can see Justin, none of them mentioned the thin-tall-table fact. Did you obtain it from practical experience or from some source?
    Also they all showed that IOT is most useful when retreiving based on PK.
    Second:
    "In general, partitioning works best when you are doing set-based processing where you can use partition elimination to concentrate on a particular subset of the data."
    In Sybex-Oracle9i Performance Tuning book it is stated that "Hash partitions work best when applications retrieve the data from the partitioned table via the unique key. Range lookups on hash partitioned tables derive no benefit from the partitioning.".
    I can see there is some confilict, isn't it?
    Thanks in advance.

  • Does hash partition distribute data evenly across partitions?

    As per Oracle documentation, it is mentioned that hash partitioning uses oracle hashing algorithms to assign a hash value to each rows partitioning key and place it in the appropriate partition. And the data will be evenly distributed across the partitions. Ofcourse following following conditions :
    1. Partition count should follow 2^n logic
    2. Data in partition key column should have high cardinality.
    I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions. To verify it, i performed a small test :
    Table script :
    Create table ch_acct_mast_hash(
    Cod_acct_no number)
    Partition by hash(cod_acct_no)
    PARTITIONS 128;
    Data population script :
    declare
    i number;
    l number;
    begin
    i := 1000000000000000;
    for l in 1 .. 100000 loop
    insert into ch_acct_mast_hash values (i);
    i := i + 1;
    end loop;
    commit;
    end;
    Row-count check :
    select count(1) from Ch_Acct_Mast_hash ; --rowcount is 100000
    Gather stats script :
    begin
    dbms_stats.gather_table_stats('C43HDEV', 'CH_ACCT_MAST_HASH');
    end;
    Data distribution check :
    Select min(num_rows), max(num_rows) from dba_tab_partitions
    where table_name = 'CH_ACCT_MAST_HASH';
    Result is :
    min(num_rows) = 700
    max(num_rows) = 853
    As per the result, it seems there is lot of skewness in data distribution across partitions. Maybe I am missing something, or something is not right.
    Can anybody help me to understand this behavior?
    Edited by: Kshitij Kasliwal on Nov 2, 2012 4:49 AM

    >
    I have used hash partitioning in some of our application tables, but data isn't distributed evenly across partitions.
    >
    All keys with the same data value will also have the same hash value and so will be in the same partition.
    So the actual hash distribution in any particular case will depend on the actual data distribution. And, as Iordan showed, the data distribution depends not only on cardinality but on the standard deviation of the key values.
    To use a shorter version of that examle consider these data samples which each have 10 values. There is a calculator here
    http://easycalculation.com/statistics/standard-deviation.php
    0,1,0,2,0,3,0,4,0,5 - total 10, distinct 6, %distinct 60, mean 1.5, stan deviation 1.9, variance 3.6 - similar to Iordan's example
    0,5,0,5,0,5,0,5,0,5 - total 10, distinct 2, %distinct 20, mean 2.5, stan dev. 2.64, variance 6.9
    5,5,5,5,5,5,5,5,5,5 - total 10, distinct 1, %distinct 10, mean 5, stan dev. 0, variance 0
    0,1,2,3,4,5,6,7,8,9 - total 10, distinct 10, %distinct 100, mean 4.5, stan dev. 3.03, variance 9.2
    The first and last examples have the highest cardinality but only the last has unique values (i.e. 100% distinct).
    Note that the first example is lower for all other attributes but that doesn't mean it would hash more evenly.
    Also note that the last example, the unique values, has the highest variance.
    So this is no single attribute that is controlling. As Iordan showed the first example has a high %distinct but all of those '0' values will hash to the same partition so even using a perfect hash the data would use 6 partitions.

  • Need help on List-Hash partition - oracle 11 feature !

    Can a list-hash partitioned tabled be exchanged for a partition?
    Say, the table is partitioned by list on CODE (varchar2) column and subpartitioned by a NUMBER column
    i.e. create table TAB1 (ID, Code, Number)
    partition by LIST (Code)
    subpartition by HASH (Number)
    subpartition template
    ( subpartition1 , subpartition2 , subpartition3)
    partition part1 values ('A'),
    partition part1 values ('B'),
    partition part1 values ('C')
    Lets say the subpartitions1,2 and 3 have values 1,2,3,4,5,6....10, how can I move only say value 1 and 2 into another table using exchange partition method? Is this possible?

    >
    Thanks for the reply. The db version details is as below. And I am more interested in knowing if and how can data be extracted from hash sub-partitions for a given sub-partition key value, using partition exchange. Can anyone demonstrate this or point to any article that demonstrates this? I am not even sure if something like is possible.
    >
    What part of my reply didn't you undertand?
    Except now you are saying 'extract' where before you wanted to exchange the hash subpartition. If you exchange then the subpartition will now have NO data since it will have been exchanged with an empty table.
    In a partition exchange ALL of the partition (or subpartition) is exchanged, not just part of it. So for a hash subpartition you either exchange ALL data or none of it. If you only want some of the data in the subpartition you have to query it out.
    No one can provide any samples until you provide a valid sample yourself. You said your partitions have character data
    partition part1 values ('A'),
    partition part1 values ('B'),
    partition part1 values ('C')
    );But then you ask about manipulating numeric data
    >
    Lets say the subpartitions1,2 and 3 have values 1,2,3,4,5,6....10, how
    >
    Which is it?
    Post the DDL for the table and show which subpartition you want to query or exchange.

  • Unique index vs non-unique index

    Hi Gurus,
    I'm getting lots of "TABLE ACCESS FULL" for lots of columns which have non-unique indexes is some queries. So my question is does optimizer does not pick up non-unique index but only pickes up unique indexes for those columns.
    Thanks
    Amitava.

    amitavachatterjee1975 wrote:
    Hi Gurus,
    I'm getting lots of "TABLE ACCESS FULL" for lots of columns which have non-unique indexes is some queries. So my question is does optimizer does not pick up non-unique index but only pickes up unique indexes for those columns.
    Thanks
    Amitava.WHY MY INDEX IS NOT BEING USED
    http://communities.bmc.com/communities/docs/DOC-10031
    http://searchoracle.techtarget.com/tip/Why-isn-t-my-index-getting-used
    http://www.orafaq.com/tuningguide/not%20using%20index.html

  • Creation of Hash Partitioned Global Index

    Hash Partion Index creation
    Hi friends,
    Could you suggest me whether we can create a hash partitioned index by using syntax as below in 9i.
    CREATE INDEX hgidx ON tab (c1,c2,c3) GLOBAL
    PARTITION BY HASH (c1,c2)
    (PARTITION p1 TABLESPACE tbs_1,
    PARTITION p2 TABLESPACE tbs_2,
    PARTITION p3 TABLESPACE tbs_3,
    PARTITION p4 TABLESPACE tbs_4);
    I am getting error ORA-14005 Missing Key word Range.
    Thanks in advance for your help.

    Yaseer,
    Is it possible to create Non-Partitioned and Global Index on Range-Partitioned Table?
    Yes
    We have 4 indexes on CS_BILLING range-partitioned table, in which one is CBS_CLIENT_CODE(*local partitioned index*) and others are unknown types of index to me??
    Means other 3 indexes are what type indexes ...either non-partitioned global index OR non-partitioned normal index??
    You got local index and 3 non-partitioned "NORMAL" b-tree tyep indexes
    Also if we create index as :(create index i_name on t_name(c_name)) By default it will create Global index. Please correct me......
    Above staement will create non-partitioned index
    Here is an example of creating global partitioned indexes
    CREATE INDEX month_ix ON sales(sales_month)
       GLOBAL PARTITION BY RANGE(sales_month)
          (PARTITION pm1_ix VALUES LESS THAN (2)
           PARTITION pm2_ix VALUES LESS THAN (3)
           PARTITION pm3_ix VALUES LESS THAN (4)
            PARTITION pm12_ix VALUES LESS THAN (MAXVALUE));Regards

  • What index is suitable for a table with no unique columns and no primary key

    alpha
    beta 
    gamma
    col1
    col2
    col3
    100
    1
    -1
    a
    b
    c
    100
    1
    -2
    d
    e
    f
    101
    1
    -2
    t
    t
    y
    102
    2
    1
    j
    k
    l
    Sample data above  and below is the dataype for each one of them
    alpha datatype- string 
    beta datatype-integer
    gamma datatype-integer
    col1,col2,col3 are all string datatypes. 
    Note:columns are not unique and we would be using alpha,beta,gamma to uniquely identify a record .Now as you see my sample data this is in a table which doesnt have index .I would like to have a index created covering these columns (alpha,beta,gamma) .I
    beleive that creating clustered index having covering columns will be better.
    What would you recommend the index type should be here in this case.Say data volume is 1 milion records and we always use the alpha,beta,gamma columns when we filiter or query records 
    what index is suitable for a table with no unique columns and primary key?
    col1
    col2
    col3
    Mudassar

    Many thanks for your explanation .
    When I tried querying using the below query on my heap table the sql server suggested to create NON CLUSTERED INDEX INCLUDING columns    ,[beta],[gamma] ,[col1] 
     ,[col2]     ,[col3]
    SELECT [alpha]
          ,[beta]
          ,[gamma]
          ,[col1]
          ,[col2]
          ,[col3]
      FROM [TEST].[dbo].[Test]
    where   [alpha]='10100'
    My question is why it didn't suggest Clustered INDEX and chose NON clustered index ?
    Mudassar

Maybe you are looking for

  • Null pointer exception when starting application

    Hi: I had a application running fine, then i had to change some parameters in my RFC. I reimported the model, removed all the excess mapping, rebuild, deployed , restarted the server. Then when i run the application, i get the below error. Anyone has

  • Two Keynote Questions

    1) What is the best way to record narration in Keynote? I have been recording with professional mics into garage band and have been satisfied with the sound quality. However, the export options on Keynote 08 seem somewhat limited. Any experienced key

  • Vendor Blocks in SRM

    Hi Guru's. Please can someone explain what system changes we need to make so that a errror message is displayed during Cart creation to tell user that a Vendor is Blocked. The message only appears on the PO screen and it is our requirement to make th

  • RH 7 for Word - Multiple TOCs

    I've read about the fact that you create multiple TOCs with RH 7 HTML. Does anyone know if you can also create multiple TOCs in RH 7 for Word projects? Thanks, Jim

  • All discs are unreadable

    Neither my PowerBook nor my iMac will read video DVDs. When I insert the disc in either machine, I get the error message "Disc Insertion The disk you inserted was not readable by this computer.", along with the options of "Ignore" & "Eject". I don't