Partition in Data warehouse

Hello,
I'm trying to create Interval partition in one table with Invoice_Date as key column.
Database NLS Setting for date is DD-MON-RR and we are entering date as 'DD-MON-YYYY'. Now what i want to know is when i create table like below
CREATE TABLE INTERVAL_SALES_NJ
    ( Invoice_Date       DATE
    , Invoice_details   CHAR(10)    )
  PARTITION BY RANGE ( Invoice_Date )
  INTERVAL(NUMTOYMINTERVAL(1, 'YEAR'))
    ( PARTITION p0 VALUES LESS THAN (TO_DATE('1-1-2008', 'DD-MM-YYYY')),
      PARTITION p1 VALUES LESS THAN (TO_DATE('1-1-2009', 'DD-MM-YYYY')),
      PARTITION P2 VALUES LESS THAN (TO_DATE('1-7-2009', 'DD-MM-YYYY')),
      PARTITION p3 VALUES LESS THAN (TO_DATE('1-1-2010', 'DD-MM-YYYY')) );
In Dashboard, we've prompt YEAR like(e.g. 2008,2009 etc.) with some other prompts for analysis. The YEAR I'm casting from Invoice_Date .
So my confusion will oracle use the above date partition for analysis.
Or better i create a numeric column YEAR in table (TO_CHAR(Invoice_Date,'YYYY') ) and create partition on it?
Regards!
OBIEE 11.1.1.6
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

NitinJoshi wrote:
Thanks for the reply.
The create table statement above was just an example. Yes, i will go for YEAR interval.
But the thing is when i take Invoice_date in Prompt, I've to put Year(Invoice_date). Will that use partition or...?
Dear rp0428,
my queries would be normal select from Invoice Sales table but where clause would have year,organization_descr and 1 more column. year & organization_descr filter would remain for most of the analysis.
Regards!
We aren't asking for some general statement about what your query 'might be'; Oracle works with actual queries. If you want an 'actual' answer provide an 'actual' query.
Oracle can't use 'year' to query a column of type DATE unless you turn it into a date using TO_DATE.
And the Oracle server knows NOTHING about 'Prompt' or 'Year(Invoice_date)'.
Partition pruning can only occur if the partition columns are included in the WHERE clause.

Similar Messages

  • What are the best solutions for data warehouse configuration in 10gR2

    I need help on solutions to be provided to my Client for upgrading the data warehouse.
    Current Configuration: Oracle database 9.2.0.8. This database contains the data warehouse and one more data mart on the same host.Sizes are respectively 6 Terabyte(retention policy of 3 years+current year) and 1 Terabyte. The ETL tool and BO Reporting tools are also hosted on the same host. This current configuration is really performing poor.
    Client cannot go for a major architectural or configuration changes to its existing environment now due to some constraints.
    However, they have agreed to separate out the databases on separate hosts from the ETL tools and BO objects. Also we are planning to upgrade the database to 10gR2 to attain stability, better performance and overcome current headaches.
    We cannot upgrade the database to 11g as the BO is at a version 6.5 which isn't compatible with Oracle 11g. And Client cannot afford to upgrade anything else other than the database.
    So, my role is very vital in providing a perfect solution towards better performance and take a successful migration of Oracle Database from one host to another (similar platform and OS) in addition to upgrade.
    I have till now thought of the following:
    Move the Oracle database and data mart to separate host.
    The host will be the same platform, that is, HP Superdome with HP-UX 32-bit OS (we cannot change to 64-bit as ETL tool doesn't support)
    Install new Oracle database 10g on the new host and move the data to it.
    Exploring all new features of 10gR2 to help data warehouse, that is, SQL MODEL Clause introduction, Parallel processing, Partitioning, Data Pump, SPA to study pre and post migrations.
    Also thinking of RAC to provide more better solution as our main motive is to show a tremendous performance enhancement.
    I need all your help to prepare a good road map for my assignment. Please suggest.
    Thanks,
    Tapan

    SGA=27.5 GB and PGA=50 MB
    Also I am pasting part of STATSPACK Report, eliminating the snaps of DB bounce. Please suggest the scope of improvement in this case.
    STATSPACK report for
    Snap Id Snap Time Sessions Curs/Sess Comment
    Begin Snap: 582946 11-Mar-13 20:02:16 46 12.8
    End Snap: 583036 12-Mar-13 18:24:24 60 118.9
    Elapsed: 1,342.13 (mins)
    Cache Sizes (end)
    ~~~~~~~~~~~~~~~~~
    Buffer Cache: 21,296M Std Block Size: 16K
    Shared Pool Size: 6,144M Log Buffer: 16,384K
    Load Profile
    ~~~~~~~~~~~~ Per Second Per Transaction
    Redo size: 1,343,739.01 139,883.39
    Logical reads: 100,102.54 10,420.69
    Block changes: 3,757.42 391.15
    Physical reads: 6,670.84 694.44
    Physical writes: 874.34 91.02
    User calls: 1,986.04 206.75
    Parses: 247.87 25.80
    Hard parses: 5.82 0.61
    Sorts: 1,566.76 163.10
    Logons: 10.99 1.14
    Executes: 1,309.79 136.35
    Transactions: 9.61
    % Blocks changed per Read: 3.75 Recursive Call %: 43.34
    Rollback per transaction %: 3.49 Rows per Sort: 190.61
    Instance Efficiency Percentages (Target 100%)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Buffer Nowait %: 99.90 Redo NoWait %: 100.00
    Buffer Hit %: 96.97 In-memory Sort %: 100.00
    Library Hit %: 99.27 Soft Parse %: 97.65
    Execute to Parse %: 81.08 Latch Hit %: 99.58
    Parse CPU to Parse Elapsd %: 3.85 % Non-Parse CPU: 99.34
    Shared Pool Statistics Begin End
    Memory Usage %: 7.11 50.37
    % SQL with executions>1: 62.31 46.46
    % Memory for SQL w/exec>1: 26.75 13.47
    Top 5 Timed Events
    ~~~~~~~~~~~~~~~~~~ % Total
    Event Waits Time (s) Ela Time
    CPU time 492,062 43.66
    db file sequential read 157,418,414 343,549 30.49
    library cache pin 92,339 66,759 5.92
    PX qref latch 63,635 43,845 3.89
    db file scattered read 2,506,806 41,677 3.70
    Background Wait Events for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    -> ordered by wait time desc, waits desc (idle events last)
    Avg
    Total Wait wait Waits
    Event Waits Timeouts Time (s) (ms) /txn
    log file sequential read 176,386 0 3,793 22 0.2
    log file parallel write 2,685,833 0 1,813 1 3.5
    db file parallel write 239,166 0 1,350 6 0.3
    control file parallel write 33,432 0 79 2 0.0
    LGWR wait for redo copy 478,120 536 75 0 0.6
    rdbms ipc reply 10,027 0 47 5 0.0
    control file sequential read 32,414 0 40 1 0.0
    db file scattered read 4,101 0 30 7 0.0
    db file sequential read 13,946 0 29 2 0.0
    direct path read 203,694 0 14 0 0.3
    log buffer space 363 0 13 37 0.0
    latch free 3,766 0 9 2 0.0
    direct path write 80,491 0 6 0 0.1
    async disk IO 351,955 0 4 0 0.5
    enqueue 28 0 1 21 0.0
    buffer busy waits 1,281 0 1 0 0.0
    log file single write 172 0 0 1 0.0
    rdbms ipc message 10,563,204 251,286 992,837 94 13.7
    pmon timer 34,751 34,736 78,600 2262 0.0
    smon timer 7,462 113 76,463 10247 0.0
    Instance Activity Stats for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    Statistic Total per Second per Trans
    CPU used by this session 49,206,154 611.0 63.6
    CPU used when call started 49,435,735 613.9 63.9
    CR blocks created 6,740,777 83.7 8.7
    Cached Commit SCN referenced 423,253,503 5,256.0 547.2
    Commit SCN cached 19,165 0.2 0.0
    DBWR buffers scanned 48,276,489 599.5 62.4
    DBWR checkpoint buffers written 6,959,752 86.4 9.0
    DBWR checkpoints 454 0.0 0.0
    DBWR free buffers found 44,817,183 556.5 57.9
    DBWR lru scans 137,149 1.7 0.2
    DBWR make free requests 162,528 2.0 0.2
    DBWR revisited being-written buff 4,220 0.1 0.0
    DBWR summed scan depth 48,276,489 599.5 62.4
    DBWR transaction table writes 5,036 0.1 0.0
    DBWR undo block writes 2,989,436 37.1 3.9
    DDL statements parallelized 3,723 0.1 0.0
    DFO trees parallelized 4,157 0.1 0.0
    DML statements parallelized 3 0.0 0.0
    OS Block input operations 29,850 0.4 0.0
    OS Block output operations 1,591 0.0 0.0
    OS Characters read/written 182,109,814,791 2,261,447.1 235,416.9
    OS Integral unshared data size ################## 242,463,432.4 ############
    OS Involuntary context switches 188,257,786 2,337.8 243.4
    OS Maximum resident set size 43,518,730,619 540,417.4 56,257.5
    OS Page reclaims 159,430,953 1,979.8 206.1
    OS Signals received 5,260,938 65.3 6.8
    OS Socket messages received 79,438,383 986.5 102.7
    OS Socket messages sent 93,064,176 1,155.7 120.3
    OS System time used 10,936,430 135.8 14.1
    OS User time used 132,043,884 1,639.7 170.7
    OS Voluntary context switches 746,207,739 9,266.4 964.6
    PX local messages recv'd 55,120,663 684.5 71.3
    PX local messages sent 55,120,817 684.5 71.3
    Parallel operations downgraded 1 3 0.0 0.0
    Parallel operations not downgrade 4,154 0.1 0.0
    SQL*Net roundtrips to/from client 155,422,335 1,930.0 200.9
    SQL*Net roundtrips to/from dblink 18 0.0 0.0
    active txn count during cleanout 16,529,551 205.3 21.4
    background checkpoints completed 43 0.0 0.0
    background checkpoints started 43 0.0 0.0
    background timeouts 280,202 3.5 0.4
    branch node splits 4,428 0.1 0.0
    buffer is not pinned count 6,382,440,322 79,257.4 8,250.7
    buffer is pinned count 9,675,661,370 120,152.8 12,507.9
    bytes received via SQL*Net from c 67,384,496,376 836,783.4 87,109.3
    bytes received via SQL*Net from d 6,142 0.1 0.0
    bytes sent via SQL*Net to client 50,240,643,657 623,890.4 64,947.1
    bytes sent via SQL*Net to dblink 3,701 0.1 0.0
    calls to get snapshot scn: kcmgss 145,385,064 1,805.4 187.9
    calls to kcmgas 36,816,132 457.2 47.6
    calls to kcmgcs 3,514,770 43.7 4.5
    change write time 369,373 4.6 0.5
    cleanout - number of ktugct calls 20,954,488 260.2 27.1
    cleanouts and rollbacks - consist 6,357,174 78.9 8.2
    cleanouts only - consistent read 10,078,802 125.2 13.0
    cluster key scan block gets 69,403,565 861.9 89.7
    Instance Activity Stats for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    Statistic Total per Second per Trans
    cluster key scans 41,311,211 513.0 53.4
    commit cleanout failures: block l 413,776 5.1 0.5
    commit cleanout failures: buffer 414 0.0 0.0
    commit cleanout failures: callbac 41,194 0.5 0.1
    commit cleanout failures: cannot 174,382 2.2 0.2
    commit cleanouts 11,469,056 142.4 14.8
    commit cleanouts successfully com 10,839,290 134.6 14.0
    commit txn count during cleanout 17,155,424 213.0 22.2
    consistent changes 145,418,277 1,805.8 188.0
    consistent gets 8,043,252,188 99,881.4 10,397.7
    consistent gets - examination 3,180,028,047 39,489.7 4,110.9
    current blocks converted for CR 9 0.0 0.0
    cursor authentications 14,926 0.2 0.0
    data blocks consistent reads - un 143,706,500 1,784.6 185.8
    db block changes 302,577,666 3,757.4 391.2
    db block gets 336,562,217 4,179.4 435.1
    deferred (CURRENT) block cleanout 2,912,793 36.2 3.8
    dirty buffers inspected 627,174 7.8 0.8
    enqueue conversions 1,296,337 16.1 1.7
    enqueue releases 13,053,200 162.1 16.9
    enqueue requests 13,239,092 164.4 17.1
    enqueue timeouts 185,878 2.3 0.2
    enqueue waits 114,120 1.4 0.2
    exchange deadlocks 7,390 0.1 0.0
    execute count 105,475,101 1,309.8 136.4
    free buffer inspected 1,604,407 19.9 2.1
    free buffer requested 258,126,047 3,205.4 333.7
    hot buffers moved to head of LRU 22,793,576 283.1 29.5
    immediate (CR) block cleanout app 16,436,010 204.1 21.3
    immediate (CURRENT) block cleanou 2,860,013 35.5 3.7
    index fast full scans (direct rea 12,375 0.2 0.0
    index fast full scans (full) 3,733 0.1 0.0
    index fast full scans (rowid rang 192,148 2.4 0.3
    index fetch by key 1,321,024,486 16,404.5 1,707.7
    index scans kdiixs1 406,165,684 5,043.8 525.1
    leaf node 90-10 splits 50,373 0.6 0.1
    leaf node splits 697,235 8.7 0.9
    logons cumulative 884,756 11.0 1.1
    messages received 3,276,719 40.7 4.2
    messages sent 3,257,171 40.5 4.2
    no buffer to keep pinned count 569 0.0 0.0
    no work - consistent read gets 4,406,092,172 54,715.0 5,695.8
    opened cursors cumulative 20,527,704 254.9 26.5
    parse count (failures) 267,088 3.3 0.4
    parse count (hard) 468,996 5.8 0.6
    parse count (total) 19,960,548 247.9 25.8
    parse time cpu 323,024 4.0 0.4
    parse time elapsed 8,393,422 104.2 10.9
    physical reads 537,189,332 6,670.8 694.4
    physical reads direct 292,545,140 3,632.8 378.2
    physical writes 70,409,002 874.3 91.0
    physical writes direct 59,248,394 735.8 76.6
    physical writes non checkpoint 69,103,391 858.1 89.3
    pinned buffers inspected 11,893 0.2 0.0
    prefetched blocks 95,892,161 1,190.8 124.0
    prefetched blocks aged out before 1,495,883 18.6 1.9
    Instance Activity Stats for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    Statistic Total per Second per Trans
    process last non-idle time ################## ############## ############
    queries parallelized 417 0.0 0.0
    recursive calls 122,323,299 1,519.0 158.1
    recursive cpu usage 3,144,533 39.1 4.1
    redo blocks written 180,881,558 2,246.2 233.8
    redo buffer allocation retries 5,400 0.1 0.0
    redo entries 164,728,513 2,045.6 213.0
    redo log space requests 1,006 0.0 0.0
    redo log space wait time 2,230 0.0 0.0
    redo ordering marks 2,563 0.0 0.0
    redo size 108,208,614,904 1,343,739.0 139,883.4
    redo synch time 558,520 6.9 0.7
    redo synch writes 2,343,824 29.1 3.0
    redo wastage 1,126,585,600 13,990.0 1,456.4
    redo write time 718,655 8.9 0.9
    redo writer latching time 7,763 0.1 0.0
    redo writes 2,685,833 33.4 3.5
    rollback changes - undo records a 522,742 6.5 0.7
    rollbacks only - consistent read 335,177 4.2 0.4
    rows fetched via callback 1,100,990,382 13,672.1 1,423.3
    session connect time ################## ############## ############
    session cursor cache count 1,061 0.0 0.0
    session cursor cache hits 1,687,796 21.0 2.2
    session logical reads 8,061,057,193 100,102.5 10,420.7
    session pga memory 1,573,228,913,832 19,536,421.0 2,033,743.8
    session pga memory max 1,841,357,626,496 22,866,054.4 2,380,359.0
    session uga memory 1,074,114,630,336 13,338,399.4 1,388,529.0
    session uga memory max 386,645,043,296 4,801,374.0 499,823.6
    shared hash latch upgrades - no w 410,360,146 5,095.9 530.5
    sorts (disk) 2,657 0.0 0.0
    sorts (memory) 126,165,625 1,566.7 163.1
    sorts (rows) 24,048,783,304 298,638.8 31,088.3
    summed dirty queue length 5,438,201 67.5 7.0
    switch current to new buffer 1,302,798 16.2 1.7
    table fetch by rowid 6,201,503,534 77,010.5 8,016.8
    table fetch continued row 26,649,697 330.9 34.5
    table scan blocks gotten 1,864,435,032 23,152.6 2,410.2
    table scan rows gotten 43,639,997,280 541,923.3 56,414.3
    table scans (cache partitions) 26,112 0.3 0.0
    table scans (direct read) 246,243 3.1 0.3
    table scans (long tables) 340,200 4.2 0.4
    table scans (rowid ranges) 359,617 4.5 0.5
    table scans (short tables) 9,111,559 113.2 11.8
    transaction rollbacks 4,819 0.1 0.0
    transaction tables consistent rea 824 0.0 0.0
    transaction tables consistent rea 1,386,848 17.2 1.8
    user calls 159,931,913 1,986.0 206.8
    user commits 746,543 9.3 1.0
    user rollbacks 27,020 0.3 0.0
    write clones created in backgroun 7 0.0 0.0
    write clones created in foregroun 4,350 0.1 0.0
    Buffer Pool Statistics for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    -> Standard block size Pools D: default, K: keep, R: recycle
    -> Default Pools for other block sizes: 2k, 4k, 8k, 16k, 32k
    Free Write Buffer
    Number of Cache Buffer Physical Physical Buffer Complete Busy
    P Buffers Hit % Gets Reads Writes Waits Waits Waits
    D 774,144 95.6############ 233,869,082 10,089,734 0 0########
    K 504,000 99.9############ 3,260,227 1,070,338 0 0 65,898
    R 63,504 96.2 196,079,539 7,511,863 535 0 0 0
    Buffer wait Statistics for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    -> ordered by wait time desc, waits desc
    Tot Wait Avg
    Class Waits Time (s) Time (ms)
    data block 7,791,121 14,676 2
    file header block 587 101 172
    undo header 151,617 71 0
    segment header 299,312 58 0
    1st level bmb 45,235 7 0
    bitmap index block 392 1 3
    undo block 4,250 1 0
    2nd level bmb 14 0 0
    system undo header 2 0 0
    3rd level bmb 1 0 0
    Latch Activity for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    ->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
    willing-to-wait latch get requests
    ->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
    ->"Pct Misses" for both should be very close to 0.0
    Pct Avg Wait Pct
    Get Get Slps Time NoWait NoWait
    Latch Requests Miss /Miss (s) Requests Miss
    Consistent RBA 2,686,230 0.0 0.2 0 0
    FAL request queue 86 0.0 0 0
    FAL subheap alocation 0 0 2 0.0
    FIB s.o chain latch 1,089 0.0 0 0
    FOB s.o list latch 4,589,986 0.5 0.0 2 0
    NLS data objects 1 0.0 0 0
    SQL memory manager worka 5,963 0.0 0 0
    Token Manager 0 0 2 0.0
    active checkpoint queue 719,439 0.3 0.1 0 1 0.0
    alert log latch 184 0.0 0 2 0.0
    archive control 4,365 0.0 0 0
    archive process latch 1,808 0.6 0.6 0 0
    begin backup scn array 3,387,572 0.0 0.0 0 0
    cache buffer handles 1,577,222 0.2 0.0 0 0
    cache buffers chains ############## 0.5 0.0 430 354,357,972 0.3
    cache buffers lru chain 17,153,023 0.1 0.0 1 385,505,654 0.5
    cas latch 538,804,153 0.3 0.0 7 0
    channel handle pool latc 1,776,950 0.5 0.0 0 0
    channel operations paren 2,901,371 0.3 0.0 0 0
    checkpoint queue latch 99,329,722 0.0 0.0 0 11,153,369 0.1
    child cursor hash table 3,927,427 0.0 0.0 0 0
    commit callback allocati 8,739 0.0 0 0
    dictionary lookup 7,980 0.0 0 0
    dml lock allocation 6,767,990 0.1 0.0 0 0
    dummy allocation 1,898,183 0.2 0.1 0 0
    enqueue hash chains 27,741,348 0.1 0.1 4 0
    enqueues 17,450,161 0.3 0.1 6 0
    error message lists 132,828 2.6 0.2 1 0
    event group latch 884,066 0.0 0.7 0 0
    event range base latch 1 0.0 0 0
    file number translation 34 38.2 0.9 0 0
    global tx hash mapping 577,859 0.0 0 0
    hash table column usage 4,062 0.0 0 8,757,234 0.0
    hash table modification 16 0.0 0 2 0.0
    i/o slave adaptor 0 0 2 0.0
    job workq parent latch 4 100.0 0.3 0 494 8.7
    job_queue_processes para 1,950 0.0 0 2 0.0
    ksfv messages 0 0 4 0.0
    ktm global data 8,219 0.0 0 0
    lgwr LWN SCN 2,687,862 0.0 0.0 0 0
    library cache 310,882,781 0.9 0.0 34 104,759 4.0
    library cache load lock 30,369 0.0 0.3 0 0
    library cache pin 153,821,358 0.1 0.0 2 0
    library cache pin alloca 126,316,296 0.1 0.0 4 0
    list of block allocation 2,730,808 0.3 0.0 0 0
    loader state object free 566,036 0.1 0.0 0 0
    longop free list parent 197,368 0.0 0 8,390 0.0
    message pool operations 14,424 0.0 0.0 0 0
    messages 25,931,764 0.1 0.0 1 0
    mostly latch-free SCN 40,124,948 0.3 0.0 5 0
    Latch Sleep breakdown for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    -> ordered by misses desc
    Get Spin &
    Latch Name Requests Misses Sleeps Sleeps 1->4
    cache buffers chains ############## 74,770,083 1,062,119 73803903/884
    159/71439/10
    582/0
    redo allocation 170,107,983 3,441,055 149,631 3292872/1467
    48/1426/9/0
    library cache 310,882,781 2,831,747 89,240 2754499/6780
    6/7405/2037/
    0
    shared pool 158,471,190 1,755,922 55,268 1704342/4836
    9/2826/385/0
    cas latch 538,804,153 1,553,992 6,927 1547125/6808
    /58/1/0
    row cache objects 161,142,207 1,176,998 27,658 1154070/1952
    0/2560/848/0
    process queue reference 1,893,917,184 1,119,215 106,454 78758/4351/1
    36/0/0
    Library Cache Activity for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    ->"Pct Misses" should be very low
    Get Pct Pin Pct Invali-
    Namespace Requests Miss Requests Miss Reloads dations
    BODY 3,137,721 0.0 3,137,722 0.0 0 0
    CLUSTER 6,741 0.1 4,420 0.2 0 0
    INDEX 353,708 0.8 361,065 1.2 0 0
    SQL AREA 17,052,073 0.3 54,615,678 0.9 410,682 19,628
    TABLE/PROCEDURE 3,521,884 0.2 12,922,737 0.1 619 0
    TRIGGER 1,975,977 0.0 1,975,977 0.0 1 0
    SGA Memory Summary for DB: P7IN1 Instance: P7IN1 Snaps: 582946 -583036
    SGA regions Size in Bytes
    Database Buffers 22,330,474,496
    Fixed Size 779,288
    Redo Buffers 17,051,648
    Variable Size 7,180,648,448
    sum 29,528,953,880

  • Tablespaces and block size in Data Warehouse

    We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. The question is what are general advices in such considerations according table spaces and block size? I made some research and it is hard to find some clear answer, there are resources advising that block size is not important and can be left small (8 KB), others state that it is crucial and should be the biggest possible (64KB). The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
    Any help highly appreciated and thank you in advance.

    Wojtus-J wrote:
    We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. With little experience, the key feature is to avoid big mistakes - don't try to get too clever.
    The question is what are general advices in such considerations according table spaces and block size? If you need to ask about block sizes, use the default (i.e. 8KB).
    I made some research and it is hard to find some clear answer, But if you get contradictory advice from this forum, how would you decide which bits to follow ?
    A couple of sensible guidelines when researching on the internet - look for material that is datestamped with recent dates (last couple of years), or references recent - or at least relevant - versions of Oracle. Give preference to material that explains WHY an idea might be relevant, give greater preference to material that DEMONSTRATES why an idea might be relevant. Check that any explanations and demonstrations are relevant to your planned setup.
    The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
    It is often convenient, and sometimes very important, to separate data into different tablespaces based on some aspect of functionality. The performance thing was mooted (badly) in an era when discs were small and (disk) partitions were hard; but all your other examples of why to split are potentially valid for administrative. Big/Small, table/index, old/new, read-only/read-write, fact/dimension etc.
    For data warehouses a fairly common practice is to identify some sort of aging pattern for the data, and try to pick a boundary that allows you to partition data so that a large fraction of the data can eventually be made read-only: using tablespaces to mark time-boundaries can be a great convenience - note that the tablespace boundary need not match the partition boudary - e.g. daily partitions in a monthly tablespace. If you take this type of approach, you might have a "working" tablespace for recent data, and then copy the older data to "time-specific" tablespace, packing it and making it readonly as you do so.
    Tablespaces are (broadly speaking) about strategy, not performance. (Temporary tablespaces / tablespace groups are probably the exception to this thought.)
    Regards
    Jonathan Lewis

  • Table and Index compression in data warehouse - thoughts?

    Hi,
    We have a data warehouse with large fact tables and materialized views of this data.
    Approx 3 million inserts per day week-ends about 12 million.
    The fact tables we have expected to have 200 million, and couple with 1-3 billion.
    Tables partitioned and have bitmap indexes.
    Just wondered what thoughts were about compressing large fact tables and mviews both from point of view of ETL into them and reporting from them afterwards.
    I take it, can compress/uncompress accordingly without any problem?
    Many Thanks

    After compression, most SELECT statements would not get slower. Actually, many can get faster due to reduced IO and buffer needs.
    The situation with DMLs is more complex. It depends on the exact compression options (basic or advanced) and the DML (INSERT,UPDATE, direct load,..),but generally DML are negatively affected by compression.
    In a Data Warehouses (DWs), it is usually quite beneficial to compress partitions or tables that contain data that is not supposed to be modified (read only or read mostly). Please note that in many cases you do not have to compress while you are loading the data – you can do that later.
    You can also consider compressing some of your B-tree indexes (if you use them in your DW system).
    Iordan Iotzov
    http://iiotzov.wordpress.com/

  • Compression and query performance in data warehouses

    Hi,
    Using Oracle 11.2.0.3 have a large fact table with bitmap indexes to the asscoiated dimensions.
    Understand bitmap indexes are compressed by default so assume cannot further compress them.
    Is this correct?
    Wish to try compress the large fact table to see if this will reduce the i/o on reads and therfore give performance benefits.
    ETL speed fine just want to increase the report performance.
    Thoughts - anyone seen significant gains in data warehouse report performance with compression.
    Also, current PCTFREE on table 10%.
    As only insert into tabel considering making this 1% to imporve report performance.
    Thoughts?
    Thanks

    First of all:
    Table Compression and Bitmap Indexes
    To use table compression on partitioned tables with bitmap indexes, you must do the following before you introduce the compression attribute for the first time:
    Mark bitmap indexes unusable.
    Set the compression attribute.
    Rebuild the indexes.
    The first time you make a compressed partition part of an existing, fully uncompressed partitioned table, you must either drop all existing bitmap indexes or mark them UNUSABLE before adding a compressed partition. This must be done irrespective of whether any partition contains any data. It is also independent of the operation that causes one or more compressed partitions to become part of the table. This does not apply to a partitioned table having B-tree indexes only.
    This rebuilding of the bitmap index structures is necessary to accommodate the potentially higher number of rows stored for each data block with table compression enabled. Enabling table compression must be done only for the first time. All subsequent operations, whether they affect compressed or uncompressed partitions, or change the compression attribute, behave identically for uncompressed, partially compressed, or fully compressed partitioned tables.
    To avoid the recreation of any bitmap index structure, Oracle recommends creating every partitioned table with at least one compressed partition whenever you plan to partially or fully compress the partitioned table in the future. This compressed partition can stay empty or even can be dropped after the partition table creation.
    Having a partitioned table with compressed partitions can lead to slightly larger bitmap index structures for the uncompressed partitions. The bitmap index structures for the compressed partitions, however, are usually smaller than the appropriate bitmap index structure before table compression. This highly depends on the achieved compression rates.

  • Oracle Development Survey: Data Warehouses Customers

    At the start of most data warehouse projects, or even during a project, I am sure you as customers try to find answers to the following questions to help you plan and manage your environments:
    * Where can I find trend and comparison information to help me plan for future growth of my data warehouse?
    * How many cpu's do other customers use per terabyte?
    * How many partitions are typically used in large tables? How many indexes?
    * How much should I allocate for memory for buffer cache?
    * How does my warehouse compare to others of similar and larger scale?
    The data warehouse development team, here at Oracle would like to help provide answers to these questions. However, to do this we need your help. If you have an existing data warehouse environment, we would like to obtain more technical information about your environment(s) by running a simple measurement script and returning the output files to us, here at Oracle. This will allow our developers to provide comprehensive documents that explain best practices and get a better understanding of which features our customers use the most. This will also allow you as Customers, to benchmark your environments compared to other customers’ environments.
    From a Company perspective we are also interested to get feedback on features we have added to the database, are these features used, how are they used etc. For example we are keen to understand:
    * Which initialization parameters are most frequently used at what values?
    * How many Oracle data warehouses run on RAC? on single nodes?
    * Is there a trend one-way or the other, especially as data volumes increase?
    * Does this change with newer releases of the database?
    All results from these scripts will be held confidential. No customers will be mentioned by name; only summaries and trends will be reported (e.g., “X percent of tables are partitioned and Y percent are indexed in data warehouses that are Z terabytes and larger in size.” or “X percent of Oracle9i and Y percent of Oracle10g data warehouses surveyed run RAC”). Results will be written up as a summarized report. Every participating customer will receive a copy of the report.
    Terabyte and larger DW are the primary interest, but information on any data warehouse environment is useful. We would like to have as many customers as possible submit results, ideally by the end of this week. However, this will be an on going process so regular feedback after this week is extremely useful.
    To help our developers and product management team please download and run the DW measurement script kit from OTN which is available from the following link:
    http://www.oracle.com/technology/products/bi/db/10g/dw_survey_0206.html
    Please return the script outputs using the link shown on the above web page, see the FAQ section, or alternatively mail them directly to me: [email protected].
    Thank you and we look forward to your responses.
    Message was edited by:
    klaker

    At the start of most data warehouse projects, or even during a project, I am sure you as customers try to find answers to the following questions to help you plan and manage your environments:
    * Where can I find trend and comparison information to help me plan for future growth of my data warehouse?
    * How many cpu's do other customers use per terabyte?
    * How many partitions are typically used in large tables? How many indexes?
    * How much should I allocate for memory for buffer cache?
    * How does my warehouse compare to others of similar and larger scale?
    The data warehouse development team, here at Oracle would like to help provide answers to these questions. However, to do this we need your help. If you have an existing data warehouse environment, we would like to obtain more technical information about your environment(s) by running a simple measurement script and returning the output files to us, here at Oracle. This will allow our developers to provide comprehensive documents that explain best practices and get a better understanding of which features our customers use the most. This will also allow you as Customers, to benchmark your environments compared to other customers’ environments.
    From a Company perspective we are also interested to get feedback on features we have added to the database, are these features used, how are they used etc. For example we are keen to understand:
    * Which initialization parameters are most frequently used at what values?
    * How many Oracle data warehouses run on RAC? on single nodes?
    * Is there a trend one-way or the other, especially as data volumes increase?
    * Does this change with newer releases of the database?
    All results from these scripts will be held confidential. No customers will be mentioned by name; only summaries and trends will be reported (e.g., “X percent of tables are partitioned and Y percent are indexed in data warehouses that are Z terabytes and larger in size.” or “X percent of Oracle9i and Y percent of Oracle10g data warehouses surveyed run RAC”). Results will be written up as a summarized report. Every participating customer will receive a copy of the report.
    Terabyte and larger DW are the primary interest, but information on any data warehouse environment is useful. We would like to have as many customers as possible submit results, ideally by the end of this week. However, this will be an on going process so regular feedback after this week is extremely useful.
    To help our developers and product management team please download and run the DW measurement script kit from OTN which is available from the following link:
    http://www.oracle.com/technology/products/bi/db/10g/dw_survey_0206.html
    Please return the script outputs using the link shown on the above web page, see the FAQ section, or alternatively mail them directly to me: [email protected].
    Thank you and we look forward to your responses.
    Message was edited by:
    klaker

  • Oracle Development Survey on Data Warehouses: How Does Yours Compare?

    At the start of most data warehouse projects, or even during a project, I am sure you as customers try to find answers to the following questions to help you plan and manage your environments:
    * Where can I find trend and comparison information to help me plan for future growth of my data warehouse?
    * How many cpu's do other customers use per terabyte?
    * How many partitions are typically used in large tables? How many indexes?
    * How much should I allocate for memory for buffer cache?
    * How does my warehouse compare to others of similar and larger scale?
    The data warehouse development team, here at Oracle, would like to help provide answers to these questions. However, to do this we need your help. If you have an existing data warehouse environment, we would like to obtain more technical information about your environment(s) by running a simple measurement script and returning the output files to us, here at Oracle. This will allow our developers to provide comprehensive documents that explain best practices and get a better understanding of which features our customers use the most. This will also allow you as Customers, to benchmark your environments compared to other customers’ environments.
    From a Company perspective we are also interested to get feedback on features we have added to the database, are these features used, how are they used etc. For example we are keen to understand:
    * Which initialization parameters are most frequently used at what values?
    * How many Oracle data warehouses run on RAC? on single nodes?
    * Is there a trend one-way or the other, especially as data volumes increase?
    * Does this change with newer releases of the database?
    All results from these scripts will be held confidential. No customers will be mentioned by name; only summaries and trends will be reported (e.g., “X percent of tables are partitioned and Y percent are indexed in data warehouses that are Z terabytes and larger in size.” or “X percent of Oracle9i and Y percent of Oracle10g data warehouses surveyed run RAC”). Results will be written up as a summarized report. Every participating customer will receive a copy of the report.
    Terabyte and larger DW are the primary interest, but information on any data warehouse environment is useful. We would like to have as many customers as possible submit results, ideally by the end of this week. However, this will be an on going process so regular feedback after this week is extremely useful.
    To help our developers and product management team please download and run the DW measurement script kit from OTN which is available from the following link:
    http://www.oracle.com/technology/products/bi/db/10g/dw_survey_0206.html
    Please return the script outputs using the link shown on the above web page, see the FAQ section, or alternatively mail them directly to me: [email protected].

    969224 wrote:
    Hi Guys, just a quick question. when we have a primary key on 4 coloumns and we have, say 20 million rows and we want to add one extra row. How does oracle check whether the data on the primary key is unique to the record being added compared to the 20 million rows. Does it actually compare the record being added to all the rows present in the table?
    Edited by: 969224 on May 10, 2013 8:14 AMNot the whole row, it compares the 4 columns in the INDEX against the 4 columns in the new row.

  • Data warehouse configuration doubt

    Hi guys,
    I´m studing data warehouse and I would like to know if there´s any documentation showing the correct way to configure a DW database, like parameters, block size and partitioned tables.
    or if you have experience please post here your suggestions.
    Thank you,
    Felipe

    Starting with Oracle's data Warehousing Guide, documentation is a great start.
    http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96520/toc.htm
    You can also search fro datawarehouse books from goole or any search enginee.
    Jaffar

  • Data warehouse backups and read only tablespaces

    Hi all,
    I am working on a data warehouse database with following specs:
    Version: Oracle 10.2.0.3 Enterprise
    OS: Solaris
    App: Data warehouse
    We use RMAN to take 'level 0' & 'level 1' backups. We have block change tracking enabled and RMAN backups up data files and archive logs straight to tape.
    I am exploring ways of reducing the 'level 0' backups and was specifically focussing on using read-only tablespaces for this purpose.
    I have often seen it mentioned that a best practice in D/Ws is to store the old static partitions of fact tables in read only tablespaces so as to reduce the backup size.
    In case you have already implemented such a scheme, I would like to know how you have implemented it.
    I am thinking of the following mechanism:
    -- Start using backups at tablespace level rather than 'level 0' at database level.
    -- Record the latest SCNs of all datafiles prior to back up.
    -- If the latest SCN has not changed since last backup and the tablespace is in read only mode then
    -- Check if a backup copy of the tablespace has been done within the recovery window and is accessible.
    -- If the copy exists then don't backup that tablespace, else backup the tablespace.
    -- If the tablespace is read/write then back it up.
    I haven't delved into the low level details, but this seems to be lot of work. So just wanted to know from you if there's any ready made feature which makes all this easier.
    Many thanks in advance.

    Thank you so much for your help.
    Backup optimization was indeed the thing I was looking for. To be honest I had done a bit of RTFM, but I didn't check the advanced user guide.
    Although my specific question has been answered, it would be interesting to know what other things other people are implementing to reduce backups etc.
    I am also thinking of following options:
    -- Turn on index monitoring to get rid of unused indexes.
    -- Stop the backups of 'index' tablespaces.
    -- Archive off old data.
    Any other ideas for reducing DW backup size?
    Many thanks.

  • ACI Setup - How to Configure Data Warehouse Database - Partitoning

    After reading the ACI Install Guide & Data Warehouse documentation, I have some questions regarding how to setup the database:
    - Should database partitioning be setup? If so, what tables should be partitioned and what should they be partitioned by?
    - Are there any other best practices or tips for setting up & tuning the database?
    We are trying to avoid the (painful) situation of having to add partitioning later on; it is much easier to add it up front (if done correctly up front).
    Thanks in advance for any advice!

    On the tables recommended for partitioning, the partition key is nullable. If ATG inserts a null value into the timestamp column of one of the partitioned tables, we'll receive an ORA-14300 or ORA-14440 error. Oracle isn't able to figure out what partition to map that record to.
    Can the columns be changed to NOT NULL? Or, can the application guarantee a nullable value won't be inserted?
    Here are some example columns:
    ARF_SITE_VISIT.START_VISIT_TIMESTAMP --> TIMESTAMP(6) null
    ARF_REGISTRATION.REGISTRATION_TIMESTAMP --> TIMESTAMP(6) null
    ARF_LINE_ITEM.SUBMIT_TIMESTAMP --> TIMESTAMP(6) null
    ARF_PROMOTION_USAGE.USAGE_TIMESTAMP --> TIMESTAMP(6) null
    ARF_RETURN_ITEM.SUBMIT_TIMESTAMP --> TIMESTAMP(6) null
    Thanks

  • Where to find best practices for tuning data warehouse ETL queries?

    Hi Everybody,
    Where can I find some good educational material on tuning ETL procedures for a data warehouse environment?  Everything I've found on the web regarding query tuning seems to be geared only toward OLTP systems.  (For example, most of our ETL
    queries don't use a WHERE statement, so the vast majority of searches are table scans and index scans, whereas most index tuning sites are striving for index seeks.)
    I have read Microsoft's "Best Practices for Data Warehousing with SQL Server 2008R2," but I was only able to glean a few helpful hints that don't also apply to OLTP systems:
    often better to recompile stored procedure query plans in order to eliminate variances introduced by parameter sniffing (i.e., better to use the right plan than to save a few seconds and use a cached plan SOMETIMES);
    partition tables that are larger than 50 GB;
    use minimal logging to load data precisely where you want it as fast as possible;
    often better to disable non-clustered indexes before inserting a large number of rows and then rebuild them immdiately afterward (sometimes even for clustered indexes, but test first);
    rebuild statistics after every load of a table.
    But I still feel like I'm missing some very crucial concepts for performant ETL development.
    BTW, our office uses SSIS, but only as a glorified stored procedure execution manager, so I'm not looking for SSIS ETL best practices.  Except for a few packages that pull from source systems, the majority of our SSIS packages consist of numerous "Execute
    SQL" tasks.
    Thanks, and any best practices you could include here would be greatly appreciated.
    -Eric

    Online ETL Solutions are really one of the biggest challenging solutions and to do that efficiently , you can read my blogs for online DWH solutions to know at the end how you can configure online DWH Solution for ETL  using Merge command of SQL Server
    2008 and also to know some important concepts related to any DWH solutions such as indexing , de-normalization..etc
    http://www.sqlserver-performance-tuning.com/apps/blog/show/12927061-data-warehousing-workshop-1-4-
    http://www.sqlserver-performance-tuning.com/apps/blog/show/12927103-data-warehousing-workshop-2-4-
    http://www.sqlserver-performance-tuning.com/apps/blog/show/12927173-data-warehousing-workshop-3-4-
    http://www.sqlserver-performance-tuning.com/apps/blog/show/12927061-data-warehousing-workshop-1-4-
    Kindly let me know if any further help is needed
    Shehap (DB Consultant/DB Architect) Think More deeply of DB Stress Stabilities

  • Analyze command in a Data warehouse env

    We are doing data loads daily on our Data warehouse. On certain target tables, we have change data capture enabled. As part of loading the table ( 4 million rows total) , we remove data for a certain time period ( say a month = 50,000+ rows) and loading that again from the source. We are also doing a full table analyze part of this load and is taking a long time.
    Question is : Do we need to do the analyze command every day ? Is there a big change we would see if we run the analyze once a week ?
    Thanks.

    Hi srwijese,
    My DW actually has 12TBs and after each dataload we do stats collection from our tables, BUT, we have partitioned tables in most of cases, so we just collect it at partition level using dbms_stat package. I don't know if your enviroment is partitioned or not, if yes, do a stats collection just for partition loaded.
    P.S: If you wish add [email protected] (MSN) to share experiences.
    Jonathan Ferreira
    http://oracle4dbas.blogspot.com

  • Efficiency of data warehouse sql and star/snowflake schema

    Hi,
    We are using 11.2.0.3 and need to improve query performance of reports.  data warehouse star/snowflake schema
    In addition to indexing, partitioning having star_transformation enabled etc I am condisriing impact of the following on query performance.
    central fact (over a billion rows) joins to a dimesnion customer ( few hundred thousand rows) which in turn joined to latest version of the dimesnion ( whichhas circa 30,000 rows).
    The table with few hundred thousand rows (customer dimesnion) must alwsys be queried as data stored aganist the version of customer applicable at the time - we just query latest_customer as users want to see
    latest version of customer attributes to stop data being fragemented across several rows in the report.
    Considering if would be more efficient to create a dimenson which is the equivalent of customer but also stores the latest version of the customer attributes on the on row - this would mean customer dimensuion would have far more columns but queries would could avoid additional lookup of this 30k row table.
    Thoughts are - would this be a material benefit?
    At monent users would query latest_customer to say get all customers belonging to a certain multiple chain.
    If change as above then they would query directly the customer dimension with few hundred thousand rows.
    Thoughts?
    Thanks

    We are using 11.2.0.3 and need to improve query performance of reports.  data warehouse star/snowflake schema
    That is NOT a realistic or even meaningful goal.
    And until you identify and document an actual PROBLEM or specific goal you should not even be considering possible solutions.
    Anything you do to improve one report might degrade the performance of several other reports.
    You need to start over and gather information about WHAT Oracle is doing for the reports now, HOW that work is being done and capture metrics that validate how the reports are currently performing.
    Your first step should be to document the performance you are getting now for each report.
    The second step would be to identify which of those reports is a possible target for tuning.
    The third step is to prioritize the reports: which is most important to tune, which is next, etc.
    Then you need to generate the execution plans for those reports to identify EXACTLY how Oracle is executing the queries now.
    At this point you should have enough information to know what your possible options are.
    So then you create a prioritized list of options. The top of the list should be additions to what you already have.
    1. New indexes - regular or bitmapped (if appropriate)
    2. Dropping indexes that aren't being used.
    3. Report-ready summary tables or Materializeds views.
    IMHO modifying your basic architecture should be your LAST resort and undertaken only if you can't solve your (unstated) problem using solutions that have less impact and risk.

  • TUNING DATA WAREHOUSE DATABASE INSTANCE

    Hi,
    I have to tune one of the DATA WAREHOUSE DATABASE INSTANCE.
    Any advice for tuning this instance.
    How different is tuning the data warehouse instance than normal instance;
    Regards

    First of all, touch nothing until you understand what your users are doing with the data warehouse, when they are doing it and what their expectations are.
    Secondly, remember that a data warehouse is, generally, much bigger than an OLTP database. This changes the laws of physics. Operations you might expect to take a few minutes might take days. This means you need to be completely certain about what you do in production before you do it.
    Thirdly, bear in mind that a lot of data warehouse tuning techniques implement physical solutions objects - different types of indexes, partitioning - rather than query tweaking. These things are easier to get right at the start than to retrofit to large volumes of data.
    Good luck, APC

  • Best practice of metadata table in data warehouse environment ?

    Hi guru's,
    In datawarehouse, we have 1. Stage schema 2. DWH(Data warehouse reporting schema). In stageing we have about 300 source tables. In DWH schema, we are creating the tables which are only required from reporting prespective . some of the tables in stageing schema, have been created in DWH schema as well with different table name and column names. The naming convention for these same tables and columns in DWH schema is more based on business names.
    In order to keep track of these tables we are creating metadata table in DWH schema say for example
    Stage                DWH_schema
    Table_1             Table_A         
    Table_2             Table_b
    Table_3             Table_c
    Table_4              Table_DMy question is how do we handle the column names in each of these tables. The stage_1, stage_2 and stage_3 column names have been renamed in DWH_schema which are part of Table_A, Table_B, Table_c.
    As said earlier, we have about 300 tables in stage and may be around 200 tables in DWH schema. Lot of the column names have been renamed in DWH schema from stage tables. In some of the tables we have 200 column's
    so my concern is how do we handle the column names in metadata table ? Do we need to keep only table names in metadata table not column names ?
    Any idea will be greatly appriciated.
    Thanks!

    hi
    seems quite a buzzing question.
    In our project we designed a hub and spoke like architecture.
    Thus we have 3 layer, L0 is the one closest to the source and L0 table's name are linked to the corresponding sources names by mean of naming standard (like tabA EXT_tabA tabA_OK1 so on based on implementation of load procedures).
    At L1 we have the ODS , normalized model , we use business names for table there and standard names for temporary structures and artifacts
    Both L0 an L1 keep source's column names as general rule, new columns like calculated one are business driven and metadata are standard driven.
    Datamodeler fits perfect for modelling L1 purpose.
    L2 is the dimensional schema business names take place for tables and columns eventually rewritten at presentation layer ( front end tool )
    hope this helps D.

Maybe you are looking for