Physical partitioning query performance wore than without

Hello experts,
I have copied InfoCube 0COPC_C04 to ZCPOC_C06 with a test-query and took over data from 0COPC_C04 to test the functionality of Repartitioning (~ 4 million records)
I did Repartitioning following instructions from note 1008833 for 0FISCPER (I used se38 RSDU_SET_FV_TO_FIX_VALUE  to set 0FISCVARNT to constant) - starting-point is 001.2000 - end-point is 016.2010
Indexes and Statistcs are re-created.
We have Oracle 10.2.0.2.0 and SAPKW70022 on BW 700
So - from my point of view everything is fine - BUT: Queryperformance is worst than before - shown in rsrv.
Variable in Query is over 0FISCYEAR and 0FISCPER3 and NOT over 0FISCPER (a querytest with 0FISCPER brought the same bad performance result).
Even I read, that with physical partitioning the query reads PARALLEL --- I can not see any parallel things in sm50 / sm66 - I have excpected to see parallel readings due to my query - I asked for results from 001.2006 - 12.2010 - and 13 partions were created (related to se14 ).
What might be the problem?
Thanks for your answer,
Thomas
Edited by: Thomas Liebschner on Jan 4, 2011 3:28 PM

#1171650 is already implemented (since 14.12.2010) --- I asked my collegaue from Basis-Team to send me the report.
> I read parallel-queryexcution in the SAP Press Galileo Press Book "SAP BW Peformanceoptimierung" written by Thomas Schröder (ISBN: 3-89842-718-8), Page 376. If you like, I can send this page as pdf to your mail stored in your business card.
Would be interesting to see that excerpt, yes.
> What I missed was to compress the F-Table.
>
> I'm wondering, that:
> 1. After Compressing 0COPC_C04 with 8 million records gave a dramatically improvement of query-performance. No partitioning.
Sure, usually a lot less data needs to be read now.
> 2. After a new Full-load from 0COPC_C04 to ZCOPC_C06 with rebuilding indices and statistics I got similar query-performance compared to 1
Also easy to understand.
The F-Facttable contains data per load. So data concerning the same business items will be in that table multiple times and need to be aggregated on-the-fly during query execution.
The E-Facttable in contrast just contains one entry per business item. Which is the same situation you'll find when you just have one single data load request.
> 3. If I compress ZCOPC_C06, then query-performance drops dramatically down (needs double time, if I use 0FISCPER, needs 8 times more in query with 0FISCYEAR and 0FISCPER3)
>
> additional information: I have merged partitions from ZCOPC_C06 to one partition --> checked by se14 /BIC/FZCOPC_C06 -- storage parameters
Hmm... ok, here we would really need to see what Oracle does with your Query and the partitioning scheme then.
What seems obvious is that as soon as you have just one partition, then there is no way for Oracle to split up work and leave uninteresting data aside (partition-pruning),
> Conclusion: it seems for me, that compression is enough - partitioning has no advantage in that dedicated scenario.
> Do I think right?
I think that compression is a must-do in 99% of all cases for SAP BI and that the partitioning of the E-Facttable should be reviewed on your system.
Up to here there is too little information available to tell whether or not it could benefit your query execution time.
Another aspect you might consider with partitioning the E-facttable is your data retention policy.
If you're about to throw away your data on, say, a yearly basis, then having the table partitioned that way can provide huge benefits.
My personal view to this is: you already payed for the most expensive and feature-rich database engine available for SAP BI - so why not go and exploit these features where possible?
best regards
Lars

Similar Messages

  • How can i increse query performance other than creating aggregates

    Hi,
           i created one query and it will take more time to give a report i created aggregates but no use again it will take same so pls give me a option to increase the query performance.
    bye .
    sandhya.

    Hi SANDHYA,
    You can increase the query performance by using indices too. You can search a table for data records that satisfy certain search criteria faster using an index.
    An index can be considered a copy of a database table that has been reduced to certain fields. This copy is always in sorted form. Sorting provides faster access to the data records of the table, for example using a binary search. The index also contains a pointer to the corresponding record of the actual table so that the fields not contained in the index can also be read.
    Pls refer the follwoing links for more idea about indices
    http://help.sap.com/saphelp_nw04/helpdata/en/9b/c743f5b40711d194f900a0c929b3c3/frameset.htm
    Re: Bitmap vs BTree
    How to create B-Tree and Bitmap index in SAP
    Re: Cardinality
    Line Item Dimesnion
    I ve some good documents regarding indices...I would send them if i know ur mail id.
    regards,
    RV.

  • Query Performance with and without cache

    Hi Experts
    I have a query that takes 50 seconds to execute without any caching or precalculation.
    Once I have run the query in the Portal, any subsequent execution takes about 8 seconds.
    I assumed that this was to do with the cache, so I went into RSRT and deleted the Main Memory Cache and the Blob cache, where my queries seemed to be.
    I ran the query again and it took 8 seconds.
    Does the query cache somewhere else? Maybe on the portal? or on the users local cache? Does anyone have any idea of why the reports are still fast, even though the cache is deleted?
    Forum points always awarded for helpful answers!!
    Many thanks!
    Dave

    Hi,
    Cached data automatically becomes invalid whenever data in the InfoCube is loaded or purged and when a query is changed or regenerated. Once cached data becomes invalid, the system reverts to the fact table or associated aggregate to pull data for the query You can see the cache settings for all queries in your system using transaction SE16 to view table RSRREPDIR . The CACHEMODE field shows the settings of the individual queries. The numbers in this field correspond to the cache mode settings above.
    To set the cache mode on the InfoCube, follow the path Business Information Warehouse Implementation Guide (IMG)>Reporting-Relevant Settings>General Reporting Settings>Global Cache Settings or use transaction SPRO . Setting the cache mode at the InfoCube level establishes a default for each query created from that specific InfoCube.

  • Poor query performance when joining CONTAINS to another table

    We just recently began evaluating Oracle Text for a search solution. We need to be able to search a table that can have over 20+ million rows. Each user may only have visibility to a tiny fraction of those rows. The goal is to have a single Oracle Text index that represents all of the searchable columns in the table (multi column datastore) and provide a score for each search result so that we can sort the search results in descending order by score. What we're seeing is that query performance from TOAD is extremely fast when we write a simple CONTAINS query against the Oracle Text indexed table. However, when we attempt to first reduce the rows the CONTAINS query needs to search by using a WITH we find that the query performance degrades significantly.
    For example, we can find all the records a user has access to from our base table by the following query:
    SELECT d.duns_loc
    FROM duns d
    JOIN primary_contact pc
    ON d.duns_loc = pc.duns_loc
    AND pc.emp_id = :employeeID;
    This query can execute in <100 ms. In the working example, this query returns around 1200 rows of the primary key duns_loc.
    Our search query looks like this:
    SELECT score(1), d.*
    FROM duns d
    WHERE CONTAINS(TEXT_KEY, :search,1) > 0
    ORDER BY score(1) DESC;
    The :search value in this example will be 'highway'. The query can return 246k rows in around 2 seconds.
    2 seconds is good, but we should be able to have a much faster response if the search query did not have to search the entire table, right? Since each user can only "view" records they are assigned to we reckon that if the search operation only had to scan a tiny tiny percent of the TEXT index we should see faster (and more relevant) results. If we now write the following query:
    WITH subset
    AS
    (SELECT d.duns_loc
    FROM duns d
    JOIN primary_contact pc
    ON d.duns_loc = pc.duns_loc
    AND pc.emp_id = :employeeID
    SELECT score(1), d.*
    FROM duns d
    JOIN subset s
    ON d.duns_loc = s.duns_loc
    WHERE CONTAINS(TEXT_KEY, :search,1) > 0
    ORDER BY score(1) DESC;
    For reasons we have not been able to identify this query actually takes longer to execute than the sum of the durations of the contributing parts. This query takes over 6 seconds to run. We nor our DBA can seem to figure out why this query performs worse than a wide open search. The wide open search is not ideal as the query would end up returning records to the user they don't have access to view.
    Has anyone ever ran into something like this? Any suggestions on what to look at or where to go? If anyone would like more information to help in diagnosis than let me know and i'll be happy to produce it here.
    Thanks!!

    Sometimes it can be good to separate the tables into separate sub-query factoring (with) clauses or inline views in the from clause or an in clause as a where condition. Although there are some differences, using a sub-query factoring (with) clause is similar to using an inline view in the from clause. However, you should avoid duplication. You should not have the same table in two different places, as in your original query. You should have indexes on any columns that the tables are joined on, your statistics should be current, and your domain index should have regular synchronization, optimization, and periodically rebuild or drop and recreate to keep it performing with maximum efficiency. The following demonstration uses a composite domain index (cdi) with filter by, as suggested by Roger, then shows the explained plans for your original query, and various others. Your original query has nested loops. All of the others have the same plan without the nested loops. You could also add index hints.
    SCOTT@orcl_11gR2> -- tables:
    SCOTT@orcl_11gR2> CREATE TABLE duns
      2    (duns_loc  NUMBER,
      3       text_key  VARCHAR2 (30))
      4  /
    Table created.
    SCOTT@orcl_11gR2> CREATE TABLE primary_contact
      2    (duns_loc  NUMBER,
      3       emp_id       NUMBER)
      4  /
    Table created.
    SCOTT@orcl_11gR2> -- data:
    SCOTT@orcl_11gR2> INSERT INTO duns VALUES (1, 'highway')
      2  /
    1 row created.
    SCOTT@orcl_11gR2> INSERT INTO primary_contact VALUES (1, 1)
      2  /
    1 row created.
    SCOTT@orcl_11gR2> INSERT INTO duns
      2  SELECT object_id, object_name
      3  FROM   all_objects
      4  WHERE  object_id > 1
      5  /
    76027 rows created.
    SCOTT@orcl_11gR2> INSERT INTO primary_contact
      2  SELECT object_id, namespace
      3  FROM   all_objects
      4  WHERE  object_id > 1
      5  /
    76027 rows created.
    SCOTT@orcl_11gR2> -- indexes:
    SCOTT@orcl_11gR2> CREATE INDEX duns_duns_loc_idx
      2  ON duns (duns_loc)
      3  /
    Index created.
    SCOTT@orcl_11gR2> CREATE INDEX primary_contact_duns_loc_idx
      2  ON primary_contact (duns_loc)
      3  /
    Index created.
    SCOTT@orcl_11gR2> -- composite domain index (cdi) with filter by clause
    SCOTT@orcl_11gR2> -- as suggested by Roger:
    SCOTT@orcl_11gR2> CREATE INDEX duns_text_key_idx
      2  ON duns (text_key)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  FILTER BY duns_loc
      5  /
    Index created.
    SCOTT@orcl_11gR2> -- gather statistics:
    SCOTT@orcl_11gR2> EXEC DBMS_STATS.GATHER_TABLE_STATS (USER, 'DUNS')
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> EXEC DBMS_STATS.GATHER_TABLE_STATS (USER, 'PRIMARY_CONTACT')
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> -- variables:
    SCOTT@orcl_11gR2> VARIABLE employeeid NUMBER
    SCOTT@orcl_11gR2> EXEC :employeeid := 1
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> VARIABLE search VARCHAR2(100)
    SCOTT@orcl_11gR2> EXEC :search := 'highway'
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> -- original query:
    SCOTT@orcl_11gR2> SET AUTOTRACE ON EXPLAIN
    SCOTT@orcl_11gR2> WITH
      2    subset AS
      3        (SELECT d.duns_loc
      4         FROM      duns d
      5         JOIN      primary_contact pc
      6         ON      d.duns_loc = pc.duns_loc
      7         AND      pc.emp_id = :employeeID)
      8  SELECT score(1), d.*
      9  FROM   duns d
    10  JOIN   subset s
    11  ON     d.duns_loc = s.duns_loc
    12  WHERE  CONTAINS (TEXT_KEY, :search,1) > 0
    13  ORDER  BY score(1) DESC
    14  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 4228563783
    | Id  | Operation                      | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT               |                   |     2 |    84 |   121   (4)| 00:00:02 |
    |   1 |  SORT ORDER BY                 |                   |     2 |    84 |   121   (4)| 00:00:02 |
    |*  2 |   HASH JOIN                    |                   |     2 |    84 |   120   (3)| 00:00:02 |
    |   3 |    NESTED LOOPS                |                   |    38 |  1292 |    50   (2)| 00:00:01 |
    |   4 |     TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  5 |      DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  6 |     INDEX RANGE SCAN           | DUNS_DUNS_LOC_IDX |     1 |     5 |     1   (0)| 00:00:01 |
    |*  7 |    TABLE ACCESS FULL           | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("D"."DUNS_LOC"="PC"."DUNS_LOC")
       5 - access("CTXSYS"."CONTAINS"("D"."TEXT_KEY",:SEARCH,1)>0)
       6 - access("D"."DUNS_LOC"="D"."DUNS_LOC")
       7 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2> -- queries with better plans (no nested loops):
    SCOTT@orcl_11gR2> -- subquery factoring (with) clauses:
    SCOTT@orcl_11gR2> WITH
      2    subset1 AS
      3        (SELECT pc.duns_loc
      4         FROM      primary_contact pc
      5         WHERE  pc.emp_id = :employeeID),
      6    subset2 AS
      7        (SELECT score(1), d.*
      8         FROM      duns d
      9         WHERE  CONTAINS (TEXT_KEY, :search,1) > 0)
    10  SELECT subset2.*
    11  FROM   subset1, subset2
    12  WHERE  subset1.duns_loc = subset2.duns_loc
    13  ORDER  BY score(1) DESC
    14  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 153618227
    | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |   1 |  SORT ORDER BY                |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |*  2 |   HASH JOIN                   |                   |    38 |  1406 |    82   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("PC"."DUNS_LOC"="D"."DUNS_LOC")
       4 - access("CTXSYS"."CONTAINS"("TEXT_KEY",:SEARCH,1)>0)
       5 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2> -- inline views (sub-queries in the from clause):
    SCOTT@orcl_11gR2> SELECT subset2.*
      2  FROM   (SELECT pc.duns_loc
      3            FROM   primary_contact pc
      4            WHERE  pc.emp_id = :employeeID) subset1,
      5           (SELECT score(1), d.*
      6            FROM   duns d
      7            WHERE  CONTAINS (TEXT_KEY, :search,1) > 0) subset2
      8  WHERE  subset1.duns_loc = subset2.duns_loc
      9  ORDER  BY score(1) DESC
    10  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 153618227
    | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |   1 |  SORT ORDER BY                |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |*  2 |   HASH JOIN                   |                   |    38 |  1406 |    82   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("PC"."DUNS_LOC"="D"."DUNS_LOC")
       4 - access("CTXSYS"."CONTAINS"("TEXT_KEY",:SEARCH,1)>0)
       5 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2> -- ansi join:
    SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
      2  FROM   duns
      3  JOIN   primary_contact
      4  ON     duns.duns_loc = primary_contact.duns_loc
      5  WHERE  CONTAINS (duns.text_key, :search, 1) > 0
      6  AND    primary_contact.emp_id = :employeeid
      7  ORDER  BY SCORE(1) DESC
      8  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 153618227
    | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |   1 |  SORT ORDER BY                |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |*  2 |   HASH JOIN                   |                   |    38 |  1406 |    82   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
       4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
       5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2> -- old join:
    SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
      2  FROM   duns, primary_contact
      3  WHERE  CONTAINS (duns.text_key, :search, 1) > 0
      4  AND    duns.duns_loc = primary_contact.duns_loc
      5  AND    primary_contact.emp_id = :employeeid
      6  ORDER  BY SCORE(1) DESC
      7  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 153618227
    | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |   1 |  SORT ORDER BY                |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |*  2 |   HASH JOIN                   |                   |    38 |  1406 |    82   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
       4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
       5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2> -- in clause:
    SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
      2  FROM   duns
      3  WHERE  CONTAINS (duns.text_key, :search, 1) > 0
      4  AND    duns.duns_loc IN
      5           (SELECT primary_contact.duns_loc
      6            FROM   primary_contact
      7            WHERE  primary_contact.emp_id = :employeeid)
      8  ORDER  BY SCORE(1) DESC
      9  /
      SCORE(1)   DUNS_LOC TEXT_KEY
            18          1 highway
    1 row selected.
    Execution Plan
    Plan hash value: 3825821668
    | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT              |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |   1 |  SORT ORDER BY                |                   |    38 |  1406 |    83   (5)| 00:00:01 |
    |*  2 |   HASH JOIN SEMI              |                   |    38 |  1406 |    82   (4)| 00:00:01 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 |  1102 |    11   (0)| 00:00:01 |
    |*  4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   |  4224 | 33792 |    70   (3)| 00:00:01 |
    Predicate Information (identified by operation id):
       2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
       4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
       5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
    SCOTT@orcl_11gR2>

  • 3.x query performance on upgraded BI 7.0 - worse than before upgrade?

    Dear Sirs,
    upgraded a BI system this weekend from 3.x to 7.0. Purely technical upgrade so no upgrade to 7.0 queries yet.
    What we have seen is that queries with "loose"/lagre selection ? (e.g all plant .. all months) etc have worse performance than before upgrade.
    One example: a query went from 26second to 60 seconds to run.
    Small selection (e.g one specific plant and date) is same run time.
    Has anyone had similar experience?
    Is BI 7.0 optimized for 7.0 queries, or are there any performance parameteres I could look at.
    Best regards,
    Jørgen

    Check this thread:
    Performance problems on NW04S/BI 7.0 after the upgrade
    Additional:
    [Improving Query Performance by Effective and Efficient Maintenance of Aggregates|https://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/906de68a-0a28-2c10-398b-ef0e11ef2022]
    BI Performance Improvements in 7.0
    BI Performance Improvements in 7.0
    Hope this helps.
    Regards
    Andreas

  • Query performance on RAC is a lot slower than single instance

    I simply followed the steps provided by oracle to install a rac db of 2 nodes.
    The performce on Insertion (java, thin ojdbc) is pretty much the same compared to a single instance on NFS
    However the performance on the select query is very slow compared to single instance.
    I have tried using different methods for the storage configuration (asm with raw, ocfs2) but the performance is still slow.
    When I shut down one instance, leaving only one instance up, the query performance is very fast (as fast as one single instance)
    I am using rhel5 64 bit (16G of physical memory) and oracle 11.1.0.6 with patchset 11.1.0.7
    Could someone help me how to debug this problem?
    Thanks,
    Chau
    Edited by: user638637 on Aug 6, 2009 8:31 AM

    top 5 timed foreground events:
    DB CPU: times 943(s), %db time (47.5%)
    cursor.pin S wait on X: wait(13940), time (321s), avg wait(23ms), %db time (16.15%)
    direct path read (95,436), times (288), avg watie (3ms), %db ime (14.51%)
    IPC send completion sync: wait(546,712), times(149s), avg wait (0), %db time (7.49%)
    gc cr multi block request: waits (7574), teims (78) avg wait (10 ms), %db time (4.0)
    another thing i see is the "avg global cache cr block flush time (ms): is 37.6 msThe DB CPU Oracle metric is the amount of CPU time (in microseconds) spent on database user-level calls.
    You should check your sql statement from report and tuning them.
    - Check from Execute Plan.
    - If not index, determine to use index.
    SQL> set autot trace explain
    SQL> sql statement;
    cursor: pin S wait on X.
    A session waits on this event when requesting a mutex for sharable operationsrelated to pins (such as executing a cursor), but the mutex cannot be granted becauseit is being held exclusively by another session (which is most likely parsing the cursor).
    use variable SQL , avoid dynamic sql
    http://blog.tanelpoder.com/2008/08/03/library-cache-latches-gone-in-oracle-11g/
    check about memory MEMORY_TARGET initialization parameter.
    By the way you have high "DB CPU"(47.5%), you should tune about your sql statement (check sql in report and tune)
    Good Luck

  • Query performance vs MySQL

    I have a table with about 500 million records in it stored within both Oracle and MySQL (MyISAM). I have a column (say phone_number) with a standard b-tree index on it in both places. Without data or indexes cached (not enough memory to fully cache the index), I run about 10,000 queries using randomly generated phone numbers as criteria in 10 concurrent threads. It seems that the average time to retrieve a record in MySQL is about 200 milliseconds whereas in Oracle it is about 400 milliseconds. I'm just wondering if MyISAM/MySQL is inherently faster for a basic index search than Oracle is, or should I be able to tune Oracle to get comparable performance.
    Of course the hardware configurations and storage configurations are the same. It's not the absolute time I'm concerned about here but the relative time. Twice as long to perform basically the same query seems concerning. I enabled tracing and it seems like some of the problem may be the recursive calls Oracle is making. Is there some way to optimize this a bit further?
    Realize, I just want to look at query performance right now...ignoring all of the issues (locking, transactional integrity, etc.)
    Thanks,
    Greg

    In Oracle, a standard table is heap-organized. A b-tree index then contains index keys and ROWIDs, so if you need to read a particular row in the table, you first do a few I/O's on the index to get the ROWIDs and then look up the ROWIDs in the table. For any given key, ROWIDs are likely to be scattered throughout the table, so this latter step generally involves multiple scattered I/O's.
    You can create an index-organized table or a hash cluster in Oracle in order to minimize the cost of this particular sort of lookup by clustering data with the same key physically near each other and, in the case of IOTs, potentially eliminating the need to store the index and table separately. Of course, there are costs to doing this in that inserts are now more expensive and secondary indexes are likely to be less useful. That probably gets you closer to what MySQL is doing if, as ajallen indicates, a MySQL table is generally stored more like an IOT than a heap-organized table.
    If you get really creative, you could even partition this single table to potentially improve performance further.
    Of course, if you're only storing one table, I'm not sure that you could really justify the cost of an Oracle license. This may well be the case where MySQL is more than sufficient for what this particular customer needs (knowing, nothing, of course, about the other requirements for the system).
    Justin

  • QUERY PERFORMANCE AND DATA LOADING PERFORMANCE ISSUES

    WHAT ARE  QUERY PERFORMANCE ISSUES WE NEED TO TAKE CARE PLEASE EXPLAIN AND LET ME KNOW T CODES...PLZ URGENT
    WHAT ARE DATALOADING PERFORMANCE ISSUES  WE NEED TO TAKE CARE PLEASE EXPLAIN AND LET ME KNOW T CODES PLZ URGENT
    WILL REWARD FULL POINT S
    REGARDS
    GURU

    BW Back end
    Some Tips -
    1)Identify long-running extraction processes on the source system. Extraction processes are performed by several extraction jobs running on the source system. The run-time of these jobs affects the performance. Use transaction code SM37 — Background Processing Job Management — to analyze the run-times of these jobs. If the run-time of data collection jobs lasts for several hours, schedule these jobs to run more frequently. This way, less data is written into update tables for each run and extraction performance increases.
    2)Identify high run-times for ABAP code, especially for user exits. The quality of any custom ABAP programs used in data extraction affects the extraction performance. Use transaction code SE30 — ABAP/4 Run-time Analysis — and then run the analysis for the transaction code RSA3 — Extractor Checker. The system then records the activities of the extraction program so you can review them to identify time-consuming activities. Eliminate those long-running activities or substitute them with alternative program logic.
    3)Identify expensive SQL statements. If database run-time is high for extraction jobs, use transaction code ST05 — Performance Trace. On this screen, select ALEREMOTE user and then select SQL trace to record the SQL statements. Identify the time-consuming sections from the results. If the data-selection times are high on a particular SQL statement, index the DataSource tables to increase the performance of selection (see no. 6 below). While using ST05, make sure that no other extraction job is running with ALEREMOTE user.
    4)Balance loads by distributing processes onto different servers if possible. If your site uses more than one BW application server, distribute the extraction processes to different servers using transaction code SM59 — Maintain RFC Destination. Load balancing is possible only if the extraction program allows the option
    5)Set optimum parameters for data-packet size. Packet size affects the number of data requests to the database. Set the data-packet size to optimum values for an efficient data-extraction mechanism. To find the optimum value, start with a packet size in the range of 50,000 to 100,000 and gradually increase it. At some point, you will reach the threshold at which increasing packet size further does not provide any performance increase. To set the packet size, use transaction code SBIW — BW IMG Menu — on the source system. To set the data load parameters for flat-file uploads, use transaction code RSCUSTV6 in BW.
    6)Build indexes on DataSource tables based on selection criteria. Indexing DataSource tables improves the extraction performance, because it reduces the read times of those tables.
    7)Execute collection jobs in parallel. Like the Business Content extractors, generic extractors have a number of collection jobs to retrieve relevant data from DataSource tables. Scheduling these collection jobs to run in parallel reduces the total extraction time, and they can be scheduled via transaction code SM37 in the source system.
    8). Break up your data selections for InfoPackages and schedule the portions to run in parallel. This parallel upload mechanism sends different portions of the data to BW at the same time, and as a result the total upload time is reduced. You can schedule InfoPackages in the Administrator Workbench.
    You can upload data from a data target (InfoCube and ODS) to another data target within the BW system. While uploading, you can schedule more than one InfoPackage with different selection options in each one. For example, fiscal year or fiscal year period can be used as selection options. Avoid using parallel uploads for high volumes of data if hardware resources are constrained. Each InfoPacket uses one background process (if scheduled to run in the background) or dialog process (if scheduled to run online) of the application server, and too many processes could overwhelm a slow server.
    9). Building secondary indexes on the tables for the selection fields optimizes these tables for reading, reducing extraction time. If your selection fields are not key fields on the table, primary indexes are not much of a help when accessing data. In this case it is better to create secondary indexes with selection fields on the associated table using ABAP Dictionary to improve better selection performance.
    10)Analyze upload times to the PSA and identify long-running uploads. When you extract the data using PSA method, data is written into PSA tables in the BW system. If your data is on the order of tens of millions, consider partitioning these PSA tables for better performance, but pay attention to the partition sizes. Partitioning PSA tables improves data-load performance because it's faster to insert data into smaller database tables. Partitioning also provides increased performance for maintenance of PSA tables — for example, you can delete a portion of data faster. You can set the size of each partition in the PSA parameters screen, in transaction code SPRO or RSCUSTV6, so that BW creates a new partition automatically when a threshold value is reached.
    11)Debug any routines in the transfer and update rules and eliminate single selects from the routines. Using single selects in custom ABAP routines for selecting data from database tables reduces performance considerably. It is better to use buffers and array operations. When you use buffers or array operations, the system reads data from the database tables and stores it in the memory for manipulation, improving performance. If you do not use buffers or array operations, the whole reading process is performed on the database with many table accesses, and performance deteriorates. Also, extensive use of library transformations in the ABAP code reduces performance; since these transformations are not compiled in advance, they are carried out during run-time.
    12)Before uploading a high volume of transaction data into InfoCubes, activate the number-range buffer for dimension IDs. The number-range buffer is a parameter that identifies the number of sequential dimension IDs stored in the memory. If you increase the number range before high-volume data upload, you reduce the number of reads from the dimension tables and hence increase the upload performance. Do not forget to set the number-range values back to their original values after the upload. Use transaction code SNRO to maintain the number range buffer values for InfoCubes.
    13)Drop the indexes before uploading high-volume data into InfoCubes. Regenerate them after the upload. Indexes on InfoCubes are optimized for reading data from the InfoCubes. If the indexes exist during the upload, BW reads the indexes and tries to insert the records according to the indexes, resulting in poor upload performance. You can automate the dropping and regeneration of the indexes through InfoPackage scheduling. You can drop indexes in the Manage InfoCube screen in the Administrator Workbench.
    14)IDoc (intermediate document) archiving improves the extraction and loading performance and can be applied on both BW and R/3 systems. In addition to IDoc archiving, data archiving is available for InfoCubes and ODS objects.
    Hope it Helps
    Chetan
    @CP..

  • Logical vs. Physical Partitioning

    Hi,
    In a discussion of logical partition the author pointed out that
    “… if a query that needs data from all 5 years would then automatically (you can control this) be split into 5 separate queries, one against each cube, running at the same time. The system automatically merges the results from the 5 queries into a single result set.”
    1. Can you direct me on how to “control this” as the author indicated?
    2. Also, the author noted that
    “… Physical Partitioning - I believe only Oracle and Informix currently support Range partitioning. …”
    a) Does it mean that BW does not use physical partitioning?
    b) Where do we indicate physical or logical partitioning as an option in BW?
    c) Or, when we talk about dimension table, etc. is there always an underlining database such as Oravle, Infomix, etc in BW? If so, what does BW use?
    3. For physical partitions, I read that the cube needs to be empty before it can be partitioned. What about logical partition?
    4. Finally, what are the underlying criteria to decide on logical or physical partitioning or both
    Thanks

    . Can you direct me on how to “control this” as the author indicated?
    You make this setting RSRT.
    2. Also, the author noted that
    “… Physical Partitioning - I believe only Oracle and Informix currently support Range partitioning. …”
    DB2 also support partiioning. Also the current relese of SQL server support partitioning.
    b) Where do we indicate physical or logical partitioning as an option in BW?
    Physical parittions are set up in the cube change option. When you are in  the cube change mode, on the themenu, chose extras - performance / DB parameters - parttitios.
    Now a screen will pop giving the time characteristics in the cube and choose the characteristic (s) that you wish to chose and confirm the entries - then you will get another small pop up where you set the no of parittions.
    Also, pl note that you can partition only on fiscalyear and fiscal period and not on other time characteritsitcs.
    Logical partitions: Logical paritions are nothinb but spilting the cube into numerous cubes of smaller sizes. You combine all these cubes by means of multi provider. For example, if you have 1000 cost centers , you may want to split into cubes based on the cost cenree numbers and combine them into a multi provider.
    No more setting is required.
    c) Or, when we talk about dimension table, etc. is there always an underlining database such as Oravle, Infomix, etc in BW? If so, what does BW use?
    Dimension tables / fact tables/ ODS tables /master data tables are all database tables. Which ever database you use and when yu activate these objects, the tables are created in the underlying database.
    3. For physical partitions, I read that the cube needs to be empty before it can be partitioned. What about logical partition?
    Logical partiton can be done any time.
    4. Finally, what are the underlying criteria to decide on logical or physical partitioning or both
    The underlying criteria is facotrs such as :
    (a) the no of years of history tou wish to view in reports.
    (b) te no ofyears you will hold the data in BW before archiving.
    (c) othe performance matters related tro sizing.
    Ravi Thothadri

  • Logical or Physical Partitionning

    Hi Dudes,
    Can anyone explain in which case it is recommended to use a logical partitioning rather than a physical one and vice versa?
    I try to optimize the queries navigation based on a cube of 20 M records. That cube contains ~ 15 different companies and grows of 6M records per year.
    Currently the queries runtimes are "good" as they always find an aggregats to display the default view but the performances fall when we drilldown a free characteristic i.e. when each time we leave the aggregate.
    The queries often compares different periods and exercices and multi-companies and have lots of calculated/restricted key figures and sometimes lots of cells.
    I haven't found until now the good solution between logical partitionning by Time Period or Organization (severals cubes) and physical partitionning (1 cube)? Should I include Hints too?
    Thanks in advance for your expertises
    Samuel

    Hi Sam
    Below is a thread which explains the differences b/w Phy/Log Partition.. ( FYI)..
    Physical Vs Logical Partitioning
    Now, about your design.. you said you have 15 different comp codes in one cube and it will grow..
    I am thinking.. what if you do a logical partition based on Company codes. If you are having different regions ( Asia, Europe, Americas).. create 3 different cubes and as per the region/source system ID.. and you can partition.. or if you dont have any regions (lets say you are in Europe).. then again based on the volume growth/analysis.. create new cubes and patition your cubes based on the "range" of comp codes into one cube and do forth.. this way you can handle the volume over a period of time..
    now create a Multi provider on top of those cubes and create queries... if you still want to optimize.. then by using some custom logicu2026 direct the queries to the releavant cube
    Hope this helps
    Thanks
    Kalyan

  • Analyse a partitioned table with more than 50 million rows

    Hi,
    I have a partitioned table with more than 50 million rows. The last analyse is on 1/25/2007. Do I need to analyse him? (query runs on this table is very slow).
    If I need to analyse him, what is the best way? Use DBMS_STATS and schedule a job?
    Thanks

    A partitioned table has global statistics as well as partition (and subpartition if the table is subpartitioned) statistics. My guess is that you mean to say that the last time that global statistics were gathered was in 2007. Is that guess accurate? Are the partition-level statistics more recent?
    Do any of your queries actually use global statistics? Or would you expect that every query involving this table would specify one or more values for the partitioning key and thus force partition pruning to take place? If all your queries are doing partition pruning, global statistics are irrelevant, so it doesn't matter how old and out of date they are.
    Are you seeing any performance problems that are potentially attributable to stale statistics on this table? If you're not seeing any performance problems, leaving the statistics well enough alone may be the most prudent course of action. Gathering statistics would only have the potential to change query plans. And since the cost of a query plan regressing is orders of magnitude greater than the benefit of a different query performing faster (at least for most queries in most systems), the balance of risks would argue for leaving the stats alone if there is no problem you're trying to solve.
    If your system does actually use global statistics and there are performance problems that you believe are potentially attributable to stale global statistics and your partition level statistics are accurate, you can gather just global statistics on the table probably with a reasonably small sample size. Make sure, though, that you back up your existing statistics just in case a query plan goes south. Ideally, you'd also have a test environment with identical (or nearly identical) data volumes that you could use to verify that gathering statistics doesn't cause any problems.
    Justin

  • Do partition scans take longer than a full table scan on an unpartitioned table?

    Hello there,
    I have a range-partitioned table PART_TABLE which has 10 Million records and 10 partitions having 1 million records each. Partition is done based on a Column named ID which is a sequence from 1 to 10 million.
    I created another table P2_BKP (doing a select * from part_table) which has the same dataset as that of PART_TABLE except that this table is not partitioned.
    Now, I run a same query on both the tables to retrieve a range of data. Precisely I am trying to read only the data present in 5 partitions of the partitioned tables which theoretically requires less reads than when done on unpartitioned tables.
    Yet, the query seems to take extra time on partitioned table than when run on unpartitioned table.Any specific reason why is this the case?
    Below is the query I am trying to run on both the tables and their corresponding Explain Plans.
    QUERY A
    =========
    select * from P2_BKP where id<5000000;
    | Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |                                                                                                                                                                                                                                
    |   0 | SELECT STATEMENT  |        |  6573K|   720M| 12152   (2)| 00:02:26 |                                                                                                                                                                                                                                
    |*  1 |  TABLE ACCESS FULL| P2_BKP |  6573K|   720M| 12152   (2)| 00:02:26 |                                                                                                                                                                                                                                
    QUERY B
    ========
    select * from part_table where id<5000000;
    | Id  | Operation                | Name       | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |                                                                                                                                                                                                     
    |   0 | SELECT STATEMENT         |            |  3983K|   436M| 22181  (73)| 00:04:27 |       |       |                                                                                                                                                                                                     
    |   1 |  PARTITION RANGE ITERATOR|            |  3983K|   436M| 22181  (73)| 00:04:27 |     1 |     5 |                                                                                                                                                                                                     
    |*  2 |   TABLE ACCESS FULL      | PART_TABLE |  3983K|   436M| 22181  (73)| 00:04:27 |     1 |     5 |                                                                                                                                                                                                     

    at the risk of bringing unnecessary confusion into the discussion: I think there is a situation in 11g in which a Full Table Scan on a non partitioned table can be faster than the FTS on a corresponding partitioned table: if the size of the non partitioned table reaches a certain threshold (I think it's: blocks > _small_table_threshold * 5) the runtime engine may decide to use a serial direct path read to access the data. If the single partitions don't pass the threshold the engine will use the conventional path.
    Here is a small example for my assertion:
    -- I create a simple partitioned table
    -- and a corresponding non-partitioned table
    -- with 1M rows
    drop table tab_part;
    create table tab_part (
        col_part number
      , padding varchar2(100)
    partition by list (col_part)
        partition P00 values (0)
      , partition P01 values (1)
      , partition P02 values (2)
      , partition P03 values (3)
      , partition P04 values (4)
      , partition P05 values (5)
      , partition P06 values (6)
      , partition P07 values (7)
      , partition P08 values (8)
      , partition P09 values (9)
    insert into tab_part
    select mod(rownum, 10)
         , lpad('*', 100, '*')
      from dual
    connect by level <= 1000000;
    exec dbms_stats.gather_table_stats(user, 'tab_part')
    drop table tab_nopart;
    create table tab_nopart
    as
    select *
      from tab_part;
    exec dbms_stats.gather_table_stats(user, 'tab_nopart')
    -- my _small_table_threshold is 1777 and the partitions
    -- have a size of ca. 1600 blocks while the non-partitioned table
    -- contains 15360 blocks
    -- I have to flush the buffer cache since
    -- the direct path access is only used
    -- if there are few blocks already in the cache
    alter system flush buffer_cache;
    -- the execution plans are not really exciting
    | Id  | Operation           | Name     | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |
    |   0 | SELECT STATEMENT    |          |     1 |  8089   (0)| 00:00:41 |       |       |
    |   1 |  SORT AGGREGATE     |          |     1 |            |          |       |       |
    |   2 |   PARTITION LIST ALL|          |  1000K|  8089   (0)| 00:00:41 |     1 |    10 |
    |   3 |    TABLE ACCESS FULL| TAB_PART |  1000K|  8089   (0)| 00:00:41 |     1 |    10 |
    | Id  | Operation          | Name       | Rows  | Cost (%CPU)| Time     |
    |   0 | SELECT STATEMENT   |            |     1 |  7659   (0)| 00:00:39 |
    |   1 |  SORT AGGREGATE    |            |     1 |            |          |
    |   2 |   TABLE ACCESS FULL| TAB_NOPART |  1000K|  7659   (0)| 00:00:39 |
    But on my PC the FTS on the non-partitioned table is faster than the FTS on the partitions (1sec to 3 sec.) and v$sesstat shows the reason for this difference:
    -- non partitioned table
    NAME                                               DIFF
    table scan rows gotten                          1000000
    file io wait time                                 15313
    session logical reads                             15156
    physical reads                                    15153
    consistent gets direct                            15152
    physical reads direct                             15152
    DB time                                              95
    -- partitioned table
    NAME                                               DIFF
    file io wait time                               2746493
    table scan rows gotten                          1000000
    session logical reads                             15558
    physical reads                                    15518
    physical reads cache prefetch                     15202
    DB time                                             295
    (maybe my choose of counters is questionable)
    So it's possible to get slower access for an FTS on a partitioned table under special conditions.
    Regards
    Martin

  • Ldap search query takes more than 10 seconds

    LDAP query takes more than 10 seconds to execute.
    For validating the policy configured, the Acess Manager(Sun Java System Access Manager) contacts the LDAP (Sun Java System Directory Server 6.2) to get the users in a dynamic group. The time out value configured in Access Manager for LDAP searches is 10 seconds.
    Issue : The ldap query takes more than 10 seconds to execute at some times .
    The query is executing with less than 10 seconds in most of the cases, but it takes more than 10 seconds in some cases. The total number of users available in the ldap is less than 1500.
    7 etime =1
    6 etime =1
    102 etime=4
    51 etime=5
    26 etime=6
    5 etime=7
    4 etime=8
    From the ldap access logs we can see the following entry,some times the query takes more than 10 seconds,
    [28/May/2012:14:21:26 +0200] conn=281 op=41433 msgId=853995 - SRCH base="dc=****,dc=****,dc=com" scope=2 filter="(&(&(***=true)(**=true))(objectClass=vfperson))" attrs=ALL
    [28/May/2012:14:21:36 +0200] conn=281 op=41434 msgId=854001 - ABANDON targetop=41433 msgid=853995 nentries=884 etime=10
    The query was aborted by the access manger after 10 seconds.
    Please post your suggestions to resolve this issue .
    1.How we can find out , why the query is taking more than 10 seconds ?
    2.Next steps to resolve this issue .

    Hi Marco,
    Thanks for your suggestions.
    Sorry for replying late. I was out of office for few weeks.
    1) Have you already tuned the caches? (entry cache, db cache, filesystem cache?)
    We are using db cache and we have not done any turning for cache. The application was working fine and there was no much changes in the number of users .
    2) Unfortunately we don't have direct access to the environment and we have contacted the responsible team to verify the server health during the issue .
    Regarding the IO operations we can see that, load balancer is pinging the ldap sever every 15 seconds to check the status of ldap servers which yields a new connection on every hit. (on average per minute 8 connections - )
    3) We using cn=dsameuser to bind the directory server. Other configuration details for ldap
    LDAP Connection Pool Minimum Size: 1
    LDAP Connection Pool Maximum Size:10
    Maximum Results Returned from Search: 1700
    Search Timeout: 10
    Is the Search Timeout value configured is proper ? ( We have less than 1500 user in the ldap server).
    Also is there any impact if the value Maximum Results Returned from Search = set to 1700. ( The Sun document for AM says that the ideal value for this is 1000 and if its higher than this it will impact performance.
    The application was running without time out issue for last 2 years and there was no much increase in the number of users in the system. ( at the max 200 users added to the system in last 2 years.)
    Thanks,
    Jay

  • Query Performance and Cache

    Hi,
    I am trying to analyze Query Performance in ST03N.
    I run a Query with Property = Cache Inactive.
    I get Runtime as say 180 sec.
    Again when I run the same Query the Runtime is getting decreased .Now it is say 150 sec.
    Pls expalin me the reason for the less runtime when there is No Cache used.

    Could be two things occurring.  When you say cache is inactive - that is probably referring to the OLAP Global Cache.  The use session, still has alocal cache so that if they run execute the same navigation in the same session, results are retrieved from local cache.
    Also,as the others have mentioned, if the same DB query is really being executed on the DB a second time, not only is the SQL stmt parsing/optimization not need to occur, but almost certainly some/all of the data still remains in the DB's buffer, so it only needs to be retrieved from memory, rather than physically reading the disk again.

  • Partitioning for Performance

    Hi All,
    Currently we have a STAR Schema with ORDER_FACT and ORDER_HEADER_DIM , ORDER_LINE_DIM, STORE_DIM, TIME_DIM and PRODUCT_DIM.
    We are planning to partition ORDER_FACT for performance improvements both reporting and loading. We have around 100 million rows in ORDER_FACT. Daily we inserted around 1 million rows and update around 2 million rows.
    We are trying to come up with some good stratgies and we have few questions..
    1) Our ORDER_FACT does not have any date columns except INSERT_DATE and LAST_UPDATE_DATE , more of timestamp columns. ORDER_DATE would be the appropriate one but we do not store it in fact. We have ORDER_DATE_KEY which is surrogatekey of TIME_DIM.
    Can a range partition (monthly) still be performed ? ( I quess we need a ORDER_DATE column in our fact )
    If somebody has handled this situation in some other way , any guidance will be helpful.
    2) Question below is assuming - we have a partitioned ORDER_FACT on ORDER_DATE.
    Currently we are doing a merge (Update/Insert) on ORDER_FACT. We have a incremental load (only newly inserted or updated rows from source) are processed.
    Update/Insert is slow.
    Can we use PEL (Partition Enabled loading ) and avoid merge (Update/Insert) ?
    PEL is fine for new rows , since it replaces empty partition in target with a loaded partition from source . How to handle updation and insertion of rows in partition which has existing rows?
    Any help on these would be helpful.
    Thanks,
    Samurai.

    Speaking from our experience, at some point you need to build your fact rows so you need an insert/update prior to PEL anyway, and you would need your partitions closely matched to your refresh frequency for it really to be effective.
    So what we have done is focus on the "E" part of ETL.
    Our remote source database is mirrored on our side via Streams. This mirrors into a local copy that we can run various reports/ processes/ queries against without impacting production.
    We also perform a custom aply that populates a second local copy of the tables, but these ones are partitioned daily and are used for our ETL. So, at the end of the day we have a partitioned set of data that contains only the current status of rows that have changed over the day. Now, of course, this is problematic for ETL because you need to have all of the associated information with those changes in order to do your ETL.
    (simple example, data in a customer's address record changes. Your ETL query undoubtably joins the customer record and the address record to build your customer dimension row. But Streams only propogates the changed address record so you wouldn't have the customer record in that daily partition for your join)
    So, we have a process that runs after the Streams aply is finished that walks the dependency tree and populates all dependant data into the current daily partition, so - at the end of our prep process we have a partitioned set of data that holds a complete set of source tables where anything has changed across any dependencies.
    This gives us a small, efficient daily data set to run our ETL queries against.
    The final piece of the puzzle is that we access this segment via synonyms, and the synonyms are pointed at this day's partition. We have a control structure that manages the list of partitions and repoints the synonyms prior to running the ETL. The partition loading and the ETL synonym pointing are completely decoupled so, for example, if we ever needed to suspend our ETL to get a code fix in place we can let the partition loading move ahead for a day or two and then play catchup loading the partitions in sequence and be confident that we have each end-of-day picture there to use.
    By running our ETL against only the changed data, we acheive huge efficiencies in query performance. And by managing the ETL partitions, we don't incur the space costs of a second full copy of the source as we prune out the partitions once we are satisfied with the load at the end of a month (with full backups of course in case there is ever a huge problem to go back and correct).
    Now for facts, of course, we expect these to be insert only. Facts shouldn't change. For dimensions we use set based fail over to row based (target only), with a couple specified to be Row Based Target Only as they are simply too large to ever complete in Set Based Mode.
    Yes, this is a bit of a convoluted process - exacerbated by our need to have a full local copy for some reporting needs and the partitioned change copy for the datamart ETL, but at the end of the day it all works and works well with properly designed control mechanisms.
    Cheers,
    Mike

Maybe you are looking for

  • Server crashes generating CSR

    My server interface (not server admin) crashes when I try to generate a CSR. After it ask me to allow access to the key chain and I choose always allow it crashes. Any thoughts? Thank You.

  • Best MAC 10.6 Color Display Profile for use with Adobe Illustrator CS5

    Hi everyone, Which Color display profile should I choose for my iMAC (10.6) using Illustrator CS5?  I need the colors to appear as accurate as possible when exporting PDF files. Thank you!

  • Installer can't load network via DHCP

    Hi everyone, I've been trying to install Arch, but the installer won't recognize my ethernet card.  I have also used the core installer and installed the basic Arch setup, but I also could not get the network to work.  I have an Intel ethernet card u

  • HT204266 Where can I find home brew apps

    Looking for specific use apps, ie Pokemon iv calculator and also I can't seem to find any good apps that have the finesse of a home user.  I'm not a big fan of business made apps.   They're usually not specific enough or cost more than their value.

  • Looking for a trip planning program for mac. Any suggestions?

    I'm looking for a trip planning for mac that's comparable to Microsoft's Streets and Trips program. Any suggestions?