Compression and query performance in data warehouses

Hi,
Using Oracle 11.2.0.3 have a large fact table with bitmap indexes to the asscoiated dimensions.
Understand bitmap indexes are compressed by default so assume cannot further compress them.
Is this correct?
Wish to try compress the large fact table to see if this will reduce the i/o on reads and therfore give performance benefits.
ETL speed fine just want to increase the report performance.
Thoughts - anyone seen significant gains in data warehouse report performance with compression.
Also, current PCTFREE on table 10%.
As only insert into tabel considering making this 1% to imporve report performance.
Thoughts?
Thanks

First of all:
Table Compression and Bitmap Indexes
To use table compression on partitioned tables with bitmap indexes, you must do the following before you introduce the compression attribute for the first time:
Mark bitmap indexes unusable.
Set the compression attribute.
Rebuild the indexes.
The first time you make a compressed partition part of an existing, fully uncompressed partitioned table, you must either drop all existing bitmap indexes or mark them UNUSABLE before adding a compressed partition. This must be done irrespective of whether any partition contains any data. It is also independent of the operation that causes one or more compressed partitions to become part of the table. This does not apply to a partitioned table having B-tree indexes only.
This rebuilding of the bitmap index structures is necessary to accommodate the potentially higher number of rows stored for each data block with table compression enabled. Enabling table compression must be done only for the first time. All subsequent operations, whether they affect compressed or uncompressed partitions, or change the compression attribute, behave identically for uncompressed, partially compressed, or fully compressed partitioned tables.
To avoid the recreation of any bitmap index structure, Oracle recommends creating every partitioned table with at least one compressed partition whenever you plan to partially or fully compress the partitioned table in the future. This compressed partition can stay empty or even can be dropped after the partition table creation.
Having a partitioned table with compressed partitions can lead to slightly larger bitmap index structures for the uncompressed partitions. The bitmap index structures for the compressed partitions, however, are usually smaller than the appropriate bitmap index structure before table compression. This highly depends on the achieved compression rates.

Similar Messages

Foreign keys in SCD2 dimensions and fact tables in data warehouse

Hello.
I have datawarehouse in snowflake schema. All dimensions are SCD2, the columns are like that:
ID (PK) SID NAME ... START_DATE END_DATE IS_ACTUAL
1 1 XXX 01.01.2000 01.01.2002 0
2 1 YYX 02.01.2002 01.01.2004 1
3 2 SYX 02.01.2002 1
4 3 AYX 02.01.2002 01.01.2004 0
5 3 YYZ 02.01.2004 1
On this table there are relations from other dimension and fact table.
Need I create foreign keys for relation?
And if I do, on what columns? SID (serial ID) is not unique. If I create on ID, I have to get SID and actual row in any query.

>
I have datawarehouse in snowflake schema. All dimensions are SCD2, the columns are like that:
ID (PK) SID NAME ... START_DATE END_DATE IS_ACTUAL
1 1 XXX 01.01.2000 01.01.2002 0
2 1 YYX 02.01.2002 01.01.2004 1
3 2 SYX 02.01.2002 1
4 3 AYX 02.01.2002 01.01.2004 0
5 3 YYZ 02.01.2004 1
On this table there are relations from other dimension and fact table.
Need I create foreign keys for relation?
>
Are you still designing your system? Why did you choose NOT to use a Star schema? Star schema's are simpler and have some performance benefits over snowflakes. Although there may be some data redundancy that is usually not an issue for data warehouse systems since any DML is usually well-managed and normalization is often sacrificed for better performance.
Only YOU can determine what foreign keys you need. Generally you will create foreign keys between any child table and its parent table and those need to be created on a primary key or unique key value.
>
And if I do, on what columns? SID (serial ID) is not unique. If I create on ID, I have to get SID and actual row in any query.
>
I have no idea what that means. There isn't any way to tell from just the DDL for one dimension table that you provided.
It is not clear if you are saying that your fact table will have a direct relationship to the star-flake dimension tables or only link to them through the top-level dimensions.
Some types of snowflakes do nothing more than normalize a dimension table to eliminate redundancy. For those types the dimension table is, in a sense, a 'mini' fact table and the other normalized tables become its children. The fact table only has a relation to the main dimension table; any data needed from the dimensions 'child' tables is obtained by joining them to their 'parent'.
Other snowflake types have the main fact table having relations to one or more of the dimensions 'child' tables. That complicates the maintenance of the fact table since any change to the dimension 'child' table impacts the fact table also. It is not recommended to use that type of snowflake.
See the 'Snowflake Schemas' section of the Data Warehousing Guide
http://docs.oracle.com/cd/B28359_01/server.111/b28313/schemas.htm
>
Snowflake Schemas
The snowflake schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake.
Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a product_category table, and a product_manufacturer table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance. Figure 19-3 presents a graphical representation of a snowflake schema.

Tablespaces and block size in Data Warehouse

We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. The question is what are general advices in such considerations according table spaces and block size? I made some research and it is hard to find some clear answer, there are resources advising that block size is not important and can be left small (8 KB), others state that it is crucial and should be the biggest possible (64KB). The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
Any help highly appreciated and thank you in advance.

Wojtus-J wrote:
We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. With little experience, the key feature is to avoid big mistakes - don't try to get too clever.
The question is what are general advices in such considerations according table spaces and block size? If you need to ask about block sizes, use the default (i.e. 8KB).
I made some research and it is hard to find some clear answer, But if you get contradictory advice from this forum, how would you decide which bits to follow ?
A couple of sensible guidelines when researching on the internet - look for material that is datestamped with recent dates (last couple of years), or references recent - or at least relevant - versions of Oracle. Give preference to material that explains WHY an idea might be relevant, give greater preference to material that DEMONSTRATES why an idea might be relevant. Check that any explanations and demonstrations are relevant to your planned setup.
The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
It is often convenient, and sometimes very important, to separate data into different tablespaces based on some aspect of functionality. The performance thing was mooted (badly) in an era when discs were small and (disk) partitions were hard; but all your other examples of why to split are potentially valid for administrative. Big/Small, table/index, old/new, read-only/read-write, fact/dimension etc.
For data warehouses a fairly common practice is to identify some sort of aging pattern for the data, and try to pick a boundary that allows you to partition data so that a large fraction of the data can eventually be made read-only: using tablespaces to mark time-boundaries can be a great convenience - note that the tablespace boundary need not match the partition boudary - e.g. daily partitions in a monthly tablespace. If you take this type of approach, you might have a "working" tablespace for recent data, and then copy the older data to "time-specific" tablespace, packing it and making it readonly as you do so.
Tablespaces are (broadly speaking) about strategy, not performance. (Temporary tablespaces / tablespace groups are probably the exception to this thought.)
Regards
Jonathan Lewis

Why do we need SSIS and star schema of Data Warehouse?

If SSAS in MOLAP mode stores data, what is the application of SSIS and why do we need a Data Warehouse and the ETL process of SSIS?
I have a SQL Server OLTP database. I am using SSIS to transfer my SQL Server data from OLTP database to a Data Warehouse database that contains fact and dimension tables.
After that I want to create cubes using SSAS form Data Warehouse data.
I know that MOLAP stores data. Do I need any Data warehouse with Fact and Dimension tables?
Is not it better to avoid creating Data warehouse and create cubes directly from OLTP database?

Another thing to note is data stored in transactional system may not always be in end user consumable format for ex. we may use bit fields/flags to represent some details in OLTP as storage required ius minimum but presenting them as is would not make any
sense to user as they would not know what each bit value represents. In such cases we apply some transformations and convert data into useful information for users to understand. This is also in the warehouse so that information in warehouse can directly be
used for reporting. Also in many cases the report will merge data from multiple source systems so merging it on the fly in report would be tedious and would have hit on report server. In comparison bringing them onto common layer (warehouse) and prebuilding
aggregates would be benefitial for the report performance.
I think (not sure) we join tables in SSAS queries and calculate aggregations in it.
I think SSAS stores these values and joined tables and we do not need to evaluates those values again and this behavior is like a Data Warehouse.
Is not it?
So if I do not need historical data, Can I avoid creating Data Warehouse?
On the backend SSAS uses queries only to extract the data
B/w I was not explaining on SSAS. I was explaining on what happens inside datawarehouse which is a relational database by itself. SSAS is used to built cube (OLAP structures) on top of datawarehouse. star schema is easier for defining relationships
and buidling aggregations inside SSAS as its simple and requires minimal lookups to be performed. Also data would be held at lowest granularity level which can easily be aggregated to required levels inside OLAP cubes. Cube processing is very resource
intensive and using OLTP system would really have a huge impact on processing performance as its nnot denormalized and also doing tranformation etc on the fly adds up to complexity. Precreating a layer (data warehouse) having data in required format would
make cube processing easier and simpler as it has to just cross join tables and aggregate data based on relationships defined and level needed inside the cube.
Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

RMAN crosscheck and expire guidelines for data warehouse environment

O/S: Windows Server 2008
DB: Oracle 11gR2
Are there any guidelines to how often one should do a RMAN crosscheck and set an expiration on archivelogs for data warehouse environments?
It would seem once a day would be enough for the crosscheck in a data warehouse environment that gets refreshed nightly. Expiration I would expect no less than 1 week.
Cheers!

I agree with damorgan
refer the below links for best practices.
http://www.oracle.com/technetwork/database/features/availability/311394-132335.pdf
https://blogs.oracle.com/datawarehousing/entry/data_warehouse_in_archivelog_m
Hope this helps,
Regards
http://www.oracleracepxert.com
Understand the Power of Oracle RMAN
http://www.oracleracexpert.com/2011/10/understand-power-of-oracle-rman.html
Duplicating RAC database using RMAN
http://www.oracleracexpert.com/2009/12/duplicate-rac-database-using-rman.html

Many Portal users mapping one R/3 user and query their own data ?

Hi everyone :
I want to discuss a issue as follow with all :
Precondition : The SSO had done between Portal and R/3.
Issue : Many Portal user(vendor) mapping one R/3 user(pulic vendor user),when they logon Portal, they can query the report, but the data was for the vendor logon now !
Any discuss is welcome!
Best Regards,
Jianguo Chen

Hi everyone :
I want to discuss a issue as follow with all :
Precondition : The SSO had done between Portal and R/3.
Issue : Many Portal user(vendor) mapping one R/3 user(pulic vendor user),when they logon Portal, they can query the report, but the data was for the vendor logon now !
Any discuss is welcome!
Best Regards,
Jianguo Chen

Distinct clause and query performance

Friends,
I have a query which returns results in 40 seconds without distinct clause and when I add distinct clause it takes over 2 hours.
I have verified following -
1. indexes/table statistics are up to date.
2. columns that are used in where clause but are not indexed have upto date column statistics
Any idea what could be the reason, explain plan shows that distinct clause has a very expensive cost.
Thanks
Query and explain plan is below
SELECT
DISTINCT -- with distinct 2hrs + and without 40 seconds
quote_by_dst.qte_hdr_stat_cd, quote_by_dst.qte_ln_cond_cd,
                product.prod_nm, product.prod_id,
                cs_ship_by_dst.bto_ds_cac_ownr_ud,
                quote_by_dst.qte_csup_csup_am, cs_ship_by_dst.bto_ds_cac_nm,
                product.spl_sht_nm,
                   product.prod_blg_un_fac_um
                || ' '
                || product.prod_blg_um
                || ' '
                || product.prod_stck_um,
                product.prod_blg_um, quote_by_dst.qte_ln_brk_1_blg_uom_am,
                quote_by_dst.qte_csup_avg_cst_am,
                quote_by_dst.qte_csup_rev_gm_pct_am,
                quote_by_dst.qte_csup_avg_cst_am, cs_ship_by_dst.bto_id,
                cs_ship_by_dst.bto_ds_cac_cd,
                   cs_ship_by_dst.bto_ds_cac_cd
                || product.prod_id
                || cs_ship_by_dst.bto_id
           FROM infowhse.quote_by_dst4 quote_by_dst,
                infowhse.product,
                infowhse.cs_ship_by_dst4 cs_ship_by_dst,
                infowhse.department
          WHERE (quote_by_dst.dpt_cd = department.dpt_cd)
            AND (quote_by_dst.cus_dpt_id = cs_ship_by_dst.cus_dpt_id)
            AND (product.prod_id = quote_by_dst.prod_id)
            AND (    (   quote_by_dst.qte_ln_cond_cd = 'E'
                      OR quote_by_dst.qte_ln_cond_cd = 'C'
                 AND quote_by_dst.qte_hdr_stat_cd = 'A'
                 AND ((cs_ship_by_dst.bto_cust_type_cd) = '01')
                 AND cs_ship_by_dst.bto_ds_cac_ownr_ud = 'EHOC'
                 AND department.dpt_cd > '0.00'
                )Explain plan
Plan
SELECT STATEMENT CHOOSECost: 911,832,256 Bytes: 433,941,639,459 Cardinality: 2,729,192,701
     15 SORT UNIQUE Cost: 911,832,256 Bytes: 433,941,639,459 Cardinality: 2,729,192,701
          14 NESTED LOOPS Cost: 68,705 Bytes: 433,941,639,459 Cardinality: 2,729,192,701
               12 HASH JOIN Cost: 68,705 Bytes: 425,754,061,356 Cardinality: 2,729,192,701
                    1 INDEX FAST FULL SCAN NON-UNIQUE INFOWHSE.DST_SEC_DST_SEC_DST_CD_IX Cost: 25 Bytes: 922,700 Cardinality: 184,540
                    11 HASH JOIN Cost: 16,179 Bytes: 1,199,209,082 Cardinality: 7,941,782
                         2 INDEX FAST FULL SCAN NON-UNIQUE INFOWHSE.DST_SEC_DST_SEC_DST_CD_IX Cost: 25 Bytes: 922,700 Cardinality: 184,540
                         10 HASH JOIN Cost: 15,879 Bytes: 3,374,060 Cardinality: 23,110
                              8 HASH JOIN Cost: 15,200 Bytes: 2,981,190 Cardinality: 23,110
                                   6 HASH JOIN Cost: 13,113 Bytes: 1,779,470 Cardinality: 23,110
                                        3 TABLE ACCESS FULL INFOWHSE.CUSTOMER_SHIP Cost: 5,640 Bytes: 42,372 Cardinality: 1,177
                                        5 PARTITION RANGE ALL Partition #: 11 Partitions accessed #1 - #12
                                             4 TABLE ACCESS FULL INFOWHSE.QUOTE Cost: 7,328 Bytes: 38,826,590 Cardinality: 946,990 Partition #: 11 Partitions accessed #1 - #12
                                   7 TABLE ACCESS FULL INFOWHSE.PRODUCT Cost: 1,542 Bytes: 9,246,640 Cardinality: 177,820
                              9 INDEX FAST FULL SCAN NON-UNIQUE INFOWHSE.CUST_SHIP_SLSDST_DTP_SICALL_IX Cost: 185 Bytes: 9,878,411 Cardinality: 581,083
               13 INDEX UNIQUE SCAN UNIQUE INFOWHSE.DEPARTMENT_PK Bytes: 3 Cardinality: 1

This might be more useful.
Query is still running.
There is heavy wait time for scattered file read.
Results from
SELECT * FROM V$SESSION_WAIT WHERE SID = 48;
SID   SEQ# EVENT                           P1TEXT                          P1    P1RAW            P2TEXT                          P2    P2RAW            P3TEXT                          P3    P3RAW            WAIT_TIME                              SECONDS_IN_WAIT                        STATE
48    6865 db file scattered read          file#                           108   000000000000006C block#                          1593370000000000026E69 blocks                          32    0000000000000020 2                                      30                                      WAITED KNOWN TIME
SELECT * FROM V$SESSION_EVENT WHERE SID = 48;
SID                                    EVENT                                                            TOTAL_WAITS                            TOTAL_TIMEOUTS                         TIME_WAITED                            AVERAGE_WAIT                           MAX_WAIT                               TIME_WAITED_MICRO
48                                     log file sync                                                    1                                      0                                      0                                      0                                      0                                      563
48                                     db file sequential read                                          11                                     0                                      0                                      0                                      0                                      243
48                                     db file scattered read                                           6820                                   0                                      330                                    0                                      7                                      3296557
48                                     SQL*Net message to client                                        19                                     0                                      0                                      0                                      0                                      23
48                                     SQL*Net message from client                                      18                                     0                                      128                                    7                                      127                                    1281912                                Sorry for long post.

Large XML and Query performance

This problem came to me from a Developer and she claims XML query on XMLType field is very slow when using large XML and is there any alternates. Details are below:
=============
Query:
select attributepool_id, attributepool_name, vintage , p.attributepool.extract('//attributepool/segmentationsystem/id/text()').getStringVal() ,
p.attributepool.extract('//attributepool/datasource/id/text()').getStringVal()
from saved_attributepools p
where user_id = 'CLPROFILE2' and vintage = 'SPRING_2003' order by attributepool_name
Table name:                saved_attributepools
Space:                    ecommerce
A Column Name:                attributepool
attributepool Column Type:      XmlType
One of xml contains 4Mbytes:     CORE LIFESTLY
When we try to get the data against this row, query is taking longer.
conn ecommerce@ecom3 --> 82 seconds (table has 65 rows)
conn ecommerce@oradev--> 34 seconds (table has only 4 rows)
We think that;
Oracle parse the entire XML document and load this document into an 'in-memory' DOM structure before executing the specified xpaths.
Adding INDEX into XmlType won't help as we don't use whereclasue against XmlType for this case.
We don't know 10g has solution for this or not.
Any suggestion will be greatly appreciated.

This problem came to me from a Developer and she claims XML query on XMLType field is very slow when using large XML and is there any alternates. Details are below:
=============
Query:
select attributepool_id, attributepool_name, vintage , p.attributepool.extract('//attributepool/segmentationsystem/id/text()').getStringVal() ,
p.attributepool.extract('//attributepool/datasource/id/text()').getStringVal()
from saved_attributepools p
where user_id = 'CLPROFILE2' and vintage = 'SPRING_2003' order by attributepool_name
Table name:                saved_attributepools
Space:                    ecommerce
A Column Name:                attributepool
attributepool Column Type:      XmlType
One of xml contains 4Mbytes:     CORE LIFESTLY
When we try to get the data against this row, query is taking longer.
conn ecommerce@ecom3 --> 82 seconds (table has 65 rows)
conn ecommerce@oradev--> 34 seconds (table has only 4 rows)
We think that;
Oracle parse the entire XML document and load this document into an 'in-memory' DOM structure before executing the specified xpaths.
Adding INDEX into XmlType won't help as we don't use whereclasue against XmlType for this case.
We don't know 10g has solution for this or not.
Any suggestion will be greatly appreciated.

How to improve query performance using infoset

I create one infoset that including 4 char.and 3 DSO which all are time-dependent.When query run, system show very poor perfomance, sometimes no data show in BEX anayzer. In this case I have to close BEX analyzer at first and then open it again, after that it show real results. It seems very strange. Does anybody has experience on infoset performance improvement. pls info, thanks!

Hi
As info set itself doesn't have any data so it improves Performance
also go through the below tips.
Find the query Run-time
where to find the query Run-time ?
557870 'FAQ BW Query Performance'
130696 - Performance trace in BW
This info may be helpful.
General tips
Using aggregates and compression.
Using less and complex cell definitions if possible.
1. Avoid using too many nav. attr
2. Avoid RKF and CKF
3. Many chars in row.
By using T-codes ST03 or ST03N
Go to transaction ST03 > switch to expert mode > from left side menu > and there in system load history and distribution for a particular day > check query execution time.
Statistical Records Part 4: How to read ST03N datasets from DB in NW2004
How to read ST03N datasets from DB
Try table rsddstats to get the statistics
Using cache memory will decrease the loading time of the report.
Run reporting agent at night and sending results to email. This will ensure use of OLAP cache. So later report execution will retrieve the result faster from the OLAP cache.
Also try
1. Use different parameters in ST03 to see the two important parameters aggregation ratio and records transferred to F/E to DB selected.
2. Use the program SAP_INFOCUBE_DESIGNS (Performance of BW infocubes) to see the aggregation ratio for the cube. If the cube does not appear in the list of this report, try to run RSRV checks on the cube and aggregates.
Go to SE38 > Run the program SAP_INFOCUBE_DESIGNS
It will shown dimension Vs Fact tables Size in percent.If you mean speed of queries on a cube as performance metric of cube,measure query runtime.
3. To check the performance of the aggregates,see the columns valuation and usage in aggregates.
Open the Aggregates...and observe VALUATION and USAGE columns.
"---" sign is the valuation of the aggregate. You can say -3 is the valuation of the aggregate design and usage. ++ means that its compression is good and access is also more (in effect, performance is good). If you check its compression ratio, it must be good. -- means the compression ratio is not so good and access is also not so good (performance is not so good).The more is the positives...more is useful the aggregate and more it satisfies the number of queries. The greater the number of minus signs, the worse the evaluation of the aggregate. The larger the number of plus signs, the better the evaluation of the aggregate.
if "-----" then it means it just an overhead. Aggregate can potentially be deleted and "+++++" means Aggregate is potentially very useful.
In valuation column,if there are more positive sign it means that the aggregate performance is good and it is useful to have this aggregate.But if it has more negative sign it means we need not better use that aggregate.
In usage column,we will come to know how far the aggregate has been used in query.
Thus we can check the performance of the aggregate.
Refer.
http://help.sap.com/saphelp_nw70/helpdata/en/b8/23813b310c4a0ee10000000a114084/content.htm
http://help.sap.com/saphelp_nw70/helpdata/en/60/f0fb411e255f24e10000000a1550b0/frameset.htm
performance ISSUE related to AGGREGATE
Note 356732 - Performance Tuning for Queries with Aggregates
Note 166433 - Options for finding aggregates (find optimal aggregates for an InfoCube)
4. Run your query in RSRT and run the query in the debug mode. Select "Display Aggregates Found" and "Do not use cache" in the debug mode. This will tell you if it hit any aggregates while running. If it does not show any aggregates, you might want to redesign your aggregates for the query.
Also your query performance can depend upon criteria and since you have given selection only on one infoprovider...just check if you are selecting huge amount of data in the report
Check for the query read mode in RSRT.(whether its A,X or H)..advisable read mode is X.
5. In BI 7 statistics need to be activated for ST03 and BI admin cockpit to work.
By implementing BW Statistics Business Content - you need to install, feed data and through ready made reports which for analysis.
http://help.sap.com/saphelp_nw70/helpdata/en/26/4bc0417951d117e10000000a155106/frameset.htm
/people/vikash.agrawal/blog/2006/04/17/query-performance-150-is-aggregates-the-way-out-for-me
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/1955ba90-0201-0010-d3aa-8b2a4ef6bbb2
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/ce7fb368-0601-0010-64ba-fadc985a1f94
http://help.sap.com/saphelp_nw04/helpdata/en/c1/0dbf65e04311d286d6006008b32e84/frameset.htm
You can go to T-Code DB20 which gives you all the performance related information like
Partitions
Databases
Schemas
Buffer Pools
Tablespaces etc
use tool RSDDK_CHECK_AGGREGATE in se38 to check for the corrupt aggregates
If aggregates contain incorrect data, you must regenerate them.
202469 - Using aggregate check tool
Note 646402 - Programs for checking aggregates (as of BW 3.0B SP15)
You can find out whether an aggregate is usefull or useless you can find out through a proccess of checking the tables RSDDSTATAGGRDEF*
Run the query in RSRT with statistics execute and come back you will get STATUID... copy this and check in the table...
This gives you exactly which infoobjects it's hitting, if any one of the object is missing it's useless aggregate.
6
Check SE11 > table RSDDAGGRDIR . You can find the last callup in the table.
Generate Report in RSRT
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/cccad390-0201-0010-5093-fd9ec8157802
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/4c0ab590-0201-0010-bd9a-8332d8b4f09c
Business Intelligence Journal Improving Query Performance in Data Warehouses
http://www.tdwi.org/Publications/BIJournal/display.aspx?ID=7891
Achieving BI Query Performance Building Business Intelligence
http://www.dmreview.com/issues/20051001/1038109-1.html
Assign points if useful
Cheers
SM

Is the OBIEE used to create a data warehouses dynamically?

Management where I work wants to use the OBIEE Administrator to source a 3NF normalized database and create a "virtual data warehouse" in the Business Modeling Mapping layer of OBI Administrator as a Star Schema model is required by OBI Business Model layer. They claim they were told by an Oracle sales rep that the Administrator tool could do this.
Is this possible? As OBI issues only SQL and not PL/SQL how can one "create" dimensions, lookup tables and fact tables dynamically? And even if it could the performance hit to recreate the virtual data warehouse each time a query is issued would be huge.
Having used Prism Warehouse Builder and DataStage in the past to create data warehouses I am aware that one needs a procedural programming language to create and maintain the star schema tables (surrogate key maintenance, controling workflows, maintaining slowly changing dimensions, intermediate lookup tables, etc.). SQL was not meant to do this heavy lifting programming. Afterall, isn't this why Oracle Warehouse Builder and previously Informatica is shipped with OBIEE suite because OBI is not an ETL tool to create dimensional models? One uses an ETL tool to create the dimensional data model for OBI to access and pass along the metadata to OBI Answers.
So is it normal practice to use the Administrator's Business Mapping/Modeling layer for creating a virtual star schema logical tables from physical tables that are 3NF? Or is the tool used to access already denormalized tables in the physical layer that were created using Informatica or OWB or other ETL tool?

I asked an "Expert" in OBIEE. Here are snippets of his response:
"Be aware though that the transformation ability is fairly limited, and
will only really work with data that is very close to a star schema, i.e.
the data can be easily transformed through a couple of denormalizations and
table joins. If your source data is very normalized and cannot easily be
transformed into a star schema, you would need to use a tools such as
Informatica, OWB or similar to extract data from your source systems, load
and then transform it into a data warehouse or data mart and report of of
that. The more that your data needs to be transformed (i.e. the closer it
is to a 3NF model) the more likely it is that you'll need to use an ETL
tool, and a data warehouse or data mart, to host your data."
And in response to my noting the lack of documentation on how to model a 3NF to Star Schema his response was:
"No, you're right, the documentation doesn't really go into "how to" turn a 3NF model into a dimensional model. If you look back to when OBIEE was a Siebel product, the documentation was really aimed at either Siebel consultants or customers who had been on the training, they didn't want customers "off the street" to try and implement OBIEE as it would hit their services revenue. That's where the blog posts we do, things like the Oracle-by-example training courses on OTN and so on come in, otherwise as you say there's little out there on the best way to transform your model - it's mostly passed on "word of mouth" or is built up from experience on working on projects."

Good pratice for data warehouse

Hi all,
I am looking for good pratices on designing and maintaining a healthy data warehouse. Any tips, appreicated it if you can give me some ideas as well as url with related topics. Besides, someone told me that enabling the Oracle parameter "star_transformation_enabled" can help gaining performance. Is that ture? How??
Many Thanks,
Chris,

I have the similar question. I have Oracle 8i batabase in which a table whose records which is increasing at a rate of 4000 every day, and the very recent records are the only one which is required to accesss ery often. Is there any method available, so that I can use a materialised view for the latest records, say for recent 7 days old records would be accessed only in a query which is having several joins, unions and sums.
The record is having structure like
Date
Time
A Counter to join other table
time_Division
Equimpent Code
Equipment Status
Duration
etc....
Any suggestion would be welcome...

OLAP Query performance

Hi,
Does using compression & partitioning (by time) affect the Reporting performance adversely? I have a 8GB Cube with 13 dimensions built in 10.1.0.4. Cube was defined with 1 dense dimension and other 12 as sparse in a compressed composite. It was also partitioned by Years. It takes close to 1 hour to build the cube. Since it is compressed, fully aggregated, I would assume. However, performance of discoverer queries on this cube has been pathetic! Any drill downs or slice/dice takes a long time to return if there are multiple dimensions in either edges of the Crosstab. Also, when scrolling down, it freezes for a while and then brings the data. Sometimes it takes couple of minutes!
What are the things I needs to check to speed this up? I think I checked things like sparsity, SGA/PGA sizes, OLAP Page Pool etc..
Regards
Suresh

Hi Suresh,
Before you can implement changes to improve performance, you need to understand the causes of the performance problems. Discoverer for OLAP uses the OLAP API for queries, and the OLAP API generates SQL to query an analytic workspace. There are a few broad possible causes of poor query performance:
retrieving data from the AW is slow
SQL execution is slow, perhaps because the SQL is inefficient
SQL execution is fast, but the OLAP API is slow to fetch data
Each of these causes demands a different approach. I'd suggest that you enable configuration parameters SQL_TRACE and TIMED_STATISTICS, generate some trace files, and use the tkprof utility to try to narrow down the cause of the trouble.
Geof

Availability data not visible in data warehouse

I'm having a problem with our data warehouse. I can't run or even find availability reports from some of the objects that are visible and clearly monitored in our scom. For example I did a web transaction monitor with the wizard but when I try to run a availability
report from it, there is no object for that so I can not even run the report. I know the 500 object limit and I have set the registry key to see more objects. We use SCOM 2012 R2 UR2.
Is there anything that I should check? Can I somehow run a SQL query against my data warehouse to see if there is any availability data?

Hello SamiKoskivaara,
Could you please check if event ID 31553 is being logged on one of your SCOM management servers ?
Event ID 31553:
"Data was written to the Data Warehouse staging area but processing failed on one of the subsequent operations. Exception 'SqlException': Sql execution failed. Error 2627, Level
14, State 1, Procedure ManagedEntityChange, Line 368, Message: Violation of UNIQUE KEY constraint 'UN_ManagedEntityProperty_ManagedEntityRowIdFromDAteTime'. Cannot insert duplicate key in object 'dbo.ManagedEntityProperty'. The duplicate key value is (184,
Mar 1 2013 9:42AM). One or more workflows were affected by this...

Difference between Compression and Aggregation

Hi,
Can anybody explain the Difference between Compression and Aggregation.Performance wise which is better and explain me in detail.
Thnaks,
Chinna

Hi,
suppose you are having three charecteristics in a cube say X,Y,Z..
Even for the records which are having the same combination of these charecteristics but are loaded with different request they won't get aggregated.
So when you go for compression the records , it deletes the request number, and aggregates the records which are having the same combination of these records.
Coming to the aggregates , if you build a aggregate on the charectaristic say 'X' then it aggregates the records which are having the same value for a particular charecteristic.
ex: say you are having the recrds as
x1, y1 ,z1......(some key figures)
xi, y2,z1,.....
x1,y1,z1,....
x3,y3,z3...
If you compress them, you will get three records.
If you go for aggregates based on the charecteristic 'X' you will get two records.
So aggregates will give more aggregate level of data than compression
regards,
haritha.

Permanent Job Opportunity - Oracle BI Data Warehouse Developer Chicago, IL

Submit Resumes to [email protected]
The Business Intelligence Specialist will play a critical role in designing, developing, deploying, and supporting data warehouse/data mart applications. In this role, the person will be responsible for all BI aspects of a data warehouse/data mart application. Primary duties will be to create reporting standards, as well as coach and support power users with selected Oracle tool. The ideal candidate will have 3+ years demonstrated experience in data warehousing and Business Intelligence tools. Must also possess excellent communication skills and an outstanding track record with the user.
Principal Duties:
Participates with internal clients to define software requirements for development, maintenance and/or improvements
Maintains accuracy, integrity, and availability of the data warehouse
Tests, monitors, manages, and validates data warehouse activity, including data extraction, transformation, movement, loading, cleansing, and updating processes
Designs and optimizes data mart models for Oracle Business Intelligence Suite.
Translates the reporting requirements into data analysis and reporting solutions.
Reviews and sign off on project plan(s).
Reviews and sign off on technical design(s).
Defines and develops BI reports for accessing/analyzing data in warehouse.
Customizes BI tools and data sets for different types of users.
Designs and develop UAT (User Acceptance Testing).
Drives improvement of BI system architecture and development process.
Develops and maintains internal relationships. Actively champions teamwork. Uses internal resources to enhance knowledge and expertise of industry, research, products and services. Provides information and support to others in the company.
Required Skills:
Education and Experience:
BS/MS in Computer Science or equivalent.
3+ years of experience with Oracle, PL/SQL Development and Data Warehousing.
Experience Oracle Business Intelligence Suite and Crystal Reports is a plus.
2-3 years dimensional modeling experience.
Demonstrated hands on experience with Unix/Linux, SQL required.
Demonstrated hands on experience with Oracle reporting tools.
Demonstrated experience with translating business requirements into data analysis and reporting solutions.
Experience in training programs/teach users to use tools.
Expertise with software development process.
Effective mediator - able to facilitate constructive and productive discussions with internal customers, external clients, and development personnel pertaining to feature definition, project scope, and status
Problem solving*identifies and resolves problems in a timely manner, gathers and analyzes information skillfully and maintains confidentiality.
Planning/organizing*prioritizes and plans work activities and uses time efficiently. Work requires continual attention to detail in composing and proofing materials, establishing priorities and meeting deadlines. Must be able to work in a fast-paced environment with demonstrated ability to juggle multiple competing tasks and demands.
Quality control*demonstrates accuracy and thoroughness and monitors own work to ensure quality.
Adaptability*adapts to changes in the work environment, manages competing demands and is able to deal with frequent change, delays or unexpected events.
Benefits/Compensation:
Employees enjoy competitive compensation. We have a full benefits package including medical and dental insurance, long-term disability and life insurance and a 401(k) plan.
The client operates within the healthcare industry.
This is a permanent full-time position. After ensuring your availability and qualifications we will put you in direct contact with the client to move forward in the process.

FORWARD THE UPDATED RESUME AS SOON AS POSSIBLE.

Compression and query performance in data warehouses

Similar Messages

Maybe you are looking for