Cube/Fact Design

I am trying to build facts/dimensions relating to university enrollment. I have designed dimensions for student info, course info and section info. I was trying to load actual enrollment data in to the fact table. I added a measure for grades, but when I load that particular measure, the fact table doesn't seem to have enough information to associate the grade with the student/class. I believe my fact table is designed wrong. Does anyone have information on what data should be loaded there in this context?
Additionally, when I try to build a mapping to the fact table, I can only move data in to the measures (not the foreign keys) which makes sense, however, it is only through the FKs that the data can match up with the proper student, etc...
Any help or advice is strongly appreciated.
Thanks,
Brandon

Brandon,
First requirement to build a fact table with dimensions is indeed to be able to relate the source of the fact records with the tables that build up the dimensions. Would there not be information somewhere (perhaps in other tables, another system or even flat files) that relates student IDs with courses? I would imagine that is the very least you can expect...
Mark.

Similar Messages

Sequential read on Cube fact table

I have couple of queries on a particular cube that run forever , i rebuilt the indexes for that
cubes fact table and regenerated the queries . some of them started executing pretty quick
whereas the others are still very slow .
The result is similar when i use listcube also
when i check sm66 it it sows that the query reads the cube fact table for a very very long time .
Is there anything that can be done ?
All suggestions welcome !!
cube has no aggregates

Hi oops,
Why dont you build aggregates on the cube? it will considerably reduce the query run time.refreshing cube statistics and creating index definitely help in loading the cube and reading the cube but if it has got huge records, then obviously, the query run time will be high and you have to build aggregates for the cube. If you click on business intelligence in SND, then ull find many documents as how to create an aggregate under performance link. This should help you
Regards
Sriram

Snapshot / Trend Fact Design

I have a fact table that has the file sizes of all the files in the organisation and it has 10 billion rows. Now we have a requirement of see the trend of these files growth over a period of time.
If i use snapshot date and use this to link a dimension with snapshot ( monthly or yearly ) that will solve my problem to see trend but that means no of rows in fact will be multiplied by snapshots that will lead to performance issues and not feasible.Is
there any alternative solution to solve this in terms of design and reporting, please help
File Fact : 10 Billion rows
ID | FileName | Size(GB) | Snapshot Date

Hi if the data loaded in the data mart is only having percentages and not having constituents then they cannot be added to any other levels other than the once at which they are loaded. The only way is to ensure that that data to be loaded is the constituents as shown in example.
Suppose user needs to see Gross Profitability which is ( Revenue - Cost)/ Revenue
I should not be creating a measure called Gross Profitability if i require reports of gross profitability at different context. Rather I should create Revenue and Cost as my measures so that i can calculate the Gross Profitability at any level required.

Fact Design , Any Aggragation ?

Hi ,
this is my Fact Table
| ProjectID | ProjectSeq | PlanPercentYearly | month | PlanPercentMonthly |
| | | | | |
| 1 | 1 | 100 | 1 | 10 |
| 1 | 1 | 100 | 1 | 10 |
| 1 | 1 | 100 | 1 | 5 |
| 1 | 2 | 100 | 3 | 20 |
| 1 | 2 | 100 | 5 | 10 |
| 1 | 2 | 100 | 6 | 5 |
| 1 | 2 | 100 | 7 | 5 |
| 1 | 3 | 100 | 6 | 20 |
| 1 | 3 | 100 | 7 | 10 |
| 1 | 3 | 100 | 8 | 15 |
Ok now i group by ProjectSequece with by Sequence Dimension
and this is my target resalt and I have sum(PlanPercentMonthly)
| ProjectID | ProjectSeq | PlanPercentYearly | PlanPercentMonthly |
| 1 | 1 | 100 | 15 |
| 1 | 2 | 100 | 40 |
| 1 | 3 | 100 | 45 |
But PlanPercentYearly is my problem
i don't know how can i make that output by any formula or any Aggregation
or might be i had wrong design fact table ?
Thank for your kind for newbie in DW
And sorry about my weak English language.
Regard.
Edited by: BlarBlarBlar on Mar 21, 2011 9:57 PM
Edited by: BlarBlarBlar on Mar 22, 2011 12:01 AM

Hi if the data loaded in the data mart is only having percentages and not having constituents then they cannot be added to any other levels other than the once at which they are loaded. The only way is to ensure that that data to be loaded is the constituents as shown in example.
Suppose user needs to see Gross Profitability which is ( Revenue - Cost)/ Revenue
I should not be creating a measure called Gross Profitability if i require reports of gross profitability at different context. Rather I should create Revenue and Cost as my measures so that i can calculate the Gross Profitability at any level required.

Copy Cube, Fact table partitioned automatically.

Hello Guys.
I have a trouble.
I do a copy of one cube to make a backup of data to redisign the original cube.
The problem is that now I see that the fact table of my new cube is partitioned into multiple tables.
I'm on Oracle database, an I can see with program SAP_DROP_EMPTY_FPARTITIONS in se 38, that on Dev environment there is any partitions in both cubes, but in prod system, if I use the same program, I see that the original cube is not parttitioned, but the new cube (that is a copy of the original one) does have a lot of partitions in it.
This partitions are generating that the loads from one cube to other be very extensive (for 300,000 registers) it delays around 4 hours, and it's because of the partition of the cube.
So I want to know what thing cause that my fact table of the copy cube be parttitioned, and how can I solve this problem in order to make the loads of data more quickly.
THANKS in advance.

Did u tried partitioning the cube by fiscper/calmonth?
This will surely help the system with very low number of partition.
Cheers..
(BTW: if you creating a backup copy/archive of cube which doesnt require posting date in reporting, try to change posting date to end of month and hence the aggregation level with change resulting in low volume)...

Fact design accumulative snapshot

This is Loan domain and I am trying to create an accumulative snap shot table at loan level with following columns
on 01-jan-2011 this snap shot would be like
LOAN_KEY,LOAN_DATE,DEPOSIT_DATE,RETURN_DATE,CLEAR_DATE,WRITEOFF_DATE,LOAN_PAIDOFF_DATE,
LOAN_STATUS,CHECK_STAUS,
DUE_AMT,RTN_FEE,LATE_FEE
sample data
10001,01-jan-2011,null,null,null,null,null,
Open,Held,
2000,0,0on 03-Feb-2011 this snap shot would be like
LOAN_KEY,LOAN_DATE,DEPOSIT_DATE,RETURN_DATE,CLEAR_DATE,WRITEOFF_DATE,LOAN_PAIDOFF_DATE,
LOAN_STATUS,CHECK_STAUS,
DUE_AMT,RTN_FEE,LATE_FEE
sample data
10001,01-jan-2011,30-jan-2011,null,03-feb-2011,null,30-jan-2011,
Close,Clear,
0,0,0Do I need to store only the current day snapshot of this information or I need to store this information for each business day(i.e. daily?)
What is the best practice for this?
Thanks,
Hesh.

Hi if the data loaded in the data mart is only having percentages and not having constituents then they cannot be added to any other levels other than the once at which they are loaded. The only way is to ensure that that data to be loaded is the constituents as shown in example.
Suppose user needs to see Gross Profitability which is ( Revenue - Cost)/ Revenue
I should not be creating a measure called Gross Profitability if i require reports of gross profitability at different context. Rather I should create Revenue and Cost as my measures so that i can calculate the Gross Profitability at any level required.

Design of a cube

Dear experts,
Many experts might have worked in desigining the cubes, will any expert share there experience how to convert the nornmization in denormised table, while designg the cube, what precaution u take while desinging the cube
Regards
Radha

Hi ,
Basic considerations....
Keep the small dimensions together
Keep line item dimensions wherever needed
Group related characteristics into one dimension only
Remove high cardinality dimensions
Using aggregates and compression.
Refer these threads.....
Tips for Cube Modeling
Designing cube
InfoCube Design
/thread/785462 [original link is broken]
CUBE DESIGN
tarak

Non-Cumulative vs. Cumulative KeyFigures for Inventory Cube Implementation?

A non-cumulative is a non-aggregating key figure on the level of one or more objects, which is always displayed in relation to time. Generally speaking, in SAP BI data modeling, there are two options for non-cumulative management. First option is to use non-cumulative management with non-cumulative key figures. Second option is to use non-cumulative management with normal key figures (cumulative key figures). For SAP inventory management, 0IC_C03 is a standard business content cube based upon the option of non-cumulative management with non-Cumulative key figures. Due to specific business requirements (this cube is designed primarily for detailed inventory balance reconciliation, we have to enhance 0IC_C03 to add additional characteristics such as Doc Number, Movement type and so on. The original estimated size of the cube is about 100 million records since we are extracting all history records from ECC (inception to date). We spent a lot of time to debate on if we should use non-cumulative key figures based upon the standard business content of 0IC_C03 cube. We understand that, by using Non-Cumulative key figures, the fact table will be smaller (potentially). But, there are some disadvantages such as following:
(1) We cannot use the InfoCube together with another InfoCube with non-cumulative key figures in a MultiProvider.
(2) The query runtime can be affected by the calculation of the non-cumulative.
(3) The InfoCube cannot logically partition by time characteristics (e.g. fiscal year) which makes it difficult for future archiving.
(4) It is more difficult to maintain non-cumulative InfoCube since we have added more granularity (more characteristics) into the cube.
Thus, we have decided not to use the Cumulative key figures. Instead, we are using cumulative key figures such as Receipt Stock Quantity (0RECTOTSTCK) , Issue Stock Quantity(0ISSTOTSTCK)
, Receipt Valuated Stock Value (0RECVS_VAL) and Issue Valuated Stock Value (0ISSVS_VAL). All of those four key figures are available in the InfoCube and are calculated during the update process. Based upon the study of reporting requirements, those four key figures seems to be sufficient to meet all reporting requirements.
In addition, since we have decided not to use cumulative key figures, we have removed non-cumulative key figures from the 0IC_C03 InfoCube and logically partitioned the cube by fiscal year. Furthermore, those InfoCube are fiscally partitioned by fiscal year/period as well.
To a large extent, we are going away from the standard business content cube, and we have a pretty customized cube here. We'd like to use this opportunity to seek some guidance from SAP BI experts. Specifically, we want to understand what we are losing here by not using non-cumulative key figures as provided by original 0IC_C03 business content cube. Your honest suggestions and comment are greatly appreciated!

Hello Marc,
Thanks for the reply.
I work for Dongxin, and would like to add couple of points to the original question...
Based on the requirements, we decided to add Doc Number and Movement type along few other characteristics into the InfoCube (Custom InfoCube - article movements) as once we added these characteristics the Non Cumulative keyfigures even when the marker was properly set were not handling the stock values (balance) and the movements the right way causing data inconsistency issues.
So, we are just using the Cumulative keyfigures and have decided to do the logical partitioning on fiscal year (as posting period is used to derive the time characteristics and compared to MC.1 makes more sense for comparison between ECC and BI.
Also, I have gone through the How to manual for Inventory and in either case the reporting requirement is Inception to date (not just weekly or monthly snapshot).
We would like to confirm if there would be any long term issues doing so.
To optimize the performance we are planning to create aggregates at plant level.
Couple of other points we took into consideration for using cumulative keyfigures are:
1. Parallel processes possible if non-cumulative keyfigures are not used.
2. Aggregates on fixed Plant possible if non-cumulative keyfigures are not used. (This being as all plants are not active and some of them are not reported).
So, since we are not using the stock keyfigures (non cumulative) is it ok not to use 2LIS_03_BX as this is only to bring in the stock opening balance....
We would like to know if there would be any issue only using BF and UM and using the InfoCube as the one to capture article movements along with cumulative keyfigures.
Once again, thanks for your input on this issue.
Thanks
Dharma.

How to combine multiple fact tables and dimensions in one worksheet?

Hello Forum,
I am encountering a reporting problem when trying to create a worksheet that uses more than one cube/fact table and common dimensions. I have used Oracle Warehouse Builder 10Gr2 to design and deploy a pretty simple ROLAP data mart. We are using Discoverer Plus for OLAP as our reporting tool. We have 5 dimension tables using a star schema and 3 fact tables, when I create the worksheet I bring in our sales measure from our sales item table and then Store_Name from my Stores Dimension and then day from my time dimension, everything looks good at the stage, we're just trying to get a sum of all sales for that store on that day. Then I bring in a measure from our advertising cost table and a join window pops up asking which join to use, if I choose either the Store or the Time dimension I get correct data for the first fact table (sales) and grossly incorrect data for the ad cost measure from the second fact table (advertsing costs)...... any help would be appreciated

You have encountered one of the key limitations of Discoverer... which I complained about to the Discoverer product manager at OpenWorld in 2001....
Anyhow, to get around this, you are going to have to deal with it either in the database, (views, materialized views, tables), or within the admin tool by creating a custom folder.
Discoverer also calls this the "fan trap", but never really had a solution to the problem. [The solution only worked is you joined to one and only one dimension!]
What you want (using Sales_Fact and Inventory_Fact as an example) is to join Sales to Time, Store, and Product, and save that result. Then join Inventory to Time, Store, and Product, save that result, then do a double outer join between the two intermediate temporary tables in order to calculate something useful like inventory turns by store and product line.
This is also known a "multipass SQL", and is supported by some (but not many) other tools.
So, to accomplish this with Discoverer, you'll either need to create a view, or table, or materialized view that has already put Sales and Inventory into a single (virtual?) fact table. Alternatively you can write the SQL for how to do this linkage (don't forget to handle missing data), and use the Discoverer admin tool to create a custom folder that uses your SQL.
Hope this helps!

Usage of line item dimension - design or run time?

Hi,
Can anyone please tell me at which stage a line item dimension is considered - at design time or after data load, once queries are run and performance degenerates?
I have read many posts and blogs about line item dimension and high cardinality, but I would require more information on when a line item dimension comes into play.
If we can decide at design time, then how is it done without data being loaded?
At which instances will the dimension table size exceed the fact table's size?
Please explain the above 2 points with a clear example, including the DIM ID and SID table access, and the ratio calculation for line item dimension consideration.
Thanks in advance.

Hello Aparajitha,
I agree with Suhas on point of consideration of LID . It would be good enough to consider a Dimension as LID in the Cube during design, it will be fruitful for the performance point of view. There is no point in saving the LID for future purpose if you have less than 13 Dimension in the Cube. It is going to save a extra join in connecting the relevant data.
If the total Dimension exceeds 13 or more (during design) , then you no option but include the related Char IO together in a one dimension.Here you cannot make a LID .
During the run phase, if the Dim table is more than 20 % of Fact Table, then for the sake of performance you have to go for the LID.In that case you will have the overhead of managing data (backup, delete & restore) .
On your specific questions :
"If we can decide at design time, then how is it done without data being loaded "
Technically same as you do during run-- Goto Dimension -- Right click --Properties -- and Check LID.
Logically -- Depending upon the Business meaning, which char has max unique values you can go with as LID.
"At which instances will the dimension table size exceed the fact table's size "
Frankly I haven't come across that.. ... Fact table is the center table and always will be the huge table in comparison to Dim table . Dim table cannot exceed the Fact Table ....!
Yes if the size of Dim Table is more than 20% of Fact table ( ratio of Dim Table to Fact Table) , then we have to select between the LID or High Cardanility.
Gurus..Please correct if anything is wrong ..!
Regards
YN

Design of infocube

Hi Experts
Good Day!
when i try to execute the query it taking lot of time and this query there no complex query also .i followed these are steps to improve the performance still not good . but my cube is contain almost 6 crore records
1, i created aggregats
2,every day we are deleting the indexes and creating the indexes .
3.when i try to partioing its giving error message the properties 0FISCVARNT should be constant .when i try change the properties it is disabled.can you please tell me how to enable this properties
actually the data is coming from the ODS . i plan split diffent years of data like 2006,2007,2008.after that i plan create infocubes each year and finally execute the queries on multicueb for underlying each year of infocube data.is it good design or i need change design parameter.can you please guide me on this issues
regards
RK

Hi Rama,
Check if u are using the proper steps in the paritioning based on Fiscal Period:
http://help.sap.com/saphelp_erp2005/helpdata/en/0a/cd6e3a30aac013e10000000a114084/frameset.htm
Partitioning infocube with multiple fiscal variants
To partition the cube double click the cube this will take you to the definition of the cube. Click on the extras button at the top of the screen. you will get a dropdown, there you can see Partition option, click on that you will get another window there you all time chars select on which you want to partition and continue.
you will get another popup window there you have to give Range.
When you activate the cube fact tble will be created in DB with number of partitions corresponding to your Range.
While you do this there should be no data in the cube.
Thanks
Assign points if this helps

Star schema design

Hi,
I know that in classical star schema the dimension tables sits within the info cube and so we cannot use this dimension table in any other cube we need to have separate dimension table for that cube thought it might be having same data. I also know to over come this redundancy extended star schema came into picture where we have SID table and we keep the dimension table out of the cube and reuse the dimension tables across many cubes.
Now what i don't understand is that instead of having Separate SID tables for linking the dimension and fact tables why cant we make the DIMENSION table generic and keep them out of the infocube so that we can same the same dimension table for many infocube in this case we wont need SID tables.
suppose i have one info cube which has dimension vendor material and customer and its keyfigure is quantity and price and i have a separate infocube which has dimesnion material customer and location and its key figure is something else ......so here in why cant i keep the dimensions out of the infocube and use the dimension material customer for both infocube.

Your dimension tables are filled based on your transaction data - which is why dimension table design is very important you decide to group related data for the incoming transaction data into your dimension tables .
The dimension tables have SIDs which in turn point to master data = in the classic star schema - the dimension tables are outside the cube but the dim tables have the master data within them whhich is overcome using the extended star schema.
The reason why dimension tables can be reused is that the dim IDs and SIDs in the simension table correspond to the transaction data in the cube - and unless the dim IDs in both your cubes match you cannot reuse the dim tables - which means that you have exactly the same data in both the cubes - which means you need not have two cubes with the same data.
Example :
Cube 1 : Fact Table
Dim1ID | DIM2ID | KF1
1|01|100
2|02|200
Dimension Table : Dim 1 ( Assumin that there are 2 characteristics in this dimension ) - here the DIM1ID is Key
Dim1ID | SID1 | SID2
1|20|25
2|30|35
Dimension Table Dim 2 - Here the Dim2ID field is key
Dim2ID| SID1 | SID2| SID3
01| 30| 45
02|45|40
Here the Dim IDs for the cube Fact table are generated at the time of load and this is generated from the NRIV Table ( read material on Number Ranges ) - this meanns that you cannot control DIM ID generation across cubes which means that you cannot reuse Dimension Tables

Dimension table and fact table exists data physically

Hi experts,
can anyone plz tell me weather dimension table and fact table exists data physically or not/

Hi..Sudheer
SAPu2019s BW is based on "Enhanced Star schema" or "Info Cubes" database design.This database design has a central database table, known as u2018Fact Tableu2019 which is surrounded by associated dimension tables.
Fact table is surrounded by dimensional tables. Fact table is usually very large, that means it contains
millions to billions of records.
These dimension tables doesn't contain data it contain references to the pointer tables that point to the master data tables which in turn contain Master data objects such as customer, material and destination country stored in BW as Info objects. An InfoObjects can contain single field definitions such as transaction data or complex Customer Master Data that hold attributes, hierarchy and customer texts that are stored in their own tables.
SID is surrogate ID generated by the system. The SID tables are created when we create a master data IO. In SAP BW star schema, the distinction is made between two self contained areas: Infocube & master data tables/SID tables.
The master data doesn't reside in the satr schema but resides in separate tables which are shared across all the star schemas in SAP BW. A numer ID is generated which connects the dimension tables of the infocube to that of the master data tables.
The dimension tables contain the dim ID and SID of a particular IO. Using this SID the attributes and texts of an master data Io is accessed.
The SID table is connected to the associated master data tables via teh char key.
Fact table(Transaction data,DIM ID)<>Dimention Table(SID and Dim ID)<->Masterdata table(SID,IO)
Thanks,
Abha

OWB 10.2 Design questions - OLAP objects - worth using? too buggy?

I am designing a 2 tiered data warehouse. I have a staging schema where numerous staging tables and key mapping tables are kept and a data warehouse schema where I have relationally implemented dimensions and cubes. I want the dw layer to be made up of conformed dimensions and facts as per Kimball.
For the DW layer, my plan was to create cubes and dimensions which would be implemented as ROLAP. I would then selectively create MOLAP cubes for subsets of the data warehouse.
My first question - does this much make sense or should I avoid the logical constructs of dimensions and cubes like the plague and simply build star schemas as Oracle tables directly?
Assuming I can use cubes and dimensions - how do I control the column names for the cube foreign keys? I have one cube with several date dimension foreign keys. Using the cube editor I get nonsensical names like time_dim_key, time_dim_key1, time_dim_key2. I want these to be creation_date_fk, expiration_date_fk, etc so that they are readable. I don't see any way with the cube editor to control this.
Also, please see my other posts about errors with the role concept on dimensions.
And lastly, have any of you had success in deploying a large scale DW with OWB using dimensions and cubes?

So far the verdict is TOO BUGGY. There is a documented bug in metalink:
DATA COLLECTED
===============
Deployment output
ATLAS_TIME_DIM
Create Error
ORA-06550: line 960, column 95:
PLS-00123: program too large (Diana nodes)
ISSUE CLARIFICATION
====================
Dimension deployment (with 5 or more roles) results in a PLS-00123 error.
ISSUE VERIFICATION
===================
Verified the issue by the deployment output which show the PLS-00123 error.
CAUSE DETERMINATION
====================
Defect in OWB 10g R2
CAUSE JUSTIFICATION
====================
Bug 5066108 (CWM2 CODE GEN USES A SINGLE ANONYMOUS PLSQL BLOCK)
POTENTIAL SOLUTION(S)
======================
Possible workarounds:
Reduce the number of roles associated with the dimension.
Use relational vs. multidimensional
POTENTIAL SOLUTION JUSTIFICATION(S)
====================================
Deployment is successful with fewer roles associated.
Looking at the recommended solutions it appears that I can:
1) limit the cube to very few dimensions making it useless for anything major
2) Use relational - My first interpretation of this was that I could generate the cube but use the "data objects only" deployment. No such luck. When the roles are added to the cube, the designer window locks up when the roles exceed 8 for the time dimension.
So, the "use relational" workaround seems to imply that cubes and dimensions should be avoided. Anyone have any contrary experiences?
Specifically has anyone done any of the following successfully with OWB 10.2:
1. Have you implemented a time dimension with more than 10 roles associated with it?
2. Have you implemented a cube (relationally or otherwise) with more than 5 roles of a single dimension against it?
3. Have you implemented a cube with more than 10 dimensions of any kind associated with it?

Dimension fact aggregation question

Hi,
I am new to Oracle OLAP and I noticed something in this tool. It doesn't aggregate the right way.
For example, if there is a customer A in the Customer dimension and customer A has 5 records in the fact with amounts 10,20,30,40,50. When I build the cube, the cube doesn't aggregate the amounts for customer A. It picks one of the amounts. I have to actually aggregate in the view and pass it to the cube.
Is this how it is in Oracle OLAP? or am i missing somehting? Help me on this basic fact/dimension design.
Thanks in advance.

Oracle OLAP assumes that you're loading in source data at the same dimensionality as what the cube is designed at. In your case below, that isn't true. Your cube is dimensioned only by customer, while means each customer should have one and only one record.
Two ways to resolve this - either add another dimension to the cube that lets you break out the 5 records (maybe a time dimension?), or create a view on the fact table that summarizes the data to the lowest level of your cube.
Hope this helps,
Scott

Cube/Fact Design

Similar Messages

Maybe you are looking for