Dimension Table Larger Than Fact Table

Hi,
I need a solution to deal with a situation in my project. The master data for the Business Partner is close to 20 Mil and growing. The overall records in the fact table is only 1.5 to 2 Mil.
I am looking for suggestions that can help me in designing a optimal data model which does not hinder the reporting performance. Reporting with 20 Mil records in Dimension and 10 times less data in the fact table is not common occurance. I guess this is a scenario that the retail Industries would have experinced due to the huge customer base.
P.S: Segregation of the dimension is something that will not help. And its currently a line item dimension. So any thoughts apart from these will be highly appreciative.
Thanks!
Sajan R

Hi Sajan Rajagopal ,
I think u need to just go through the dimensions u had created so trat u can delete unnecessary infoobjects assigned to tat dimension and can build a dimension whcih has only those of master data.
For Ex...for the material master when we checked we had huge amount of data more than ur's so we had made diffenet dimenons for material and batch and then we had tried to load the data for tat dimensions and it was working fine and also data loads also getting loaded fine..
Please check ur dimensions once so tat required ones are defined correctly in the dimensions

Similar Messages

How can we know that size of dimension is more than fact table?

how can we know that size of dimension is more than fact table?
this was the question asked for me in interview

Hi Reddy,
This is common way finding the size of cube or dimensions or KF.
Each keyfiure occupies 10 Bytes of memory
Each Char occupies 6 Bytes of memory
So in an Infocube the maximum number of fields are 256 out of which 233 keyfigure, 16 Dimesions and 6 special char.
So The maximum capacity of a cube
= 233(Key figure)10 + 16(Characteristics)6 + 6(Sp.Char)*6
In general InfoCube size should not exceed 100 GB of data
Hope it answer your question.
Regards,
Varun

How to import PDF image into Pages manuscript that is larger than the text bed margins

How do I import a PDF image into a Pages book manuscript, whose dimensions are larger than the global text bed margins? Any imported PDF image is automatically shrunk to fit within the text bed. I know I can set up a single page with wider margins and a text bed within it, but I want the image to import to a page already within the manuscript sequence, where the text bed margins are set and constant.
Larry Kettelkamp
[email protected]

By default it should be dragging in as a floating image and not part of the text.
What version of Pages are you using?
In Pages 5 this is:
Format > Arrange > Object Placement > Stay on Page > Size > Original size
Peter

Dimension table is larger than the fact table

Hi Community,
How can we explain the phenomenon when a dimension table has MORE records in it than the fact table ? What are the conditions that would cause this to occur ?
Thank you !
Keith

Thanks, Bhanu,
I am wondering specifically how to explain the output from program SAP_INFOCUBE_DESIGNS when the dimension table is shown to have a fact table ratio that is greater than 100%
I believe that SAP_INFOCUBE_DESIGNS already takes into consideration both the E and also the F-fact table when calculating the ratio. So in this case, we could not explain it by your first suggestion (after compression - but looking at only the F table).
In the case where selective deletions have been performed, how can we correct the situation ? For example, how could we clean out the records in the dimension tables which no longer have any facts in the fact table ? (I think the BW system should do this automatically as a part of the selective deletion, don't you agree ?).
Also, is there any other explanation for how the dimension table could arrive at greater than 100% the size of the fact table(s) ?
For example, lets say that (theoretically) we placed many very dynamic characteristics together into the same dimension.. which we know you should not do. Would it be possible for the combination of these very many dynamic characteristics to cause so many DIM IDs that the dimension table overtakes the record count of the fact table ? Is this situation then made worse by compression if the number of fact table records is reduced thanks to removal of the request ID ?

Dimension table greater than Fact table

Hello,
By analysing infocubes with SAP report , i encounter a situation where I have more records in a dimension table than in the cube :
ABI_C0101          /BIC/DABI_C01011    rows:         22    ratio:          0 %
ABI_C0101          /BIC/DABI_C01012    rows:         27    ratio:          0 %
ABI_C0101          /BIC/DABI_C01013    rows: 3.558.433    ratio:        229 %
ABI_C0101          /BIC/DABI_C01014    rows:          1    ratio:          0 %
ABI_C0101          /BIC/DABI_C01015    rows:     66.440    ratio:          4 %
ABI_C0101          /BIC/DABI_C01016    rows:     15.383    ratio:          1 %
ABI_C0101          /BIC/DABI_C01017    rows:      2.533    ratio:          0 %
ABI_C0101          /BIC/DABI_C01018    rows:          1    ratio:          0 %
ABI_C0101          /BIC/DABI_C0101P    rows:          2    ratio:          0 %
ABI_C0101          /BIC/DABI_C0101T    rows:        122    ratio:          0 %
ABI_C0101          /BIC/EABI_C0101     rows:          0    ratio:          0 %
ABI_C0101          /BIC/FABI_C0101     rows: 1.551.333    ratio:        100 %
As the contents of the cube AND dimension are deleted before reload, I'm wondering how this can happen.
Can somebody help me here ?
Thanks.
Fred.

Hi,
As I said yesterday, I'vz got the same feeling and test it by manually delete cube and dimensions before load and dimension is now only 17% of the cube. This is well what I expected.
I will give also a feeback on the report SAP_INFOCUBE_DESIGNS. When I ran it after manual delation of the cube, the number of rows in this report doesn't change ! So, I suppose that thsi program read some statistics but it does'nt read directly tables for sure.
So, conclusion is well that the variant doesn't work correctly. I will check OSS and open a note if required.
Thanks all for you ideas.
Fred.

Fact and dimension table partition

My team is implementing new data-warehouse. I would like to know that when should we plan to do partition of fact and dimension table, before data comes in or after?

Hi,
It is recommended to partition Fact table (Where we will have huge data). Automate the partition so that each day it will create a new partition to hold latest data (Split the previous partition into 2). Best practice is to create partition on transaction
timestamps so load the incremental data into a empty table called (Table_IN) and then Switch that data into main table (Table). Make sure your tables (Table and Table_IN) should be on one file group.
Refer below content for detailed info
Designing and Administrating Partitions in SQL Server 2012
A popular method of better managing large and active tables and indexes is the use of partitioning. Partitioning is a feature for segregating I/O workload within
SQL Server database so that I/O can be better balanced against available I/O subsystems while providing better user response time, lower I/O latency, and faster backups and recovery. By partitioning tables and indexes across multiple filegroups, data retrieval
and management is much quicker because only subsets of the data are used, meanwhile ensuring that the integrity of the database as a whole remains intact.
Tip
Partitioning is typically used for administrative or certain I/O performance scenarios. However, partitioning can also speed up some queries by enabling
lock escalation to a single partition, rather than to an entire table. You must allow lock escalation to move up to the partition level by setting it with either the Lock Escalation option of Database Options page in SSMS or by using the LOCK_ESCALATION option
of the ALTER TABLE statement.
After a table or index is partitioned, data is stored horizontally across multiple filegroups, so groups of data are mapped to individual partitions. Typical
scenarios for partitioning include large tables that become very difficult to manage, tables that are suffering performance degradation because of excessive I/O or blocking locks, table-centric maintenance processes that exceed the available time for maintenance,
and moving historical data from the active portion of a table to a partition with less activity.
Partitioning tables and indexes warrants a bit of planning before putting them into production. The usual approach to partitioning a table or index follows these
steps:
1. Create
the filegroup(s) and file(s) used to hold the partitions defined by the partitioning scheme.
2. Create
a partition function to map the rows of the table or index to specific partitions based on the values in a specified column. A very common partitioning function is based on the creation date of the record.
3. Create
a partitioning scheme to map the partitions of the partitioned table to the specified filegroup(s) and, thereby, to specific locations on the Windows file system.
4. Create
the table or index (or ALTER an existing table or index) by specifying the partition scheme as the storage location for the partitioned object.
Although Transact-SQL commands are available to perform every step described earlier, the Create Partition Wizard makes the entire process quick and easy through
an intuitive point-and-click interface. The next section provides an overview of using the Create Partition Wizard in SQL Server 2012, and an example later in this section shows the Transact-SQL commands.
Leveraging the Create Partition Wizard to Create Table and Index Partitions
The Create Partition Wizard can be used to divide data in large tables across multiple filegroups to increase performance and can be invoked by right-clicking
any table or index, selecting Storage, and then selecting Create Partition. The first step is to identify which columns to partition by reviewing all the columns available in the Available Partitioning Columns section located on the Select a Partitioning Column
dialog box, as displayed in Figure 3.13. This screen also includes additional options such as the following:
Figure 3.13. Selecting a partitioning column.
The next screen is called Select a Partition Function. This page is used for specifying the partition function where the data will be partitioned. The options
include using an existing partition or creating a new partition. The subsequent page is called New Partition Scheme. Here a DBA will conduct a mapping of the rows selected of tables being partitioned to a desired filegroup. Either a new partition scheme should
be used or a new one needs to be created. The final screen is used for doing the actual mapping. On the Map Partitions page, specify the partitions to be used for each partition and then enter a range for the values of the partitions. The
ranges and settings on the grid include the following:
Note
By opening the Set Boundary Values dialog box, a DBA can set boundary values based on dates (for example, partition everything in a column after a specific
date). The data types are based on dates.
Designing table and index partitions is a DBA task that typically requires a joint effort with the database development team. The DBA must have a strong understanding
of the database, tables, and columns to make the correct choices for partitioning. For more information on partitioning, review Books Online.
Enhancements to Partitioning in SQL Server 2012
SQL Server 2012 now supports as many as 15,000 partitions. When using more than 1,000 partitions, Microsoft recommends that the instance of SQL Server have at
least 16Gb of available memory. This recommendation particularly applies to partitioned indexes, especially those that are not aligned with the base table or with the clustered index of the table. Other Data Manipulation Language statements (DML) and Data
Definition Language statements (DDL) may also run short of memory when processing on a large number of partitions.
Certain DBCC commands may take longer to execute when processing a large number of partitions. On the other hand, a few DBCC commands can be scoped to the partition
level and, if so, can be used to perform their function on a subset of data in the partitioned table.
Queries may also benefit from a new query engine enhancement called partition elimination. SQL Server uses partition enhancement automatically if it is available.
Here’s how it works. Assume a table has four partitions, with all the data for customers whose names begin with R, S, or T in the third partition. If a query’s WHERE clause
filters on customer name looking for ‘System%’, the query engine knows that it needs only to partition three to answer
the request. Thus, it might greatly reduce I/O for that query. On the other hand, some queries might take longer if there are more than 1,000 partitions and the query is not able to perform partition elimination.
Finally, SQL Server 2012 introduces some changes and improvements to the algorithms used to calculate partitioned index statistics. Primarily, SQL Server 2012
samples rows in a partitioned index when it is created or rebuilt, rather than scanning all available rows. This may sometimes result in somewhat different query behavior compared to the same queries running on SQL Server 2012.
Administrating Data Using Partition Switching
Partitioning is useful to access and manage a subset of data while losing none of the integrity of the entire data set. There is one limitation, though. When
a partition is created on an existing table, new data is added to a specific partition or to the default partition if none is specified. That means the default partition might grow unwieldy if it is left unmanaged. (This concept is similar to how a clustered
index needs to be rebuilt from time to time to reestablish its fill factor setting.)
Switching partitions is a fast operation because no physical movement of data takes place. Instead, only the metadata pointers to the physical data are altered.
You can alter partitions using SQL Server Management Studio or with the ALTER TABLE...SWITCH
Transact-SQL statement. Both options enable you to ensure partitions are
well maintained. For example, you can transfer subsets of data between partitions, move tables between partitions, or combine partitions together. Because the ALTER TABLE...SWITCH statement
does not actually move the data, a few prerequisites must be in place:
• Partitions must use the same column when switching between two partitions.
• The source and target table must exist prior to the switch and must be on the same filegroup, along with their corresponding indexes,
index partitions, and indexed view partitions.
• The target partition must exist prior to the switch, and it must be empty, whether adding a table to an existing partitioned table
or moving a partition from one table to another. The same holds true when moving a partitioned table to a nonpartitioned table structure.
• The source and target tables must have the same columns in identical order with the same names, data types, and data type attributes
(length, precision, scale, and nullability). Computed columns must have identical syntax, as well as primary key constraints. The tables must also have the same settings for ANSI_NULLS and QUOTED_IDENTIFIER properties.
Clustered and nonclustered indexes must be identical. ROWGUID properties
and XML schemas must match. Finally, settings for in-row data storage must also be the same.
• The source and target tables must have matching nullability on the partitioning column. Although both NULL and NOT
NULL are supported, NOT
NULL is strongly recommended.
Likewise, the ALTER TABLE...SWITCH statement
will not work under certain circumstances:
• Full-text indexes, XML indexes, and old-fashioned SQL Server rules are not allowed (though CHECK constraints
are allowed).
• Tables in a merge replication scheme are not allowed. Tables in a transactional replication scheme are allowed with special caveats.
Triggers are allowed on tables but must not fire during the switch.
• Indexes on the source and target table must reside on the same partition as the tables themselves.
• Indexed views make partition switching difficult and have a lot of extra rules about how and when they can be switched. Refer to
the SQL Server Books Online if you want to perform partition switching on tables containing indexed views.
• Referential integrity can impact the use of partition switching. First, foreign keys on other tables cannot reference the source
table. If the source table holds the primary key, it cannot have a primary or foreign key relationship with the target table. If the target table holds the foreign key, it cannot have a primary or foreign key relationship with the source table.
In summary, simple tables can easily accommodate partition switching. The more complexity a source or target table exhibits, the more likely that careful planning
and extra work will be required to even make partition switching possible, let alone efficient.
Here’s an example where we create a partitioned table using a previously created partition scheme, called Date_Range_PartScheme1.
We then create a new, nonpartitioned table identical to the partitioned table residing on the same filegroup. We finish up switching the data from the partitioned table into the nonpartitioned table:
CREATE TABLE TransactionHistory_Partn1 (Xn_Hst_ID int, Xn_Type char(10)) ON Date_Range_PartScheme1 (Xn_Hst_ID) ; GO CREATE TABLE TransactionHistory_No_Partn (Xn_Hst_ID int, Xn_Type
char(10)) ON main_filegroup ; GO ALTER TABLE TransactionHistory_Partn1 SWITCH partition1 TO TransactionHistory_No_Partn; GO
The next section shows how to use a more sophisticated, but very popular, approach to partition switching called a sliding
window partition.
Example and Best Practices for Managing Sliding Window Partitions
Assume that our AdventureWorks business is booming. The sales staff, and by extension the AdventureWorks2012 database, is very busy. We noticed over time that
the TransactionHistory table is very active as sales transactions are first entered and are still very active over their first month in the database. But the older the transactions are, the less activity they see. Consequently, we’d like to automatically group
transactions into four partitions per year, basically containing one quarter of the year’s data each, in a rolling partitioning. Any transaction older than one year will be purged or archived.
The answer to a scenario like the preceding one is called a sliding window partition because
we are constantly loading new data in and sliding old data over, eventually to be purged or archived. Before you begin, you must choose either a LEFT partition function window or a RIGHT partition function window:
1. How
data is handled varies according to the choice of LEFT or RIGHT partition function window:
• With a LEFT strategy, partition1 holds the oldest data (Q4 data), partition2 holds data that is 6- to 9-months old (Q3), partition3
holds data that is 3- to 6-months old (Q2), and partition4 holds recent data less than 3-months old.
• With a RIGHT strategy, partition4 holds the holds data (Q4), partition3 holds Q3 data, partition2 holds Q2 data, and partition1
holds recent data.
• Following the best practice, make sure there are empty partitions on both the leading edge (partition0) and trailing edge (partition5)
of the partition.
• RIGHT range functions usually make more sense to most people because it is natural for most people to to start ranges at their lowest
value and work upward from there.
2. Assuming
that a RIGHT partition function windows is used, we first use the SPLIT subclause of the ALTER PARTITION FUNCTIONstatement
to split empty partition5 into two empty partitions, 5 and 6.
3. We
use the SWITCH subclause
of ALTER TABLE to
switch out partition4 to a staging table for archiving or simply to drop and purge the data. Partition4 is now empty.
4. We
can then use MERGE to
combine the empty partitions 4 and 5, so that we’re back to the same number of partitions as when we started. This way, partition3 becomes the new partition4, partition2 becomes the new partition3, and partition1 becomes the new partition2.
5. We
can use SWITCH to
push the new quarter’s data into the spot of partition1.
Tip
Use the $PARTITION system
function to determine where a partition function places values within a range of partitions.
Some best practices to consider for using a slide window partition include the following:
• Load newest data into a heap, and then add indexes after the load is finished. Delete oldest data or, when working with very large
data sets, drop the partition with the oldest data.
• Keep an empty staging partition at the leftmost and rightmost ends of the partition range to ensure that the partitions split when
loading in new data, and merge, after unloading old data, do not cause data movement.
• Do not split or merge a partition already populated with data because this can cause severe locking and explosive log growth.
• Create the load staging table in the same filegroup as the partition you are loading.
• Create the unload staging table in the same filegroup as the partition you are deleting.
• Don’t load a partition until its range boundary is met. For example, don’t create and load a partition meant to hold data that is
one to two months older before the current data has aged one month. Instead, continue to allow the latest partition to accumulate data until the data is ready for a new, full partition.
• Unload one partition at a time.
• The ALTER TABLE...SWITCH statement
issues a schema lock on the entire table. Keep this in mind if regular transactional activity is still going on while a table is being partitioned.
Thanks Shiven:) If Answer is Helpful, Please Vote

Multiple Fact Tables and Dimension Tables

I have been having some problems trying to model the data from Oracle E-Business Suite maintenance. I will try to give the best description of how the data is held in the tables. The structure is such that a work order can have multiple operations and an operation can have multiple resources as well. I believe the problem comes in the fact that an operation doesn't necessarily need to have a resource. I could not attach an image so I have written out an example below. I am not saying this is right or that it works, but just to give you an idea of what I am thinking. The full dimension would be Organization -> WorkOrder -> Operation -> Resource. Now, the fact tables all hold factual data for the three different levels, with the facts being at each corresponding level. This causes an obvious problem in combining the tables into one large fact table through the ETL process.
Can anyone tell me if they think this can be done? Am I way off? I am sure that there is a solution as there always is but I have been killing myself trying to figure this one out. I currently have the entire solution in different Business Models. I would like however to be able to compare facts from multiple areas such as the Work Order level and the Resource level.
Any help is greatly appreciated. I realize that the solution may also require additional work on the ETL side so I am open to any and all suggestions.
Thank you in advance for anyones time. :)
Dimension Tables
WorkOrder_D
Operation_D
Resource_D
Organization_D
Fact Tables
WorkOrder_F
Operation_F
Resource_F
Joins
WorkOrder_D -> Operation_D
Operation_D -> Resource_D
WorkOrder_D -> WorkOrder_F
Operation_D -> Operation_F
Resource_D -> Resource_F
Organization_D -> WorkOrder_D
Organization_D -> Operation_D
Organization_D -> Resource_D

Hi,
Currently the dimension table is taken as a simple logical table in rpd as it does not have have any levels or hierarchy.
Its a flat dimension. Can you guide me how can I implement a flat dimension in OBIEE? Because this dimension is taken as simple logical table
I am not able to set appropriate level for fac tables. This dimension does not appear in the list of dimensions.

[39008] Logical dimension table has a source that does not join to any fact

Dear reader,
After deleting a fact table from my physical layer and deleting it from my business model I'm getting an error: [39008] Logical dimension table TABLE X has a source TABLE X that does not join to any fact source. I do have an other fact table in the same physical model and in the same business model wich is joined to TABLE X both in the physical and business model.
I cannot figure out why I'm getting this error, even after deleting all joins and rebuilding the joins I'm getting this error. When I look into the "Joins Manager" these joins both in physical as well as logical model do exist, but with consistency check it warns me about [39008] blabla. When I ignore the warning and go to answers and try to show TABLE X (not fact, but dim) it gives me the following error.
Odbc driver returned an error (SQLExecDirectW).
Error Details
Error Codes: OPR4ONWY:U9IM8TAC:OI2DL65P
State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. [nQSError: 14026] Unable to navigate requested expression: TABLE X.column X Please fix the metadata consistency warnings. (HY000)
SQL Issued: SELECT TABLE X.column X saw_0 FROM subject area ORDER BY saw_0
There is one *"special"* thing about this join. It is a complex join in the physical layer, because I need to do a between on dates and a smaller or equal than current date like this example dim.date between fact.date_begin and fact.date_end and dim.date <= current_date. In the business model I've got another complex join
Any help is kindly appreciated!

Hi,
Have you specified the Content level of the Fact table and mapped it with the dimension in question? Ideally this should be done by default since one of the main features of the Oracle BI is its ability to determine which source to use and specifying content level is one of the main ways to achieve this.
Another quick thing that you might try is creating a dimension (hierarchy) in case it is not already present. I had a similar issue few days back and the warning was miraculously gone by doing this.
Regards

Problem with populating a fact table from dimension tables

my aim is there are 5 dimensional tables that are created
Student->s_id primary key,upn(unique pupil no),name
Grade->g_id primary key,grade,exam_level,values
Subject->sb_id primary key,subjectid,subname
School->sc_id primary key,schoolno,school_name
year->y_id primary key,year(like 2008)
s_id,g_id,sb_id,sc_id,y_id are sequences
select * from student;
S_ID UPN FNAME COMMONNAME GENDER DOB
==============================
9062 1027 MELISSA ANNE       f 13-OCT-81
9000 rows selected
select * from grade;
      G_ID GRADE      E_LEVEL         VALUE
        73 A          a                 120
        74 B          a                 100
        75 C          a                  80
        76 D          a                  60
        77 E          a                  40
        78 F          a                  20
        79 U          a                   0
        80 X          a                   0
18 rows selectedThese are basically the dimensional views
Now according to the specification given, need to create a fact table as facts_table which contains all the dim tables primary keys as foreign keys in it.
The problem is when i say,I am going to consider a smaller example than the actual no of dimension tables 5 lets say there are 2 dim tables student,grade with s_id,g_id as p key.
create materialized view facts_table(s_id,g_id)
as
select s.s_id,g.g_id
from   (select distinct s_id from student)s
,         (select distinct g_id from grade)gThis results in massive duplication as there is no join between the two tables.But basically there are no common things between the two tables to join,how to solve it?
Consider it when i do it for 5 tables the amount of duplication being involved, thats why there is not enough tablespace.
I was hoping if there is no other way then create a fact table with just one column initially
create materialized view facts_table(s_id)
as
select s_id
from student;then
alter materialized view facts_table add column g_id number;Then populate this g_id column by fetching all the g_id values from the grade table using some sort of loop even though we should not use pl/sql i dont know if this works?
Any suggestions.

Basically your quite right to say that without any logical common columns between the dimension tables it will produce results that every student studied every sibject and got every grade and its very rubbish,
I am confused at to whether the dimension tables can contain duplicated columns i.e column like upn(unique pupil no) i will also copy in another table so that when writing queries a join can be placed. i dont know whether thats right
These are the required queries from the star schema
Design a conformed star schema which will support the following queries:
a. For each year give the actual number of students entered for at A-level in the whole country / in each LEA / in each school.
b. For each A-level subject, and for each year, give the percentage of students who gained each grade.
c. For the most recent 3 years, show the 5 most popular A-level subjects in that year over the whole country (measure popularity as the number of entries for that subject as a percentage of the total number of exam entries).
I written the queries earlier based on dimesnion tables which were highly duplicated they were like
student
=======
upn
school
school
======
school(this column substr gives lea,school and the whole is country)
id(id of school)
student_group
=============
upn(unique pupil no)
gid(group id)
grade
year_col
========
year
sid(subject id)
gid(group id)
exam_level
id(school id)
grades_list
===========
exam_level
grade
value
subject
========
sid
subject
compulsory
These were the dimension table si created earlier and as you can see many columns are duplicated in other tables like upn and this structure effectively gets the data out of the schema as there are common column upon which we can link
But a collegue suggested that these dimension tables are wrong and they should not be this way and should not contain dupliated columns.
select      distinct count(s.upn) as st_count
,     y.year
,     c.sn
from      student_info s
,     student_group sg
,     year_col y
,     subject sb
,     grades_list g
,     country c
where      s.upn=sg.upn
and     sb.sid=y.sid
and     sg.gid=y.gid
and     c.id=y.id
and     c.id=s.school
and      y.exam_lev=g.exam_level
and      g.exam_level='a'
group by y.year,c.sn
order by y.year;This is the code for the 1st query
I am confused now which structure is right.Are my earlier dimension tables which i am describing here or the new dimension tables which i explained above are right.
If what i am describing now is right i mean the dimension tables and the columns are allright then i just need to create a fact table with foreign keys of all the dimension tables.

DWH: how do you analyze fact and dimension tables

Hi,
I was wondering how you analyze your fact and dimension tables.
Our fact table is partitioned per month. Each partition contains 4M rows and is 270 MB large. We are using 9 dimensions, 6 have about 50'000 rows (2MB), 1 about 1M rows (50MB) and 2 about 3M rows (200MB). All tables are compressed. The version of Oracle we are using is 9.2.0.5.
What I was wondering is how you would, using dbms_stats, analyze the fact and dimension tables. Which percentage would you analyze? On which column would you build histograms?

nope, but i could copy-paste the URL or I could copy-paste the entire thread from the other forum. Id did the one that made more sense to me.

Fact table is GT Dimension table

Hi All
Where can i find the relation between Fact table and dimension table sizes. If i want to check whether my dimension table is greater than fact table then how?
If i can use program " sap_infocube_designs" then when i am trying to execute it it is not working.
regards
Naga

I dont think SAP_INFOCUBE_DESIGNS doesnt give u the size of the fact table rather the density and the row count in the cube.
"density" in SAP_INFOCUBE_DESIGNS
check the answer by pizzaman in this thread detailing the size of the fact tabe
Fact table size of the cube

Dimension table and fact table exists data physically

Hi experts,
can anyone plz tell me weather dimension table and fact table exists data physically or not/

Hi..Sudheer
SAPu2019s BW is based on "Enhanced Star schema" or "Info Cubes" database design.This database design has a central database table, known as u2018Fact Tableu2019 which is surrounded by associated dimension tables.
Fact table is surrounded by dimensional tables. Fact table is usually very large, that means it contains
millions to billions of records.
These dimension tables doesn't contain data it contain references to the pointer tables that point to the master data tables which in turn contain Master data objects such as customer, material and destination country stored in BW as Info objects. An InfoObjects can contain single field definitions such as transaction data or complex Customer Master Data that hold attributes, hierarchy and customer texts that are stored in their own tables.
SID is surrogate ID generated by the system. The SID tables are created when we create a master data IO. In SAP BW star schema, the distinction is made between two self contained areas: Infocube & master data tables/SID tables.
The master data doesn't reside in the satr schema but resides in separate tables which are shared across all the star schemas in SAP BW. A numer ID is generated which connects the dimension tables of the infocube to that of the master data tables.
The dimension tables contain the dim ID and SID of a particular IO. Using this SID the attributes and texts of an master data Io is accessed.
The SID table is connected to the associated master data tables via teh char key.
Fact table(Transaction data,DIM ID)<>Dimention Table(SID and Dim ID)<->Masterdata table(SID,IO)
Thanks,
Abha

Fact and dimension table relationship?

hi
in se38 i executed this program sap_infocube_designs
i got all cubes and percentage , this is directly fact and dimension table relationship based on this i need to take action is it line item dimension or high cardinality (dimen>20% fact line, dimen>10<20 fact is high cardinality.
regards
suneel.

hi,
line item has to be choosen in such a way to control the dim table size for the char that have almost large unique records.
Line item dim table will not be shown by this program.
Ramesh

Large dimension tables

I am trying to get a grip with how a dimension table can have more records than the fact table since the key for the dimension table is DIMID. Can someone give me a practical example? Thanks

Thanks for this lets work with the Doc No example.
1) I have a cube with 4 dimensions - the fourth dimension contains material no, doc no and plant, the key for this dimension table is DIMID (system generated ??) and each record in the Dimension contains the corresponding SID values for mat no, doc no and plant
2) My first load of data into the cube - 1000 records
3) For each of these 1000 records a DIMID is generated in dimension 4 and the corresponding values of the 3 SIDs are retained as part of these records
4) My next load is a delta 200 lines - 150 new records and 50 updated records - in this case 150 new DIMIDs are generated and the 3 SIDS are retained whereas the original 50 updated records do not have new DIMIDs generated
Is how I described the scenario correct

#datasync error - could that come from merging on fields from fact tables instead of dimensions tables?

I'm reporting out of two universes published by Epic
1. Warehouse - Patient
2. Warehouse - Transactions
I wanted to join two queries based on the primary key for a hospital encounter. Following the tutorial seemed pretty straightforward until I got to displaying data from a merged query.
The table displayed results from Query1, but adding fields from Query2 wiped out all the data in the table, leaving only #datasync in each field.
My workaround to get fields from both queries displayed (see screenshot)
1. Merge queries on *two* fields - primary key of hospital encounter and primary key of patient
2. Create new variable
3. Make variable type Detail
4. Associate variable with hospital encounter key from Query1
5. Set formula equal to a field from Query2
I'm not sure why this workaround works or if what I'm experiencing is a symptom of something larger. Could this workaround be needed because I am merging on fields from a fact table instead of fields from a dimension table?
Thanks in advance

You have defined merged dimension on 'company code-Key' which is common in both the Bex-queries.
You are able to bring the merged dimension and other objects from Query_ One to report block without any issue but when adding objects from Query_Two you are getting the error #DATASYNC.
In this case objects from Query_One are got sync with the merged dimension object without any issue because they got added first to it.
Similarly when you add merged dimension and objects from Query_Two you find no issue, because objects from Query_Two go sync first.
Once objects from a query(Query_One/Query_Two) got added to a merged dimension, when we try adding objects from other query we get #DATASYNC error. This is because data in the other query is not able to sync with the initial result set, this is a know behavior.
There are two workarounds:
1) Merging all common dimension/characteristic objects: Only merged dimensions data will sync with the initial query, un-merged dimensions/characteristic objects will still give #DATASYNC error.
2) Create detail objects/attribute objects at report level for all uncommon characteristic/dimension objects from query_two referring to merged dimension. Then add these newly created detail/attribute objects to the table block that is having initial query objects with merged dimension, with this you see result set of query_two in the table block not #DATASYNC error.
~ Manoj

Dimension Table Larger Than Fact Table

Similar Messages

Maybe you are looking for