Identify high volume characteristics in dimension tables

Hi,
We have identified some cubes where dimension table to fact table ratio is more than 20%.
Now further step is to find out characteristics in those dimension tables which we can either make as line item dimension or
move to new dimension.Is there any function module or table which can identify such characteristics for particular cube?
Regards,
Shital

Hi,
LISTSCHEMA won't give you the size of the tables nor the option to analyze the dimensions in detail.
Similarly, SAP_INFOCUBE_DESIGN won't give you the detail of what's inside a dimension. It just gives you the size of the dimension.
The only way(as far as i know)  to do it is as i suggested in earlier posts - get your dimension table to your DBA. It will take 2 minutes for them to find out.
Please correct me if I am wrong.
-RMP

Similar Messages

  • How to identify fact and dimension tables

    Hi ,
    We are having the list of parent child relation info for each database tables. Based upon the parent child tables ,needs to identify which table is fact ,which table is dimension .Could you please help me how to identify the fact table and dimensions tables ?
    Thanks in advance

    Hi,
    Refer this link........
    http://www.oraclebidwh.com/2007/12/fact-dimension-tables-in-obiee/
    Please mark if it helps you.......

  • Identify the Cubes where dimension table size exactly 100% of the fact tabl

    Hello,
    We want to Identify the Cubes where dimension table size exactly 100%  of the fact table.
    Is there any table or standard query which can give me this data?
    Regards,
    Shital

    use report (se38) SAP_INFOCUBE_DESIGNS
    M.

  • Drawbacks of big volume into Dimension Tables

    Experts,
    Below are counts for DIM and FACT tables.
    Can somebody please help me with the big volume of data under Dimension tables versus FACT tables?
    What are the drawbacks and how DIM tables with big volume of data can hurt in Data Warehousing?
    Also, in near future we are target for cubes designing above this DIM & FACT tables.
    Your thoughts/responses are much appreciated.
    Thanks in advance
    Please do let us know your feedback. Thank You - KG, MCTS

    Hi gk1393, 
    Incremental loading is fundamental when facing these volumes you are showing. But, besides the ETL considerations, you may want to re-think about the design of this datawarehouse. 
    ¿Are these dimensions actually dimensions? In some businesses dimensions are very big, but is not that common to find many dimensions bigger than the facts (at least not in a mature datawarehouse). 
    Building dimensions on top of these big tables presents some performance problems, both processing and querying the cube. Disabling attributes for processing, removing extra text attributes that are not used in aggregations and reducing data type sizes (if
    you can store data in a smallint data type don't use bigint, for example) are useful techniques to mitigate these drawbacks. 
    But, again, I'd think about the datawarehouse design itself to check if these cardinalities make sense within your business needs. 
    Regards.
    Pau

  • How to Maintain Surrogate Key Mapping (cross-reference) for Dimension Tables

    Hi,
    What would be the best approach on ODI to implement the Surrogate Key Mapping Table on the STG layer according to Kimball's technique:
    "Surrogate key mapping tables are designed to map natural keys from the disparate source systems to their master data warehouse surrogate key. Mapping tables are an efficient way to maintain surrogate keys in your data warehouse. These compact tables are designed for high-speed processing. Mapping tables contain only the most current value of a surrogate key— used to populate a dimension—and the natural key from the source system. Since the same dimension can have many sources, a mapping table contains a natural key column for each of its sources.
    Mapping tables can be equally effective if they are stored in a database or on the file system. The advantage of using a database for mapping tables is that you can utilize the database sequence generator to create new surrogate keys. And also, when indexed properly, mapping tables in a database are very efficient during key value lookups."
    We have a requirement to implement cross-reference mapping tables with Natural and Surrogate Keys for each dimension table. These mappings tables will be populated automatically (only inserts) during the E-LT execution, right after inserting into the dimension table.
    Someone have any idea on how to implement this on ODI?
    Thanks,
    Danilo

    Hi,
    first of all please avoid bolding something. After this according Kimball (if i remember well) is a 1:1 mapping, so no-surrogate key.
    After that personally you could use Lookup Table
    http://www.odigurus.com/2012/02/lookup-transformation-using-odi.html
    or make a simple outer join filtering by your "Active_Flag" column (remember that this filter need to be inside your outer join).
    Let us know
    Francesco

  • Tool to export and import high volume data from/to Oracle and MS Excel

    We are using certain reports (developed in XLS and CSV) to extract more than 500K to 1M records in single report. There around 1000 reports generated daily. The business users review those reports and apply certain rules to identify exceptions then they apply those corrections back to the system through XL upload.
    The XL reports are developed in TIBCO BW and deployed in AMX platform. The user interface is running on TIBCO GI.
    Database Version: Oracle 11.2.0.3.0 (RAC - 2 node)
    The inputs around following points will be of great help:
    1) Recommendation to handle such higher volumes reports and mechanism to apply bulk correction back to system?
    2) Suggestions for any Oracle tool or third party tool

    If you were to install Oracle client software on the PC where EXCEL is installed,
    then you can utilize ODBC such that Excel can connect directly to the DB & issue SQL.

  • Should my dimension table be this big

    Im in the process of building my first product dimension for a star schema and not sure if im doing this correctly. A brief explanation of our setup
    We have products (dresses) made by designers for specific collections each of which has a range of colors and each color can come in many sizes. For our UK market this equates to some 1.9 million
    product variants. Flattening the structure out to Product+designer+collection gives about 33,000 records but when you add all the colors and then all the colors sizes it pumps that figure up to 1.9million. My “rolled own” incremental ETL load runs
    ok just now, but we are expanding into the US market and our projections indicate that our products variants could multiple 10 fold. Im a bit worried about performance of the ETL just now and in the future.
    Is 1.9m records not an awful lot of records to incrementally load (well analyse) for a dimension table nightly as it is never mind what that figure may grow to when we go to US?
    I thought of somehow reducing this by using a snowflake but would this not just reduce the number of columns in the dimensions and not the row count?
    I then thought of separating the colors and size into their own dimension, but this doesn’t seem right as they are attributes of products and also I would lose the relationship between products,  size & color I.e. I would have to go through the
    fact table (which ive read is not a good choice.) for any analysis.
    Am I correct in thinking these are big numbers for a dimension table? Is it even possible to reduce the number somhow?
    Still learning so welcome any help.
    Thanks

    Hi Plip71,
    In my opinion, It is always good to reduce the Dimension Volume as much as possible for better performance.
    Is there any Hierarchy in you product Dimension?.. Going for a snowflake for this problem is a bad idea..
    Solution 1:
    From the details given by, It is good to Split the Colour and Size as seperate dimension. This will reduce the vloume of dimension and increase the column count in the fact(seperate WID has to be maintained in the fact table). but, this will improve the
    performance of the cube. before doint this please check the layout requirement from the business.
    Solution 2:
    Check the distinct count of Item varient used in fact table. If it is very less, they try creating a linear product dimension. i.e, create an view for the product dimension doing a inner join with the fact table. so that only the used Dimension member will
    be loaded in the Cube Product Dimension. hence volume is reduced with improvement in performance and stability of the cube.
    Thanks in advance, Sorry for the delayed reply ;)
    Anand
    Please vote as helpful or mark as answer, if it helps Regards, Anand

  • Consistency Warning - [39008] Dimension Table not joined to Fact Source

    I have a schema in which I have the following tables:
    A) Patient Transaction Fact Table (i.e. supplies used, procedures performed, etc.)
    B) Demographic Dimension table (houses info like patient location code)
    C) Location Dimension table (tells me what Hospital each unique Location maps to)
    So table A is the fact, and table B is a dimension table joined to table A based on Patient ID, so I can get general info on the patient. This would allow me to apply logic to just see patient transactions where the patient was FEMALE, or was in the Emergency Room, by applying conditions to these fields in table B.
    Table C is a simple lookup table joined to table B by Location Code, so I can identify which hospital's emergency room the patient was located in for instance.
    So the schema is: A<---B<---C, where B and C are both dimension tables.
    The query works as desired, but my consistency check gives me the following WARNING:
    *[39008] Logical dimension table LOCATION MASTER D has a source LOCATION MASTER D that does not join to any fact source.*
    How do I resolve this WARNING, or at least suppress it?

    Hi,
    What you need to do is to add the (physical) location dimension table to the logical table source of the demographic dimension, for example by dragging it from physical layer on top of logical table source of demographic logical dimension table in bmm layer
    Regards,
    Stijn

  • What is '#Distinct values' in Index on dimension table

    Gurus!
    I have loaded my BW Quality system (master data and transaction data) with almost equivalent volume as in Production.
    I am comparing the sizes of dimension and fact tables of one of the cubes in Quality and PROD.
    I am taking one of the dimension tables into consideration here.
    Quality:
    /BIC/DCUBENAME2 Volume of records: 4,286,259
    Index /BIC/ECUBENAME~050 on the E fact table /BIC/ECUBENAME for this dimension key KEY_CUBENAME2 shows #Distinct values as  4,286,259
    Prod:
    /BIC/DCUBENAME2 Volume of records: 5,817,463
    Index /BIC/ECUBENAME~050 on the E fact table /BIC/ECUBENAME for this dimension key KEY_CUBENAME2 shows #Distinct values as 937,844
    I would want to know why the distinct value is different from the dimension table count in PROD
    I am getting this information from the SQL execution plan, if I click on the /BIC/ECUBENAME table in the code. This screen gives me all details about the fact table volumes, indexes etc..
    The index and statistics on the cube is up to date.
    Quality:
    E fact table:
    Table   /BIC/ECUBENAME                    
    Last statistics date                  03.11.2008
    Analyze Method               9,767,732 Rows
    Number of rows                         9,767,732
    Number of blocks allocated         136,596
    Number of empty blocks              0
    Average space                            0
    Chain count                                0
    Average row length                      95
    Partitioned                                  YES
    NONUNIQUE  Index   /BIC/ECUBENAME~P:
    Column Name                     #Distinct                                       
    KEY_CUBENAMEP                                  1
    KEY_CUBENAMET                                  7
    KEY_CUBENAMEU                                  1
    KEY_CUBENAME1                            148,647
    KEY_CUBENAME2                          4,286,259
    KEY_CUBENAME3                                  6
    KEY_CUBENAME4                                322
    KEY_CUBENAME5                          1,891,706
    KEY_CUBENAME6                            254,668
    KEY_CUBENAME7                                  5
    KEY_CUBENAME8                              9,430
    KEY_CUBENAME9                                122
    KEY_CUBENAMEA                                 10
    KEY_CUBENAMEB                                  6
    KEY_CUBENAMEC                              1,224
    KEY_CUBENAMED                                328
    Prod:
    Table   /BIC/ECUBENAME
    Last statistics date                  13.11.2008
    Analyze Method                      1,379,086 Rows
    Number of rows                       13,790,860
    Number of blocks allocated       187,880
    Number of empty blocks            0
    Average space                          0
    Chain count                              0
    Average row length                    92
    Partitioned                               YES
    NONUNIQUE Index /BIC/ECUBENAME~P:
    Column Name                     #Distinct                                                      
    KEY_CUBENAMEP                                  1
    KEY_CUBENAMET                                 10
    KEY_CUBENAMEU                                  1
    KEY_CUBENAME1                            123,319
    KEY_CUBENAME2                            937,844
    KEY_CUBENAME3                                  6
    KEY_CUBENAME4                                363
    KEY_CUBENAME5                            691,303
    KEY_CUBENAME6                            226,470
    KEY_CUBENAME7                                  5
    KEY_CUBENAME8                              8,835
    KEY_CUBENAME9                                124
    KEY_CUBENAMEA                                 14
    KEY_CUBENAMEB                                  6
    KEY_CUBENAMEC                                295
    KEY_CUBENAMED                                381

    Arun,
    The cube in QA and PROD are compressed. Index building and statistics are also up to date.
    But I am not sure what other jobs are run by BASIS as far as this cube in production is concerned.
    Is there any other Tcode/ Func Mod etc which can give information about the #distinct values of this Index or dimension table?
    One basic question, As the DIM key is the primary key in the dimension table, there cant be duplicates.
    So, how would the index on Ftable on this dimension table show #distinct values less than the entries in that dimension table?
    Should the entries in dimension table not exactly match with the #Distinct entries shown in
    Index /BIC/ECUBENAME~P on this DIM KEY?

  • Regarding  Dimension Table and Fact table

    Hello,
    I am having basic doubts regarding the star schema.
    Let me explain first regarding  star schema.
    Fact table containes Key fiigures and Dim IDs,Ok,
    These DIm ids will be connected to my dimension tables.The Dimension table contains Characterstics and these Dim ids ,Ok.
      Then My basic doubt
    1.How does DIm id will be linked to SID tables
    2.If I have not maintained any master data or text or Heirachies then SID tables will it be generated or not?
    3.If it is generated I think there is use of This SID now..as we have not  maintained Master data.
    4.I am haing 18 characterstic which are no way related to each other in that scnerio how does Dimensions have to identified.?or we need to inclued whole chracterstics in one dimensions or we need to create seprate dimesnions for each of them..?(max is 13 dimensions)
    5.If Dimension table contains dim ids and characterstics then where does the  values for characterstics will be stored...?
    ( for ex..sales rep is characterstics for this we will be giving values some names where does these values will be stored..)

    hi Vasu,
    e.g we have infocube with 
    - dimension 'location' -> characteristic 'sales rep', 'country' 
    - dimension 'partner'.
    fact table
    dim-id('sales person') dim-id('partner') revenue
    1001                   9001              500
    1002                   9002              300
    1003                   9004              200
    dimenstion table 'location'
    dim-id  sid-id(sales rep) sid-id(country)
    1001    3001              5001
    1002    3004              5004
    1003    3005              5001
    'sales rep' sid table
    sid   sales rep
    3001  abc
    3004  pqr
    3005  xyz
    'country' sid table
    5001      country1
    5004      country2
    so from the link dim-id and sid, we get
    "sales rep report"
    sales-rep   revenue
    abc        500
    pqr        300
    xyz        200
    "country report"
    country    revenue
    country1   700
    country2   300
    hope it's clear.

  • Dimension Table populating data

    Hi
    I am in the process of creating a data mart with a star schema.
    The star schema has been defined with the fact and dimension tables and the primary and foreign keys.
    I have written the script for one of the dimensions and would like to know when the job runs on a daily basis should the job truncate the table every day and rebuild the dimension table or should it only add new records to the table. If it should add only
    new records to the table how do is this done?
    I assume that the fact table job is run once a day and only new data is added to it?
    Thanks

    It will depend on the volume of your dimensions. In most of our projects, we do not truncate, we update only updated rows based on a fingerprint (to make the comparison faster than column by column), and insert for new rows (SCD1). For SCD2 we apply
    similar approach for updates and inserts, and expirations in batch (one UPDATE for all applicable rows at the end of the package/ETL). 
    If your dimension is very large, you can consider truncating all data or deleting only affected modified rows (based on nosiness key) to later reload those, but you have to be carefully maintaining the same surrogate keys reference by your
    existing facts.
    HTH,
    Please, mark this post as Answer if this helps you to solve your question/problem.
    Alan Koo | "Microsoft Business Intelligence and more..."
    http://www.alankoo.com

  • How and when does a dimension table gets generated

    Hi Gurus,
                     I am new into BI and I will be put into a project within 2 months. I have learned that dimension table contains the sid's of all the charateristics in the dimension table. My conclusions are like
    1. Dimension table contains the dim id as the primary key.
    2. Dimension table contains sid's of the characteristics.
    3. Though sid's in the dimension table are primary keys in thier 'S' table they are not key in the dimension table.
    My question is
    1. Is there any chance to generate new dim id's for the same combination of sid's because sid's are not part of the key?
    2. I got confused when and how does the dimension table gets generated ?
    I have searched in the forum and google but still my doubts didnt get clarified. If anyone could throw some light on this topic I would really appreciate it.

    HI,
    All your conclusions are correct.
    Now for your questions the answers are in line:
    1. Is there any chance to generate new dim id's for the same combination of sid's because sid's are not part of the key?
        No new dim id's will be generated, dim Id is unique for the same combination of sid's .
    2. I got confused when and how does the dimension table gets generated ?
        They get generated when you activate the info provider.
    Hope this helps.
    thanks,
    Rahul

  • None of the dimension tables are compatible with the query request

    Hi,
    i am experiencing the below error while querying columns alone from employee dimension (w_employee_d) in workforce profile SA. There is only one column in my report which is employee number coming from employee dimension. when i query other information like job, region, location etc i am not getting any error. the below error appears only when querying columns from employee dimension. the content tab level for the LTS of employee dimension is set to employee detail.
         View Display Error
    Odbc driver returned an error (SQLExecDirectW).
    Error Details
    Error Codes: OPR4ONWY:U9IM8TAC:OI2DL65P
    State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. [nQSError: 43113] Message returned from OBIS. [nQSError: 43119] Query Failed: [nQSError: 14077] None of the dimension tables are compatible with the query request Dim - Employee.Employee Number. (HY000)
    SQL Issued: SELECT 0 s_0, "Human Resources - Workforce Profile"."Employee Attributes"."Employee Number" s_1 FROM "Human Resources - Workforce Profile" FETCH FIRST 65001 ROWS ONLY.
    couldn't able to know the exact reason. Any suggestions would be highly appreciated.
    Regards.

    hi user582149,
    It is difficult to answer you question with such a little amount of details. Could you specify:
    - how many facts/dimensions are you using in the query?
    - what is the structure of your Business Model?
    - which version of OBI are you using?
    - what does your log say?
    I hope to tell you more having the information above
    Cheers

  • Problem with populating a fact table from dimension tables

    my aim is there are 5 dimensional tables that are created
    Student->s_id primary key,upn(unique pupil no),name
    Grade->g_id primary key,grade,exam_level,values
    Subject->sb_id primary key,subjectid,subname
    School->sc_id primary key,schoolno,school_name
    year->y_id primary key,year(like 2008)
    s_id,g_id,sb_id,sc_id,y_id are sequences
    select * from student;
    S_ID UPN FNAME COMMONNAME GENDER DOB
    ==============================
    9062 1027 MELISSA ANNE       f  13-OCT-81
    9000 rows selected
    select * from grade;
          G_ID GRADE      E_LEVEL         VALUE
            73 A          a                 120
            74 B          a                 100
            75 C          a                  80
            76 D          a                  60
            77 E          a                  40
            78 F          a                  20
            79 U          a                   0
            80 X          a                   0
    18 rows selectedThese are basically the dimensional views
    Now according to the specification given, need to create a fact table as facts_table which contains all the dim tables primary keys as foreign keys in it.
    The problem is when i say,I am going to consider a smaller example than the actual no of dimension tables 5 lets say there are 2 dim tables student,grade with s_id,g_id as p key.
    create materialized view facts_table(s_id,g_id)
    as
    select  s.s_id,g.g_id
    from   (select distinct s_id from student)s
    ,         (select distinct g_id from grade)gThis results in massive duplication as there is no join between the two tables.But basically there are no common things between the two tables to join,how to solve it?
    Consider it when i do it for 5 tables the amount of duplication being involved, thats why there is not enough tablespace.
    I was hoping if there is no other way then create a fact table with just one column initially
    create materialized view facts_table(s_id)
    as
    select s_id
    from student;then
    alter materialized view facts_table add column g_id number;Then populate this g_id column by fetching all the g_id values from the grade table using some sort of loop even though we should not use pl/sql i dont know if this works?
    Any suggestions.

    Basically your quite right to say that without any logical common columns between the dimension tables it will produce results that every student studied every sibject and got every grade and its very rubbish,
    I am confused at to whether the dimension tables can contain duplicated columns i.e column like upn(unique pupil no) i will also copy in another table so that when writing queries a join can be placed. i dont know whether thats right
    These are the required queries from the star schema
    Design a conformed star schema which will support the following queries:
    a. For each year give the actual number of students entered for at A-level in the whole country / in each LEA / in each school.
    b. For each A-level subject, and for each year, give the percentage of students who gained each grade.
    c. For the most recent 3 years, show the 5 most popular A-level subjects in that year over the whole country (measure popularity as the number of entries for that subject as a percentage of the total number of exam entries).
    I written the queries earlier based on dimesnion tables which were highly duplicated they were like
    student
    =======
    upn
    school
    school
    ======
    school(this column substr gives lea,school and the whole is country)
    id(id of school)
    student_group
    =============
    upn(unique pupil no)
    gid(group id)
    grade
    year_col
    ========
    year
    sid(subject id)
    gid(group id)
    exam_level
    id(school id)
    grades_list
    ===========
    exam_level
    grade
    value
    subject
    ========
    sid
    subject
    compulsory
    These were the dimension table si created earlier and as you can see many columns are duplicated in other tables like upn and this structure effectively gets the data out of the schema as there are common column upon which we can link
    But a collegue suggested that these dimension tables are wrong and they should not be this way and should not contain dupliated columns.
    select      distinct count(s.upn) as st_count
    ,     y.year
    ,     c.sn
    from      student_info s
    ,     student_group sg
    ,     year_col y
    ,     subject sb
    ,     grades_list g
    ,     country c
    where      s.upn=sg.upn
    and     sb.sid=y.sid
    and     sg.gid=y.gid
    and     c.id=y.id
    and     c.id=s.school
    and      y.exam_lev=g.exam_level
    and      g.exam_level='a'
    group by y.year,c.sn
    order by y.year;This is the code for the 1st query
    I am confused now which structure is right.Are my earlier dimension tables which i am describing here or the new dimension tables which i explained above are right.
    If what i am describing now is right i mean the dimension tables and the columns are allright then i just need to create a fact table with foreign keys of all the dimension tables.

  • What kind of throughput should I expect? Anyone using AQ in high volume?

    Hi,
    I am working with AQ in a 10.2 environment and have been doing some testing with AQ. What I have is a very simple Queue with 1 queue table. The queue table structure is:
    id number
    message varchar(256)
    message_date date
    I have not done anything special with storage paramteres, etc so it's all defalt at this point. The I created a stored procedure that will generate messages given message text and number of times to loop. When I run this procedure with 10,000 iterations it runs in 15 seconds (if I commit all messages at the end) and 24 seconds if I commit after each message (probabliy more realistic).
    Now, on the same database I have a straight table that contains one column (message varchar(256)). I have also created a similiar storage procedure to insert into it. For this, 10,000 inserts takes about 1 second.
    As you can see there is an order of magnitude of difference so I am looking to see if others have been able to achieve higher throughput than 500-700 messages per second and if so what was done to achieve it.
    Thanks in advance,
    Bill

    Yes, I have seen it. My testing so far hasn't even gotten to the point of concurrent enqueue/dequeue. So far I have focused on enqueue time and it is dramatically slower than a plain old database table. That link also discussed mutliple indexed organized tables being created behind the scenes. I'm guessing that the 15X factor I am seeing is because of 4 underlying tables, plus they are indexed organized which adds additional overhead.
    So my question remains - Is anyone using AQ for high volume processing? I suppose I could create a bunch of queues. However, that will create additional management on my side which is what I was trying to avoid by using AQ in the first place.
    Can one queue be served by multiple queue tables? Can queue tables be partitioned? I would like to minimize the number of queue so that the dequeue processes don't have to contain multiplexed logic.
    Thanks

Maybe you are looking for

  • Events on iCal not showing up on my iPhone 4

    I downloaded iCloud and synced my iPhone calendar with my calendars on my mac. Before I did not have mobile me, so they were two seperate calendars that I could sync and update when I synced my phone to my computer. Now that I've merged the two toget

  • Why doesn't my iTunes Remote App point to my iCloud content?  Why does it have to point to my iTunes content on my PC?

    Hi, I just set up a 3rd-gen Apple TV.  W/ iTunes Match & Home Sharing, I'm able to play my songs or TV shows on my TV. Then, I installed the Apple Remote app on my iPad.  It points to my Apple TV but then can displays & plays only the songs, not the

  • ITunes 7.2 error message (-9813)

    Since installing iTunes 7.2, everything seems to be working and my podcasts are downloading, but when I try to purchase music, I get an error message saying, "iTunes could not connect to the iTunes Store. An unknown error occurred (-9813). Make sure

  • Non-global zones on a SAN???

    Hi everyone, i have a question that's probably been asked before and i'm sure many others are interested in knowing the answer. Is it possible to store non-global zone(s) on a SAN? The idea being that if the server hosting the non-global zone(s) dies

  • DBMS_JAVA.GRANT_PERMISSION

    Hi all anyone know when/why do I need to grant permissions to myself? I created a small java class, created the wrapper plsql, but when I run it, complains about permissions. Funny thing is, another simple class I had tried before didnt require anyth