Data Warehouse Tablespace Design

Hi Folks,
I'm working on a new DW project in my company and I'm thinking about what tablespace model may I use.
I'm thinking in...
Each schema should have *5 distincts tablespaces* (Data, Indexes, Data Stage, Index Stage, Materialized Views) for best management.
1) Is it a good tablespace model for a DW?
2) The actual model in another DW here is one tablespace for each table (don't ask me why), in that way I have a lot of mixed small, medium and large tablespaces. From database mgmt view it is a caos, but to do an IO load balance is easy. My proposed model make it difficult?
My question is: Is my model better than the actual model?
...advices, previous experience, and hint will be helpful.
Jonathan Ferreira
http://oracle4dbas.blogspot.com

This looks to be a [duplicate of your other thread|http://forums.oracle.com/forums/thread.jspa?threadID=713439&tstart=0] that I replied to 20 minutes ago.
Justin

Similar Messages

  • Tablespaces and block size in Data Warehouse

    We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. The question is what are general advices in such considerations according table spaces and block size? I made some research and it is hard to find some clear answer, there are resources advising that block size is not important and can be left small (8 KB), others state that it is crucial and should be the biggest possible (64KB). The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
    Any help highly appreciated and thank you in advance.

    Wojtus-J wrote:
    We are preparing to implement Data Warehouse on Oracle 11g R2 and currently I am trying to set up some storage strategy - unfortunately I have very little experience with that. With little experience, the key feature is to avoid big mistakes - don't try to get too clever.
    The question is what are general advices in such considerations according table spaces and block size? If you need to ask about block sizes, use the default (i.e. 8KB).
    I made some research and it is hard to find some clear answer, But if you get contradictory advice from this forum, how would you decide which bits to follow ?
    A couple of sensible guidelines when researching on the internet - look for material that is datestamped with recent dates (last couple of years), or references recent - or at least relevant - versions of Oracle. Give preference to material that explains WHY an idea might be relevant, give greater preference to material that DEMONSTRATES why an idea might be relevant. Check that any explanations and demonstrations are relevant to your planned setup.
    The other thing is what part of data should be placed where? Many resources state that keeping indexes apart from its data is a myth and a bad practice as it may lead to decrease of performance, others say that although there is no performance benefit, index table spaces do not need to be backed up and thats why it should be split. The next idea is to have separate table spaces for big tables, small tables, tables accessed frequently and infrequently. How should I organize partitions in terms of table spaces? Is it a good idea to have "old" data (read only) partitions on separate table spaces?
    It is often convenient, and sometimes very important, to separate data into different tablespaces based on some aspect of functionality. The performance thing was mooted (badly) in an era when discs were small and (disk) partitions were hard; but all your other examples of why to split are potentially valid for administrative. Big/Small, table/index, old/new, read-only/read-write, fact/dimension etc.
    For data warehouses a fairly common practice is to identify some sort of aging pattern for the data, and try to pick a boundary that allows you to partition data so that a large fraction of the data can eventually be made read-only: using tablespaces to mark time-boundaries can be a great convenience - note that the tablespace boundary need not match the partition boudary - e.g. daily partitions in a monthly tablespace. If you take this type of approach, you might have a "working" tablespace for recent data, and then copy the older data to "time-specific" tablespace, packing it and making it readonly as you do so.
    Tablespaces are (broadly speaking) about strategy, not performance. (Temporary tablespaces / tablespace groups are probably the exception to this thought.)
    Regards
    Jonathan Lewis

  • Architectural Design - New Data Warehouse

    Hello All,
    This is my first post to the oracle discussion forums and I'm looking forward to the interactions with other ODWB users.
    I am just begining to implement a design for a new data warehouse, our team has already defined user requirements for a subset of the business (Sales/Marketing) and have committted a logical model to paper. We have installed our dev environment and are now ready to begin the work of creating our prototype.
    I've read all the Oracle doc I can get my hands on regarding implementing your DW objects and have been pondering the approach. ROLAP or MOLAP.....
    it seems to make sense that we should deploy into a ROLAP environment bringing in all our data from our staging area to create a stable relational data store. Then select most used or queried dimensions and facts to deploy in a MOLAP environment... has anyone used this approach? any lessons learned? do you have to choose one method or the other? or can you take a blended approach ? would you deploy both in the same database instance or seperate the two?
    thx

    I'm somewhat new to OWB coming from an Informatica background but in our environment, we are doing the same thing. Our Enterprise Data Warehouse will be based on ROLAP and I intend to use MOLAP for subsets of the EDW.
    Dimensions in Oracle are somewhat interesting in that they are "leveled" and you can tie cubes or "fact tables" to any level of the dimension. This is a bit un-Kimball-like and has taken some getting used to. I think it is a powerful feature but I will have to experiment some until I understand it better.
    One critical bug with 10.2 I've run into is with dimension roles - The time dimension for instance. Typically this is one table that is aliased many many times. If you exceed roughly 5 roles for the time dimension, the generation of the object fails since OWB generates a single anonymous PL/SQL block that exceeds 64k. Its a documented bug in development with no workaround according to metalink.
    Other gotchas are that table changes always try to generate "create table" scripts even if you only add an index or change parallelism. We have had to do table maintenance outside OWB and then keep the metadata in sync up until now.
    I haven't done any of the MOLAP yet but from what I read there are some restrictions - such as you can't have roles on dimensions for MOLAP and I believe you can't have SCDs in MOLAP. I don't know how Time dimensions are handled in MOLAP without roles! Do people really generate tables for every single time dimension in OWB???
    Hope you share your experiences here!
    - Mike Taylor

  • Data warehouse backups and read only tablespaces

    Hi all,
    I am working on a data warehouse database with following specs:
    Version: Oracle 10.2.0.3 Enterprise
    OS: Solaris
    App: Data warehouse
    We use RMAN to take 'level 0' & 'level 1' backups. We have block change tracking enabled and RMAN backups up data files and archive logs straight to tape.
    I am exploring ways of reducing the 'level 0' backups and was specifically focussing on using read-only tablespaces for this purpose.
    I have often seen it mentioned that a best practice in D/Ws is to store the old static partitions of fact tables in read only tablespaces so as to reduce the backup size.
    In case you have already implemented such a scheme, I would like to know how you have implemented it.
    I am thinking of the following mechanism:
    -- Start using backups at tablespace level rather than 'level 0' at database level.
    -- Record the latest SCNs of all datafiles prior to back up.
    -- If the latest SCN has not changed since last backup and the tablespace is in read only mode then
    -- Check if a backup copy of the tablespace has been done within the recovery window and is accessible.
    -- If the copy exists then don't backup that tablespace, else backup the tablespace.
    -- If the tablespace is read/write then back it up.
    I haven't delved into the low level details, but this seems to be lot of work. So just wanted to know from you if there's any ready made feature which makes all this easier.
    Many thanks in advance.

    Thank you so much for your help.
    Backup optimization was indeed the thing I was looking for. To be honest I had done a bit of RTFM, but I didn't check the advanced user guide.
    Although my specific question has been answered, it would be interesting to know what other things other people are implementing to reduce backups etc.
    I am also thinking of following options:
    -- Turn on index monitoring to get rid of unused indexes.
    -- Stop the backups of 'index' tablespaces.
    -- Archive off old data.
    Any other ideas for reducing DW backup size?
    Many thanks.

  • Designer Vs. Oracle Data warehouse builder

    Dear all,
    Currently I'm responsible of building a Data warehousing project using Oracle database. I'm trying to decide on a tool for modelling my datawarehouse. I have two options:
    1) Designer: we have some experience with this tool and we are using it for our main OLTP application.
    2) Oracle Data Warehouse builder: we are using this to design our ETL processes.
    I want to get some advice on whether the OWB is capable of modelling my datawarehouse and of doing a retrofit action. also, I try to standardize on the tools that are using in the Data Warehouse department (currently we are using only OWB).
    I will appreciate for any other advice to help in my selection process.
    Best Regards,
    Bilal

    Hi,
    In my experience this choice depends on the implementation of the datawarehouse. If you are building a "pure" Kimball style dimensional data warehouse you should be able to do this using OWB. I have architected such a DW in the past using only OWB, so I am speaking from experience.
    If on the other hand you are planning to implement an Inmon style CIF, if your requirements includes an operational data store (ODS), or if you for any other reason anticipate that you are going to be doing a lot of ER modeling, then I would not recommend using the current release of OWB for modelling. (Note however that there are significant improvements to the modelling capabilities in the Paris release of OWB, so this may change in the future)
    The advantage of improved maintainability when using a single tools needs to be weighted against the improved functionality if you choose a combination of the two. In the "two tool" scenario strict development and deployment routines need to be enforced to avoid that the model in Designer comes out of sync with the metadata in OWB. (Consider the effect of a developer making a change to a table definition in OWB and deploying it directly to the database without updating the model in Designer.)
    Hope this helps.
    Regards,
    Roald

  • Design the data warehouse around the reporting system?

    Hi All,
    A Jr. data warehouse developer resisted my suggestion to flatten out activity tables of differing grains into a single fact table.  (Think sales order header, sales order detail, and even a 3rd level of details to each sales order detail.)  Although
    he agreed that flattening out the fact tables into a single fact would be proper for a data warehouse, he's concerned that report developers will have an easier time querying the data warehouse with the 3 separate fact tables.  I'm not sure if it's because
    the report developers don't like learning new schemas or if their reporting tool is just severely limited, mainly because I've never used Cognos.  I assured him that a properly-designed data warehouse will save on query execution time, but he's concerned
    about the reporting tool and how it may not work so well with the data warehouse.  
    Did I give him the proper advice?  It seems like a data warehouse should be built properly regardless of reporting tool shortcomings.  Assuming this tool is lousy, maybe they need a new reporting system for their new data warehouse.
    Thanks,
    Eric

    Hi Eric,
    one of the hard and fast rules of building a data warehouse is that from a logical point of view the fact table presents data at a certain level of granularity and that you do not mix facts in fact tables. This is data warehousing 101.
    From your comment you seem to be suggesting mixing data of different granularity in the one table.
    Now, we have ways and means of co-habiting data that will appear as different fact tables in the one physical table. We control the physical placement of data in fact tables. But on SQL Server we would never mix facts at different granularities or representing
    different data in the one fact table. SQL Server supports that quite poorly.
    It is sad that in 2015 people are still messing up data warehouse project from pure ignorance of what is available. We have data warehouse data models that are extremely extensive but people just have to start from scratch and reinvent the wheel and fail over
    and over again. Sad but true.
    Best Regards 
    Peter Nolan

  • Could you please recommend a book or two about data warehouse designing?

    Want to read some books about data warehouse and how to build or deal with the problems during the data warehousing process.
    Anyone could recommend any book regard to this?
    I want the book to mainly talk about the common case scenarios in data warehouse area and the general solutions to those scenarios.
    I am quite new in this area, so any recommendation would be highly appreciated.
    Thanks.

    Perhaps also these resources, if you've not already seen them
    DW Best Practices Whitepaper
    http://www.oracle.com/technetwork/database/features/bi-datawarehousing/twp-dw-best-practies-11g11-2008-09-132076.pdf
    Greg Rahn on the core performance fundamentals of Oracle data warehousing
    http://structureddata.org/2009/12/14/the-core-performance-fundamentals-of-oracle-data-warehousing-introduction/

  • How do I design this in data warehouse?

    I am working on building a data warehouse for insurance quote data.
    Each quote will have an applicant and can have an optional co-applicant. Each applicant and co-applicant will have prior auto insurance history, prior home insurance history, current auto insurance information and current home insurance information.
    So do I create Applicant and Insurance dimensions here?

    Hi Ashan,
    Just so you know.
    I completely reworked our methodology of building data warehouses back in 2012. The new way of building data warehouses is quite different to the old way.  The way you listed.
    The methodology presentation is on this link.
    https://www.youtube.com/watch?v=Df4CgOtrFq8
    Video channels are here. http://www.instantbi.com/videos/
    Downloads are here: http://www.instantbi.com/company/downloads/
    I have been doing BI since 91 and what we have done now is industry leading. 
    I am an MSDN so we do our development on MSFT first and then deploy where ever our clients want us to deploy.
    Best Regards 
    Peter Nolan

  • Best practice of metadata table in data warehouse environment ?

    Hi guru's,
    In datawarehouse, we have 1. Stage schema 2. DWH(Data warehouse reporting schema). In stageing we have about 300 source tables. In DWH schema, we are creating the tables which are only required from reporting prespective . some of the tables in stageing schema, have been created in DWH schema as well with different table name and column names. The naming convention for these same tables and columns in DWH schema is more based on business names.
    In order to keep track of these tables we are creating metadata table in DWH schema say for example
    Stage                DWH_schema
    Table_1             Table_A         
    Table_2             Table_b
    Table_3             Table_c
    Table_4              Table_DMy question is how do we handle the column names in each of these tables. The stage_1, stage_2 and stage_3 column names have been renamed in DWH_schema which are part of Table_A, Table_B, Table_c.
    As said earlier, we have about 300 tables in stage and may be around 200 tables in DWH schema. Lot of the column names have been renamed in DWH schema from stage tables. In some of the tables we have 200 column's
    so my concern is how do we handle the column names in metadata table ? Do we need to keep only table names in metadata table not column names ?
    Any idea will be greatly appriciated.
    Thanks!

    hi
    seems quite a buzzing question.
    In our project we designed a hub and spoke like architecture.
    Thus we have 3 layer, L0 is the one closest to the source and L0 table's name are linked to the corresponding sources names by mean of naming standard (like tabA EXT_tabA tabA_OK1 so on based on implementation of load procedures).
    At L1 we have the ODS , normalized model , we use business names for table there and standard names for temporary structures and artifacts
    Both L0 an L1 keep source's column names as general rule, new columns like calculated one are business driven and metadata are standard driven.
    Datamodeler fits perfect for modelling L1 purpose.
    L2 is the dimensional schema business names take place for tables and columns eventually rewritten at presentation layer ( front end tool )
    hope this helps D.

  • Unread      Implementing heirarichal structure in data warehouse

    I want to create a data warehouse for credit card application. Each user can have a credit card and multiple supplementary credit cards. Each credit card has a main limit, which can be sub-divided into sub-limits to supplementary credit cards as requested by the user. Let us consider the following example:
    User “A” has a credit card “CC” with Limit “L” and its limit is $100,000.
    User “A” requested for a supplementary credit card “CC1” which is assigned limit
    “L1” = $50,000. He requests for another supplementary credit card “CC2” which is assigned limit “L2” = $100,000.
    Source tables contain data like this:
    1. src_client_card_trans: contains transaction data of client/user credit card usage (client_id, credit_card_number, balance_acquired)
    Client_id     Credit_card_number     Balance_acquired
    A     CC1     $20,000
    A     CC2     $50,000
    A     CC     $70,000
    2. src_card_limits: contains client’s credit cards linked to credit limits.
    Credit_card_number     Limit_id
    CC1     L1
    CC2     L2
    CC     L
    3. src_limit_structure: contains the relationship of limits and sub-limits.
    Limit_id     Sub_Limit_id
    L     L1
    L     L2
    I have designed two dimensions and one fact table. Dimensions are:
    1. LIMITS: contains the limit_id.
    2. CLIENTS: contains credit card user’s information.
    And fact table is LIMIT_BALANCES_FACT, which have some fact columns with the above dimensions.
    How can I implement the above scenario of limit hierarchy in data warehouse? Need your suggestions.
    Thanks in advance

    Much depends on how you want to analyze the data and there are a few options:
    1) Use credit limit as an attribute of the customer dimension. This would allow you to create query filters that can just show those customers with a $100,000 credit limit. This would return a list of credit cards (since the attribute would be assigned to each credit card) and then you can simply add or just keep the parents of that result set.
    However, this assumes you do not want to measure data specifically relating to credit card limit. For example it would not be possible to view a total amount spent by all customers who had a credit-limit of $100,000.
    In this case the attribute, credit limit, is simply used to filter a result set
    2) Create a separate dimension called Credit Limit and create three levels:
    All
    Range
    Credit Limit
    The level Range would contain groupings of credit limits such as 100-500, 501-1200, 1201-1,000 etc etc.
    This would allow you to analyse your data by customer and by credit limit over time. Allowing you to slice and dice quickly and easily.
    3) A second customer hierarchy could be added to the customer dimension. This would allow you to drill-down through different credit limits to customers to individual credit cards. It would be advisable to follow the same approach as option 2 and create some groupings for the credit limits to make the drill down easier for your business users to navigate:
    All
    Range
    Credit Limit
    Customer
    Credit Card
    Hope this helps
    Keith Laker
    Oracle EMEA Consulting
    BI Blog: http://oraclebi.blogspot.com/
    DM Blog: http://oracledmt.blogspot.com/
    BI on Oracle: http://www.oracle.com/bi/
    BI on OTN: http://www.oracle.com/technology/products/bi/
    BI Samples: http://www.oracle.com/technology/products/bi/samples/

  • Implementing heirarichal structure in data warehouse

    I want to create a data warehouse for credit card application. Each user can have a credit card and multiple supplementary credit cards. Each credit card has a main limit, which can be sub-divided into sub-limits to supplementary credit cards as requested by the user. Let us consider the following example:
    User “A” has a credit card “CC” with Limit “L” and its limit is $100,000.
    User “A” requested for a supplementary credit card “CC1” which is assigned limit
    “L1” = $50,000. He requests for another supplementary credit card “CC2” which is assigned limit “L2” = $100,000.
    Source tables contain data like this:
    1. src_client_card_trans: contains transaction data of client/user credit card usage (client_id, credit_card_number, balance_acquired)
    Client_id     Credit_card_number     Balance_acquired
    A     CC1     $20,000
    A     CC2     $50,000
    A     CC     $70,000
    2. src_card_limits: contains client’s credit cards linked to credit limits.
    Credit_card_number     Limit_id
    CC1     L1
    CC2     L2
    CC     L
    3. src_limit_structure: contains the relationship of limits and sub-limits.
    Limit_id     Sub_Limit_id
    L     L1
    L     L2
    I have designed two dimensions and one fact table. Dimensions are:
    1. LIMITS: contains the limit_id.
    2. CLIENTS: contains credit card user’s information.
    And fact table is LIMIT_BALANCES_FACT, which have some fact columns with the above dimensions.
    How can I implement the above scenario of limit hierarchy in data warehouse? Need your suggestions.
    Thanks in advance

    Much depends on how you want to analyze the data and there are a few options:
    1) Use credit limit as an attribute of the customer dimension. This would allow you to create query filters that can just show those customers with a $100,000 credit limit. This would return a list of credit cards (since the attribute would be assigned to each credit card) and then you can simply add or just keep the parents of that result set.
    However, this assumes you do not want to measure data specifically relating to credit card limit. For example it would not be possible to view a total amount spent by all customers who had a credit-limit of $100,000.
    In this case the attribute, credit limit, is simply used to filter a result set
    2) Create a separate dimension called Credit Limit and create three levels:
    All
    Range
    Credit Limit
    The level Range would contain groupings of credit limits such as 100-500, 501-1200, 1201-1,000 etc etc.
    This would allow you to analyse your data by customer and by credit limit over time. Allowing you to slice and dice quickly and easily.
    3) A second customer hierarchy could be added to the customer dimension. This would allow you to drill-down through different credit limits to customers to individual credit cards. It would be advisable to follow the same approach as option 2 and create some groupings for the credit limits to make the drill down easier for your business users to navigate:
    All
    Range
    Credit Limit
    Customer
    Credit Card
    Hope this helps
    Keith Laker
    Oracle EMEA Consulting
    BI Blog: http://oraclebi.blogspot.com/
    DM Blog: http://oracledmt.blogspot.com/
    BI on Oracle: http://www.oracle.com/bi/
    BI on OTN: http://www.oracle.com/technology/products/bi/
    BI Samples: http://www.oracle.com/technology/products/bi/samples/

  • Permanent Job Opportunity - Oracle BI Data Warehouse Developer Chicago, IL

    Submit Resumes to [email protected]
    The Business Intelligence Specialist will play a critical role in designing, developing, deploying, and supporting data warehouse/data mart applications. In this role, the person will be responsible for all BI aspects of a data warehouse/data mart application. Primary duties will be to create reporting standards, as well as coach and support power users with selected Oracle tool. The ideal candidate will have 3+ years demonstrated experience in data warehousing and Business Intelligence tools. Must also possess excellent communication skills and an outstanding track record with the user.
    Principal Duties:
    Participates with internal clients to define software requirements for development, maintenance and/or improvements
    Maintains accuracy, integrity, and availability of the data warehouse
    Tests, monitors, manages, and validates data warehouse activity, including data extraction, transformation, movement, loading, cleansing, and updating processes
    Designs and optimizes data mart models for Oracle Business Intelligence Suite.
    Translates the reporting requirements into data analysis and reporting solutions.
    Reviews and sign off on project plan(s).
    Reviews and sign off on technical design(s).
    Defines and develops BI reports for accessing/analyzing data in warehouse.
    Customizes BI tools and data sets for different types of users.
    Designs and develop UAT (User Acceptance Testing).
    Drives improvement of BI system architecture and development process.
    Develops and maintains internal relationships. Actively champions teamwork. Uses internal resources to enhance knowledge and expertise of industry, research, products and services. Provides information and support to others in the company.
    Required Skills:
    Education and Experience:
    BS/MS in Computer Science or equivalent.
    3+ years of experience with Oracle, PL/SQL Development and Data Warehousing.
    Experience Oracle Business Intelligence Suite and Crystal Reports is a plus.
    2-3 years dimensional modeling experience.
    Demonstrated hands on experience with Unix/Linux, SQL required.
    Demonstrated hands on experience with Oracle reporting tools.
    Demonstrated experience with translating business requirements into data analysis and reporting solutions.
    Experience in training programs/teach users to use tools.
    Expertise with software development process.
    Effective mediator - able to facilitate constructive and productive discussions with internal customers, external clients, and development personnel pertaining to feature definition, project scope, and status
    Problem solving*identifies and resolves problems in a timely manner, gathers and analyzes information skillfully and maintains confidentiality.
    Planning/organizing*prioritizes and plans work activities and uses time efficiently. Work requires continual attention to detail in composing and proofing materials, establishing priorities and meeting deadlines. Must be able to work in a fast-paced environment with demonstrated ability to juggle multiple competing tasks and demands.
    Quality control*demonstrates accuracy and thoroughness and monitors own work to ensure quality.
    Adaptability*adapts to changes in the work environment, manages competing demands and is able to deal with frequent change, delays or unexpected events.
    Benefits/Compensation:
    Employees enjoy competitive compensation. We have a full benefits package including medical and dental insurance, long-term disability and life insurance and a 401(k) plan.
    The client operates within the healthcare industry.
    This is a permanent full-time position. After ensuring your availability and qualifications we will put you in direct contact with the client to move forward in the process.

    FORWARD THE UPDATED RESUME AS SOON AS POSSIBLE.

  • Database and Data Warehouse, SAP BW Vs Oracle

    Hello Gurus,
    I would like to know the differences between Database and Data Warehouse.
    Oracle acts as a Database for SAP BW. I understand it this way, that all the data is stored in Oracle and BW tell the Database how to store, with all the links etc.
    Please tell me whether I am correct.
    It’s my pleasure to award points,
    Thanks and best wishes,
    i-bi

    hi,
    A data warehouse is, primarily, a record of an enterprise's past transactional and operational information, stored in a database designed to favour efficient data analysis and reporting (especially OLAP). Data warehousing is not meant for current, "live" data.
    A database is a collection of information stored in a computer in a systematic way, such that a computer program can consult it to answer questions. The software used to manage and query a database is known as a database management system (DBMS). The properties of database systems are studied in information science.
    http://www.webopedia.com/TERM/D/data_warehouse.html
    Hope this helps.
    Regards,
    yunus

  • What are the Disadvantages of Management Data Warehouse (data collection) ?

    Hi All,
    We are plan to implement Management Data Warehouse in production servers .
    could you please explain the Disadvantages of Management Data Warehouse (data collection) .
    Thanks in advance,
    Tirumala 
     

    >We are plan to implement Management Data Warehouse in production servers
    It appears you are referring to production server performance.
    BOL: "You can install the management data warehouse on the same instance of SQL Server that runs the data collector. However, if server resources or performance is an issue on the server being monitored, you can install the management data warehouse
    on a different computer."
    Management Data Warehouse
    Kalman Toth Database & OLAP Architect
    SQL Server 2014 Database Design
    New Book / Kindle: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2014

  • Foreign keys in SCD2 dimensions and fact tables in data warehouse

    Hello.
    I have datawarehouse in snowflake schema. All dimensions are SCD2, the columns are like that:
    ID (PK) SID NAME ... START_DATE END_DATE IS_ACTUAL
    1 1 XXX 01.01.2000 01.01.2002 0
    2 1 YYX 02.01.2002 01.01.2004 1
    3 2 SYX 02.01.2002 1
    4 3 AYX 02.01.2002 01.01.2004 0
    5 3 YYZ 02.01.2004 1
    On this table there are relations from other dimension and fact table.
    Need I create foreign keys for relation?
    And if I do, on what columns? SID (serial ID) is not unique. If I create on ID, I have to get SID and actual row in any query.

    >
    I have datawarehouse in snowflake schema. All dimensions are SCD2, the columns are like that:
    ID (PK) SID NAME ... START_DATE END_DATE IS_ACTUAL
    1 1 XXX 01.01.2000 01.01.2002 0
    2 1 YYX 02.01.2002 01.01.2004 1
    3 2 SYX 02.01.2002 1
    4 3 AYX 02.01.2002 01.01.2004 0
    5 3 YYZ 02.01.2004 1
    On this table there are relations from other dimension and fact table.
    Need I create foreign keys for relation?
    >
    Are you still designing your system? Why did you choose NOT to use a Star schema? Star schema's are simpler and have some performance benefits over snowflakes. Although there may be some data redundancy that is usually not an issue for data warehouse systems since any DML is usually well-managed and normalization is often sacrificed for better performance.
    Only YOU can determine what foreign keys you need. Generally you will create foreign keys between any child table and its parent table and those need to be created on a primary key or unique key value.
    >
    And if I do, on what columns? SID (serial ID) is not unique. If I create on ID, I have to get SID and actual row in any query.
    >
    I have no idea what that means. There isn't any way to tell from just the DDL for one dimension table that you provided.
    It is not clear if you are saying that your fact table will have a direct relationship to the star-flake dimension tables or only link to them through the top-level dimensions.
    Some types of snowflakes do nothing more than normalize a dimension table to eliminate redundancy. For those types the dimension table is, in a sense, a 'mini' fact table and the other normalized tables become its children. The fact table only has a relation to the main dimension table; any data needed from the dimensions 'child' tables is obtained by joining them to their 'parent'.
    Other snowflake types have the main fact table having relations to one or more of the dimensions 'child' tables. That complicates the maintenance of the fact table since any change to the dimension 'child' table impacts the fact table also. It is not recommended to use that type of snowflake.
    See the 'Snowflake Schemas' section of the Data Warehousing Guide
    http://docs.oracle.com/cd/B28359_01/server.111/b28313/schemas.htm
    >
    Snowflake Schemas
    The snowflake schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake.
    Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a product_category table, and a product_manufacturer table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance. Figure 19-3 presents a graphical representation of a snowflake schema.

Maybe you are looking for

  • Customer Downpayment Clearing

    Hello, I'm trying to clear the customer down payment request and down payment. I already posted the down payment through F-29, but forgot to choose "request". I tried to use F-30 or F-32,  but down payment request didn't show up (maybe because it was

  • RemoveEventListener in drag and drop parallax composition

    Hello all, I have a question, one of my students is making a project with a horizontal parallax and he wants to add drag and drop functionality to this. It works, only when you start to drag the object (converted to symbol) shoots to the right out of

  • Why did my Mac delete all my pictures on Iphoto?

    My Mac deleted all my pictures in IPhoto without asking me... I do not know what happened, or why it happened. I've had my Macbook Pro for only 1 year.

  • Thawte SSL cert

    We're interested in getting an SSL certificate with Thawte for our Snow Leopard Server. The default self signed certificate generated during setup seems to only have the Common Name and Country fields so when you generate a CSR, you aren't able to se

  • Upgrade from 9.2.0.7 to 10g

    Hi, We have a test database (Oracle 9.2.0.7.0) running on Windows XP Ver:2002, and we want this database to be migrated to Oracle 10g. Please let me know, from where I can download the patch, and installation guide for the same. Thanks in advance. Re