Maintaining large dimensions

I'm currently testing whether using an OLAP cube is a viable solution in our case. Here's an overview about the data for the first test setup:
3 dimensions:
- one time dimension
- one small dimension (4 levels)
- one large dimension (5 levels)
Items in each level of the large dimension:
1st level: 27
2nd level: 246
3rd level: 1.889
4th level: 383.434
5th level: 1.348.869
While the small dimension can be easily handled by the OLAP cube, loading the large dimension data takes too long (It has already been running for over 4 hours and is still running). Since the dimension data changes regularly and will be even larger in the real setup, this is not acceptable.
Is there a way to significantly increase the dimension data load speed, or is an OLAP cube not the right solution for this problem? Or is there anything else I'm doing wrong?
What during the loading process requires such a lot of processing power?
Any suggestions or further reading material are welcome.
Thanks,
Karl

Hi,
I'm not sure if this is your problem or not,but I want to share my experience.
I've had slow building dimensions as well, but not as big as yours. I think the lowest level had 500,000 rows. This would often keep on building beyond reasonable time. What I discovered was that the source data wasn't completely hierarchical. As far as I know your dimension data need to have a strict parent child relationship, not a relationship that would leave any soap opera on TV jealous.
I'll try to explain better.
Say that you have a product dimension with 4 levels with the following hierarchy:
Total - being the top level and only just one member
Product_category - What kind of product it is
Color - The color of the product
Product - Lowest level.
Looks like an easy and basic dimension. Now lets say that we want to load data into this dimension.
Every dimension also has the total level, but I'll leave it out for this example since it would always be the same.
say you load the following products:
Product_category - Color - Product
Clothes - Blue - Sweater
Clothes - Black - Socks
Furniture - Black - Chair
Furniture - White - Lawnchair
Toys - Black - R/C Helicopter
Now, apart from being a weird mix of products, this seems like nice dataset to load. In my experience this is where things starts to mess up.
If you load this data in to your dimension, and start to drill down into the dimension you will start at the total level on top. From there you will go to product_Category and if you then choose clothes, you should see the colors blue and black. If you drill into the black you will suddenly see the R/C Helicopter, the chair and also the the socks you expected to.
The reason is that the lowest levels have different grandparents, when they should be having the same. Black has ended up with having three different parents: toys, furniture and clothes and this is not good.
If this is the case for your dimension I would believe things would act weird.
Now back to the example, if this is the hierarchy you want to need to do some sql to your sourcedata so that all the levels gets just one parent.
In my very simplified source, the data could look something like this:
Clothes - Blue clothes - Sweater
Clothes - Black clothes- Socks
Furniture - Black furniture- Chair
Furniture - White furniture- Lawnchair
Toys - Black toys- R/C Helicopter
Now every level has just one parent and everything is fine. You don't have to change the description fields, only the memberfields. However you should also change the description field so that the endusers won't get totally confused.
Like I said, I'm not sure if this is what messes things up for you, it might be something completely different. I just wanted to mention it as a thing you might want to check out/keep in mind.
Good luck!
Ragnar

Similar Messages

Large Dimension Update

Mapping using merge to refresh dimension works fine, but updates all records whether there has been a change or not. So, an update on a large dim will be inefficient/impractical.
I guess if staging table/cursor only has the changed records to start with, then only the changed records are updated so its feasible. So my problem is that source records do not have a "last updated" date - CDC issue.
So now that I've typed that I'm thinking a filter/join between staging and dimension & returning non-matching rows and results are used to update dimension (I suppose I should be doing that anyway to ensure new descriptions aren't null, etc)
Any other suggestions?
Thanks!

Then again, it's a completely impractical approach if the dimension has a lot of changeable attributes (as large dimensions probably would!). So back to root of problem - if changes are not recorded at source then what are the options?
1. Get changes recorded at source (but maybe tracking them across 20 tables for 1 dim makes for more inefficiency than a full refresh?)
2. Full refresh of dims possibly workable for low data volumes, but still need to reduce updates whilst keeping keys (So use method above, or alternative may be to create new table using existing key but attributes from source if available, use post mapping process to drop existing dim and rename new dim - so a full refresh/update in a fraction of the time?)
Then again... "fitness for purpose" - just create materialized views/tables with the current view of the data & ignore a lot of DW fundamentals - make reporting easier/safer than using the transactional system. What does all the extra work give us? Better user interface, historical comparisons, better maintainability? Does the customer actually require it?

Are large dimensions a problem?

Hello, I am looking to possibly purchase Premiere for some video editing. I have been using Camtasia as a hack method, but I've learned that Camtasia does not deal well with large dimensions. I'm looking to request authorization to purchase Premiere, but I want to ensure that Premiere is the tool I need for what I'm doing.
In short, I am taking rather large screen motion captures for instructional purposes. I need to blur out confidential information and silence portions of the audio. The dimensions of these captures are 1920 x 1200. The duration of these videos range from about 40 seconds to 14 minutes (which is 1.45 GB in size).
Has anyone worked with movies of these dimensions in Premiere? I'm hoping to find some anecdotes that I can bring to my manager before making this request. I'd appreciate any input folks have on this.
Kevin

Kevin,
If you are looking to purchase, I would assume that you are looking at PrPro CS4. Is that correct? Unfortunately, you have posted to the Premiere (precursor to PrPro) forum. Maybe one of our tireless MOD's will move the post out to the PrPro forum, where you will get a lot more traffic.
As to the dimensions, yes, PrPro can handle those easily. Now, your satisfaction will be tied to two things: your computer, and the full specs. of your source footage. With a good, stout editing machine, and appropriate source footage, you will have no problems.
In the Hardware sub-forum, Harm Millaard has done several worthwhile articles on building/buying an editing rig. In the PrPro forum, there is much discussion on cameras and their footage, to work the best in PrPro.
When this post gets moved, you will receive a lot of worthwhile comments, that will steer you in the right direction.
Good luck, and do not be surprised, when Curt or Jeff moves the post.
Hunt

Java support for file of large dimension

Hi friends,
Do you know if exist a support for managing file of large dimension (about 2GB) using java? BerkeleyDB could be a good support?
thanks

As Kaj mentioned before, using an NTFS partition, you should be able to write files of extreme sizes (for instance, an NTFS partition that is 200GB in size may store a single 200GB file).
Of course, using FileChannel objects, you shouldn't actually try and write the full 2+ GB of data at once, because that will effectively require over 2GB RAM for the data of your application alone. Instead, append your datafile one segment at a time (the size of the segment, in bytes, is roughly the amount of RAM you need for your data). Perhaps you can directly write the bytes into FileChannel (ie: when your measurements take place), which does not require a byte buffer.
Neither FileChannel nor File impose size restrictions beyond what the underlying operating system and type of partition impose. For example: a FAT32 partition can't deal with files over 4GB, so no language (Java, C++, whatever) will make it possible to store 4+ GB files on a FAT32 partition.
If you rewrite a portion of my [url http://forum.java.sun.com/thread.jsp?forum=31&thread=562837&start=15&range=15&hilite=false#2770450]sample from the thread "storing in Java", you should be able to write a 4GB file on your NTFS partition. Change the size of the byte buffer to 64MB and the amount of cycles to 64 and you're there.
I'm not sure how long it will take to write 64 chunks of 64MB, but don't expect it to finish in a few seconds. My regular IDE drive takes 2.25 seconds for a 60MB file (yet, my Serial-ATA drive does it in 1078ms, so don't forget that hardware has a significant impact on I/O performance as well).

I am running snow leopard 10.6.8. on my IMAC. How can I post panoramic type pictures to my desktop so that they maintain panoramic dimensions? Thanks

How can I post panoramic type (and sized) pictures to my desktop so that they maintain panoramic dimensions? I try to do it through systems preferences but they do not maintain panoramic dimensions. Thanks

Your post is pretty lengthy and I have to admit I didn't read it all. Please try restarting in Safe Mode, if that doesn't work please do both a SMC and PRAM reset. These may take 2-3 attempts.
SMC RESET
Shut down the computer.
Unplug the computer's power cord and all peripherals.
Press and hold the power button for 5 seconds.
Release the power button.
Attach the computers power cable.
Press the power button to turn on the computer.
PRAM RESET
Shut down the computer.
Locate the following keys on the keyboard: Command, Option, P, and R. You will need to hold these keys down simultaneously in step 4.
Turn on the computer.
Press and hold the Command-Option-P-R keys. You must press this key combination before the gray screen appears.
Hold the keys down until the computer restarts and you hear the startup sound for the second time.
Release the keys.

How to export to very large dimensions?

I need to export a file to the specs below, however, every time I try I get a 'Error compiling movie - Unknown error'.
Resolution 2732 pixels (w) x 768 pixels (h).
32:9 Landscape format.
File format is MPEG-4/H.264 AVC at 20Mbps.
Frame rate is 25 as per PAL standard.
This error has been replicated across a number of fairly high spec computers so I don't think it is an issue with disk space etc. We're working off the assumption that it's the large dimensions causing the problem.
The only solution we can come up with is to export to smaller dimensions (which we have done successfully) and then upscale but even that is proving challenging!
Any suggestions or ideas are very welcome!

Hi,
I was also unable to reproduce the error using your export specs.
Could you provide answers to these:
Are you able to reproduce this issue with multiple projects?
Did you try increasing the "level" to 5.1 under video settings?
What is the specs of original media used in the project?
Regards
Vipul

Rapidly changing very large dimensions in OWB

Hello,
How is rapidly changing very large dimensions supported in OWB?
Is it supported directly by the tool or do we have to depend on PL/SQL?
If supported, Is it supported by all versions of OWB from 9.0.3 thru 10g?
TIA

Hi
Use merge (insert/update or update/insert) if you have to update and insert too.
Ott Karesz
http://www.trendo-kft.hu

What are the steps to rendering large dimensions in AI?

Hi,
What is the ideal way to render a file in say as big as 3500x7000 px in AI? This image when exported will be used for print.
I was thinking of opening a document that size and working on it but AI gets real choppy and during export it takes forever.
What's the best way to export it out as 3500x7000 ?
Thanks
(NOTE: the printshop doesn't want to scale it,they want the final file to be in the size you want it printed out)

Yes, of course. Both in preferences and in scaling. I can get an innerglow effect if scaled to 3500x7000 only that it is just barely enough to even see it. While on a smaller dimension the effects are more prominent.
I've tried increasing the dpi to 300 for a 350x800px and then scaling it to 3500x7000px and still nothing, worse AI errors out. I increase it to more than 300 and still errors out.
Ive been reading afew things about scaling strokes and effects online and people say since it's a raster effect it doesn't scale well and has a limitation to scaling. Is this true? If so what can i do to use these effects so i can scale.? Someone mentioned using InDesign to scale it so it retains the effects and strokes even on large dimensions. Im competely confused now. I need to use InDesign now so i can get my effects scaled? or can illustrator do this on it's own?

AWM 10.1 release maintaining a dimension

I have a wierd problem maintaining a dimension.
Consider the following scenerio in a dimension;
Atomic level
Hierarchy Level 1
Hierarchy Level 2
if here are common key values in the atomic level and hierarchy level 1, then the loading of the dimension fails.
However, there is no problem if we have common key values in Hierarchy Level 1 and Hierarchy Level 2.
I am running 10.1.4.0 Oracle Database and AWM 10.1.0.4
The hierachy is based on levels
It looks very wierd, can someone throw some light on this.
Thanks in advance
- Shiva

I think I have also experiened similar problem. If HL1 & HL2 has common key values, it may load the dimension but your results while viewing the DIM will be very unpredictable and may not be what you were looking for. You could turn on the surrogate key option to eliminate any such scenario.
Cheers
Suresh.

How to maintain the dimension member that had large amount (over 10K)

Hi, all,
I am now doing a Project Planning using BPC and had some questions as follows:
1. the total amount of the project memeber is huge (exceeding 10K in total). it will be crazy for the Administrator to maintain it only by himself. Is it possible that we can find a workaround method to let the end user to do the restricted work of administrator. in another word, can we find a way that prevent the end user to enter the Administration interface but can add the member themselves through the front end. the process work can be done in shcedule or by manually by the administrator? Anybody had the experience on this? Or do you have any alternative and workaround way to solve the problem?
2. the project memeber adding is not finished within one time. that means, in the first time, maybe only add the highest level. and later, adding the members under it. how can we manage it? Dynamic Hierachy or the others?
Thanks

You could also have the end user update a regular excel spreadsheet with the same column format as a membersheet. Easiest would be to save off the membersheet in another location accessible to the end user. Then it can be modified using excel. Or have the end user maintain a delimited flat file containing all the information included in a membersheet.
Using the bpc makedim package as a starting point, you can create a custom version that takes the user updated document as input and updates the bpc sql mbr table for that dimension then process the dimension. As long as the member list does not exceed the excel limitations, I would also suggest updating the membersheet in bpc. In previous implementations this has meant taking a copy of the membersheet, deleting it, coping a template with the correct columns, then adding the members into it. Updating excel spreadsheets from SSIS can be challenging if you have to deal with deleted members.
Also if you have the potential of deleted members that might have associated facts, I have another post concerning that issue, but have not had time to try any of the suggestions.
Now, the end user is responsible for updating the member list without having access to BPC Admin tool, but someone with access to run datamanager packages would have to execute the datamanager package to process the dimension.

How to turning the large dimension query in the Analyzer 6.2?

Dear expert:I need to query a report in the Analyzer 6.2How to turning the performance in the large sparse dimension query in the Analyzer 6.2 and essbase 6.5.1? ---------------------------------------Report is look like: page : order_date sub_code item pur_date currency vendor measure ---------------------------------------dimension order is dimension(member number) Dimension typeOrder_Date(304) densemeasure(6) densecurrency(6) sparsepur_date(138) sparsevendor(135) sparseitem(253) sparsesub_code(151) sparsePlease help to slove the problem, otherwise I will be killed by customer.Thanks your very much.Phoenixsub_code

Hi All,
We got another idea to create new template and use it as "Current Default Workbook".
Then it is showing latest date as we changed one of the Text element from "Display Status of Data" to "Display status of Data To".
But the this change is showing to my user id only but not to the other users.
We are selecting the tick mark for "Global Default Workbook", but this tick mark is going away after each refresh. I think if this tick mark is holds permanently, my problem will solve.
Please suggest me if you have any ideas to resolve this issue....

Maintain attribute dimensions in Hyperion Planning, via ODI

How can I move/maintain attribute members in Hyperion Planning, preferably via ODI?
I have been loading an attribute dimension into Hyperion Planning, via ODI. The problem is that some children have been previously loaded to incorrect parents. However, a reload does not seem to move the children to their rightful place in the hierarchy.
There are no errors being generated.
Cheers

Are you definitely sure it is not on the planning side the issue lies, I would create a file and then use the outline loader to compare load times to see where the issue is first.
Cheers
John
http://john-goodwin.blogspot.com/

Large Dimensions

Hi
I am currently working on an Oracle OLAP database and I am experiencing some strange behaviour, that I have never seen before.
Firstly are there any rules governing when the number of dimension elements are too large for the OLAP world.
If I do the Cartesian calculation on this cube the number of permutations works out to be in the region of 800000000000000000000000000. Is this a valid use of oracle OLAP and can it handle these situations with high numbers of unique Dimensions.
fact: 3,000,000 records
dim1: 1000,000
dim2: 400,000
dim3: 100,000
dim4: 50,000
dim5: 200
dim6: 100
dim7: 20
Thanks in advanced

In real practical OLAP applications you will never have such a scenario. For instance lets assume the following for 2 of your dimensions
dim1: article dimension
dim2: market dimension
In real world business models, you will never have all products selling in all the markets ( the full permutation of the 2 dimensions). Having said that, Oracle OLAP will only hold combinations
of both dimensions that have data and therefore store cube cells for these combinations. The number of permutations will be reduced tremendously.
Furthermore a good Analytic Workspace design using compressed composites etc will take care of the rest. An OLAP application is meant to hold data at very highly aggregated levels

EPMA very slow to import large dimension from interface tables

I am attempting to import a dimension into the master library from the EPMA interface tables. The dimension is roughly 255,000 members. The import from the interface tables into an empty dimension member is taking close to 3 hours. CPU utilization on the EPMA server is a steady 6% (epma_server.exe) for the entire time. We are on 11.1.2.1.001 of EPMA. The rest of the suite is at 11.1.2.1. The dimension generic type. I have had the same result when importing into a local dimension. The performance degrades after about 6000 members.
Your thoughts would be much appreciated.

I have found my answer from EPMA guide:
In addition to the dimension interface tables created by the template script, you can add interface tables for additional dimensions. For example, the template script contains one set of tables for the Entity dimension. You can add more Entity dimensions as needed. For each dimension added to the interface tables, you must also include the dimension in the IM_Dimension system table so that the dimension is available during profile creation.

Large dimension tables

I am trying to get a grip with how a dimension table can have more records than the fact table since the key for the dimension table is DIMID. Can someone give me a practical example? Thanks

Thanks for this lets work with the Doc No example.
1) I have a cube with 4 dimensions - the fourth dimension contains material no, doc no and plant, the key for this dimension table is DIMID (system generated ??) and each record in the Dimension contains the corresponding SID values for mat no, doc no and plant
2) My first load of data into the cube - 1000 records
3) For each of these 1000 records a DIMID is generated in dimension 4 and the corresponding values of the 3 SIDs are retained as part of these records
4) My next load is a delta 200 lines - 150 new records and 50 updated records - in this case 150 new DIMIDs are generated and the 3 SIDS are retained whereas the original 50 updated records do not have new DIMIDs generated
Is how I described the scenario correct

Maintaining large dimensions

Similar Messages

Maybe you are looking for