Regarding High Cardinality

Hi,
Can any one give difference between High Cardinality and Line Item Dimension.
Regards
YJ

Hi YJ,
Refer these links:
Line Item Dimension
Re: Line Item Dimension
Cardinality
Re: High Cardinality Flag
Bye
Dinesh

Similar Messages

Any relation between indexes and high cardinality.............

HI,
what is difference b/w index and high cardinality?
and also difference b/w index and line item dimension?
Thanks

Hi,
High Cardinality:
Please Refer this link, especially the post from PB:
line item dimension and high cardinality?
Line Item Dimension:
Please go through this link from SAP help for line item dimension
http://help.sap.com/saphelp_nw04/helpdata/en/a7/d50f395fc8cb7fe10000000a11402f/content.htm
Also in this thread the topic has been discussed
Re: Line Item Dimension
BI Index:
There are two types of indexes in BW on Oracle, bitmap and b-tree
Bitmap indexes are created by default on each dimension column of a fact table
Setting the high cardinality flag for dimension usually affects query performance if the dimension is used in a query.
You can change the bitmap index on the fact table dimension column to a b-tree index by setting the high cardinality flag. For this purpose it is not necessary
to delete the data from the InfoCube
Refer:
Re: Bitmap vs BTree
How to create B-Tree and Bitmap index in SAP
Re: Cardinality
Line Item Dimesnion
Hope it helps...
Cheers,
Habeeb

SSAS Tabular. MDX slow when reporting high cardinality columns.

Even with small fact tables( ~20 million rows) MDX is extremely slow when there are high cardinality columns in the body of the report.
e.g. The DAX query is subsecond.
Evaluate
SUMMARIZE (
CALCULATETABLE('Posted Entry',
'Cost Centre'[COST_CENTRE_ID]="981224"
, 'Vendor'[VENDOR_NU]="100001"
,'Posted Entry'[DR_CR]="S")
,'Posted Entry'[DOCUMENT_ID]
,'Posted Entry'[DOCUMENT_LINE_DS]
,'Posted Entry'[TAX_CODE_ID]
,"Posted Amount",[GL Amount]
,"Document Count",[Document Count]
,"Record Count",[Row Count]
,"Document Line Count",[Document Line Count]
,"Vendor Count",[Vendor Count]
order by
'Posted Entry'[GL Amount] desc
The MDX equivalent takes 1 minute 13 seconds.
Select
{ [Measures].[Document Count],[Measures].[Document Line Count],[Measures].[GL Amount], [Measures].[Row Count],[Measures].[Vendor Count]} On Columns ,
NON EMPTY [Posted Entry].[DOCUMENT_ID_LINE].[DOCUMENT_ID_LINE].AllMembers * [Posted Entry].[DOCUMENT_LINE_DS].[DOCUMENT_LINE_DS].AllMembers * [Posted Entry].[TAX_CODE_ID].[TAX_CODE_ID].AllMembers On Rows
From [Scrambled Posted Entry]
WHERE ( [Cost Centre].[COST_CENTRE_ID].&[981224] ,[Vendor].[VENDOR_NU].&[100001] ,{[Posted Entry].[DR_CR].&[S]})
I've tried this under 2012 SP1 and it is still a problem. The slow MDX happens when there is a high cardinality column in the rows and selection is done on joined tables. DAX performs well; MDX doesn't. Using client generated MDX or bigger fact tables makes
the situation worse.
Is there a go fast switch for MDX in Tabular models?

Hi,
There's only 50 rows returned. The MDX is still slow if you only return a couple of rows.
It comes down to the DAX produces a lot more efficient queries against the engine.
FOR DAX
e.g.
after a number of reference queries in the trace the main vertipaq se query is
SELECT
[Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_ID], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_LINE_DS], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[TAX_CODE_ID],
SUM([Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[POSTING_ENTRY_AMT])
FROM [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1]
WHERE
([Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_ID], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_LINE_DS], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[TAX_CODE_ID]) IN {('0273185857', 'COUOXKCZKKU:CKZTCO CCU YCOT
XY UUKUO ZTC', 'P0'), ('0272325356', 'ZXOBWUB ZOOOUBL CCBW ZTOKKUB:YKB 9T KOD', 'P0'), ('0271408149', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 7.3ZT BUY', 'P0'), ('0273174968', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KBW', 'P0'), ('0273785256', 'ZOUYOWU ZOCO CLU:Y/WTC-KC
YOBT 3ZT JXO', 'P0'), ('0273967993', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KCB', 'P0'), ('0272435413', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT BUY', 'P0'), ('0273785417', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT BUY', 'P0'), ('0272791529', 'ZOUYOWU ZOCO CLU:Y/WTC-KC
YOBT 7.3ZT JXO', 'P0'), ('0270592030', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 89.3Z JXO', 'P0')...[49 total tuples, not all displayed]};
showing a CPU time of 312 and duration of 156. It looks like it has constructed an in clause for every row it is retrieving.
The total for the DAX query from the profiler is 889 CPU time and duration of 1669
For the MDX
after a number of reference queries in the trace the expensive vertipaq se query is
SELECT
[Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_ID_LINE], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_LINE_DS], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[TAX_CODE_ID]
FROM [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1]
WHERE
[Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DR_CR] = 'S';
showing a CPU time of 49213 and duration of 25818.
It looks like it is only filtering by a debit/credit indicator .. this will be half the fact table.
After that it fires up some tuple based queries (similar to the MDX but with crossjoins)
SELECT
[Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_ID_LINE], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_LINE_DS], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[TAX_CODE_ID]
FROM [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1]
LEFT OUTER JOIN [Vendor_6b7b13d5-69b8-48dd-b7dc-14bcacb6b641] ON [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[VENDOR_NU]=[Vendor_6b7b13d5-69b8-48dd-b7dc-14bcacb6b641].[VENDOR_NU]
LEFT OUTER JOIN [Cost Centre_f181022d-ef5c-474a-9871-51a30095a864] ON [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[COST_CENTRE_ID]=[Cost Centre_f181022d-ef5c-474a-9871-51a30095a864].[COST_CENTRE_ID]
WHERE
[Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DR_CR] = 'S' VAND
([Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_ID_LINE], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[DOCUMENT_LINE_DS], [Posted Entry_053caf72-f8ab-4675-bc0b-237ff9ba35e1].[TAX_CODE_ID]) IN {('0271068437/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 7.3ZT ZTC', 'P0'), ('0272510444/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KBW', 'P0'), ('0272606954/1', null, 'P0'), ('0273967993/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KCB', 'P0'), ('0272325356/1', 'ZXOBWUB ZOOOUBL CCBW ZTOKKUB:YKB 9T KOD', 'P0'), ('0272325518/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KUW', 'P0'), ('0273231318/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 7.3ZT ZWB', 'P0'), ('0273967504/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT KBW', 'P0'), ('0274055644/1', 'YBUCC OBUC YTT OYX:OD 5.3F81.3ZT TOZUT', 'P5'), ('0272435413/1', 'ZOUYOWU ZOCO CLU:Y/WTC-KC YOBT 3ZT BUY', 'P0')...[49 total tuples, not all displayed]};
This query takes 671 CPU and duration 234; more expensive than the most expensive part of the DAX query but still insignificant compared to the expensive part of the MDX.
The total for the MDX query from the profiler is 47206 CPU time and duration of 73024.
To me the problem looks like the MDX fires a very expensive query against the fact table and only filters by 1 element of the fact table; then goes about refining the set later on.

If v check only High Cardinality?

Hello all
If v check only cardinality in dimension assignment,then dimension table will be available.(I hope so)
upto how many characteristics can v assign this cardinality.
Is there any mandatory like,while v select the cardinality should v select the Line Item also?
many thanks
balaji

hi pizzaman
thanks for your info., given.
In your statements u said like "when just High Cardinality is selected for a dimension, a b-tree index is created instead of a bitmap index", but if v check only line item then which index will create? (then also only b-tree index will create,is it this way).
if both line item and high cardinality checked then which index will create?
Mant Thanks
balaji

High cardinality

my cube e-fact table has 214510 entries and the z-sales order
line item dim sid tables has 1438296 entries , shall the 'high
cardinality' setting be marked/unmarked in this situation?

When compared to a fact table, dimensions ideally have a small cardinality. However, there is an exception to this rule. For example, there are InfoCubes in which a characteristic document is used, in which case almost every entry in the fact table is assigned to a different document. This means that the dimension (or the associated dimension table) has almost as many entries as the fact table itself. We refer here to a degenerated dimension. In BW 2.0, this was also known as a line item dimension, in which case the characteristic responsible for the high cardinality was seen as a line item. Generally, relational and multi-dimensional database systems have problems to efficiently process such dimensions. You can use the indicators line item and high cardinality to execute the following optimizations
This means that the dimension is to have a large number of instances (that is, a high cardinality). This information is used to carry out optimizations on a physical level in depending on the database platform. Different index types are used than is normally the case. A general rule is that a dimension has a high cardinality when the number of dimension entries is at least 20% of the fact table entries. If you are unsure, do not select a dimension having high cardinality
Message was edited by:
Benjamin Edwards

High Cardinality Flag

If a Dimension only has the High Cardinality flag checked, I understand that the index will be B-Tree. However, if I am to determine whether this setting is correct if I want to check the cardinality i.e.
Number of Distinct Values/Number of Records
This is to be done using the Number of Records in the Dimension table and not the Fact table is that correct? Thanks

You're right, for a fact table of 8.5 million, you would NOT want a dimension with only 6,000 values to be marked high cardinality.
The approach to calculate the dimension relative to the fact table is fine. The challenge in the initial design of dimensions is that without expert domain knowledge, it is difficult to figure out how big a dimension will be until you have built the cube and loaded data. Unless you can analyze the data from R3 some way, you have to go thru load and review process.
Yes every Dim ID is a distinct value by design. What you are trying to avoid is putting characteristics that by themselves, each have low cardinality, but have no relationship to one another, and when put together in the same dimension, result in a large dimension, e.g.
Let's take your existing dimension with 6,000 values (6,000 different combinations of the characteristics currently in the dimension), and you add another characteristic that has 1,000 distinct values by itself.
Adding this characteristic to this dimension could result in no new dimension rows if the new characteristic was directly related to an existing characteristic(s),
e.g. lets say you were adding a char called Region, which was nothing more than a concatenation of Division and Business Area. Dim still has only 6,000 values. (When you have parsings like this where a characteristic is a piece of another characteristic, you would want them in the same dimension).
Or lets say you were adding a characteristic to this dimension that has no relationship to any of the existing chars, a Posting Date. Each occurence of the 6,000 dimension combinations has all 1,000 Posting dates associated with it. Now your dimension table is 6,000 * 1,000 = 6,000,000 rows !!! Now your dimension IDs would be considered to have high cardinality. The answer in this design however, is NOT to set this dimension to high caridnality, but rather, to put Posting Date in it's own dimension.
Hope this helps a little.

High cardinality flag viewer

hello all,
im looking for a way to see all the infocubes that have the high cardinality flag set in the system.
is there a transaction i can call or a program i can run to see this?

Hi Jason
Use RSDDIMEV - View of dimensions with texts
In selection for Card height put "X" and execute
hope this helps
PBI

High Cardinality - only relevant for reporting performance?

Hello,
I read a lot about high cardinality. What could not be clarified: is it "only" for reporting purpose? Sure that tables will be sized in another way.
If I have no reports (we use the BW system which is integrated in DP) - is high cardinality flag relevant?

It could potentially be used for determining access path for any SQL accessing the cube. Whether it is open-hub, bapi, ABAP or some retractor, whenever an SQL is executed against the cube, this flag may influence how data is read (eg what index is used).

Benchmark for High Cardinality

In the link below SAP uses 20% as a benchmark figure for high cardinality
http://help.sap.com/saphelp_nw04/helpdata/en/a7/d50f395fc8cb7fe10000000a11402f/content.htm
Whereas in the link below Oracle (we are using 9.0.2.x) uses 1% as a benchmark for high cardinality
http://www.lc.leidenuniv.nl/awcourse/oracle/server.920/a96524/toc.htm
Why is there such a stark difference in benchmark values and which is the correct benchmark to consider.
Thank You for any help offered

I'm not sure that you are comparing apples to apples.
SAP is refering to a high cardinality dimension in it's stmt, not a single column that Oracle's doc is talking about. Oracle's doc also mentions that the advantage is greatest when ratio of the number of distinct values vs the nbr of rows is 1%. Remember, both SAP's and Oracle's stmts are general rules of thumb.
Let's use Oracle's example:
- Table with 1 million rows
- Col A as 10,000 distinct values, or 1%.
But now, let's talk about a dimension that consists of two characteristics, Col A, and Col B.
- Col A has 10,000 distinct values as before (low cardinality as per Oracle)
- Col B has 100 distinct values (very low cardinality per Oracle)
Now what happens if for every value of Col A, at least one row exists with every Col B value - the number of rows in your dimension in your dimension table is 10,000 x 100, or 1,000,000 or 100% of the fact table.
Dimension tables can grow very quickly when you have characteristics that individually have low cardinality, but when combined, result in a dimension that does not.
Hope this makes sense and clears things up bit.
Harry -
You have had several posts related to Cardinality. I would suggest that rather than closing out each question, and posting another question, just go ahead and post a follow-up question in the original thread (assuming it's related). I think this creates a more educational thread for other forum members to follow.
Cheers,
Pizzaman

Hai guys regarding higher seconder education cess

hai guys regarding higher education cess in mm prcing what condition recorde should i have to be maintain in fv11 and register 23 A iam not getting the option higer education cess so plz give the clarfication thank to sap guys

Hi,
Plz chk in OBQ3, whether u r able to see the condition type JSEP for higher edu cess.
Maintain that condition in FV11.
Regards,
Piyush

High cardinality and BTREE

Hello,
CAN ANY ONE EXPLAIN HIGH CARDINALITY AND WITH EXAMPLE...
AND ALSO IN THAT WHAT B TREE INDEXES...
how this would be helpful in performance point of view comparing to Lineitemdimension
Looking for reply
thanks

Hi Guru,
High Cardinality means that the dimension is to have a large number of instances/datasets (that is, a high cardinality). A general rule is that a dimension has a high cardinality when the number of dimension entries is at least 20% of the fact table entries. If you are unsure, do not select a dimension having high cardinality.
This information is used to carry out optimizations on a physical level in depending on the database platform. Different index types are used than is normally the case.
Additionally, you have the option of giving dimensions the indicator High Cardinality. This function needs to be switched on if the dimension is larger than ten percent of the fact table. Then you can create B tree indexes instead of Bitmap indexes.
So whenever you have a high instance dimension mark it as High Cardinality while defining Dimensions.
You use Line Item when you have only 1 charaterstic in your Dimension.This means that the system does not create a dimension table. Instead, the SID table of the characteristic takes on the role of dimension table. This helps asn when we load transaction data, no IDs are generated for the entries in the dimension table and secondly table- having a very large cardinality- is removed from the star schema. As a result, the SQL-based queries are simpler.
Hope it helps.
Thanks
CK
Message was edited by:
Chitrarth Kastwar

High cardinality and line dimenson

Hi
Are High cardinality and line dimenson both are dependent?
my understanding is that if the dimenson is more than 10% size of the fact table then we go for line dimenson. Highcardinality should be given only if the dimenson is more than 10% of fact table.By choosing line dimenson,fact table will be directly linked to sid table as there will be no dimenson table. Does it mean if I choose line dimenson, I can't go for high cardinality as there is no dimenson? Please let me know the relationship the between the two?
Thank you
Sriya

When compared to a fact table, dimensions ideally have a small cardinality. However, there is an exception to this rule. For example, there are InfoCubes in which a characteristic Document is used, in which case almost every entry in the fact table is assigned to a different Document. This means that the dimension (or the associated dimension table) has almost as many entries as the fact table itself. We refer here to a degenerated dimension.
Generally, relational and multi-dimensional database systems have problems to efficiently process such dimensions. You can use the indicators line item and high cardinality to execute the following optimizations:
       1.      Line item: This means the dimension contains precisely one characteristic. This means that the system does not create a dimension table. Instead, the SID table of the characteristic takes on the role of dimension table. Removing the dimension table has the following advantages:
○       When loading transaction data, no IDs are generated for the entries in the dimension table. This number range operation can compromise performance precisely in the case where a degenerated dimension is involved.
○       A table- having a very large cardinality- is removed from the star schema. As a result, the SQL-based queries are simpler. In many cases, the database optimizer can choose better execution plans.
Nevertheless, it also has a disadvantage: A dimension marked as a line item cannot subsequently include additional characteristics. This is only possible with normal dimensions
High cardinality: This means that the dimension is to have a large number of instances (that is, a high cardinality). This information is used to carry out optimizations on a physical level in depending on the database platform. Different index types are used than is normally the case. A general rule is that a dimension has a high cardinality when the number of dimension entries is at least 20% of the fact table entries. If you are unsure, do not select a dimension having high cardinality.
For example a dimension having Sales Doc number and Sales Org can be set as High Cardinal as Sales Doc number will occur many times.
Hope this helps
Raja

Differences between High cardinality and line item dimension

Please Search the forum
Friends,
can any one tell me the differences between High Cardinality and Line item Dimension and their use.
Thanks in Advance.
Jose
Edited by: Pravender on May 13, 2010 5:34 PM

please search in SDN

Hai regarding higher education cess

hai this is swamy i want know what is the condtion record to maintin for higher education record let plz give the answer k thanks bye

hi,
JECP - Education cess
JIAX - Higher Education Cess
regards
sadhu kishore

Hai guys this is swamy regarding higher secondery education cess

hai i am facing in /nj1iin screen we another field for higher seconder education cess how can maintain for higher education cess 1% and who will response for the out put display . plz give the confirmation immediataly to my mail id [email protected] k thank s guys

hi
can you be more descreptive.
in J1IIN, higher secondary education flows from the billing document, for example Excise Proforma.
In excise proforma it can come from FTXP, where it is maintained in taxcode , if you are using TAXINJ Pricing procedure.
Or it can come from TAXINN if you are using condtion type.
Regards
Jitesh

Regarding High Cardinality

Similar Messages

Maybe you are looking for