Distinct count makes BI server to choose wrong aggregate table
Hi experts,
I have 4 dimension tables time, store, product and client and one fact table sales.
The sources from table sales are from 3 aggregate tables:
agg_sales_1: aggregate sales for one product of one client in one store per day
agg_sales_2: aggregate sales for one product in one store per day (all clients)
agg_sales_3: aggregate sales for one store per day (all products , all clients)
You can see that agg_sales_1 have a lot of lines, agg_sales_2 has few lines and agg_sales_3 has very few lines (one line per store and day)...
What I need is: all stores to see the average sales per one month (I don't care about products or clients - all of them)
so I create :
one fact logical column wich has sum(sales) and at time level i set it to month : total_sales_per_month
one fact logical column wich has count(distinct(date)) and at time level i set it to month - wich gives me in one month how many days with sales I have: '#_of_days_in_with_sales_in_month'
and I want to have the average_sales_per_month = total_sales_per_month / '#_of_days_in_with_sales_in_month'.
So far so good:
if in Presentation in my report I put day and total_sales_per_month then the server choses agg_sales_3 (wich is the best solution)
if in Presentation in my report I put day and total_sales_per_month and '#_of_days_in_with_sales_in_month', or just average_sales_per_month then the server choses agg_sales_1 (wich is the worst solution).
The question is why?
another clue:
if I change the aggregate function from count(distinct()) in count() (This is no good for me) then the server choses again the good table agg_sales_3.
So, I'm thinking that the function count(distinct()) is makeing this bad behavior...
Any suggestions pls...
And Happy Hollydays
Thanks
Nicolae Ancuta
One of the dimension table have joins to other fact tables and query routed through unwanted dim and fact tables. this is happeneing because of aggregate navigation in fact sources, content tab set to detailed level. I'm trying to use aggregate functions...
Similar Messages
-
Aggregate table showing wrong data
Hello Gurus:
I am working on an issue, where a report is showing wrong value for an aggregated fact table.
If I try not including it in query, then results work fine. But by default BI server is pointing to aggregate table ( which it should ).
So my questions are
1) Does Aggregate tables are refreshed automatically with DAC?
2) I know data is wrong in aggregate table. How do I verify?
3) How to make a particular report hit regular table instead of aggregate table?
Please let me know.
Thanks.
~VinayAlright. So I made some progress on this issue, HOWEVER it is still not solved.
1) Aggregated tables are refreshed daily with DAC. There is a script for that on server which is created using Aggregate Persistence wizard.
2) ONLY one column is showing wrong data. I have verified this using Toad. However I dont know why is it showing wrong data. Theoretically it should be fine.
3) this question is still same.
How to make a particular report hit regular table instead of aggregate table?
Please help me out.
Thanks.
~Vinay -
Distinct Count Function-how to use properly
Hello,
I am new to using forums & have only been using Crystal since May of 2009, so i hope i do this correctly & provide the appropriate information. i've looked for this answer in what's been posted but cannot find it. Some things i've read I don't really understand. I only know how to use the functions that are in the software, i don't know how to write them myself (i think that's when people have referred to SQL code or basic syntax)
I have CR Professional, version 11.0.0.1282 (Crystal Reports XI).
I work at a county health dept and we have a annual medicaid cost report, I am linking Crystal to our EMR billing module. i have my report sorted by insurance, ie medicaid, bcbs, abw, hpm etc. and within each ins group i have the clients ID, DOS (date of service), procedure code, charge amt, ins pmt & patient pmt. i have totaled the charges & pmts for each group-works fine. i even have been able to create the formula to adj out the duplicate entries in the billing module (a service was entered wrong then adjusted out then re-entered correctly-without my formula crystal was pulling both these records and adding them to total charges.)
Where my problem lies and what my question is: I need to count encounters, an encounter is the visit, but each visit could have 2 or more procedure codes. So this results in multiple lines on my report for one visit, which i want for the charges to add correctly, but it makes my visit count to high. So I read about the distinct count function, of which there are three listed & i'm having a hard time understanding the differences. What i tried is: a distinct count of the acct ID-so the same acct ID's are only counted the one time. But some clients see us more than once per year, meaning the acct ID is the same but the DOS is different. For this client that would be 2 visits. But crystal is counting this as 1.
Saying what i want to do is this: Count as 1 when the acct ID and DOS are the same. I've tried using the different distinct counts but when i check my formula it always has errors. So I'm sure my lack of knowledge is what's holding me up-i fully believe crystal can do this.
Any help would be greatly appreciated.I create a dummy table, set up acc_id and DOS and Charge.
Created a running total
Summarized acc_id
Type of summary Count
Evaluated using a formula
<> previous ()
and reset on ACC_ID
My groups were sorted by acc_id and date
where there were multiple visits on the same DOS my count was 0
where the dos changed it would count accordingly.
You may need to use two Running totals to get the complete picture. -
Report using Tabular Model and Measures based on Distinct Counts
Hello,
I am creating a report that should present something like this:
YEAR-1 | MONTH-1 | MONTH-2 | MONTH-3... | YEAR | MONTH-1 | MONTH-2 | MONTH-3...
My problem is that when designing the dataset to support this layout I drag the Year, Month and Distinct count Measure, but on the report when I want the value for the YEAR level I don't have it and I cannot sum the months value...
What is the best aproach to solve this? Do I really have to go to advanced mode and customize my MDX or DAX? Can't basic users do something like this that seems so trivial and needed?
Thank you
Luis SimõesHi Luis,
According to your description, you create a Reporting Services report using Analysis Service Tabular Model as the datasource, now what you want is sum the months value on year level, right?
In your scenario, you can add the Month field to column group, add a parent group using Year Field and then add a Total on Month group. In this case, Reporting Services will sum the months value on Year level. I have tested it on my local environment, the
screenshot below is for you reference.
Reference:Lesson 6: Adding Grouping and Totals (Reporting Services)
If this is not what you want, please describe your dataset structure, so that we can make further analysis.
Regards,
Charlie Liao
TechNet Community Support -
How can I make a server differ between two or more clients?
How can I make a server differ between two or more clients?
The clients can connect and talk to the server fine, but how can I make the server talk to one, two or all clients? i.e. what would be a good way to implement this?
Currently, the server listens for connections like this:
while (listening) {
try {
new ServerThread(this, serverSocket.accept()).start();
I guess one way would be to add the ServerThreads to a Hashtable with the client ID as key, and then get the ServerThread with the proper client ID, but this seems unnecessary complicated. Any ideas?Complicated was perhaps the wrong word, I should have
written something like it doesn't "feel" right. Or is
this a common and good way to solve communication
between a server and multiple clients?Thats pretty much how I do it. I normally use an array or ArrayList of Sockets instead of HashTable, with [0] being the first player etc.... Then you can communicate with exactly who you want. If you want to send bytes to all of them, just send the same thing to each socket individually (or is there a better way to do this?). -
Distinct count of dimension business key in fact table
In my cube I have a fact table which joins to a patient dimension. The patient dimension is a type 2. What I would like to do is get a distinct count of patients who have records in the fact table. The business key in the patient dimension
is the PrimaryMrn. So a SQL query would look like this.
SELECT count(distinct PrimaryMrn)
FROM EncounterFact e
INNER JOIN PatientDim p
on e.PatientKey = p.PatientKey
Is it possible to do this via MDX?
Thanks for the help.If you have to distinct count an attribute in a SCD 2, you might choose between:
Denormalizing that attribute in the fact table, and the create a classical DISTINCT COUNT measure
Use a many-to-many approach - see the "Distinct Count" scenario in the Many-to-Many White paper here:
http://www.sqlbi.com/articles/many2many (for both Multidimensional and Tabular
If you use Tabular, you might want to read also this pattern:
http://www.daxpatterns.com/distinct-count/
Marco Russo http://www.sqlbi.com http://www.powerpivotworkshop.com http://sqlblog.com/blogs/marco_russo -
Distinct count for multiple fact tables in the same cube
I'm fairly new to working with SSAS, but have been working with DW environments for many years.
I have a cube which has 4 fact tables. The central fact table is Encounter and then I also have Visit, Procedure and Medication. Visit, Procedure and Medication all join to Encounter on Encounter Key. The relationship between Encounter
and Procedure and Encounter and Medication are both an optional 1 to 1. The relationship between Encounter and Visit is an optional 1 to many.
Each of the fact tables join to the Patient dimension on the Patient Key. The users are looking for a distinct count of patients in all 4 fact tables.
What is the best way to accomplish this so that my cube does not talk all day to process? Please let me know if you need any more information about my cube in order to answer this.
Thanks for the help,
AndyHi Andy,
Each distinct count measure cause an ORDER BY clause in the SELECT sent to the relational data source during processing. In SSAS 2005 or later, it creates a new measure group for each distinct count measure(it's a technique strategy for improving perormance).
Besides, please take a look at the following distinct count optimization techniques:
Create Customized Aggregations
Define a Processing Plan
Create Partitions of Equal Size
Use Partitions Comprised of a Distinct Range of Integers
Distribute the Hash of Your UserIDs
Modulo Function
Hash Function
Choose a Partitioning Strategy
For more detail information, please refer to the article below:
Analysis Services Distinct Count Optimization:
http://www.microsoft.com/en-us/download/details.aspx?id=891
In addition, here is a good article about SSAS Best Practices for your reference:
http://technet.microsoft.com/en-us/library/cc966525.aspx
If you have any feedback on our support, please click
here.
Hope this helps.
Elvis Long
TechNet Community Support -
Distinct Count doesn't return the expected results
Hi All,
I was fighting a little trying to implement a Distinct Count measure over an account dimension in my cube. I read a couple of posts relateed to that and I followed the steps posted by the experts.
I could process the cube but the results I'm getting are not correct. The cube is returning a higher value compared to the correct one calculated directly from the fact table.
Here are the details:
Query of my fact table:
select distinct cxd_account_id,
contactable_email_flag,
case when recency_date>current_date-365 then '0-12' else '13-24' end RECENCY_DATE_ROLLUP,
1 QTY_ACCNT
from cx_bi_reporting.cxd_contacts
where cxd_account_id<>-1 and recency_date >current_date-730;
I have the following dimensions:
Account (with 3 different hierarchies)
Contactable Email Flag (Just 3 values, Y, N, Unknown)
Recency_date (Just dimension members)
All dimensions are sparse and the cube is a compressed one. I defined "MAXIMUM" as aggregate for Contactable Email flag and Recency date and at the end, SUM over Account.
I saw that there is a patch to fix an issue when different aggregation rules are implemented in a compressed cube and I asked the DBA folks to apply it. They told me that the patch cannot be applied because we have an advanced version already installed (Patch 11.2.0.1 ).
These are the details of what we have installed:
OLAP Analytic Workspace 11.2.0.3.0 VALID
Oracle OLAP API 11.2.0.3.0 VALID
OLAP Catalog 11.2.0.3.0 VALID
Is there any other patch that needs to be applied to fix this issue? Or it's already included in the version we have installed (11.2.0.3.0)?
Is there something wrong in the definition of my fact table and that's why I'm not getting the right results?
Any help will be really appreciated!
Thanks in advance,
MartínNot sure I would have designed the dimensions /cubes as you, but there is another method you can obtain distinct counts.
Basically relies on using basic OLAP DML Expression language and can be put in a Calculated Measure, or can create two Calculated measures
to contain each specific result. I use this method to calculate distinct counts when I want to calculate averages, etc ...
IF account_id ne -1 and (recency_date GT today-365) THEN -
CONVERT(NUMLINES(UNIQUELINES(CHARLIST(Recency_date))) INTEGER)-
ELSE IF account_id ne -1 and (recency_date GT today-730 and recency_date LE today-365) THEN -
CONVERT(NUMLINES(UNIQUELINES(CHARLIST(Recency_date))) INTEGER)-
ELSE NA
This exact code may not work in your case, but think you can get the gist of the process involved.
This assumes the aggregation operators are set to the default (Sum), but may work with how you have them set.
Regards,
Michael Cooper -
Distinct count inside a measure group with other measures
Hello,
I have 1 distinct count inside a measure group with other measures, sum, count etc. I know this is not recommended due to poor processing performance and query response time.
Processing performance I can live with if it means not having another measure group, which increases processing time anyway.
I have used the recommended approach before and it generated many questions about what this second measure group is for (visible via excel), even though I made the distinct count appear in the main measure group via a calculated measure.
(it would be nice if you could hide measure groups)
However my question is: is query response time only effected when the distinct count is used in the query? Or is query response time effected regardless if the distinct count is used or not??
Below is an extract from the 2005 distinct count optimizer white paper. It’s not completely clear but I assume if effects queries regardless if distinct count is used or not?
"By adding other measures to the measure group holding a distinct count measure, all of the other measures will be at the same granularity as the distinct count measure, resulting in inefficient data structures and suboptimal
queries."You might also be interested in reading this blog post, which deals with a similar scenario, to get a feeling for some of the things that might be going on behind the scenes:
http://cwebbbi.wordpress.com/2012/11/27/storage-engine-caching-measures-and-measure-groups/
Chris
Check out my MS BI blog I also do
SSAS, PowerPivot, MDX and DAX consultancy
and run public SQL Server and BI training courses in the UK -
Hi,
I’m facing a little issue to calculate a distinct count of number of clients in SSAS OLAP Cube. The difficulty appears for the credited client’s accounts, in other words, for the clients how have credit (quantity = -1) or for the clients
how have bought the product and they receive a credit after (quantity = 0). My actual distinct count in my cube considers these two cases as real buying transaction, but in fact they’re not. I’ve checked in SSAS to make a distinct count with the expression
(SUM Quantity > 1), but I didn’t find nothing. Now I’m thinking to model these cases directly in my Datawarehouse, but I don’t see how can’t do it. Can anyone de give me a little help? Thanks.Hi Merouane,
According to your description, you want to count the members with the condition quantity=-1, right? In this case, we can use Filter function inside the Count function. The query will looks like below.
Count(Filter([Product].[Product].[Product], [Measures].[Internet Order Quantity] =-1))
However, the Filter function might cause a performance issue, we can change the query to
Sum(
[Product].[Product].[Product],
Iif([Measures].[Internet Order Quantity] =-1,1,0))
For the detail information about it, please refer to the link below.
http://sqlblog.com/blogs/mosha/archive/2007/11/22/optimizing-count-filter-expressions-in-mdx.aspx
Regards,
Charlie Liao
TechNet Community Support -
Distinct count using lookup table
How can I get a distinct count of column values using a different table?
Let's say I want to get a distinct count of all "company_name" records in table "emp" that corespond (match) with "lookup" table, "state" category.
What I want is to find counts for all companies that have a value of "california" in the "state" column of the "lookup" Table. I want the output to look like:
Sears 17
Pennys 22
Marshalls 6
Macys 9
I want the result to show me the company names dynamically as I don't know what they are, just that they are part of the "state" group in the lookup Table. Does this make sense?
MMark,
In the future you might consider creating test cases for us to work with. Something similar to the following where sample data is created for each table as the union all of multiple select statementsselect 'INIT_ASSESS' lookup_type
, 1 lookup_value
, 'Initial Assessment' lookup_value_desc
from dual union all
select 'JOB_REF', 2, 'Job Reference' from dual union all
select 'SPEC_STA', 3, 'SPEC STA' from dual;
select 'INIT_ASSESS' rfs_category
, 1 val
from dual union all
select 'JOB_REF', 1 from dual union all
select 'JOB_REF', 1 from dual union all
select 'SPEC_STA', null from dual;Then we can either take your select statements and make them the source of a CTAS (create table as) statementcreate table lookup as
select 'INIT_ASSESS' lookup_type
, 1 lookup_value
, 'Initial Assessment' lookup_value_desc
from dual union all
select 'JOB_REF', 2, 'Job Reference' from dual union all
select 'SPEC_STA', 3, 'SPEC STA' from dual;, or include them as subfactored queries by using the with statement:with lookup as (
select 'INIT_ASSESS' lookup_type
, 1 lookup_value
, 'Initial Assessment' lookup_value_desc
from dual union all
select 'JOB_REF', 2, 'Job Reference' from dual union all
select 'SPEC_STA', 3, 'SPEC STA' from dual
), RFS as (
select 'INIT_ASSESS' rfs_category
, 1 val
from dual union all
select 'JOB_REF', 1 from dual union all
select 'JOB_REF', 1 from dual union all
select 'SPEC_STA', null from dual
select lookup_value_desc, count_all, count_val, dist_val
from lookup
join (select rfs_category
, count(*) count_all
, count(val) count_val
, count(distinct val) dist_val
from RFS group by rfs_category)
on rfs_category = lookup_type;Edited by: Sentinel on Nov 17, 2008 3:38 PM -
MSSQL2005 Analysis Service Distinct Count
hi,
i am currently trying to build a distinct count on my cube (mssql2005 analysis services).
But after i added the discount count on the field i want to and start the processing, the following errors appear.
- Errors in the OLAP storage engine: The sort order specified for distinct count records is incorrect.
- Errors in the OLAP storage engine: An error occurred while processing the 'FACT VIEW STATISTIC' partition of the 'FACT VIEW STATISTIC 1' measure group for the 'Accident Statistic' cube from the OLAP_PROJECT database.
the count measure works fine.
will appreciate any help on this distinct count problem.
thanks in advance.
HYI also received this error:
"Errors in the OLAP storage engine: The sort order specified for distinct count records is incorrect. "
Running SQL Server 2005 SP2 Enterprise Edition
The collation between SQL Server and Analysis Services was the same.
The distinct count was on a character data type.
There were no NULLs in the data.
The cube was processing fine until new data was added.
After some investigation into the data it seems that the culprit was one row that the data length was 13 characters on the column of the distinct count. Everything else was less than 13 characters. (See results below). Updating this one row solved the problem. The exact value of the data is: '1-4296-175-9'
Here is a result set:
select len(columnname) as data_length, count(*) as count
from [tablename]
group by len(columnname)
order by data_length
data_length count
2 3
5 1
6 3
7 2
9 1
10 856
13 1
My question though is if SQL2005 can do distinct counts on strings then why choke on one row with an extra length? -
Sql query to get distinct count
Hi
I use SQL Server Management Studio
can I have a sql query to get count as shown below against each month column and name column to get distinct count.
for example if there is two rows with the same date period and same name then the count should be one in first row and zero in the next row of the same data.
Table Name: Table1
Column: Month, Name
Month
Name
Count
12/1/2012 0:00
AK
1
12/1/2012 0:00
AK
0
12/1/2012 0:00
AB
1
1/1/2013 0:00
AK
1
1/1/2013 0:00
AK
0
1/1/2013 0:00
AB
1
3/1/2013 0:00
AA
1
3/1/2013 0:00
AK
1
3/1/2013 0:00
AK
0
6/1/2013 0:00
AA
1
6/1/2013 0:00
AK
1
6/1/2013 0:00
AK
0
9/1/2013 0:00
AA
1
9/1/2013 0:00
AK
1
9/13/2013 0:00
AK
1
10/1/2013 0:00
AA
1
10/1/2013 0:00
AK
1
10/1/2013 0:00
AK
0Hi,
Thanks for the query but this query gives the total count like shown below
if see the second row in the below table AK for 2012-12-1 gives total count as 2 but need the query to show the first row as 1 and there after 0
query result
Month name cnt
2012-12-01 00:00:00.000 AB 1
2012-12-01 00:00:00.000 AK 2
2012-12-01 00:00:00.000 AK 2
2013-01-01 00:00:00.000 AB 1
2013-01-01 00:00:00.000 AK 2
2013-01-01 00:00:00.000 AK 2
2013-03-01 00:00:00.000 AA 1
2013-03-01 00:00:00.000 AK 2
2013-03-01 00:00:00.000 AK 2
2013-06-01 00:00:00.000 AA 1
2013-06-01 00:00:00.000 AK 2
2013-06-01 00:00:00.000 AK 2
2013-09-01 00:00:00.000 AA 1
2013-09-01 00:00:00.000 AK 1
2013-09-13 00:00:00.000 AK 1
2013-10-01 00:00:00.000 AA 1
2013-10-01 00:00:00.000 AK 2
2013-10-01 00:00:00.000 AK 2 -
Hi..
Can anyone tell me how to do the SQL
SELECT EMPNO,DISTINCT(COUNT(SALARY))
FROM EMPLOYEE
Thanks in Advnace......
RajeshwariYou need to count the distinct account_keys in the account table as an in-line view and join that to the product table. Something like:
SELECT p.product_key, p.name, a.cnt
FROM product p,
(SELECT product_key, COUNT(DISTINCT account_key) cnt
FROM account
GROUP BY product_key) a
WHERE p.product_key = a.product_keyAlthough, your sample data gives odd results because there are two entries for p1 A and product p4 has two names D and B, although since it does not appear in accounts, it makes no differene in this particular case.
If you actually want to show all of the products from the product table and a zero count if they do not appear in accounts, then you need an outer join, something like:
SELECT p.product_key, p.name, NVL(a.cnt, 0) cnt
FROM product p
LEFT JOIN (SELECT product_key, COUNT(DISTINCT account_key) cnt
FROM account
GROUP BY product_key) a
ON p.product_key = a.product_keyHTH
John -
I am doing a COUNT of accounts in my report. How can I achieve a DISTINCT count. Can this be done without using Virtual Key Figure?
Thanksyou mean you want to have a counter for each time an account gets added, like number of records in ODS?? I may have got your question wrong, but if all you need is a counter for your accounts , then you can create a characteristic for counter and add it in your report.
Maybe you are looking for
-
Font appears different in Muse (without smooth anti aliasing)
I am using Google Font Roboto and noticed that it appears thicker when i type it in Muse than in Photoshop. When i have anti-aliasing on 'Strong' in photoshop it mirrors what is showing in Muse, but i prefer the default Smooth anti-aliasing. Is there
-
Automatic PR Creation from Sales order
Hi All, This is with respect to PR creation from Sales order for 3rd party items. I am able to create PR from standard sales order and i dont have any issues with that. I have used schedule line categories which is configured with Order ty
-
What's release SDK for jinitiator in DS10R2
HY, for error i removed sdk from panel (add/remove program) after i installed Iinitiator, a little question: what's release java (SDK) i must download for JINITIATOR 1.3.1.25 in DS 10R2? Thanks in advance
-
Why is music playing after upgrading to lion?
Why is music playing after installing Lion?
-
Delivery costs to be loaded on material cost
Hi, Our client wants the delivery costs (Freight/ Octroi/ Delivery Charges.etc) to be loaded on material. But mean time they want system to calculate taxes on net value of the PO, not on Material value plus delivery costs. Request to guide me to map