"group by" slow for using "count(distinct some_column)" - a better way?

Hi all,
i have an
select
count(distinct some_column),
from [...]
group by [...];
Which is slowed down for the "*count(distinct some_column)*".
The "group by" aggregates base records.
But the base records have 1:n for some #1 event #n records each.
Some of the #n records fall into group by result record (A), some other into group by result record (B).
But each shall only count +1 per event - disregarding how many of the #n record have fallen into that category.
Is there another (faster) way to count for this?
- thanks!
best regards,
Frank
Edited by: user8704911 on Jun 29, 2011 1:30 AM

Hi Dom,
incidentally i went in the direction you proposed:
I replaced the pl/sql collection with the global temporary table.
But the reason for doing this was a different one:
I recognized, that the group by is much faster, if applied on table or global temporary table.
However i first just moved the data from pl/sql collection to global temporary table in order to apply the group by there.
Then the group by is much faster - but the moving of data from pl/sql collection to global temporary table then took away the time.
So it was not the group by, but in general the read-access to the pl/sql collection (btw, around #65,000 records).
Now having completely replaced the pl/sql collection with global temporary table everything is fine.
cheers,
Frank

Similar Messages

Configuartion of logon group !DIAG for use by web dispatcher

We are load balancing our HTTP traffic across a range of ABAP+Java Addin instances and have found that the load is balanced according to the logon group !DIAG. Does anyone know if is possible to configure this group (ie to remove the CI) or how to use a different logon group.
I think you can do something with URL mapping in SICF, but do not know how to configure it.
Regards
Jack Inniss.

Hi Jack
The internal server group !DIAG is made up of all application servers defined in SMLG, I believe. The web dispatcher uses the ICF service /sap/public/icf_info to determine the content of this group in conjunction with URL mapping.
If you exeute this service you'll see the list of servers and logon groups that will be used for the !DIAG group. Is the CI defined in SMLG ? Also, you may find the following page on SAP Help useful.
<a href="http://help.sap.com/saphelp_nw04/helpdata/en/a8/a8463c53a0ff02e10000000a114084/frameset.htm">Server Groups in the Internet Communication Framework</a>
Regards,
Gary

One Apple ID that the whole family uses it, is ther a better way to use family sharing?

My wife and I share the same apple ID on all of our apple devices (Mac-running Yosemite, Iphone 4s, Ipad), and my sons no have their own ipods and I want them to use the ID as well, but I am having trouble because I am getting txts and other messages when my sons send message to their friends.
My sons are 9 & 6 so getting them and Apple Id is not appropriate at this time (or allowed). Is there a way that I can set up a child account under our Apple ID?

Hi Mdbarnharts,
Thanks for visiting Apple Support Communities.
It sounds like your family members are using the same Apple ID to share purchases, and now you're receiving messages sent and received by your kids.
You can set up and manage Apple IDs for your children through the Family Sharing feature.
You can find more information about Family Sharing and setting up an Apple ID for your kids at these resources:
Start or join a family group using Family Sharing - Apple Support
Family Sharing and Apple IDs for kids - Apple Support
All the best,
Jeremy

Query rewrite for COUNT(DISTINCT)

Hi,
I am having fact table with different dimension keys.
CREATE TABLE FACT
TIME_SKEY NUMBER
REGION_SKEY NUMBER,
AC_SKEY NUMBER
I need to take COUNT(DISTINCT(AC_SKEY) for TIME_SKEY and REGION_SKEY. There are oracle dimension defined for time and region which are using TIME_SKEY and REGION_SKEY. I have created MV with query rewrite with COUNT(DISTINCT) but it is not using dimension if I am using any other level and MV can't be fast refreshed as it was build using COUNT(DISTINCT).
CREATE MATERIALIZED VIEW AC_MV
NOCACHE
NOLOGGING
NOCOMPRESS
NOPARALLEL
BUILD IMMEDIATE
REFRESH COMPLETE ON DEMAND
WITH PRIMARY KEY
ENABLE QUERY REWRITE
AS
SELECT
TIME_SKEY ,
REGION_SKEY,
COUNT (DISTINCTAC_SKEY)
FROM FACT
GROUP BY TIME_SKEY, REGION_SKEY;
Query used to retrieve data is as below
SELECT TIME_SKEY, COUNT(DISTINCT AC_SKEY) OVER (PARTITION BY TIME_SKEY) UNIQ_AC, COUNT(DISTINCT AC_SKEY) OVER () UNIQ_AC1
FROM FACT;
There can be other queries based on time / region dimension.
Can you please provide help in solving above issue?
Thanks,
Pritesh

What version of the Oracle database?

Set Aggregation type of Count Distinct to use correct table aggregation in

Hi there,
Currently I use OBIEE 10.1.3.4.1 , and there is a case where a fact table consist of 2 logical table source: detail and aggregate table, which has some measure using count distinct as aggregation type. The problem is everytime I browse the measure with no dimension at all , it always use detail table not aggegation one..
Really appreciate for any suggestion ..
thanks a lot

Hi,
I don't think it's the same case as mine. Let say I have 2 table : detail and aggegate
Detail Table consists 4 fields:
*) Period
*) Market
*) Region
*) Measure : Customer ID, Sales
Aggregate Table consists 3 fields :
*) Period
*) Region
*) Measure : Customer ID, Sales
in the measure I set aggregation type for each field:
*) Sales >> set as Sum
*) Customer ID >> copy as "Number of Customer" and set as Count Distinct
In each LTS' contents I set the level of aggregation using "Get Levels" feature..
Then I try to browse via Presentation and do some querys belows:
a) only choose single field of measure : Sales, the session shows that the value is taken from aggregation table and just as I expected.
b) choose period and sales, the session shows that the values are taken from aggregation table, and still just as I expected.
c) choose period, sales , and market, the session shows that the values are taken from detail table, just as I expected.
d) only choose single field of measure : "Number of Customer", the session shows that the value is taken from detail table , this is NOT as I expected. It suppose to take the value from aggregation table..
e) choose period and "Number of Customer", the session shows that the value is taken from detail table , this is also NOT as I expected. It suppose to take the value from aggregation table..
I've tried to override the aggregation , but still confuse how to apply in measure "Number of Customer" and did not work at all..
any idea ?
thanks a lot

Count distinct on a varchar field

Hi,
i want to count the unique values for a field, but when i try this in a aggregator i'm getting the following error:
aggrete function COUNT expected a numeric type.
This is the query i want to reproduce in a mapping to update a tablefield.
select material, material_grid, count (distinct season)
from material_season
group by material, material_grid
having count (distinct season)>1
Why isn't this possible in a aggregator?
regards,
Osman

Hi Osman
When do you get the error? Which version of OWB are you using? What SQL do you get generated - look at the intermediate code gen within the mapping.
Cheers
David

Count Distinct Wtih CASE Statement - Does not follow aggregation path

All,
I have a fact table, a day aggregate and a month aggregate. I have a time hierarchy and the month aggregate is set to the month level, the day aggregate is set to the day level within the time hierarchy.
When using any measures and a field from my time dimension .. the appropriate aggregate is chosen, ie month & activity count .. month aggregate is used. Day & activity count .. day aggregate is used.
However - when I use the count distinct aggregate rule .. the request always uses the lowest common denominator. The way I have found to get this to work is to use a logical table source override in the aggregation tab. Once I do this .. it does use the aggregates correctly.
A few questions
1. Is this the correct way to use aggregate navigation for the count distinct aggregation rule (using the source override option)? If yes, why is this necessary for count distinct .. what is special about it?
2. The main problem I have now is that I need to create a simple count measure that has a CASE statement in it. The only way I see to do this is to select the Based on Dimensions checkbox which then allows me to add a CASE statement into my count distinct clause. But now the aggregation issue comes back into play and I can't do the logical table source override when the based on dimensions checkbox is checked .. so I am now stuck .. any help is appreciated.
K

Ok - I found a workaround (and maybe the preferred solution for my particular issue), which is - Using a CASE Statement with a COUNT DISTINCT aggregation and still havine AGGREGATE AWARENESS
To get all three of the requirements above to work I had to do the following:
- Create the COUNT DISTINCT as normal (counting on a USERID physically mapped column in my case)
- Now I need to map my fact and aggregates to this column. This is where I got the case statement to work. Instead of trying to put the case statement inside of the Aggregate definition by using the checkbox 'Base on Dimension' (which didnt allow for aggregate awareness for some reason) .. I instead specified the case statement in the Column Mapping section of the Fact and Aggregate tables.
- Once all the LTS's (facts and aggregates) are mapped .. you still have to define the Logical Table Source overrides in the aggregate tab of the count distinct definition. Add in all the fact and aggregates.
Now the measure will use my month aggregate when i specify month, the day aggregate when i specify day, etc..
If you are just trying to use a Count Distinct (no CASE satement needed) with Aggregate Awareness, you just need to use the Logical Table Source override on the aggregate tab.
There is still a funky issue when using the COUNT aggregate type. As long as you dont map multiple logical table sources to the COUNT column it works fine and as expected. But, if you try to add in multiple sources and aggregate awareness it randomly starts SUMMING everything .. very weird. The blog in this thread says to check the 'Based on Dimension' checkbox to fix the problem but that did not work for me. Still not sure what to do on this one .. but its not currently causing me a problem so I will ignore for now ;)
Thanks for all the help
K

Count distinct can not be aggregate

Hi All,
I create discoverer report and then create calculation by using Count Distinct function in this cal. when I create total on this calculation the data could not be shown on report.
How to correct this problem ? But I don't want to create Custom folder because
If use this method I must create a lot of custom folder to support all reports
Thank you
Mcka

Yes, that's right. Normally when you ask Discoverer to create a total it will use the SUM command. The SUM command is a function that adds up all the columns with natural numbers, but for columns with calculations it tries to do a SUM of the calculation. Many times this will work but sometimes the formula becomes impossibe and you end up with nothing in the total.
CELL SUM tells the system to literally add up the values in the column above and ignore the underlying formulas, hence it has a better chance of evaluating the total when there are functions being used in the calculations.
Does this help
Regards
Michael

COUNT DISTINCT in SAP Hana

I am not getting the results that what i am looking for when i try to use COUNT DISTINCT function. I did able to write the Calc View using the COUNT (DISTINCT <column_name1> || <column_name2>) but when i tried to look at the explorer it doesn't give me the right results. On the other side, the same system i have in MSAS 2000 and MSAS 2008, they are giving the right results while porting the same query in CalcView for SAP Hana, the results doesn't tally. Any suggestions or experience to share. Look at the side below;
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/401eb46e-d407-2f10-c999-e467d93eae50?QuickLink=index&overridelayout=true&53244709744350
Other Examples
u2013 Exception Aggregation (e.g. Distinct Count only available as BWA Calculation Engine feature)
http://www.saptechno.com/sap-notes.html?view=sapnote&id=1631919

I believe you are more likely to get a response in the [In-Memory Business Data Management|SAP HANA and In-Memory Computing; forum.
I'd move your post there, but I don't have the rights for this.
Please remeber to close this thread.
Thank you for your understanding,
- Ludek

Best practice for use of spatial operators

Hi All,
I'm trying to build a .NET toolkit to interact with Oracles spatial operators. The most common use of this toolkit will be to find results which are within a given geometry - for example select parish boundaries within a county.
Our boundary data is high detail, commonly containing upwards of 50'000 vertices for a county sized polygon.
I've currently been experimenting with queries such as:
select
from
uk_ward a,
uk_county b
where
UPPER(b.name) = 'DORSET COUNTY' and
sdo_relate(a.geoloc, b.geoloc, 'mask=coveredby+inside') = 'TRUE';
However the speed is unacceptable, especially as most of the implementations of the toolkit will be web based. The query above takes around a minute to return.
Any comments or thoughts on the best practice for use of Oracle spatial in this way will be warmly welcomed. I'm looking for a solution which is as quick and efficient as possible.

Thanks again for the reply... the query currently takes just under 90 seconds to return. Here are the results from the execution plan ran in sql*:
Elapsed: 00:01:24.81
Execution Plan
Plan hash value: 598052089
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 156 | 46956 | 76 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 156 | 46956 | 76 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | UK_COUNTY | 2 | 262 | 5 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| UK_WARD | 75 | 12750 | 76 (0)| 00:00:01 |
|* 4 | DOMAIN INDEX | UK_WARD_SX | | | | |
Predicate Information (identified by operation id):
2 - filter(UPPER("B"."NAME")='DORSET COUNTY')
4 - access("MDSYS"."SDO_INT2_RELATE"("A"."GEOLOC","B"."GEOLOC",'mask=coveredby+i
nside')='TRUE')
Statistics
20431 recursive calls
60 db block gets
22432 consistent gets
1156 physical reads
0 redo size
2998369 bytes sent via SQL*Net to client
1158 bytes received via SQL*Net from client
17 SQL*Net roundtrips to/from client
452 sorts (memory)
0 sorts (disk)
125 rows processed
The wards table has 7545 rows, the county table has 207.
We are currently on release 10.2.0.3.
All i want to do with this is generate results which fall in a particular geometry. Most of my testing has been successful i just seem to run into issues when querying against a county sized polygon - i guess due to the amount of vertices.
Also looking through the forums now for tuning topics...

Logical Aggregate Column (count(distinct)) Does Not Group for SQL Server DB

When utilizing the count(distinct column_name) aggregate function within a Logical Fact source in the Business Model and Mapping layer in the RPD file the output in BI Answers is not grouping correctly for SQL Server 2008 database sources only. All Oracle database sources represent the same aggregate column correctly within BI Answers.
I am using OBIEE version 10.1.3.3.3
Does anyone know how to resolve this issue?
Thanks in advance,
Kyle

I thought that I would update my current findings with this issue. If you display the report in BI Answers as a Pivot Table view the aggregate column displays properly, it does not in a Table or Compound Layout view for some reason. I am still working with Oracle Support on this.

Count distinct values current group

Hi--
Is there a way to count the distinct values within the current group? ie--i've got a PO and want to display all the addresses at the shipment level if there is more than one distinct one, but if they are all the same as the header-level address, then I don't want any of them to show up.
I'm using the following at the header level to tell the header not to show up if there are multiple shipment level addresses, but can't seem to get a similar statement to work when it's sitting at the same level as the group that I want to count.
This is what I use at the header--it seems to work:
<?if:count(xdoxslt:distinct_values(PLL_SHIP_ADDRESS_LINE1))>1?>See Details Below<?end if?><?if:count(xdoxslt:distinct_values(PLL_SHIP_ADDRESS_LINE1))=1?>
POH_SHIP_ADDRESS_LINE1POH_SHIP_ADDRESS_LINE1
POH_SHIP_ADDRESS_LINE2
POH_SHIP_ADDRESS_LINE3
POH_SHIP_ADR_INFO POH_SHIP_COUNTRY<?end if?>
A really simplified version of the structure of the report is below:
<?xml version="1.0" ?>
- 
- <SMTPOXPRPOP2>
- <LIST_G_INIT_INFO>
- <G_INIT_INFO>
<MANUAL_PO_NUM_TYPE>NUMERIC</MANUAL_PO_NUM_TYPE>
<C_COMPANY>CompanyName</C_COMPANY>
- <LIST_G_HEADERS>
- <G_HEADERS>
<POH_PO_NUM>310001100</POH_PO_NUM>
- <LIST_G_LINES>
- <G_LINES>
<POL_VENDOR_PROD_NUM>12q</POL_VENDOR_PROD_NUM>
<POL_ITEM_DESCRIPTION>sample</POL_ITEM_DESCRIPTION>
<POL_QUANTITY_TO_PRINT>10</POL_QUANTITY_TO_PRINT>
- <LIST_G_SHIPMENTS>
- <G_SHIPMENTS>
<PLL_SHIP_COUNTRY>Canada</PLL_SHIP_COUNTRY>
<PLL_SHIP_ADR_INFO>Calg,AB Zip</PLL_SHIP_ADR_INFO>
<PLL_SHIP_ADDRESS_LINE3 />
<PLL_SHIP_ADDRESS_LINE2 />
<PLL_SHIP_ADDRESS_LINE1>Ad1</PLL_SHIP_ADDRESS_LINE1>
</G_SHIPMENTS>
</G_SHIPMENTS> <PLL_SHIP_COUNTRY>Canada</PLL_SHIP_COUNTRY>
<PLL_SHIP_ADR_INFO>Calg,AB Zip</PLL_SHIP_ADR_INFO>
<PLL_SHIP_ADDRESS_LINE3 />
<PLL_SHIP_ADDRESS_LINE2 />
<PLL_SHIP_ADDRESS_LINE1>Ad1</PLL_SHIP_ADDRESS_LINE1>
</G_SHIPMENTS>
</G_LINES>
<POH_SHIP_ADDRESS_LINE2 />
<POH_SHIP_COUNTRY>Canada</POH_SHIP_COUNTRY>
<POH_SHIP_ADR_INFO>Kanata,ON K2V 0A2</POH_SHIP_ADR_INFO>
<POH_SHIP_ADDRESS_LINE3 />
<POH_SHIP_ADDRESS_LINE1>XXX Palladium Drive</POH_SHIP_ADDRESS_LINE1>
</G_HEADERS>
</LIST_G_HEADERS>
</SMTPOXPRPOP2>
Could anyone help out with this?
Thanks--I'd really appreciate it!
Kate

Hi Vetsrini--
Thanks for getting back to me so quickly! I'd love to email you a copy and the XML if you wouldn't mind taking a look. it'll probably be more clear than me trying to explain.
I can't quite figure out how to do that, though---your profile doesn't list an email. Do I need to click elsewhere?
Thanks!
Kate

Totals for Count Distinct

I need to display totals for Count Distinct measures. I want to display these above a table view.
We have done this before by creating hidden columns with level-based measures for totals and then displaying the first row of these hidden columns in a narrative view above the table. We have also used MAX(RSUM()) within requests, sometimes.
These solutions won't work, because I need Count Distinct() measures (so simple sums and counts will give inaccurate results) and I may navigate to the request with filters at different levels (so LBMs won't work, either).
The only solution I can think of is to have LBMs for each level and have duplicate dashboards that differ only in which variation of this request with which level's LBMs are displayed for the totals. That seems like too much of a kluge. There should be a simpler, better way to do this.

I was trying to reproduce your issue with "Sample Sales" - but can't figure out which columns you'd like to see. Can you please post couple columns - and which count distinct you need? That would make it easier to reproduce the issue.
I was thinking that it might be difficult to pull it in 1 report (since you can't completely exclude columns in table view). I have two suggestions:
a) did you try to create a separate report and combine it with existing one (same Dashboard page)?
b) did you try Pivot Table and its calculated column feature? I've had some success with it when I needed to combine measures at different levels on the same report (i needed to see daily totals for 3 specific days, monthly values for specific months, and couple annual totals). This way you could have it on the same report.
I just tried A. And it worked (again, not sure if this is applicable to your situation). I used "Server Complex Aggregate" in column options. The formula is showing: SELECT "D5 Employee"."E01 Employee Name" saw_0, COUNT(DISTINCT "D1 Customer"."C1 Cust Name") saw_1 FROM "Sample Sales" ORDER BY saw_0
Edited by: wildmight on Oct 30, 2009 9:35 AM

Group by count distinct

mytable
id | yy
1 | 78
2 | 78
3 | 78
3 | 79
3 | 79
4 | 79
5 | 79
5 | 80
Desired output:
yy | id_count
78 | 3
79 | 2
80 | 0
Following query doesn't work, as it doesn't take into account that id was already counted
select yy, count(distinct id) as id_count
from mytable
group by yy
--output
yy | id_count
78 | 3
79 | 4
80 | 1
Hope this makes sense.
Ideas?

Hi,
You only want to count each id once, with the first (that is, lowest) yy: is that right?
Here's one way:
WITH     got_r_num    AS
     SELECT id
     ,     yy
     ,     ROW_NUMBER () OVER ( PARTITION BY id
                               ORDER BY         yy
                       ) AS r_num
     FROM    my_table
SELECT       yy
       COUNT ( CASE
                  WHEN r_num = 1
                THEN id
              END
          )     AS id_cnt
FROM       got_r_num
GROUP BY yy
ORDER BY yy
;Doing anything for the first of each id is probably a job for "ROW_NUMBER () OVER (PARTITION BY id ...)".

How do I use count for this query?

How do I display all the addresses in a table that have more than one (or >1) account number? I wasn't sure how or if I should use count along with group by and having to get the expected results.

select address from tablename
group by address having count(1) > 1;

"group by" slow for using "count(distinct some_column)" - a better way?

Similar Messages

Maybe you are looking for