Counting distinct values???

Hi all,
Why does this not work?
"Select count(distinct employee_id) from employee_table"
I am looking to return the count of the distinct employee_id.

because its not valid SQL
try:
select * from employee_table where employee_id in (select distinct employee_id from employee_table)not very neat but should work (on oracle or SQLserveR)

Similar Messages

Count distinct values in report builder

i have a situation where i have to count distinct number of customers.
i have a query which returns the list of values of bill_to_customer_id from ra_customer_trx_all table and i have to display only the number of distinct customers. i cant do this in the query because it has to be grouped and i am doing it in an aging report. i have to list the number of distinct customers in each aging period. can anybody please help me how to achieve this in reports 6i.
thanks

how can i count distinct values in reports?
the situation is like this
i have a query which lists customer_id, invoice number, amount due
so what i want is to count the distinct customer_id and display the number of distinct customers. one customer_id can be repeated any number of times but i should count it only once.

Count distinct values current group

Hi--
Is there a way to count the distinct values within the current group? ie--i've got a PO and want to display all the addresses at the shipment level if there is more than one distinct one, but if they are all the same as the header-level address, then I don't want any of them to show up.
I'm using the following at the header level to tell the header not to show up if there are multiple shipment level addresses, but can't seem to get a similar statement to work when it's sitting at the same level as the group that I want to count.
This is what I use at the header--it seems to work:
<?if:count(xdoxslt:distinct_values(PLL_SHIP_ADDRESS_LINE1))>1?>See Details Below<?end if?><?if:count(xdoxslt:distinct_values(PLL_SHIP_ADDRESS_LINE1))=1?>
POH_SHIP_ADDRESS_LINE1POH_SHIP_ADDRESS_LINE1
POH_SHIP_ADDRESS_LINE2
POH_SHIP_ADDRESS_LINE3
POH_SHIP_ADR_INFO POH_SHIP_COUNTRY<?end if?>
A really simplified version of the structure of the report is below:
<?xml version="1.0" ?>
- 
- <SMTPOXPRPOP2>
- <LIST_G_INIT_INFO>
- <G_INIT_INFO>
<MANUAL_PO_NUM_TYPE>NUMERIC</MANUAL_PO_NUM_TYPE>
<C_COMPANY>CompanyName</C_COMPANY>
- <LIST_G_HEADERS>
- <G_HEADERS>
<POH_PO_NUM>310001100</POH_PO_NUM>
- <LIST_G_LINES>
- <G_LINES>
<POL_VENDOR_PROD_NUM>12q</POL_VENDOR_PROD_NUM>
<POL_ITEM_DESCRIPTION>sample</POL_ITEM_DESCRIPTION>
<POL_QUANTITY_TO_PRINT>10</POL_QUANTITY_TO_PRINT>
- <LIST_G_SHIPMENTS>
- <G_SHIPMENTS>
<PLL_SHIP_COUNTRY>Canada</PLL_SHIP_COUNTRY>
<PLL_SHIP_ADR_INFO>Calg,AB Zip</PLL_SHIP_ADR_INFO>
<PLL_SHIP_ADDRESS_LINE3 />
<PLL_SHIP_ADDRESS_LINE2 />
<PLL_SHIP_ADDRESS_LINE1>Ad1</PLL_SHIP_ADDRESS_LINE1>
</G_SHIPMENTS>
</G_SHIPMENTS> <PLL_SHIP_COUNTRY>Canada</PLL_SHIP_COUNTRY>
<PLL_SHIP_ADR_INFO>Calg,AB Zip</PLL_SHIP_ADR_INFO>
<PLL_SHIP_ADDRESS_LINE3 />
<PLL_SHIP_ADDRESS_LINE2 />
<PLL_SHIP_ADDRESS_LINE1>Ad1</PLL_SHIP_ADDRESS_LINE1>
</G_SHIPMENTS>
</G_LINES>
<POH_SHIP_ADDRESS_LINE2 />
<POH_SHIP_COUNTRY>Canada</POH_SHIP_COUNTRY>
<POH_SHIP_ADR_INFO>Kanata,ON K2V 0A2</POH_SHIP_ADR_INFO>
<POH_SHIP_ADDRESS_LINE3 />
<POH_SHIP_ADDRESS_LINE1>XXX Palladium Drive</POH_SHIP_ADDRESS_LINE1>
</G_HEADERS>
</LIST_G_HEADERS>
</SMTPOXPRPOP2>
Could anyone help out with this?
Thanks--I'd really appreciate it!
Kate

Hi Vetsrini--
Thanks for getting back to me so quickly! I'd love to email you a copy and the XML if you wouldn't mind taking a look. it'll probably be more clear than me trying to explain.
I can't quite figure out how to do that, though---your profile doesn't list an email. Do I need to click elsewhere?
Thanks!
Kate

Count number of distinct values for a column for all tables that contains that column

Imagine I have one Column called cdperson. With the query below I know which Tables have column cdperson
select
t.[name]fromsys.schemassinnerjoin
sys.tables
tons.schema_id=t.schema_idinnerjoin
sys.columnscont.object_id=c.object_idinnerjoin
sys.types
donc.user_type_id=d.user_type_idwherec.name ='cdperson'
now I want to know for each table, how many distinct values of cdperson I have and I want the result ordered by the table that has more distinct values (descending)
Table1
   cdperson                        select distinct(cdperson) = 10
   cdadress
   quant
Table2 with
   cdaddress                      (no column cdperson in this table)
   quant
Table3
   cdperson                        select distinct(cdperson) = 100
   value
Table 4
   cdperson                        select distinct(cdperson) = 18
   sum
I want this result ordered by number of distinct cdperson
table3   100
table4   18
table    10
Thks for your answers

I had to add schema name to the above script to make it work in AdventureWorks:
CREATE TABLE #temp(TableName sysname , CNT BIGINT)
DECLARE @QRY NVARCHAR(MAX);
SET @qry=(SELECT
N'INSERT INTO #TEMP SELECT '''+schema_name(t.schema_id)+'.'+T.[name] +''' AS TableName, COUNT (DISTINCT ProductID) DistCount FROM '+
schema_name(t.schema_id)+'.'+t.[name] +';'
FROM sys.schemas s INNER JOIN sys.tables t ON s.schema_id=t.SCHEMA_ID
INNER JOIN sys.columns c ON t.object_id=c.object_id INNER JOIN sys.types d ON c.user_type_id=d.user_type_id
WHERE c.name ='ProductID'
FOR XML PATH(''))
EXEC(@QRY)
SELECT * FROM #temp ORDER BY TableName
DROP TABLE #temp
Production.Product 504
Production.ProductCostHistory 293
Production.ProductDocument 31
Production.ProductInventory 432
Production.ProductListPriceHistory 293
Production.ProductProductPhoto 504
Production.ProductReview 3
Production.TransactionHistory 441
Production.TransactionHistoryArchive 497
Production.WorkOrder 238
Production.WorkOrderRouting 149
Purchasing.ProductVendor 211
Purchasing.PurchaseOrderDetail 211
Sales.SalesOrderDetail 266
Sales.ShoppingCartItem 3
Sales.SpecialOfferProduct 295
Kalman Toth Database & OLAP Architect
SQL Server 2014 Database Design
New Book / Kindle: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2014

How to count distinct excluding a value in business layer?

Hi all,
I'm having a column which has many values. I need to make this is as a measure with count distinct aggregator. But i should not count 0 in the column. How can i do this. If i try to use any condition means the aggregator option is disables. Please help
Thanks

Look this example:
I made in BMM in the SALES fact table measure:
Count_Distinct_Prod_Id_Exclude_Prod_Id_144
I'll count distinct PRODUCTS.PROD_ID, but exclude PROD_ID=144 in counting.
Make this measure like this:
1. New object/Logical column
2. Go to data type tab and click EDIT on the logical table table source
3. Now, in the general tab add join to a table (in my case PRODUCTS)
4. Go to the column mapping tab -> show unmapped columns
5. In the new column (in my case Count_Distinct_Prod_Id_Exclude_Prod_Id_144) write code like similar:
CASE WHEN "orcl".""."SH"."PRODUCTS"."PROD_ID" = 144 THEN NULL ELSE "orcl".""."SH"."PRODUCTS"."PROD_ID" END
6. Click OK and close the logical table source window
7. Now, in the logical column window go to aggregation tab and choose COUNT DISTINCT.
8. Move the measure Count_Distinct_Prod_Id_Exclude_Prod_Id_144 in the presentation area
9. Test in Answers (report cointains columns as follow)
PROD_CATEGORY_ID
Count_Distinct_Prod_Id_Exclude_Prod_Id_144
And the result in the NQQuery.log is:
select T21473.PROD_CATEGORY_ID as c1,
count(distinct case when T21473.PROD_ID = 144 then NULL else T21473.PROD_ID end ) as c2
from
PRODUCTS T21473
group by T21473.PROD_CATEGORY_ID
order by c1
Regards
Goran
http://108obiee.blogspot.com

"How to get distinct values of sharepoint column using SSRS"

Hi,
    I have integrated sharepoint list data to SQL Server reporting services. I am using the below to query sharepoint list data using sql reporting services.
<Query>
   <SoapAction>http://schemas.microsoft.com/sharepoint/soap/GetListItems</SoapAction>
   <Method Namespace="http://schemas.microsoft.com/sharepoint/soap/" Name="GetListItems">
      <Parameters>
         <Parameter Name="listName">
            <DefaultValue>{GUID of list}</DefaultValue>
         </Parameter>
         <Parameter Name="viewName">
            <DefaultValue>{GUID of listview}</DefaultValue>
         </Parameter>
         <Parameter Name="rowLimit">
            <DefaultValue>9999</DefaultValue>
         </Parameter>
      </Parameters>
   </Method>
<ElementPath IgnoreNamespaces="True">*</ElementPath>
</Query>
By using this query, I am getting a dataset which includes all the columns of sharepoint list. Among these columns, I wanted to display only 2 columns (i.e Region and Sales type) using chart. I have created a Region parameter but when I click preview, the drop down box is giving me all the repeatative values of region like RG1,RG1,RG1,RG2,RG2,RG2,RG2,RG3.......... I wanted to display only distinct values of Region parameter so that whenever end user select region from the parameter drop down, it will display the respective value of Sales type column.
Also when I select only RG1 parameter, it is giving me a chart including the sales type of all the Regions. (it should display me only the sales type of RG1) How can I link these 2 columns so that they will display the values respectively.
          I would really appreciate if anyone can help me out with this.
Thanks,
Sam.

Hi Sam,
By code, the CAML language doesn’t have any reserved word (or tag) to set this particular filter to remove duplicate results.
In this case, we could use the custom code to get distinct records.
Here are the detailed steps:
1.         Create a hidden parameter that gets all the records in one field.
Note: Please create another dataset that is same of the main dataset. This dataset is used for the parameter.
2.         Create a function that used to remove the duplicate records.
Here is the code:
Public Shared Function RemoveDups(ByVal items As String) As String
Dim noDups As New System.Collections.ArrayList()
Dim SpStr
SpStr = Split(items ,",")
For i As Integer=0 To Ubound(Spstr)
If Not noDups.Contains(SpStr(i).Trim()) Then
noDups.Add(SpStr(i).Trim())
End If
Next
Dim uniqueItems As String() = New String(noDups.Count-1){}
noDups.CopyTo(uniqueItems)
Return String.Join(",", uniqueItems)
End Function
3.         Create another parameter that will be used for filtering the maindata.
Please set the available value to be =Split(Code.RemoveDups(JOIN(Parameters!ISSUE_STATUS_TEMP.Value, ",")), ",")
And the default value to be the value you what such as the first value:
=Split(Code.RemoveDups(JOIN(Parameters!ISSUE_STATUS_TEMP.Value, ",")), ",").(0)
4.         Go to the main dataset. Open the property window of this dataset.
5.         In the “Filters” tab, set the filter to be:
Expression: <The field to be filter>
Operator: =
Value: =Parameters!Region.Value
The parameter “Region” should be the parameter we created in the step3.
Now, we should get distinct values of SharePoint columns.
If there is anything unclear, please feel free to ask.
Thanks,
Jin
Jin Chen - MSFT

How to display the count distinct in a report

hi,
i have a report with multiple columns in it and with column, say A; i need to display in a calculated column B how many distinct values there are in A across the entire report; how to do that?

Hi.
For example:
CALENDAR_YEAR
CALENDAR_MONTH_DESC
count(distinct TIMES.CALENDAR_MONTH_DESC by TIMES.CALENDAR_YEAR)
Count will give you how many distinct months are in year.
Regards
Goran
http://108obiee.blogspot.com

What is '#Distinct values' in Index on dimension table

Gurus!
I have loaded my BW Quality system (master data and transaction data) with almost equivalent volume as in Production.
I am comparing the sizes of dimension and fact tables of one of the cubes in Quality and PROD.
I am taking one of the dimension tables into consideration here.
Quality:
/BIC/DCUBENAME2 Volume of records: 4,286,259
Index /BIC/ECUBENAME~050 on the E fact table /BIC/ECUBENAME for this dimension key KEY_CUBENAME2 shows #Distinct values as 4,286,259
Prod:
/BIC/DCUBENAME2 Volume of records: 5,817,463
Index /BIC/ECUBENAME~050 on the E fact table /BIC/ECUBENAME for this dimension key KEY_CUBENAME2 shows #Distinct values as 937,844
I would want to know why the distinct value is different from the dimension table count in PROD
I am getting this information from the SQL execution plan, if I click on the /BIC/ECUBENAME table in the code. This screen gives me all details about the fact table volumes, indexes etc..
The index and statistics on the cube is up to date.
Quality:
E fact table:
Table   /BIC/ECUBENAME
Last statistics date                  03.11.2008
Analyze Method               9,767,732 Rows
Number of rows                         9,767,732
Number of blocks allocated         136,596
Number of empty blocks              0
Average space                            0
Chain count                                0
Average row length                      95
Partitioned                                  YES
NONUNIQUE Index   /BIC/ECUBENAME~P:
Column Name                     #Distinct
KEY_CUBENAMEP                                  1
KEY_CUBENAMET                                  7
KEY_CUBENAMEU                                  1
KEY_CUBENAME1                            148,647
KEY_CUBENAME2                          4,286,259
KEY_CUBENAME3                                  6
KEY_CUBENAME4                                322
KEY_CUBENAME5                          1,891,706
KEY_CUBENAME6                            254,668
KEY_CUBENAME7                                  5
KEY_CUBENAME8                              9,430
KEY_CUBENAME9                                122
KEY_CUBENAMEA                                 10
KEY_CUBENAMEB                                  6
KEY_CUBENAMEC                              1,224
KEY_CUBENAMED                                328
Prod:
Table   /BIC/ECUBENAME
Last statistics date                  13.11.2008
Analyze Method                      1,379,086 Rows
Number of rows                       13,790,860
Number of blocks allocated       187,880
Number of empty blocks            0
Average space                          0
Chain count                              0
Average row length                    92
Partitioned                               YES
NONUNIQUE Index /BIC/ECUBENAME~P:
Column Name                     #Distinct
KEY_CUBENAMEP                                  1
KEY_CUBENAMET                                 10
KEY_CUBENAMEU                                  1
KEY_CUBENAME1                            123,319
KEY_CUBENAME2                            937,844
KEY_CUBENAME3                                  6
KEY_CUBENAME4                                363
KEY_CUBENAME5                            691,303
KEY_CUBENAME6                            226,470
KEY_CUBENAME7                                  5
KEY_CUBENAME8                              8,835
KEY_CUBENAME9                                124
KEY_CUBENAMEA                                 14
KEY_CUBENAMEB                                  6
KEY_CUBENAMEC                                295
KEY_CUBENAMED                                381

Arun,
The cube in QA and PROD are compressed. Index building and statistics are also up to date.
But I am not sure what other jobs are run by BASIS as far as this cube in production is concerned.
Is there any other Tcode/ Func Mod etc which can give information about the #distinct values of this Index or dimension table?
One basic question, As the DIM key is the primary key in the dimension table, there cant be duplicates.
So, how would the index on Ftable on this dimension table show #distinct values less than the entries in that dimension table?
Should the entries in dimension table not exactly match with the #Distinct entries shown in
Index /BIC/ECUBENAME~P on this DIM KEY?

Select records based on first n distinct values of column

I need to write a query in plsql to select records for first 3 distinct values of a single column (below example, ID )and all the rows for next 3 distinct values of the column and so on till the end of count of distinct values of a column.
eg:
ID name age
1 abc 10
1 def 20
2 ghi 10
2 jkl 20
2 mno 60
3 pqr 10
4 rst 10
4 tuv 10
5 vwx 10
6 xyz 10
6 hij 10
7 lmn 10
so on... (till some count)
Result should be
Query 1 should result --->
ID name age
1 abc 10
1 def 20
2 ghi 10
2 jkl 20
2 mno 60
3 pqr 10
query 2 should result -->
4 rst 10
4 tuv 10
5 vwx 10
6 xyz 10
6 hij 10
query 3 should result -->
7 lmn 10
9 .. ..
so on..
How to write a query for this inside a loop.

Hi,
So, one group will consist of the lowest id value, the 2nd lowest and the 3rd lowest, reggardless of how many rows are involved. The next group will consist of the 4th lowest id, the 5th lowest and the 6th lowest. To do that, you need to assign numbers 1, 2, 3, 4, 5, 6, ... to the rows in order by id, with all rows having the same id getting the same number, and without skipping any numbers.
That sounds like a job for the analytic DENSE_RANK function:
WITH     got_grp_id     AS
     SELECT     id, name, age
     ,     CEIL ( DENSE_RANK () OVER (ORDER BY id)
               / 3
               )          AS grp_id
     FROM     table_x
SELECT     id, name, age
FROM     got_grp_id
WHERE     id     = 1     -- or whatever number you want
;If you'd care to post CREATE TABLE and INSERT statements for your sample data, then I could test it.
See the forum FAQ {message:id=9360002}

COUNT(DISTINCT) WITH ORDER BY in an analytic function

-- I create a table with three fields: Name, Amount, and a Trans_Date.
CREATE TABLE TEST
NAME VARCHAR2(19) NULL,
AMOUNT VARCHAR2(8) NULL,
TRANS_DATE DATE NULL
-- I insert a few rows into my table:
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '110', TO_DATE('06/01/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '20', TO_DATE('06/01/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '110', TO_DATE('06/02/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '21', TO_DATE('06/03/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '68', TO_DATE('06/04/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '110', TO_DATE('06/05/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Anna', '20', TO_DATE('06/06/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Bill', '43', TO_DATE('06/01/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Bill', '77', TO_DATE('06/02/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Bill', '221', TO_DATE('06/03/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Bill', '43', TO_DATE('06/04/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
INSERT INTO TEST ( TEST.NAME, TEST.AMOUNT, TEST.TRANS_DATE ) VALUES ( 'Bill', '73', TO_DATE('06/05/2005 08:00:00 PM', 'MM/DD/YYYY HH12:MI:SS PM') );
commit;
/* I want to retrieve all the distinct count of amount for every row in an analytic function with COUNT(DISTINCT AMOUNT) sorted by name and ordered by trans_date where I get only calculate for the last four trans_date for each row (i.e., for the row "Anna 110 6/5/2005 8:00:00.000 PM," I only want to look at the previous dates from 6/2/2005 to 6/5/2005 and get the distinct count of how many amounts there are different for Anna). Note, I cannot use the DISTINCT keyword in this query because it doesn't work with the ORDER BY */
select NAME, AMOUNT, TRANS_DATE, COUNT(/*DISTINCT*/ AMOUNT) over ( partition by NAME
order by TRANS_DATE range between numtodsinterval(3,'day') preceding and current row ) as COUNT_AMOUNT
from TEST t;
This is the results I get if I just count all the AMOUNT without using distinct:
NAME     AMOUNT     TRANS_DATE     COUNT_AMOUNT
Anna 110 6/1/2005 8:00:00.000 PM     2
Anna 20 6/1/2005 8:00:00.000 PM     2
Anna 110     6/2/2005 8:00:00.000 PM     3
Anna 21     6/3/2005 8:00:00.000 PM     4
Anna 68     6/4/2005 8:00:00.000 PM     5
Anna 110     6/5/2005 8:00:00.000 PM     4
Anna 20     6/6/2005 8:00:00.000 PM     4
Bill 43     6/1/2005 8:00:00.000 PM     1
Bill 77     6/2/2005 8:00:00.000 PM     2
Bill 221     6/3/2005 8:00:00.000 PM     3
Bill 43     6/4/2005 8:00:00.000 PM     4
Bill 73     6/5/2005 8:00:00.000 PM     4
The COUNT_DISTINCT_AMOUNT is the desired output:
NAME     AMOUNT     TRANS_DATE     COUNT_DISTINCT_AMOUNT
Anna     110     6/1/2005 8:00:00.000 PM     1
Anna     20     6/1/2005 8:00:00.000 PM     2
Anna     110     6/2/2005 8:00:00.000 PM     2
Anna     21     6/3/2005 8:00:00.000 PM     3
Anna     68     6/4/2005 8:00:00.000 PM     4
Anna     110     6/5/2005 8:00:00.000 PM     3
Anna     20     6/6/2005 8:00:00.000 PM     4
Bill     43     6/1/2005 8:00:00.000 PM     1
Bill     77     6/2/2005 8:00:00.000 PM     2
Bill     221     6/3/2005 8:00:00.000 PM     3
Bill     43     6/4/2005 8:00:00.000 PM     3
Bill     73     6/5/2005 8:00:00.000 PM     4
Thanks in advance.

you can try to write your own udag.
here is a fake example, just to show how it "could" work. I am here using only 1,2,4,8,16,32 as potential values.
create or replace type CountDistinctType as object
   bitor_number number,
   static function ODCIAggregateInitialize(sctx IN OUT CountDistinctType)
     return number,
   member function ODCIAggregateIterate(self IN OUT CountDistinctType,
     value IN number) return number,
   member function ODCIAggregateTerminate(self IN CountDistinctType,
     returnValue OUT number, flags IN number) return number,
    member function ODCIAggregateMerge(self IN OUT CountDistinctType,
      ctx2 IN CountDistinctType) return number
create or replace type body CountDistinctType is
static function ODCIAggregateInitialize(sctx IN OUT CountDistinctType)
return number is
begin
   sctx := CountDistinctType('');
   return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self IN OUT CountDistinctType, value IN number)
return number is
begin
    if (self.bitor_number is null) then
      self.bitor_number := value;
    else
      self.bitor_number := self.bitor_number+value-bitand(self.bitor_number,value);
    end if;
    return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(self IN CountDistinctType, returnValue OUT
number, flags IN number) return number is
begin
    returnValue := 0;
    for i in 0..log(2,self.bitor_number) loop
      if (bitand(power(2,i),self.bitor_number)!=0) then
        returnValue := returnValue+1;
      end if;
    end loop;
    return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self IN OUT CountDistinctType, ctx2 IN
CountDistinctType) return number is
begin
    return ODCIConst.Success;
end;
end;
CREATE or REPLACE FUNCTION CountDistinct (n number) RETURN number
PARALLEL_ENABLE AGGREGATE USING CountDistinctType;
drop table t;
create table t as select rownum r, power(2,trunc(dbms_random.value(0,6))) p from all_objects;
SQL> select r,p,countdistinct(p) over (order by r) d from t where rownum<10 order by r;
         R          P          D
         1          4          1
         2          1          2
         3          8          3
         4         32          4
         5          1          4
         6         16          5
         7         16          5
         8          4          5
         9          4          5buy some good book if you want to start at writting your own "distinct" algorythm.
Message was edited by:
Laurent Schneider
a simpler but memory killer algorithm would use a plsql table in an udag and do the count(distinct) over that table to return the value

Count Distinct over a Window

Hi everyone,
An analyst on my team heard of a new metric called a "Stickiness" metric. It basically measures how often users are coming to your website overtime.
The definition is as follows:
# Unique Users Today/#Unique users Over Last 7 days
and also
# Unique Users Today/#Unique users Over Last 30 days
We have visit information stored in a table W_WEB_VISIT_F. For the sake of simplicity say it has columns VISIT_ID, VISIT_DATE and USER_ID (there are several more dimensional columns it has but I want to keep this exercise simple).
I want to create an aggregate table called W_WEB_VISIT_A that pre-aggregates the three values I need per day: # Unique Users Today, #Unique users Over Last 7 days and #Unique users Over Last 30 days. The only way I can think of building the aggregate table is as follows
WITH AGG AS (
SELECT
VISIT_DATE,
USER_ID
FROM W_WEB_VISIT_F
GROUP BY
VISIT_DATE,
USER_ID
select
VISIT_DATE
COUNT(DISTINCT USER_ID) UNIQUE_TODAY,
(select count(distinct hist.USER_ID) from agg hist where hist.VISIT_DATE between src.VISIT_DATE - 6 and src.VISIT_DATE) SEVEN_DAYS,
(select count(distinct hist.USER_ID) from agg hist where hist.VISIT_DATE between src.VISIT_DATE - 29 and src.VISIT_DATE) THIRTY_DAYS
from agg
group by visit_date
The problem I am having is that W_WEB_VISIT_F has several million records in it and I can't get it the above query to complete. It ran over night and didn't complete.
Is there a fancy 11g function I can use to do this for me? Is there a more efficient method?
Thanks everyone for the help!
-Joe
Edited by: user9208525 on Jan 13, 2011 6:24 AM
You guys are right. I missed the group by I had in the WITH Clause.

Hi,
Haven't used the windowing clause a lot, so I wanted to give a try.
I made up some data with this query :create table t as select sysdate-dbms_random.value(0,10) visit_date, mod(level,5)+1 user_id
from dual
connect by level <= 20;Which gave me following rows :Scott@my10g SQL>select * from t order by visit_date;
VISIT_DATE             USER_ID
03/01/2011 13:17:10          1
04/01/2011 05:30:30          4
04/01/2011 08:08:13          5
04/01/2011 14:42:24          3
04/01/2011 20:20:58          3
05/01/2011 17:29:24          2
05/01/2011 17:40:20          4
05/01/2011 18:32:56          2
06/01/2011 04:12:53          5
06/01/2011 08:59:18          2
06/01/2011 09:04:26          3
06/01/2011 10:14:20          1
06/01/2011 14:22:54          1
06/01/2011 19:39:04          1
08/01/2011 14:44:18          5
08/01/2011 21:38:04          5
11/01/2011 04:56:05          4
11/01/2011 18:52:29          2
11/01/2011 23:57:30          4
13/01/2011 07:24:22          3
20 rows selected.I came up to that query :select
        v.*,
        case
                when unq_l3d is null then -1
                else trunc(unq_today/unq_l3d,2)
        end ratio
from (
        select distinct trcdt, unq_today, unq_l3d
        from (
                select
                trcdt,
                count(user_id)
                over (
                        order by trcdt
                        range between numtodsinterval(1,'DAY') preceding and current row
                ) unq_today,
                count(user_id)
                over (
                        order by trcdt
                        range between numtodsinterval(3,'DAY') preceding and current row
                ) unq_l3d
                from (
                        select distinct trunc(visit_date) trcdt, user_id from t
) v
order by trcdtWith my sample data, it gives me :TRCDT                UNQ_TODAY    UNQ_L3D RATIO
03/01/2011 00:00:00          1          1 1.00
04/01/2011 00:00:00          4          4 1.00
05/01/2011 00:00:00          5          6 0.83
06/01/2011 00:00:00          6         10 0.60
08/01/2011 00:00:00          1          7 0.14
11/01/2011 00:00:00          2          3 0.66
13/01/2011 00:00:00          1          3 0.33
7 rows selected.where :
- UNQ_TODAY is the number of distinct user_id in the day
- UNQ_L3D is the number of distinct user_id in the last 3 days
- RATIO is UNQ_TODAY divided by UNQ_L3D +(when UNQ_L3D is not zero)+
It seems quite correct, but you would have to modify the query to fit to your needs and double-check the results !
Just noticed that my query is all wrong*... must have been missing coffeine, or sleep.... but I'm still trying !
Edited by: Nicosa on Jan 13, 2011 5:29 PM

Select distinct value

HI all
DESC CLAIMS
CLAIM_NUMBER NUMBER(10)
VERSION          NUMBER(2)
STATUS          NUMBER(1) to every claim_number can be 3 statuses
0/9/1
there must be at least 1 status
Select CLAIM_NUMBER   ,STATUS from claims
order by CLAIM_NUMBER,STATUS
CLAIM_NUMBER   STATUS
6971700            1
6971700            1
6971700            0
8624300            9
10071200          1
10453800          0
10453800          0 I want to fetch only CLAIM_NUMBER that contain only one value status
like CLAIM_NUMBER = 10453800 (which contain only 0 )
10071200 (which contain only 1)
8624300 (which contain only 9)
i started a query like this
select CLAIM_NUMBER
from CLAIMS
group by CLAIM_NUMBER
having count(distinct(status)) < 2 the problem is that i want to add to the condition the status value like 9 or 0
how shall i do that ?
thanks in advanced
Naama

is this what you are looking for?
select p.*from
select CLAIM_NUMBER,STATUS,count(distinct(STATUS)) over(partition by CLAIM_NUMBER) cnt
from CLAIMS
)p
where p.cnt =1
and status=0

Set Aggregation type of Count Distinct to use correct table aggregation in

Hi there,
Currently I use OBIEE 10.1.3.4.1 , and there is a case where a fact table consist of 2 logical table source: detail and aggregate table, which has some measure using count distinct as aggregation type. The problem is everytime I browse the measure with no dimension at all , it always use detail table not aggegation one..
Really appreciate for any suggestion ..
thanks a lot

Hi,
I don't think it's the same case as mine. Let say I have 2 table : detail and aggegate
Detail Table consists 4 fields:
*) Period
*) Market
*) Region
*) Measure : Customer ID, Sales
Aggregate Table consists 3 fields :
*) Period
*) Region
*) Measure : Customer ID, Sales
in the measure I set aggregation type for each field:
*) Sales >> set as Sum
*) Customer ID >> copy as "Number of Customer" and set as Count Distinct
In each LTS' contents I set the level of aggregation using "Get Levels" feature..
Then I try to browse via Presentation and do some querys belows:
a) only choose single field of measure : Sales, the session shows that the value is taken from aggregation table and just as I expected.
b) choose period and sales, the session shows that the values are taken from aggregation table, and still just as I expected.
c) choose period, sales , and market, the session shows that the values are taken from detail table, just as I expected.
d) only choose single field of measure : "Number of Customer", the session shows that the value is taken from detail table , this is NOT as I expected. It suppose to take the value from aggregation table..
e) choose period and "Number of Customer", the session shows that the value is taken from detail table , this is also NOT as I expected. It suppose to take the value from aggregation table..
I've tried to override the aggregation , but still confuse how to apply in measure "Number of Customer" and did not work at all..
any idea ?
thanks a lot

OBIEE 10G Total by in answers not correct for count distinct fields. Is this a bug?

For example:
Sales fact has receipt no and line no as key. It has data like:
receipt no, line no, value
1, 1, 30
1, 2, 40
2, 1, 10
2, 2, 10
There is also a transaction field defined as count distinct of receipt no (in BMM)
In answers, I set to show Total.
without any filters:
receipt no, value, transactions
1, 70, 1
2, 20, 1
total: 90, 2
Transactions is 2, which is correct.
If apply filter of transaction value greater than 50.
Then transactions in total will still show 2
1, 70, 1
total: 70, 2
Is this a bug? It looks only SUM works no problem in the total by.

I did look at the physical query and saw how it calculated the Total transactions and it didn't take into account of the filter of transaction value greater than 50. Don't know why though. I don't know why you want to count line no. The result would be still 2.

OLAP Analysis Count Distinct?

If this query is better suited to the OLAP forum, please let me know.
I am creating an Enrollment cube that has a dimension of Student with a Student_ID attribute. The fact table contains a measure column called Students, with each record having a value of 1. This results in getting a total SUM of students for a specific semester in an analysis in BI. However, this SUM aggregation does not distinctly identify students, resulting in a student that attends 4 semesters being counted as 4 students for the entire academic year. Adding COUNT(DISTINCT Student.Student_ID) to the analysis worked with an earlier test cube that I had created, but when I try to perform it on my updated cube it will only give me a COUNT(DISTINCT) for All Time, even when looking at the Semester or Academic Year levels. The only appreciable difference in my updated cube is that it has more dimensions.

Yes, you can post your query on the OLAP forum because this forum is on Oracle BI Applications (pre packages applications using OBEE + DAC + Informatica).
Regards,
Benoit

Counting distinct values???

Similar Messages

Maybe you are looking for