Group by count distinct

mytable
id | yy
1 | 78
2 | 78
3 | 78
3 | 79
3 | 79
4 | 79
5 | 79
5 | 80
Desired output:
yy | id_count
78 | 3
79 | 2
80 | 0
Following query doesn't work, as it doesn't take into account that id was already counted
select yy, count(distinct id) as id_count
from mytable
group by yy
--output
yy | id_count
78 | 3
79 | 4
80 | 1
Hope this makes sense.
Ideas?

Hi,
You only want to count each id once, with the first (that is, lowest) yy: is that right?
Here's one way:
WITH     got_r_num    AS
     SELECT  id
     ,     yy
     ,     ROW_NUMBER () OVER ( PARTITION BY  id
                               ORDER BY          yy
                       ) AS r_num
     FROM    my_table
SELECT       yy
       COUNT ( CASE
                  WHEN  r_num = 1
                THEN  id
              END
          )     AS id_cnt
FROM       got_r_num
GROUP BY  yy
ORDER BY  yy
;Doing anything for the first of each id is probably a job for "ROW_NUMBER () OVER (PARTITION BY id ...)".

Similar Messages

  • "group by" slow for using "count(distinct some_column)" - a better way?

    Hi all,
    i have an
    select
    count(distinct some_column),
    from [...]
    group by [...];
    Which is slowed down for the "*count(distinct some_column)*".
    The "group by" aggregates base records.
    But the base records have 1:n for some #1 event #n records each.
    Some of the #n records fall into group by result record (A), some other into group by result record (B).
    But each shall only count +1 per event - disregarding how many of the #n record have fallen into that category.
    Is there another (faster) way to count for this?
    - thanks!
    best regards,
    Frank
    Edited by: user8704911 on Jun 29, 2011 1:30 AM

    Hi Dom,
    incidentally i went in the direction you proposed:
    I replaced the pl/sql collection with the global temporary table.
    But the reason for doing this was a different one:
    I recognized, that the group by is much faster, if applied on table or global temporary table.
    However i first just moved the data from pl/sql collection to global temporary table in order to apply the group by there.
    Then the group by is much faster - but the moving of data from pl/sql collection to global temporary table then took away the time.
    So it was not the group by, but in general the read-access to the pl/sql collection (btw, around #65,000 records).
    Now having completely replaced the pl/sql collection with global temporary table everything is fine.
    cheers,
    Frank

  • Logical Aggregate Column (count(distinct)) Does Not Group for SQL Server DB

    When utilizing the count(distinct column_name) aggregate function within a Logical Fact source in the Business Model and Mapping layer in the RPD file the output in BI Answers is not grouping correctly for SQL Server 2008 database sources only. All Oracle database sources represent the same aggregate column correctly within BI Answers.
    I am using OBIEE version 10.1.3.3.3
    Does anyone know how to resolve this issue?
    Thanks in advance,
    Kyle

    I thought that I would update my current findings with this issue. If you display the report in BI Answers as a Pivot Table view the aggregate column displays properly, it does not in a Table or Compound Layout view for some reason. I am still working with Oracle Support on this.

  • Performance problem with more than one COUNT(DISTINCT ...) in a query

    Hi,
    (I hope this is the good forum).
    In the following query, I have 2 Count Distinct on 2 different fields of the same table.  Execution time is okay (2 s) with one or the other COUNT(DISCTINCT ...) in the SELECT clause, but is not tolerable (12 s) with both together in the query! I have
    a similar case with 3 counts: 4 s each, 36 s when together!
    I've looked at the execution plan, and it seems that with two count distinct, SQL server sorts the table twice before joining the results.
    I do not have much experience with SQL server optimization, and I don't know what to improve and how. The SQL is generated by Business Objects, I have few possibilities to tune it. The most direct way would be to execute 2 different queries, but I'd like
    to avoid it.
    Any advice?
    SELECT
      DIM_MOIS.DATE_DEBUT_MOIS,
      DIM_MOIS.NUM_ANNEE_MOIS,
      DIM_DEMANDE_SCD.CAT_DEMANDE,
      DIM_APPLICATION.LIB_APPLICATION,
      DIM_DEMANDE_SCD.CAT_DEMANDE ,
      count(distinct FAITS_DEMANDE.NB_DEMANDE_FLUX),
      count(distinct FAITS_DEMANDE.NB_DEMANDE_RESOL_NIV1)
    FROM
      ALIM_SID.DIM_MOIS INNER JOIN ALIM_SID.DIM_JOUR ON (DIM_JOUR.SEQ_MOIS=DIM_MOIS.SEQ_MOIS)
       INNER JOIN ALIM_SID.FAITS_DEMANDE ON (FAITS_DEMANDE.SEQ_JOUR=DIM_JOUR.SEQ_JOUR)
       INNER JOIN ALIM_SID.DIM_APPLICATION ON (FAITS_DEMANDE.SEQ_APPLICATION=DIM_APPLICATION.SEQ_APPLICATION)
       INNER JOIN ALIM_SID.DIM_DEMANDE_SCD ON (FAITS_DEMANDE.SEQ_DEMANDE_SCD=DIM_DEMANDE_SCD.SEQ_DEMANDE_SCD)
    WHERE
      ( ( DIM_MOIS.NUM_ANNEE_MOIS ) >201301
    GROUP BY
      DIM_MOIS.DATE_DEBUT_MOIS,
      DIM_MOIS.NUM_ANNEE_MOIS,
      DIM_DEMANDE_SCD.CAT_DEMANDE,
      DIM_APPLICATION.LIB_APPLICATION

    Here is the script, nothing original. Hope this helps.
    -- Fact table :
    -- foreign keys begin by FK_,
    -- measures to counted (COUNT DISTINCT) begin with NB_
    CREATE TABLE [ALIM_SID].[FAITS_DEMANDE](
        [SEQ_JOUR] [int] NOT NULL,
        [SEQ_DEMANDE] [int] NOT NULL,
        [SEQ_DEMANDE_SCD] [int] NOT NULL,
        [SEQ_APPLICATION] [int] NOT NULL,
        [SEQ_INTERVENANT] [int] NOT NULL,
        [SEQ_SERVICE_RESPONSABLE] [int] NOT NULL,
        [NB_DEMANDE_FLUX] [int] NULL,
        [NB_DEMANDE_STOCK] [int] NULL,
        [NB_DEMANDE_RESOLUE] [int] NULL,
        [NB_DEMANDE_LIVREE] [int] NULL,
        [NB_DEMANDE_MEP] [int] NULL,
        [NB_DEMANDE_RESOL_NIV1] [int] NULL,
     CONSTRAINT [PK_FAITS_DEMANDE] PRIMARY KEY CLUSTERED
        [SEQ_JOUR] ASC,
        [SEQ_DEMANDE] ASC,
        [SEQ_DEMANDE_SCD] ASC,
        [SEQ_APPLICATION] ASC,
        [SEQ_INTERVENANT] ASC,
        [SEQ_SERVICE_RESPONSABLE] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
     CONSTRAINT [AK_AK_FAITS_DEMANDE_FAITS_DE] UNIQUE NONCLUSTERED
        [SEQ_JOUR] ASC,
        [SEQ_DEMANDE] ASC,
        [SEQ_DEMANDE_SCD] ASC,
        [SEQ_APPLICATION] ASC,
        [SEQ_INTERVENANT] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_APPLICATION] FOREIGN KEY([SEQ_APPLICATION])
    REFERENCES [ALIM_SID].[DIM_APPLICATION] ([SEQ_APPLICATION])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_APPLICATION]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_DEMANDE] FOREIGN KEY([SEQ_DEMANDE])
    REFERENCES [ALIM_SID].[DIM_DEMANDE] ([SEQ_DEMANDE])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_DEMANDE]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_DEMANDE_SCD] FOREIGN KEY([SEQ_DEMANDE_SCD])
    REFERENCES [ALIM_SID].[DIM_DEMANDE_SCD] ([SEQ_DEMANDE_SCD])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_DEMANDE_SCD]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_INTERVENANT] FOREIGN KEY([SEQ_INTERVENANT])
    REFERENCES [ALIM_SID].[DIM_INTERVENANT] ([SEQ_INTERVENANT])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_INTERVENANT]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_JOUR] FOREIGN KEY([SEQ_JOUR])
    REFERENCES [ALIM_SID].[DIM_JOUR] ([SEQ_JOUR])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_JOUR]
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE]  WITH CHECK ADD  CONSTRAINT [FK_FAITS_DEMANDE_DIM_SERVICE_RESPONSABLE] FOREIGN KEY([SEQ_SERVICE_RESPONSABLE])
    REFERENCES [ALIM_SID].[DIM_SERVICE] ([SEQ_SERVICE])
    GO
    ALTER TABLE [ALIM_SID].[FAITS_DEMANDE] CHECK CONSTRAINT [FK_FAITS_DEMANDE_DIM_SERVICE_RESPONSABLE]
    GO
    -- not shown : extended properties
    -- One of the dimension  tables (they all have a primary key named SEQ_)
    CREATE TABLE [ALIM_SID].[DIM_JOUR](
        [SEQ_JOUR] [int] IDENTITY(1,1) NOT NULL,
        [SEQ_ANNEE] [int] NOT NULL,
        [SEQ_MOIS] [int] NOT NULL,
        [DATE_JOUR] [date] NULL,
        [CODE_ANNEE] [varchar](25) NULL,
        [CODE_MOIS] [varchar](25) NULL,
        [CODE_SEMAINE_ISO] [varchar](25) NULL,
        [CODE_JOUR_ANNEE] [varchar](25) NULL,
        [CODE_ANNEE_JOUR] [varchar](25) NULL,
        [LIB_JOUR] [varchar](25) NULL,
        [LIB_JOUR_COURT] [varchar](25) NULL,
        [JOUR_OUVRE] [tinyint] NULL,
        [JOUR_CHOME] [tinyint] NULL,
     CONSTRAINT [PK_DIM_JOUR] PRIMARY KEY CLUSTERED
        [SEQ_JOUR] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
    ALTER TABLE [ALIM_SID].[DIM_JOUR]  WITH CHECK ADD  CONSTRAINT [FK_DIM_JOUR_DIM_ANNEE] FOREIGN KEY([SEQ_ANNEE])
    REFERENCES [ALIM_SID].[DIM_ANNEE] ([SEQ_ANNEE])
    GO
    ALTER TABLE [ALIM_SID].[DIM_JOUR] CHECK CONSTRAINT [FK_DIM_JOUR_DIM_ANNEE]
    GO
    ALTER TABLE [ALIM_SID].[DIM_JOUR]  WITH CHECK ADD  CONSTRAINT [FK_DIM_JOUR_DIM_MOIS] FOREIGN KEY([SEQ_MOIS])
    REFERENCES [ALIM_SID].[DIM_MOIS] ([SEQ_MOIS])
    GO
    ALTER TABLE [ALIM_SID].[DIM_JOUR] CHECK CONSTRAINT [FK_DIM_JOUR_DIM_MOIS]
    GO

  • Count distinct in case statement

    SELECT A.P_ID,
    B.P_NAME,
    C.P_DESC,
    SUM(CASE
    WHEN A.DATE BETWEEN TRUNC(ADD_MONTHS(LAST_DAY(SYSDATE),-4) + 1) AND ADD_MONTHS(LAST_DAY(TO_DATE(SYSDATE)),-1)
    AND A.M_ID IS NOT NULL
    THEN 1
    ELSE 0
    END) AS COUNT,
    SUM(CASE
    WHEN A.DATE BETWEEN TRUNC(ADD_MONTHS(LAST_DAY(SYSDATE),-4) + 1) AND ADD_MONTHS(LAST_DAY(TO_DATE(SYSDATE)),-1)
    AND A.M_ID IS NOT NULL
    THEN COUNT(DISTINCT A.M_ID)
    ELSE 0
    END) AS UNIQUE_COUNT, /* Not possible */
    SUM(CASE
    WHEN A.DATE BETWEEN TRUNC(SYSDATE,'YEAR') AND ADD_MONTHS(LAST_DAY(TO_DATE(SYSDATE)),-1)
    THEN A.AMT_1
    ELSE 0
    END) AS TOTAL_AMT_1,
    SUM(CASE
    WHEN A.DATE BETWEEN TRUNC(SYSDATE,'YEAR') AND ADD_MONTHS(LAST_DAY(TO_DATE(SYSDATE)),-1)
    THEN A.AMT_2
    ELSE 0
    END) AS TOTAL_AMT_2
    FROM TABLE_A A,
    TABLE_B B,
    TABLE_C C
    WHERE A.P_ID = B.P_ID
    AND B.PT_ID = C.PT_ID
    GROUP BY A.P_ID,
    B.P_NAME,
    C.P_DESC
    Hi,
    This is a simplified version of my query.
    I am trying to do 4 things here,
    1. count A.M_ID
    2. count distinct A.M_ID, this is where I have a problem.
    3. and 4. Its just the sum from 2 diff columns.
    Note that the dates for count and amt are different and I can't hard code them.
    Can any one help me in the distinct count step?
    This query is also running kinda slow.
    So any suggestions, comments are very welcome.
    Note: TABLE_A has 700 million recs, TABLE_B 4 million and TABLE_c is just 500 recs
    Thanks!

    Taking advantage of the fact that most aggregate functions ignore nulls, you could do something like:
    SELECT a.p_id, b.p_name, c.p_desc,
           COUNT(CASE WHEN a.date BETWEEN TRUNC(ADD_MONTHS(LAST_DAY(sysdate),-4) + 1) AND
                                          ADD_MONTHS(LAST_DAY(TO_DATE(sysdate)),-1) AND
                           a.m_id IS NOT NULL THEN m_id END) AS countall,
           COUNT(DISTINCT CASE WHEN a.date BETWEEN TRUNC(ADD_MONTHS(LAST_DAY(sysdate),-4) + 1) AND
                                        ADD_MONTHS(LAST_DAY(TO_DATE(sysdate)),-1) AND
                         a.m_id IS NOT NULL THEN a.m_id END) AS unique_count, /* entirely possible */
           SUM(CASE WHEN a.date BETWEEN TRUNC(sysdate,'YEAR') AND
                                        ADD_MONTHS(LAST_DAY(TO_DATE(sysdate)),-1) THEN a.amt_1
                    ELSE 0 END) AS total_amt_1,
           SUM(CASE WHEN A.DATE BETWEEN TRUNC(sysdate,'YEAR') AND
                                        ADD_MONTHS(LAST_DAY(TO_DATE(sysdate)),-1) THEN A.AMT_2
                    ELSE 0 END) AS TOTAL_AMT_2
    FROM table_a a, table_b b, table_c c
    WHERE a.p_id = b.p_id and
          b.pt_id = c.pt_id
    GROUP BY a.p_id, b.p_name, c.p_descThe two case statements inside the COUNT return either a.m_id or NULL. A simplified test case is:
    SQL> WITH t as (
      2     SELECT 1 m_id, 9 dt FROM dual UNION ALL
      3     SELECT 1 m_id, 6 dt FROM dual UNION ALL
      4     SELECT 2 m_id, 9 dt FROM dual UNION ALL
      5     SELECT 2 m_id, 6 dt FROM dual UNION ALL
      6     SELECT 1 m_id, 5 dt FROM dual UNION ALL
      7     SELECT 2 m_id, 5 dt FROM dual UNION ALL
      8     SELECT null m_id, 9 dt FROM dual)
      9  SELECT count(CASE WHEN dt BETWEEN 6 and 9 THEN m_id end) cid,
    10         count(distinct CASE WHEN dt BETWEEN 6 and 9 THEN m_id end) cdid
    11  FROM t;
           CID       CDID
             4          2I'm not entirely sure that you actually need the a.m_id IS NOT NULL predicate in the CASE statements, but I left it to be safe.
    John

  • Count Distinct over a Window

    Hi everyone,
    An analyst on my team heard of a new metric called a "Stickiness" metric. It basically measures how often users are coming to your website overtime.
    The definition is as follows:
    # Unique Users Today/#Unique users Over Last 7 days
    and also
    # Unique Users Today/#Unique users Over Last 30 days
    We have visit information stored in a table W_WEB_VISIT_F. For the sake of simplicity say it has columns VISIT_ID, VISIT_DATE and USER_ID (there are several more dimensional columns it has but I want to keep this exercise simple).
    I want to create an aggregate table called W_WEB_VISIT_A that pre-aggregates the three values I need per day: # Unique Users Today, #Unique users Over Last 7 days and #Unique users Over Last 30 days. The only way I can think of building the aggregate table is as follows
    WITH AGG AS (
    SELECT
    VISIT_DATE,
    USER_ID
    FROM W_WEB_VISIT_F
    GROUP BY
    VISIT_DATE,
    USER_ID
    select
    VISIT_DATE
    COUNT(DISTINCT USER_ID) UNIQUE_TODAY,
    (select count(distinct hist.USER_ID) from agg hist where hist.VISIT_DATE between src.VISIT_DATE - 6 and src.VISIT_DATE) SEVEN_DAYS,
    (select count(distinct hist.USER_ID) from agg hist where hist.VISIT_DATE between src.VISIT_DATE - 29 and src.VISIT_DATE) THIRTY_DAYS
    from agg
    group by visit_date
    The problem I am having is that W_WEB_VISIT_F has several million records in it and I can't get it the above query to complete. It ran over night and didn't complete.
    Is there a fancy 11g function I can use to do this for me? Is there a more efficient method?
    Thanks everyone for the help!
    -Joe
    Edited by: user9208525 on Jan 13, 2011 6:24 AM
    You guys are right. I missed the group by I had in the WITH Clause.

    Hi,
    Haven't used the windowing clause a lot, so I wanted to give a try.
    I made up some data with this query :create table t as select sysdate-dbms_random.value(0,10) visit_date, mod(level,5)+1 user_id
    from dual
    connect by level <= 20;Which gave me following rows :Scott@my10g SQL>select * from t order by visit_date;
    VISIT_DATE             USER_ID
    03/01/2011 13:17:10          1
    04/01/2011 05:30:30          4
    04/01/2011 08:08:13          5
    04/01/2011 14:42:24          3
    04/01/2011 20:20:58          3
    05/01/2011 17:29:24          2
    05/01/2011 17:40:20          4
    05/01/2011 18:32:56          2
    06/01/2011 04:12:53          5
    06/01/2011 08:59:18          2
    06/01/2011 09:04:26          3
    06/01/2011 10:14:20          1
    06/01/2011 14:22:54          1
    06/01/2011 19:39:04          1
    08/01/2011 14:44:18          5
    08/01/2011 21:38:04          5
    11/01/2011 04:56:05          4
    11/01/2011 18:52:29          2
    11/01/2011 23:57:30          4
    13/01/2011 07:24:22          3
    20 rows selected.I came up to that query :select
            v.*,
            case
                    when unq_l3d is null then -1
                    else trunc(unq_today/unq_l3d,2)
            end ratio
    from (
            select distinct trcdt, unq_today, unq_l3d
            from (
                    select
                    trcdt,
                    count(user_id)
                    over (
                            order by trcdt
                            range between numtodsinterval(1,'DAY') preceding and current row
                    ) unq_today,
                    count(user_id)
                    over (
                            order by trcdt
                            range between numtodsinterval(3,'DAY') preceding and current row
                    ) unq_l3d
                    from (
                            select distinct trunc(visit_date) trcdt, user_id from t
    ) v
    order by trcdtWith my sample data, it gives me :TRCDT                UNQ_TODAY    UNQ_L3D RATIO
    03/01/2011 00:00:00          1          1  1.00
    04/01/2011 00:00:00          4          4  1.00
    05/01/2011 00:00:00          5          6  0.83
    06/01/2011 00:00:00          6         10  0.60
    08/01/2011 00:00:00          1          7  0.14
    11/01/2011 00:00:00          2          3  0.66
    13/01/2011 00:00:00          1          3  0.33
    7 rows selected.where :
    - UNQ_TODAY is the number of distinct user_id in the day
    - UNQ_L3D is the number of distinct user_id in the last 3 days
    - RATIO is UNQ_TODAY divided by UNQ_L3D +(when UNQ_L3D is not zero)+
    It seems quite correct, but you would have to modify the query to fit to your needs and double-check the results !
    Just noticed that my query is all wrong*... must have been missing coffeine, or sleep.... but I'm still trying !
    Edited by: Nicosa on Jan 13, 2011 5:29 PM

  • OBIEE 10G count distinct problem

    Hi,
    I am really new to OBI now runs into this problem.
    I have a fact and three dimension tables as follows:
    fact:
    1. sales:
    sold_vlaue (sum)
    transactions (count distinct receipt_id)
    branch_id (foreign key)
    daykey (foreign key)
    receipt_id (foreign key)
    product_key (foreign key)
    dimensions
    1. branch
    branch_id (key)
    2. time
    daykey (key)
    3. product
    product_key (key)
    These tables are joined as star schema by keys mentioned above. sales.sold_value is aggregated by 'sum', transactions is by (count distinct receipt_id). I don't have a dimension for receipt_id since it's only for the calculation of transaction.
    So how can I set up to make the transactions correct (count distinct receipt_id)?
    I tried to set transactions as count distinct in Default aggregation rule. But the result is wrong (all 1)

    All right. I figured it out.
    The fact table should be modelled as:
    1. sales:
    physical layer:
    sold_vlaue
    branch_id (foreign key)
    daykey (foreign key)
    receipt_id (foreign key)
    product_key (foreign key)
    The underlying query is:
    select
    branch_id, daykey, receipt_id, product_key
    , sum(sold_value)
    from table
    group by
    branch_id, daykey, receipt_id, product_key
    BMM layer:
    sold_value (sum)
    transactions (count distinct receipt_id)
    branch_id (foreign key)
    daykey (foreign key)
    receipt_id (foreign key) (removed)
    product_key (foreign key)

  • Count distinct values in report builder

    i have a situation where i have to count distinct number of customers.
    i have a query which returns the list of values of bill_to_customer_id from ra_customer_trx_all table and i have to display only the number of distinct customers. i cant do this in the query because it has to be grouped and i am doing it in an aging report. i have to list the number of distinct customers in each aging period. can anybody please help me how to achieve this in reports 6i.
    thanks

    how can i count distinct values in reports?
    the situation is like this
    i have a query which lists customer_id, invoice number, amount due
    so what i want is to count the distinct customer_id and display the number of distinct customers. one customer_id can be repeated any number of times but i should count it only once.

  • Count Distinct

    Hi @all,
    the question might be answered already but I can't think of what to search for.
    I've got a dimension with Active Directory attributes. And another dimension with groupnames.
    One AD-account can be in many groups.
    The facttable (snowflake Schema) contains the ID of the AD-Dimension and the groupname.
    It could look like this:
    ID    GroupName     GroupAlias
    1      Test1              Test
    1      Test2              Test
    1      Test3              Test
    1      hello1             Hello
    1      hello2             Hello
    I am actually talking about the GroupAlias which should be counted distinct.
    The ID 1 is in 3 different "Test-Groups", but the alias is always "Test". So the Count should be 1.
    How does the MDX should look like?
    Thanks!

    something i grabbed from technet. this gives the distinct count of dim members with internet sales. if you are not able to get your mdx, post it. 
    WITH SET MySet AS
    {[Customer].[Customer Geography].[Country].&[Australia],[Customer].[Customer Geography].[Country].&[Australia],
    [Customer].[Customer Geography].[Country].&[Canada],[Customer].[Customer Geography].[Country].&[France],
    [Customer].[Customer Geography].[Country].&[United Kingdom],[Customer].[Customer Geography].[Country].&[United Kingdom]}
    {[Measures].[Internet Sales Amount] }
    MEMBER MEASURES.SETDISTINCTCOUNT AS
    DISTINCTCOUNT(MySet)
    SELECT {MEASURES.SETDISTINCTCOUNT} ON 0
    FROM [Adventure Works]

  • Maximum, Count Distinct functions in Template.

    Hi,
    I have a dataset which I was trying to group for a report.
    I need in report in such a way that, I have to list product code in one column and Max(Sub-product Code) in another column product code,Sub Product code are strings and not Numbers.
    Suppose my data set has
    product-code|sub-product-code
    abc|1a
    abc|1b
    abc|1c
    def|2a
    def|2b
    def|2c
    I need in report
    abc|1a
    def|2b
    I grouped it and I was trying to show Max(subproduct code) of current group. But it was giving me numer instead of text. I tried to select "text" for the form field. But it is automatically chaging to Number.
    Also, Can you let me know how to write the count(distinct subproduct code) . I also,need to add this column.
    I cannot add them in the SQL because, there are tons of templates on this dataset. I have to do it in template.
    Any help is greately appreciated.

    Try using this: <?count(xdoxslt:distinct_values(field_name))?> -- substitute your sub product code field name
    Thanks,
    Bipuser

  • Query rewrite for COUNT(DISTINCT)

    Hi,
    I am having fact table with different dimension keys.
    CREATE TABLE FACT
    TIME_SKEY NUMBER
    REGION_SKEY NUMBER,
    AC_SKEY NUMBER
    I need to take COUNT(DISTINCT(AC_SKEY) for TIME_SKEY and REGION_SKEY. There are oracle dimension defined for time and region which are using TIME_SKEY and REGION_SKEY. I have created MV with query rewrite with COUNT(DISTINCT) but it is not using dimension if I am using any other level and MV can't be fast refreshed as it was build using COUNT(DISTINCT).
    CREATE MATERIALIZED VIEW AC_MV
    NOCACHE
    NOLOGGING
    NOCOMPRESS
    NOPARALLEL
    BUILD IMMEDIATE
    REFRESH COMPLETE ON DEMAND
    WITH PRIMARY KEY
    ENABLE QUERY REWRITE
    AS
    SELECT
    TIME_SKEY ,
    REGION_SKEY,
    COUNT (DISTINCTAC_SKEY)
    FROM FACT
    GROUP BY TIME_SKEY, REGION_SKEY;
    Query used to retrieve data is as below
    SELECT TIME_SKEY, COUNT(DISTINCT AC_SKEY) OVER (PARTITION BY TIME_SKEY) UNIQ_AC, COUNT(DISTINCT AC_SKEY) OVER () UNIQ_AC1
    FROM FACT;
    There can be other queries based on time / region dimension.
    Can you please provide help in solving above issue?
    Thanks,
    Pritesh

    What version of the Oracle database?

  • Count distinct aggregation in CWM2

    Hi,
    Is there any way to make a count distinct aggregation with a CWM2 model?
    I mean, for the documentation example (geography, product, channel and time dimensions), I would like to show in a graph the sum of the sales measure and the number of different products, for the different filter selections made.
    E.g.:
    Page items: geography dim, time dim.
    Groups: Measures (sales measure, ¿number of different products?)
    Series: channel dim.
    I know that the unique possible aggregation in CWM2 model is SUM, so I think it can't be done with an aggregation. Can anyone suggest me a workaround to do this? Is it possible within an analytical workspace?
    Thanks in advanced...

    Hi Nagarajan,
    1. have to count distinct records from one internal ta
      first of all u have to decide
      which FIELD COMBINATION makes a record unique.
    2. After that u can use the abap syntax
       delete adjacent duplicates (see documention/help)
        (before this make sure to SORT the internal table
        in the same sequence of FIELD COMBINATION)
    3. then u can use
       describe table itab.
    4. Before doing step 2,
       u can copy the whole internal table to another internal table
      and do your logic on the second internal table
      so that the original is not lost.
      u can copylike this.
       ITAB2[]   =  ITAB1[].
    Hope it helps.
    Regards,
    Amit M.
    Message was edited by: Amit Mittal

  • COUNT(DISTINCT) on multiple columns?

    Is there an easier way of doing a COUNT(DISTINCT...) on multiple items than converting them to strings and concatenating them?
    i.e. if I have a table with column string1 as VARCHAR2(1000), number2 as NUMBER, and date3 as DATE, and I want a count on how many distinct combinations of the three exist, is there a better way than:
    SELECT COUNT(DISTINCT string1 || TO_CHAR(number2) || TO_CHAR(date3, 'YYYYMMDD'))-- Don

    Hi,
    Why not a group by?
    SQL> ed
    Wrote file afiedt.buf
      1  with t as
      2  (
      3  select 'string1' string1, 1 number1, to_date('10-NOV-2009','DD-MON-YYYY') date1 from dual
      4  union all select 'string2',1,to_date('10-NOV-2009','DD-MON-YYYY') from dual
      5  union all select 'string1',1,to_date('11-NOV-2009','DD-MON-YYYY') from dual
      6  union all select 'string1',2,to_date('11-NOV-2009','DD-MON-YYYY') from dual
      7  union all select 'string2',1,to_date('10-NOV-2009','DD-MON-YYYY') from dual
      8  )
      9  select string1, number1, date1 from t
    10* group by string1, number1, date1
    SQL> /
    STRING1    NUMBER1 DATE1
    string1          1 11-NOV-09
    string2          1 10-NOV-09
    string1          1 10-NOV-09
    string1          2 11-NOV-09
    SQL> ed
    Wrote file afiedt.buf
      1  with t as
      2  (
      3  select 'string1' string1, 1 number1, to_date('10-NOV-2009','DD-MON-YYYY') date1 from dual
      4  union all select 'string2',1,to_date('10-NOV-2009','DD-MON-YYYY') from dual
      5  union all select 'string1',1,to_date('11-NOV-2009','DD-MON-YYYY') from dual
      6  union all select 'string1',2,to_date('11-NOV-2009','DD-MON-YYYY') from dual
      7  union all select 'string2',1,to_date('10-NOV-2009','DD-MON-YYYY') from dual
      8  )
      9  select string1, number1, date1 from t
    10  group by string1, number1, date1
    11* having count(*) > 1
    SQL> /
    STRING1    NUMBER1 DATE1
    string2          1 10-NOV-09-Arun

  • Query with COUNT DISTINCT

    Hello,
    We are in 10g ...
    I have to compute COUNT DISTINCT of customers, per month, and YearToDate.
    Per month, I think I found it out ...
    On the year to date ... I have no clue at all ... and I hope that you could provide me with a solution or advice...
    Here is my example :
    month cust
    200711 A
    200711 B
    200712 A
    200712 C
    200801 A
    200801 B
    200802 A
    200802 C
    200803 A
    200803 C
    200803 A
    200804 D
    I would like to get this :
    month cust_count cust_count_YTD
    200711......2................2 (because cust A and B)
    200712......2................3 (because cust A and C)
    200801......2................2 (Back to 0 at the beginning of each year)
    200802......2................3 (because cust A and C)
    200803......2................3 (because cust A and C, and A but count distinct)
    200804......1................4 (because D)
    Thank you in advance,
    Olivier

    Oh This is an interesting question.
    create table custTable(yyyymm,cust) as
    SELECT '200711','A' FROM dual UNION all
    SELECT '200711','B' FROM dual UNION all
    SELECT '200712','A' FROM dual UNION all
    SELECT '200712','C' FROM dual UNION all
    SELECT '200801','A' FROM dual UNION all
    SELECT '200801','B' FROM dual UNION all
    SELECT '200802','A' FROM dual UNION all
    SELECT '200802','C' FROM dual UNION all
    SELECT '200803','A' FROM dual UNION all
    SELECT '200803','C' FROM dual UNION all
    SELECT '200803','A' FROM dual UNION all
    SELECT '200804','D' FROM dual;
    select distinct yyyymm,cust_count,
    sum(WillSum) over(partition by substr(yyyymm,1,4) order by yyyymm) as cust_count_YTD
    from (select yyyymm,count(distinct cust) over(partition by yyyymm) as cust_count,
          case Row_Number() over(partition by substr(yyyymm,1,4),cust order by yyyymm)
          when 1 then 1 else 0 end as WillSum
            from custTable)
    order by yyyymm;or
    select yyyymm,count(distinct cust) as cust_count,
    sum(sum(WillSum)) over(partition by substr(yyyymm,1,4) order by yyyymm) as cust_count_YTD
    from (select yyyymm,cust,
          case Row_Number() over(partition by substr(yyyymm,1,4),cust order by yyyymm)
          when 1 then 1 else 0 end as WillSum
            from custTable)
    group by yyyymm
    order by yyyymm;
    YYYYMM  CUST_COUNT  CUST_COUNT_YTD
    200711           2               2
    200712           2               3
    200801           2               2
    200802           2               3
    200803           2               3
    200804           1               4similer threads
    Rolling unique person count by month over a time period
    [SQL] how can i get this result....??(accumulation distinct count)

  • Count distinct  from a master table and sum from a detail

    Hello to all,
    I have  a query like as:
             select a,b,c,     SUM(fa.ip1) S1, SUM(fa.ip2)   S2
             FROM tab1 FI, tab2 FA
             where fi.x1 = fa.x1
             and fi.x2 = fa.x2
             group by    a,b,c;
    tab1's table is master table for tab2's table (one tab1 records there are   many tab2 records), (one to many relation)
    My question is, how can I get to sum of columns: ip1 and ip2 from tab2 Fa, with only count(how many) of rows of tab1? 
    Somethings similar to;
    Select a,b,c, count(distinct fi.x1, fi.x2 ) nrec_of_FI,   SUM(fa.ip1)S1, SUM(fa.ip2)   S2
    Thanks in advance

    Hi,
    Sorry, I can't tell what you want just by looking at code that does not do it.
    Whenever you have a problem, please post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) from all tables involved, so that the people who want to help you can re-create the problem and test their ideas.
    Also post the results you want from that data, and an explanation of how you get those results from that data, with specific examples.
    Always say which version of Oracle you're using (for example, 11.2.0.2.0).
    See the forum FAQ: https://forums.oracle.com/message/9362002
    Perhaps you want to get the totals in 2 stages, like this:
    WITH  five_column_totals  AS
        select    a, b, c
        ,         fi.x1, fi.x2       -- For debugging only
        ,         SUM (fa.ip1)   AS prelim_S1
        ,         SUM (fa.ip2)   AS prelim_S2
        FROM      tab1 FI
        ,         tab2 FA
        where     fi.x1 = fa.x1
        and       fi.x2 = fa.x2
        group by  a, b, c
        ,         f1.x1, f2.x2
    SELECT    a, b, c
    ,         COUNT (*)             AS nrec_of_f1
    ,         SUM (prelim_s1)       AS s1
    ,         SUM (prelim_s2)       AS s2
    FROM      five_column_totals
    GROUP BY  a, b, c
    Notice that the sub-query called five_column_totals is essentially what you posted, except that there fi.x1 and fi.x2 are included in the GROUP BY clause.  That means the sub-query will hve a separate row for each distinct combination of x1 and x2, which you can COUNT in the main query, GROUPing only BY a, b and c.

Maybe you are looking for

  • MaxDB 7.5 does not clear logarea after full backup

    Hi all, We have MaxDB 7.5, turned on the auto-log-backup and are running full data-backups every night. The problem is that the log-area is not cleared as I expected, so the log-area is filling up at 300 MB per day. Was I wrong to expect the log-area

  • How do I rearrange the order of songs on an album?

    I like to have the songs in the correct order because a lot of the time the artist intended for them to be listened to in a specific order. How do i rearrange songs in an album that are mixed up?

  • Problem with character mode printing on epson lx-300

    Greetings I developed a character mode report and is printing fine on the printer lx-300 when I use Reports 6i (Client/Server) We migrated all the platform to OAS 10G Release2 bu the same report has the following behaviour: 1. The left margin is shif

  • How do I get rid of....

    How do I get rid of email addresses that have only been used once or twice that are not in my contacts? I've cleared cookies, cache and everything else I can think of. Anyone have any suggestions? Thanks!! Stephanie

  • Chart series can't change color

    I create different type series in the series block only the columnserie can affect my specified color. The other two serie only use the default color. The color like this: <mx:ColumnChart id="myChart4" width="527" height="178"        showDataTips="tr