Partition by clause

Dose the Partition by clause increase the perfomance then the normal group by clause ??
thanks,
Raj.

Analytic queries != Aggregate queries
Therefore it depends on what you're doing as to whether you'd want to use Analytic vs Aggregate queries.
Performance depends also on the amount of data in your tables.
Having said that, Analytics can mean that a self-join is no longer needed, and this could help if a large table is involved ... on the otherhand, it might hinder.
Think of Analytic queries as another tool in your toolbox; sometimes a hammer is the right tool to use, but sometimes you need a spanner instead.

Similar Messages

Missing partition by clause causing wrong aggregation

Hello all!
I have a Location Hierarchy. Country Region > Country > State > City
When I create a report using Country, Headcount ( Month = July 11) I get correct results with right aggregation:
United States      2000
Mexico      1500
Ireland      1000
SQL Generated:
WITH
SAWITH0 AS (select T95996.COUNTRY_NAME as c2,
T95996.COUNTRY_CODE as c3,
sum(T158903.HEADCOUNT) as c4,
T100027.PER_NAME_MONTH as c5
from
W_BUSN_LOCATION_D T95996,
W_EMPLOYMENT_D T95816,
W_MONTH_D T100027,
W_WRKFC_EVT_MONTH_F T158903
where ( T95816.ROW_WID = T158903.EMPLOYMENT_WID
and T95996.ROW_WID = T158903.LOCATION_WID
and T100027.ROW_WID = T158903.EVENT_MONTH_WID
and T100027.PER_NAME_MONTH = '2011 / 08'
group by T95996.COUNTRY_CODE, T95996.COUNTRY_NAME, T100027.PER_NAME_MONTH),
SAWITH1 AS (select distinct SAWITH0.c2 as c1,
LAST_VALUE(SAWITH0.c4 IGNORE NULLS) OVER (PARTITION BY SAWITH0.c3 ORDER BY SAWITH0.c3 NULLS FIRST, SAWITH0.c5 NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as c2,
SAWITH0.c3 as c3
from
SAWITH0)
select SAWITH1.c1 as c1,
SAWITH1.c2 as c2
from
SAWITH1
order by c1
When I create a report using Country Region, Headcount ( Month = July 11) I get wrong aggregation and all rows show same number
Region 1- 135000
Region 2- 135000
Region 3- 135000
SQL Generated:
WITH
SAWITH0 AS (select T95996.COUNTRY_REGION as c2,
sum(T158903.HEADCOUNT) as c3,
T100027.PER_NAME_MONTH as c4
from
W_EMPLOYMENT_D T95816,
W_MONTH_D T100027,
W_WRKFC_EVT_MONTH_F T158903,
W_BUSN_LOCATION_D T95996
where ( T95816.ROW_WID = T158903.EMPLOYMENT_WID
and T100027.ROW_WID = T158903.EVENT_MONTH_WID
and T100027.PER_NAME_MONTH = '2011 / 08'
group by T95996.COUNTRY_REGION, T100027.PER_NAME_MONTH)
select distinct SAWITH0.c2 as c1,
LAST_VALUE(SAWITH0.c3 IGNORE NULLS) OVER ( ORDER BY SAWITH0.c4 NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as c2
from
SAWITH0
order by c1
I see that the second SQL is missing that PARTITION BY CLAUSE and wondering if this is reason for wrong calculation. How can I make BI Server to include this clause?
Any leads will be helpful.

Hi Deepak,
Thanks for your reply. I see your point here.
Some more info: This fact table is actually a monthly snapshot. Do you think I should check on anything else?
I tried to simply the SQL Generated by I Server by removing some extra conditions. Here is the actual SQL for Country, Headcount:
WITH
SAWITH0 AS (select T95996.COUNTRY_NAME as c2,
T95996.COUNTRY_CODE as c3,
sum(case when T95816.W_EMPLOYMENT_STAT_CODE = 'A' and T95816.W_EMPLOYEE_CAT_CODE = 'EMPLOYEE' then T158903.HEADCOUNT else 0 end ) as c4,
T100027.PER_NAME_MONTH as c5
from
W_BUSN_LOCATION_D T95996 /* Dim_W_BUSN_LOCATION_D_Employee */ ,
W_EMPLOYMENT_D T95816 /* Dim_W_EMPLOYMENT_D */ ,
W_MONTH_D T100027 /* Dim_W_MONTH_D */ ,
W_WRKFC_EVT_MONTH_F T158903 /* Fact_W_WRKFC_EVT_MONTH_F_Snapshot */
where ( T95816.ROW_WID = T158903.EMPLOYMENT_WID and T95996.ROW_WID = T158903.LOCATION_WID and T100027.ROW_WID = T158903.EVENT_MONTH_WID and T100027.PER_NAME_MONTH = '2011 / 07' and T158903.SNAPSHOT_IND = 1 and T158903.DELETE_FLG <> 'Y' and T100027.CAL_MONTH_START_DT >= TO_DATE('2004-01-01 00:00:00' , 'YYYY-MM-DD HH24:MI:SS') and (T158903.SNAPSHOT_MONTH_END_IND in (1) or T158903.EFFECTIVE_END_DATE >= TO_DATE('2011-08-11 00:00:00' , 'YYYY-MM-DD HH24:MI:SS')) and (T95996.ROW_WID in (0) or T95996.BUSN_LOC_TYPE in ('EMP_LOC')) and T158903.EFFECTIVE_START_DATE <= TO_DATE('2011-08-11 00:00:00' , 'YYYY-MM-DD HH24:MI:SS') )
group by T95996.COUNTRY_CODE, T95996.COUNTRY_NAME, T100027.PER_NAME_MONTH),
SAWITH1 AS (select distinct SAWITH0.c2 as c1,
LAST_VALUE(SAWITH0.c4 IGNORE NULLS) OVER (PARTITION BY SAWITH0.c3 ORDER BY SAWITH0.c3 NULLS FIRST, SAWITH0.c5 NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as c2,
SAWITH0.c3 as c3
from
SAWITH0)
select SAWITH1.c1 as c1,
SAWITH1.c2 as c2
from
SAWITH1
order by c1
for Country Region, Headcount:
WITH
SAWITH0 AS (select T95996.COUNTRY_REGION as c2,
sum(case when T95816.W_EMPLOYMENT_STAT_CODE = 'A' and T95816.W_EMPLOYEE_CAT_CODE = 'EMPLOYEE' then T158903.HEADCOUNT else 0 end ) as c3,
T100027.PER_NAME_MONTH as c4
from
W_EMPLOYMENT_D T95816 /* Dim_W_EMPLOYMENT_D */ ,
W_MONTH_D T100027 /* Dim_W_MONTH_D */ ,
W_WRKFC_EVT_MONTH_F T158903 /* Fact_W_WRKFC_EVT_MONTH_F_Snapshot */ ,
W_BUSN_LOCATION_D T95996 /* Dim_W_BUSN_LOCATION_D_Employee */
where ( T95816.ROW_WID = T158903.EMPLOYMENT_WID and T100027.ROW_WID = T158903.EVENT_MONTH_WID and T100027.PER_NAME_MONTH = '2011 / 07' and T158903.SNAPSHOT_IND = 1 and T158903.DELETE_FLG <> 'Y' and T100027.CAL_MONTH_START_DT >= TO_DATE('2004-01-01 00:00:00' , 'YYYY-MM-DD HH24:MI:SS') and (T158903.SNAPSHOT_MONTH_END_IND in (1) or T158903.EFFECTIVE_END_DATE >= TO_DATE('2011-08-11 00:00:00' , 'YYYY-MM-DD HH24:MI:SS')) and (T95996.ROW_WID in (0) or T95996.BUSN_LOC_TYPE in ('EMP_LOC')) and T158903.EFFECTIVE_START_DATE <= TO_DATE('2011-08-11 00:00:00' , 'YYYY-MM-DD HH24:MI:SS') )
group by T95996.COUNTRY_REGION, T100027.PER_NAME_MONTH)
select distinct SAWITH0.c2 as c1,
LAST_VALUE(SAWITH0.c3 IGNORE NULLS) OVER ( ORDER BY SAWITH0.c4 NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as c2
from
SAWITH0
order by c1

How to group the values with this partition over clause ?

Hi,
I have a nice request :
select c.libelle "Activité", sum(b.duree) "Durée"
from    fiche a, activite_faite b,
        activites c, agent d
where   a.date_activite
BETWEEN TO_DATE('20/09/2009', 'DD/MM/YYYY') AND TO_DATE('26/10/2009', 'DD/MM/YYYY')
AND     a.agent_id = 104
AND     a.fiche_id = b.fiche_id
AND     b.activites_id = c.activites_id
AND     a.agent_id = d.agent_id
group   by c.libelle
order   by sum(b.duree)It gives me this nice result :
ACTIVITE DUREE
     Tonte            27I want to get a percentage, i use ratio_to_report
select a.fiche_id, c.libelle "Activité", ratio_to_report(duree) over (partition by c.activites_id) * 100 "Durée"
from    fiche a, activite_faite b,
        activites c, agent d
where   a.date_activite
BETWEEN TO_DATE('20/09/2009', 'DD/MM/YYYY') AND TO_DATE('26/10/2009', 'DD/MM/YYYY')
AND     a.agent_id = 104
AND     a.fiche_id = b.fiche_id
AND     b.activites_id = c.activites_id
AND     a.agent_id = d.agent_idIt gives me this less nice result :
Tonte 7,40740740740740740740740740740740740741
Tonte 33,33333333333333333333333333333333333333
Tonte 33,33333333333333333333333333333333333333
Tonte 25,92592592592592592592592592592592592593I would like to get this result :
Tonte 100I tried "grouping" values in the partition over clause but without success.
Any help appreciated from the slq-masters :
Regards,
Christian

Christian from France wrote:
I would like to get this result :
Tonte 100
Hi,
Why not this
select c.libelle "Activité", 100 "Durée"
from    fiche a, activite_faite b,
        activites c, agent d
where   a.date_activite
BETWEEN TO_DATE('20/09/2009', 'DD/MM/YYYY') AND TO_DATE('26/10/2009', 'DD/MM/YYYY')
AND     a.agent_id = 104
AND     a.fiche_id = b.fiche_id
AND     b.activites_id = c.activites_id
AND     a.agent_id = d.agent_id
group   by c.libelle
order   by sum(b.duree)Because it would always be 100 (if you are taking as percentage) be what ever the count of duree be.
Or did I miss something in understanding the requirement.
Regards
Anurag

Cross-listing Query (Partition By Clause? Self-Join?)

Hello,
I need a query that will cross-list courses a professor is teaching this semester. Essentially, two fields need to be the same (i.e.: Section & CourseTitle), while the third field is different (i.e.: Subject).
For example, Max Power is a professor teaching 3 courses, one is cross-listed (ENG 123 and JRL 123):
LastName     FirstName     Subject     Section     CourseTitle
Power          Max          ENG     123     English Composition
Power          Max          ENG     452     Robert Frost Poetry
Power          Max          JRL     123     English Composition
Power           Max          ENG      300     Faulkner & TwainThe desired query output is this:
LastName     FirstName     Subject     Section     CourseTitle
Power          Max          ENG     123     English Composition
Power          Max          JRL     123     English CompositionBasically, I need only the cross-listed courses in the output.Is this an instance where I use a "Partition By Clause" or should I create a self-join?
Much thanks for any help and comments.

Unfortunately, I can't create new tables. I don't have permission. I can't alter, add or delete any of the data.
So I tried Frank's code with my data:
WITH got_cnt AS
SELECT sivasgn_term_code, spriden_id, spriden_last_name, spriden_first_name,
                ssbsect_ptrm_code, ssbsect_camp_code,
                sivasgn_crn, ssbsect_subj_code, ssbsect_crse_numb, scbcrse_title,
       count(*) over (partition by ssbsect_crse_numb, scbcrse_title) cnt
FROM spriden INNER JOIN sivasgn ON spriden_pidm = sivasgn_pidm JOIN
     ssvsect ON ssbsect_crn = sivasgn_crn JOIN
     sfrstcr ON sfrstcr_crn = sivasgn_crn
WHERE ssbsect_term_code= sivasgn_term_code
AND sfrstcr_term_code = sivasgn_term_code
AND ssbsect_enrl > '0' and sivasgn_credit_hr_sess > '0'
AND sivasgn_term_code IN ('200901', '200909')
AND spriden_change_ind IS NULL
AND ssbsect_camp_code IN ('1', '2', 'A', 'B')
SELECT DISTINCT sivasgn_term_code, spriden_id, spriden_last_name, spriden_first_name,
                substr(ssbsect_ptrm_code,1,1) as ptrm_code, ssbsect_camp_code,
                sivasgn_crn, ssbsect_subj_code, ssbsect_crse_numb, scbcrse_title
FROM got_cnt
WHERE cnt >1
ORDER BY spriden_last_name, sivasgn_term_code, ssbsect_crse_numb;The output pretty much displays all courses with same subject code, course number and course title.
Output:
LastName     FirstName     Subject     Section     CourseTitle
Power          Max          ENG     123     English Composition
Power          Max          ENG     123     English Composition
Power          Max          ENG     452     Robert Frost Poetry
Power          Max          ENG     452     Robert Frost Poetry
Power           Max          ENG      300     Faulkner & Twain
Power           Max          ENG      300     Faulkner & Twain
Power          Max          JRL     123     English Composition
Power          Max          JRL     123     English CompositionWhat I would like is same course number, course title, BUT different subject code. Pretty much that in my first post of this thread.
Desired Output:
LastName     FirstName     Subject     Section     CourseTitle
Power          Max          ENG     123     English Composition
Power          Max          JRL     123     English CompositionMaybe I'm explaining this wrong. Any help would be greatly appreciated. Thanks.

Help !!! - partition by clause

hi. i have the following table
table : item_tracker
date | item_code | begin_price
end of each day, one record for each item is getting written to this table. now after the records are written, i want to calculate the change in begin prices
the formula is change_in_price=(todays_begin / yesterdays_begin) * 100
i wrote the following query but it seems not working. please advise me on what i am doing wrong.. also is the approach wrong ?
please advise.
select date, item_code, begin_price, first_value(begin_price) over (partition by date, item_code order by trunc(date) desc rows between 2 preceding and 1 preceding) as yesterdays_begin
from item_tracker
here, the partition by column is not showing any value.
is there any way i can include the date ? (like a 'having clause' for group by, what is the method to partition by)
i tried range between as well. it reurned the error that i need to mention a number value instead of date to check a range.
please help me :(
thanx in advance

Yes, joins are effective :
SQL> create table item_tracker(price_date date, item_id number, day_price number);
Table created.
SQL> insert into item_tracker values ('11-JAN-11',1,10);
1 row created.
SQL> insert into item_tracker values ('11-JAN-11',2,12);
1 row created.
SQL> insert into item_tracker values ('11-JAN-11',3,24);
1 row created.
SQL> insert into item_tracker values ('12-JAN-11',1,10.5);
1 row created.
SQL> insert into item_tracker values ('12-JAN-11',2,16);
1 row created.
SQL> insert into item_tracker values ('12-JAN-11',3,21);
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> l
1 select a.item_id, a.price_date, a.day_price, b.day_price PrevPrice, (a.day_price - b.day_price) PriceDiff
2 from item_tracker a, item_tracker b
3 where a.item_id=b.item_id
4 and a.price_date=to_date('12-JAN-11','DD-MON-RR')
5 and b.price_date=a.price_date-1
6* order by 1
SQL> /
   ITEM_ID PRICE_DAT DAY_PRICE PREVPRICE PRICEDIFF
         1 12-JAN-11       10.5         10         .5
         2 12-JAN-11         16         12          4
         3 12-JAN-11         21         24         -3
SQL>Hemant K Chitale

Group by Vs Partition By Clause

Hello,
Can you please help me in resolving the below issue,
I have explained the scenario below with dummy table,
CREATE TABLE emp (empno NUMBER(12), ename VARCHAR2(10), deptno NUMBER(12));
INSERT INTO emp
(empno, ename, deptno
VALUES (1, 'A', 10
INSERT INTO emp
(empno, ename, deptno
VALUES (2, 'B', 10
INSERT INTO emp
(empno, ename, deptno
VALUES (3, 'C', 20
INSERT INTO emp
(empno, ename, deptno
VALUES (4, 'D', 20
INSERT INTO emp
(empno, ename, deptno
VALUES (5, 'E', 30
COMMIT ;
SELECT DISTINCT deptno, SUM (empno) / SUM (empno) OVER (PARTITION BY deptno)
FROM emp
GROUP BY deptno;
ORA-00979: not a GROUP BY expression
Earlier i had the query like
SELECT DISTINCT deptno, SUM (empno) OVER (PARTITION BY deptno,empno) / SUM (empno) OVER (PARTITION BY deptno)
FROM emp;
which executed successfully with wrong result.
Please guide me how to resolve this issue,
Thanks,
Santhosh

Hi,
santhosh.shivaram wrote:
Hello all, sorry for the providing the limited data, I have now depicting the actual data set and the current select query which is giving error and desired output. Please let me know if you need further information on this.
/* Formatted on 2012/09/14 08:00 (Formatter Plus v4.8.8) */ ...If you're going to the trouble of formatting the data, post it inside \ tags, so that this site won't remove the formatting. See the forum FAQ {message:id=9360002}
**Current query:**
SELECT rep_date, cnty, loc, component_code,
SUM (volume) / SUM (volume) OVER (PARTITION BY rep_date, cnty, loc)This is the same problem you had before, and was explained in the first answer {message:id=10573091} Don't you read the replies you get?SUM (volume) OVER (PARTITION BY rep_date, cnty, loc)
can't be used in this GROUP BY query, because it depends on volume, and volume isn't one of the GROUP BY expressions.
FROM table1
GROUP BY rep_date, cnty, loc, component_code;
when execute this query i am getting "ORA-00979: not a GROUP BY expression" error
My desired output_Formatting is especially important for the output. Which do you think is easier to read and understand: what you posted:
Rep_Date     Cnty     Loc     Component_Code     QTY_VOL
9/12/2012     2     1     CONTRACT      -0.019000516
9/12/2012     2     1     CONTRACT      -0.019000516
9/12/2012     2     1     NON-CONTRACT      -0.893525112
9/12/2012     2     1     NON-CONTRACT      -0.89322
9/12/2012     2     1     CONTRACT-INDEX     1.912525629
9/12/2012     2     1     CONTRACT-INDEX     1.912526
9/12/2012     2     1     CONTRACT-INDEX     1.912526
9/12/2012     2     4     CONTRACT     0.015197825
9/12/2012     2     4     CONTRACT     0.015198
9/12/2012     2     4     NON-CONTRACT     0.984802175
9/12/2012     2     4     NON-CONTRACT     0.984802or this?Rep_Date     Cnty     Loc     Component_Code     QTY_VOL
9/12/2012     2     1     CONTRACT -0.019000516
9/12/2012     2     1     CONTRACT -0.019000516
9/12/2012     2     1     NON-CONTRACT      -0.893525112
9/12/2012     2     1     NON-CONTRACT      -0.89322
9/12/2012     2     1     CONTRACT-INDEX     1.912525629
9/12/2012     2     1     CONTRACT-INDEX     1.912526
9/12/2012     2     1     CONTRACT-INDEX     1.912526
9/12/2012     2     4     CONTRACT     0.015197825
9/12/2012     2     4     CONTRACT     0.015198
9/12/2012     2     4     NON-CONTRACT     0.984802175
9/12/2012     2     4     NON-CONTRACT     0.984802
Which do you think will lead to more answers? Quicker answers? Better answers?
Please let me know if you need any more information.Explain the results.
How do you compute the qty_vol column? Give a couple of very specific examples, showing step by step how you calculate the values given from the sample data.
What does each row of the output represent? Your query says
GROUP BY rep_date, cnty, loc, component_code;which means the result set will have 1 row for each distinct combiation of rep_date, cnty, loc and component_code, but your desired output has at least 2 rows for every distinct combination of them, and in one case you want 3 rows with the same rep_date, cnty, loc and component_code. How do you decide when you want 2 rows, and when you need 3? Will there be occassions when you need 4 row, or 5, or 1?
All the rows with the same rep_date, cnty, loc and component_code have *nearly* the same qty_vol, but usually not quite the same. Sometimes qty_col is rounded: sometimes it's changed slightly, but not just rounded (-0.893525112 get converted to -0.89322). How do you decide when it's rounded, when it remains the same, and when it's changed to a completely different number? When it's rounded, how do you decide how many digits to round it to?
Edited by: Frank Kulash on Sep 14, 2012 12:44 AM

Query performace with "partition by" clause.

Below is my query
>
select event_type, time, count(event_id) as no_of_events from (
select e.event_type, t.time , e.id as event_id from time t
left outer join events e partition by (event_type)
on t.time < e.end_time and (t.time + 1) > e.start_time
where t.time >= '2008-01-01' and t.time < '2008-02-01'
) events_by_event_type
group by event_type, time
order by event_type, time
The idea is to get a count of active "events" of each "event_type", for each day between 2 dates. The "time" table has one row each for each day. An event is said to be active for a day , when it's end_time - start_time overlaps the day's beginning and end.
The query works but always does a full table scan of the events table.
I tried creating following indexes on the events table , but none of them is ever used.
(event_type,start_time)
(event_type,end_time)
(event_type,start_time,end_time)
(start_time)
(end_time)
(start_time,end_time)
How can I avoid the full table scan of the "events" table in the above query ?
fyi the events table looks like
>
id number not null primary key,
event_type number not null,
start_date date not null,
end_date date not null

What I want is to avoid the full table scan on the
"events" table. I don't think adding an index on the
'time' table will help there.The conditions you have on events are
t.time < e.end_time and (t.time + 1) > e.start_timeSo you should have an index on the columns end_time and start_time to avoid a full table scan.
But anyway is that query slow?
Bye Alessandro
Message was edited by:
Alessandro Rossi
Plus I would add two more predicates to the query to enforce a range scan. If I did well they should always be true for the rows you want. They are there just to tell the CBO that the scan on end_time has to begin from '2008-02-01' and the scan on start_time has to finish on ('2008-01-01' - 1). Sometime this kind of additional conditions helped me.
select event_type, time, count(event_id) as no_of_events
from (
          select e.event_type, t.time , e.id as event_id from time t
          left outer join events e partition by (event_type) on (
               t.time between e.start_time - 1 and e.end_time
          where t.time >= '2008-01-01' and t.time < '2008-02-01'
               and e.end_time >= '2008-02-01' and e.start_time <= '2008-01-01' - 1
) events_by_event_type
group by event_type, time
order by event_type, time

Storage clause of create partition based on last storage value

Hello,
could you give some advice how could i resolve this problem ?
I have procedure which is executed by job every day, once a day, and what it does, it automaticly add new partitions on one table.
I am now strugling with how to construct statement that will place every new partition in new tablespaces.
I created 8 tablespaces named data1..data8 and for this purpose, so now i am trying to understand how can i perform check in what tablespace was last partition placed and based on that, add new partition in first next tablespace. For example, last was created in data8 tablespace, so next must be created in data1 tablespace.
How can i check that ? Anyone have some suggestions ?
I am currently going with this logic :
select tablespace_name from
(select * from user_tab_partitions where table_name='partitioned_table' order by partition_name desc)
where rownum < 2 )And with this i get the name of the last used tablespace for the last created partition, but i currently don't know how to use this info further.
Deos someone have some other idea about this ?
Edited by: user11141123 on May 7, 2009 4:01 AM
Edited by: user11141123 on May 7, 2009 4:16 AM

Yes Sean,
i would like this to do bacuse of the tablespace that are created for that purpose, for that partitioned table.
For some reason, they must be used, and by the current logic where that procedure adds new partiotions with default storage option, they can't be used
and all partitions are created in one other tablespace named data.
This is situation on production, and i created similiar test environment.
data tablespace in production will in few days propably be full and intention is to start create new partitions in this way , in those new 8 tablespaces and in some time tottaly leave current and only used tablespace and that is data tablespace.
So i would like to somehow implement storage clause on already fully functional
alter table add partition
statement inside that procedure.
Is there some way for that ? That i use that qury i wrote above , and by using that value add on existing statement for create partition storage clause with next in line tablespace ?

How to use partition by instead of group by?

Hi,
I am having trouble using partition by clause in following case,
column other_number with null values contains 10 records in 'some_table'
5 records with date 11-01-2009, item_code = 1
5 records with date 10-01-2009, item_code = 2
This query returns all 10 records, (which suppose to return 2)
SELECT count (a.anumber) over (partition by TO_char(a.some_date,'MM'), a.item_code) AS i_count, a.item_code,
TO_char(a.some_date,'MM')
     FROM some_table
     WHERE to_char(a.some_date,'yyyy') = 2009
     AND a.other_number IS NULL
Works fine if I wrote like this,
SELECT count (a.anumber) AS i_count, a.item_code,
TO_char(a.some_date,'MM')
     FROM some_table
     WHERE to_char(a.some_date,'yyyy') = 2009
     AND a.other_number IS NULL
group by TO_char(a.some_date,'MM'), a.item_code
How to use partition by in this case?

Hi,
Almost all of the aggregate functions (the ones you use in a GROUP BY query) have analytic counterparts.
You seem to have already discovered that whatever values are returned by
an aggregate funcition using "GROUP BY x, y, z" can also be found with
an analytic function using "PARTITION BY x, y. z".
Aggregate queries collapse the result set.
The aggregate COUNT function:
SELECT    deptno
,         COUNT (*)   AS cnt
FROM       scott.emp
GROUP BY deptno
;tells how many of the 14 employees are in each of the 3 departments.
So does the analytic COUNT function:
SELECT    deptno
,         COUNT (*) OVER (PARTITION BY deptno)   AS cnt
FROM       scott.emp
;but the first query produces 3 rows of output, the second query produces 14.
You could get 3 rows of output using the analytic function and SELECT DISTINCT , but it's inefficient.
Which should you use? Like so many other things, the answer depends on what data you have, and what results you want from that data.
If you want collapsed results (one row per group), that's a striong indication that you'll want aggregate, not analytic functions.
If you want one row of output for every row in the table, that's a strong indication that you'll want analytic functions.
If you have a particular question, ask it. Post some sample data and the results you want from that data, as Rob said.
There is another important difference between aggreate and analytic functions: analytic functions can easily be restricted to a window , or subset, of the data set. This is something like a WHERE clause, but a WHERE clause applies to the whole query: a wondowing condition applies only to an individual row.
If you need to compute a SUM of rows with an earlier order_date than this row or an average of the last 5 rows, then you proabably want to use analytic function.

Problem in using PARTITION BY with SCORE

Hi
I have a table in which I have list of products along with the name of the company that offers that product. Table structure is as shown below:
PRD_ID NUMBER
PRD_NAME VARCHAR2(100)
COMPANY_NAME VARCHAR2(100)
PRD_DESC VARCHAR2(500)
DUMMY CHAR(1)
I have created an Intermedia Index on PRD_NAME and PRD_DESC with
PRD_NAME and PRD_DESC as two separate field sections.
Now I want to retrieve up-to 3 products per company that match the searched keywords. Now if a user searches for Candle Holders then I write following query
SELECT A.*, ROWNUM FROM
(SELECT SCORE(1) AS SC,
PRD_ID,
     PRD_NAME,
     COMPANY_NAME,
     ROW_NUMBER() OVER (PARTITION BY COMPANY_NAME ORDER BY SCORE(1) DESC) AS RK
FROM PRODUCTS
WHERE CONTAINS(DUMMY,'(Candle Holders)*7, ($Candle $Holders)*5, (Candle% Holders%)*3) within PRD_NAME)'
,1) > 0 ) A
WHERE A.RK <= 3
ORDER BY A.COMPANY_NAME, SC DESC;
I have many records in my database that should get a score of 100 in the above query - e.g.
Glass Candle Holder Comp1
Iron Candle Holder Comp1
Metal Candle Holder Comp2
Votive Candle Holder Comp3
Silver Plated Candle Holder Comp4
Gold Plated Candle Holder Comp4
Copper Plated Candle Holder Comp4
Platinum Coated Candle Holder Comp4
and so on.
My query is returning upto 3 records per company, but it is not giving 100 as score.
If I remove the row_number() partition by clause, then my query returns 100 score.
I want to restrict the query from returning product at a certain cut-off score.
Please advise what is wrong in the above query and what can I do to fix the problem
Regards
Madhup

I am unable to reproduce the problem given only what you have provided. I get the same score no matter what. What version of Oracle are you using? The only thing I can think of is to put the query without the row_number in an inline view and add a condition where rownum > 0 to try to materialize the view, then apply your row_number in an outer query, as in the last query in the example below.
SCOTT@10gXE> CREATE TABLE products
2    (PRD_ID           NUMBER,
3      PRD_NAME      VARCHAR2(100),
4      COMPANY_NAME VARCHAR2(100),
5      PRD_DESC      VARCHAR2(500),
6      DUMMY           CHAR(1)
7 )
8 /
Table created.
SCOTT@10gXE> INSERT ALL
2 INTO PRODUCTS VALUES (1, 'Glass Candle Holder', 'Comp1', NULL, NULL)
3 INTO PRODUCTS VALUES (2, 'Iron Candle Holder', 'Comp1', NULL, NULL)
4 INTO PRODUCTS VALUES (3, 'Metal Candle Holder', 'Comp2', NULL, NULL)
5 INTO PRODUCTS VALUES (4, 'Votive Candle Holder', 'Comp3', NULL, NULL)
6 INTO PRODUCTS VALUES (5, 'Silver Plated Candle Holder', 'Comp4', NULL, NULL)
7 INTO PRODUCTS VALUES (6, 'Gold Plated Candle Holder', 'Comp4', NULL, NULL)
8 INTO PRODUCTS VALUES (7, 'Copper Plated Candle Holder', 'Comp4', NULL, NULL)
9 INTO PRODUCTS VALUES (8, 'Platinum Coated Candle Holder', 'Comp4', NULL, NULL)
10 SELECT * FROM DUAL
11 /
8 rows created.
SCOTT@10gXE> BEGIN
2    FOR i IN 1 .. 10 LOOP
3       INSERT INTO products (prd_id, prd_name, company_name)
4       SELECT object_id, object_name, 'Comp1' FROM all_objects;
5    END LOOP;
6 END;
7 /
PL/SQL procedure successfully completed.
SCOTT@10gXE> COMMIT
2 /
Commit complete.
SCOTT@10gXE> EXEC CTX_DDL.CREATE_PREFERENCE ('your_multi', 'MULTI_COLUMN_DATASTORE')
PL/SQL procedure successfully completed.
SCOTT@10gXE> BEGIN
2    CTX_DDL.SET_ATTRIBUTE ('your_multi', 'COLUMNS', 'PRD_DESC, PRD_NAME');
3 END;
4 /
PL/SQL procedure successfully completed.
SCOTT@10gXE> EXEC CTX_DDL.CREATE_SECTION_GROUP ('your_sec_group', 'BASIC_SECTION_GROUP')
PL/SQL procedure successfully completed.
SCOTT@10gXE> BEGIN
2    CTX_DDL.ADD_FIELD_SECTION ('your_sec_group', 'PRD_DESC', 'PRD_DESC', TRUE);
3    CTX_DDL.ADD_FIELD_SECTION ('your_sec_group', 'PRD_NAME', 'PRD_NAME', TRUE);
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@10gXE> CREATE INDEX your_index ON products (dummy)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS
4    ('DATASTORE     your_multi
5       SECTION GROUP your_sec_group')
6 /
Index created.
SCOTT@10gXE> EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'PRODUCTS')
PL/SQL procedure successfully completed.
SCOTT@10gXE>
SCOTT@10gXE> COLUMN prd_name      FORMAT A30
SCOTT@10gXE> COLUMN company_name FORMAT A10
SCOTT@10gXE> COLUMN prd_desc      FORMAT A8
SCOTT@10gXE> SELECT A.*, ROWNUM FROM
2 (SELECT SCORE(1) AS SC,
3 PRD_ID,
4 PRD_NAME,
5 COMPANY_NAME,
6 ROW_NUMBER() OVER (PARTITION BY COMPANY_NAME ORDER BY SCORE(1) DESC) AS RK
7 FROM PRODUCTS
8 WHERE CONTAINS(DUMMY,'(((Candle Holders)*7, ($Candle $Holders)*5, (Candle% Holders%)*3) within PRD_NAME)'
9 ,1) > 0 ) A
10 WHERE A.RK <= 3
11 ORDER BY A.COMPANY_NAME, SC DESC
12 /
        SC     PRD_ID PRD_NAME                       COMPANY_NA         RK     ROWNUM
        28          1 Glass Candle Holder            Comp1               1          1
        28          2 Iron Candle Holder             Comp1               2          2
        28          3 Metal Candle Holder            Comp2               1          3
        28          4 Votive Candle Holder           Comp3               1          4
        28          8 Platinum Coated Candle Holder Comp4               1          5
        28          7 Copper Plated Candle Holder    Comp4               2          6
        28          6 Gold Plated Candle Holder      Comp4               3          7
7 rows selected.
SCOTT@10gXE> SELECT A.*, ROWNUM FROM
2 (SELECT SCORE(1) AS SC,
3 PRD_ID,
4 PRD_NAME,
5 COMPANY_NAME
6 FROM PRODUCTS
7 WHERE CONTAINS(DUMMY,'(((Candle Holders)*7, ($Candle $Holders)*5, (Candle% Holders%)*3) within PRD_NAME)'
8 ,1) > 0 ) A
9 ORDER BY A.COMPANY_NAME, SC DESC
10 /
        SC     PRD_ID PRD_NAME                       COMPANY_NA     ROWNUM
        28          1 Glass Candle Holder            Comp1               1
        28          2 Iron Candle Holder             Comp1               2
        28          3 Metal Candle Holder            Comp2               3
        28          4 Votive Candle Holder           Comp3               4
        28          5 Silver Plated Candle Holder    Comp4               5
        28          6 Gold Plated Candle Holder      Comp4               6
        28          7 Copper Plated Candle Holder    Comp4               7
        28          8 Platinum Coated Candle Holder Comp4               8
8 rows selected.
SCOTT@10gXE> SELECT SCORE(1) AS SC,
2          PRD_ID,
3          PRD_NAME,
4          COMPANY_NAME
5 FROM   PRODUCTS
6 WHERE CONTAINS
7            (DUMMY,
8             '(((Candle Holders)*7,
9           ($Candle $Holders)*5,
10           (Candle% Holders%)*3) within PRD_NAME)',
11             1) > 0
12 /
        SC     PRD_ID PRD_NAME                       COMPANY_NA
        28          1 Glass Candle Holder            Comp1
        28          2 Iron Candle Holder             Comp1
        28          3 Metal Candle Holder            Comp2
        28          4 Votive Candle Holder           Comp3
        28          5 Silver Plated Candle Holder    Comp4
        28          6 Gold Plated Candle Holder      Comp4
        28          7 Copper Plated Candle Holder    Comp4
        28          8 Platinum Coated Candle Holder Comp4
8 rows selected.
SCOTT@10gXE> SELECT *
2 FROM   (SELECT sc, prd_id, prd_name, company_name,
3               ROW_NUMBER () OVER
4                 (PARTITION BY company_name ORDER BY sc DESC) AS rk
5           FROM   (SELECT SCORE(1) AS SC,
6                    PRD_ID,
7                    PRD_NAME,
8                    COMPANY_NAME
9                FROM   PRODUCTS
10                WHERE CONTAINS
11                      (DUMMY,
12                       '(((Candle Holders)*7,
13                     ($Candle $Holders)*5,
14                     (Candle% Holders%)*3) within PRD_NAME)',
15                       1) > 0
16                AND    ROWNUM > 0))
17 WHERE rk <= 3
18 /
        SC     PRD_ID PRD_NAME                       COMPANY_NA         RK
        28          1 Glass Candle Holder            Comp1               1
        28          2 Iron Candle Holder             Comp1               2
        28          3 Metal Candle Holder            Comp2               1
        28          4 Votive Candle Holder           Comp3               1
        28          5 Silver Plated Candle Holder    Comp4               1
        28          6 Gold Plated Candle Holder      Comp4               2
        28          7 Copper Plated Candle Holder    Comp4               3
7 rows selected.
SCOTT@10gXE>

Using - Partition By in SQL report

Our users are having problem with the below formula on Reports. I am not sure what this means. Our Users told that they need
(Tons Used/spread miles) * 2000 lb/ton = lbs/spread mile for their report.
And I see the below formula being used by our developer. Not sure how to read it though. Can any one let me know the formula used in the below code could actually represent the above formula. Users are complainting as below
from user ->column doubles or triples the amount if there are multiple records for the same truck/driver/beat/with the same amount for Dry Mat used and Spread Miles. If spread miles or the Dry Mat used is different if calculates correctly.
Edited by: Lucy Discover on Mar 28, 2011 1:50 PM

Hi,
Lucy Discover wrote:
Our users are having problem with the below formula on Reports. I am not sure what this means. Our Users told that they need
(Tons Used/spread miles) * 2000 lb/ton = lbs/spread mile for their report.
And I see the below formula being used by our developer. Not sure how to read it though. Can any one let me know the formula used in the below code could actually represent the above formula. Users are complainting as below
from user ->column doubles or triples the amount if there are multiple records for the same truck/driver/beat/with the same amount for Dry Mat used and Spread Miles. If spread miles or the Dry Mat used is different if calculates correctly.
SUM (ROUND ((NVL (sim_trip.dry_mat_used, 0)) /
nullif(sim_trip.spread_miles_total,0) * 2000, 2 ))
OVER (PARTITION BY sim_trip.trip_no,   sim_trip.dry_mat_code,        sim_trip.liquid_mat_code,
sim_trip.dry_mat_used,      sim_trip.liq_mat_used,         sim_trip.spread_miles_total
Yes, the code is basically implementing that formula. There's some code that hadles NULLs and rounds the amounts.
This is the analytic SUM function; the keyword OVER right after the argument indicates that. The difference (here) between the two forms is that the analytic SUM will work without a GROUP BY clause; you can show details about individual rows and the groupd total on the same output row, without doing a sub-query and a join. The analytic PARTITION BY clause correspdons to the aggregate GEOUP BY clause. Since you are saying:
...   OVER ( PARTITION BY sim_trip.trip_no
                ,                 sim_trip.dry_mat_code
          ,                sim_trip.liquid_mat_code
          ,             sim_trip.dry_mat_used
          ,              sim_trip.liq_mat_used
          ,                 sim_trip.spread_miles_total
             )the ratio on any given row will only include rows with the same values of all 6 variables. It's very suspicious that two of those varibales (dry_mat_used and spread_miles_total) are also in the argument. That's legal, but it's very unusual.
Also, this is a sum or ratios, no a ratio of sums. That is, suppose, in some group, there are 2 rows. One row has 100 tons and 4 miles, while the other row has 100 tons and 1 mile. The formula you posted would repoert that as
(100 / 4) + ( 100 / 1) =
(25) + (100) =
125
and not
(100 + 100) / (4 + 1) =
(200) / (5) =
40
As you can see, there's a big difference. I don't know which one you want.
Whenever you have a problem, post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) from all tables.
Also post the results you want from that data, and an explanation of how you get those results from that data, with specific examples.
Always say which version of Oracle you're using.

Query based on date partition

Hi,
I am trying to output only a successful job during the past 24 hrs of each day. If there is job that has an outcome of a success and a failure within the last
24 hrs for each day, I want to only output the successful one. If there are no success for the same job, I will output the last attempted failed job.
Here are my columns:
current output:
JOB_ID     JOBDATE               GROUP     PATH          OUTCOME          FAILED     LEVEL     ASSET
3400908     7/27/2012 10:01:18 AM     polA     target1          Success          0     incr     clone1
3400907     7/27/2012 10:01:09 AM     polA     target1          Failed          0     incr     clone1
3389180     7/23/2012 10:01:14 AM     polA     target1          Failed          1     incr     clone1
3374713     7/23/2012 10:01:03 AM     polA     target1          Success          0     incr     clone1
3374712     7/22/2012 11:24:32 AM     polA     target1          Success          0     Full     clone1
3367074     7/22/2012 11:24:00 AM     polA     target1          Failed          1     Full     clone1
3167074     7/21/2012 10:01:13 AM     polA     target1          Success          0     incr     clone1
336074     7/21/2012 10:01:08 AM     polA     target1          Success          0     incr     clone1
desired output:
JOB_ID     JOBDATE               GROUP     PATH          OUTCOME          FAILED     LEVEL     ASSET
3400908     7/27/2012 10:01:18 AM     polA     target1          Success          0     incr     clone1
3374713     7/23/2012 10:01:03 AM     polA     target1          Success          0     incr     clone1
3374712     7/22/2012 11:24:32 AM     polA     target1          Success          0     Full     clone1
3167074     7/21/2012 10:01:13 AM     polA     target1          Success          0     incr     clone1
Here is a code I am trying to use without success:
select *
from
   (selectjob_id, jobdate, group, path, outcome, Failed, level, asset,
          ROW_NUMBER() OVER(PARTITION BY group, path, asset ORDER BY jobdate desc) as rn
               from job_table where jobdate between trunc(jobdate) and trunc(jobdate) -1 )
   where rn = 1
   order by jobdate desc;Thanks,
-Abe

Hi, Abe,
You're on the right track, using ROW_NUMBER to assign numbers, and picking only #1 in the main query. The main thing you're missing is the PARTITION BY clause.
You want to assign a #1 for each distinct combination of group_id, path, asset and calendar day , right?
Then you need to PARTITION BY group_id, path, asset and calendar day . I think you realized that when you named this thread "Query Based *on date partition* ".
The next thing is the analytic ORDER BY clause. To see which row in each partition gets assigned #1, you need to order the rows by outcome ('Success' first, then 'Failed'), and after that, by jobdate (latest jobdate first, which is DESCending order).
If so, this is what you want:
WITH     got_r_num     AS
     SELECT j.*     -- or list columns wanted
     ,     ROW_NUMBER () OVER ( PARTITION BY group_id     -- GROUP is not a good column name
                               ,                    path
                         ,             asset
                         ,             TRUNC (jobdate)
                               ORDER BY         CASE outcome
                                             WHEN 'Succcess'
                                     THEN 1
                                     ELSE 2
                                         END
                         ,             jobdate     DESC
                       )      AS r_num
     FROM    job_table j
     WHERE     outcome     IN ('Success', 'Failed')
--     AND     ...     -- Any other filtering, if needed
SELECT     *       -- or list all columns except r_num
FROM     got_r_num
WHERE     r_num     = 1
;If you'd care to post CREATE TABLE and INSERT statements for the sample data, then I could test it.
It looks like you posted multiple copies of this thread. I'll bet that's not your fault; this site can cause that. Even though it's not your fault, please mark all the duplicate versions of this thread as "Answered" right away, and continue in this thread if necessary.
Edited by: Frank Kulash on Jul 28, 2012 11:47 PM
This site is flakier than I thought! I did see at least 3 copies of this same thread earlier, but I don't see them now.

Partition by

Hi all,
Could anyone tell me how can i use PARTITION BY clause in SQL (SELECT) query I request u to explain it with one example.
thank u

It's similar to a GROUP BY in that it groups the data in order to perform a function. In this case it 'groups' the data by ID and sums the value per each Id;
SQL> with t as (
   select 1 id, 100 value from dual union all
   select 1 id, 200 value from dual union all
   select 2 id, 300 value from dual union all
   select 2 id, 400 value from dual)
select id, value,
   sum(value) over (partition by id) sum_value
from t
        ID      VALUE SUM_VALUE
         1        100        300
         1        200        300
         2        300        700
         2        400        700
4 rows selected.

Virtual column based partitioning

Hi,
we have a non-partitioned table in a production database and wish to partition it based on an expression. Since we are on 11.2 the first thing that comes to mind is virtual column based partitioning. The "problem" is that in order to partition by a virtual column, you have to create one, and adding a new column to a table could break any application that doesn't reference the existing columns by name, e.g. "SELECT *" or. "INSERT INTO table VALUES(....)".
My question is: is it possible to somehow specify the expression on which to partition directly in the "partition by" clause rather than specifying it as a virtual column definition?
Example:
Instead of this..
SQL> create table test (
2    id             number not null,
3    content        varchar2(10),
4    record_type    varchar2(1) generated always as (case when (substr(content, 1, 1)='B' and not substr(content, 1, 3)='Bxy') then 'B' else 'A' end) virtual
5 )
6 partition by list(record_type)
7 (
8    partition partA values ('A'),
9    partition partB values ('B')
10 );
Table created...I'd like to use something like this:
SQL> create table test (
2    id             number not null,
3    content        varchar2(10)
4 )
5 partition by list((case when (substr(content, 1, 1)='B' and not substr(content, 1, 3)='Bxy') then 'B' else 'A' end))
6 (
7    partition partA values ('A'),
8    partition partB values ('B')
9 );
partition by list((case when (substr(content, 1, 1)='B' and not substr(content, 1, 3)='Bxy') then 'B' else 'A' end))
ERROR at line 5:
ORA-00904: : invalid identifierThank you in advance for any answers.
Regards,
Jure

adding a new column to a table could break any application that doesn't reference the existing columns by name, e.g. "SELECT *" or. "INSERT INTO table VALUES(....)". "Ok, i got it. You mean, in application you are using select * from yourtable; and those data is being used by a datagrid or any control, then where that application will show / handle the new column data right ?
Yes, thats why DBA and developers do SDLC (Software Development Life Cycle); that our table will be looks like this, these will be columns, these will be their data types, these will be followed in naming convention, privileges, indexes, storage parameters, constraints, dependent objects etc. Now after creating the table, you found a need to add a column, it means there was some lapses happened at the design time/phase or business requirements are newly defined.
So, as far as concerned of select * ... ; you have to change in the application by :
select col1, col2, new_col from your table... (as per order of your datagrid control columns). There is no other solution, you have to change in application code at every place where you have used select * and if those statements are being merged / deals into a control)
New column addition will hamper only at select * from... not with any INSERT/UPDATE/DELETE, because if they are running fine it means, they have well written column references. For DMLs you need not worry so far.
By the way, what is your technology for application ? I have worked on couple of applications in ASP.NET with using datagrid which auto add/removes the columns in itself as per cursor result (not rememberring the exact property of it though)
Regards
Girish Sharma

Group by vs partition by

Hi,
I am new to my project where I need some help.
My requirement is like below
In EMP table I need to group the all the employees who belongs to one department, I used group by function to fetch the results, but my lead asked me to use partition by clause. So I am not sure is I get different result set by using partition by ?
If yes How can I use it.
Thanks,

grouping reduces the number of result rows - and analytics do not. Here is a small emp example:
select ENAME
     , job
     , count(*) over (partition by job) job_count
from emp
order by ename;
ENAME      JOB        JOB_COUNT
ADAMS      CLERK              4
ALLEN      SALESMAN           4
BLAKE      MANAGER            3
CLARK      MANAGER            3
FORD       ANALYST            2
JAMES      CLERK              4
JONES      MANAGER            3
KING       PRESIDENT          1
MARTIN     SALESMAN           4
MILLER     CLERK              4
SCOTT      ANALYST            2
SMITH      CLERK              4
TURNER     SALESMAN           4
WARD       SALESMAN           4
With group by you would only get one line per job group. You could get the same result with a self join with a grouped result:
select e1.ename
     , e1.job
     , e2.job_count
from emp e1
     , (select JOB
             , count(*) job_count
          from emp
         group by job) e2
where e1.job = e2.job
order by e1.ename;
ENAME      JOB        JOB_COUNT
ADAMS      CLERK              4
ALLEN      SALESMAN           4
BLAKE      MANAGER            3
CLARK      MANAGER            3
FORD       ANALYST            2
JAMES      CLERK              4
JONES      MANAGER            3
KING       PRESIDENT          1
MARTIN     SALESMAN           4
MILLER     CLERK              4
SCOTT      ANALYST            2
SMITH      CLERK              4
TURNER     SALESMAN           4
WARD       SALESMAN           4

Partition by clause

Similar Messages

Maybe you are looking for