Difference in number of records in GROUP BY and PARTITION BY

Hi Experts
If I run the following query I got 997 records by using GROUP BY.
SELECT c.ins_no, b.pd_date,a.project_id,
a.tech_no
FROM mis.tranche_balance a,
FMSRPT.fund_reporting_period b,
ods.proj_info_lookup c,
ods.institution d
WHERE a.su_date = b.pd_date
AND a.project_id = c.project_id
AND c.ins_no = d.ins_no
AND d.sif_code LIKE 'P%'
AND d.sif_code <> 'P-DA'
AND a.date_stamp >='01-JAN-2011'
AND pd_date='31-MAR-2011'
GROUP BY c.ins_no,
b.pd_date,
a.project_id,
a.tech_no;
I want to show the extra columns a.date_stamp and a.su_date
in the out put so that I have used PARTITION BY in the second query but I got 1079 records.
SELECT c.ins_no, b.pd_date,a.date_stamp,a.su_date, a.project_id,
a.tech_no,
COUNT(*) OVER(PARTITION BY c.ins_no,
b.pd_date,
a.project_id,
a.tech_no)c
FROM mis.tranche_balance a,
FMSRPT.fund_reporting_period b,
ods.proj_info_lookup c,
ods.institution d
WHERE a.su_date = b.pd_date
AND a.project_id = c.project_id
AND c.ins_no = d.ins_no
AND d.sif_code LIKE 'P%'
AND d.sif_code <> 'P-DA'
AND a.date_stamp >='01-JAN-2011'
AND pd_date='31-MAR-2011'
Please help me why I got 1079 records.
And also please help me how to show the two extra columns in the out put whcich are not used in
GROUP BY clause.
Thanks in advance.

Hi,
user9077483 wrote:
Hi Experts
If I run the following query I got 997 records by using GROUP BY. ...Let''s call this "Query 1", and the number of rows it returns "N1".
The results tell you that there are 997 distinct combinations of the GROUP BY columns (c.ins_no, b.pd_date, a.project_id, a.tech_no).
I want to show the extra columns a.date_stamp and a.su_date
in the out put so that I have used PARTITION BY in the second query but I got 1079 records. ...Let's call the query without the GROUP BY "Query 2", and the number of rows it returns "N2".
Please help me why I got 1079 records.Because there are 1079 rows that meet all the GROUP BY conditions. Query 2 has nothing to with distinct values in any columns. You would expect that N2 to be at least as high as N1, but it's not surprising that the N2 is higher than N1.
And also please help me how to show the two extra columns in the out put whcich are not used in
GROUP BY clause.Doesn't Query 2 show those two columns already? If the Query 2 is not producing the results you want, then what results do you want?
Post a little sample data (CREATE TABLE and INSERT statements, relevant columns only) for all the tables involved, and the results you want from that data.
Explain, using specific examples, how you get those results from that data. Point out a couple of places where Query 2 is not doing what you want.
Always say which version of Oracle you're using.

Similar Messages

Any table or program to get the number of records in P, A and F tables

HI all
any table for program to get the number of records in P, A and F tables. I want to create SQ01 queries to get the status of number of records. We are gng for a production cutover next week. Want to capture all the data before and after Upgrade cutover.
Also suggest me how to create sq01 queries.
Thanks in advance
regards
Janardhan KUmar K.

Use Transaction LISTSCHEMA to see all the tables assosciated with ur cube
Total number would be what you find in both the E & F fact tables. If there is no compression in the cube then E table will be empty.
Alternatively u can use se16 transaction and enter E table and F table manually
E table - /BIC/E(Cube name) and Ftable - /BIC/F(Cube name)
Or else u can go to the manage of the cube and without selecting any field for O/P and ticking the option output number of hits execute. The total of Row Count will give u the total no of records in the cube.

How to find number of records in a cube and ODS....

Hi,
How do we find total number of records in a cube and ODS?
Is there any Tcode for this ?
From the content it is difficult to get the number of records, if it is more in number.
Thanks,
Jeetu

Hello ,
Please check the following thread,
Number of records in a infocube
hope it helps,
assign points if helpful.

Total number of records loaded into ODS and in case of Infocube

hai
i loaded some datarecords from Oracle SS in ODS and Infocube.
My SOurceSytem guy given some datarecords by his selection at Oracle source system side.
how can i see that 'how many data records are loaded into ODS and Infocube'.
i can check in monitor , but that not correct(becz i loaded second , third time by giving the ignore duplicate records). So i think in monitor , i wont get the correct number of datarecords loaded in case of ODS and Infocube.
So is there any transaction code or something to find number records loaded in case of ODS and Infocube .
ps tell me
i ll assing the points
bye
rizwan

HAI
I went into ODS manage and see the 'transferred' and 'added' data records .Both are same .
But when i total the added data records then it comes 147737.
But when i check in active table(BIC/A(odsname)00 then toal number of entries come 1,37,738
why it is coming like that difference.......
And in case of infocube , how can i find total number of records loaded into Infocube.(not in infocube).
Like any table for fact table and dimension tables.
pls tell me
txs
rizwan

Find duplicate records withouyqusing group by and having

I know i can delete duplicate records using an analytic function. I don't want to delete the records, I want to look at the data to try to understand why I have duplicates. I am looking at tables that don't have unique constraints (I can't do anything about it). I have some very large tables. so I am trying to find a faster way to do this.
for example
myTable
col1 number,
col2 number,
col3 number,
col4 number,
col5 number)
My key column is col1 and col2 (it is not enforced in the database). So I want to get all the records that have duplicates on these fields and put them in a table to review. This is a standard way to do it, but it requires 2 full table scans of very large tables (many, many gigabtytes. one is 150 gbs and not partitioned), a sort, and a hash join. Even if i increase sort_area_size and hash_area_size, it takes a long time to run..
create table mydup
as
select b.*
from (select col1,col2,count(*)
from myTable
group by col1, col2
having count(*) > 1) a,
myTable b
where a.col1 = b.col1
and a.col2 = b.col2
I think there is a way to do this without a join by using rank, dense_rank, or row_number or some other way. When I google this all I get is how to "delete them". I want to analyze them a nd not delete them.

create table mytable (col1 number,col2 number,col3 number,col4 number,col5 number);
insert into mytable values (1,2,3,4,5);
insert into mytable values (2,2,3,4,5);
insert into mytable values (3,2,3,4,5);
insert into mytable values (2,2,3,4,5);
insert into mytable values (1,2,3,4,5);
SQL> ed
Wrote file afiedt.buf
1 select * from mytable
2   where rowid in
3 (select rid
4      from
5     (select rowid rid,
6              row_number() over
7              (partition by
8                   col1,col2
9               order by rowid) rn
10          from mytable
11      )
12    where rn <> 1
13* )
SQL> /
      COL1       COL2       COL3       COL4       COL5
         1          2          3          4          5
         2          2          3          4          5
SQL>Regards
Girish Sharma

SELECT query & Database Table Difference in number of records

Hi All,
In program it’s selecting 3 records based on selection parameters whereas when i execute SE16 with same selection criteria gives 4 records.
Please suggest when we will face these kind of issue
Thanks,
Spandana

SELECT objnr wlges
FROM bpge
INTO TABLE t_bpge
FOR ALL ENTRIES IN t_prps
WHERE lednr EQ c_0002
AND objnr EQ t_prps-objnr
AND wrttp EQ c_41.
Here t_prps-objnr has the following values
PR00473170
PR00397060
PR00397061
PR00397062
PR00397063
PR00397064
PR00397065
For this selection criteria there are 4 records in BPGE table where as the program is fetching only 3

Wrong number of records in extraction in Generic Extractor

Hi all,
I have created generic datasource on table DFKKCOLLH and made this datasource delta enabled on field ACPTM.
1)RSA3 is shoiwng 94.615 records
2) When i load the Full load it is fetching 92,015
3) When i load the initlization for Delta load the is it fetchin only 35,000 records
Can anybody suggest why there is difference ?
Regards
Nilesh

you can't use ACPTM as a delta field. it's a time field not a date field. the difference might be that in your init you only take the records where this field is initilal or just not initial...
just check in se16 the number of records you should have and tell us if this corresponds to what you have in rsa3, in full or in init...
M.

Number of records in Cube and in report

Hi,
Is it possible to have less number of records visible in InfoCube and in reporting report has to show all the records?i.e., if cube contains 100000 records while using InfoCube--->manage and checking the data it has to show only 90000 records,while reporting it has to show all the records.If it is possible how?

If you want all the records to be displayed on the report as in the cube create a report having all chars & keyfigures with no filters.
Sry, I dont get the 90000 part..."checking the data it has to show only 90000 records" -- checking where & how ?
<b>**Added</b> will be the num of records in the cube.

Total number of records in a file

hello experts,
I am getting a file to upload the data.... This file is containing one header record one trailor record and remaining data records...
how to read all the data into one internal table along with header and tralor record.
Because i have to control record validation. In the trailer record the amount what they were giving is the total number of records along with header and trailor
can anyone guide me how to do that
SRI

If thats an excel file then you can use function module
CALL FUNCTION 'ALSM_EXCEL_TO_INTERNAL_TABLE'
    EXPORTING
      filename                = p_fnam
      i_begin_col             = 1
      i_begin_row             = 1
      i_end_col               = 100
      i_end_row               = 30000
    TABLES
      intern                  = iexcel
    EXCEPTIONS
      inconsistent_parameters = 1
      upload_ole              = 2
      OTHERS                  = 3.
IF sy-subrc <> 0.
    WRITE: / 'EXCEL UPLOAD FAILED ', p_fnam, sy-subrc.
    STOP.
ENDIF.
Instead of passing 1 in i_begin_row you, if you know how many lines the header is going to be then you can ignore those lines and pass the number of the row which has the first data record. You can hard code the header because this is giong to constant.
Just a suggestion. Hope this works for you.
Thanks.
Message was edited by:
        mg s

Inconsistent number of records

When i do an Init load from a datasource to a DSO.
The number of records in source system and the one loaded in th target is not matching for the same selection.
I checked the transaction KEB2 transaction in source system.It gives the following message;
"Delta update delivers inconsistent data to SAP BW"
If a full load is done ..the number of records in source system and target matches.
what can be done so that the number of records loaded in target should match the source?

Hi
Try to see the No.of records at RSA3 .
It should generally match with the Full/ Init .

Setting a sequential number to records placed into groups?

Hello I have one table with records which can be grouped and I want to know if I can set an ordinal number to each record and when the group breaks set 1 again to the first record of the group and then 2 to next...3 to next....until group breaks again.....
the first thing I need is order the records according the group then number each record inside the group sequentially from 1 to ..n (numbers or records of the group).
Can I do this???
Thanks in advance

Hello
you can use the row_number analytic function
WITH gp AS
(   SELECT 1 id, 1 family_id,to_date('29/06/1966','dd/mm/yyyy') dob from dual union all
    SELECT 2 id, 1 family_id,to_date('22/06/1986','dd/mm/yyyy') dob from dual union all
    SELECT 3 id, 2 family_id,to_date('04/03/1975','dd/mm/yyyy') dob from dual union all
    SELECT 4 id, 3 family_id,to_date('01/04/1990','dd/mm/yyyy') dob from dual union all
    SELECT 5 id, 3 family_id,to_date('10/01/1996','dd/mm/yyyy') dob from dual union all
    SELECT 6 id, 3 family_id,to_date('21/09/2000','dd/mm/yyyy') dob from dual
SELECT
    id,
    family_id,
    dob,
    ROW_NUMBER() OVER(PARTITION BY family_id ORDER BY DOB) rn,
    ROW_NUMBER() OVER(PARTITION BY family_id ORDER BY DOB DESC) rn_desc
FROM
    gp
        ID FAMILY_ID DOB                          RN    RN_DESC
         1          1 29-JUN-1966 00:00:00          1          2
         2          1 22-JUN-1986 00:00:00          2          1
         3          2 04-MAR-1975 00:00:00          1          1
         4          3 01-APR-1990 00:00:00          1          3
         5          3 10-JAN-1996 00:00:00          2          2
         6          3 21-SEP-2000 00:00:00          3          1
6 rows selected.which gives you the ability to generate the numbers as you have requested. The important parts are the PARTITION which is the group of rows over which the function will be applied, and the ORDER BY which will determine the order in which the function is applied to those rows. You can see that I called the function twice, once with ORDER BY DOB and the other with ORDER BY DOB DESC and the difference in the output.
HTH
David

RE: (forte-users) Optimal number of records to fetch fromForte C ursor

The reason why a single fetch of 20.000 records performs less then
2 fetches of 10.000 might be related to memory behaviour. Do you
keep the first 10.000 records in memory when you fetch the next
10.000? If not, then a single fetch of 20.000 records requires more
memory then 2 fetches of 10.000. You might have some extra over-
head of Forte requesting additional memory from the OS, garbage
collections just before every request for memory and maybe even
the OS swapping some memory pages to disk.
This behaviour can be controlled by modifying the Minimum memory
and Maximum memory of the partition, as well as the memory chunk
size Forte uses to increment its memory.
Upon partition startup, Forte requests the Minimum memory from the
OS. Whithin this area, the actual memory being used grows, until
it hits the ceiling of this space. This is when the garbage collector
kicks in and removes all unreferenced objects. If this does not suffice
to store the additional data, Forte requests 1 additional chunk of a
predefined size. Now, the same behaviour is repeated in this, slightly
larger piece of memory. Actual memory keeps growing until it hits
the ceiling, upon which the garbage collector removes all unrefer-
enced objects. If the garbage collector reduces the amount of
memory being used to below the original Miminum memory, Forte
will NOT return the additional chunk of memory to the OS. If the
garbage collector fails to free enough memory to store the new data,
Forte will request an additional chunk of memory. This process is
repeated untill the Maximum memory is reached. If the garbage
collector fails to free enough memory at this point, the process
terminates gracelessly (which is what happens sooner or later when
you have a memory leak; something most Forte developpers have
seen once or twice).
Pascal Rottier
STP - MSS Support & Coordination Group
Philip Morris Europe
e-mail: [email protected]
Phone: +49 (0)89-72472530
+++++++++++++++++++++++++++++++++++
Origin IT-services
Desktop Business Solutions Rotterdam
e-mail: [email protected]
Phone: +31 (0)10-2428100
+++++++++++++++++++++++++++++++++++
/* All generalizations are false! */
-----Original Message-----
From: [email protected] [SMTP:[email protected]]
Sent: Monday, November 15, 1999 6:53 PM
To: [email protected]
Subject: (forte-users) Optimal number of records to fetch from Forte
Cursor
Hello everybody:
I 'd like to ask a very important question.
I opened Forte cursor with approx 1.2 million records, and now I am trying
to figure out the number of records per fetch to obtain
the acceptable performance.
To my surprise, fetching 100 records at once gave me approx 15 percent
performance gain only in comparsion
with fetching records each by each.
I haven't found significant difference in performance fetching 100, 500
or
10.000 records at once.In the same time, fetching 20.000
records at once make a performance approx 20% worse( this fact I cannot
explain).
Does anybody have any experience in how to improve performance fetching
from
Forte cursor with big number of rows ?
Thank you in advance
Genady Yoffe
Software Engineer
Descartes Systems Group Inc
Waterloo On
Canada
For the archives, go to: http://lists.sageit.com/forte-users and use
the login: forte and the password: archive. To unsubscribe, send in a new
email the word: 'Unsubscribe' to: [email protected]

Hi Kieran,
According to your description, you are going to figure out what is the optimal number of records per partition, right? As per my understanding, this number was change by your hardware. The better hardware you have, the more number of records per partition.
The earlier version of the performance guide for SQL Server 2005 Analysis Services Performance Guide stated this:
"In general, the number of records per partition should not exceed 20 million. In addition, the size of a partition should not exceed 250 MB."
Besides, the number of records is not the primary concern here. Rather, the main criterion is manageability and processing performance. Partitions can be processed in parallel, so the more there are the more can be processed at once. However, the more partitions
you have the more things you have to manage. Here is some links which describe the partition optimization
http://blogs.msdn.com/b/sqlcat/archive/2009/03/13/analysis-services-partition-size.aspx
http://www.informit.com/articles/article.aspx?p=1554201&seqNum=2
Regards,
Charlie Liao
TechNet Community Support

RE: (forte-users) Optimal number of records to fetch fromForte Cursor

Guys,
The behavior (1 fetch of 20000 vs 2 fetches of 10000 each) may also be DBMS
related. There is potentially high overhead in opening a cursor and initially
fetching the result table. I know this covers a great deal DBMS technology
territory here but one explanation is that the same physical pages may have to
be read twice when performing the query in 2 fetches as compared to doing it in
one shot. Physical IO is perhaps the most expensive (vis a vis- resources)
part of a query. Just a thought.
"Rottier, Pascal" <[email protected]> on 11/15/99 01:34:22 PM
To: "'Forte Users'" <[email protected]>
cc: (bcc: Charlie Shell/Bsg/MetLife/US)
Subject: RE: (forte-users) Optimal number of records to fetch from Forte C
ursor
The reason why a single fetch of 20.000 records performs less then
2 fetches of 10.000 might be related to memory behaviour. Do you
keep the first 10.000 records in memory when you fetch the next
10.000? If not, then a single fetch of 20.000 records requires more
memory then 2 fetches of 10.000. You might have some extra over-
head of Forte requesting additional memory from the OS, garbage
collections just before every request for memory and maybe even
the OS swapping some memory pages to disk.
This behaviour can be controlled by modifying the Minimum memory
and Maximum memory of the partition, as well as the memory chunk
size Forte uses to increment its memory.
Upon partition startup, Forte requests the Minimum memory from the
OS. Whithin this area, the actual memory being used grows, until
it hits the ceiling of this space. This is when the garbage collector
kicks in and removes all unreferenced objects. If this does not suffice
to store the additional data, Forte requests 1 additional chunk of a
predefined size. Now, the same behaviour is repeated in this, slightly
larger piece of memory. Actual memory keeps growing until it hits
the ceiling, upon which the garbage collector removes all unrefer-
enced objects. If the garbage collector reduces the amount of
memory being used to below the original Miminum memory, Forte
will NOT return the additional chunk of memory to the OS. If the
garbage collector fails to free enough memory to store the new data,
Forte will request an additional chunk of memory. This process is
repeated untill the Maximum memory is reached. If the garbage
collector fails to free enough memory at this point, the process
terminates gracelessly (which is what happens sooner or later when
you have a memory leak; something most Forte developpers have
seen once or twice).
Pascal Rottier
STP - MSS Support & Coordination Group
Philip Morris Europe
e-mail: [email protected]
Phone: +49 (0)89-72472530
+++++++++++++++++++++++++++++++++++
Origin IT-services
Desktop Business Solutions Rotterdam
e-mail: [email protected]
Phone: +31 (0)10-2428100
+++++++++++++++++++++++++++++++++++
/* All generalizations are false! */
-----Original Message-----
From: [email protected] [SMTP:[email protected]]
Sent: Monday, November 15, 1999 6:53 PM
To: [email protected]
Subject: (forte-users) Optimal number of records to fetch from Forte
Cursor
Hello everybody:
I 'd like to ask a very important question.
I opened Forte cursor with approx 1.2 million records, and now I am trying
to figure out the number of records per fetch to obtain
the acceptable performance.
To my surprise, fetching 100 records at once gave me approx 15 percent
performance gain only in comparsion
with fetching records each by each.
I haven't found significant difference in performance fetching 100, 500
or
10.000 records at once.In the same time, fetching 20.000
records at once make a performance approx 20% worse( this fact I cannot
explain).
Does anybody have any experience in how to improve performance fetching
from
Forte cursor with big number of rows ?
Thank you in advance
Genady Yoffe
Software Engineer
Descartes Systems Group Inc
Waterloo On
Canada
For the archives, go to: http://lists.sageit.com/forte-users and use
the login: forte and the password: archive. To unsubscribe, send in a new
email the word: 'Unsubscribe' to: [email protected]
For the archives, go to: http://lists.sageit.com/forte-users and use
the login: forte and the password: archive. To unsubscribe, send in a new
email the word: 'Unsubscribe' to: [email protected]

Hi Kieran,
According to your description, you are going to figure out what is the optimal number of records per partition, right? As per my understanding, this number was change by your hardware. The better hardware you have, the more number of records per partition.
The earlier version of the performance guide for SQL Server 2005 Analysis Services Performance Guide stated this:
"In general, the number of records per partition should not exceed 20 million. In addition, the size of a partition should not exceed 250 MB."
Besides, the number of records is not the primary concern here. Rather, the main criterion is manageability and processing performance. Partitions can be processed in parallel, so the more there are the more can be processed at once. However, the more partitions
you have the more things you have to manage. Here is some links which describe the partition optimization
http://blogs.msdn.com/b/sqlcat/archive/2009/03/13/analysis-services-partition-size.aspx
http://www.informit.com/articles/article.aspx?p=1554201&seqNum=2
Regards,
Charlie Liao
TechNet Community Support

Optimal number of records to fetch from Forte Cursor

Hello everybody:
I 'd like to ask a very important question.
I opened Forte cursor with approx 1.2 million records, and now I am trying
to figure out the number of records per fetch to obtain
the acceptable performance.
To my surprise, fetching 100 records at once gave me approx 15 percent
performance gain only in comparsion
with fetching records each by each.
I haven't found significant difference in performance fetching 100, 500 or
10.000 records at once.In the same time, fetching 20.000
records at once make a performance approx 20% worse( this fact I cannot
explain).
Does anybody have any experience in how to improve performance fetching from
Forte cursor with big number of rows ?
Thank you in advance
Genady Yoffe
Software Engineer
Descartes Systems Group Inc
Waterloo On
Canada

You can do it by writing code in start routine of your transformations.
1.If you have any specific criteria for filtering go with that and delete unwanted records.
2. If you want to load specific number of records based on count, then in start routine of the transformations loop through source package records by keeping a counter till you reach your desired count and copy those records into an internal table.
Delete records in the source package then assign the records stored in internal table to source package.

SQL help: return number of records for each day of last month.

Hi: I have records in the database with a field in the table which contains the Unix epoch time for each record. Letz say the Table name is ED and the field utime contains the Unix epoch time.
Is there a way to get a count of number of records for each day of the last one month? Essentially I want a query which returns a list of count (number of records for each day) with the utime field containing the Unix epoch time. If a particular day does not have any records I want the query to return 0 for that day. I have no clue where to start. Would I need another table which has the list of days?
Thanks
Ray

Peter: thanks. That helps but not completely.
When I run the query to include only records for July using a statement such as following
============
SELECT /*+ FIRST_ROWS */ COUNT(ED.UTIMESTAMP), TO_CHAR((TO_DATE('01/01/1970','MM/DD/YYYY') + (ED.UTIMESTAMP/86400)), 'MM/DD') AS DATA
FROM EVENT_DATA ED
WHERE AGENT_ID = 160
AND (TO_CHAR((TO_DATE('01/01/1970','MM/DD/YYYY')+(ED.UTIMESTAMP/86400)), 'MM/YYYY') = TO_CHAR(SYSDATE-15, 'MM/YYYY'))
GROUP BY TO_CHAR((TO_DATE('01/01/1970','MM/DD/YYYY') + (ED.UTIMESTAMP/86400)), 'MM/DD')
ORDER BY TO_CHAR((TO_DATE('01/01/1970','MM/DD/YYYY') + (ED.UTIMESTAMP/86400)), 'MM/DD');
=============
I get the following
COUNT(ED.UTIMESTAMP) DATA
1 07/20
1 07/21
1 07/24
2 07/25
2 07/27
2 07/28
2 07/29
1 07/30
2 07/31
Some dates donot have any records and so no output. Is there a way to show the missing dates with a COUNT value = 0?
Thanks
Ray

Difference in number of records in GROUP BY and PARTITION BY

Similar Messages

Maybe you are looking for