Performance of top-n analysis queries
Hi,
I am trying to understand how top-n queries work w.r.t performance.
I have 10 lakh records in my table out of my query filters.
I just want to retrieve top 1000 records.
i.e.
select * from t where x=y; -- returns 10,00,00,000 records
To retrieve top 1000 records I use
select * from
(select * from t where x=y
order by x
) where rownum<1001;
How can I actually measure that the cost is reduced?
Or to put my question in other words, How can I prove that 2nd query is faster or better than 1st query (does it actually improves performance)?
Thanks.
Kavipriya wrote:
Hi,
I am trying to understand how top-n queries work w.r.t performance.
I have 10 lakh records in my table out of my query filters.
I just want to retrieve top 1000 records.
i.e.
select * from t where x=y; -- returns 10,00,00,000 records
To retrieve top 1000 records I use
select * from
(select * from t where x=y
order by x
) where rownum<1001;
How can I actually measure that the cost is reduced?
Or to put my question in other words, How can I prove that 2nd query is faster or better than 1st query (does it actually improves performance)?
Thanks.You cannot compare your two query. They are entirely different. First one gets rows from table T for the condition X=Y. But the second query does more than that. It filters the table T for the condition X=Y then Orders the result set based on the column X and finally filters it for the first 1000 records.
So how would you compare two query that are entirely different?
Similar Messages
-
Top n analysis using hierarchial queries
hi all,
can we do top n analysis in hierarchial queries using level pseudo columns. if so please give an example.
thanks and regards,
sri ram.Hi,
Analytic functions (such as RANK) often interfere with CONNECT BY queries. Do one of them in a sub-query, and the other in a super-query, as shown below.
If you do the CONNECT BY first, use ROWNUM (which is assigned after ORDER SIBLINGS BY is applied) to preserve the order of the CONNECT BY query.
WITH connect_by_results AS
SELECT LPAD ( ' '
, 3 * (LEVEL - 1)
) || ename AS iname
, sal
, ROWNUM AS r_num
FROM scott.emp
START WITH mgr IS NULL
CONNECT BY mgr = PRIOR empno
ORDER SIBLINGS BY ename
SELECT iname
, sal
, RANK () OVER (ORDER BY sal DESC) AS sal_rank
FROM connect_by_results
ORDER BY r_num
;Output:
INAME SAL SAL_RANK
KING 5000 1
BLAKE 2850 5
ALLEN 1600 7
JAMES 950 13
MARTIN 1250 10
TURNER 1500 8
WARD 1250 10
CLARK 2450 6
MILLER 1300 9
JONES 2975 4
FORD 3000 2
SMITH 800 14
SCOTT 3000 2
ADAMS 1100 12
I hope this answers your question.
If not, post a little sample data (CREATE TABLE and INSERT statements, relevant columns only), and the results you want from that data. If you use only commonly available tables (such as those in the scott or hr schemas), then you don't have to post any sample data; just post the results.
Explain how you get those results from that data.
Always say what version of oracle you're using. -
Performing Top-n Analysis (but per group)
The Oracle University Guide SQL Volume 2 says:
To perform Top-n Analysis the general syntax is
SELECT [column_list], ROWNUM
FROM (SELECT [column_list]
FROM table
ORDER BY Top-N_column)
WHERE ROWNUM <= N;
for example
To display the top three earner names and salaries
from the EMPLOYEES table.
SELECT ROWNUM as RANK, last_name, salary
FROM (SELECT last_name,salary FROM employees
ORDER BY salary DESC)
WHERE ROWNUM <= 3;
or to display the four most senior employees in the company.
SELECT ROWNUM as SENIOR,E.last_name, E.hire_date
FROM (SELECT last_name,hire_date FROM employees
ORDER BY hire_date)E
WHERE rownum <= 4;
but what about if I have groups? for example if I want to display the 3 top earners per department?
In my case now
I want to fetch the top 4 items per category, with the biggest quantity (posothta)
SELECT ROWNUM as RANK,
H.KATHG_EIDOYS,
H.KATHG_EIDOYS_DESCR,
H.EIDOS,
H.EIDOS_DESCR,
H.CODE_SUP_BASIKOS_NAME,
H.RAFI_CODE,
H.LINES,
H.POSOTHTA
from (
SELECT B.KATHG_EIDOYS,
D.DESCRIPTION KATHG_EIDOYS_DESCR,
B.CODE EIDOS,
B.DESCRIPTION EIDOS_DESCR,
S.NAME CODE_SUP_BASIKOS_NAME,
C.RAFI_CODE,
COUNT(A.FLD_SEQ_NUM) LINES,
nvl(SUM(decode(k.INV_APOGRAFH_FLAG,'0', decode(k.INV_EXAGOGH_POSOTHTA,1, a.POSOTHTA_TIMOLOGHSHS,2,-a.POSOTHTA_TIMOLOGHSHS))),0) POSOTHTA
FROM ERP_EIDOI_ANA_RAFI C,
ERP_KODIKOI_KINHSHS K,
ERP_POLHSEIS_DETAILS A,
ERP_SUP_CUST S,
ERP_KATHG_EIDON D,
ERP_EIDH b
WHERE B.COMPANY = DECODE(1,1,'9',B.COMPANY)
AND a.COMPANY_KK=K.COMPANY
AND a.KK_CODE=K.CODE
and A.company_WAREHOUSE = c.COMPANY_WARE(+)
and A.MASTER_WAREHOUSE = c.MASTER_WARE_CODE(+)
and A.CODE_WAREHOUSE = c.DETAIL_WARE_CODE(+)
and A.COMPANY_EIDOS = c.COMPANY_EIDOS(+)
and A.EIDOS = c.CODE_EIDOS (+)
AND C.DEFAULT_FLAG (+)= 1
AND b.code = a.EIDOS
and b.company = a.COMPANY_EIDOS
AND D.CODE= B.KATHG_EIDOYS
AND D.COMPANY= B.COMPANY_KATHG_EIDOYS
AND B.COMPANY_SUP_BASIKOS = S.COMPANY
AND B.CODE_SUP_BASIKOS = S.CODE
AND B.PROMHTHEYTHS_FLAG_BASIKOS = S.PELATHS_PROMHTHEYTHS_FLAG
AND /*&p_where*/
a.COMPANY='9' and (a.group_source) = '10' and (A.MASTER_WAREHOUSE) = '01' and (A.CODE_WAREHOUSE) = '0101' and (a.hmerom_parast) >= to_date('01/01/2006','dd/mm/rrrr') and (a.hmerom_parast) <= to_date('25/05/2006','dd/mm/rrrr')
GROUP BY B.KATHG_EIDOYS, D.DESCRIPTION, B.CODE, B.DESCRIPTION, S.NAME,C.RAFI_CODE
ORDER BY 8 DESC
) H
where 1=1 and ROWNUM <= 4
this select does not bring me the desired results, because if for example
category 01 has 10 items
and category 02 has 2 items,
this select will bring me only the first four rows
and not the items from the 02 category.
If you understand what is the case I will wait for your replies.
Thanks in advanceHi,
Here is an example. It gives you customers ids with highest salary per department.
SELECT CUSTOMER_ID, SALARY, RANK, DEPARTMENT
FROM (SELECT CUSTOMER_ID, SALARY, DEPARTMENT,
RANK() OVER (PARTITION BY DEPARTMENT ORDER BY SALARY DESC) AS RANK
FROM TABLEA)
WHERE RANK < 2;
Peter D. -
An Oracle University Material in Sql Says
The high-level structure of a Top-N analysis query is:
SELECT [column_list], ROWNUM
FROM (SELECT [column_list]
FROM table
ORDER BY Top-N_column)
WHERE ROWNUM <= N;
For example to display the top three earner names and salaries from the EMPLOYEES table:
SELECT ROWNUM as RANK, last_name, salary
FROM (SELECT last_name,salary FROM employees
ORDER BY salary DESC)
WHERE ROWNUM <= 3;
My question is
If, instead of this query, I write
1)
SELECT ROWNUM as RANK, last_name, salary
FROM employees
WHERE ROWNUM <= 3
ORDER BY salary DESC
or
2)
SELECT ROWNUM as RANK, last_name, salary
FROM ( SELECT last_name,salary
FROM employees
WHERE ROWNUM <= 3
ORDER BY salary DESC
is any difference?
The results in schema hr are the same.............
Thank youis any difference? yes, there is!
SQL> with t as (select 1 num from dual union all
2 select 4 from dual union all
3 select 3 from dual union all
4 select 2 from dual)
5 --
6 SELECT ROWNUM as RANK, num
7 FROM (SELECT num FROM t ORDER BY num desc)
8 WHERE ROWNUM <= 3;
RANK NUM
1 4
2 3
3 2
SQL>
SQL> with t as (select 1 num from dual union all
2 select 4 from dual union all
3 select 3 from dual union all
4 select 2 from dual)
5 --
6 SELECT ROWNUM as RANK, num
7 FROM t
8 WHERE ROWNUM <= 3
9 ORDER BY num DESC
10 /
RANK NUM
2 4
3 3
1 1
SQL>
SQL> with t as (select 1 num from dual union all
2 select 4 from dual union all
3 select 3 from dual union all
4 select 2 from dual)
5 --
6 SELECT ROWNUM as RANK, num
7 FROM (SELECT num FROM t
8 WHERE ROWNUM <= 3
9 ORDER BY num DESC)
10 /
RANK NUM
1 4
2 3
3 1
SQL> rownum is confered before ordering is made.
Thats why you should place the subquery with ordering in the inline view. -
What is Top-N analysis.?? can some one explain. I encountered this while going thru
the Sql book
The ORDER BY clause in the subquery is not
needed unless you are performing Top-N analysis.That means you are doing something like this in order to get top five paid people::
SELECT * FROM
( SELECT empno, ename, sal+nvl(comm,0) as remuneration
FROM emp
ORDER BY remuneration DESC )
WHERE rownum <= 5
/We have to do the ORDER BY in a sub-query because rownum gets applied before the rows are sorted.
Cheers, APC -
TOP N analysis with same values
Dear Members,
Suppose we have the following data in the table Student.
Sname GPA
Jack 4.0
Smith 3.7
Rose 3.5
Rachel 3.5
Ram 2.8
I have seen many questions in this forum which gives good queries for TOP N analysis. But in my case those are not working.
There are total 5 students. I should write a query which should take an input and should give the students with top gpa as output in desc order.
Suppose if i give 4 as input i must get 4,3.7,3.5,3.5,2.8Gpa's since we have 2 gpa's which are same. Suppose i give 3 as the input i must get 4,3.7,3.5 and 3.5 GPA's.
The query must consider the GPA's which are same as one not different. How can we achive this. i.e the top three students (suppose input is 3) must be
Jack 4.0
Smith 3.7
Rose 3.5
Rachel 3.5
It must also include Rachel.
Any help is greatly appreciated.
Thanks
SandeepSQL> select * from test;
NAME GPA
Jack 4
Smith 3.7
Rose 3.5
Rachel 3.5
Ram 2.8
SQL> select name,gpa
2 from
3 (select name,gpa,dense_rank() over(order by gpa desc) rn
4 from test)
5 where rn <= 3
6 order by rn;
NAME GPA
Jack 4
Smith 3.7
Rose 3.5
Rachel 3.5 -
Top n Analysis using correlated subquery
Please explain this query. It is doing top n analysis using correlated subquery. I need explaination of execution of this query.
Select distinct a.sal
From emp a
where 1=(select count ( distinct b.sal) from emp b
where a.sal <=b.sal)
Thanks in advanceTry breaking the query down and rewriting it in order to follow the logic;
SQL> --
SQL> -- Start by getting each salary from emp along with a count of all salaries in emp
SQL> --
SQL> select a.sal,
(select count (distinct b.sal) from scott.emp b ) count_sal
from scott.emp a
order by 1 desc
SAL COUNT_SAL
5000 12
3000 12
3000 12
2975 12
2850 12
2450 12
1600 12
1500 12
1300 12
1250 12
1250 12
1100 12
950 12
800 12
14 rows selected.
SQL> --
SQL> --Add a condition to the count for only salaries below or equal to the current salarySQL> --
SQL> select a.sal,
(select count (distinct b.sal) from scott.emp b where a.sal <=b.sal) rank_sal
from scott.emp a
order by 1 desc
SAL RANK_SAL
5000 1
3000 2
3000 2
2975 3
2850 4
2450 5
1600 6
1500 7
1300 8
1250 9
1250 9
1100 10
950 11
800 12
14 rows selected.
SQL> --
SQL> -- Add a condition to only pick the nth highest salary
SQL> --
SQL> select a.sal,
(select count (distinct b.sal) from scott.emp b where a.sal <=b.sal) rank_sal
from scott.emp a
where (select count (distinct b.sal) from scott.emp b where a.sal <=b.sal) = 4
SAL RANK_SAL
2850 4
1 row selected.Hope this helps. -
How to use top-n analysis in oracle 8i?
I mean,take a example.
I am maintaining a database of a 1000 employees.I want to display the names of the employees who are getting top 10 salaries(more further top 100 salaries) using a SQL query in oracle 8i only.Please answer my problem.Sorry, my suggestion will return 10 emp with highest salaries, not all employees with 10 highest salaries. To get all employees with 10 highest salaries in 8i:
SQL> SELECT ename,
2 sal
3 FROM emp
4 WHERE sal IN (
5 SELECT sal
6 FROM (
7 SELECT sal
8 FROM emp
9 GROUP BY sal
10 ORDER BY sal DESC
11 )
12 WHERE rownum <= 10
13 )
14 /
ENAME SAL
KING 5000
FORD 3000
SCOTT 3000
JONES 2975
BLAKE 2850
CLARK 2450
ALLEN 1600
TURNER 1500
MILLER 1300
MARTIN 1250
WARD 1250
ENAME SAL
ADAMS 1100
12 rows selected.
SQL> SY. -
Regarding performance,process chains,initial analysis
Hi All,
This BW system already developed.
They are doing HW migration.This time i need to check all these steps.
Initial analysis of environment
performance review
Process chain analysis
troubleshooting items
Please give Detailed steps how to do?.
I am waiting for your mail.
Thanks
Vasu.
Message was edited by:
vasudeva reddy mittapalliHi,
For performance check the following links,
FAQ - The Future of SAP NetWeaver Business Intelligence in the Light of the NetWeaver BI&Business Objects Roadmap
Business Intelligence Performance Tuning [original link is broken] [original link is broken]
http://help.sap.com/saphelp_nw04/helpdata/en/06/b5f8926ba22b45bc9eaa589f1c835b/content.htm
Nice web logs by Vikas Please do check this. On number range buffering,
/people/vikash.agrawal/blog/2006/04/05/load-lots-of-data-147faster148-with-buffering-number-range
docs bw loading performance material
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/1955ba90-0201-0010-d3aa-8b2a4ef6bbb2
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/3a699d90-0201-0010-bc99-d5c0e3a2c87b
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/4c0ab590-0201-0010-bd9a-8332d8b4f09c
and don't miss bw performance knowledge centre, there are e-learning
Business Intelligence Performance Tuning [original link is broken] [original link is broken]
A nice weblog by Vikas Please do check this.on number range buffering,
/people/vikash.agrawal/blog/2006/04/05/load-lots-of-data-147faster148-with-buffering-number-range
Please reward for the same. -
Hi,
How can i find the top 10 sql queries which consuming high IO, CPU's. in oracle db.
I am doing in one way that by using TOP command trying to get PID's then i am getting the sql query by applying the hash value in v$sqlarea.
Is there any way to get directly high consumed IO and CPU's with out seeing PID's in TOP command.
Thanskhi,
try something along the lines of
select c.* from
(select disk_reads,
buffer_gets,
rows_processed,
executions,
first_load_time,
sql_text
from v$sqlarea
where parsing_user_id !=0
order by
buffer_gets/decode(executions,null,1,0,1,executions) desc ) c
where rownum < 11;
select c.* from
(select disk_reads,
buffer_gets,
rows_processed,
executions,
first_load_time,
sql_text
from v$sqlarea
order by
disk_reads/decode(rows_processed,null,1,0,1,rows_processed) desc ) c
where rownum <11;or even
--Top 10 by Buffer Gets:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
buffer_gets, executions, buffer_gets/executions "Gets/Exec",
hash_value,address
FROM V$SQLAREA
WHERE buffer_gets > 10000
ORDER BY buffer_gets DESC)
WHERE rownum <= 10
--Top 10 by Physical Reads:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
disk_reads, executions, disk_reads/executions "Reads/Exec",
hash_value,address
FROM V$SQLAREA
WHERE disk_reads > 1000
ORDER BY disk_reads DESC)
WHERE rownum <= 10
--Top 10 by Executions:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
executions, rows_processed, rows_processed/executions "Rows/Exec",
hash_value,address
FROM V$SQLAREA
WHERE executions > 100
ORDER BY executions DESC)
WHERE rownum <= 10
--Top 10 by Parse Calls:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
parse_calls, executions, hash_value,address
FROM V$SQLAREA
WHERE parse_calls > 1000
ORDER BY parse_calls DESC)
WHERE rownum <= 10
--Top 10 by Sharable Memory:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
sharable_mem, executions, hash_value,address
FROM V$SQLAREA
WHERE sharable_mem > 1048576
ORDER BY sharable_mem DESC)
WHERE rownum <= 10
--Top 10 by Version Count:
set linesize 100
set pagesize 100
SELECT * FROM
(SELECT substr(sql_text,1,40) sql,
version_count, executions, hash_value,address
FROM V$SQLAREA
WHERE version_count > 20
ORDER BY version_count DESC)
WHERE rownum <= 10
;you may have to play around with the column formatting a little to show the best results
regards
Alan
Edited by: alanm on Dec 22, 2008 4:01 PM -
Poor query performance in WebI on top of BEx queries
Hello,
We are using Web Inteliigence i BO 3.1 (SP0) to report on BEx queries (BW 7.0).
We experience however that reporting is much quicker for the same set of data when using BEx Analyzer than when using WebI. In BEx it is acceptable, but in WebI it's not.
How can we optimize the performance in this scenario? What tricks are there?
The amount of data is not huge (max 500 000 records in the cube)
BR,
FredrikHi,
Dennis is right. But if you need to upgrade (and you do) than maybe going to the latest SP is better. It is SP4. With SP3 the WebI Option "Query Striping" has been introduced which brings Performance as well.
Also check
http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/d0bf4691-cdce-2d10-45ba-d1ff39408eb4
http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/109b7d63-7cab-2d10-2fbc-be5c61dcf110
http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/006b1374-8f91-2d10-fe9c-f9fa12e2f595
Regards
-Seb. -
Enterprise Manager,under performance, aclick on hang analysis fails.
Hi there,
Am on 10.2.0.3.0
on my enterprise manager when am logged in as sys, when i get to performance menu and go to hang analysis,
or top activity, i get a rejection sighting wrong credentials , that i log in with the right credentials, am wondering how this is happening when am already logged in as sys and can access
all other items with no problems.
am running on three clustes with 3 instances.
Some one there to guide me.Did you check the entries you made in your preferred credentials?
Eric -
Restricting access for top Hierarchy in queries
Hello all,
Since we have a top hierarchy that comes from R/3 in which every company from our organization is attached, is there any way to restrict users access in the queries and authorizations so that when a user runs a query and tries to access nodes (cost or profit centers or other companies) that are restricted for him/her the "Authorization Not allowed" message displays. We know that the companies can not be treated as 0co_code but as nodes and We also know that in the Role modification we can put all this detail, but this will increase in a manual maintenace process, because everytime there's a new cost or profit center a manual maintenance must be done.
We want to have an automatic process since the hierarchy comes from R/3.
Thanks for your help!!
Mrs. Eyda MuñozHi,
You can try look at transaction RSSM and at the very bottom there is a button "fr. hierarchy". This is where you can specify the levels and nodes to restrict to. Then you have to set up a profile in PFCG to provide the restriction.
http://help.sap.com/saphelp_nw04/helpdata/en/80/1a689ae07211d2acb80000e829fbfe/content.htm - this should be able to provide some form of basic understanding.
Hope this helps.
Cheers,
Gim -
Top N used queries and Bottom N used queris
Hi All....
Iam the new user for SDN, I heard that lot of professional are here,
can any give me complete solution,
I have 30queries, I want to know the Top()used queries and Bottom()used queires in BW system.
Thanks in Advance,Purushotham
Welcome to SDN
Have you tried using conditions. Please check this
http://help.sap.com/saphelp_nw2004s/helpdata/en/1e/7875a998bc44409f6002e28552685a/frameset.htm
http://help.sap.com/saphelp_nw2004s/helpdata/en/a3/3ea1c929a49741b8e93597f067140c/frameset.htm
Hope this helps
Thnaks
sat -
Performance of Top Link with clustering
Hi,
We are planning to use Top Link with a J2EE app that will reside on Oracle 9i app server. We are also planning to use clustering of app server. So we will need to configure Top Link ServerSessions accordingly (isnt it?).
Do you think the performance of TopLink will be reduced badly if used in this fashion, since the server sessions have to talk to each other continuously !
Thanks,
KrishnaKrishna,
The Cache-Sync feature of TopLink is an excellent way to minimize stale data. The messaging and change-set propagation does not come without a cost though. Typically we try to recommend this for customers with a high percentage of reads. In these cases it is more beneficial to keep the cache and avoid trips to the database.
There is no magic formula for when to use cache-sync. The messaging cost/efficiency is dependent on the mechanism (point to point over RMI/CORBA/IIOP or JMS), the size of the change sets or complexity of transaction, number of servers in the cluster, etc.
What is important to understand though is that cache-sync is just one way to minimize stale cache. It does not ensure you won't have stale data. For this you must configure a locking strategy and adjust your cache relative to the expected life-span you need on cached objects.
For a system like your I would recommend starting out without cache-sync. Use optimistic locking on all classes and use a weak cache for all classes that have a high occurrence of writes. For the rest I would recommend a Full cache on static/reference data and a soft-weak cache on those that fall in between. The size of the soft-weak cache will determine the number of objects potentially held (LRU) beyond their use.
With these settings you should be able to monitor and account for changes made to stale objects through OptimisticLockExceptions. If you find specific use-cases that are prone to failure you could force the refresh on the query at the start of the UnitOfWork (TX).
This approach will give you a system that ensures data integrity. Then you can experiment with scaling to multiple servers with or without cache-sync and monitor the load vs failures for your test environment.
I hope this helps,
Doug
Maybe you are looking for
-
How can I use my external hard drive from PC to Mac without reformatting???
Ok, so I am computer illiterate. I have always used PC's and have just bought a MacBook. When I plug in my external hard drive into my new Mac, it says it needs to be reformatted to use with Mac, which would cause all data to be erased. Is there anyw
-
How can I extract information from a web page
I wanna read some information from a webpage, and then put those information together into a table. Since I only know the url, not the file path, when I use BufferedReader(new FileReader(FilePath)), there is a FileNotFoundException. How can I do that
-
Table displays incorrect value in coverted PDF
I converted a microsoft word document to a PDF to send to a client, after converting the values within one of the tables are not displaying correctly. In the table, every value is set to " !L2 Is Not In Table " is there a known solution to this? Than
-
Cursor freezing 10.8.2
After I updated the sistem to 10.8.2 (three days ago) in my MacBookPro, the cursor is freezing everytime I leave the Mac alon for a couple of seconds. No problem if I use the keyboard or if I move the cursor, otherwise it freezes. To unfreeze I h
-
Two running instances of Weblogic on the same machine
I need to run at least two instances of Weblogic on the same machine. My Company GE Capital purchases the Clustered version and the BEA Engineer who did the install did not know Unix and could not intall the instances. Could you please give pointers