Make 4 queries to 1 query , performance degrades

qry_9z_isolate dups: gives duplicate records
SELECT data.ACCT_NBR
, data.HERITAGE
     , data.SYST
     , Sum(data.LOAN_COUNT) AS SumOfLOAN_COUNT
FROM data
GROUP BY data.ACCT_NBR,
     data.HERITAGE,
     data.SYST
HAVING (((Sum(data.LOAN_COUNT))>1));
--qry_9ZZ1_detail on dups:
SELECT [qry_9Z_isolate dups].ACCT_NBR
, [qry_9Z_isolate dups].SumOfLOAN_COUNT
, data.ACCT_NBR, data.HERITAGE
, data.SYST
, data.IDNum
FROM [qry_9Z_isolate dups]
INNER JOIN
data
ON
([qry_9Z_isolate dups].ACCT_NBR = data.ACCT_NBR)
AND ([qry_9Z_isolate dups].HERITAGE = data.HERITAGE)
AND ([qry_9Z_isolate dups].SYST = data.SYST);
--qry_9ZZZ_max to eliminate:
SELECT [qry_9ZZ1_detail on dups].data.ACCT_NBR
, [qry_9ZZ1_detail on dups].HERITAGE
, [qry_9ZZ1_detail on dups].SYST
, Max([qry_9ZZ1_detail on dups].IDNum) AS MaxOfIDNum
FROM [qry_9ZZ1_detail on dups]
GROUP BY [qry_9ZZ1_detail on dups].data.ACCT_NBR,
[qry_9ZZ1_detail on dups].HERITAGE,
[qry_9ZZ1_detail on dups].SYST;
qry_1A_pull stats for new waterfall:
SELECT data.RUN_YR_MO
, data.SERVICER_NAME
, data.GSE_NONGSE
, data.KNOWN_NOTKNOWN_VACANT
, data.KNOWN_NOTKNOWN_MODDATA
, data.WATERFALL_EXCLUSION
, Sum(data.LOAN_COUNT) AS SumOfLOAN_COUNT,
Sum(data.UPB) AS SumOfUPB
FROM [qry_9ZZZ_max to eliminate] RIGHT JOIN data
ON [qry_9ZZZ_max to eliminate].MaxOfIDNum = data.IDNum
WHERE ((([qry_9ZZZ_max to eliminate].MaxOfIDNum) Is Null))
GROUP BY data.RUN_YR_MO
, data.SERVICER_NAME
, data.GSE_NONGSE
, data.KNOWN_NOTKNOWN_VACANT
, data.KNOWN_NOTKNOWN_MODDATA
, data.WATERFALL_EXCLUSION;
I have created one single query instead of 4 as below:
select data.RUN_YR_MO
, data.SERVICER_NAME
, data.GSE_NONGSE
, data.KNOWN_NOTKNOWN_VACANT
, data.KNOWN_NOTKNOWN_MODDATA
, data.WATERFALL_EXCLUSION
, Sum(data.LOAN_COUNT) AS SumOfLOAN_COUNT,
Sum(data.UPB) AS SumOfUPB
from
(select data.acct_nbr,data.HERITAGE
, data.SYST,max(data.idnum) MaxOfIDNum
from
(SELECT a.ACCT_NBR
, a.SumOfLOAN_COUNT
, data.ACCT_NBR, data.HERITAGE
, data.SYST
, data.IDNum
FROM
(SELECT ACCT_NBR
,HERITAGE
     ,SYST
     , Sum(LOAN_COUNT) AS SumOfLOAN_COUNT
FROM data
GROUP BY ACCT_NBR,
     HERITAGE,
     SYST
HAVING (((Sum(LOAN_COUNT))>1)) ) a,data
where
(a.ACCT_NBR = data.ACCT_NBR)
AND (a.HERITAGE = data.HERITAGE)
AND (a.SYST = data.SYST))
group by
data.acct_nbr,data.HERITAGE
, data.SYST)b RIGHT JOIN data
ON b.MaxOfIDNum = data.IDNum
WHERE (((b.MaxOfIDNum) Is Null))
GROUP BY data.RUN_YR_MO
, data.SERVICER_NAME
, data.GSE_NONGSE
, data.KNOWN_NOTKNOWN_VACANT
, data.KNOWN_NOTKNOWN_MODDATA
, data.WATERFALL_EXCLUSION;
this looks performance degrading ... is there an alternate?
Thanks

Anurag,
I don't think "keep (DENSE_RANK first order by IDNUM desc)" is needed in your query. "max(IDNum) over (partition by ACCT_NBR,HERITAGE,SYST) max_id_num" itself would do.
FIRST/LAST are meaningful only when the column you are selecting from with MAX is different from the ORDER BY column in the KEEP clause.
Here is an example:
LPALANI@l11gr2>select e.*,
2 max(hiredate) keep (dense_rank first order by hiredate desc) over (partition by deptno) max_hire_date,
3 max(hiredate) over (partition by deptno) max_hire_date1
4 from scott.emp e;
EMPNO ENAME      JOB         MGR HIREDATE               SAL             COMM           DEPTNO MAX_HIRE_ MAX_HIRE_
7782 CLARK      MANAGER    7839 09-JUN-81            2,450                                10 23-JAN-82 23-JAN-82
7839 KING       PRESIDENT       17-NOV-81            5,000                                10 23-JAN-82 23-JAN-82
7934 MILLER     CLERK      7782 23-JAN-82            1,300                                10 23-JAN-82 23-JAN-82
7566 JONES      MANAGER    7839 02-APR-81            2,975                                20 23-MAY-87 23-MAY-87
7902 FORD       ANALYST    7566 03-DEC-81            3,000                                20 23-MAY-87 23-MAY-87
7876 ADAMS      CLERK      7788 23-MAY-87            1,100                                20 23-MAY-87 23-MAY-87
7369 SMITH      CLERK      7902 17-DEC-80              800                                20 23-MAY-87 23-MAY-87
7788 SCOTT      ANALYST    7566 19-APR-87            3,000                                20 23-MAY-87 23-MAY-87
7521 WARD       SALESMAN   7698 22-FEB-81            1,250              500               30 03-DEC-81 03-DEC-81
7844 TURNER     SALESMAN   7698 08-SEP-81            1,500                0               30 03-DEC-81 03-DEC-81
7499 ALLEN      SALESMAN   7698 20-FEB-81            1,600              300               30 03-DEC-81 03-DEC-81
7900 JAMES      CLERK      7698 03-DEC-81              950                                30 03-DEC-81 03-DEC-81
7698 BLAKE      MANAGER    7839 01-MAY-81            2,850                                30 03-DEC-81 03-DEC-81
7654 MARTIN     SALESMAN   7698 28-SEP-81            1,250            1,400               30 03-DEC-81 03-DEC-81
14 rows selected.

Similar Messages

Excel Pivot Table with Date Hierarchies - query performance degradation

For the sake of this explanation, I’m going to try and keep it simple. Slicing the data by additional dimensions only makes the issue worse. I’ll keep this description to one fact table and three dimensions. Also, I’m fairly new to SSAS Tabular; I’ve worked
with SSAS Multidimensional in the past.
We’ve got a fact table that keeps track of bill pay payments made over time. Currently, we only have about six months of data, with the fact row count at just under 900,000 rows. The grain is daily.
There is an Account dimension (approx. 460,000 rows), with details about the individual making a payment.
There is a Payment Category dimension (approx.. 35,000 rows), which essentially groups various Payees into groups which we like to report on: Automobile Loan, Mortgage, Insurance, etc.
There is the requisite Date dimension (exactly 62093 rows-more days than we need?), which allows visibility as to what is being paid when.
Using this DW model, I’ve created a SSAS BISM Tabular model, from which Excel 2010 is ultimately used to perform some analysis, using Pivot Tables. In the tabular model, for easier navigation (doing what I’ve always done in SSAS MultiDimensional), I’ve created
several Date Hierarchies, Year-Month, Year-Quarter-Month, etc.
There are currently only two measures defined in the Tabular model: one for the “Sum of PaymentAmount”; one for the “PaymentsProcessed”.
OK, in Excel 2010, using a Pivot Table, drag the “Sum of PaymentAmount” measure to the Values section, next to/under the PivotTable Field List. Not too exciting, just the grand total of all Payments, for all time.
Drag the “YearMonth” hierarchy (from the Date dimension) to the “Column Labels” section. After expanding the year hierarchy to see the months, now the totals are for each of the months, for which we have data, for June through November, 2013.
Drag the “PaymentCategory” (from the Payment Categories dimension) to the “Report Filter” section. Filter accordingly: We just want to see the monthly totals for “Automobile Loans”.
Now, some details. Drag the “AccountSK” (hiding the actual account numbers) to the “Row Labels” section. This shows all accounts that have made Automobile Loan payments over the last six months, showing the actual payment amounts.
So far, so good. Remember, I’m using a Date Hierarchy here, in this case “YearMonth”
Now, if any of the other attributes on the Account dimension table, say “CreditScore”, or “LongName”, are subsequently dragged over to the “Row Lables” section, under the “AccountSK”, the results will never come back, before timing out or by giving up and
pressing ESCape!
If this exact scenario is done by removing the Date Hierarchy, “YearMonth” from the “Column Labels” and replace it with “Year” and “MonthName” attributes from the Date dimension, these fields not being in any sort of hierarchy, adding an additional “Account”
attribute does not cause any substantial delay.
What I’m trying to find out is why is this happening? Is there anything I can do as a work around, other than what I’ve done by not using a Date Hierarchy? Is this a known issue with DAX and the query conversion to MDX? Something else?
I’ve done a SQL Profiler trace, but I’m not sure at this point what it all means. In the MDX query there is a CrossJoin involved. There are also numerous VertiPaq Scans which seems to be going through each and every AccountSK in the Account dimension, not
just the ones filtered, to get an additional attribute (About 3,600 accounts which are “Automobile Loan” payments.).
Any thoughts?
Thanks! Happy Holidays!
AAO

Thanks for your reply Marco. I've been reading your book, too, getting into Tabular.
I've set up the Excel Pivot Table using either the Year/MonthName levels, or the YearMonth hierarchy and then adding the additional attribute for the CreditScore.
Incidentally, when using the YearMonth hierarchy and adding the CreditScore, all is well, if the Year has not been "opened". When this is done, I suspect the same thing is going on.
From SQL Profiler, each of the individual MDX queries below (formatted a bit for readability).
Thanks!
// MDX query using separate Year and MonthName levels, NO hierarchy.
SELECT
NON EMPTY
Hierarchize(
DrilldownMember(
CrossJoin(
{[Date].[Year].[All],[Date].[Year].[Year].AllMembers},
{([Date].[MonthName].[All])}
,[Date].[Year].[Year].AllMembers, [Date].[MonthName]
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME
ON COLUMNS,
NON EMPTY
Hierarchize(
DrilldownMember(
CrossJoin(
{[Accounts].[AccountSK].[All],[Accounts].[AccountSK].[AccountSK].AllMembers},
{([Accounts].[CreditScore].[All])}
,[Accounts].[AccountSK].[AccountSK].AllMembers, [Accounts].[CreditScore]
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME
ON ROWS
FROM [PscuPrototype]
WHERE ([PaymentCategories].[PaymentCategory].&[Automobile Loan],[Measures].[Sum of PaymentAmount])
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS
// MDX query using separate YearMonth hierarchy (Year, MonthName).
SELECT
NON EMPTY
Hierarchize(
DrilldownMember(
{{DrilldownLevel({[Date].[YearMonth].[All]},,,INCLUDE_CALC_MEMBERS)}},
{[Date].[YearMonth].[Year].&[2013]},,,INCLUDE_CALC_MEMBERS
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME
ON COLUMNS,
NON EMPTY
Hierarchize(
DrilldownMember(
CrossJoin(
{[Accounts].[AccountSK].[All],[Accounts].[AccountSK].[AccountSK].AllMembers},
{([Accounts].[CreditScore].[All])}
,[Accounts].[AccountSK].[AccountSK].AllMembers, [Accounts].[CreditScore]
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME
ON ROWS
FROM [PscuPrototype]
WHERE ([PaymentCategories].[PaymentCategory].&[Automobile Loan],[Measures].[Sum of PaymentAmount])
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS
AAO

SQL query performance degradation

Hi,
I have to select statements:
1)
SELECT 1
FROM wheels.equipment a
JOIN wheels.equipment_fel a1
ON(a1.identnr_trunc LIKE a.nr_cztrunc ||'%')
JOIN wheels.cz b ON(b.cz_id = a1.cz_id)
WHERE (a.company_id = 21);
* no performance problem
* execute in about 2 seconds
* EXECUTION PLAN: https://dl.dropboxusercontent.com/u/368042/zap1.PNG
2)
SELECT 1
FROM wheels.equipment a
JOIN wheels.equipment_fel a1
ON(a1.identnr_trunc LIKE a.nr_cztrunc ||'%')
JOIN wheels.cz b ON(b.cz_id = a1.cz_id)
JOIN wheels.cz_felgi b1 ON(b1.cz_id = b.cz_id)
WHERE (a.company_id = 21);
* add only 1 line: JOIN wheels.cz_felgi b1 ON(b1.cz_id = b.cz_id)
* big performance problem because execute takes more that 1 minute
* EXECUTION PLAN: https://dl.dropboxusercontent.com/u/368042/zap2.PNG
Can you help me with that?

It's interesting that using 2nd query with some modyfication as below:
WITH test AS (
SELECT b.cz_id as bczid
FROM wheels.equipment a
JOIN wheels.equipment_fel a1
ON(a1.identnr_trunc LIKE a.nr_cztrunc ||'%')
JOIN wheels.cz b ON(b.cz_id = a1.cz_id)
WHERE (a.company_id = 21)
) SELECT * FROM test t, cz_felgi b1
WHERE b1.cz_id = t.bczid;
problem with cartesian disapear. Try to find other solution - without using WITH caluse and understand what is real problem. I'll be glad if you'll have some sugestions.

Poor query performance when joining CONTAINS to another table

We just recently began evaluating Oracle Text for a search solution. We need to be able to search a table that can have over 20+ million rows. Each user may only have visibility to a tiny fraction of those rows. The goal is to have a single Oracle Text index that represents all of the searchable columns in the table (multi column datastore) and provide a score for each search result so that we can sort the search results in descending order by score. What we're seeing is that query performance from TOAD is extremely fast when we write a simple CONTAINS query against the Oracle Text indexed table. However, when we attempt to first reduce the rows the CONTAINS query needs to search by using a WITH we find that the query performance degrades significantly.
For example, we can find all the records a user has access to from our base table by the following query:
SELECT d.duns_loc
FROM duns d
JOIN primary_contact pc
ON d.duns_loc = pc.duns_loc
AND pc.emp_id = :employeeID;
This query can execute in <100 ms. In the working example, this query returns around 1200 rows of the primary key duns_loc.
Our search query looks like this:
SELECT score(1), d.*
FROM duns d
WHERE CONTAINS(TEXT_KEY, :search,1) > 0
ORDER BY score(1) DESC;
The :search value in this example will be 'highway'. The query can return 246k rows in around 2 seconds.
2 seconds is good, but we should be able to have a much faster response if the search query did not have to search the entire table, right? Since each user can only "view" records they are assigned to we reckon that if the search operation only had to scan a tiny tiny percent of the TEXT index we should see faster (and more relevant) results. If we now write the following query:
WITH subset
AS
(SELECT d.duns_loc
FROM duns d
JOIN primary_contact pc
ON d.duns_loc = pc.duns_loc
AND pc.emp_id = :employeeID
SELECT score(1), d.*
FROM duns d
JOIN subset s
ON d.duns_loc = s.duns_loc
WHERE CONTAINS(TEXT_KEY, :search,1) > 0
ORDER BY score(1) DESC;
For reasons we have not been able to identify this query actually takes longer to execute than the sum of the durations of the contributing parts. This query takes over 6 seconds to run. We nor our DBA can seem to figure out why this query performs worse than a wide open search. The wide open search is not ideal as the query would end up returning records to the user they don't have access to view.
Has anyone ever ran into something like this? Any suggestions on what to look at or where to go? If anyone would like more information to help in diagnosis than let me know and i'll be happy to produce it here.
Thanks!!

Sometimes it can be good to separate the tables into separate sub-query factoring (with) clauses or inline views in the from clause or an in clause as a where condition. Although there are some differences, using a sub-query factoring (with) clause is similar to using an inline view in the from clause. However, you should avoid duplication. You should not have the same table in two different places, as in your original query. You should have indexes on any columns that the tables are joined on, your statistics should be current, and your domain index should have regular synchronization, optimization, and periodically rebuild or drop and recreate to keep it performing with maximum efficiency. The following demonstration uses a composite domain index (cdi) with filter by, as suggested by Roger, then shows the explained plans for your original query, and various others. Your original query has nested loops. All of the others have the same plan without the nested loops. You could also add index hints.
SCOTT@orcl_11gR2> -- tables:
SCOTT@orcl_11gR2> CREATE TABLE duns
2    (duns_loc NUMBER,
3      text_key VARCHAR2 (30))
4 /
Table created.
SCOTT@orcl_11gR2> CREATE TABLE primary_contact
2    (duns_loc NUMBER,
3      emp_id       NUMBER)
4 /
Table created.
SCOTT@orcl_11gR2> -- data:
SCOTT@orcl_11gR2> INSERT INTO duns VALUES (1, 'highway')
2 /
1 row created.
SCOTT@orcl_11gR2> INSERT INTO primary_contact VALUES (1, 1)
2 /
1 row created.
SCOTT@orcl_11gR2> INSERT INTO duns
2 SELECT object_id, object_name
3 FROM   all_objects
4 WHERE object_id > 1
5 /
76027 rows created.
SCOTT@orcl_11gR2> INSERT INTO primary_contact
2 SELECT object_id, namespace
3 FROM   all_objects
4 WHERE object_id > 1
5 /
76027 rows created.
SCOTT@orcl_11gR2> -- indexes:
SCOTT@orcl_11gR2> CREATE INDEX duns_duns_loc_idx
2 ON duns (duns_loc)
3 /
Index created.
SCOTT@orcl_11gR2> CREATE INDEX primary_contact_duns_loc_idx
2 ON primary_contact (duns_loc)
3 /
Index created.
SCOTT@orcl_11gR2> -- composite domain index (cdi) with filter by clause
SCOTT@orcl_11gR2> -- as suggested by Roger:
SCOTT@orcl_11gR2> CREATE INDEX duns_text_key_idx
2 ON duns (text_key)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 FILTER BY duns_loc
5 /
Index created.
SCOTT@orcl_11gR2> -- gather statistics:
SCOTT@orcl_11gR2> EXEC DBMS_STATS.GATHER_TABLE_STATS (USER, 'DUNS')
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> EXEC DBMS_STATS.GATHER_TABLE_STATS (USER, 'PRIMARY_CONTACT')
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> -- variables:
SCOTT@orcl_11gR2> VARIABLE employeeid NUMBER
SCOTT@orcl_11gR2> EXEC :employeeid := 1
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> VARIABLE search VARCHAR2(100)
SCOTT@orcl_11gR2> EXEC :search := 'highway'
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> -- original query:
SCOTT@orcl_11gR2> SET AUTOTRACE ON EXPLAIN
SCOTT@orcl_11gR2> WITH
2    subset AS
3       (SELECT d.duns_loc
4        FROM      duns d
5        JOIN      primary_contact pc
6        ON      d.duns_loc = pc.duns_loc
7        AND      pc.emp_id = :employeeID)
8 SELECT score(1), d.*
9 FROM   duns d
10 JOIN   subset s
11 ON     d.duns_loc = s.duns_loc
12 WHERE CONTAINS (TEXT_KEY, :search,1) > 0
13 ORDER BY score(1) DESC
14 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 4228563783
| Id | Operation                      | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT               |                   |     2 |    84 |   121   (4)| 00:00:02 |
|   1 | SORT ORDER BY                 |                   |     2 |    84 |   121   (4)| 00:00:02 |
|* 2 |   HASH JOIN                    |                   |     2 |    84 |   120   (3)| 00:00:02 |
|   3 |    NESTED LOOPS                |                   |    38 | 1292 |    50   (2)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 5 |      DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 6 |     INDEX RANGE SCAN           | DUNS_DUNS_LOC_IDX |     1 |     5 |     1   (0)| 00:00:01 |
|* 7 |    TABLE ACCESS FULL           | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("D"."DUNS_LOC"="PC"."DUNS_LOC")
   5 - access("CTXSYS"."CONTAINS"("D"."TEXT_KEY",:SEARCH,1)>0)
   6 - access("D"."DUNS_LOC"="D"."DUNS_LOC")
   7 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2> -- queries with better plans (no nested loops):
SCOTT@orcl_11gR2> -- subquery factoring (with) clauses:
SCOTT@orcl_11gR2> WITH
2    subset1 AS
3       (SELECT pc.duns_loc
4        FROM      primary_contact pc
5        WHERE pc.emp_id = :employeeID),
6    subset2 AS
7       (SELECT score(1), d.*
8        FROM      duns d
9        WHERE CONTAINS (TEXT_KEY, :search,1) > 0)
10 SELECT subset2.*
11 FROM   subset1, subset2
12 WHERE subset1.duns_loc = subset2.duns_loc
13 ORDER BY score(1) DESC
14 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 153618227
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT              |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|   1 | SORT ORDER BY                |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|* 2 |   HASH JOIN                   |                   |    38 | 1406 |    82   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("PC"."DUNS_LOC"="D"."DUNS_LOC")
   4 - access("CTXSYS"."CONTAINS"("TEXT_KEY",:SEARCH,1)>0)
   5 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2> -- inline views (sub-queries in the from clause):
SCOTT@orcl_11gR2> SELECT subset2.*
2 FROM   (SELECT pc.duns_loc
3           FROM   primary_contact pc
4           WHERE pc.emp_id = :employeeID) subset1,
5          (SELECT score(1), d.*
6           FROM   duns d
7           WHERE CONTAINS (TEXT_KEY, :search,1) > 0) subset2
8 WHERE subset1.duns_loc = subset2.duns_loc
9 ORDER BY score(1) DESC
10 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 153618227
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT              |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|   1 | SORT ORDER BY                |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|* 2 |   HASH JOIN                   |                   |    38 | 1406 |    82   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("PC"."DUNS_LOC"="D"."DUNS_LOC")
   4 - access("CTXSYS"."CONTAINS"("TEXT_KEY",:SEARCH,1)>0)
   5 - filter("PC"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2> -- ansi join:
SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
2 FROM   duns
3 JOIN   primary_contact
4 ON     duns.duns_loc = primary_contact.duns_loc
5 WHERE CONTAINS (duns.text_key, :search, 1) > 0
6 AND    primary_contact.emp_id = :employeeid
7 ORDER BY SCORE(1) DESC
8 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 153618227
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT              |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|   1 | SORT ORDER BY                |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|* 2 |   HASH JOIN                   |                   |    38 | 1406 |    82   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
   4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
   5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2> -- old join:
SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
2 FROM   duns, primary_contact
3 WHERE CONTAINS (duns.text_key, :search, 1) > 0
4 AND    duns.duns_loc = primary_contact.duns_loc
5 AND    primary_contact.emp_id = :employeeid
6 ORDER BY SCORE(1) DESC
7 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 153618227
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT              |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|   1 | SORT ORDER BY                |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|* 2 |   HASH JOIN                   |                   |    38 | 1406 |    82   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
   4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
   5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2> -- in clause:
SCOTT@orcl_11gR2> SELECT SCORE(1), duns.*
2 FROM   duns
3 WHERE CONTAINS (duns.text_key, :search, 1) > 0
4 AND    duns.duns_loc IN
5          (SELECT primary_contact.duns_loc
6           FROM   primary_contact
7           WHERE primary_contact.emp_id = :employeeid)
8 ORDER BY SCORE(1) DESC
9 /
SCORE(1)   DUNS_LOC TEXT_KEY
        18          1 highway
1 row selected.
Execution Plan
Plan hash value: 3825821668
| Id | Operation                     | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT              |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|   1 | SORT ORDER BY                |                   |    38 | 1406 |    83   (5)| 00:00:01 |
|* 2 |   HASH JOIN SEMI              |                   |    38 | 1406 |    82   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DUNS              |    38 | 1102 |    11   (0)| 00:00:01 |
|* 4 |     DOMAIN INDEX              | DUNS_TEXT_KEY_IDX |       |       |     4   (0)| 00:00:01 |
|* 5 |    TABLE ACCESS FULL          | PRIMARY_CONTACT   | 4224 | 33792 |    70   (3)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("DUNS"."DUNS_LOC"="PRIMARY_CONTACT"."DUNS_LOC")
   4 - access("CTXSYS"."CONTAINS"("DUNS"."TEXT_KEY",:SEARCH,1)>0)
   5 - filter("PRIMARY_CONTACT"."EMP_ID"=TO_NUMBER(:EMPLOYEEID))
SCOTT@orcl_11gR2>

Weblogic 8.1.6 and Oracle 9.2.0.8 - query performance

Folks,
We are upgrading WebLogic from 8.1.5 to 8.1.6 and Oracle from 9.2.0.6 to 9.2.0.8. We use the Oracle thin client driver for 9.2.0.8 to connect from the application to Oracle.
When we use the following combination of the stack we see SQL query performance degradation: -
Oracle 9.2.0.8 database, Oracle 9.2.0.8 driver, WL 8.1.6
Oracle 9.2.0.8 database, Oracle 9.2.0.1 driver, WL 8.1.6
We do not see the degradation in case of the following: -
Oracle 9.2.0.8 database, Oracle 9.2.0.1 driver, WL 8.1.5
Oracle 9.2.0.6 database, Oracle 9.2.0.1 driver, WL 8.1.5
This shows that the problem could be with the WL 8.1.6 version and I was wondering if any of you have faced this before? The query retrieves a set of data from Oracle none of which contain the AsciiStream data type, which is noted as a problem in WL 8.1.6, but that too, only for WL JDBC drivers.
Any ideas appreciated.

Folks,
We are upgrading WebLogic from 8.1.5 to 8.1.6 and Oracle from 9.2.0.6 to 9.2.0.8. We use the Oracle thin client driver for 9.2.0.8 to connect from the application to Oracle.
When we use the following combination of the stack we see SQL query performance degradation: -
Oracle 9.2.0.8 database, Oracle 9.2.0.8 driver, WL 8.1.6
Oracle 9.2.0.8 database, Oracle 9.2.0.1 driver, WL 8.1.6
We do not see the degradation in case of the following: -
Oracle 9.2.0.8 database, Oracle 9.2.0.1 driver, WL 8.1.5
Oracle 9.2.0.6 database, Oracle 9.2.0.1 driver, WL 8.1.5
This shows that the problem could be with the WL 8.1.6 version and I was wondering if any of you have faced this before? The query retrieves a set of data from Oracle none of which contain the AsciiStream data type, which is noted as a problem in WL 8.1.6, but that too, only for WL JDBC drivers.
Any ideas appreciated.

Poor query performance only with migrated 7.0 queries

Dear Team,
We are facing a serious query performance issue after migration of queries from 3.5 to 7.0.
I executed a query in 3.5 with some variable values which takes fraction of seconds to display the output. But the same migrated query with same variable entries is taking very long time and giving time out error.
We are not using any aggregates in the InfoProvider level.
Both the queries are based on same cube but 3.5 query is taking less time and 7.0 is taking very long time if more selection is done.
I checked for notes where I didn't find specific note for this particular scenario. I found notes only for general query performance improvement.
I want to know the reason why only in 7.0 the same 3.5 query is taking a long time and giving time out error. And please suggest some notes or suggestions related to this scenario.
Regards,
Chan

Hi,
Queries in BI 7.0 are almost the same as queries in 3.x format.
inorder to check if the problem is in the query runtime (database time) or JAVA runtime (probably rendering) you should try running it from RSRT once in JAVA web and once in ABAP web.
if the problem is only with JAVA web, than u should take the URL and add &profiling=X at the end.
after the query execution u can use statistics which will be shown at the top of the page.
With my experience, the problem is in the rendering phase of the query. Things that could be done is to limit the number of rows shown at each page, that could be done by changing the 0ANALYSIS web template - it's one of the web template parameters.
Tomer.

Use of hints in query performance

Hi
Please let me know actual usage of hints in query tunging, how do we write hints of increase performnace.
let me know below query will gives better performnce. if hints are not use query will degrade performance.
SELECT /*+ ORDERED INDEX (b, jl_br_balances_n1) USE_NL (j b)
USE_NL (glcc glf) USE_MERGE (gp gsb) */
b.application_id ,
b.set_of_books_id ,
b.personnel_id,
p.vendor_id Personnel,
p.segment1 PersonnelNumber,
p.vendor_name Name
FROM jl_br_journals j,
jl_br_balances b,
gl_code_combinations glcc,
fnd_flex_values_vl glf,
gl_periods gp,
gl_sets_of_books gsb,
po_vendors p

942919 wrote:
Please let me know actual usage of hints in query tunging, how do we write hints of increase performnace.The majority of hints would be used to diagnose a performance problem by identifying a better query plan and fixing the underlying reason that the optimizer did not select that plan automatically. Hints used in this way would be removed from the query after the cause of the performance problem was fixed.
http://docs.oracle.com/cd/E11882_01/server.112/e16638/hintsref.htm#i8327
Hints change the access paths and methods the optimizer chooses, so before using a hint, you need to understand what the optimizer does, what access methods are, when they are chosen, and what they are best used for.
To do that you need to read the Performance Tuning Guide
http://docs.oracle.com/cd/E11882_01/server.112/e16638/toc.htm
At a minimum reading and understanding these sections -
http://docs.oracle.com/cd/E11882_01/server.112/e16638/perf_overview.htm#i1006218
http://docs.oracle.com/cd/E11882_01/server.112/e16638/optimops.htm#i21299
http://docs.oracle.com/cd/E11882_01/server.112/e16638/ex_plan.htm#i19260
http://docs.oracle.com/cd/E11882_01/server.112/e16638/stats.htm#i13546
Then you should be able to use a hint safely.
let me know below query will gives better performnce. if hints are not use query will degrade performance.Not true, hints change the performance of queries they can make them slower as well as faster. Here is an example of an index hint slowing down a query
{message:id=1989089}

LDAP/SSL performance degradation with 1.6.29/1.6.30

Hi,
we are running an application within a Tomcat 6.0.35 server on RHEL 5.7/i386 that queries our company's Active Directory using LDAP over SSL. One of the queries involves expanding a large distribution list. Since the upgrade from JDK 1.6.27 to 1.6.29 (or 1.6.30) the performance of this LDAP query has degraded dramatically, from about 8 seconds to more than 300 seconds. This only happens when encrypting the LDAP connection.
We are not sure how to debug this further. Which information would we need to provide to get to the root of this? I was thinking that perhaps the Tomcat output with the javax.net.debug=ssl,handshake property set for 1.6.27 and 1.6.29/30 would be sufficient?
With Java 1.6.29/30, the basic response/reply between the Tomcat and the AD server looks like:
TP-Processor11, WRITE: TLSv1 Application Data, length = 32
TP-Processor11, WRITE: TLSv1 Application Data, length = 160
Thread-270, READ: TLSv1 Application Data, length = 16368
Thread-270, READ: TLSv1 Application Data, length = 16368
Thread-270, READ: TLSv1 Application Data, length = 11920
TP-Processor11, WRITE: TLSv1 Application Data, length = 32
TP-Processor11, WRITE: TLSv1 Application Data, length = 160
Thread-270, READ: TLSv1 Application Data, length = 16368
Thread-270, READ: TLSv1 Application Data, length = 16368
Thread-270, READ: TLSv1 Application Data, length = 11920
When using Java 1.6.27, we see:
TP-Processor12, WRITE: TLSv1 Application Data, length = 208
Thread-42, READ: TLSv1 Application Data, length = 16368
Thread-42, READ: TLSv1 Application Data, length = 16368
Thread-42, READ: TLSv1 Application Data, length = 5696
TP-Processor12, WRITE: TLSv1 Application Data, length = 208
Thread-42, READ: TLSv1 Application Data, length = 16368
Thread-42, READ: TLSv1 Application Data, length = 16368
Thread-42, READ: TLSv1 Application Data, length = 5696
Looking at the 32 bytes long requests (with javax.net.debug=all set), we see:
Padded plaintext before ENCRYPTION: len = 32
0000: 30 0C C2 32 83 6E 9F D8 8F 5E E8 47 7A 0B 9A F1 0..2.n...^.Gz...
0010: 7D 44 78 0B 9E 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A 0A .Dx.............
TP-Processor1, WRITE: TLSv1 Application Data, length = 32
Which doesn't make a whole lot of sense to us...
Any help debugging this further would be most welcome.
Cheers
Stefan
Edited by: user9158206 on Jan 12, 2012 6:06 AM

Since you've determined that your problem is related to the use of TLS, your posting is likely to get a quicker response on the Java Secure Socket Extension (JSSE) forum. When you do get a resolution, please post a link to it on this thread to close the loop. Thanks.
Arshad Noor
StrongAuth, Inc.

How to improve query performance built on a ODS

Hi,
I've built a report on FI_GL ODS (BW3.5). The report execution time takes almost 1hr.
Is there any method to improve or optimize th query performance that build on ODS.
The ODS got huge volume of data ~ 300 Million records for 2 years.
Thanx in advance,
Guru.

Hi Raj,
Here are some few tips which helps you in improving ur query performance
Checklist for Query Performance
1. If exclusions exist, make sure they exist in the global filter area. Try to remove exclusions by subtracting out inclusions.
2. Use Constant Selection to ignore filters in order to move more filters to the global filter area. (Use ABAPer to test and validate that this ensures better code)
3. Within structures, make sure the filter order exists with the highest level filter first.
4. Check code for all exit variables used in a report.
5. Move Time restrictions to a global filter whenever possible.
6. Within structures, use user exit variables to calculate things like QTD, YTD. This should generate better code than using overlapping restrictions to achieve the same thing. (Use ABAPer to test and validate that this ensures better code).
7. When queries are written on multiproviders, restrict to InfoProvider in global filter whenever possible. MultiProvider (MultiCube) queries require additional database table joins to read data compared to those queries against standard InfoCubes (InfoProviders), and you should therefore hardcode the infoprovider in the global filter whenever possible to eliminate this problem.
8. Move all global calculated and restricted key figures to local as to analyze any filters that can be removed and moved to the global definition in a query. Then you can change the calculated key figure and go back to utilizing the global calculated key figure if desired
9. If Alternative UOM solution is used, turn off query cache.
10. Set read mode of query based on static or dynamic. Reading data during navigation minimizes the impact on the R/3 database and application server resources because only data that the user requires will be retrieved. For queries involving large hierarchies with many nodes, it would be wise to select Read data during navigation and when expanding the hierarchy option to avoid reading data for the hierarchy nodes that are not expanded. Reserve the Read all data mode for special queriesu2014for instance, when a majority of the users need a given query to slice and dice against all dimensions, or when the data is needed for data mining. This mode places heavy demand on database and memory resources and might impact other SAP BW processes and tasks.
11. Turn off formatting and results rows to minimize Frontend time whenever possible.
12. Check for nested hierarchies. Always a bad idea.
13. If "Display as hierarchy" is being used, look for other options to remove it to increase performance.
14. Use Constant Selection instead of SUMCT and SUMGT within formulas.
15. Do review of order of restrictions in formulas. Do as many restrictions as you can before
calculations. Try to avoid calculations before restrictions.
17. Turn off warning messages on queries.
18. Check to see if performance improves by removing text display (Use ABAPer to test and validate that this ensures better code).
19. Check to see where currency conversions are happening if they are used.
20. Check aggregation and exception aggregation on calculated key figures. Before aggregation is generally slower and should not be used unless explicitly needed.
21. Avoid Cell Editor use if at all possible.
22. Make sure queries are regenerated in production using RSRT after changes to statistics, consistency changes, or aggregates.
23. Within the free characteristics, filter on the least granular objects first and make sure those come first in the order.

Query performance in two environments

Hi all,
I have developed simple select queries on a multiprovider and I am facing issues with query performance in quality box. A query runs pretty fast in in dev and return results while the same one dumps in Quality environment giving a time out error. This sounds more strange because our dev box has comparitively more records than the quality environment right now.
On anlyzing the query path in both environments, we noticed that the query does an index scan in dev but not in Quality environment, especially when the selection is such that the query is supposed to return lot of records. Since the query does a sequential scan in quality, it dumps. Is there any setting that I need to make seprately in the quality environment.
Any tips on query optimization would be great help. Thanks
Regards
Niranjana

Execute some of the RSRT tests in the QA for the query using "Execute+Debug" option and use some test for Multiprovider and Databases checks in it ,try to compare with Dev as well.
Hope it Helps
Chetan
@CP..

When table with clustered columnstore indexe is partitioned the performance degrades if data is located in multiple partitions

Hello,
Below I provide a complete code to re-produce the behavior I am observing. You could run it in tempdb or any other database, which is not important. The test query provided at the top of the script is pretty silly, but I have observed the same
performance degradation with about a dozen of various queries of different complexity, so this is just the simplest one I am using as an example here. Note that I also included approximate run times in the script comments (this is obviously based on what I
observed on my machine). Here are the steps with numbers corresponding to the numbers in the script:
1. Run script from #1 to #7. This will create the two test tables, populate them with records (40 mln. and 10 mln.) and build regular clustered indexes.
2. Run test query (at the top of the script). Here are the execution statistics:
Table 'Main'. Scan count 5, logical reads 151435, physical reads 0, read-ahead reads 4, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Txns'. Scan count 5, logical reads 74155, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 5514 ms,
elapsed time = 1389 ms.
3. Run script from #8 to #9. This will replace regular clustered indexes with columnstore clustered indexes.
4. Run test query (at the top of the script). Here are the execution statistics:
Table 'Txns'. Scan count 4, logical reads 44563, physical reads 0, read-ahead reads 37186, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 4, logical reads 54850, physical reads 2, read-ahead reads 96862, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 828 ms,
elapsed time = 392 ms.
As you can see the query is clearly faster. Yay for columnstore indexes!.. But let's continue.
5. Run script from #10 to #12 (note that this might take some time to execute). This will move about 80% of the data in both tables to a different partition. You should be able to see the fact that the data has been moved when running Step #
11.
6. Run test query (at the top of the script). Here are the execution statistics:
Table 'Txns'. Scan count 4, logical reads 44563, physical reads 0, read-ahead reads 37186, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 4, logical reads 54817, physical reads 2, read-ahead reads 96862, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 8172 ms,
elapsed time = 3119 ms.
And now look, the I/O stats look the same as before, but the performance is the slowest of all our tries!
I am not going to paste here execution plans or the detailed properties for each of the operators. They show up as expected -- column store index scan, parallel/partitioned = true, both estimated and actual number of rows is less than during the second
run (when all of the data resided on the same partition).
So the question is: why is it slower?
Thank you for any help!
Here is the code to re-produce this:
--==> Test Query - begin --<===
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT COUNT(1)
FROM Txns AS z WITH(NOLOCK)
LEFT JOIN Main AS mmm WITH(NOLOCK) ON mmm.ColBatchID = 70 AND z.TxnID = mmm.TxnID AND mmm.RecordStatus = 1
WHERE z.RecordStatus = 1
--==> Test Query - end --<===
--===========================================================
--1. Clean-up
IF OBJECT_ID('Txns') IS NOT NULL DROP TABLE Txns
IF OBJECT_ID('Main') IS NOT NULL DROP TABLE Main
IF EXISTS (SELECT 1 FROM sys.partition_schemes WHERE name = 'PS_Scheme') DROP PARTITION SCHEME PS_Scheme
IF EXISTS (SELECT 1 FROM sys.partition_functions WHERE name = 'PF_Func') DROP PARTITION FUNCTION PF_Func
--2. Create partition funciton
CREATE PARTITION FUNCTION PF_Func(tinyint) AS RANGE LEFT FOR VALUES (1, 2, 3)
--3. Partition scheme
CREATE PARTITION SCHEME PS_Scheme AS PARTITION PF_Func ALL TO ([PRIMARY])
--4. Create Main table
CREATE TABLE dbo.Main(
SetID int NOT NULL,
SubSetID int NOT NULL,
TxnID int NOT NULL,
ColBatchID int NOT NULL,
ColMadeId int NOT NULL,
RecordStatus tinyint NOT NULL DEFAULT ((1))
) ON PS_Scheme(RecordStatus)
--5. Create Txns table
CREATE TABLE dbo.Txns(
TxnID int IDENTITY(1,1) NOT NULL,
GroupID int NULL,
SiteID int NULL,
Period datetime NULL,
Amount money NULL,
CreateDate datetime NULL,
Descr varchar(50) NULL,
RecordStatus tinyint NOT NULL DEFAULT ((1))
) ON PS_Scheme(RecordStatus)
--6. Populate data (credit to Jeff Moden: http://www.sqlservercentral.com/articles/Data+Generation/87901/)
-- 40 mln. rows - approx. 4 min
--6.1 Populate Main table
DECLARE @NumberOfRows INT = 40000000
INSERT INTO Main (
SetID,
SubSetID,
TxnID,
ColBatchID,
ColMadeID,
RecordStatus)
SELECT TOP (@NumberOfRows)
SetID = ABS(CHECKSUM(NEWID())) % 500 + 1, -- ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SubSetID = ABS(CHECKSUM(NEWID())) % 3 + 1,
TxnID = ABS(CHECKSUM(NEWID())) % 1000000 + 1,
ColBatchId = ABS(CHECKSUM(NEWID())) % 100 + 1,
ColMadeID = ABS(CHECKSUM(NEWID())) % 500000 + 1,
RecordStatus = 1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
--6.2 Populate Txns table
-- 10 mln. rows - approx. 1 min
SET @NumberOfRows = 10000000
INSERT INTO Txns (
GroupID,
SiteID,
Period,
Amount,
CreateDate,
Descr,
RecordStatus)
SELECT TOP (@NumberOfRows)
GroupID = ABS(CHECKSUM(NEWID())) % 5 + 1, -- ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SiteID = ABS(CHECKSUM(NEWID())) % 56 + 1,
Period = DATEADD(dd,ABS(CHECKSUM(NEWID())) % 365, '05-04-2012'), -- DATEADD(dd,ABS(CHECKSUM(NEWID())) % @Days, @StartDate)
Amount = CAST(RAND(CHECKSUM(NEWID())) * 250000 + 1 AS MONEY),
CreateDate = DATEADD(dd,ABS(CHECKSUM(NEWID())) % 365, '05-04-2012'),
Descr = REPLICATE(CHAR(65 + ABS(CHECKSUM(NEWID())) % 26), ABS(CHECKSUM(NEWID())) % 20),
RecordStatus = 1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
--7. Add PK's
-- 1 min
ALTER TABLE Txns ADD CONSTRAINT PK_Txns PRIMARY KEY CLUSTERED (RecordStatus ASC, TxnID ASC) ON PS_Scheme(RecordStatus)
CREATE CLUSTERED INDEX CDX_Main ON Main(RecordStatus ASC, SetID ASC, SubSetId ASC, TxnID ASC) ON PS_Scheme(RecordStatus)
--==> Run test Query --<===
--===========================================================
-- Replace regular indexes with clustered columnstore indexes
--===========================================================
--8. Drop existing indexes
ALTER TABLE Txns DROP CONSTRAINT PK_Txns
DROP INDEX Main.CDX_Main
--9. Create clustered columnstore indexes (on partition scheme!)
-- 1 min
CREATE CLUSTERED COLUMNSTORE INDEX PK_Txns ON Txns ON PS_Scheme(RecordStatus)
CREATE CLUSTERED COLUMNSTORE INDEX CDX_Main ON Main ON PS_Scheme(RecordStatus)
--==> Run test Query --<===
--===========================================================
-- Move about 80% the data into a different partition
--===========================================================
--10. Update "RecordStatus", so that data is moved to a different partition
-- 14 min (32002557 row(s) affected)
UPDATE Main
SET RecordStatus = 2
WHERE TxnID < 800000 -- range of values is from 1 to 1 mln.
-- 4.5 min (7999999 row(s) affected)
UPDATE Txns
SET RecordStatus = 2
WHERE TxnID < 8000000 -- range of values is from 1 to 10 mln.
--11. Check data distribution
SELECT
OBJECT_NAME(SI.object_id) AS PartitionedTable
, DS.name AS PartitionScheme
, SI.name AS IdxName
, SI.index_id
, SP.partition_number
, SP.rows
FROM sys.indexes AS SI WITH (NOLOCK)
JOIN sys.data_spaces AS DS WITH (NOLOCK)
ON DS.data_space_id = SI.data_space_id
JOIN sys.partitions AS SP WITH (NOLOCK)
ON SP.object_id = SI.object_id
AND SP.index_id = SI.index_id
WHERE DS.type = 'PS'
AND OBJECT_NAME(SI.object_id) IN ('Main', 'Txns')
ORDER BY 1, 2, 3, 4, 5;
PartitionedTable PartitionScheme IdxName index_id partition_number rows
Main PS_Scheme CDX_Main 1 1 7997443
Main PS_Scheme CDX_Main 1 2 32002557
Main PS_Scheme CDX_Main 1 3 0
Main PS_Scheme CDX_Main 1 4 0
Txns PS_Scheme PK_Txns 1 1 2000001
Txns PS_Scheme PK_Txns 1 2 7999999
Txns PS_Scheme PK_Txns 1 3 0
Txns PS_Scheme PK_Txns 1 4 0
--12. Update statistics
EXEC sys.sp_updatestats
--==> Run test Query --<===

Hello Michael,
I just simulated the situation and got the same results as in your description. However, I did one more test - I rebuilt the two columnstore indexes after the update (and test run). I got the following details:
Table 'Txns'. Scan count 8, logical reads 12922, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 8, logical reads 57042, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 251 ms, elapsed time = 128 ms.
As an explanation of the behavior - because of the UPDATE statement in CCI is executed as a DELETE and INSERT operation, you had all original row groups of the index with almost all data deleted and almost the same amount of new row groups with new data
(coming from the update). I suppose scanning the deleted bitmap caused the additional slowness at your end or something related with that "fragmentation".
Ivan Donev MCITP SQL Server 2008 DBA, DB Developer, BI Developer

Query Performance - Query very slow to run

I have built a query to show payroll costings per month per employee by cost centres for the current fiscal year. The cost centres are selected with a hierarchy variable - it's quite a latrge hierarchy. The problem is the query takes ages to run - nearly ten minutes. It's built on a DSO so I cant aggregate it. Is there anything I can do to improve performance.

Hi Joel,
Walkthrough Checklist for Query Performance:
1. If exclusions exist, make sure they exist in the global filter area. Try to remove exclusions by subtracting out inclusions.
2. Use Constant Selection to ignore filters in order to move more filters to the global filter area. (Use ABAPer to test and validate that this ensures better code)
3. Within structures, make sure the filter order exists with the highest level filter first.
4. Check code for all exit variables used in a report.
5. Move Time restrictions to a global filter whenever possible.
6. Within structures, use user exit variables to calculate things like QTD, YTD. This should generate better code than using overlapping restrictions to achieve the same thing. (Use ABAPer to test and validate that this ensures better code).
7. When queries are written on multiproviders, restrict to InfoProvider in global filter whenever possible. MultiProvider (MultiCube) queries require additional database table joins to read data compared to those queries against standard InfoCubes (InfoProviders), and you should therefore hardcode the infoprovider in the global filter whenever possible to eliminate this problem.
8. Move all global calculated and restricted key figures to local as to analyze any filters that can be removed and moved to the global definition in a query. Then you can change the calculated key figure and go back to utilizing the global calculated key figure if desired
9. If Alternative UOM solution is used, turn off query cache.
10. Set read mode of query based on static or dynamic. Reading data during navigation minimizes the impact on the R/3 database and application server resources because only data that the user requires will be retrieved. For queries involving large hierarchies with many nodes, it would be wise to select Read data during navigation and when expanding the hierarchy option to avoid reading data for the hierarchy nodes that are not expanded. Reserve the Read all data mode for special queriesu2014for instance, when a majority of the users need a given query to slice and dice against all dimensions, or when the data is needed for data mining. This mode places heavy demand on database and memory resources and might impact other SAP BW processes and tasks.
11. Turn off formatting and results rows to minimize Frontend time whenever possible.
12. Check for nested hierarchies. Always a bad idea.
13. If "Display as hierarchy" is being used, look for other options to remove it to increase performance.
14. Use Constant Selection instead of SUMCT and SUMGT within formulas.
15. Do review of order of restrictions in formulas. Do as many restrictions as you can before calculations. Try to avoid calculations before restrictions.
16. Check Sequential vs Parallel read on Multiproviders.
17. Turn off warning messages on queries.
18. Check to see if performance improves by removing text display (Use ABAPer to test and validate that this ensures better code).
19. Check to see where currency conversions are happening if they are used.
20. Check aggregation and exception aggregation on calculated key figures. Before aggregation is generally slower and should not be used unless explicitly needed.
21. Avoid Cell Editor use if at all possible.
22. Make sure queries are regenerated in production using RSRT after changes to statistics, consistency changes, or aggregates.
23. Within the free characteristics, filter on the least granular objects first and make sure those come first in the order.
24. Leverage characteristics or navigational attributes rather than hierarchies. Using a hierarchy requires reading temporary hierarchy tables and creates additional overhead compared to characteristics and navigational attributes. Therefore, characteristics or navigational attributes result in significantly better query performance than hierarchies, especially as the size of the hierarchy (e.g., the number of nodes and levels) and the complexity of the selection criteria increase.
25. If hierarchies are used, minimize the number of nodes to include in the query results. Including all nodes in the query results (even the ones that are not needed or blank) slows down the query processing. The u201Cnot assignedu201D nodes in the hierarchy should be filtered out, and you should use a variable to reduce the number of hierarchy nodes selected.
Regards
Vivek Tripathi

How to test SQL query performance - realiably?

I have certain queries and I want to test which one is faster, and how big is the difference.
How can I do this reliably?
The problem is, when I execute the queries, Oracle does it's caching and execution planning and whatnot, and results of the queries are dependent on the order I execute them.
Example: query A and query B, supposed to return same data.
query A, run 1: 587 seconds
query A, run 2: 509 seconds
query B, run 1: 474 seconds
query B, run 2: 451 seconds
It would seem that A is somewhat faster than B, but if I change the order and execute B before A, results are different.
Also I'm running the queries in SQL Developer, and it only returns 100 first lines, how can I remove this effect and simulate real scenario where all lines are fetched?
I can also use EXPLAIN PLANs and look at the costs but I'm not sure how much I can trust those either. I understand they are only estimations and even if cost(a) = 1.5 * cost (b), b could still end up executing faster in practise due to inaccuracies in the cost calculation....right? EDIT: actually event if cost(a) = 5000 * cost(b), b can still execute faster.....seems like query A's cost is 15836 and B's cost is 3 while A seems to be faster in practise.
Edited by: user620914 on 19-Jan-2010 01:42

user620914 wrote:
I have to say I don't understand your point either :)
What are you saying, that people should not test their SQL performance? That tools such as autotrace are useless?No.. what I'm saying is that you need a baseline to make an informed decision about SQL performance.
What does a 4 second SQL performance mean for query foo ? Nothing really.. wearing my dba cap I would point at that this is actually utterly useless for me to determine the impact of your query on production, or use it to determine how to scale it.
If instead you tell me that is hits that table using an index range scan.. I know what it is doing and have a far better idea what it will do to the production instance.
Thus my questioning this "+elapsed time+" measurement approach. I as a dba cannot use it... and I'm not sure what benefit (wearing my developer hat) you will find from it either.
You can form your SQL queries better or worse, or select your table structure / indexes better or worse. Some choices may end up executing orders of magnitude slower than others. Obviously you can't get exact measurements "this query executes in 43123 ns" and there are a lot of unpredictable variables that affect the end performance. Still, it's often better to test your querie's / table's performance before implementing them in the application than not.Exactly. I'm not questioning the fact that optimising your code (and ALL your code, not just SQL) is a Good Thing (tm) - but how you go about that optimisation process.
For example, your PL/SQL code fires off a query. It returns on average 10,000 rows, hits a single partition (SQL enables partitioning pruning) and then uses a local bitmap index to identify the rows.
An optimal query by the sounds of it, and one that will perform and scale well.. even when the database instance needs to service a 100 clients using your code and running this query.
Only, the code does a single bulk collect of all the rows and stuff it into dedicated process memory (PGA). Servicing a 100 clients means that dedicated server memory is now needed for 100x10000 rows - there's insufficient free memory, causing the kernel to start swapping pages in and out of memory heavily as all 100 client sessions are active and wanting to process the rows returned by the optimal query.
What happens to scalability and performance now?
Testing for performance is not simply measuring a query and then trying to use that or extrapolate that to determine application performance and the impact on production.
It starts with the design of the tables, the design of the application, the writing of the code (application and SQL). It is not something that should be done after the fact as in "+okay, application all done, let's see how she performs!+".. and especially not using time as the baseline for performance measurement.

Problem with query performance

Hi All,
While loading data if we run the report, will it make any difference in the query performance? I have a cube with 230 million records I am running the delta which has got another 230 million records. Now I am trying to run the queries on the cube, all those timed out.
But I have a DSO with the 480 million the same queries are running good on the DSO. But I wanted to run the reports on the cube not on DSO what can I do? Is the data load giving any problem in the query runtime??
Pleas advise me what to do?
Regards
Kiran

Hi,
My load got finished, Now i created indexes and tried to run the query, every query on this cube times out.. but on the top of DSO we have Same queries those queries are running.. what could be the reason how can i go ahead to improve the query performance plzzz advise me
Kiran
Edited by: kiran kumar on Mar 14, 2009 1:28 AM

Input ready query performance

Hi Experts,
We are working on input ready queries. But the input ready reports are taking lot of time around 10 to 15 mins to display the results and hence the planning functions like save are also taking lot of time.
We can't use the OLAP cache as these reports are developed on Aggregation level.
Could somebody guide me how to improve the performance of input ready queries.
Thanks in Advance,
Raj

Hi
You can do repartitioning and reclustering
Repartitiong help you to partition the cube even after data uploaded in cube.Similar to partitioning but you do after cube load
Reclustering enables related data to be stored in same extent in the database and increase the query performance . Similar to clustering but you do after cube load
There are heap of material available in BI section of SDN which you can make use of
Regards
N Ganesh

Make 4 queries to 1 query , performance degrades

Similar Messages

Maybe you are looking for