Aggregation Techniques

Hi all,
I have an OLAP cube with 4 measures and 8 dimensions on a 9.1.0.6 database. The initial load and aggregation now takes me 12 hrs and the AW size comes to about 60GB (with 3.5 million rows of base data). My temp space is 20GB. When I run the aggregation I am doing it one measure at a time, i.e. I aggregate one measure, update the cube and commit, then aggregate the next measure and update...and so on...All 4 of my measures use the same Aggmap. WIll it save me space, if I aggregate all 4 measures at one time and then do an update and commit at the end. I cant test this scenario because of temp space limitation. Thanks in advance for any advice.
--Bharat

Thank you Chris for your reply. I cannot use compressed composites now as our upgrade to 10.1/2 is not planned until next year. I have some information abt my cube setup below and I hope someone can suggest a few ideas to help me do this better.
Here are my raw data statistics::
Total no. of rows from the fact table being loaded:2,587,760
Dimension Name - No. of Leaves - No. of tuples without this dim
1. Payment Type - 4 - 2,090,303
2. Region - 80 - 2,533,022
3. Service Type - 57 - 1,050,517
4. Benefit - 151 - 2,495,237
5. Product - 33 - 2,545,214
6. Service Area - 381 - 1,587,233
7. Partner - 932 - 2,007,902
8. Time - 147 - 435,365
I have used the following dimensions in the order listed below in my composite (Composite_1)
1. Payment Type
2. Region
3. Service Type
4. Service Area
5. Product
6. Benefit
7. Partner
My 4 measures are dimensioned by Time and Composite_1. I have used skip level aggregating. When aggregaing, this creates 5362606 singles.Please advise if I can make any changes that might help reduce the cube size from its current 60GB. ALso, if any of you need additional info on the design, I would be gald to provide any info.
Thanks for your help,
-Bharat

Similar Messages

Calculation Before Aggregation obsolete in NW2004s

We are upgrading our sandbox from BW 3.5 to NW2004s and several of our queries use the obsolete Time of Calculation 'Before Aggregation' technique. I searched and found a thread Time of Calculation "Before Aggregation" obsolete in NW2004s in which Klaus Werner recommended the use of exception aggregation in place of the calculation before aggregation method.
We do not yet have our BI Java up and running and I do not yet have access to the Query Designer 7.0, just the Query Designer 3.5 under NW2004s. I have not been able to create the solution proposed by Klaus Werner and I think his solution is only possible using Query Designer 7.0. Is this right or am I missing something and should be able to modify my Query Designer 3.5 queries to utilize the exception aggregation solution as proposed by Werner.
An example of our CKF which currently uses the calculation before aggregation technique is
CKF1 is defined as ((KFa >0) AND (KFa < 1000) * KFb + 0)
and we want that resolved at the document level (before multiple documents are summed) to determine if KFa is in this range for a single document. For every document in which KFa is in the above range, we want to sum together KFb. When I input my formula into CKF1 I am not given a choice where I see any exception aggregation options using Query Designer 3.5 under NW2004s. Am I just missing something or do I need to wait until I have Query Designer 7.0 to change my method of calculation this?
Thanks for your help,
Jeri

Hi Jeri,
Could you tell me where you found the document of Klaus Werner?
Answer:
I found it in the other thread.
Thanks,
Eelco
Message was edited by:
Eelco de Vries

How to list the employees working under one manager in the same row.

Hi,
my emp table has the following data.
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
I want to group all the employees under one manager and list their names in the same row. IS that possible. When i tried the query below,
select mgr, count(*) employees from emp where mgr is not null group by mgr, ename;
I got the result as
MGR EMPLOYEES
7566 2
7698 6
7782 1
7788 1
7839 3
7902 1
Additionally, would I be able to display the names of the employees under one manager in a row? or atleast in some other way? Pls share your ideas.

A summary of different string aggregation techniques can be found here
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php

Concatenate elements with same name in sequence

version 9.2
I have a clob column containing xml - unregistered
Some of the old xml has multiple <notes> elements and the new xml has one <notes> element. I have made a view that successfully extracts the xml from the column and it works great, but when I have multiple <notes> elements the view fails since I am using extractvalue(). I need to select the <notes> tags concatenated into one column in sequence when I have multiples. I am guessing the sequence should be as they appear in the xml document from top to bottom since there is no sequence attribute. I know how to use xmlsequence and xmltable to get the individual <notes> tags but they are not concatenated. Is there a magic xmlsequence/concatenation function that will do what I want here?
-- old xml
<Accident>
 <Case>
 <TRACS_Case_Number Value="7777777"/>
 <Notes>V-1 AND V-2 N/B TRANSIT RD (ST 78\) SLOWING TO MERGE INTO TRAFFIC. V-3 N/B TRANSIT RD. </Notes>
 <Notes>STRIKES V-2 IN REAR AND PUSHES V-2 INTO V-1 STRIKING V-1 IN REAR WITH FRONT OF V-2 . </Notes>
 <Notes>NO INJURIES.</Notes>
 </Case>
</Accident>-- new xml
<Accident>
 <Case>
 <TRACS_Case_Number Value="7777777"/>
 <Notes>V-1 AND V-2 N/B TRANSIT RD (ST 78) SLOWING TO MERGE INTO TRAFFIC. V-3 N/B TRANSIT RD. STRIKES V-2 IN REAR AND PUSHES V-2 INTO V-1 STRIKING V-1 IN REAR WITH FRONT OF V-2 . NO INJURIES.</Notes>
 </Case>
</Accident>I am also trying to register this xml to improve performance. However, when I see things like this in the xml I wonder what will happen when I try to register this xml. The DTD, I have no XSD, currently only supports one <notes> tag. Do I have to clean up all the xml in column to match the current DTD before registering? I could also use a good, EASY, example of how to register a schema.
Thanks...

You can use the string aggregation technique to concatenate the Notes element.
sql> WITH xmltable AS
2 (SELECT xmltype('<Accident>
3 <Case>
4 <TRACS_Case_Number Value="7777777"/>
5 <Notes>V-1 AND V-2 N/B TRANSIT RD (ST 78\) SLOWING TO MERGE INTO TRAFFIC. V-3 N/B TRANSIT RD. </Notes>
6 <Notes>STRIKES V-2 IN REAR AND PUSHES V-2 INTO V-1 STRIKING V-1 IN REAR WITH FRONT OF V-2 . </Notes>
7 <Notes>NO INJURIES.</Notes>
8 </Case>
9 </Accident>') xmlcol
10 FROM dual)
11 SELECT SUBSTR(replace(MAX(sys_connect_by_path(notes, ':')),':',' '), 2) Notes
12 FROM
13 (SELECT extractvalue(t.column_value, '/Notes/text()') notes,
14 rownum rn
15 FROM xmltable xt,
16 TABLE(xmlsequence(EXTRACT(xmlcol, 'Accident/Case/Notes'))) t)
17 CONNECT BY PRIOR rn = rn -1 START WITH rn = 1;
NOTES
V-1 AND V-2 N/B TRANSIT RD (ST 78\) SLOWING TO MERGE INTO TRAFFIC. V-3 N/B TRANSIT RD. STRIKES V-2 IN REAR AND PUSHES
G V-1 IN REAR WITH FRONT OF V-2 . NO INJURIES.

Oracle 10g - To find the corresponding record for a certain row

Hi all,
The scenario is like this - Suppose I've got a table with 100+ columns. For a certain row inside, I need to find its corresponding record which is in the same table. The way how I define "corresponding" here is - these two rows should be identical in all attributes but only different in one column, say "id" (primary key).
So how could I achieve this? What I can think of is to fetch all columns of the first row into some pre-defined variables, then use a cursor to loop the table and match the values of the columns of each row to those variables. But given that we've got 100+ rows in the table, this solution doesn't look practical?
Any advises are greatly appreciated. Thanks.

something to play with as Solomon suggested (use some other string aggregation technique if you're not 11g yet)
you'll have to adjust the column_list accordingly
select 'select ' || column_list ||
       ' from ' || :table_name ||
       ' group by ' || column_list ||
       ' having count(*) > 1' the_sql
from (select listagg(column_name,',') within group (order by column_id) column_list
          from user_tab_cols
         where table_name = :table_name
       )Regards
Etbin
Edited by: Etbin on 25.12.2011 16:53
Sorry, I'd better leave the forum: the title says you're 10g :(
Providing a link for replacing listagg: http://www.sqlsnippets.com/en/topic-11787.html

Derived Column for "PATH"

I have a table with the first 3 colums below. Is there a way to generate the fourth column? I need to string together the location by date (asc). if the location does not change, then I do not string the data. There should be one path per ID. I have tried using window functions but can't seem to figure it out. I also cannot use P/SQL for this. It needs to be regular SQL. The following is sample data.
ID DATE Location PATH
A 01/01/2006 CA CA
A 01/02/2006 CA CA
A 01/10/2006 CA CA
B 01/01/2006 CA CA-TX-CA
B 01/02/2006 TX CA-TX-CA
B 01/10/2006 CA CA-TX-CA
C 01/01/2006 CA CA-NY-CA-TX
C 01/02/2006 CA CA-NY-CA-TX
C 01/10/2006 NY CA-NY-CA-TX
C 01/01/2006 CA CA-NY-CA-TX
C 01/02/2006 CA CA-NY-CA-TX
C 01/10/2006 TX CA-NY-CA-TX
I would greatly appreciate any help on this,
Sam

Try this (not tested)
SELECT id,
 LTRIM(MAX(SYS_CONNECT_BY_PATH(location,'-'))
 KEEP (DENSE_RANK LAST ORDER BY curr),'-') AS path
FROM (SELECT id,
 location
 ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS curr,
 ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) -1 AS prev
 FROM <your_table>)
GROUP BY id
CONNECT BY prev = PRIOR curr AND id = PRIOR id
START WITH curr = 1;For other string aggregation techniques see
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php

Facing issue in after report trigger in XML publisher

HI All,
I have a XML template in that i am calling a after report trigger like below :
<dataTrigger name="afterReport" source="apps.testpkg.testupdate_fun(:HEADER_ID)"/>
I am passing the header_id from my header block to the function.
My Function looks like below:
FUNCTION testupdate_fun(P_HEADER_ID IN NUMBER ) RETURN BOOLEAN AS
BEGIN
UPDATE TEST_TABLE SET PROCESS_FLAG='P' WHERE HEADER_ID=P_HEADER_ID;
COMMIT;
return true;
EXCEPTION
WHEN OTHERS THEN
return false;
END;
The problem i am facing is it is updating only one record in the table.but in my header data block i am getting the details for more order numbers with header id like 1,2,3..., but it is updating only one header id in table, please help me out how to update record for all the header_id.
Thanks

Examples of combining all IDs into comma-separate list can be found here: http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php . That list you need to send as parameter to the trigger. In the trigger you may split it again and update the records one-by-one, or you can build an EXECUTE IMMEDIATE statement to update the table, and in that statement the list can be used as a whole, no need to split it. Please, advise on which steps of this process you need more detailed assistance.

Pivot in oracle 10g

Hi Master ,
Q1>
I have two column in a table on oracle 10g.
data like :
NAME     DATE
a     10-JAN-13
a     11-JAN-13
a     12-JAN-13
I want the output like :
NAME     DATE
a     10-JAN-13,11-JAN-13,12-JAN-13.
Q2>
How can i use pivot in oracle 10g.
plz. help.

Ekalabya wrote:
I want the output like :
NAME     DATE
a     10-JAN-13,11-JAN-13,12-JAN-13.this looks like string concatenation: http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php
Pivoting in10g: Pivot function in Oracle 10g???

Query to convert Row to column

Hi all,
I need a query to convert row value into column delimited by ','
create table tb_row_2_col (id number,val varchar2(100));
insert into tb_row_2_col values (1,'col1');
insert into tb_row_2_col values (1,'col2');
insert into tb_row_2_col values (1,'col3');
insert into tb_row_2_col values (2,'col4');
insert into tb_row_2_col values (2,'col5');
commit;
SQL> select * from tb_row_2_col;
ID VAL
1 col1
1 col2
1 col3
2 col4
2 col5
SQL>
if i execute a query the output should be like this
ID VAL
1 col1,col2,col3
2 col4,col5
Thanks in advance
S. Sathish Kumar

Or look for aggregation techniques against the forum helping by the search feature (top-right of the current page).
Nicolas.

SQL HELP , URGENT PLEASE

Hi,
I want some help in writing a SQL Query .Its besically a hierarchical query. Let me lay down the table structure first to explain my requirements better.
PORP_TABLE(NODE_LEVEL int, WBS_ID int, WBS_NUMBER varchar(60), LFT int,RGT int)
SELECT NODE_LEVEL, WBS_ID, LFT,RGT FROM PROPOSAL_WBS PW WHERE PROPOSAL_REV_ID = 7000
(SAMPLE DATA)
NODE WBS
LEVEL WBS_ID NUMBER LFT RGT
0 7055 ROOT 1 24
1 7056 1 2 5
1 7088 2 6 9
2 7057 1.1 3 4
2 7089 2.1 7 8
2 7091 3.1 11 14
2 7103 3.2 15 16
2 7105 4.1 19 20
1 7090 3 10 17
3 7092 3.1.1 12 13
1 7104 4 18 23
2 7106 4.2 21 22
ALLOCATION_DETAIL( WBS_ID int, COST_ID int, PERIOD Date, AMOUNT Float)
sample data
WBS_ID , COST_ID , PERIOD , AMOUNT
7057 100 01-jan-2005 5000
7057 100 01-feb-2005 2000
7057 100 01-mar-2005 1000
7057 100 01-apr-2005 6000
7057 100 01-may-2005 3000
7057 100 01-jun-2005 45000
7106 100 01-mar-2005 8000
7106 100 01-apr-2005 7000
7106 100 01-may-2005 9000
Now the PORP_TABLE has got the parents and childs. Only the leaf nodes in the hierarchy has the values stored in the ALLOCATION_DETAIL table. Now here is the scenario
In the example 7055 is the root WBS . The Leaf WBS are the one with max extension in the wbs number ( in this case it is 1.1, 2.1, 3.1.1, 3.2, 4.1 and 4.2)
Now the Starting period for each leaf node in the ALLOCATION_TABLE could be differrent . What that means is WBS 1.1 could start in Jan -2003 and WBS 3.1 Could be Jul-2005 . So the ending perios are also differrent for differrent WBS . Some can span 2 years some can 5 years.
So how to write a query so it retrieves the value for all the Wbs starting from the MIN ( PERIOD ) upto the MAX(PERIOD), and it should roll up also. Now there is No connect by Prior or any analytic functions available for this . THIS NEEDS TO BE DONE ONLY THROUGH TRADITIONAL SQL STATEMENT . And NO DB FUNCTIONS CAN BE USED .
Now if the WBS is a parent node then it should have the sum of all its child nodes for the COST category.
SO THE RESULT SET SHOULD BRING LIKE THIS
WBS_NUMBER, PERIOD_NUMER, COST_CATEGORY , AMOUNT
ROOT
1
1.1
2
2.1
3
3.1
3.1.1
3.2
4
4.1
4.2
......

Hi,
 Read String Aggregation Techniques
 HTH,
 Nicolas.

Sql select query

Hi friends ,
i have a table like
table ABC(
essay_id number(2),
line_id number(3),
line_t varchar2(100)
and here my table data goes..
essay_id line_id line_t
1 1 abc
1 2 def
1 3 ghi
2 1 klm
2 2 nop
here a single essay is stored as multiple lines in diff rows with same essay id and diff line id..
here i want to concatenate all the lines associated to same essay id..
for essay_id 1 ..output should be abcdefghi
for essay_id 2.. output should be klmnop
i did this with the help of cursor..
bt my question is ..
can we achieve this with a single select query...?
is it possible..
Please help me on this..
Thanks in advance
Regards,
Jeyanthi

What version of Oracle are you using?
Tim Hall has a page on the various string aggregation techniques that are available. The approaches that are available will depend heavily on the Oracle version.
Justin

Sql query with multiple joins to same table

I have to write a query for a client to display business officers' names and title along with the business name
The table looks like this
AcctNumber
OfficerTitle
OfficerName
RecKey
90% of the businesses have exactly 4 officer records, although some have less and some have more.
There is a separate table that has the AcctNumber, BusinessName about 30 other fields that I don’t need
An individual account can have 30 or 40 records on the other table.
The client wants to display 1 record per account.
Initially I wrote a query to join the table to itself:
Select A.OfficerTtitle, A.OfficerName, B.OfficerTitle, B.OfficerName, C.OfficerTtitle, C.OfficerName, D.OfficerTitle, D.OfficerName where A.AcctNumber = B.AcctNumber and A.AcctNumber = C.AcctNumber and A.AcctNumber = D.AcctNumber
This returned tons of duplicate rows for each account ( number of records * number of records, I think)
So added
And A.RecKey > B.RecKey and B.RecKey > C. RecKey and C.RecKey . D.RecKey
This works when there are exactly 4 records per account. If there are less than 4 records on the account it skips the account and if there are more than 4 records, it returns multiple rows.
But when I try to l join this to the other table to get the business name, I get a row for every record on the other table
I tried select distinct on the other table and the query runs for ever and never returns anything
I tried outer joins and subqueries, but no luck so far. I was thinking maybe a subquery - if exists - because I don't know how many records there are on an account, but don't know how to structure that
Any suggestions would be appreciated

Welcome to the forum!
user13319842 wrote:
I have to write a query for a client to display business officers' names and title along with the business name
The table looks like this
AcctNumber
OfficerTitle
OfficerName
RecKey
90% of the businesses have exactly 4 officer records, although some have less and some have more.
There is a separate table that has the AcctNumber, BusinessName about 30 other fields that I don’t need
An individual account can have 30 or 40 records on the other table.
The client wants to display 1 record per account.As someone has already mentioned, you should post CREATE TABLE and INSERT statements for both tables (relevant columns only). You don't have to post a lot of sample data. For example, you need to pick 1 out of 30 or 40 rows (max) for the same account, but it's almost certainly enough if you post only 3 or 4 rows (max) for an account.
Also, post the results you want from the sample data that you post, and explain how you get those resutls from that data.
Always say which version of Oracle you're using. This sounds like a PIVOT problem, and a new SELECT .... PIVOT feature was introduced in Oracle 11.1. If you're using Oracle 11, you don't want to have to learn the old way to do pivots. On the other hand, if you have Oracle 10, a solution that uses a new feature that you don't have won't help you.
Whenever you have a question, please post CREATE TABLE and INSERT statements for some sample data, the results you want from that data, an explanation, and your Oracle version.
Initially I wrote a query to join the table to itself:
Select A.OfficerTtitle, A.OfficerName, B.OfficerTitle, B.OfficerName, C.OfficerTtitle, C.OfficerName, D.OfficerTitle, D.OfficerName where A.AcctNumber = B.AcctNumber and A.AcctNumber = C.AcctNumber and A.AcctNumber = D.AcctNumber Be careful, and post the exact code that you're running. The statement above can't be what you ran, because it doesn't have a FROM clause.
This returned tons of duplicate rows for each account ( number of records * number of records, I think)
So added
And A.RecKey > B.RecKey and B.RecKey > C. RecKey and C.RecKey . D.RecKey
This works when there are exactly 4 records per account. If there are less than 4 records on the account it skips the account and if there are more than 4 records, it returns multiple rows.
But when I try to l join this to the other table to get the business name, I get a row for every record on the other table
I tried select distinct on the other table and the query runs for ever and never returns anything
I tried outer joins and subqueries, but no luck so far. I was thinking maybe a subquery - if exists - because I don't know how many records there are on an account, but don't know how to structure that
Any suggestions would be appreciatedDisplaying 1 column from n rows as n columns on 1 row is called Pivoting . See the following link fro several ways to do pivots:
SQL and PL/SQL FAQ
Pivoting requires that you know exactly how many columns will be in the result set. If that number depends on the data in the table, then you might prefer to use String Aggregation , where the output consists of a huge string column, that contains the concatenation of the data from n rows. This big string can be formatted so that it looks like multiple columns. For different string aggregation techniques, see:
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php
The following thread discusses some options for pivoting a variable number of columns:
Re: Report count and sum from many rows into many columns

Instr and substr function in oracle issues

hi all
I have an issue to split my filename
Filename is as below:
ABCD01_123456789_samplename_13062012_10062012-12-12-12-1.PDF
my output as below
Col1 Col2 col3 col4 col5 col6
ABCD01 123456789 samplename 13062012 10062012 10062012-12-12-12-1.PDFcan you please help me to split this using any simple method
thanks ain advance
Edited by: A on Jun 12, 2012 8:25 PM
Edited by: A on Jun 12, 2012 8:25 PM

>
can you please help me to split this using any simple method
>
You can split the string into rows using the same technique used in this example
http://nuijten.blogspot.com/2009/07/splitting-comma-delimited-string-regexp.html
Just replace the ',' with an underscore.
Then you can convert the rows to columns using LISTAGG like this example
http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php
Search the forum for 'rows to columns' and you will find plenty of other alternatives.

Sql query row format

Hi .
Please look at the below table
EMP_ID EMP_NAME
1 Micel
2 Jonh
3 Steev
I need to display all emp_id & corresponding emp_name row wize.
Ie in the below format
1 2 3
Micel Jonh Steev
But for this I need to use SQL only .
Summery : I need to display the columns in a table in row using SQL query only.
Is there any command in SQL which can do it for me , or else how to do it , Can u help me in this regard??

A summary of different string aggregation techniques can be found here
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php

Group by groups

I have a bunch of data I need to transform and "group by groups." Let me explain by an example:
SQL> create table orig_data as
2 select distinct job, deptno
3 from scott.emp e
4 /
Table created.
SQL> select job
2       , deptno
3    from orig_data
4   order by
5         job
6       , deptno
7 /
JOB           DEPTNO
ANALYST           20
CLERK             10
CLERK             20
CLERK             30
MANAGER           10
MANAGER           20
MANAGER           30
PRESIDENT         10
SALESMAN          30
9 rows selected.The real-world data is about 5 million rows.
First I group by job (I use xmlagg here because I am on version 11.1 and therefore no listagg ;-) ):
SQL> select od.job
2       , rtrim(xmlagg(xmlelement(d,od.deptno,',').extract('//text()') order by od.deptno),',') deptnos
3    from orig_data od
4   group by od.job
5 /
JOB       DEPTNOS
ANALYST   20
CLERK     10,20,30
MANAGER   10,20,30
PRESIDENT 10
SALESMAN 30I notice here that both job CLERK and MANAGER has the same set of deptnos.
So if I group by deptnos I can get this result:
SQL> select s2.deptnos
2       , rtrim(xmlagg(xmlelement(j,s2.job,',').extract('//text()') order by s2.job),',') jobs
3    from (
4     select od.job
5          , rtrim(xmlagg(xmlelement(d,od.deptno,',').extract('//text()') order by od.deptno),',') deptnos
6       from orig_data od
7      group by od.job
8         ) s2
9   group by s2.deptnos
10 /
DEPTNOS                        JOBS
10                             PRESIDENT
10,20,30                       CLERK,MANAGER
20                             ANALYST
30                             SALESMANMy requirement is to identify all such unique groups of deptnos in my orig_data table, give each such group a surrogate key in a parent table, and then populate two child tables with the deptnos of each group and the jobs that have that group of deptnos:
SQL> create table groups (
2     groupkey number primary key
3 )
4 /
Table created.
SQL> create table groups_depts (
2     groupkey number references groups (groupkey)
3   , deptno number(2)
4 )
5 /
Table created.
SQL> create table groups_jobs (
2     groupkey number references groups (groupkey)
3   , job varchar2(9)
4 )
5 /
Table created.For the surrogate groupkey I can just use a rownumber on my group by deptnos query:
SQL> select row_number() over (order by s2.deptnos) groupkey
2       , s2.deptnos
3       , rtrim(xmlagg(xmlelement(j,s2.job,',').extract('//text()') order by s2.job),',') jobs
4    from (
5     select od.job
6          , rtrim(xmlagg(xmlelement(d,od.deptno,',').extract('//text()') order by od.deptno),',') deptnos
7       from orig_data od
8      group by od.job
9         ) s2
10   group by s2.deptnos
11 /
GROUPKEY DEPTNOS                        JOBS
         1 10                             PRESIDENT
         2 10,20,30                       CLERK,MANAGER
         3 20                             ANALYST
         4 30                             SALESMANThat query I can use for a (slow) insert into my three tables in this simple manner:
SQL> begin
2     for g in (
3        select row_number() over (order by s2.deptnos) groupkey
4             , s2.deptnos
5             , rtrim(xmlagg(xmlelement(j,s2.job,',').extract('//text()') order by s2.job),',') jobs
6          from (
7           select od.job
8                , rtrim(xmlagg(xmlelement(d,od.deptno,',').extract('//text()') order by od.deptno),',') deptnos
9             from orig_data od
10            group by od.job
11               ) s2
12         group by s2.deptnos
13     ) loop
14        insert into groups values (g.groupkey);
15
16        insert into groups_depts
17           select g.groupkey
18                , to_number(regexp_substr(str, '[^,]+', 1, level)) deptno
19             from (
20                    select rownum id
21                         , g.deptnos str
22                      from dual
23                  )
24           connect by instr(str, ',', 1, level-1) > 0
25                  and id = prior id
26                  and prior dbms_random.value is not null;
27
28        insert into groups_jobs
29           select g.groupkey
30                , regexp_substr(str, '[^,]+', 1, level) job
31             from (
32                    select rownum id
33                         , g.jobs str
34                      from dual
35                  )
36           connect by instr(str, ',', 1, level-1) > 0
37                  and id = prior id
38                  and prior dbms_random.value is not null;
39
40     end loop;
41 end;
42 /
PL/SQL procedure successfully completed.The tables now contain this data:
SQL> select *
2    from groups
3   order by groupkey
4 /
GROUPKEY
         1
         2
         3
         4
SQL> select *
2    from groups_depts
3   order by groupkey, deptno
4 /
GROUPKEY     DEPTNO
         1         10
         2         10
         2         20
         2         30
         3         20
         4         30
6 rows selected.
SQL> select *
2    from groups_jobs
3   order by groupkey, job
4 /
GROUPKEY JOB
         1 PRESIDENT
         2 CLERK
         2 MANAGER
         3 ANALYST
         4 SALESMANI can now from these data get the same result as before (just to test I have created the desired data):
SQL> select g.groupkey
2       , d.deptnos
3       , j.jobs
4    from groups g
5    join (
6           select groupkey
7                , rtrim(xmlagg(xmlelement(d,deptno,',').extract('//text()') order by deptno),',') deptnos
8             from groups_depts
9            group by groupkey
10         ) d
11         on d.groupkey = g.groupkey
12    join (
13           select groupkey
14                , rtrim(xmlagg(xmlelement(j,job,',').extract('//text()') order by job),',') jobs
15             from groups_jobs
16            group by groupkey
17         ) j
18         on j.groupkey = g.groupkey
19 /
GROUPKEY DEPTNOS                        JOBS
         1 10                             PRESIDENT
         2 10,20,30                       CLERK,MANAGER
         3 20                             ANALYST
         4 30                             SALESMANSo far so good. This all works pretty much as desired - except for a couple of things:
The very simple loop insert code will be slow. OK, it is a one-time conversion job (in theory, but very few times at least) so that could probably be acceptable (except for my professional pride ;-) .)
But worse is, that I have groups where the string aggregation won't work - the string would have to be about varchar2(10000) which won't work in SQL in the group by :-( .
So I have tried an attempt using collections. First a collection of deptnos:
SQL> create type deptno_tab_type as table of number(2)
2 /
Type created.
SQL> select od.job
2       , cast(collect(od.deptno order by od.deptno) as deptno_tab_type) deptnos
3    from orig_data od
4   group by od.job
5 /
JOB       DEPTNOS
ANALYST   DEPTNO_TAB_TYPE(20)
CLERK     DEPTNO_TAB_TYPE(10, 20, 30)
MANAGER   DEPTNO_TAB_TYPE(10, 20, 30)
PRESIDENT DEPTNO_TAB_TYPE(10)
SALESMAN DEPTNO_TAB_TYPE(30)All very good - no problems here. But then a collection of jobs:
SQL> create type job_tab_type as table of varchar2(9)
2 /
Type created.
SQL> select s2.deptnos
2       , cast(collect(s2.job order by s2.job) as job_tab_type) jobs
3    from (
4     select od.job
5          , cast(collect(od.deptno order by od.deptno) as deptno_tab_type) deptnos
6       from orig_data od
7      group by od.job
8         ) s2
9   group by s2.deptnos
10 /
group by s2.deptnos
ERROR at line 9:
ORA-00932: inkonsistente datatyper: forventede -, fik XAL_SUPERVISOR.DEPTNO_TAB_TYPENow it fails - I cannot group by a collection datatype...
I am not asking anyone to write my code, but I know there are sharper brains out there on the forums ;-) .
Would anyone have an idea of something I might try that will allow me to create these "groups of groups" even for larger groups than string aggregation techniques can handle?
Thanks for any help, hints or tips ;-)

The "group-by-collection" issue can be solved by creating a container object on which we define an ORDER method :
SQL> create type deptno_container as object (
2    nt deptno_tab_type
3 , order member function match (o deptno_container) return integer
4 );
5 /
Type created
SQL> create or replace type body deptno_container as
2    order member function match (o deptno_container) return integer is
3    begin
4      return case when nt = o.nt then 0 else 1 end;
5    end;
6 end;
7 /
Type body created
Then a multitable INSERT can do the job, after unnesting the collections :
SQL> insert all
2    when rn0 = 1 then into groups (groupkey) values (gid)
3    when rn1 = 1 then into groups_jobs (groupkey, job) values(gid, job)
4    when rn2 = 1 then into groups_depts (groupkey, deptno) values(gid, deptno)
5 with all_groups as (
6    select s2.deptnos
7         , cast(collect(s2.job order by s2.job) as job_tab_type) jobs
8         , row_number() over(order by null) gid
9    from (
10      select od.job
11           , deptno_container(
12               cast(collect(od.deptno order by od.deptno) as deptno_tab_type)
13             ) deptnos
14      from orig_data od
15      group by od.job
16    ) s2
17    group by s2.deptnos
18 )
19 select gid
20       , value(j) job
21       , value(d) deptno
22       , row_number() over(partition by gid order by null) rn0
23       , row_number() over(partition by gid, value(j) order by null) rn1
24       , row_number() over(partition by gid, value(d) order by null) rn2
25 from all_groups t
26     , table(t.jobs) j
27     , table(t.deptnos.nt) d
28 ;
15 rows inserted
SQL> select * from groups;
GROUPKEY
         1
         2
         3
         4
SQL> select * from groups_jobs;
GROUPKEY JOB
         1 SALESMAN
         2 PRESIDENT
         3 CLERK
         3 MANAGER
         4 ANALYST
SQL> select * from groups_depts;
GROUPKEY DEPTNO
         1     30
         2     10
         3     10
         3     30
         3     20
         4     20
6 rows selected
Works great on the sample data but how this approach scales on a much (much) larger dataset is another story :)

Aggregation Techniques

Similar Messages

Maybe you are looking for