Flattenning a table

I am running Oracle 9i on RedHat Linux. This problem involves two separate issues.
The schema has 3 tables, A, B, and C. Their primary keys are A_ID, B_ID, and C_ID, respectively. Table B has a foreign key named A_ID which links each row in B to a row in A. Table C has a foreign key named B_ID which links each row in C to a row in B.
Each row in A has one or more rows in B. Each row in B has exactly 3 rows in C.
B has a column named timestamp_B which records when each row was last updated.
I want to construct a query that will return one row of data for each row in A. But appended to A should be the associated data in B as well as the associated data from the 3 rows in C. So I want the returned row to have this layout:
| A data | B data | C1 data | C2 data | C3 data |
where C1, C2, and C3 are the three rows in C that are linked to the row in B.
The 2nd issue is: if there is more than one row in B for a given A_ID, I want only the most recent row in B, as identified by timestamp_B. That is, I want the result to contain only one row for each A_ID.
Can all this be done in one SQL query? If not, it would be helpful to know what SQL to use for either of the two problems.
Thank you.
Ted

select a.*
, b.*
, c1.*
, c2.*
, c3.*
from a
, b
, c c1
, c c2
, c c3
where a.a_id = b.a_id
and b.timestamp_b = (select max(b1.timestamp_b) from b b1 where a.a_id = b1.a_id)
and b.b_id = c1.b_id
and c1.c_id = (select max(c.c_id from c where b.b_id = c.b_id)
and b.b_id = c2.b_id
and c2.c_id < (select max(c.c_id from c where b.b_id = c.b_id)
and c2.c_id > (select min(c.c_id from c c where b.b_id = c.b_id)
and b.b_id = c3.b_id
and c3.c_id = (select min(c.c_id from c c where b.b_id = c.b_id)
Not sure if this is the most efficient way to write this, but it'll return what you're looking for if A always has at least 1 B and B always has exactly 3 C.
HTH,
Pete

Similar Messages

Flattening CDC tables to use in a data warehouse

I have enabled CDC on 3 tables: [Employee, EmployeeAddress, and Address]. I want to flatten the data out then store it data in a warehouse so I can report what changed in a specific time frame.
If a Address is updated the linking table (EmployeeAddress) is not updated since the link to the employee already exists so when i flatten out my data my CDC tables cant figure out the who the address belongs to.
It feels wrong to in some cases join my CDC table to my production tables. What is a best practice or recommendation for a scenario like this?
My tables look like:
Employee:
EmployeeId | Name
-------------+-------------
e1           | Bob Jones
e2 | jane Doe
EmployeeAddress:
EmployeeAddressId |EmplyeeId | AddressId
--------------------+----------+---------------
ea1        | e1              | a1
ea2              | e2              | a2
ea3              | e2              | a3
Address:
AddressId | Address
----------+------------------------
a1        | 111 Some Street
a2    | 222 Some Street
a3     | 333 Some Street
SELECT *
FROM [cdc].[fn_cdc_get_all_changes_dbo_Employee](sys.fn_cdc_get_min_lsn('dbo_Employee'), sys.fn_cdc_get_max_lsn(), N'all update old') E
JOIN [cdc].[fn_cdc_get_all_changes_dbo_EmployeeAddress](sys.fn_cdc_get_min_lsn('dbo_EmployeeAddress'), sys.fn_cdc_get_max_lsn(), N'all update old') EA ON EA.EmployeeId = E.EmployeeId
JOIN [cdc].[fn_cdc_get_all_changes_dbo_Address](sys.fn_cdc_get_min_lsn('dbo_Address'), sys.fn_cdc_get_max_lsn(), N'all update old') A ON A.AddressId = EA.AddressId
Thanks!

Create a
dimensional model instead of flattening. Under normal circumstances this leads to model with a small fact table and some slowly changing dimension tables. The key problem is to find some meaningful measures (HR related, e.g. work time) and to determine
the appropriate grain and fact table type (transaction or snapshot (periodic, accumulating or temporal)).
See also
Kimball Dimensional Modeling Techniques.

How to flatten sparse table records?

We have an application that (unfortunately) stores its data in a "long thin" name-value style format, which we then have to pivot into "wide" records for outputs. We're aiming to do this with materialised views, but want to take advantage of partition change tracking to refresh the views, which imposes some restrictions on how we build our views. In particular, we cannot use sub-queries or analytical functions to fully flatten the data into output records.
Right now, our pivoted materialized views produce something like the following:
Record Key | Date Created | Column1 | Column2 | Column3 | etc
REC01 | 01-JAN-2010 | A | B | | ...
REC01 | 04-JAN-2010 | | C | D | ...
What we need is to flatten these records so that there is only one record with a given Record Key, and having only the most recent non-NULL value for each ColumnN e.g.:
Record Key | Column1 | Column2 | Column3 | etc
REC01 | A | C | D | ...
The problem is that there can be hundreds of ColumnN columns, may be many separate update-records for a given Record Key (and potentially millions of different Record Keys), so I'm having trouble working out how to do this (a) efficiently or indeed (b) at all.
We can't just use the latest Date Created, because we may need to look at a different Date Created for each column. Because the MV needs to be PCT-capable, it seems we cannot use a sub-query to fetch only the latest value for each ColumnN before pivoting the data, and I can't see how to apply a group or analytical function to do this reliably on the wide records in the pivoted view.
What we need is a kind of FLATTEN() OVER (PARTITION BY RecordKey ORDER BY DateCreated DESC) function, but Oracle seems sadly lacking in this respect. Or am I missing something?
Looking at the data, it looks like it should be a pretty simple, but my SQL-brain is running out of juice, so if anybody else can suggest how to approach this, I'd be very grateful.
thanks,
Chris

There's probably a simpler way than this, but it's early in the morning and I haven't woken up properly yet...
SQL> ed
Wrote file afiedt.buf
1 with t as (select 'REC01' as record_key, to_date('01-JAN-2010','DD-MON-YYYY') as date_created, 'A' as column1, 'B' as column2, null as column3 from dual union all
2             select 'REC01', to_date('04-JAN-2010','DD-MON-YYYY'), null, 'C', 'D' from dual union all
3             select 'REC02', to_date('06-JAN-2010','DD-MON-YYYY'), 'X', null, 'Z' from dual)
4 -- end of test data
5 select record_key, date_created, column1, column2, column3
6 from (
7        select record_key
8              ,last_value(date_created) over (partition by record_key order by date_created) as date_created
9              ,last_value(column1 ignore nulls) over (partition by record_key order by date_created) as column1
10              ,last_value(column2 ignore nulls) over (partition by record_key order by date_created) as column2
11              ,last_value(column3 ignore nulls) over (partition by record_key order by date_created) as column3
12              ,row_number() over (partition by record_key order by date_created desc) as rn
13        from t
14       )
15* where rn = 1
SQL> /
RECOR DATE_CREATED        C C C
REC01 04/01/2010 00:00:00 A C D
REC02 06/01/2010 00:00:00 X   Z
SQL>

Multiple tables in a recordset.

Hi all,
I have got a bit stuck with creating an advanced recordset that calls from multiple sub tables. I have created the SQL statement below (originally used inner joins but the code got too long and still didn't work).
SELECT *
FROM Main_Table, Table_1, Table_2, Table_3, Table_4, Table_5, Table_6, Table_7, Table_8, Table_9, Table_10, Table_11, Table_12, Table_13, Table_14, Table_15
WHERE
(((((((((((((((((Main_Table.ID_1 = Table_1.ID_1),
Main_Table.ID_2 = Table_2.ID),
Main_Table.ID_3 = Table_3.ID),
Main_Table.ID_4 = Table_4.ID),
Main_Table.ID_5 = Table_5.ID),
Main_Table.ID_6 = Table_6.ID),
Main_Table.ID_7 = Table_7.ID),
Main_Table.ID_8 = Table_8.ID),
Main_Table.ID_9 = Table_9.ID),
Main_Table.ID_10 = Table_10.ID),
Main_Table.ID_10 = Table_11.ID),
Main_Table.ID_11 = Table_12.ID),
Main_Table.ID_11 = Table_13.ID),
Main_Table.ID_11 = Table_14.ID),
Main_Table.ID_12 = Table_15.ID),
AND cust_id = colname
The idea is to get some text values from the sub tables by passing through the relevant ID of the main table to the sub tables ID. To add to the complexity the "cust_id = colname" is a passthrough from a prior page that only brings up a specific record from the main table. Further complexity may come where the two bolded sections are using a single ID from the main table to call from multiple sub tables.
Ideally at the end of this I just want to pull the value from the recordset and drop into the page design. The page is PHP and the database/web server is using WAMP, if that helps?
I feel i'm so close (but i'm probably nowhere near!). Can anyone help me before I lose my hair and develop a nervous tick? Unfortunately, I am still a bit of a novice with these things, so if it's my syntax or a coding issue please could you explain it in noob terms for me?
Thank you in advance!

The database was an access DB of 1 main table and 15 related tables that was then converted to MYSQL. The idea is that it will serve two purposes from web pages setup to point to it;
1) To add records to the main database with dropdowns. Certain data will be shown depending on the dropdown selected.
2) To show a data-only details screen.
The data is complicated survey data and it was decided that the main table would store the variables and the sub tables would store the constants (Between 2 to 6 records) using the ID to link them.
e.g.
Name = Main table.name
Address = Main table.address
Position = Table 2.position
Department = Table 2.Department
etc...
Have tried the AND statement above but the system just times out.
Would it be easier to create multiple recordsets (15 recordsets of 1 main table and 1 sub table) and link it somehow to the cust_id? If so, how would I best set each up so only the relevant fields are pulled from the sub table for each recordset? Or am I being to adventerous and should I just give up and flatten the tables to 1 single table?
Am starting to thin up top with this one! lol!

Ideas on how to best flatten a structure

Hello,
I need to take the data in a table and flatten it. e.g.
The data is stored like this:
Id Description
1 Cat
2 Dog
3 Bird
4 Tree
I have to make a table with the "Description" elements as columns. i.e. the final table would have the following columns:
Cat Dog Bird Tree ...
The number of columns is not fixed, it will be based on how many rows are in the original table.
I can think of a few pretty rough ways of doing it. i.e. Using Unix Shell/SQL to generate a create table script i.e.
CREATE TABLE descriptions
( cat VARCHAR2(100),
bird VARCHAR2(100)
etc);
Is there any way I can do this in straight PLSQL? can I write dynamic DDL (I assume I can?). Or is there an even more powerful/elegant way of flattening a table in the manner in Oracle 9i??
Thanks,
Dan.
[email protected]

Tom Kytes has some examples of pivot queries to do this sort of thing, but I don't believe he's extended them to handle an arbitrary number of columns. You should be able to turn Tom's statis SQL queries into dynamic SQL queries, though, which would get you to your arbitrary columns requirement.
Justin
Distributed Database Consulting, Inc.
http://www.ddbcinc.com/askDDBC

Pivot Table, "Insert Page Break After Each Item" Setting Only Works for the First Item Change

I have a flattened pivot table generated from Powerpivot and I would like to insert a page break for each change in the row item.
When I use the pivot table Field Settings>Insert Page Break After Each Item, Excel inserts the first page break then returns to normal pagination for the rest of the output.
Is there another setting required to maintain the page breaks after the item change?
Thanks.

We are experiencing the same problem. Did you ever find a solution?

Changing into parent-child relation

Hi,
I have a table which has the following fields and the table gives the identifier for parent-child relation:
ID
PARENT ID
structure
parentkey_field
foreignkey_field
Fields[]
data[]
1.ID is the numbering like 1,2,3...
2.PARENT ID will give the relation like which structure is the child for which structure.
Example: I have two structures HEADER and ITEM.Then the IDs will be 1 and 2.The parent id for ITEM will be 1(ID OF HEADER).
3.Structure is the header/item structure names.
4.parentkey_field and foreignkey_field are defining the header-child relation identifier fields.
5.fields[] is the table with all the fields of header and item.
6.DATA[] is the table having the data of header/item fields.
Now my requirement is to flatten this table into one more table(with field structure_name,fieldname,value) with parent-child relation.
Can anyone suggest something.

It would be better if you posted the code you tried first and then having others help you with what's wrong. Flattening a hierarchy is not that difficult; why don't you give it a shot first?

Different elements of different levels of hierarchy

Hi experts,
I have a question about hierarchies obiee 11g.
I need to do a table that contains (plain text, without '+' to drill-down) different elements of different levels of hierarchy
I have a hierarchy Account like this:
Level1
-----Level2
-------Element1Level2
-------Element2Level2
------------Level3
--------------Element1Level3
-------Element3Level2
My assignment is to do a table like this
Level1---------------
Element2Level2--
Element1Level3--
Element3Level2--
So I don't want to drill-down on Element2Level2 to see Elements of Level3...
How can I achieve this??
Thanks!

If I a not wrong, you basically want to flatten out a hierarchy. You should be able to do it using the Parent-child hierarchy concept newly introduced in OBIEE 11g where the RPD generates the DDL and DML for you, but you might have to build out an intermediate table to source the final flattened hierarchy table. Refer to this link which has a great example. ( http://prasadmadhasi.com/2011/11/15/hierarchies-parent-child-hierarchy-in-obiee-11g/ )
Thanks,
-Amith.

Export pdf with completely flattened linked images/tables

Hello
I am trying to export my thesis as pdf so that the illustrator images which are linked in, are flattened including their text labels.
This document needs to be submitted to examiners who want to scan the word-count of the body text, not the images/tables.
I have looked at other discussions but suggestions have not worked, I can still select the text of embedded images/tables in Preview and Acrobat.
Please can I have some suggestions
Thanks

Here is one way, You will need to open each Illustrator file, duplicate the original layer and call it "Outlined type". Go to select> Objects> Text objects, then Type> Create outlines. (outlined type won't be recognized as words, unless you ask it to try).
Open your InDesign file, select each linked Illustrator file and go to Object> Object Layer Options, turn on the outlined type layer and turn off the original layer. Keep the original Illustrator layer, in case you need to make future edits.

Best way to flatten out this table?

Lets say you're dealing with these two tables:
CREATE TABLE VEHICLES
VEHICLE_ID NUMBER,
VEHICLE_NAME VARCHAR2(100 BYTE),
MILES NUMBER
CREATE TABLE VEHICLE_PARTS
PART_ID NUMBER,
VEHICLE_ID NUMBER NOT NULL,
PART_TYPE NUMBER NOT NULL,
PART_DESCRIPTION VARCHAR2(1000 BYTE) NOT NULL,
START_SERVICE_DATE DATE NOT NULL,
END_SERVICE_DATE DATE,
PART_TYPE_NAME VARCHAR2(100 BYTE)
And some example data as follows:
Insert into VEHICLES (VEHICLE_ID, VEHICLE_NAME, MILES) Values (1, 'Honda Civic', 75500);
Insert into VEHICLES (VEHICLE_ID, VEHICLE_NAME, MILES) Values (2, 'Ford Taurus', 156000);
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, END_SERVICE_DATE, PART_TYPE_NAME)
Values
(1, 1, 1, '1.4 VTEC',
TO_DATE('07/07/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('05/03/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'ENGINE');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, PART_TYPE_NAME)
Values
(2, 1, 1, '1.6 VTEC',
TO_DATE('05/03/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'ENGINE');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, END_SERVICE_DATE, PART_TYPE_NAME)
Values
(3, 1, 2, 'Good Year All-Season',
TO_DATE('07/07/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('08/10/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TIRES');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, PART_TYPE_NAME)
Values
(4, 1, 2, 'Bridgestone Blizzaks',
TO_DATE('08/10/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TIRES');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, PART_TYPE_NAME)
Values
(5, 2, 1, '3.5 L Duratec',
TO_DATE('06/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'ENGINE');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, END_SERVICE_DATE, PART_TYPE_NAME)
Values
(6, 2, 2, 'Good Year All-Season',
TO_DATE('06/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TIRES');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, END_SERVICE_DATE, PART_TYPE_NAME)
Values
(7, 2, 2, 'Michelin All-Seaon',
TO_DATE('03/15/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/12/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TIRES');
Insert into VEHICLE_PARTS
(PART_ID, VEHICLE_ID, PART_TYPE, PART_DESCRIPTION, START_SERVICE_DATE, PART_TYPE_NAME)
Values
(8, 2, 2, 'Nokian',
TO_DATE('01/12/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'TIRES');
And you need to produce a view which displays the joined data flattened out where each vehicle has one row with columns representing their most current part (by what has a service start date with null end date).
Like this:
Vehicle: Engine: Tires:
Honda Civic 1.6 VTEC Bridgestone Blizzaks
Ford Taurus 3.5 L Duratec Nokian
Is there a fast/efficient way to do this?
My current approach which is the brute force method is to have a separate outer join for each column I need to pull with condition of max(START_SERVICE_DATE) to get the current part for each type (Engine, Tires, etc...).
but its so slow and painful code.
i thought about Pivot but I dont think Pivot would help here since there is no aggregation going on, right?
Could anything with partition over help? Im not familiar with that syntax

Hi,
trant wrote:
Your query does work great I just ran it - but by your last note about wanting to use SELECT...PIVOT - would you please elaborate on that?If you're used to doing pivots with CASE (or DECODE) and GROUP BY, then you may require some practice before the new way seems better. I beliieve it is, and I think you'll find whatever effort you have to spend in learning to use the new PIVOT feature is time well invested. I find this code:
WITH     got_r_num     AS
     SELECT     v.vehicle_name
     ,     vp.part_description
     ,     vp.part_type_name
     ,     ROW_NUMBER () OVER ( PARTITION BY vp.vehicle_id
                         ,             vp.part_type_name
                         ORDER BY        start_service_date
                       )     AS r_num
     FROM     vehicle_parts     vp
     JOIN     vehicles     v     ON     v.vehicle_id     = vp.vehicle_id
     WHERE     vp.end_service_date     IS NULL
SELECT       vehicle_name, engine, tires
FROM       got_r_num
PIVOT       (     MIN (part_description)
       FOR     part_type_name IN ( 'ENGINE'     AS engine
                         , 'TIRES'     AS tires
WHERE       r_num     = 1
ORDER BY vehicle_name
;easier to understand and maintain, and this is a more complicated than average example. Often, when using SELECT ... PIVOT, the main SELECT clause is juSt "SELECT *", and adding more columns is even easier. For eaxample:
WITH     got_r_num     AS
     SELECT     v.vehicle_name
     ,     vp.part_description
     ,     vp.part_type_name
     ,     ROW_NUMBER () OVER ( PARTITION BY vp.vehicle_id
                         ,             vp.part_type_name
                         ORDER BY        start_service_date
                       )     AS r_num
     FROM     vehicle_parts     vp
     JOIN     vehicles     v     ON     v.vehicle_id     = vp.vehicle_id
     WHERE     vp.end_service_date     IS NULL
,     top_1_only     AS
     SELECT     vehicle_name
     ,     part_description
     ,     part_type_name
     FROM     got_r_num
     WHERE     r_num     = 1
SELECT       *
FROM       top_1_only
PIVOT       (     MIN (part_description)
       FOR     part_type_name IN ( 'ENGINE'
                         , 'TIRES'
ORDER BY vehicle_name
;If you want to include BRAKES and STEERING columns, all you have to do is change
FOR     part_type_name IN ( 'ENGINE', 'TIRES')to
FOR     part_type_name IN ( 'ENGINE', 'TIRES', 'BRAKES', 'STEERING')The only way it could be any easier is if it were automatic! (That's another subject entirely.)

Rainbow color table flatten to pixmap

I want to use the labview flatten to pixmap to make a jpeg file out of a 16 bit greyscale. Labview makes lovely images from my data with the "rainbow" color table. I can't find the rainbow color table anywhere on the forum and would spend many hours trying to make it myself. I have a "rainbow banded" table that I found here on the forum, but it just doesn't do the image justice. Can someone help me out with the plain old rainbow color table? J (using labview 11.0)

Have you tried using your own color gradient with the VI attached to the other forum? There are plenty of rgb color gradients floating around online; Googling 'rgb gradient' returns plenty of examples, such as the one attached below.
Attachments:
rgb-gradient.zip ‏3 KB

SQL Table 'flattening'

I have a table which has a structure of:
VALUE_TABLE
REFERENCE VARCHAR2(10)
TYPE VARCHAR2(1)
LEVEL NUMBER(2)
VALUE NUMBER(3)
with example contents:
REF TYPE LEVEL VALUE
1 A 1 30
1 A 2 50
1 A 3 80
1 B 1 25
1 B 2 44
1 B 3 78
2 A 1 14
2 A 2 17
2 A 3 45
and I want to display the data like this:
REF TYPE L1 L2 L3 TOTAL
1 A 30 50 80 160
1 B 25 44 78 147
2 A 14 17 45 76
Can any of you guru's help?
p.s. sorry about the formatting - the forum stripped all the spaces!

What you can do is to use decode function.
select
reference ref,
type,
sum(decode(level, 1, value, 0)) L1,
sum(decode(level, 2, value, 0)) L2,
sum(decode(level, 3, value, 0)) L3,
sum(level) total
from
value_table
group by
reference, type
Welcome to Oracle-SQL !
-Bipin.

To Partition a 4NF table or convert to 2NF?

Hi,
In one of our Applications we have a fairly simple table that holds details about users. To keep things simple this is created as a simple triple of userid,propertyname,propertyvalue.
So...
4th normal form
CREATE OR REPLACE TABLE user_metadata
user_id VARCHAR2(20 CHAR) NOT NULL ,
parameter_name VARCHAR2(5 CHAR) NOT NULL ,
parameter_value VARCHAR2(20 CHAR) NOT NULL
This has worked well for the typically small number of users and properties.
We now have a larger proposition with 20-50 million users and 15-50 properties; not all properties are known yet; and for those properties that are known we do not know the distribution of values within a key.
So the question we have is do we change to use 2NF and make use of (lots of) bitmapped indexes. e.g.
CREATE OR REPLACE TABLE user_metadata
(user_id VARCHAR2(20 CHAR) NOT NULL ,
zone_name VARCHAR2(50 CHAR) ,
age NUMBER(3,0) ,
gender CHAR(1) ,
profession_description ...
,town_name
,etc
Or do we stick with 4NF and make use of partitioning based on the value of the parameter_name column?
From a product viewpoint we would prefer to keep with 4NF as it means no application changes for us to make, but we need to advise the client as to the best way to structure the table for performance.
I am not a practicing DBA, but I am the person who will be put on a plane to the client if this solution does not perform ;-) so I have some, possibly rtfm questions:
* Can we index within a partition?
** For example if one partition_name is "City" then we would have a partition that contains all rows which represent City. Can I then index on the parameter_value column of the City partition? Does finding all users in NY involve scanning the City partition?
* Is the additional storage required for 4NF likely to have a significant impact on performance; there is considerable duplication of storage with the 4NF table.
* Which will perform better?
* Which will be more complex to manage as the users grow from 20M to 50M?
Thanks in advance,
Karl.

Thinking about this some more I think you really have two separate requirements. One is an operational requirement to store this User Metadata or Attributes, and let this be edited or added for new users easily. The other is for analytical queries that combine these Attributes in different combinations:
* send a message to all users in NY over 40 who favorite sport is Football
* offer a discount to a user whos birthday is this week.
The exact list of requirements is not detailed by the client yet, but will be along the lines of "use any of the parameters to create a set of users".You seem to have a working solution to the operational side using the 4 NF triples - any set of attributes of a user can be easily represented and stored. What you do not have is a solution to the analytical query side. As already mentioned querying this metadata table directly involves some horrendous SQL and will probably perform poorly due to lack of indexes etc.
I think the best solution is to have two separate sets of tables in the database. I cannot see why the analytical queries have to run against the current, up to date data. Surely these analytical queries would be just as valid against yesterdays data, or even the day before that or the week before. The change in a very small percentage of the metadata should not materially affect the kind of queries you are on about.
If I am right, then you can also have your flattened out 2 NF table, with each attribute as a named data column, of the correct data type. And as you say, you can have Bit Mapped indexes on each of the columns, which will work well in the analytical type queries you have described - lots of 'and's of separate conditions.
What you need is a way of propagating the data from the 4 NF table to the 2 NF table once a day, or at whatever frequency you want. The crudest option is to always recreate the data, copying it all across. Of course performance will be slow, and get worse as the data set grows. So you really want a way of tracking the changes to the 4 NF metadata, and then only propagating the changes over to the 2 NF copy. There are several ways you could achieve this.
Materialized Views would be worth exploring, as they seem to do most of this already. You could also create your own staging tables for the changes, and use a trigger on the 4 NF table to populate the staging table on any change. Then use the staging table to update the 2NF data set at your desired frequency, and clear the staging table (delete records). Of course you would have to write your own SQL for this, but your table structures are relatively simple and straightforward.
I think there are several benefits to this approach:
* Separate data structures optimized to each type of requirement
* Good indexing on 2 NF data for good query performance - Bit Mapped Indexes
* Updates to 2 NF done separately from OLTP updates to 4 NF data - no contention on locking of Bit Mapped Indexes
* No performance impact on 4 NF data sets and the OLTP updates to these
* Minimal if any changes to your existing code on the 4 NF tables
I think you might have some data consistency and integrity issues though, which you need to be careful of. Do you have any logic that enforces the rule that all metadata attributes must be present for each user? Do you somehow enforce that each user has an 'age' or 'city' or 'birth date'. The reason is that when you take the separate rows of triples and merge them together to form one wide row of many columns you might end up missing some of the data attributes for some of the users. This would result in a NULL being stored in the 2 NF table. Is this what you want?
Unfortunately this lack of data integrity is really inherent in the 4 NF form of separate triples. You cannot have a data integrity constraint on individual rows that will enforce that each user must have a value for all parameter names. And you can only check that a user has values for all parameters after the data rows have been stored for the other parameters. Normal data integrity constraints can be checked and enforced before a data row is stored i.e. a single new data row can have all its columns checked for NULLs, and all foreign keys can be checked, during the INSERT before Oracle physically adds the row to the table. Your type of data integrity is across a set of data rows, which you would have to handle somehow else.
Just some more thoughts,
John

Link on a pivot table to see the data behind summary figures

Hello,
I need your support because I am not sure if what I need to do is possible.
I have a flattened table and I have created several pivot tables and slicers based on it. Once I have created the scenario I want to analyze (using slicers), I get some summarized figures in the pivot table. What I need is a way to put a link on those figures
to allow me to see the data associated to them.
For instance, if after applying some slicer I get one figure 3, another 2 and grand total 5, I need to see the 2, 3 or 5 records behind them filtered in the existing data (not just double-click and create a new worksheet) .
Thanks in advance for your comments,
Parseval

Hyperlinks in pivots are not supported.
There is some limited drill through functionality built into pivots, but you can't do a lot to configure how it behaves.
If it is enabled, which I believe it is by default, you show be able to double click on a aggregated total and a new sheet will pop up with the underlying records. If your measure is a relatively simple aggregation and the underlying data is a
flat file then this should work pretty well. The results can be unexepected if the measures are complex or there are a lot of underlying relationships in the model.
The other possibility is to use the pivots ability to collapse/expand fields. If you can put the row ID for the individual records into the rows of your pivot beneath what ever aggregated row you are currently displaying, then you should be able to expand
into the rows and see the lower levels of the "hierarchy". By default, all rows in you pivot would be collapsed, but then a user could click the little plus sign next to the row label and expand it to see the underlying records.

I created four (4) similar two (2) page file tables as an inventory of the contents of four (4) file crates full of LP vinyl record albums. How can these four (4) files be merged into a single file, then arranged in alphabetical order (by artist)?

I am using "pages" version 5.5 (2109) as updated in its newest version after installing Yosemite OS X 10.10 on my 21.5 inch Mac desktop computer. When I printing the second file as a two sided document on a single sheet of paper, it only printed the first side. I looked at the copy of the file on my screen, and saw that the second page was also blank on my screen. I opened the original locked version of the 68 row, 4 column table, and found that both pages, 34 rows on each page, were intact, but when I saved it again, locked it, and reopened a duplicate copy, the second page again was not there. I ended up printing a copy of the original file, but I am nit able to save more than one page of the two page original. I would like to do this so I can edit the list without altering the original. I would eventually like to merge all four tables into one document, than arrange the entire merged file in alphabetical order. I would like to do this and have not been able to. This was the original question I had before I lost the second page of the second file table.

In general theory, one now has the Edit button for their posts, until someone/anyone Replies to it. I've had Edit available for weeks, as opposed to the old forum's ~ 30 mins.
That, however, is in theory. I've posted, and immediately seen something that needed editing, only to find NO Replies, yet the Edit button is no longer available, only seconds later. Still, in that same thread, I'd have the Edit button from older posts, to which there had also been no Replies even after several days/weeks. Found one that had to be over a month old, and Edit was still there.
Do not know the why/how of this behavior. At first, I thought that maybe there WAS a Reply, that "ate" my Edit button, but had not Refreshed on my screen. Refresh still showed no Replies, just no Edit either. In those cases, I just Reply and mention the [Edit].
Also, it seems that the buttons get very scrambled at times, and Refresh does not always clear that up. I end up clicking where I "think" the right button should be and hope for the best. Seems that when the buttons do bunch up they can appear at random around the page, often three atop one another, and maybe one way the heck out in left-field.
While I'm on a role, it would be nice to be able to switch between Flattened and Threaded Views on the fly. Each has a use, and having to go to Options and then come back down to the thread is a very slow process. Jive is probably incapable of this, but I can dream.
Hunt

Flattenning a table

Similar Messages

Maybe you are looking for