Administrator desing star schema for "time series analysis"
Hi all,
I need to develop a set of dashboard with reports display a set of customers properties at
the last etl period (this month) and, for these customers, show their properties in the "past"
(this month - "n").
I've a fact table with cust_id and the classic dimension cust, period, product and so on...
My question is find a technique to desing the model in order to do these analysis or
use oracle administrator function to retrieve the photo of my customers in the past.
Here a specific user request:
Find all customer revenue that this month have status = 1
and, only for these customer having status != 1, show the revenue "in the past".
Any suggestion?
Ugo
Edited by: user8021820 on 13-apr-2011 1.43
Edited by: user8021820 on 13-apr-2011 1.44
http://gerardnico.com/wiki/dat/obiee/function_time
Similar Messages
-
I'm studying at University and currently working on a time series analysis in assistance with Oracle 10g R2. The aim of my analysis is the comparison of two time series tables, each table contains two columns, 1st column comprises the date and the 2nd column comprises the value (price). The standard functionality within Oracle (this includes also the statistical functionality) doesn't support any time series analysis.
I’m searching for a code or script in PL/SQL which supports the analysis I’m doing such as cross correlation or others. Any help I’ll get in this regard is highly appreciated.
Thanks in advanceWell, maybe your real problem is more complex, but on provided dataset, would it not be sufficient?
SQL> with table1 as(
2 select DATE '2007-03-30' dt,72.28 price from dual union all
3 select DATE '2007-03-29',72.15 from dual union all
4 select DATE '2007-03-28',72.13 from dual union all
5 select DATE '2007-03-27',71.95 from dual union all
6 select DATE '2007-03-26',72.00 from dual union all
7 select DATE '2007-03-23',72.00 from dual union all
8 select DATE '2007-03-22',72.02 from dual union all
9 select DATE '2007-03-21',71.13 from dual union all
10 select DATE '2007-03-20',70.75 from dual union all
11 select DATE '2007-03-19',70.38 from dual ),
12 table2 as(
13 SELECT DATE '2007-03-30' dt ,33.28 price from dual union all
14 select DATE '2007-03-29',31.73 from dual union all
15 select DATE '2007-03-28',33.74 from dual union all
16 select DATE '2007-03-27',32.21 from dual union all
17 select DATE '2007-03-26',32.50 from dual union all
18 select DATE '2007-03-23',33.79 from dual union all
19 select DATE '2007-03-22',34.04 from dual union all
20 select DATE '2007-03-21',32.18 from dual union all
21 select DATE '2007-03-20',42.15 from dual union all
22 select DATE '2007-03-19',38.10 from dual)
23 select
24 t1.dt,t1.price p1,t2.price p2,
25 corr(t1.price,t2.price) over() correlation
26 from table1 t1,table2 t2
27 WHERE t1.dt=t2.dt
28 /
DT P1 P2 CORRELATION
30.03.2007 00:00:00 72.28 33.28 -.73719325
29.03.2007 00:00:00 72.15 31.73 -.73719325
28.03.2007 00:00:00 72.13 33.74 -.73719325
27.03.2007 00:00:00 71.95 32.21 -.73719325
26.03.2007 00:00:00 72 32.5 -.73719325
23.03.2007 00:00:00 72 33.79 -.73719325
22.03.2007 00:00:00 72.02 34.04 -.73719325
21.03.2007 00:00:00 71.13 32.18 -.73719325
20.03.2007 00:00:00 70.75 42.15 -.73719325
19.03.2007 00:00:00 70.38 38.1 -.73719325which shows rather negative correlation - by rising prices in table 1, prices in table 2 decreases?
Best regards
Maxim -
Star schema for a uploaded data sheet
Hi All gurus,
I am new to this tech . I have a requirement like this , I have to prepare the star schema for this data sheet as below .
REPORT_DATE PREPARED_BY Units On-time Units Late Non-Critical On-time Non-Critical Lates Non-Critical DK On-time Non-Critical DK Lates
2011 -01 Team1 1
2011-02 Team1
2011-03 Team1
2011 -01 Team2
2011-02 Team2 7 1
2011-03 Team2 4 5
2011 -01 Team3
2011-02 Team3
2011-03 Team3 1 3
(Take blank fields as zeros)
Note : There are 3 report date types 2011-01,02,03 and three teams team 1,2,3 as text data and all others columns contain number data .
I am given Time as dimensional table containing the Report Date and Whole sheet as Data table . So how to define the relationship for this in Physical and BMM ?
I am thinking to make Time as Dimensional Table and the whole table(as Data) as a fact table in the Physical layer . And then in the BMM , I want to carve out a Logical Dimension called Group from the Data Physical Table and then make Group and Time as dimensional Table and Data as Fact table .
Is this approach is correct ? please suggest me and if have any better Idea ,then please note down what are the tables to be taken as Dimension and Fact table in both physical and BMM . Your help willl be appreciated ,so thanks in advance . You can also advice for any change in no of Physical tables in the Physical schema design ..Your' s suggestion utterly anticipated ..
-
Time series analysis in Numbers
Has anyone done any chart showing time series analysis in Numbers? I could not find a way to change the axis to reflect the right data series.
Hi sanjay,
Yes, Numbers has a different style. Instead of a single large, multi-purpose table, Numbers uses several small tables, each with a purpose.
To plot a time series (or any Category graph) the X values must be in a Header Column. Here is a database of measurements over time as a tree grows:
That database can be left alone. No need to juggle with it. You can even lock it to prevent accidental edits.
A table to pull data and graph them:
Formula in B1
=Tree Data::B1
Formula in B2 (and Fill Down)
=Tree Data::B2
Next graph, pull some other data
(Scatter Plots do not require X data to be in a Header Column. Command click on each column to choose.)
Regards,
Ian. -
SQL for Time Series Functions AGO and YTD
When we use a time series function such as AGO or TODATE, OBIEE creates 2 physical queries. One query reads the calendar table. The other query reads the fact table without any date filter in the WHERE clause. Then the results of the 2 queries are stitched together. The query on the fact table returns a lot of rows because there is no filter on date.
Is there a way to force OBIEE to put a filter on the date when performing the physical query on the fact table when using AGO or TODATE?
Thanks,
Travis
v11.1.1.6We do have a date filter on the analysis. We need the analysis to show sales for a certain month and sales for that month a year ago, so we use the AGO function. However, it is really slow because it does a physical query on the sales table without filtering on date and then filters the results of that physical query by the dates from the physical query on the calendar table.
-
Best Partition for Time Series
Hi All,
i have the following tables in my DB
CREATE TABLE READING_DWR (
ID VARCHAR(20) NOT NULL,
MACHINE_ID VARCHAR(20),
DATE_ID NUMBER,
TIME_ID NUMBER,
READING NUMBER
CREATE TABLE DATE_DIMENSION (
DATE_ID NUMBER NOT NULL,
DATE_VALUE DATE NOT NULL,
DAY VARCHAR(10),
DAY_OF_WEEK INTEGER,
DAY_OF_MONTH INTEGER,
DAY_OF_YEAR INTEGER,
PREVIOUS_DAY DATE,
NEXT_DAY DATE,
WEEK_OF_YEAR INTEGER,
MONTH VARCHAR(10),
MONTH_OF_YEAR INTEGER,
QUARTER_OF_YEAR INTEGER,
YEAR INTEGER
CREATE TABLE TIME_DIMENSION (
TIME_ID NUMBER NOT NULL,
HOUR VARCHAR(3),
MINUTE VARCHAR(3),
SECOND VARCHAR(3),
INTERVAL NUMBER
Referential Constrains:-
STG_READING(DATE_ID)>>>>>DATE_DIMENSION(DATE_ID)
STG_READING(TIME_ID)>>>>>TIME_DIMENSION(TIME_ID)
READING_DWR contains the time series data for a particular machine.
What is the best way to partition the READING_DWR to improve the performance of my select query??Thanks for posting the additional information. I think I have a better understanding of what you are trying to do.
As I suspected partitioning has nothing to do with it.
>
Now where the first value is null , i have to get the record from the READING_DWR , where the time is less then 10:00 for a particular machIne
>
If I understand what you what you are trying to do correctly it is something like this. Please correct anything that is wrong.
1. READING_DWR is a history table - for each machine_id there is a datetime value and an amount which represents a 'total_to_date' value
2. STG_READING is a stage table - this table has new data that will be (but hasn't been) added to the READING_DWR table. All data in this table has a later datetime value than any data in the READING_DWR table. You know what the date cutoff is for each batch; in your example the earliest date is 10:00
3. You need to report on all records from STG_READING (which has 'total_to_date') and determine the 'incremental-value'; that is, the increase of this value from the preceding value.
4. For the first record (earliest datetime value) in the record set for each machine_id the preceding value will be the value of the READING_DWR table for that machine_id for the record that has the latest datetime value.
5. Your problem is how to best meet the requirement of step 4 above: that is, getting and using the proper record from the READING_DWR table.
If the above is correct then basically you need to optimize the 'getting' since you already posted code that uses the LAG (1 record) function to give you the data you need; you are just missing a record.
So where you show output that was from only the STG table
>
Now the output will be
=======================
Time Reading lag
10:00 200 null
10:15 220 200
10:20 225 220
10:30 230 225
>
If you include the DWR record (and no other changes) the output might look like
>
Time Reading lag
08:23 185 null
10:00 200 185
10:15 220 200
10:20 225 220
10:30 230 225
>
The above output is exactly what you want but without the first record. I assume you already know how to eliminate one record from a result set.
So the process for what you need, in pseudo-code, basically boils down to:
WITH ALL_RECORDS_NEEDED AS (
SELECT machine_id, last_record_data FROM READING_DWR
UNION ALL
SELECT * FROM STG_READING
SELECT lag_query_goes_here FROM ALL_RECORDS_NEEDEDThen either ignore or remove the earliest record for each machine_id since it came from READING_DWR and will have a NULL for the lag value. If you add a flag column to each query to indicate where the data came from (e.g. 'R' for READING_DWR and 'S' for STG_READING) then you can just use the records with a flag of 'S' in a report query or outer query.
So now the problem is reduce to two things:
1. Efficiently finding the records needed from the READING_DWR table
2. Combining the one DWR record with the staging records.
For #1 since you want the latest date for each machine_id then an index COULD help. You said you have an index
>
index on READING_DWR---MACHINE_ID,DATE_ID,TIME_ID
>
But for a query to find the latest date you want DATE_ID and TIME_ID to be in descending order.
The problem here is that you have seriously garbaged up your data by using numbers for dates and times - requiring
>
TO_DATE(DATE_ID||''||LPAD(time_id,6,0),'YYYYMMDDHH24MISS'))
>
to make it useful.
This is a VERY BAD IDEA. If at all possible you should correct it. The best way to do that is to use a DATE column in both tables and convert the data to the proper date values when you insert it.
If that is not possible then you should create a VIRTUAL column using your TO_DATE functionality so that you can index and query the virtual column as if it were a date.
For #2 (Combining the one DWR record with the staging records) you can either just union the two queries together (as in my psuedo-code) or extract a copy of the DWR and insert it into the staging table.
In short query ALL of the DWR records you need (one for each machine_id) separately as a batch and then combine them with the STG records. Don't look them up one at a time like your posted code is trying to do.
If your process is something like this and perhaps run every 15 minutes
1. truncate the stage table
2. run my report
3. add stage records to the history table
Then I would modify the process to use the 15 minutes 'dead' time between batches to extract the DWR records needed for the next batch into a holding table. Once you do step 3 above (update the history table) you can run this query and have the records preprocessed for your next batch and report.
I would use a new holding table for this purpose rather than have the staging table server a double purpose. You never know when you might need to redo the staging table load; this means truncating the table which would wipe out the DWR staged records.
Anyway - with all of the above you should be able to get it working and performing. -
How prepare Query for time series algorithm?
Hi every one,
i want next 6 month prediction, how prepare the Query ,
I have Date column,Crime column,Incidents Column,I going with next 6 month so how we get date columns month wise or date wise,
if month wise means,How split the Year and month from date colum??
Please i need some help.....waiting for reply.....
pandiyanHi Leo,
Thanks a lot for replay.
but using help of this blog also this problem not solve.
is this problem can we solve using "Seasonal Decomposition of Time Series by Loess".
Regards,
Manish -
I want to know where i can get the example of star schema of OSA ... Please Help me
Your' s suggestion utterly anticipated ..
-
Partition Scheme for Time Machine
I have an external hard drive that is the same size as the internal hard drive in my mac mini, and I want to use the external drive to backup the mac. Currently, my mac is partitioned to use Boot Camp, and I also wanted to use the external drive to share files between OSX and Windows. When I setup Time Machine, will it backup the Windows partition as well? If so, would it be best to setup the partitions on the external drive the same as the internal drive and use something else to share files between OSX and Windows?
Thanks.No, Time Machine will only backup your OS X partition. You'd need a different backup solution running under Windows to backup that side.
-
Time-series / temporal database - design advice for DWH/OLAP???
I am in front of task to design some DWH as effectively as it can be - for time series data analysis - are there some special design advices or best practices available? Or can the ordinary DWH/OLAP design concepts be used? I ask this - because I have seen the term 'time series database' in academia literature (but without further references) and also - I have heard the term 'temporal database' (as far as I have heard - it is not just a matter for logging of data changes etc.)
So - it would be very nice if some can give me some hints about this type design problems?Hi Frank,
Thanks for that - after 8 years of working with Oracle Forms and afterwards the same again with ADF, I still find it hard sometimes when using ADF to understand the best approach to a particular problem - there is so many different ways of doing things/where to put the code/how to call it etc... ! Things seemed so much simplier back in the Forms days !
Chandra - thanks for the information but this doesn't suit my requirements - I originally went down that path thinking/expecting it to be the holy grail but ran into all sorts of problems as it means that the dates are always being converted into users timezone regardless of whether or not they are creating the transaction or viewing an earlier one. I need the correct "date" to be stored in the database when a user creates/updates a record (for example in California) and this needs to be preserved for other users in different timezones. For example, when a management user in London views that record, the date has got to remain the date that the user entered, and not what the date was in London at the time (eg user entered 14th Feb (23:00) - when London user views it, it must still say 14th Feb even though it was the 15th in London at the time). Global settings like you are using in the adf-config file made this difficult. This is why I went back to stripping all timezone settings back out of the ADF application and relied on database session timezones instead - and when displaying a default date to the user, use the timestamp from the database to ensure the users "date" is displayed.
Cheers,
Brent -
Hi Guys,
I have designed a Star schema for one of my datamart and my client is after me suggesting that over that I should create a MV to provide a consolidated view. I am trying to convience my client not to do so with the points as below:
1. As we have created a Star Schema in the database we should take advantages of the same and should avoid creating another layer of reporting which in future will increase the complexity of the queries while expanding the functionality of the mart.
2. We have to create a complete refresh MV and during refresh data will not be available for reporting to users and the duration will increase over the period of time once the data increases
3. As MV are a table on a disk using a MV in this case will consume the tablespace which will increase over the period of time.
Please can you experts suggest of any more points or additions. We are using SAP BO as a reporting tool in our organization wherein a Universe can be created easily for reporting.
Cheers,
ShazI have designed a Star schema for one of my datamart and my client is after me suggesting that over that I should create a MV to provide a consolidated view. I am trying to convience my client not to do so with the points as below:You are convincing them to NOT do one of the the things materialized views were originally introduced to provide?
I'm purposely going all the way back to 8i documentation here to emphasize the point.
http://docs.oracle.com/cd/A87860_01/doc/server.817/a76994/qr.htm#35520
" Overview of Query RewriteOne of the major benefits of creating and maintaining materialized views is the ability to take advantage of query rewrite, which transforms a SQL statement expressed in terms of tables or views into a statement accessing one or more materialized views that are defined on the detail tables. The transformation is transparent to the end user or application, requiring no intervention and no reference to the materialized view in the SQL statement. Because query rewrite is transparent, materialized views can be added or dropped just like indexes without invalidating the SQL in the application code. "
>
The theory behind query rewrite is this: have them build their queries based on your star schema (or you a build a traditional view that does that), then build a materialized view that mirrors the query/view. If the materialized view is refreshing or not up-to-date, their queries will run (more slowly) against the star schema. If it is up-to-date it will be used instead, providing faster results.
But before you go to that trouble: they are asking for a consolidated view (presumably something easier to query - common in data warehousing). You can create a view to provide this. If that view is not fast enough for their performance requirements, materialize it. Yes, the materialized view uses space, but that space is the price you pay for meeting the performance requirement. -
Star schema or Snowflake schema
Hi Gurus,
I have following dimensions and fact table. let me know can I go ahead with star schema and snowflake schema while building the cube.
1. Country's table
2. workgroup table --> each country have N number of work groups
3. user table---> each workgroup have N number of users.
4. time table.
5. fact table.This is a similar thread that discusses on the design approach of star vs normalized tables
https://social.technet.microsoft.com/Forums/sqlserver/en-US/7bf4ca30-a1bc-415d-97e6-ce0ac3137b53/normalized-3nf-vs-denormalizedstar-schema-data-warehouse-?forum=sqldatawarehousing
In my experience majority of cases I've some across is also star schema for data marts where tables will be more denormalized rather than applying priciples of normalization. And I believe so far as its through SSAS cubes that you exposes the OLAP model
it would be much easier to implement relationships using a denormalised approach.
What you may do is to have a normalised datawarehouse if you want and then built the datamarts over it using denormalised tables (star schema) for the cube.
Please Mark This As Answer if it solved your issue
Please Vote This As Helpful if it helps to solve your issue
Visakh
My Wiki User Page
My MSDN Page
My Personal Blog
My Facebook Page -
What is star schema - pls explain with example
Hai.
what is star schema - pls explain with example
thanks in advance
GiriHi Giri,
SAP's BIW employs extended star schama
The extended star schema consists of a fact table (in two parts, E and F - f is the inbound table, E long-term storage). Dimension tables are connected to the fact tables via the DIMID(dimension id) which is a generated value and is stored in both dimension and fact tables. In addition, the dimension tables are connected to tables which hold master data values (or bind the dimension table to tables that hold the values), such as S tables, P, Q, X, Y. These dimension tables hold SIDs, again generated keys which relate values in the dimension table (the DIMIDs) with master data values. Thus, at the time of the query, join operations ensure that the master data values can be merged with the key figure values stored in the fact tables.
Truthfully, one does not need to understand this schema extensively in order to model in BI in SAP NetWeaver. It helps to understand master data, navigational attributes, etc. Otherwise, simply model the key figures in the fact table and the characteristics into dimensions and you're good - the application generates the star schema for you - you don't have to specify it.
See the transaction "LISTSCHEMA" which will show you the relationship between the F fact table and the other tables of the cube's star schema.
Also follow the link for more info:
http://help.sap.com/saphelp_nw04/helpdata/en/4c/89dc37c7f2d67ae10000009b38f889/content.htm
Thanks for any points you assign.
Regards -
Hi when using HFM at a datasource the dimensions are not mapped as time. So we cant use the built in functions ( in OBIEE) for time series. How can we get this to work? or is there a workaround. ? The HFM dimensions are not able to be changed in the physical layer to represent time.
Since you only asked how to get rid of the 2011 line, here are two ways:
1) Filter on Time.Year column <> 2011
or better
2) Filter on Time.Year column <> YEAR(CURRENT_DATE)+1 -
Time Series initialization dates with fiscal periods!
Dear Experts
Problem:
I cannot initialize planning area.
Period YYYY00X is invalid for periodicity P FYV
Configuration
Storage Buckets Profile: Week, Month, Quarter, Year, Post Period FYV all checked.
Horizon
Start: 24.02.2013
End: 31.12.2104
No Time stream defined.
Fiscal Year Variant
2012
Month
1
2
3
4
5
6
7
8
9
10
11
12
Edate
28
25
31
28
26
30
28
25
29
27
24
31
FP
1
2
3
4
5
6
7
8
9
10
11
12
2013
Month
1
2
3
4
5
6
7
8
9
10
11
12
Edate
26
23
30
27
25
29
27
24
28
26
23
31
FP
1
2
3
4
5
6
7
8
9
10
11
12
2014
Month
1
2
3
4
5
6
7
8
9
10
11
12
Edate
25
22
29
26
24
28
26
23
27
25
22
31
FP
1
2
3
4
5
6
7
8
9
10
11
12
2015
Month
1
2
4
5
5
7
8
8
10
10
11
12
Edate
31
28
4
2
30
4
1
29
3
31
28
31
FP
1
2
3
4
5
6
7
8
9
10
11
12
Question
What goddamn dates should I enter in the planning area initialization start and end to initialize for maximum duration given the settings above?
I tried a few dozens but none is accepted. For a start I tried the same dates as in Horizon of storage bucket profile. But given the kind of error text I have I cannot decipher with tiny little mind what dates I am expected to enter in for time series creation.
Thanks
BSThanks Mitesh,
No its 2014
Here is what worked.
Storage Bucket Horizon
Start: 24.02.2013
End: 22.11.2014
Time Series Initialization
Start: 01.02.2013
End Date: 31.12.2014
The fiscal year variant is what I pasted above.
I thought time series can only be initialized for a subset of period in the storage bucket profile !. This is my first experience of this kind where Initialization period is larger than the storage horizon. Is this the case ? I went by the book always and this is just a chance discovery by me.
I was aware of the limitation by SAP notes on need to have one more year on either sides in fiscal year variant for it to be used for initialization and hence the range of dates I tried.
Appreciate your comments on this.
All dates in dd/mm/yyyy.
Thanks
BS
Maybe you are looking for
-
See title Issues with device iPad apparently being registered to us store (now living in Belgium) when trying to upgrade apps which have fixes pending. How do I get these fixes Some apps I have bought while living in us and some while living in eu
-
System preferences, Sharing: just a few items?
Hi, I saw a video tutorial recently, in which the instructor started Apache by turning on "Personal Web Sharing" under "Sharing" in "System Preferences", from an earlier version of OSX Server. But in OSX SL Server I find only 7 items, and there is no
-
HDS Real time tables shows negetive figures
cisco real time tables throwing strange number(figure) for offered calls every mid night between 00:00 and 00:05. (Table name Call_type_real_Time) Kindly note we have problem with the real time data in the CISCO database, it gives number of calls r
-
How to clear non-stateless incident ?
Any option I may clear those incidents 'not' stateless incident or purging policy I may use for 'OPEN' incident ? For example, I would like to clear the incident already marked 'resolved' like '/u01 available space less than 7%'. ' listener response
-
Multiple channels from GigE camera
Hello NI Folks, I am using a GigE camera for my Machine Vision application. I have to save data from all three channels coming out of Camera. I am using Example VI from National Instruments 'Grab and Setup attributes.vi' to get attributes and save Im