Volume of Data in BW

Hi all,
How to know how much volume of data under SAP BW.
How much historical data does SAP BW contains presently.
Best Reg/
Sus/-

Hi,
From SAP BW point of view SAP has not defined any restrictions on the Volume of data that can be stored. But there can be bottle necks if you have more for each individual objects.
Regards,
Mohan

Similar Messages

Processing large volumes of data in PL/SQL

I'm working on a project which requires us to process large volumes of data on a weekly/monthly/quarterly basis, and I'm not sure we are doing it right, so any tips would be greatly appreciated.
Requirement
Source data is in a flat file in "short-fat" format i.e. each data record (a "case") has a key and up to 2000 variable values.
A typical weekly file would have maybe 10,000 such cases i.e. around 20 million variable values.
But we don't know which variables are used each week until we get the file, or where they are in the file records (this is determined via a set of meta-data definitions that the user selects at runtime). This makes identifying and validating each variable value a little more interesting.
Target is a "long-thin" table i.e. one record for each variable value (with numeric IDs as FKs to identify the parent variable and case.
We only want to load variable values for cases which are entirely valid. This may be a merge i.e. variable values may already exist in the target table.
There are various rules for validating the data against pre-existing data etc. These rules are specific to each variable, and have to be applied before we put the data in the target table. The users want to see the validation results - and may choose to bail out - before the data is written to the target table.
Restrictions
We have very limited permission to perform DDL e.g. to create new tables/indexes etc.
We have no permission to use e.g. Oracle external tables, Oracle directories etc.
We are working with standard Oracle tools i.e. PL/SQL and no DWH tools.
DBAs are extremely resistant to giving us more disk space.
We are on Oracle 9iR2, with no immediate prospect of moving to 10g.
Current approach
Source data is uploaded via SQL*Loader into static "short fat" tables.
Some initial key validation is performed on these records.
Dynamic SQL (plus BULK COLLECT etc) is used to pivot the short-fat data into an intermediate long-thin table, performing the validation on the fly via a combination of including reference values in the dynamic SQL and calling PL/SQL functions inside the dynamic SQL. This means we can pivot+validate the data in one step, and don't have to update the data with its validation status after we've pivoted it.
This upload+pivot+validate step takes about 1 hour 15 minutes for around 15 million variable values.
The subsequent "load to target table" step also has to apply substitution rules for certain "special values" or NULLs.
We do this by BULK collecting the variable values from the intermediate long-thin table, for each valid case in turn, applying the substitution rules within the SQL, and inserting into/updating the target table as appropriate.
Initially we did this via a SQL MERGE, but this was actually slower than doing an explicit check for existence and switching between INSERT and UPDATE accordingly (yes, that sounds fishy to me too).
This "load" process takes around 90 minutes for the same 15 million variable values.
Questions
Why is it so slow? Our DBAs assure us we have lots of table-space etc, and that the server is plenty powerful enough.
Any suggestions as to a better approach, given the restrictions we are working under?
We've looked at Tom Kyte's stuff about creating temporary tables via CTAS, but we have had serious problems with dynamic SQL on this project, so we are very reluctant to introduce more of it unless it's absolutely necessary. In any case, we have serious problems getting permissions to create DB objects - tables, indexes etc - dynamically.
So any advice would be gratefully received!
Thanks,
Chris

We have 8 "short-fat" tables to hold the source data uploaded from the source file via SQL*Loader (the SQL*Loader step is fast). The data consists simply of strings of characters, which we treat simply as VARCHAR2 for the most part.
These tables consist essentially of a case key (composite key initially) plus up to 250 data columns. 8*250 = 2000, so we can handle up to 2000 of these variable values. The source data may have 100 any number of variable values in each record, but each record in a given file has the same structure. Each file-load event may have a different set of variables in different locations, so we have to map the short-fat columns COL001 etc to the corresponding variable definition (for validation etc) at runtime.
CASE_ID VARCHAR2(13)
COL001 VARCHAR2(10)
COL250 VARCHAR2(10)
We do a bit of initial validation in the short-fat tables, setting a surrogate key for each case etc (this is fast), then we pivot+validate this short-fat data column-by-column into a "long-thin" intermediate table, as this is the target format and we need to store the validation results anyway.
The intermediate table looks similar to this:
CASE_NUM_ID NUMBER(10) -- surrogate key to identify the parent case more easily
VARIABLE_ID NUMBER(10) -- PK of variable definition used for validation and in target table
VARIABLE_VALUE VARCHAR2(10) -- from COL001 etc
STATUS VARCHAR2(10) -- set during the pivot+validate process above
The target table looks very similar, but holds cumulative data for many weeks etc:
CASE_NUM_ID NUMBER(10) -- surrogate key to identify the parent case more easily
VARIABLE_ID NUMBER(10) -- PK of variable definition used for validation and in target table
VARIABLE_VALUE VARCHAR2(10)
We only ever load valid data into the target table.
Chris

How to calculate the volume of data used (PI licensing limit).

Hi,
One of our customers has several interfases running on SAP PI. They have a license that allows them the use of PI for a quantity of Gb in message data per month (I think that's the usual way).
They would like to know how near/far they are of their limint of data, so they would like to know how does SAP evaluate that.
Does anybody know wich tool(s) are used by SAP to calculate the volume of data spent in one month?. We would like to launch those tools and get this info (so we can evaluate if we can add new interfases or change the periodicity of existing ones within the limits of the current license).
Thanks in advance for your help.
Best Regards
Rafa

Hi Mark,
I have a quick question. The links that you provided gives an idea of the way in which the volume is calculated. This is helpful.
My question is regarding Auditing and Compliance:
Client has bought SAP PI licence for certain GB of data per month.
Can this report be generated automatically classified by SAP and Non SAP Systems (PI licence volume constraints are normally for data exchange for non SAP Systems)?
Also can this data be measured at Adaptor level? How does one monitor data at Adaptor level.
Thanks And Regards,
Maloy

Error while extracting huge volumes of data from BW

Hi,
we see this error while extracting huge volumes of data (apprx 3.4 million and with more no.of columns) and we see this error.
R3C-151001: |Dataflow DF_SAPSI_SAPSI3131_SAPBW_To_Teradata
Error calling R/3 to get table data: <RFC Error:
Key: TSV_TNEW_PAGE_ALLOC_FAILED
Status: EXCEPTION SYSTEM_FAILURE RAISED
No more storage space available for extending an internal table.
>.
We are not sure if DoP works with source as SAP BW, but when tried with DoP also, we got the same error.
Will this issue be resolved with an R/3 or ABAP dataflow? Can anyone suggest some possible solutions for this scenario?
Sri

The problem is that you've reached the maximum memory configure for your system.
If this is batch job reconfigure the profile parameter
abap/heap_area_nondia
Markus

Dealing with large volumes of data

Background:
I recently "inherited" support for our company's "data mining" group, which amounts to a number of semi-technical people who have received introductory level training in writing SQL queries and been turned loose with SQL Server Management
Studio to develop and run queries to "mine" several databases that have been created for their use. The database design (if you can call it that) is absolutely horrible. All of the data, which we receive at defined intervals from our
clients, is typically dumped into a single table consisting of 200+ varchar(x) fields. There are no indexes or primary keys on the tables in these databases, and the tables in each database contain several hundred million rows (for example one table
contains 650 million rows of data and takes up a little over 1 TB of disk space, and we receive weekly feeds from our client which adds another 300,000 rows of data).
Needless to say, query performance is terrible, since every query ends up being a table scan of 650 million rows of data. I have been asked to "fix" the problems.
My experience is primarily in applications development. I know enough about SQL Server to perform some basic performance tuning and write reasonably efficient queries; however, I'm not accustomed to having to completely overhaul such a poor design
with such a large volume of data. We have already tried to add an identity column and set it up as a primary key, but the server ran out of disk space while trying to implement the change.
I'm looking for any recommendations on how best to implement changes to the table(s) housing such a large volume of data. In the short term, I'm going to need to be able to perform a certain amount of data analysis so I can determine the proper data
types for fields (and whether any existing data would cause a problem when trying to convert the data to the new data type), so I'll need to know what can be done to make it possible to perform such analysis without the process consuming entire days to analyze
the data in one or two fields.
I'm looking for reference materials / information on how to deal with the issues, particularly when a large volumn of data is involved. I'm also looking for information on how to load large volumes of data to the database (current processing of a typical
data file takes 10-12 hours to load 300,000 records). Any guidance that can be provided is appreciated. If more specific information is needed, I'll be happy to try to answer any questions you might have about my situation.

I don't think you will find a single magic bullet to solve all the issues. The main point is that there will be no shortcut for major schema and index changes. You will need at least 120% free space to create a clustered index and facilitate
major schema changes.
I suggest an incremental approach to address you biggest pain points. You mention it takes 10-12 hours to load 300,000 rows, which suggests there may be queries involved in the process which require full scans of the 650 million row table. Perhaps
some indexes targeted at improving that process is a good first step.
What SQL Server version and edition are you using? You'll have more options with Enterprise (partitioning, row/page compression).
Regarding the data types, I would take a best guess at the proper types and run a query with TRY_CONVERT (assuming SQL 2012) to determine counts of rows that conform or not for each column. Then create a new table (using SELECT INTO) that has strongly
typed columns for those columns that are not problematic, plus the others that cannot easily be converted, and then drop the old table and rename the new one. You can follow up later to address columns data corrections and/or transformations.
Dan Guzman, SQL Server MVP, http://www.dbdelta.com

In Bdc I have huge volume of data to upload for the given transaction

Hi gurus,
In Bdc I have huge volume of data to upload for the given transaction, here am using session method, it takes lots of exection time to complete the whole transaction, Is there any other method to process the huge volume with minimum time,
reward awaiting
with regards
Thambe

Selection of BDC Method depends on the type of the requirement you have. But you can decide which one will suite requirement basing the difference between the two methods. The following are the differences between Session & Call Transaction.
Session method.
1) synchronous processing.
2) can tranfer large amount of data.
3) processing is slower.
4) error log is created
5) data is not updated until session is processed.
Call transaction.
1) asynchronous processing
2) can transfer small amount of data
3) processing is faster.
4) errors need to be handled explicitly
5) data is updated automatically
Batch Data Communication (BDC) is the oldest batch interfacing technique that SAP provided since the early versions of R/3. BDC is not a typical integration tool, in the sense that, it can be only be used for uploading data into R/3 and so it is
not bi-directional.
BDC works on the principle of simulating user input for transactional screen, via an ABAP program.
Typically the input comes in the form of a flat file. The ABAP program reads this file and formats the input data screen by screen into an internal table (BDCDATA). The transaction is then started using this internal table as the input and executed in the background.
In Call Transaction, the transactions are triggered at the time of processing itself and so the ABAP program must do the error handling. It can also be used for real-time interfaces and custom error handling & logging features. Whereas in
Batch Input Sessions, the ABAP program creates a session with all the transactional data, and this session can be viewed, scheduled and processed (using Transaction SM35) at a later time. The latter technique has a built-in error processing mechanism too.
Batch Input (BI) programs still use the classical BDC approach but doesnt require an ABAP program to be written to format the BDCDATA. The user has to format the data using predefined structures and store it in a flat file. The BI program then reads this and invokes the transaction mentioned in the header record of the file.
Direct Input (DI) programs work exactly similar to BI programs. But the only difference is, instead of processing screens they validate fields and directly load the data into tables using standard function modules. For this reason, DI programs are much faster (RMDATIND - Material Master DI program works at least 5 times faster) than the BDC counterpart and so ideally suited for loading large volume data. DI programs are not available for all application areas.
synchronous & Asynchronous updating:
http://www.icesoft.com/developer_guides/icefaces/htmlguide/devguide/keyConcepts4.html
synchronous & Asynchronous processings
Asynchronous refers to processes that do not depend on each other's outcome, and can therefore occur on different threads simultaneously. The opposite is synchronous. Synchronous processes wait for one to complete before the next begins. For those Group Policy settings for which both types of processes are available as options, you choose between the faster asynchronous or the safer, more predictable synchronous processing.
By default, the processing of Group Policy is synchronous. Computer policy is completed before the CTRLALTDEL dialog box is presented, and user policy is completed before the shell is active and available for the user to interact with it.
Note
You can change this default behavior by using a policy setting for each so that processing is asynchronous. This is not recommended unless there are compelling performance reasons. To provide the most reliable operation, leave the processing as synchronous.

How to find volume of data

Hi,
We want to find out the volume of data being inserted into the database schema per day.
Is there any way we can find out this information?
We have so many tables in our database. So counting rows is not an effective calculation.
And there are so many BLOB objects in the table. Each row is different from other row.
Database version: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
Thanks,
Ravi

select sum(round(bytes/1048576)/1024) "GB"
from dba_segments
where owner='OSS';
Use this query to calculate the total schema size. Calculate the size daily and compare the growth.
Regards
Asif Kabir
Ravi
     Newbie
Handle:      Ravi
Status Level:      Newbie
Registered:      Jul 9, 2010
Total Posts:      9
Total Questions:      *5 (5 unresolved)*
Name      Ravi
-- Mark your helpful post as correct/helpful and close all the threads.

Table to find out the volume of data

HI Experts,
Is there any table to find out the volume of data in the data targets(infocubes and DSO)
Thanks & Regards,
Prasad.

Hey. I am not aware of anything that can give you a flat list of providers and number of total records. It would be a fairly simple program to write though.
But, you could just go the old school route and just lookup the number of entries on the active table for the DSOs. Cube would be a bit different and would probably depend on what you really want to meassure but I guess the E table is a place to start.
I also know there are some DB size summary tables that show you number of records per table. One of those tables is DBSTATTORA (other similar tables exist). But again it is at the actual table level and not a provider summary. And I will not speak to how accurate the data is as I have not tried to validate. I will say the FM in the link posted above actually just does a number of records select on whatever table you enter.
Thanks

Retrive SQL from Webi report and process it for large volume of data

We have a Scenario where, we need to extract large volumes of data into flat files and distribute them from u2018Teradatau2019 warehouse which we usually call them as u2018Extractsu2019. But the requirement is such that, Business users wants to build their own u2018Adhoc Extractsu2019. The only way, I thought, to achieve this, is Build a universe, create the query, save the report and do not run it. Then write a RAS SDK to retrieve the SQL code from the reports and save into .txt file and process it directly in Teradata?
Is there any predefined Solution available with SAP BO or any other tool for this kind of Scenarios?

Hi Shawn,
Do we have some VB macro to retrieve Sql queries of data providers of all the WebI reports in CMS.
Any information or even direction where I can get information will be helpful.
Thanks in advance.
Ashesh

Maximum volume of data supported by the version of the database embedded

Hi,
How to know the maximum volume of data supported by the version of the database embedded??
Any help will be needful for me
Thanks and Regards

* 1 EB = 1,000,000,000,000,000,000 B = 1 billion gigabytes
The term exbibyte, using a binary prefix, is used for powers of 1024 bytes.
Source: http://en.wikipedia.org/wiki/Exabyte
Regards
Asif Kabir
-- Mark the answer as correct/helpful

Data with huge volume of data with DTP

Hi Experts,
I have this problem with upload of huge volume of data with DTPs.
I have my initialisation done as I am doing reloads, Now I have this data from fiscal year period 000.2010 to 016.9999.
I have huge volume of data.
I have tried uploading this data in chunks by dividing 3 months for each DTP and had made full load.
But when I processed the DTP the data packages are decided at source and I have about 2000 data packages.
Now my request is turning to red after processing about 1000 datapackages, batch processes allocated to this also stopped.
I have tried dividing DTP only by month and processed the DTP I have same problem. I have deleted the indexes before uplaoding to the cube, Changed the setting battch processing from 3 to 5.
Please can any one advise what could be problem.I am uplaoding this reloads in quality system.
How can upload this data which are in millions.
Thanks,
Tati

Hi Galban,
I have made the parallel processing from 3 to 5 even and the datapakcage size
Can you please advise in this area how can I increase the data package size as the data package size for my upload is the package size corresponds to package size in source it is determined dynammically at runtime.
Please advise.
Thanks
Tati

Delete volumes of Data in a table.

We have a table which has loads of data. When we execute the delete statement , it gives DB error as it was not able to delete huge volume of data.
Approach tried so far :
Keep a counter who size is say 10000. and execute delete statements every 10000 records. But still as the data is huge it still takes time.
Can anybody suggest some approaches ?

user12944938 wrote:
Oracle Version : 10g
SQL : DELETE FROM STUDENT WHERE YEAR = 2000.
The requirement is more of a yearly basis but on different tables[STUDENT, ADMIN, MANAGERS,........].
We have tried to run in off-hours.
STUDENT TABLE [ ID, FIRST NAME, MIDDLE NAME LAST NAME,.......................................................................................] AROUND 200 COLUMNS.Well you left out a lot of information from what i asked for so i don't have much to suggest.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2345591157689
Is a good read and may be of help to you.
Partitioning could also be an option (at which point you could drop the old partitions).
Really can't say knowing what we do (and more importantly, do not know) about your situation.

How can I configure ReFS to NOT fail read operations when a checksum error is detected (on non-Storage-Spaces volumes where data integrity streams are enabled)?

According to William Stanek, in his Windows Server 2012 R2 Inside Out: Configuration, Storage & Essentials book, this is apparently possible: (pg. 615 - here it is on Google Books: https://books.google.ca/books?id=0IyfBAAAQBAJ&pg=PT819&lpg=PT819&dq=read+operation )
Integrity can be enabled when the system is not running on Storage Spaces. When
integrity is enabled and ReFS detects a checksum mismatch, ReFS logs an event and
fails the read operation by default. If you don’t want the read operation to fail, you
can configure ReFS to continue with the read operation. A related event will be logged
regardless.
So then how do I configure it to do that???
(And just to make it super-clear, I'm NOT using Storage Spaces, so there is no redundancy via mirroring/parity, and I'm not expecting any file repair - just detection of corruption. It's just a basic volume formatted with ReFS and
with integrity streams enabled, via format E: /fs:ReFS /i:enabled
For those who want more details, here's the situation:
I try to perform a read operation on a file with corrupted data (purposely done for testing using a low-level disk editor), I get a the following error message:
And an event ID 133 from ReFSv1 gets logged in the System log:
Clicking "Try Again" just brings up the same message, and clicking "Skip" skips the operation entirely.
This is indeed the correct default behaviour.
What I want instead is for the read operation to be allowed to complete, with corrupt data and all, and ONLY for the event to be logged. And according to William Stanek, this is supposed to be configurable somewhere - and after hours of searching, I haven't
been able to find anything.

Hi Tommy,
>>How can I configure ReFS to NOT fail read operations when a checksum error is detected
We can use PowerShell command Set-FileIntegrity to configure this. The specific parameter for controlling this behavior is
-Enforce <Boolean>which indicates whether to enable blocking access to a file if integrity streams do not match the data.
Regarding this point, the following article can be referred to as reference.
Set-FileIntegrity
https://technet.microsoft.com/en-us/library/jj218351.aspx
Best regards,
Frank Shen
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact [email protected]

What is the best way to extract large volume of data from a BW InfoCube?

Hello experts,
Wondering if someone can suggest the best method that is availabe in SAP BI 7.0 to extract a large amount of data (approx 70 million records) from an InfoCube. I've tried OpenHub and APD but not working. I always need to separate the extracts into small datasets. Any advice is greatly appreciated.
Thanks,
David

Hi David,
We had the same issue but that was loading from an ODS to cube. We have over 50 million records. I think there is no such option like parallel loading using DTPs. As suggested earlier in the forum, the only best option is to split according to the calender year of fis yr.
But remember even with the above criteria sometimes for some cal yr you might have lot of data, even that becomes a problem.
What i can suggest you is apart from Just the cal yr/fisc, also include some other selection criteria like comp code or sales org.
yes you will end up load more requests, but the data loads would go smooth with lesser volumes.
Regards
BN

Error when creating volume on data group for acfs

Hi,
I want to create ACFS file system on solaris SPARC system which is running solaris 10.9. I have created data group successfully but i get an error when i try to create volume.
ASMCMD> volcreate -G OEMLIB -s 30G oemlibvol1
ORA-15032: not all alterations performed
ORA-15472: volume library cannot be loaded. Platform may not support volume creation. (DBD ERROR: OCIStmtExecute)
i read that solaris 10.8 or more support acfs and my oracle grid software is also 11.2 higher.
what could be the issue? Do i need to manually load drivers for ADVM? where are they located?

Thanks
Thanks for your input.
We are missing the ACFS binaries after we installed 11.2.0.1 GRID and Database on Solaris 10 Update 8.
The following are missing from the GRID_HOME/bin:
Acfsload, acfsroot, acfsdriverstate, acfsregitrymount, acfssinglefsmount.
I see these in 11.2.0.1 version of a LINUX install in our datacenter but don’t see it in our Solaris Install.

Volume of Data in BW

Similar Messages

Maybe you are looking for