Data Extract Design

Hi All,
I have a requirement where I have to extract data from various different tables-->Only particular columns.
The requirements are same for different databases, hence I thought to have a single generic approach and reuse the same code to perform the extract and create an ascii file.
Below is the typical scenarion i want to achieve, hence need your expertise inputs to start off..
a) Define the required columns -- This should be configurable, i.e., add or remove columns in future.
b) Extract the column names from the database for those that are defined in the step a) above.
c) Extract the data from relevent tables/columns for various conditions based on step a and b above.
d) Create an ascii file for all the data extracted.
I'm unsure if there is anything wrong or please suggest the best approach.
Regs,
R

user10177353 wrote:
I'm unsure if there is anything wrong or please suggest the best approach.
The first thing to bear in mind is that developing a generic, dynamic solution is considerably more more complicated than writing a set of extract statements. So you need to be sure that the effort you're about to expend willl save you more time than writing a script and copying/editing it for subsequent re-use.
You'll probably need three tables:
1. Extracts - one record per extract definition (perhaps including info such as target file name)
2. Extract tables - tableges for each extract
3. Extract columns - columns for each extracted column.
I'm writing this as though you'll be extracting more than one table per run.
The writing to file is the trickiest bit. Choose a good target. Remember that although we called them CSV files, commas actually make a remarkably poor choice of separator, as way too much data contains them. Go for soemthing really unlikely, ideally a multi-character separator like ||¬.
Also remember text files only take strings, so you need to convert your data to text. Use the data dictionary ALL_TAB_COLUMNS view to get the metatdata for the extracted columns, and apply explicit masks to date and numeric columns. You may want to allow date columns to have masks which include or exclude the time element.
Consider what you want to do with complex data types (LOBs, UDTs, etc).
Finally, you need to address the problem of the extract file's location. PL/SQL has a lot of utilities to wrangle files but they only work on the server side. So if you want to write to a local drive you'll need to use SPOOL.
One last thought: how will you import the data? It would probably be a good idea to use this mechanism to generate DDL for a matching external table.
Cheers, APC
Edited by: APC on May 4, 2012 1:08 PM

Similar Messages

  • Data Conversion Design Patters

    I'm looking at building a conversion program that will import data from several different formats and convert into one common format. The convertor should simply be pointed to a database or a flat-file and it will extract the data and populate tables in the target database. All the logic is mind-numbingly simple, but as far as an overall design, what are your thoughts?
    For example, should I build a separate tool that does validation of source data? Or should the validator be part of the convertor?
    Are there design patterns out there that anyone can recommend?

    This problem u can go for strategy pattern. A strategy is nothing but an algorithm to reach the soution.
    Strategy pattern deals with different algorithms(strategies) to achieve the same result.The client will have some indicator. Using the indicator, stategy manager will decide which algorithm to invoke.
    In ur case,the algorithm may differ depending on the data format. For (eg) if you have five different formats of data , then you may have to write five different type of logic to achieve the solution.
    So firt of all u need to identify the strategies u r going to use.
    Now you may require following classes
    1) SourceReader -- > this class is responsible for getting the data either from file/database.It just reads and holds data.
    It identifies which strategy to use by parsing the data . No data processing happens here
    2) StrategyManager --> Strategy manager is the one who reads the data from source reader and decides and instantiate the strategy to use.
    3) Strategy --> It could be an interface or an abstract class. It can be decided upon any common operation involved irrespective of data           format.
    4)ConcreateStrategy --> This implemenattion class, here actual data extraction process will happens.
    5) DataLoader --> this calss loads the data in to the destination database
    6) YurEntity --> The data to be populated can be kept in the form object. if ur object contains many attributes, we can have data in the           form objects. (this class is required only if all the data are realted and fall in to same logical group).
    Class SourceReader{
    Vector data;
    public void do(){
    readData()
    StrategyManager.getInstance().process(data,findStrategy());
    public void readData()
         read from file or db and populate data
    public int findStrategy()
         logic for identifying format
    class StrategyManager
    public static StrategyManager sm = new StrategyManger();
    public StrategyManager getInstance()
    return sm;
    public void process(Vector data,int indicator)
         //read the indicator and instantiate the appropriate startegy class
         Strategy s = new Strategy1();
         s.process();
    abstract class strategy{
    public abstract void parse();
    public void validate{};// if the validation logic is common, u can have implemenation here itself.
    public updateDB()
         update destination db;
    public void process()
         validate();
         parse();
         updateDB();
    }

  • Data Model Design

    Hi Experts,
    In Current project I need to design data model and create data flow strategy for SD,MM,PP,FI modules from R/3.The client wants to use BOBJ on top of BI Info cubes/Reports.Based on KPI's given I need to do Data availability in R/3,Data extraction,Data model analysis and submit the documentation for that.Please guide me how to approach step by step so that I can go ahead with clear cut strategy.Any documentation if it is there please share with me.
    Regards
    Prasad

    You can find ASAP methodology and accelleretors related to modelling data here
    https://websmp203.sap-ag.de/roadmaps
    see the "ASAP Implementation Roadmap for SAP Exchange Infrastructure"  there are New roadmap content for SAP Business Intelligence.
    Regards,
    Sergio

  • Suggest best data flow design

    <<Do not ask for or offer points>>
    Hi
    Right now  we need to implement  HR related  data - the main concentration is PA, OM
    Please suggest me best suitable data flow design
    Note: All my datasource will not support delta
    How we can handle these loads in SAP BI side  -  I mean to stay how to handle these loads by using DTP
    Please suggest some good data flow. I will assign some good points on this
    Regards
    Madan Mohan
    Edited by: Matt on Feb 19, 2010 7:03 AM

    Hi Madan,
       You can find the data flow in metadata repository in RSA1. Goto the metadata repository-> select your datasource ->
       network  display.
       DTP: You can extract both full/delta using the DTP.
       Please refer the below links also.
       Personnel Administration:
       [http://help.sap.com/saphelp_nw04/helpdata/EN/7d/eab4391de05604e10000000a114084/frameset.htm]
       Organizational Management:
       [http://help.sap.com/saphelp_nw04/helpdata/EN/2d/7739a2a6d5cc4c8d63a514599dc30f/frameset.htm]
    Regards
    Prasad

  • Data extraction from Oracle database

    Hello all,
    I have to extract data from legacy database tables. I need to apply a lot of conditions on data extraction using SQL statements for getting only valid master data, transaction data, SAP date format etc. Unfortunately I don;t have a luxary of accessing legacy system data base table to create table views and applying select statements.
    Is there anyother way round by which I can filter data on source system side w/o getting into legacy system. I mean being in BW data source side.
    I am suppose to use both UD connect and DB connect to test which will workout better way of data extraction. But my question above should be same in either interface.
    This is very urgent as we are in design phase.
    Points will be rewarded immediately.
    Thanks
    Message was edited by:
            Shail
    Message was edited by:
            Shail

    Well I and eveyone know that it can be done in BI.
    I apologize that I did not mention it in my question.
    I am looking for very specific answer, if there is any trick we can do on source system side from BI. Or where we can insert SQL statements in infopackage or data source.
    Thanks

  • Data extraction from Siebel to BW - use BC or not

    Hello
    I am doing data extraction from Siebel to BW. I can map just 15% of Siebel fields to SAP BC data sources. Do you still recommend me to use BC or forget it and create whole buch of new infoobjects and custom ods/cubes?

    Hi,
    If data is comming from out of SAP, then carefully select the infoObects, because SAP provided InfoObjects are more meaningful and better integration with other infoobjects(Compound Objects, Navigational and so on).If You are created any custom objects, You have to take care all of them in overall design and architecture.
    I am prefering first to choose existing BC Infoobjects, if not available and then create new objects. But here You need lot of functional knowledge also

  • How to identify whether the data extracted is direct, queued, unserialized

    hi,
    how to identify whether the data extraction from r/3 is direct, queued and unseralized data.
    can anyone let me know abt it
    regds
    hari

    hI,
    Direct Delta: With this update mode, the extraction data is transferred with each document posting directly into the BW delta queue. In doing so, each document posting with delta extraction is posted for exactly one LUW in the respective BW delta queues.
    This update method is recommended for the following general criteria:
    a) A maximum of 10,000 document changes (creating, changing or deleting documents) are accrued between two delta extractions for the application in question. A (considerably) larger number of LUWs in the BW delta queue can result in terminations during extraction.
    b) With a future delta initialization, you can ensure that no documents are posted from the start of the recompilation run in R/3 until all delta-init requests have been successfully posted. This applies particularly if, for example, you want to include more organizational units such as another plant or sales organization in the extraction. Stopping the posting of documents always applies to the entire client.
    Queued Delta: With this update mode, the extraction data is collected for the affected application instead of being collected in an extraction queue, and can be transferred as usual with the V3 update by means of an updating collective run into the BW delta queue. In doing so, up to 10000 delta extractions of documents for an LUW are compressed for each DataSource into the BW delta queue, depending on the application.
    new queued delta
    This update method is recommended for the following general criteria:
    a) More than 10,000 document changes (creating, changing or deleting a documents) are performed each day for the application in question.
    b) In future delta initializations, you must reduce the posting-free phase to executing the recompilation run in R/3. The document postings should be included again when the delta Init requests are posted in BW. Of course, the conditions described above for the update collective run must be taken into account.
    Non-serialized V3 Update:With this update mode, the extraction data for the application considered is written as before into the update tables with the help of a V3 update module. They are kept there as long as the data is selected through an updating collective run and are processed. However, in contrast to the current default settings (serialized V3 update), the data in the updating collective run are thereby read without regard to sequence from the update tables and are transferred to the BW delta queue.
    unserialized v3 update
    This update method is recommended for the following general criteria:
    a) Due to the design of the data targets in BW and for the particular application in question, it is irrelevant whether or not the extraction data is transferred to BW in exactly the same sequence in which the data was generated in R/3.
    take a look Roberto's weblog series
    /people/sap.user72/blog/2004/12/16/logistic-cockpit-delta-mechanism--episode-one-v3-update-the-145serializer146
    /people/sap.user72/blog/2004/12/23/logistic-cockpit-delta-mechanism--episode-two-v3-update-when-some-problems-can-occur
    /people/sap.user72/blog/2005/01/19/logistic-cockpit-delta-mechanism--episode-three-the-new-update-methods
    /people/sap.user72/blog/2005/04/19/logistic-cockpit-a-new-deal-overshadowed-by-the-old-fashioned-lis
    https://weblogs.sdn.sap.com/pub/wlg/126 [original link is broken] [original link is broken] [original link is broken]
    doc
    https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/f83be790-0201-0010-4fb0-98bd7c01e328
    and oss note 505700
    Re: delta methods
    go throuth the previous thread
    Delta types
    hope it helps..

  • SAP Data Extraction requirements - mySAP SCM SPP/ ERP

    Hello All,
    Could you please explain me how to start for this requirement?.
    <b>Business Summary:</b>
    For MM Planning to have a substantial range of Master and Transactional data to get extracted from the SAP system for reporting and analysis purposes. A list of the data requested for extraction for the SCM.
    <b>Functional Discription:</b>
    There is a general MM Planning requirement to have most Master and Transactional data extracted from SAP-APO for reporting and analysis purposes in order to support following business tasks:
    •     process evaluation and optimization
    •     error analyzes
    •     ad-hoc reporting
    •     planning processes out of scope for SCM launch (e.g. ATR process)
    The data need to be automatically extracted on regular basis (i.e.: daily/weekly/monthly/yearly: this will be defined during the next steps for every single set of data) and made available in a structured environment to the SCM Planning Team for the above mentioned processes.
    The requirement for data availability outside of SAP APO comprises both current data and historical data.
    FoE: today no "final" report is generated by the SCM Planning team directly within the current data warehouse environment.  Most data are extracted from the current data warehouse to external tools. Current assumption is that this will remain unchanged.
    Some regular reports (MBB, Inventory Dashboard) may be directly developed in Business Warehouse in future if this offers any improvement (flexibility, design, handling.). This will be investigated by the SCM Planning Team after go-live.
    FUS: For the analyses and the planning processes that are out of scope, the data will need to be extracted so that it is available for use in analytical tools that are powerful enough to process the data and appropriate calculations. 
    Regular reports may be directly developed in the BW if it is determined that that is the most appropriate location and tool.  Otherwise, the data will need to be integrated into reports generated using tools outside of SAP.  That determination will needs to be completed based on the type of data being reported.
    Thanks & Regards
    PRadeep
    Message was edited by:
            Pradeep Reddy

    Hi,
    You can download all the master guides from the SAP service marketplace (http//service.sap.com/instguides).
    Cheers,
    Mike.

  • Data Extraction in Deliveries / Infopackages F, D, I / PC

    Dear Experts,
    could you please help me by explaining how th data extraction in Deliveries properly work?
    When do you create an infopackage with Full, Delta and Init?
    They are some scenarios with all 3 mentioned before.
    How are the data in that case laod into the DSO (Acquisition Layer)?
    What about the DTP?
    How to load Data into Cubes?
    How are the process chains designed?
    Thank you for you input.
    Cheers

    Hello Durgesh,
    thank you for replying.
    could you please help me by explaining how th data extraction in Deliveries properly work?
    When do you create an infopackage with Full, Delta and Init?
    Full load will give you complete set of data from R/3 to BW, Delta extraction fetches only the new or changed records and so as to enable the delta mechanism you will have to do delta initialization first. --> That is clear
    How are the data in that case laod into the DSO (Acquisition Layer)?
    Once the data is loaded to PSA through Infopackage then subsequently it can be loaded from PSA to DSO and Infocube using transformation and DTP.
    1) For Full Load, should the data have to be load till the cubes?
    2) After the full Load we have to initialize. Could you please give the steps? Or it is just to run the infopackage for initilization?
        Should the init have to be load till the cubes?
    3) That is what i would like to know.
    Creating the PC is not a problem. I would only like to know if in the PC we only take the delta Infopackage.
    Is there a document which explain all those steps?
    Thank you

  • BODS 3.1 : SAP R/3 data extraction -What is the difference in 2 dataflows?

    Hi.
    Can anyone advise as to what is the difference  in using the data extraction flow for extracting Data from SAP R/3 ?
    1)DF1 >> SAPR/3 (R3/Table -query transformation-dat file) >>query transformation >> target
    This ABAP flow generates a ABAP program and a dat file.
    We can also upload this program and run jobs as execute preloaded option on datastore.
    This works fine.
    2) We also can pull the SAP R/3 table directly.
    DF2>>SAPR/3 table (this has a red arrow like in OHD) >> Query transformation >> target
    THIS ALSO Works fine. And we are able to see the data directly into oracle.
    Which can also be scheduled on a job.
    BUT am unable to understand the purpose of using the different types of data extraction flows.
    When to use which type of flow for data extraction.
    Advantage / disadvantage - over the 2 data flows.
    What we are not understanding is that :
    if we can directly pull data from R/3 table directly thro a query transformation into the target table,
    why use the Flow of creating a R/3 data flow,
    and then do a query transformation again
    and then populate the target database?
    There might be some practical reasons for using these 2 different types of flows in doing the data extraction. Which I would like to understand.  Can anyone advise please.
    Many thanks
    indu
    Edited by: Indumathy Narayanan on Aug 22, 2011 3:25 PM

    Hi Jeff.
    Greetings. And many thanks for your response.
    Generally we pull the entire SAP R/3 table thro query transformation into oracle.
    For which we use R/3 data flow and the ABAP program, which we upload on the R/3 system
    so as to be able to use the option of Execute preloaded - and run the jobs.
    Since we do not have any control on our R/3 servers nor we have anyone on ABAP programming,
    we do not do anything at the SAP R/3 level
    I was doing this trial and error testing on our Worflows for our new requirement
    WF 1 : which has some 15 R/3 TABLES.
    For each table we have created a separate Dataflow.
    And finally in between in some dataflows, wherein, the SAP tables which had lot of rows, i decided to pull it directly,
    by-passing the ABAP flow.
    And still the entire work flow and data extraction happens ok.
    In fact i tried creating a new sample data flow and tested.
    Using direct download and - and also execute preloaded.
    I did not see any major difference in time taken for data extraction;
    Because anyhow we pull the entire Table, then choose whatever we want to bring into oracle thro a view for our BO reporting or aggregate and then bring data as a table for Universe consumption.
    Actually, I was looking at other options to avoid this ABAP generation - and the R/3 data flow because we are having problems on our dev and qa environments - giving delimiter errors.  Whereas in production it works fine. Production environment is a old set up of BODS 3.1. QA and Dev are relatively new enviornments of BODS. Which is having this delimiter error.
    I did not understand how to resolve it as per this post : https://cw.sdn.sap.com/cw/ideas/2596
    And trying to resolve this problem, I ended up with the option of trying to pull directly the R/3 table. Without using ABAP workflow.  Just by trial and error of each and every drag and drop option. Because we had to urgently do a POC and deliver the data for the entire e recruiting module of SAP. 
    I dont know whether i could do this direct pulling of data - for the new job which i have created,
    which has 2 workflows with 15 Dataflows in each worflow.
    And and push this job into production.
    And also whether i could by-pass this ABAP flow and do a direct pulling of R/3 data, in all the Dataflows in the future for ANY of our SAP R/3 data extraction requirement.  And this technical understanding is not clear to us as regards the difference between the 2 flows.  And being new to this whole of ETL - I just wanted to know the pros and cons of this particular data extraction. 
    As advised I shall check the schedules for a week, and then we shall move it probably into production.
    Thanks again.
    Kind Regards
    Indu
    Edited by: Indumathy Narayanan on Aug 22, 2011 7:02 PM

  • Open data extraction orders -  Applying Support Packs

    Dear All,
    I have done the IDES 4.6C SR2 installation.
    While updating the support packs, i get the message saying
    CHECK_REQUIREMENTS phase.
    Open data extraction orders
    There are still open data extraction orders in the system
    process these before the start of the object import because changes to the ABAP Dictionary structures could lead to data extraction orders not being able to be read after the import and their processing terminating
    For more details about this problem, see Note 328181.
    Go to the Customizing cockpit for data extraction and start the processing of all open extraction orders.
    I have checked the Note.
    But this is something m facing for the first time.
    Any suggestion!!!
    Rgds,
    NK

    The exact message is :
    Phase CHECK_REQUIREMENTS: Explanation of the Errors
    Open Data Extraction Requests
    The system has found a number of open data extraction requests. These
    should be processed before starting the object import process, as
    changes to DDIC structures could prevent data extraction requests from
    being read after the import, thus causing them to terminate. You can
    find more information about this problem in SAP Note 328181.
    Call the Customizing Cockpit data extraction transaction and process all
    open extraction requests.

  • Bulk API V2.0 Data extract support for additional objects (Campaign,Email,Form,FormData,LandingPage)?

    allison.moore
    Any plans for adding following objects under Bulk API V2.0 for data extraction from Eloqua. Extracting the data using the REST API for these objects makes it complicated.

    Thanks for quick response. Extracting these objects using REST API in depth=Complete poses lots of complication from the code perspective since these object(s) contains multiple nested or embedded objects within it. is there any guideline on how to extract these objects using REST so that we can get all the data which is required for analysis/reporting.

  • Data Extraction and ODS/Cube loading: New date key field added

    Good morning.
    Your expert advise is required with the following:
    1. A data extract was done previously from a source with a full upload to the ODS and cube. An event is triggered from the source when data is available and then the process chain will first clear all the data in the ODS and cube and then reload, activate etc.
    2. In the ODS, the 'forecast period' field was now moved from data fields to 'Key field' as the user would like to report per period in future. The source will in future only provide the data for a specific period and not all the data as before.
    3) Data must be appended in future.
    4) the current InfoPackage in the ODS is a full upload.
    5) The 'old' data in the ODS and cube must not be deleted as the source cannot provide it again. They will report on the data per forecast period key in future.
    I am not sure what to do in BW as far as the InfoPackages are concerned, loading the data and updating the cube.
    My questions are:
    Q1) How will I ensure that BW will append the data for each forecast period to the ODS and cube in future? What do I check in the InfoPackages?
    Q2) I have now removed the process chain event that used to delete the data in the ODS and cube before reloading it again. Was that the right thing to do?
    Your assistance will be highly appreciated. Thanks
    Cornelius Faurie

    Hi Cornelius,
    Q1) How will I ensure that BW will append the data for each forecast period to the ODS and cube in future? What do I check in the InfoPackages?
    -->> Try to load data into ODS in Overwrite mode full update asbefore(adds new records and changes previous records with latest). Pust delta from this ODS to CUBE.
    If existing ODS loading in addition, introduce one more ODS with same granularity of source and load in Overwrite mode if possible delta or Full and push delta only subsequently.
    Q2) I have now removed the process chain event that used to delete the data in the ODS and cube before reloading it again. Was that the right thing to do?
    --> Yes, It is correct. Otherwise you will loose historic data.
    Hope it Helps
    Srini

  • FI data extraction help

    HI All,
    I have gone thorugh the sdn link for FI extarction and founf out to be very  useful.
    Still i have some doubts....
    For line item data extraction...........Do we need to etract data from 0FI_GL_4, 0FI_AP_4, 0FI_AR_4, into ODS0FIGL_O02(General Ledger: Line Items) ? If so do we need to maintain transformation between ods and all three DS?
    ALso Please educate me on 0FI_AP_3 and 0FI_AP_4 data sources.......

    >
    Jacob Jansen wrote:
    > Hi Raj.
    >
    > Yes, you should run GL_4 first. If not, AP_4 and AR_4 will be lagging behind. You can see in R/3 in table BWOM_TIMEST how the deltas are "behaving" with respect to the date selection.
    >
    > br
    > jacob
    Not necessarily for systems above plug in 2002.
    As of Plug-In 2002.2, it is no longer necessary to have DataSources linked. This means that you can now load 0FI_GL_4, 0FI_AR_4, 0FI_AP_4, and 0FI_TX_4 in any order. You also have the option of using DataSources 0FI_AR_4, 0FI_AP_4 and 0FI_TX_4 separately without 0FI_GL_4. The DataSources can then be used independently of one another (see note 551044).
    Source:http://help.sap.com/saphelp_nw04s/helpdata/EN/af/16533bbb15b762e10000000a114084/frameset.htm

  • Generic Data Extraction From SAP R/3 to BI server using Function Module

    Hi,
    I want step by step procedure for Generic Extraction from SAP R/3 application to BI application
    using Functional module.
    If any body have any Document or any PPT then please reply and post it in forum, i will give point for them.
    Thanks & Regards
    Subhasis Pan

    please go thr this link
    [SAP BI Generic Extraction Using a Function Module|https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/library/business-intelligence/s-u/sap%20bi%20generic%20extraction%20using%20a%20function%20module.pdf]
    [Generic Data Extraction Using Function Module |Re: Generic Data Extraction Using Function Module;

Maybe you are looking for

  • How old a mac will CS6 work with

    I have a 2006 mac with os 10.6.8, will cs6 work with this os?

  • Cheque Printing issue in T.Code FBZ5

    Dear Expert, we r using  transaction code FBZ5  for cheque printing.we have two cheque both have different size. Is it possible assign two script for cheque printing in T.Code FBZ5. Any other solution availabel for two different size cheque printing

  • I am not able to install

    Hi, I tried to install webas preview, in the drive selection it gives an error saying that the drive does not have proper format choose any other drive. I am running the set up file from WIN XP machine. Can anyone help me. Regards Guhapriyan

  • Erase disk

    I have an intel Xserve running 10.5.4 server and an EonStor raid array connected to the Xserve via SCSI. I have configured the EonStor as a 2.3TB raid array using the Infortrend Raidwatch software and mapped the LUNs accordingly. The array shows up i

  • HT1695 Slide button will not work to turn on the wifi?

    I cannot connect to wifi which I need to back up my phone as the slide button will not work on this setting only