UNIX sed commands to clean up data before loading into BI

Hi all,
we are trying to load the data into BI from text files.. This data needs to be cleaned before it can be loaded into BI and this has to be done using UNIX sed commands!!
We basically need to do a data cleansing act, like removing unwanted characters (tab values and newline values as part of the characteristic values). How can it be done using unix sed commands??
your help is very much appreciated.
Regards

Hi all,
we are trying to load the data into BI from text files.. This data needs to be cleaned before it can be loaded into BI and this has to be done using UNIX sed commands!!
We basically need to do a data cleansing act, like removing unwanted characters (tab values and newline values as part of the characteristic values). How can it be done using unix sed commands??
your help is very much appreciated.
Regards

Similar Messages

  • Depersonalising Data Before loading into DB

    Hi Guys,
    I need some help on de-Personalizing customer data before loading it into the database using SSIS.
    So one all the transformation done and finally want to load the data into respective tables , we need to de-personalize it.
    Also, how it will handle datatype of the table for each columns need to be de-personalized ?
    Later on we have to again de-cript once its tested by the testers. 
    Anky

    Hi Raj
    We have to  encrypt the data before loading the data into the table.
    As we are not encrypting the client ID that can be used to join with other tables for testing purpose but tester won’t able to see the other Client Personal Data
    like account number, address and DOB etc .
    we have to decrypt the data back once testing is done.
    Anky

  • Do We Need to Validate Data Before Loading Into Planning?

    We are debating between whether to load data from GL to Planning using ODI or FDM. If we need some form of validity check on the data, we will have to use FDM, otherwise I believe ODI is good enough.
    My question is, for financials planning, what determines whether we need validity checks or not? How do we decide that?

    FDM helps in validation for data load audit options but validation is as easy as doing a comparison to totals by GL accounts from source and planning. You should be able to use ODI, FDM or load rules to load data into Hyperion and complete validation outside using any of reporting options.

  • ODI : how to raise cross reference error before loading into Essbase?

    Hi John .. if you read my post, I want to say that you impress me! really, thank for your blog.
    Today, my problem is :
    - I received a bad quality data file from ERP extract
    - I have cross reference table (Source ==> Target)
    - >> How to raise the error before loading into Essbase !
    My Idea is the following, (first of all, I'm not sure if it is a good one, and also I meet issue to do it in ODI !)
    - Step 1 : make JOIN between data.txt and cross-reference Table ==> Create a table DATA_STEP1 in the ODISTAGING schema (the columns of DATA_STEP1 are the addition of columns of data.txt those of cross-references Tables (... there is more than 20 columns in my case)
    - Step 2 : Control if there is no NULL value in the Target Column (NULL means that the data.txt file contains value that are not defined in my cross reference Table) by using Filter ( Filter = Target_Account IS NULL or Target_Entity IS NULL or ...)
    The result of this interface is send to reject.txt file - if reject.txt file is not empty then a mail is sent to the administrator
    - Step 3 : make the opposite : Filter NOT (Target_Account IS NULL or Target_Entity IS NULL ... ) ==> the result is sent in DATA_STEP3 Table
    - Step 4 : run properly the mapping : source : DATA_STEP3 (the clean and verified data !) with cross reference Tables and send data into Essbase - NORMALY, there is not rejected record !
    My main problem is : what is the right IKM to send data into the DATA_STEP1, or DATA_STEP3 Table, which are Oracle Table in my ODISTAGING Schema ! I thy with IKM Oracle Incremental Update but I get error, and actually I don't need an update (which is time consumming), I just need an INSERT !
    I'm just lookiing for an 'IKM SQL to Oracle" ....
    regards
    xavier

    Thanks john : very speed !
    I understood better now which IKM is useful.
    I found other information about the error followup with ODI : http://blogs.oracle.com/dataintegration/2009/10/did_you_know_that_odi_generate.html
    and I decided to activate Integrity Constorl in ODI :
    I load :
    - data.txt in ODITEMP.T_DATA
    - transco_account.csv in ODITEMP.T_TRANSCO_ACCOUNT
    - transco_entity.csv in ODITEMP.T_TRANSCO_ENTITY
    - and so on ...
    - Moreover I create integrity constraints between T_DATA and T_TRANSCO_ACCOUNT and T_TRANSCO_ENTITY ... so I expected that ODI will raise for me in E$_DATA (the error table) the bad records !
    However I have one issue when loading data.txt into T_DATA because I have no ID or Primary Key ... I read in a training book that I could use a SEQUENCE ... I try but unsuccessful ... :-(
    Is there another simple way to create a Primary Key automaticaly (T_DATA is in an oracle Schema of course) ?thanks in advance

  • Can transaction data be loade into info object

    Hi Gurus
    Can a Transaction data be loaded into info objects. Appreciate if some one give a simple Definition of transaction data.
    GSR

    Hi,
    You can probably do that but why would you want to do it? Transaction data is generally required for querying purposes and queries run best on multidimensional cube structure, not a flat structure.

  • Can we execute the Reporting while the data is loading into that ODS/Cube.

    Hi Friends,
          Can we execute the reports on particular ODS/InfoCube in the following cases
    1) When the data is loading into that ODS/Infocube.
    2) When we are Archiving the data from that ODS/Infocube
    Thanks & Regards,
    Shaliny. M

    Hi Shaliny,
    First of all you are in the wrong forum, in Business Intelligence Old Forum (Read Only Archive) you will find better support.
    In case you are loading data in an infocube you will be able to have report only until the request that has the icon ready for reporting filled. In case of an ODS object i don't think you will be able to have valid reporting since the ODS data firstly needs to be activated.
    Nevertheless please post your question in the above link.
    Kostas

  • Best practice to Consolidate forecast before loading into ASCP

    Hi,
    Can anyone suggest best practice to consolidate forecast that is in spreadsheets. Forecast comes from Sales, Finance and Operations. Then consolidated forecast should be loaded into ASCP.
    Is there any way to automate the load?
    Is Oracle S&OP best product out there?
    Do we need Oracle Demand management also?
    Regards

    Forecast comes from Sales, Finance and Operations (spreadsheets)
    -> Using integration interfaces to load the data in to three different series sales fcst, finance fcst and ops fcst
    Then consolidated forecast should be loaded into ASCP.
    -> create a workflow/ii to load the consolidated of the 3 series
    So this can be done in DM.
    Also a standard workflow exists in S&OP to publish consensus forecast to ASCP which accomplish your objective.

  • Cleaning data before turning into WDDX

    A program I am supporting is displaying behavior that seems
    to imply that WDDX format cannot hold those windows specific
    characters for things like "smart quotes", certain types of dashes,
    etc. Right now, the application takes user input which is being
    stored in cfml structures, turns it into wddx, then puts that data
    into an oracle column/row. Later, the code retrieves the
    column/record, and before converting it to the cfml structure, uses
    the CFML MX7 IsWDDX function. This is returning false if the data
    contains the Windows special characters, but returns true if I
    manually go into the database and change the characters into
    something else.
    What I need is some code that I could use before ever
    creating the WDDX record, that would find the Windows specific
    characters and turn them into some sort of "valid" character.
    Does anyone have any pointers to such code, or at least an
    article discussing this type of thing?
    Thank you.

    Go to cflib.org and look for a function called
    safetext.

  • Error when Read only permission set when filtering data before loading with Excel 2013 Addin

    Good afternoon :)
    I have an MDS issue that is making me lose my mind.
    I have some permission set to an Entity. It is a read only permission in the entity but I tried to put inside every field and same thing happen.
    Every time an Entity has any kind of read only permission assigned to it or its fields, Excel Addin show an error when we try to load it. When Entity has more rows than the maximum rows in the Settings pane, it will show you an option to filter data. When
    you try to use this filter, Excel show an error message but you can press OK and everything works fine.
    There is the message:
    The thing user, my user do not want it :( And I don't know how to get rid of it.
    Do anyone have an ideia on how to fix it ?
    In the debug set of the addin, there is this message:
    2014-10-22T11:38:42.152        8440 EXCEL.EXE            EXCEL.EXE                               
    Generic          EventType: Error, Message: Unobserved exception in TaskScheduler. Exception:'System.AggregateException: One or more errors occurred. ---> System.NullReferenceException: Object reference not set
    to an instance of an object.
       at System.Windows.Forms.Control.MarshaledInvoke(Control caller, Delegate method, Object[] args, Boolean synchronous)
       at System.Windows.Forms.Control.Invoke(Delegate method, Object[] args)
       at System.Windows.Forms.WindowsFormsSynchronizationContext.Send(SendOrPostCallback d, Object state)
       at Microsoft.MasterDataServices.ExcelAddInCore.ExcelHelpers.ExecuteOnUIThread(SendOrPostCallback callback)
       at Microsoft.MasterDataServices.ExcelAddInCore.DataView.FinalizeUIOperation(Boolean mdsOperation)
       at Microsoft.MasterDataServices.ExcelAddInCore.DataView.<>c__DisplayClass53.<LoadData>b__51(IAsyncResult ar)
       at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
       --- End of inner exception stack trace ---
    ---> (Inner Exception #0) System.NullReferenceException: Object reference not set to an instance of an object.
       at System.Windows.Forms.Control.MarshaledInvoke(Control caller, Delegate method, Object[] args, Boolean synchronous)
       at System.Windows.Forms.Control.Invoke(Delegate method, Object[] args)
       at System.Windows.Forms.WindowsFormsSynchronizationContext.Send(SendOrPostCallback d, Object state)
       at Microsoft.MasterDataServices.ExcelAddInCore.ExcelHelpers.ExecuteOnUIThread(SendOrPostCallback callback)
       at Microsoft.MasterDataServices.ExcelAddInCore.DataView.FinalizeUIOperation(Boolean mdsOperation)
       at Microsoft.MasterDataServices.ExcelAddInCore.DataView.<>c__DisplayClass53.<LoadData>b__51(IAsyncResult ar)
       at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)<---

    Rafael,
    Is this still an issue?
    Thanks!
    Ed Price, Azure & Power BI Customer Program Manager (Blog,
    Small Basic,
    Wiki Ninjas,
    Wiki)
    Answer an interesting question?
    Create a wiki article about it!

  • How to clean file cache before loading PDF file ?

    Hi,
    I have one application in which we open PDF file inside it using the following api .
    AxAcroPDFLib.AxAcroPDF axAcroPDF1 = new AxAcroPDFLib.AxAcroPDF();
    axAcroPDF1.LoadFile(m_FileName);
    What I observed that, when we open PDF files second time, it opens with a page where we have closed it for the first time.
    It happens only when acrobat.exe is already running means if somebody already opened another PDF document and then we open and close the document in my application
    So I wanted to know is there any way to clean file cache information so that it open file with the first page always ?.

    Moved to the Acrobat SDK forum: Acrobat SDK

  • BW data loading to cube (delete data before load)

    I have a cube which load data from a SQL based source system.
    I would like to have a schedule job which first delete current year's data in the cube, then extract and load current year's data from SQL server to cube.
    How can it be done? Thanks.

    Hi,
    If you are using Process chains then use...
    Use DELETE_FACTS tcode and genarate the report name i.e. select Data target and Genarate Program then execute it will genarate the program, so create varient for that program and then use it in process chain. i.e.
    Start
      |
    Delete using that Program (with varient on year/month)
      |
    Load data
    Thanks
    Reddy

  • Check data before loading through SQL *Loader

    Hi all,
    I have a temp table which is loaded through SQL*Loader.This table is used by a procedure for inserting data into another table.
    I get error of 0RA-01722 frequently during procdures execution.
    I have decided to check for the error data through the control file itself.
    I have few doubts about SQL Loader.
    Will a record containing character data for column declared as INTEGER EXTERNAL in ctrl file get discarded?
    Does declaring column as INTERGER EXTERNAL take care of NULL values?
    Does a whole record gets discarded if one of the column data is misplaced in the record in input file?
    Control File is of following format:
    LOAD DATA
    APPEND INTO TABLE Temp
    FIELDS TERMINATED BY "|" optionally enclosed by "'"
    trailing nullcols
    FILEDATE DATE 'DD/MM/YYYY',
    ACC_NUM INTEGER EXTERNAL,
    REC_TYPE ,
    LOGO , (Data:Numeric Declared:VARCHAR)
    CARD_NUM INTEGER EXTERNAL,
    ACTION_DATE DATE 'DD/MM/YYYY',
    EFFECTIVE_DATE DATE 'DD/MM/YYYY',
    ACTION_AMOUNT , (Data:Numeric Declared:NUMBER)
    ACTION_STORE , (Data:Numeric Declared:VARCHAR)
    ACTION_AUTH_NUM ,
    ACTION_SKU_NUM ,
    ACTION_CASE_NUM )
    What changes do I need to make in this file regarding above questions?

    Is there any online document for this?<br>
    Here it is

  • End Routine is NOT modifying the DSO with new data after load into that DSO

    Hi all,
      I am creating an End Routine for DSO to populate a field ZFCMP_FLG (to store 'Y' ) with lookup from another DSO ZMDS_D01. This new field shows blank instead of 'Y', after activating the DSO. The RESULT_PACKAGE record is populated with 'Y' for ZFCMP_FLG  while debugging that End Routine and why it is NOT writing the modified records into DSO, please ? It is a Characteristic InfoObject with length 1 to store 'Y'. The following is some part of the code:
    DATA: wa_fcmp_flag   TYPE c VALUE 'Y'.
    LOOP AT RESULT_PACKAGE ASSIGNING <RESULT_FIELDS>.
        READ TABLE it_zmds_d01 INTO wa_zmds_d01 WITH KEY
                    /BIC/ZAUFNR    = <RESULT_FIELDS>-CS_ORDER
                    NOTIFICATN     = <RESULT_FIELDS>-NOTIFICATN  BINARY SEARCH.
         IF sy-subrc = 0.
           <RESULT_FIELDS>-/BIC/ZFCMP_FLG = wa_fcmp_flg.
        ENDIF.
    ENDLOOP.
    Thanks,
    Venkat.

    hi...
    Since you are using Field symbol to loop the internal Table there is no need to use the MODIFY Statement in the loop.
    So your code is correct only.
    But here you have to check the status of READ TABLE command in the debug mode.
    it may be failing that's why the RESULT_PACKAGE is not getting modified.
    Plz check it.
    Note: You may need to SORT the Int Table since you are using BINARY SEARCH. check below.
    DATA: wa_fcmp_flag   TYPE c VALUE 'Y'.
    Sort it_zmds_d01 by  /BIC/ZAUFNR    NOTIFICATN  .
    LOOP AT RESULT_PACKAGE ASSIGNING <RESULT_FIELDS>.
        READ TABLE it_zmds_d01 INTO wa_zmds_d01 WITH KEY
                    /BIC/ZAUFNR    = <RESULT_FIELDS>-CS_ORDER
                    NOTIFICATN     = <RESULT_FIELDS>-NOTIFICATN  BINARY SEARCH.
         IF sy-subrc = 0.
           <RESULT_FIELDS>-/BIC/ZFCMP_FLG = wa_fcmp_flg.
        ENDIF.
    ENDLOOP.

  • Master Data & Hierarchy loading into BPC from BW

    HI all,
    When i am getting master data and hierarchy from BW i need to replace a "-" to an "_".
    These members are parents in BW.
    I was able to do that using a conversion file for the master data. But when i run the hierarchy package it says parents dont exist because the nodes that were pulled before are not with a underscore and not a dash.
    Ex: XXXX-XX needs to be replaced as XXXX_XX.
    This member is a parent aswell. So i need to add conversion in the hierarchy load. I did that but its not taking it.
    Conversion file for hierarchy load looks like this:
    External                         Internal
    XXXX-XX                          XXXX_XX
    I am on BPC NW7.5 SP6.
    Thank You all.

    I think conversion file string manipulation for BW infoobject hierarchy imports has some issues.  I suggest reporting the problem to SAP support by creating a message on the Service Marketplace. Please be very clear in the message about your setup and how to recreate the problem and why you think it is a bug so that you don't receive back a reply about it being a consulting issue.  Expect some time to work through this with support and eventually development, with the goal of getting a correction note released to fix the issue.
    In the meantime, you might try to do the string manipulation in the transformation file *MAPPING section instead of the conversion file.
    NODENAME=NODENAME(1:4)*STR(-)NODENAME(5:6)
    or substitute NODENAME above for PARENT if that is appropriate level with the issue.
    Best regards,
    [Jeffrey Holdeman|http://wiki.sdn.sap.com/wiki/display/profile/Jeffrey+Holdeman]
    SAP Labs, LLC
    BusinessObjects Division
    Americas Customer Solutions Adoption (CSA) team

  • Here's a tough one: can Infotype data be loaded into a new implementation?

    We have an R/3 system that is going to be replaced with Netweaver; not upgraded, but a new implementation, and that’s because there was too much customizing, and we're only using the HR module.
    Can we load all of the infotype data from the old R/3 system into the new ECC system? Normally it is impossible to re-create all the history from a previous system in SAP, but for Infotypes, maybe? No?  Assuming that the infotype configurations are the same?
    I see infotypes as fairly static tables that are only added to by changes, or by certain actions.  No changes could be made to these older infotype records, but at least BW could extract all of these and present in an ODS a continuous view of history.
    Am I right? Is this possible to load up the infotype tables?
    I know there will be problems with double infinity records, but I think we can handle that one.

    We have an R/3 system that is going to be replaced with Netweaver; not upgraded, but a new implementation, and that’s because there was too much customizing, and we're only using the HR module.
    Can we load all of the infotype data from the old R/3 system into the new ECC system? Normally it is impossible to re-create all the history from a previous system in SAP, but for Infotypes, maybe? No?  Assuming that the infotype configurations are the same?
    I see infotypes as fairly static tables that are only added to by changes, or by certain actions.  No changes could be made to these older infotype records, but at least BW could extract all of these and present in an ODS a continuous view of history.
    Am I right? Is this possible to load up the infotype tables?
    I know there will be problems with double infinity records, but I think we can handle that one.

Maybe you are looking for

  • Fcp6 unexpectedly quit on startup!!!

    I had hard time to install FCP 6 because Logic studio and FCP had some crash. Now I solved the problem and installed FCP 6. but,,, I can't launch FCP 6 cause unexpected quit occurs everytime... I think it has somthing to do with my Sound Source Plug-

  • Uploaded files are not visible

    I have uploaded numerous mp4 tutorial files I use on a regular basis, and they are not visible (I did not hide files).  They were not visible at all after their upload.  I initially thought there was an error were issue with the upload, but when I we

  • Noise in LR 1.3.1 even at low ISOs

    It seems I am getting noise in my RAW files even at low ISOs like 400, 200 and 100. I have noticed this after "upgrading" to 1.3.1. I am using an Apple I-Mac with Leopard. I am primarily shooting with a Canon Rebel XTi but noticed it with an older Ca

  • HELP - Look at my desktop!

    One of our clients on our system had a major crash while searching through Finder. Any ideas? Different logins have same results. Names within finder are fine, spotlight search results are ok, but HD name, any finder options, apple options are all sc

  • Nikon D600 NEF files and Photoshop CS3

    Hi, I am using Photoshop CS3 and cannot upload Nikon D600 NEF files--obviously a gap between the two debuts.  Are there any suggestions, or am I stuck upgrading? Thanks