To use Character string operator in ABAP

HI,
I have a problem with joining the two fields with different data length i.e
OBJKY has length (30).
tknum has length (10).
the above read table i_nast works as long as both has the records not greater than 10 and I do have some records with greater than 10 for OBJKY in the database and my read is failing at that scenario, I need to use a charater string operator, as I am new to ABAP, can any one suggest me how to do .
...SQL..
select OBJKY DATVR from nast
into corresponding fields of table i_nast
where KSCHL = 'ZBOL'.
sort i_nast by OBJKY.
LOOP at i_ship_data.
read table i_nast with key
OBJKY = i_ship_data-tknum binary search.
if sy-subrc = 0.
move: i_nast-datvr to i_ship_data-datvr.
endif.
modift i_ship_data.
ENDLOOP.

HI,
Since OBJKY and TKNUM are with different lengths the Read statement works only
when OBJKY has a 10 character value identical to TKNUM.
but if we can assume that only first 10 characters of OBJKY are to be comapred with TKNUM then we can try the under mentioned approach:
Create a new field in the Internal table I_NAST with length 10 characters.(I_NAST-OBJKY_TEMP).
now assign the first 10 characters of OBJKY to this new field :
LOOP AT I_NAST.
MOVE I_INAST-OBJKY+0(10) TO OBJKY_TEMP.
MODIFY I_NAST.
ENDLOOP.
Now you can Modify your READ STATEMENT :
LOOP at i_ship_data.
read table i_nast with key
OBJKY_TEMP = i_ship_data-tknum binary search.
if sy-subrc = 0.
move: i_nast-datvr to i_ship_data-datvr.
endif.
modift i_ship_data.
ENDLOOP.
Hope this will help.
Note: You can pick up any 10 characters starting from 1 to 20 th character of the field
I_INAST-OBJKY.
Reward Points if found helpfull..
Cheers,
Chandra Sekhar.

Similar Messages

  • Efficiency of Java String operations

    Hi,
    for an Information Retrieval project, I need to make extensive use of String operations on a vast number of documents, and in particular involving lots of substring() operations.
    I'd like to get a feeling how efficient the substring() method of java.lang.String is implemented just to understand whether trying to optimize it would be a reasonable option (I was thinking of an algorithm for efficient string pattern matching such as the Knuth-Morris-Pratt algorithm, but if java.lang.String already applies similarly efficient algorithms I would not bother).
    Can someone help?
    J

    Thanks for your comment. Yes of course you're right, I
    mean indexOf(). If so (thanks DrClap), let me enter the discussion.
    The indexOf() implements a so called "brute force algorithm".
    The performance is O(n*m), where n is the length of the text, and
    m is the length of the pattern, but is close to n on the average.
    The KMP is O(n), so the performance gain should be hardly noticeable.
    To get a real performance gain you should look at the BM (Boyer-Moore,
    O(n/m)) algorithm or some of its descendants.
    As for java.util.regex package, as far as i understand it should be
    several times slower than indexOf(), because it reads EACH character through an interface method (as opposed to direct array access in indexOf()).
    Though it's still to be proved experimentally.

  • Character string in sapscript

    pls explain the following code for barcode in scripts:
    barcode in sapscript: using character string
    /: <BC> &VBAK-WERKS& </>
    Is it another way of achieving barcode apart from character format in scripts?

    Hi,
    U have to create a character format for barcode.  Using that character format u need to place ur string in the place where  u want to print.
    example.
    <characterformat> &string&</characterformat>
    Venkat

  • [svn:bz-trunk] 18053: BLZ-571: Use of wrong operator in string comparison in flex.messaging.VersionInfo. java

    Revision: 18053
    Revision: 18053
    Author:   [email protected]
    Date:     2010-10-07 03:27:37 -0700 (Thu, 07 Oct 2010)
    Log Message:
    BLZ-571: Use of wrong operator in string comparison in flex.messaging.VersionInfo.java
    Updated the code to use the right operator.
    Check-in Tests: PASS
    QA: Yes
    Ticket Links:
        http://bugs.adobe.com/jira/browse/BLZ-571
    Modified Paths:
        blazeds/trunk/modules/core/src/flex/messaging/VersionInfo.java

  • Where used of any string list in ABAP repository

    Hi,
    Can anyone help me to how to find a string in the ABAP repository,
    e.g. "MY_SEARCH_STRING" has been used in any report, function module etc etc...
    Regards,
    Ravi

    Hi Ravi Kumar Jaiswal,
    have a look on report RPR_ABAP_SOURCE_SCAN
    & tcode EWK1
    Also see CODE_SCANNER
    Also would like to suggest you to search b4 posting the question...
    have look on below thread...
    Re: Transaction
    Hope it will solve your problem..
    Thanks & Regards
    ilesh 24x7

  • Importing large character string using Text_io

    Hi,
    I have been using text_io and/or client_text_io on forms 10G, DB= 9i succesfully for a while to both read (parse) and write csv files. Today I ran across a problem. The csv file has two columns, one a short character string ( v_name varchar2(25)) and the other a character field that is usually less than 3000 characters but can be as high as 15000 characters (actually DNA sequences). Apparantly, the text_io is not working when this field is >~4000 characters. The database column that this is loading in to is a CLOB and the variable (v_seq) itself is set for varchar(15000) whereas the string (str) loaded with text_io.getline can be up to 32000 characters. The code below works fine If the second column in the csv is less than ~4000 characters. I tried to set the str and v_seq variables to CLOB but that did not run at all. Can anybody tell me how I can load longer character strings from csv files?
    Thanks
    Larry Solomon
    DECLARE
    file_name varchar2(100);
    file1 TEXT_IO.FILE_TYPE;
    str VARCHAR2(32000);
    v_comma number;
    v_comma2 number;
    v_name varchar2(25);
    v_seq     varchar2(15000);
    BEGIN
    DEBUG.SUSPEND;
    file_name := :control.orf_file_name;
    file1 := client_TEXT_IO.FOPEN( file_name,'r' );
    loop client_TEXT_IO.GET_LINE( file1, str );
    v_comma := instr(str, ',');
    v_name := substr(str,1, v_comma-1);
    v_seq := substr(str,v_comma+1, length(str)-v_comma);
    insert into ibase( name, orf)
    values ( v_name, v_seq);
    END LOOP ;
    client_TEXT_IO.FCLOSE( file1 );
    commit;

    I believe the largest allowable size of a varchar2 is 8000 now on 10.2 of the database. We don't use 10.2 RSFs so we have a upper limit of 4000 in our client side PLSQL.
    That will surely be a problem in your code. You probably need to rewrite to make chunks out of your data and insert in chunks into the database.

  • How to use execute immediate for character string value 300

    how to use execute immediate for character string value > 300 , the parameter for this would be passed.
    Case 1: When length = 300
    declare
    str1 varchar2(20) := 'select * from dual';
    str2 varchar2(20) := ' union ';
    result varchar2(300);
    begin
    result := str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || ' ';
    dbms_output.put_line('length : '|| length(result));
    execute immediate result;
    end;
    /SQL> 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
    length : 300
    PL/SQL procedure successfully completed.
    Case 2: When length > 300 ( 301)
    SQL> set serveroutput on size 2000;
    declare
    str1 varchar2(20) := 'select * from dual';
    str2 varchar2(20) := ' union ';
    result varchar2(300);
    begin
    result := str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || str2 ||
    str1 || ' ';
    dbms_output.put_line('length : '|| length(result));
    execute immediate result;
    end;
    /SQL> 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
    declare
    ERROR at line 1:
    ORA-06502: PL/SQL: numeric or value error: character string buffer too small
    ORA-06512: at line 6

    result varchar2(300);Answer shouldn't be here ? In the greater variable ?
    Nicolas.

  • Retrieve value from element using extract gives ORA-19011: Character string

    hi, new to this..i'm trying to do in oracle 11.2.0.1.0, need to store all jobid in a table..
    DROP TABLE sox_xmltable_mytest;
    CREATE TABLE sox_xmltable_mytest as
    SELECT '<?xml version="1.0"?><ROWSET>'|| extract(OBJECT_VALUE, '/ArrayOfJobClass/JobClass/Tasks/TaskClass/JobId').getClobval() ||'</ROWSET>' as JOBID FROM sox_xmltable;
    drop TABLE sox_xmltable_tab;
    CREATE TABLE sox_xmltable_tab (
    poDoc XMLType NOT NULL
    insert into sox_xmltable_tab
    SELECT XMLType(JOBID)
    FROM sox_xmltable_mytest;
    commit;
    ***sample query is ok
    SQL> set pages 0 long 100000000
    SQL> SELECT e.poDoc.extract('/ROWSET/JobId').getClobval()
    2 FROM sox_xmltable_tab e;
    <JobId>3deed63a-05a9-4018-8e17-455282c6af83</JobId><JobId>534c7b37-c6d3-454c-962
    4-3901887a6163</JobId><JobId>534c7b37-c6d3-454c-9624-3901887a6163</JobId><JobId>
    534c7b37-c6d3-454c-9624-3901887a6163</JobId><JobId>534c7b37-c6d3-454c-9624-39018
    87a6163</JobId><JobId>821c6b33-6d4a-43e0-aa24-13475da72fd6</JobId><JobId>821c6b3
    3-6d4a-43e0-aa24-13475da72fd6</JobId><JobId>821c6b33-6d4a-43e0-aa24-13475da72fd6
    </JobId><JobId>821c6b33-6d4a-43e0-aa24-13475da72fd6</JobId><JobId>6c33838b-2966-
    4428-a4f6-422a186433f0</JobId><JobId>a70719c2-9d54-49f2-9555-1cf60404468d</JobId
    <JobId>4efb985b-0a4b-456c-9b4a-fe9876073208</JobId><JobId>19beaecc-22ac-450d-bccf-2d4ff30bcc80</JobId><JobId>1c33002d-dfd0-4533-99c4-4310a887d528</JobId><JobId
    1c33002d-dfd0-4533-99c4-4310a887d528</JobId><JobId>1c33002d-dfd0-4533-99c4-4310a887d528</JobId><JobId>1c33002d-dfd0-4533-99c4-4310a887d528</JobId>
    ***error when i tried to get jobid
    SQL> SELECT e.poDoc.extract('/ROWSET/JobId/text()').getStringval() as ID
    2 FROM sox_xmltable_tab e;
    SELECT e.poDoc.extract('/ROWSET/JobId/text()').getStringval() as ID
    ERROR at line 1:
    ORA-19011: Character string buffer too small
    ORA-06512: at "SYS.XMLTYPE", line 169

    user503699 wrote:
    This should work for youProbably not.
    If I'm not mistaken OP wants each JobId in a separate row.
    @OP :
    In XMLTable, the main XQuery expression returns a sequence of nodes that will each represent a separate relational row in the final resultset.
    If you need JobId in separate rows, then you have to tell the XQuery to return a sequence of JobId.
    BTW, you don't have to use multiple intermediate tables either, just query from your base table :
    SELECT x.JobId
    FROM sox_xmltable t
       , XMLTable('/ArrayOfJobClass/JobClass/Tasks/TaskClass/JobId'
           passing t.object_value
           columns JobId varchar2(100) path '.'
         ) x
    ;

  • Program for string comparision in ABAP

    Hi,
    I require a program in abap for string comparision

    Comparing Strings
    Similarly to the special statements for processing strings, there are special comparisons that you can apply to strings with types C, D, N, and T. You can use the following operators:
    <operator>
    Meaning
    CO
    Contains Only
    CN
    Contains Not only
    CA
    Contains Any
    NA
    contains Not Any
    CS
    Contains String
    NS
    contains No String
    CP
    Contains Pattern
    NP
    contains No Pattern
    There are no conversions with these comparisons. Instead, the system compares the characters of the string. The operators have the following functions:
    CO (Contains Only)
    The logical expression
    <f1> CO <f2>
    is true if <f1> contains only characters from <f2>. The comparison is case-sensitive. Trailing blanks are included. If the comparison is true, the system field SY-FDPOS contains the length of <f1>. If it is false, SY-FDPOS contains the offset of the first character of <f1> that does not occur in <f2> .
    CN (Contains Not only)
    The logical expression
    <f1> CN <f2>
    is true if <f1> does also contains characters other than those in <f2>. The comparison is case-sensitive. Trailing blanks are included. If the comparison is true, the system field SY-FDPOS contains the offset of the first character of <f1> that does not also occur in <f2>. If it is false, SY-FDPOS contains the length of <f1>.
    CA (Contains Any)
    The logical expression
    <f1> CA <f2>
    is true if <f1> contains at least one character from <f2>. The comparison is case-sensitive. If the comparison is true, the system field SY-FDPOS contains the offset of the first character of <f1> that also occurs in <f2> . If it is false, SY-FDPOS contains the length of <f1>.
    NA (contains Not Any)
    The logical expression
    <f1> NA <f2>
    is true if <f1> does not contain any character from <f2>. The comparison is case-sensitive. If the comparison is true, the system field SY-FDPOS contains the length of <f1>. If it is false, SY-FDPOS contains the offset of the first character of <f1> that occurs in <f2> .
    CS (Contains String)
    The logical expression
    <f1> CS <f2>
    is true if <f1> contains the string <f2>. Trailing spaces are ignored and the comparison is not case-sensitive. If the comparison is true, the system field SY-FDPOS contains the offset of <f2> in <f1> . If it is false, SY-FDPOS contains the length of <f1>.
    NS (contains No String)
    The logical expression
    <f1> NS <f2>
    is true if <f1> does not contain the string <f2>. Trailing spaces are ignored and the comparison is not case-sensitive. If the comparison is true, the system field SY-FDPOS contains the length of <f1>. If it is false, SY-FDPOS contains the offset of <f2> in <f1> .
    CP (Contains Pattern)
    The logical expression
    <f1> CP <f2>
    is true if <f1> contains the pattern <f2>. If <f2> is of type C, you can use the following wildcards in <f2>:
    for any character string *
    for any single character +
    Trailing spaces are ignored and the comparison is not case-sensitive. If the comparison is true, the system field SY-FDPOS contains the offset of <f2> in <f1> . If it is false, SY-FDPOS contains the length of <f1>.
    If you want to perform a comparison on a particular character in <f2>, place the escape character # in front of it. You can use the escape character # to specify
    characters in upper and lower case
    the wildcard character "" (enter # )
    the wildcard character "" (enter # )
    the escape symbol itself (enter ## )
    blanks at the end of a string (enter #___ )
    NP (contains No Pattern)
    The logical expression
    <f1> NP <f2>
    is true if <f1> does not contain the pattern <f2>. In <f2>, you can use the same wildcards and escape character as for the operator CP.
    Trailing spaces are ignored and the comparison is not case-sensitive. If the comparison is true, the system field SY-FDPOS contains the length of <f1>. If it is false, SY-FDPOS contains the offset of <f2> in <f1> .
    DATA: F1(5) TYPE C VALUE <f1>,
          F2(5) TYPE C VALUE <f2>.
    IF F1 <operator> F2.
       WRITE: / 'Comparison true, SY-FDPOS=', SY-FDPOS.
    ELSE.
       WRITE: / 'Comparison false, SY-FDPOS=', SY-FDPOS.
    ENDIF.
    The following table shows the results of executing this program, depending on which operators and values of F1 and F2.
    <f1>
    <operator>
    <f2>
    Result
    SY-FDPOS
    'BD '
    CO
    'ABCD '
    true
    5
    'BD '
    CO
    'ABCDE'
    false
    2
    'ABC12'
    CN
    'ABCD '
    true
    3
    'ABABC'
    CN
    'ABCD '
    false
    5
    'ABcde'
    CA
    'Bd '
    true
    1
    'ABcde'
    CA
    'bD '
    false
    5
    'ABAB '
    NA
    'AB '
    false
    0
    'ababa'
    NA
    'AB '
    true
    5
    'ABcde'
    CS
    'bC '
    true
    1
    'ABcde'
    CS
    'ce '
    false
    5
    'ABcde'
    NS
    'bC '
    false
    1
    'ABcde'
    NS
    'ce '
    true
    5
    'ABcde'
    CP
    'b'
    true
    1
    'ABcde'
    CP
    '#b'
    false
    5
    'ABcde'
    NP
    'b'
    false
    1
    'ABcde'
    NP
    '#b'
    true
    5
    goto sap library if intsalled in ur system and check the above one....

  • Purpose of using wild cards in SAP-ABAP

    what is the purpose of using the following in ABAP programs.
    In my case these naming convention are used in Reports * (Custom Programs)
    1> tables: *nast.
    why we use * before the table name?
    2> Data:  %var1 type mara-matnr,
    What for this % is used to declare the variable. what effect it will have?
                  %%var1 type marc-werks.
    this kind of declaration is also used, what purpose?
    3> Perform %<form-name>.
    A sub routine is called with the above kind of syntax, what purpose?
    Please help me to find what purpose they are used?
    Message was edited by:
            VK AJAY
    Message was edited by:
            VK AJAY

    HI,
    WILDCARD characters which are used for comparisions with character strings & numeric strings.
    Refer samplecode:
    Example to select all customers whose name begins with 'M':
    DATA SCUSTOM_WA TYPE SCUSTOM.
    SELECT ID NAME FROM SCUSTOM
           INTO CORRESPONDING FIELDS OF SCUSTOM_WA
           WHERE NAME LIKE 'M%'.
      WRITE: / SCUSTOM_WA-ID, SCUSTOM_WA-NAME.
    ENDSELECT.
    Example to select all customers whose name contains 'huber':
    DATA SCUSTOM_WA TYPE SCUSTOM.
    SELECT ID NAME FROM SCUSTOM
           INTO CORRESPONDING FIELDS OF SCUSTOM_WA
           WHERE NAME LIKE '%huber%'.
      WRITE: / SCUSTOM_WA-ID, SCUSTOM_WA-NAME.
    ENDSELECT.
    Example to select all customers whose name does not contain 'n' as the second character:
    DATA SCUSTOM_WA TYPE SCUSTOM.
    SELECT ID NAME FROM SCUSTOM
           INTO CORRESPONDING FIELDS OF SCUSTOM_WA
           WHERE NAME NOT LIKE '_n%'.
      WRITE: / SCUSTOM_WA-ID, SCUSTOM_WA-NAME.
    ENDSELECT.
    Reward points if this Helps.
    Manish

  • How to execute an SQL query present in a string inside an ABAP program?

    hello,
    How to execute an SQL query present in a string inside an ABAP program

    Raut,
    You can execute Native SQl statements.
    Ex: To use a Native SQL statement, you must precede it with the EXEC SQL statement, and follow it with the ENDEXEC statement as follows:
    EXEC SQL [PERFORMING <form>].
      <Native SQL statement>
    ENDEXEC.
    There is no period after Native SQL statements. Furthermore, using inverted commas (") or an asterisk (*) at the beginning of a line in a native SQL statement does not introduce a comment as it would in normal ABAP syntax. You need to know whether table and field names are case-sensitive in your chosen database.
    In Native SQL statements, the data is transported between the database table and the ABAP program using host variables. These are declared in the ABAP program, and preceded in the Native SQL statement by a colon (:). You can use elementary structures as host variables. Exceptionally, structures in an INTO clause are treated as though all of their fields were listed individually.
    If the selection in a Native SQL SELECT statement is a table, you can pass it to ABAP line by line using the PERFORMING addition. The program calls a subroutine <form> for each line read. You can process the data further within the subroutine.
    As in Open SQL, after the ENDEXEC statement, SY-DBCNT contains the number of lines processed. In nearly all cases, SY-SUBRC contains the value 0 after the ENDEXEC statement. Cursor operations form an exception: After FETCH, SY-SUBRC is 4 if no more records could be read. This also applies when you read a result set using EXEC SQL PERFORMING.
    EXEC SQL PERFORMING loop_output.
      SELECT connid, cityfrom, cityto
      INTO   :wa
      FROM   spfli
      WHERE  carrid = :c1
    ENDEXEC.
    Pls. Mark If useful

  • String operations on internal table text....

    Original table is consists of 2 columns:
    E        
    RFC error(SM_DHTCLNT010_READ): Error when opening connection
    E RFC error(SM_DHLCLNT010_READ): Error when opening connection
    E RFC error(SM_DHKCLNT010_READ): Error when opening connection
    E RFC error(SM_E10CLNT000_READ): 'tdhtci00.emea.gdc:sapgw02' E     No read RFC FOR SM_B72CLNT003_READ
    E     No read RFC FOR SM_B71CLNT003_READ
    S     Clients for system 'E21' found in RFC  'SM_E21CLNT001_READ'
    S     Clients for system 'E22' found in RFC  'SM_E22CLNT001_READ'
    S     Clients for system 'E23' found in RFC  'SM_E22CLNT001_READ'
    Now we need to apply string operations such that result table is 3 columns with new refined message:
    status       sid            
    Message NEW_TEXT
    E     DHT         
    RFC error               Error when opening connectionE     DHL         RFC error                       Error when opening connection
    E     DHK         RFC error                       Error when opening connection
    E     E10       RFC error                      tdhtci00.emea.gdc:sapgw02
    E     B72        No RFC LINK
    E     B71        No RFC LINK
    S     E21        DATA READ
    S     E22       DATA READ
    S     E23       DATA READ
    String conditions to arrive at new table is:
    1) to get SID column : the conditions are
    •     If the Status is “RFC Error” then next 3 Characters after the “_” must be extracted as SID
    •     Else the SID is between the first and the second inverted comma ‘
    Example:  Clients for system 'E21' found in RFC  'SM_E21CLNT001_READ'extracts “E21” as SID
    2) for message column
    ·         message “RFC Error” if the message text
    starts with “RFC Error”
    · message “no RFC Link” if the message text starts with “No read RFC*”
    · message “Data Read” if the Substring “found in RFC”</b> was found in the Message      
    3) •     If the Status is “RFC Error” then the whole Textstring behind the “: “ must be Extracted
    For example if message is RFC error(SM_DHLCLNT010_READ): Error when opening connection NEW_TEXT will be Error when opening connection
    Need ur inputs on these.
    Bset regards,
    Subba

    Hi,
    this u can acheive simply using offset:
    var_name+off(len). "
    e.g. wa_message-fld+0(3) = first threee characters
    wa_message-fld(3) same as above first three characters
    wa_message-fld+2(2) " will display second and third characte of wa_message-fld
    this u can use to set condtions like :
    if wa_message-fld(9) = 'RFC Error'.
    "process here
    endif.
    Hope this will help u...
    Jogdand M B

  • Adding zeros to a character string

    Hello friends,
    I want to add leading zeros for a field.
    the field is a character string.
    for example ,
    data: A(5) type c.
    now when A  = 'ab' (non-numeric value)
    i want this to be converted in '000ab'
    so, is there any standard Function module or any other way for doing this conversion ?
    I tried the FM   'CONVERSION_EXIT_ALPHA_INPUT' but this FM does not work for non-numeric inputs..
    Thanks.

    Hi,
    The packed field is transported right-justified to the character field, if required with a
    decimal point. The first position is reserved for the sign. Leading zeros appear as
    blanks. If the target field is too short, the sign is omitted for positive numbers. If this is still not sufficient, the field is truncated on the left. ABAP indicates the truncation with an asterisk (*). If you want the leading zeros to appear in the character field, use UNPACK instead of MOVE.
    UNPACK
    Converts variables from type P to type C.
    Syntax
    UNPACK <f> TO <g>.
    Unpacks the packed field <f> and places it in the string <g> with leading zeros. The opposite of PACK.
    Regards,
    Bhaskar

  • Use of q operator in oracle 11g

    Hi,
    I am looking for following data Rewards–ALL (UK/IE,US,Germany & China). even though i have used q operator. when report is generated using following query
    and UTL_FILE in excel sheet it show following data
    Rewards–ALL (UK/IE,US,Germany & China)here is the query. sql developer worksheet result tab shows Rewards–ALL (UK/IE,US,Germany & China) though
    SELECT 'c' temp,q'#Rewards–ALL (UK/IE,US,Germany & China)#'
    FROM memberhistory mlhthanks
    sandy
    Edited by: Sandy310 on Apr 27, 2010 5:10 PM
    Edited by: Sandy310 on Apr 27, 2010 5:10 PM

    First, it appears you have an illegal ASCII character after REWARDS - the "–" character.
    Did you SET ESCAPE ON ?
    Here is what I did after changing out the illegal character in SQL Developer, I executed it as a script.
    set escape on
    SELECT  'c' temp
            ,q'[Rewards-ALL (UK/IE,US,Germany \& China)]'  search_string
    FROM
             dual   ;
    TEMP SEARCH_STRING                         
    c    Rewards-ALL (UK/IE,US,Germany & China)I don't know why you are using the q operator. It is designed to eliminate the need for doubling up single quotes - the string you have does not have any quotes.

  • ASCII character/string processing and performance - char[] versus String?

    Hello everyone
    I am relative novice to Java, I have procedural C programming background.
    I am reading many very large (many GB) comma/double-quote separated ASCII CSV text files and performing various kinds of pre-processing on them, prior to loading into the database.
    I am using Java7 (the latest) and using NIO.2.
    The IO performance is fine.
    My question is regarding performance of using char[i] arrays versus Strings and StringBuilder classes using charAt() methods.
    I read a file, one line/record at a time and then I process it. The regex is not an option (too slow and can not handle all cases I need to cover).
    I noticed that accessing a single character of a given String (or StringBuilder too) class using String.charAt(i) methods is several times (5 times+?) slower than referring to a char of an array with index.
    My question: is this correct observation re charAt() versus char[i] performance difference or am I doing something wrong in case of a String class?
    What is the best way (performance) to process character strings inside Java if I need to process them one character at a time ?
    Is there another approach that I should consider?
    Many thanks in advance

    >
    Once I took that String.length() method out of the 'for loop' and used integer length local variable, as you have in your code, the performance is very close between array of char and String charAt() approaches.
    >
    You are still worrying about something that is irrevelant in the greater scheme of things.
    It doesn't matter how fast the CPU processing of the data is if it is faster than you can write the data to the sink. The process is:
    1. read data into memory
    2. manipulate that data
    3. write data to a sink (database, file, network)
    The reading and writing of the data are going to be tens of thousands of times slower than any CPU you will be using. That read/write part of the process is the limiting factor of your throughput; not the CPU manipulation of step #2.
    Step #2 can only go as fast as steps #1 and #3 permit.
    Like I said above:
    >
    The best 'file to database' performance you could hope to achieve would be loading simple, 'known to be clean', record of a file into ONE table column defined, perhaps, as VARCHAR2(1000); that is, with NO processing of the record at all to determine column boundaries.
    That performance would be the standard you would measure all others against and would typically be in the hundreds of thousands or millions of records per minute.
    What you would find is that you can perform one heck of a lot of processing on each record without slowing that 'read and load' process down at all.
    >
    Regardless of the sink (DB, file, network) when you are designing data transport services you need to identify the 'slowest' parts. Those are the 'weak links' in the data chain. Once you have identified and tuned those parts the performance of any other step merely needs to be 'slightly' better to avoid becoming a bottleneck.
    That CPU part for step #2 is only rarely, if every the problem. Don't even consider it for specialized tuning until you demonstrate that it is needed.
    Besides, if your code is properly designed and modularized you should be able to 'plug n play' different parse and transform components after the framework is complete and in the performance test stage.
    >
    The only thing that is fixed is that all input files are ASCII (not Unicode) characters in range of 'space' to '~' (decimal 32-126) or common control characters like CR,LF,etc.
    >
    Then you could use byte arrays and byte processing to determine the record boundaries even if you then use String processing for the rest of the manipulation.
    That is what my framework does. You define the character set of the file and a 'set' of allowable record delimiters as Strings in that character set. There can be multiple possible record delimiters and each one can be multi-character (e.g. you can use 'XyZ' if you want.
    The delimiter set is converted to byte arrays and the file is read using RandomAccessFile and double-buffering and a multiple mark/reset functionality. The buffers are then searched for one of the delimiter byte arrays and the location of the delimiter is saved. The resulting byte array is then saved as a 'physical record'.
    Those 'physical records' are then processed to create 'logical records'. The distinction is due to possible embedded record delimiters as you mentioned. One logical record might appear as two physical records if a field has an embedded record delimiter. That is resolved easily since each logical record in the file MUST have the same number of fields.
    So a record with an embedded delimiter will have few fields than required meaning it needs to be combined with one, or more of the following records.
    >
    My files have no metadata, some are comma delimited and some comma and double quote delimited together, to protect the embedded commas inside columns.
    >
    I didn't mean the files themselves needed to contain metadata. I just meant that YOU need to know what metadata to use. For example you need to know that there should ultimately be 10 fields for each record. The file itself may have fewer physical fields due to TRAILING NULLCOS whereby all consecutive NULL fields at the of a record do not need to be present.
    >
    The number of columns in a file is variable and each line in any one file can have a different number of columns. Ragged columns.
    There may be repeated null columns in any like ,,, or "","","" or any combination of the above.
    There may also be spaces between delimiters.
    The files may be UNIX/Linux terminated or Windows Server terminated (CR/LF or CR or LF).
    >
    All of those are basic requirements and none of them present any real issue or problem.
    >
    To make it even harder, there may be embedded LF characters inside the double quoted columns too, which need to be caught and weeded out.
    >
    That only makes it 'harder' in the sense that virtually NONE of the standard software available for processing delimited files take that into account. There have been some attempts (you can find them on the net) for using various 'escaping' techniques to escape those characters where they occur but none of them ever caught on and I have never found any in widespread use.
    The main reason for that is that the software used to create the files to begin with isn't written to ADD the escape characters but is written on the assumption that they won't be needed.
    That read/write for 'escaped' files has to be done in pairs. You need a writer that can write escapes and a matching reader to read them.
    Even the latest version of Informatica and DataStage cannot export a simple one column table that contains an embedded record delimiter and read it back properly. Those tools simply have NO functionality to let you even TRY to detect that embedded delimiters exist let alone do any about it by escaping those characters. I gave up back in the '90s trying to convince the Informatica folk to add that functionality to their tool. It would be simple to do.
    >
    Some numeric columns will also need processing to handle currency signs and numeric formats that are not valid for the database inpu.
    It does not feel like a job for RegEx (I want to be able to maintain the code and complex Regex is often 'write-only' code that a 9200bpm modem would be proud of!) and I don't think PL/SQL will be any faster or easier than Java for this sort of character based work.
    >
    Actually for 'validating' that a string of characters conforms (or not) to a particular format is an excellent application of regular expressions. Though, as you suggest, the actual parsing of a valid string to extract the data is not well-suited for RegEx. That is more appropriate for a custom format class that implements the proper business rules.
    You are correct that PL/SQL is NOT the language to use for such string parsing. However, Oracle does support Java stored procedures so that could be done in the database. I would only recommend pursuing that approach if you were already needing to perform some substantial data validation or processing the DB to begin with.
    >
    I have no control over format of the incoming files, they are coming from all sorts of legacy systems, many from IBM mainframes or AS/400 series, for example. Others from Solaris and Windows.
    >
    Not a problem. You just need to know what the format is so you can parse it properly.
    >
    Some files will be small, some many GB in size.
    >
    Not really relevant except as it relates to the need to SINK the data at some point. The larger the amount of SOURCE data the sooner you need to SINK it to make room for the rest.
    Unfortunately, the very nature of delimited data with varying record lengths and possible embedded delimiters means that you can't really chunk the file to support parallel read operations effectively.
    You need to focus on designing the proper architecture to create a modular framework of readers, writers, parsers, formatters, etc. Your concern with details about String versus Array are way premature at best.
    My framework has been doing what you are proposing and has been in use for over 20 years by three different major nternational clients. I have never had any issues with the level of detail you have asked about in this thread.
    Throughout is limited by the performance of the SOURCE and the SINK. The processing in-between has NEVER been an issu.
    A modular framework allows you to fine-tune or even replace a component at any time with just 'plug n play'. That is what Interfaces are all about. Any code you write for a parser should be based on an interface contract. That allows you to write the initial code using the simplest possible method and then later if, and ONLY if, that particular module becomes a bottlenect, replace that module with one that is more performant.
    Your intital code should ONLY use standard well-established constructs until there is a demonstrated need for something else. For your use case that means String processing, not byte arrays (except for detecting record boundaries).

Maybe you are looking for