Faster split than String.split() and StringTokenizer?

First I imrpoved performance of split by replacing the String.split() call with a custom method using StringTokenizer:
                final StringTokenizer st = new StringTokenizer(string, separator, true);
                String token = null;
                String lastToken = separator; //if first token is separator
                while (st.hasMoreTokens()) {
                    token = st.nextToken();
                    if (token.equals(separator)) {
                        if (lastToken.equals(separator)) { //no value between 2 separators?
                            result.add(emptyStrings ? "" : null);
                    } else {
                        result.add(token);
                    lastToken = token;
                }//next tokenBut this is still not very fast (as it is one of the "hot spots" in my profiling sessions). I wonder if it can go still faster to split strings with ";" as the delimiter?

Yup, for simple splitting without escaping of separators, indexOf is more than twice as fast:
    static private List<String> fastSplit(final String text, char separator, final boolean emptyStrings) {
        final List<String> result = new ArrayList<String>();
        if (text != null && text.length() > 0) {
            int index1 = 0;
            int index2 = text.indexOf(separator);
            while (index2 >= 0) {
                String token = text.substring(index1, index2);
                result.add(token);
                index1 = index2 + 1;
                index2 = text.indexOf(separator, index1);
            if (index1 < text.length() - 1) {
                result.add(text.substring(index1));
        }//else: input unavailable
        return result;
    }Faster? ;-)

Similar Messages

  • Xml data considered than strings character in«append»&«insert-aft»operation

    Hello,
    I use BPEL10.1.3.1. Here is my problem. I recuperate in a database a CLOB data type :
    <c>
    <d>
    <e></e>…
    </d>
    <f>
    <g></g>
    <h></h>…
    </f>
    </c>
    I insert this CLOB data type in a new variable "compteRenduXml" (type : simple type, xsd language). When I visualize my "compteRenduXml" variable, it is a xml data (that's OK).
    I am trying to insert the "compteRenduXml" variable after the <b>…</b> operation :
    <a>
    <b>…</b>
    </a>     
    The problem is the variable I would like to insert is considered than strings character (and not xml data ! ).
    After, when I try to insert a value in "<e>abcd</e>" for example, I have this error : « The XPath chain does not return any node. »
    Can you help me ?
    Thank you.
    Tania

    If in my Sqlloader script I use :
    INTO TABLE XML_DOCUMENT_TABLE_VWO TRUNCATE
    XMLType(xmlcol)
    FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
    xmlcol
    I have always this error whith the two XML document (< 255 kb, & > 255 Kb)
    Table XML_DOCUMENT_TABLE_VWO, loaded from every logical record.
    Insert option in effect for this table: TRUNCATE
    Column Name Position Len Term Encl Datatype
    XMLCOL FIRST * , O(") CHARACTER
    Record 1: Rejected - Error on table XML_DOCUMENT_TABLE_VWO.
    ORA-00904: "SYS_NC_ROWINFO$": invalid identifier

  • StringTokenizer vs. split and empty strings -- some clarification please?

    Hi everybody,
    I posted a question that was sort of similar to this once, asking if it was best to convert any StringTokenizers to calls to split when parsing strings, but this one is a little different. I rarely use split, because if there are consecutive delimiters, it gives empty strings in the array it returns, which I don't want. On the other hand, I know StringTokenizer is slower, but it doesn't give empty strings with consecutive delimiters. I would use split much more often if there was a way to use it and not have to check every array element to make sure it isn't the empty string. I think I may have misunderstood the javadoc to some extent--could anyone explain to me why split causes empty strings and StringTokenizer doesn't?
    Thanks,
    Jezzica85

    Because they are different?
    Tokenizers are designed to return tokens, whereas split is simply splitting the String up into bits. They have different purposes
    and uses to be honest. I believe the results of previous discussions of this have indicated that Tokenizers are slightly (very
    slightly and not really meaningfully) faster and tokenizers do have the option of return delimiters as well which can be useful
    and is a functionality not present in just a straight split.
    However. split and regex in general are newer additions to the Java platform and they do have some advantages. The most
    obvious being that you cannot use a tokenizer to split up values where the delimiter is multiple characters and you can with
    split.
    So in general the advice given to you was good, because split gives you more flexibility down the road. If you don't want
    the empty strings then yes just read them and throw them away.
    Edited by: cotton.m on Mar 6, 2008 7:34 AM
    goddamned stupid forum formatting

  • Unicode export:Table-splitting and package splitting

    Hi SAP experts,
    I know there are lot of forums related to this topic, but I have some new questions and hence posting a new thread.
    We are in the process of doing unicode conversion in our landscape(CRM 7.0 system based on NW 7.01) and we are running on AIX 6.1 and DB2 9.5. The database size is around 1.5 TB and hence, we want to go in for optimization for export and import in order to reduce the downtime.As a part of the process, we have tried to do table-splitting and parallel export-import to reduce the downtime.
    However, we are having some doubts whether this table-splitting has actually worked in our scenario,as the export has exeucted for nearly 28 hours.
    The steps followed by us :
    1.) Doing the export preparation using SAPINST
    2.) Doing table splitting preparation, by creating a table input file having entries in the format <tablename>%<No.of splits>.Also, we have used the latest R3ta file and the dbdb6slib.o(belonging to version 7.20 even though our system is on 7.01) using SAPINST.
    3.) Starting with the export using SAPINST.
    some observations and questions:
    1.) After completion of tablesplitting preparation, there were .WHR files that were generated for each of the tables in DATA directory of export location. However, how many .WHR files should be created and on what basis are they created?
    2.) I will take an example of a table PRCD_CLUST(cluster table) in our environment, which we had split. We had 29 *.WHR files that were created for this particular table. The number of splits given for this table was 36 and the table size is around 72 GB.Also, we noticed that the first 28 .WHR files for this table, had lots of records but the last 29th .WHR file, had only 1 record. But we also noticed that, the packages/splits for the 1st 28 splits were created quite fast but the last one,29th one took a long time(serveral hours) to get completed.Also,lots of packages were generated(around 56) of size 1 GB each for this 29th split. Also, there was only one R3load which was running for this 29th split, and was generating packages one by one.
    3.) Also,Our question here is that is there any thumb rule for deciding on the number of splits for a table.Also, during the export, are there any things that need to be specified, while giving the inputs when we use table splitting,in the screen?
    4.) Also, what exactly is the difference between table-splitting and package-splitting? Are they both effective together?
    If you have any questions and or need any clarifications and further inputs, please let me know.
    It would be great, if we could get any insights on this whole procedure, as we know a lot of things are taken care by SAPINST itself in the background, but we just want to be certain that we have done the right thing and this is the way it should work.
    Regards,
    Santosh Bhat

    Hi,
    First of all please ignore my very first response ... i have accidentally posted a response to some other thread...sorry for that 
    Now coming you your questions...
    > 1.) Can package splitting and table-splitting be used together? If yes or no, what exactly is the procedure to be followed. As I observed that, the packages also have entries of the tables that we decided to split. So, does package splitting or table-splitting override the other, and only one of the two can be effective at a time?
    Package splitting and table splitting works together, because both serve a different purpose
    My way of doing is ...
    When i do package split i choose packageLimit 1000 and also split out the tables (which i selected for table split)  into seperate package (one package per table). I do it because that helps me to track those table.
    Once the above is done i follow it up with the R3ta and wheresplitter for those tables.
    Followed by manual migration monitor to do export/import , as mentioned in the previous reply above you need to ensure you sequenced you package properly ... large tables are exported first , use sections in the package list file , etc
    > 2.) If you are well versed with table splitting procedure, could you describe maybe in brief the exact procedure?
    Well i would say run R3ta (it will create multiple select queries) followed by wheresplitter (which will just split each of the select into multiple WHR files)  ...  
    Best would go thought some document on table spliting and let me know if you have specific query. Dont miss the role of hints file.
    > 3.) Also, I have mentioned about the version of R3ta and library file in my original post. Is this likely to be an issue?Also, is there a thumb rule to decide on the no.of splits for a table.
    Rule is use executable of the kernel version supported by your system version. I am not well versed with 7.01 and 7.2 support ... to give you an example i should not use 700 R3ta on 640 system , although it works.
    >1.) After completion of tablesplitting preparation, there were .WHR files that were generated for each of the tables in DATA directory of export location. However, how many .WHR files should be created and on what basis are they created?
    If you ask for 10 split .... you will get 10 splits or in some case 11 also, the reason might be the field it is using to split the table (the where clause). But not 100% sure about it.
    > 2) I will take an example of a table PRCD_CLUST(cluster table) in our environment, which we had split. We had 29 *.WHR files that were created for this particular table. The number of splits given for this table was 36 and the table size is around 72 GB.Also, we noticed that the first 28 .WHR files for this table, had lots of records but the last 29th .WHR file, had only 1 record. But we also noticed that, the packages/splits for the 1st 28 splits were created quite fast but the last one,29th one took a long time(serveral hours) to get completed.Also,lots of packages were generated(around 56) of size 1 GB each for this 29th plit. Also, there was only one R3load which was running for this 29th split, and was generating packages one by one.
    Not sure why you got 29 split when you asked for 36, one reason might be the field (key) used for split didn't have more than 28 unique records. I dont know how is PRCD_CLUST  split , you need to check the hints file for "key". One example can be suppose my table is split using company code, i have 10 company codes so even if i ask for 20 splits i will get only 10 splits (WHR's).
    Yes the 29th file will always have less records, if you open the 29th WHR you will see that it has the "greater than clause". The 1st and the last WHR file has the "less than" and "greater than" clause , kind of a safety which allows you to prepare for the split even before you have downtime has started. This 2 WHR's ensures  that no record gets missed, though you might have prepared your WHR files week before the actual migration.
    > 3) Also,Our question here is that is there any thumb rule for deciding on the number of splits for a table.Also, during the export, are there any things that need to be specified, while giving the inputs when we use table splitting,in the screen?
    Not aware any thumb rule. First iteration you might choose something like 10 for 50 GB , 20 for 100 GB. If any of the tables overshoots the window. They you can give a try by  increase or decrease the number of splits for the table. For me couple of times the total export/import  time have improved by reducing the splits of some tables (i suppose i was oversplitting those tables).
    Regards,
    Neel
    Edited by: Neelabha Banerjee on Nov 30, 2011 11:12 PM

  • Output from dbms_output.put_line splits and move to next line

    Hi All,
    I am printing out a list using dbms_output.put_line its like
    One or more of following Required Parameters are missing:
    1. Primary Field
    2. Structure Field
    3. Structure
    Table
    4. List File Name
    5. Query Directory
    6. Query String
    but I don't know why third option is splitting and moving to second line. any idea? its not that long even then.
    thanks

    set linesize 150
    or set it as per your requirement

  • Split and Joins?

    Hi,
    Could anyone explain the split and Join with simple scenarios?
    I understand that a split is geared towards movement from one activity to more than one activities, and vice-versa for a Join! But, is there a 'correlation' between the Or-split, And-Split and Or-join, And-join?
    Thanks experts,
    Kosh!

    Abasolutely, and split means both workitems should be evaluated on and join, default is Or split.
    But, make sure with your need, if you really need some approvals where there are lets say 5 possible approvers on a particular work item, then you do not really need to use split join, just use iterate and break it up on one's approval

  • Import issue - split and compressed dump

    Hi,
    I received 15GB export dump file from site as below, they are splited and compressed
    1. xaa 4gb
    2. xab 4gb
    3. xac 4gb
    4. xad 3gb
    i have import these dump file here in Unix server. i found some document to import the split and compressed dump
    i follow the below steps
    1. copy all 4 files into a directory
    2. then i used commands
    rm -f import_pipe
    mknod import_pipe p
    chmod 666 import_pipe
    (import_pipe file created in the current diriecty)
    nohup cat xaa xab xac xad | uncompress - > import_pipe & ( Process number created like 23901. we need to wait till complete the backgroud process before giving IMPORT command?)
    then i give
    imp userid=<connection string> file=import_pipe full=yes ignore=yes log=dumplog.txt
    then it shows the imp-0009 error...
    pls help me resolve this issue.
    Thanks in advacne

    Pl post details of OS and database versions of the source and target. You will have to contact the source to determine how these files were created. It is quite possible that they were created using the FILESIZE parameter (http://docs.oracle.com/cd/E11882_01/server.112/e22490/original_export.htm#autoId25), in which case the import process can read from these multiple files without you having to further manipulate them.
    HTH
    Srini

  • How to show character length 60 ,If we split and distribute..??

    Hi Friends,
    As per other threads ,It was mentioned like split the string into parts depending upon length and split and send it into different Infoobjects....It is ok ..after sending it into different objects ..how can we see it in query as a single string..Is it by using customer exit variables kind of things??

    MSTF ,
    What you can use is , You can use the table modifier and concatenate the three columns into one at runtime and hide the other two columns of the table. But again I assume you are using BeX.
    Arun Varadarajan
    P.S Otherwise store the comments in a BW Internal Table and then display the data in the concerned cell at runtime using the table modifier.

  • Using split and regex

    I have a string like this:
    String parseMe = "//element//elementAgain/elementSecond"
    I want to get the string between // and /. How can I do that using split and regex? I'm confused on how will i construct my regex expression. can anyone help me or show me how? thanks!

            String parseMe = "//element//elementAgain/elementSecond";
            Pattern p = Pattern.compile("//(.*?)(?=/)");
            for(Matcher m = p.matcher(parseMe); m.find(); System.out.println(m.group(1)));

  • I need my keyboard to be split and at the bottom

    my keyboard was split and at the bottom of my screen and all of the sudden quit and is back in the middle, how do i get it back?

    When you have the keyboard in the 'split' option, via the Keyboard, holding down the Botton and selecting 'split', when you go to marge it back, it gives the command of "Merge and Dock". This says to me, that when you choose 'split' that it will split and go centre screen. If it was to stay at the bottom, it would just say "Merge" with the Dock, so im thinking its not a Bug, its intended, and Apple wrote the Manual wrong.
    Its easier to write words in a Manual, than it is to write code to have it 'split' and 'docked'. So it seems a intentional thing to have in mid screen IMHO anyways.

  • I have a huge file which is in GB and I want to split the video into clip and export each clip individually. Can you please help me how to split and export the videos to computer? It will be of great help!!

    I have a huge file which is in GB and I want to split the video into clip and export each clip individually. Can you please help me how to split and export the videos to computer? It will be of great help!!

    video
    What version of Premiere Elements do you have and on what computer operating system is it running?
    Please review the following workflow.
    ATR Premiere Elements Troubleshooting: PE11: Project Assets Organization for Scene and Highlight Grabs from Collection o…
    But please also determine if your project goal is supported by
    a. format of your source
    and
    b. computer resources
    More later based on details that you will post.
    ATR

  • Can we split and fetch the records in Database Adapter

    Hi,
    I designed a Database Adapter to fetch the records from oracle Database. Some time, the Database Adapter need to fetch around 5000, or 10,000 records in single shot. In that case my BPEL process is choking and getting error as
    java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
    Could someone help me to resolve this?
    In Database Adapter can we split and fetch the records, if number of records more then 1000.
    ex. First 100 rec as one set and next 100 as 2nd set like this.
    Thank you.

    You can send the records as batches useing the debatching feature of db adapter. Refer documentation for implementation details.

  • 1:n Message split and Abap Proxies??

    Hello,
    Can I not use Message split and Abap Proxy together? My scenario is MDM->File ->XI->Proxy->BI.
    I am getting a single file syndicated from MDM and in XI If I use message mapping to do 1:n split in the message mapping, can I use it with Abap Proxies? As per the link below, XI adapter is not present in the list..We are on PI 7.0 SP14. Thank you..
    http://help.sap.com/saphelp_nw04/helpdata/en/42/ed364cf8593eebe10000000a1553f7/frameset.htm
    Thank you for any suggestion..

    Hi Thanujja,
    If you see the message from Raj, I dont think we can split the messages for the proxy. This is beacause the splitting of messages take place at the Adapter Level only for the adapters on the Java stack.
    As suggested by Guru, you can try splitting the messages in the inbound proxy instead of using a BPM, in that way you can acheive good performance.
    Thanks,
    Srini
    Edited by: srinivas kapu on Mar 27, 2008 9:09 AM
    Edited by: srinivas kapu on Mar 27, 2008 9:10 AM

  • BPM Split and Merge

    Hi...
       I want to do scenario like file split and merge using BPM.
    for that i have used,
    1.Receive
    2.Transformation(1:N)
    3.Block(ForEach)
    4.Control
    5.EndBlock
    6.Transformation(N:1)
    7.Send.
    while executing the scenario, the message is going to the queue. In that it is showing the status as "Running".
    can you please tell if i did wrong in my scenario?

    > 1.Receive
    > 2.Transformation(1:N)
    > 3.Block(ForEach)
    > 4.Control
    > 5.EndBlock
    > 6.Transformation(N:1)
    > 7.Send.
    U are using an empty infinite block and hence it is in running state always. You dont need a block at all. after 1:n transformation, use the n:1 transformation and send. I know you must be doing a sample scenario. In reality you will have a send step usually for sending to another system line by line. That when you will need a block.
    VJ

  • Splitting and Collecting Messages

    Hi,
    I have a scenario in which I get a message from SAP. In this message their are multiple item structures and 1 header structure. Now I have to sent these items to a WS seperately and collect the messages  from the WS response in one message with a header.
    Something like:
    Receive Message -> split message (message mapping) -> send all to WS / receive response for everyone -> collect messages from response in a new message -> send the new message to another WS
    How can I do this in BPM?

    Hi Chris,
    Your design should be like this:
    Start --> Receive --> Transform1 (Do the split and multimapping) -->  Send Step1 (Synchronous) --> Send 2 (Send Response from send1 to the output) --> Stop.
    Regards,
    ---Satish

Maybe you are looking for

  • Data in the cube/ ODS?

    Hi all, I would like to ask one simple question. Can someone tell me how to check if infocube/ ODS contains data (instead of checking it by runing reports in BW Analyzer). Thank you. Best wishes, John

  • Unable to login to gnome using gdm after upgrade to 3.6

    Hi to all, After the upgrade to gnome 3.6 I'm not able to access to gnome anymore. It appears gdm background but after a while it says that there was an error, it will disable all extensions, but after closing it reloads but the same error appears! I

  • Error in Jsp page - undefined type

    <html> <%@ page contentType="text/html; charset=utf-8" import="java.io.InputStream, java.io.IOException, javax.xml.parsers.SAXParser, java.lang.reflect.*, java.io.Writer.*, java.util.*, javax.xml.parsers.SAXParserFactory" session="false" %> <% * Copy

  • How to call Webservice in abap Program

    Hi Guys, How can i call a abap webservice located in another server(CRM) in Bpc BADI.Is this web service is thing is possible??

  • How can I change the settings to be lower case pdf not upper case PDF?

    my processing site for our business will not allow documents to be saved in upper case PDF, but I can't find where to change it to lower case pdf....which I am told will fix the issue.