How to parse a big file with Regex/Patternthan

I would parse a big file by using matcher/pattern so i have thought to use a BufferedReader.
The problem is that a BufferedReader constraints to read
the file line by line and my patterns are not only inside a line but also at the end and at the beginning of each one.
For example this class:
import java.util.regex.*;
import java.io.*;
public class Reg2 {
  public static void main (String [] args) throws IOException {
    File in = new File(args[1]);
    BufferedReader get = new BufferedReader(new FileReader( in ));
    Pattern hunter = Pattern.compile(args[0]);
    String line;
    int lines = 0;
    int matches = 0;
    System.out.print("Looking for "+args[0]);
    System.out.println(" in "+args[1]);
    while ((line = get.readLine()) != null) {
      lines++;
      Matcher fit = hunter.matcher(line);
      //if (fit.matches()) {
      if (fit.find()) {
     System.out.println ("" + lines +": "+line);
     matches++;
    if (matches == 0) {
      System.out.println("No matches in "+lines+" lines");
  }used with the pattern "ERTA" and this file (genomic sequence) :
AAAAAAAAAAAERTAAAAAAAAAERT [end of line]
ABBBBBBBBBBBBBBBBBBBBBBERT [end of line]
ACCCCCCCCCCCCCCCCCCCCCCERT [end of line]
returns it has found the pattern only in this line
"1: AAAAAAAAAAAERTAAAAAAAAAERT"
while my pattern is present 4 times.
Is really a good idea to use a BufferedReader ?
Has someone an idea ?
thanx
Edited by: jfact on Dec 21, 2007 4:39 PM
Edited by: jfact on Dec 21, 2007 4:43 PM

Quick and dirty demo:
import java.io.*;
import java.util.regex.*;
public class LineDemo {
    public static void main (String[] args) throws IOException {
        File in = new File("test.txt");
        BufferedReader get = new BufferedReader(new FileReader(in));
        int found = 0;
        String previous = "", next = "", lookingFor = "ERTA";
        Pattern p = Pattern.compile(lookingFor);
        while((next = get.readLine()) != null) {
            String toInspect = previous+next;
            Matcher m = p.matcher(toInspect);
            while(m.find()) found++;
            previous = next.substring(next.length()-lookingFor.length());
        System.out.println("Found '"+lookingFor+"' "+found+" times.");
/* test.txt contains these four lines:
AAAAAAAAAAAERTAAAAAAAAAERT
ABBBBBBBBBBBBBBBBBBBBBBERT
ACCCCCCCCCCCCCCCCCCCCCCERT
ACCCCCCCCCCCCCCCCCCCCCCBBB
*/

Similar Messages

  • How to parse a flat file with C#

    I need to parse a flat file with data that looks like
    01,1235,555
    02,2135,558
    16,156,15614
    16,000,000
    You get the idea. Anyway, I'd like to just used a derived column and move on except I need to put a line number on each row as it comes by so the end looks like,
    1,01,1235,555
    2,02,2135,558
    3,16,156,15614
    4,16,000,000
    I'm trying to do with a script transformation but I can't seem to get the hang of the syntax. I've tried looking at various examples but everybody seems to prefer VB and I'd like to keep all of my packages C#. I've set up my input and my output columns I just
    need to figure out how to write the code that says something like:
    row_number = 1
    line_number = row_number
    record_type = input.split.get the second data element
    data_point_1 = input.split.get the third data element
    row_number = row_number ++

    /* Microsoft SQL Server Integration Services Script Component
    * Write scripts using Microsoft Visual C# 2008.
    * ScriptMain is the entry point class of the script.*/
    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
    private int rowCounter = 0;
    // Method that will be started before the rows start to pass
    public override void PreExecute()
    base.PreExecute();
    // Lock variable for read
    VariableDispenser variableDispenser = (VariableDispenser)this.VariableDispenser;
    variableDispenser.LockForRead("User::MaxID");
    IDTSVariables100 vars;
    variableDispenser.GetVariables(out vars);
    // Fill the internal variable with the value of the SSIS variable
    rowCounter = (int)vars["User::MaxID"].Value;
    // Unlock variable
    vars.Unlock();
    // Method that will be started for each record in you dataflow
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    // Seed counter
    rowCounter++;
    // Fill the new column
    Row.MaxID = rowCounter;
    Here is a script to get an incremental ID. On the ReadWriteVariables of the script add the "User::MaxID" variables to get the last number. On the Inputs and Outputs tab, create an output column  here in the code it's MaxID numeric data types.

  • How to download the big file in SDN

    Hi buddies,
         Do you have any idea how to download the big files in SDN such as NetWeaver CE trial? I cannot download it with download tool, and if I download it by IE, after dowloading for a while, I always get "connection reset" error.
         Hope you can help me.
    YiNing

    Yining,
    There seems to be a few people making comments about problems with downloads.  I myself have never had a problem, but I think that downloading from in Europ isn't a problem.
    I would try and use a tool such as GetRight download manager which picks up the fact that something is downloading and manages that download, so it will be able to restart the download.
    Paul

  • Hi, how do i share big files using the creative cloud , it says download the crative cloud connection, but then sends me nowhere that says that

    hi, how do i share big files using the creative cloud , it says download the crative cloud connection, but then sends me nowhere that says that

    Hi Jonty, check the file size limit:  Creative Cloud Desktop application FAQ
    You can also check the details around sharing and collaborating files: Sync and share files and folders with collaborators | Adobe Creative Cloud tutorials
    Atul_Saini

  • How to load unicode data files with fixed records lengths?

    Hi!
    To load unicode data files with fixed records lengths (in terms of charachters and not of bytes!) using SQL*Loader manually, I found two ways:
    Alternative 1: one record per row
    SQL*Loader control file example (without POSITION, since POSITION always refers to bytes!)<br>
    LOAD DATA
    CHARACTERSET UTF8
    LENGTH SEMANTICS CHAR
    INFILE unicode.dat
    INTO TABLE STG_UNICODE
    TRUNCATE
    A CHAR(2) ,
    B CHAR(6) ,
    C CHAR(2) ,
    D CHAR(1) ,
    E CHAR(4)
    ) Datafile:
    001111112234444
    01NormalDExZWEI
    02ÄÜÖßêÊûÛxöööö
    03ÄÜÖßêÊûÛxöööö
    04üüüüüüÖÄxµôÔµ Alternative2: variable length records
    LOAD DATA
    CHARACTERSET UTF8
    LENGTH SEMANTICS CHAR
    INFILE unicode_var.dat "VAR 4"
    INTO TABLE STG_UNICODE
    TRUNCATE
    A CHAR(2) ,
    B CHAR(6) ,
    C CHAR(2) ,
    D CHAR(1) ,
    E CHAR(4)
    ) Datafile:
    001501NormalDExZWEI002702ÄÜÖßêÊûÛxöööö002604üuüüüüÖÄxµôÔµ Problems
    Implementing these two alternatives in OWB, I encounter the following problems:
    * How to specify LENGTH SEMANTICS CHAR?
    * How to suppress the POSITION definition?
    * How to define a flat file with variable length and how to specify the number of bytes containing the length definition?
    Or is there another way that can be implemented using OWB?
    Any help is appreciated!
    Thanks,
    Carsten.

    Hi Carsten
    If you need to support the LENGTH SEMANTICS CHAR clause in an external table then one option is to use the unbound external table and capture the access parameters manually. To create an unbound external table you can skip the selection of a base file in the external table wizard. Then when the external table is edited you will get an Access Parameters tab where you can define the parameters. In 11gR2 the File to Oracle external table can also add this clause via an option.
    Cheers
    David

  • How can I print a file with mixed page orientation in windows 8.1?

    I have a win 7 and a win 8.1 computer.  I have a file which contains both landscape and portrait pages.  The file prints correctly with the mixed orientation from the win 7 pc, but will only print with either landscape or portrait on the win 8.1 pc. 
    I am using Adobe reader XI on the win 7 pc and adobe touch on the win 8.1 pc

    ส่งจาก จดหมายของ Windows
    จาก: Pat Willener
    ส่งเมื่อ: จ. 5 มกราคม 2558 6:15
    ถึง: thang dinhvan
    How can I print a file with mixed page orientation in windows 8.1?
    reply from Pat Willener in Adobe Reader Touch for Windows 8 - View the full discussion 
    I have a win 7 and a win 8.1 computer.  I have a file which contains both landscape and portrait pages.  The file prints correctly with the mixed orientation from the win 7 pc, but will only print with either landscape or portrait on the win 8.1 pc. 
    I am using Adobe reader XI on the win 7 pc and adobe touch on the win 8.1 pc
    If the reply above answers your question, please take a moment to mark this answer as correct by visiting: https://forums.adobe.com/message/7064031#7064031 and clicking ‘Correct’ below the answer
    Replies to this message go to everyone subscribed to this thread, not directly to the person who posted the message. To post a reply, either reply to this email or visit the message page:
    Please note that the Adobe Forums do not accept email attachments. If you want to embed an image in your message please visit the thread in the forum and click the camera icon: https://forums.adobe.com/message/7064031#7064031
    To unsubscribe from this thread, please visit the message page at , click "Following" at the top right, & "Stop Following"
    Start a new discussion in Adobe Reader Touch for Windows 8 by email or at Adobe Community
    For more information about maintaining your forum email notifications please go to https://forums.adobe.com/thread/1516624.

  • There is no 'Save as' under the file menu.  How do I save a file with another name?

    Numbers help refers to the 'Save As' function in the File menu.  My version (latest) does not have a 'Save As' function.  How do I save a file with another name?

    Ross Millard wrote:
    Badunit. You have an original file and an edited file. Now you duplicate the edited file and save it with a different name. Now do you have to go and delete the changed file from which the duplicate had been made? And.... is the original still unchaned... still original?
    Ross,
    For the situation you describe, Apple has provided Duplicate and Revert. Use File > Duplicate to reach this menu:
    Regards,
    Jerry

  • How to create a pdf file with CS5

    Hello, I'm new to PhotoShop CS5 and haven't figured out yet (despite two hours of trying) how to create a pdf file with pictures and texts.  Can someone please help me with this ?  The "help" button in CS5 doesn't seem to cover this question.  Nor do the FAQs.
    Thank you very much.

    Save As... Photoshop PDF.

  • How to fix the .pdf file with error "invalid annotation object"

    how to fix the .pdf file with error "invalid annotation object"

    As long as the PDF opens, then just try saving it to a new file name. There may be a preflight script that would help troubleshoot the issue.

  • How to convert a pdf file with hand-written signature?

    How to convert a pdf file with hand-written signature?

    Hi Lotus1215,
    Once the document is signed we cannot edit that document, hence convertion is not possible
    Please see the article mentioned below
    http://forums.adobe.com/docs/DOC-1515
    Let me know if you have any other question.
    Regards,
    ~Pranav

  • How to create a csv file with NCS attributes?

    Hi
    i installed Cisco Prime NCS and trying to perform bulk update of device credentials with csv file.
    How to create a csv file with all required attributes?
    This is part of NCS online help talking about this topic:
    Bulk Update Devices—To update the device credentials in a bulk, select Bulk Update Devices from the Select a command drop-down list. The Bulk Update Devices page appears.You can choose a CSV file.
    Note        The CSV file contains a list of devices to be updated, one device per line. Each line is a comma separated list of device attributes. The first line describes the attributes included. The IP address attribute is mandatory.
    Bellow is test csv file i created but does not work:
    10.64.160.31,v2c,2,10,snmpcomm,ssh,zeus,password,password,enablepwd,enablepwd,60
    10.64.160.31,v2c,2,10,snmpcomm,ssh,zeus,password,password,enablepwd,enablepwd,60
    The error i am getting while importing this file:
    Missing mandatory field [ip_address] on header line:10.64.160.31,v2c,2,10,snmpcomm,ssh,zeus,password,password,enablepwd,enablepwd,60
    Assistance appreciated.

    It looks like the IP address field is incorrectly set.,
    It should be as follows
    {Device IP},{Device Subnet Mask}, etc etc
    so a practical example of the aboove could be (i dont know if this is completely correct after the IP address / Subnet Mask)
    10.64.160.31,255.255.255.0,v2c,2,10,snmpcomm,ssh,zeus,password,password,enablepwd,enablepwd,60
    below is a link to the documentation
    http://www.cisco.com/en/US/docs/wireless/ncs/1.0/configuration/guide/ctrlcfg.html#wp1840245
    HTH
    Darren

  • In Pages how do I save a file with another name

    In Pages how do I save a file with another name?

    I don't know if there is a less cumbersome way to do this (I'm still pretty new to Lion) but This is what I do - since "Save As" seems to be gone with Lion.
    I create a duplicate document - then click the red button to close the document - and a dialog box pops up and asks if you want to save the changes - I change the name in there and then save the file under the new name.
    You can also save the copy to the desktop - get info - and then change the name if the finder window.
    I copied this from the Pages help area. These help instructions were created pre-Lion I'm sure.
    Saving a Copy of a Document
    If you want to make a copy of your document—to create a backup copy or multiple versions, for example—you can save the document using a different name or location. (You can also automate saving a backup version, as Automatically Saving a Backup Version of a Document describes.)
    To save a copy of a document: 
    Choose File > Save As and specify a name and location.
    The document with the new name remains open. To work with the previous version, choose File > Open Recent and choose the previous version from the submenu.

  • How to configure the .ini file with applet

    hai
    i am using native methods in that methods they use some ip addresses. when i am using that native methods in applet run the applet using appletviewer tool it works fine but when i am open that applet using html page browser not configure that .ini file data .how to configure that .ini file with browser

    Hi Jay SenSharma,
    Thanks for your immediate response.
    I saw your URL links, But in your link give the recursive deployment using wlst. But my question is how to configure the oracle weblogic library files into Admin server & Managed Servers by using the wls.jar file through wlst script to create the new domain.
    But if create the new domain by using GUI mode then we manually give the admin server port number & managed servers port number and name.
    By default the library files are configured with the Admin server in GUI mode. But the Managed server the Library files are not configured with the Managed servers. Then we manually select all the library files to the corresponding managed servers. Then only the applications are deployed into the corresponding managed server.
    Regards,
    S.vinoth Babu

  • How to parse a XML file

    I am a new learner to XML & JAVA,I dont't know how to parse the XML file using JAXP,Who can tell me,Who can write an Example?
    thx
    Best Regards.

    Using the SAXParser in JAXP the parsing of the XML file is event driven.
    Instantiate the parser:
    SAXParserFactory factory = SAXParserFactory.newInstance();
    SAXParser parser = factory.newSAXParser();
    InputSource is = new InputSource(new FileReader(theXML));call the parse method:
    parser.parse(is, this);The following events are fired as the parser works through the XML public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws org.xml.sax.SAXException
    public void endElement(String namespaceURI, String localName, String qName) throws org.xml.sax.SAXException
    characters(char[] ch, int start, int length)etc.
    You write what you want within each of these sections to handle the structure of your data. Keep in mind SAX is useful only when you know the structure of your XML.

  • How can I associat ".swf" files with flashlite 2.1

    How can I associate ".swf" files with flashlite 2.1? When I
    install flash lite 2.0,it can do ---- click .swf files system can
    autoly use flashlite to play.But now I install flashlite 2.1,it
    lost this ability return as "unknow files formate".
    Now,how can I reassociate ".swf" with flashlite 2.1?
    first: I must use flashlite 2.1.
    second:I must click .swf file directly.
    I know another way to open :open the player and then find the
    .swf,and open it. But ,unfortunately,I can not use this way to
    open.Now I will be looking for another way.

    unfortunately, the 2.1 developer edition you have installed
    has not overwritten the original version integrated into the
    symbian OS by the manufacturer. This means that the only way to
    open the 2.1 files is to open the 2.1 player from your apps menuu,
    and click on the swf file. You will only get direct access if the
    updated player is in a firmware update for you handset.
    Matt
    http://www.outside-media.co.uk/blog

Maybe you are looking for