How to skip a line like !-- PJG STAG 4700 -- ?

I am doing a project that requires me to parse a documents.
The documents have different tags, and main text area between <TEXT> and </TEXT>, here is an example:
<DOC>
<DOCNO> FT911-3 </DOCNO>
<HEADLINE>
FT 14 MAY 91 / International Company News: Contigas plans DM900m east
German project
</HEADLINE>
<!-- PJG 0012 frnewline -->
<TEXT>
CONTIGAS, the German gas group 81 per cent owned by the utility Bayernwerk, said yesterday that it intends to invest DM900m in the next four years.
</TEXT>
</DOC>
Since the main porpuse is to read text body, so I will skip those text between tags except <TEXT></TEXT>.
I can use switch statement to skip the text between tags, but I can't find a way to skip a line like:
<!-- PJG 0012 frnewline -->
any suggestions?

Let me correct my former statement for jesper1, this is SGML( Standard Generalized Markup Language)
document, early version for XML, so jesper1 are kind of right.
For freekee:
First step:
/** returns a hashmap of initialized tags
    *  This way we can have more than one tagstring map to the same
    *  tag type.  For example, we might have DOCID and DOCUMENTID both
    *  map to the BEGIN_DOCID tag.
   private HashMap initTags() {
        int NUM_TAGS = 20;
        Integer tagVals[] = new Integer[NUM_TAGS];
        HashMap tags = new HashMap();
        for (int i = 0; i < NUM_TAGS; i++) {
            tagVals[i] = new Integer(i);
        tags.put("<DOC>",             tagVals[0]);
        tags.put("</DOC>",            tagVals[1]);
        tags.put("<PARENT>",          tagVals[2]);
        tags.put("<TITLE>",           tagVals[2]);
        tags.put("<TL>",              tagVals[2]);
        tags.put("<HEADLINE>",        tagVals[2]);
        tags.put("</PARENT>",         tagVals[3]);
        tags.put("</TITLE>",          tagVals[3]);
        tags.put("</TL>",             tagVals[3]);
        tags.put("</HEADLINE>",       tagVals[3]);
        tags.put("<TEXT>",            tagVals[4]);
        tags.put("</TEXT>",           tagVals[5]);
        tags.put("<DOCID>",           tagVals[6]);
        tags.put("<DOCNO>",           tagVals[6]);
        tags.put("</DOCID>",          tagVals[7]);
        tags.put("</DOCNO>",          tagVals[7]);
        tags.put("<DATELINE>",        tagVals[8]);
        tags.put("</DATELINE>",       tagVals[9]);
        return tags;
second step:
/* this will read the file of documents, parse them and return a list of document */
  /* objects.  Could be called the document factory since it generates useful       */
  /* document objects that can then be indexed.                                     */
  /* Documents are created when an end of document tag is encountered.              */
  public ArrayList readDocuments() {
      String      dateline     = null;
      String      docName      = null;
      ArrayList   documentList = new ArrayList();
      boolean     done         = false;
      int         documentID   = 0;
      int         length       = 0;
      int         offset       = 0;
      String      title        = null;
      boolean endOfFile = false;
      while (! done) {
          System.out.println("Token --> " + in.sval);
          /* check to see if we hit the end of the file */
          try {
              if (in.nextToken() == in.TT_EOF) {
                  done = true;
                  endOfFile = true;
                  continue;
              /* now test for a tag */
              switch (in.ttype) {
                // where does "END_DOC" come from?
                // Since: tags.put("</DOC>", tagVals[1]), so </DOC> is mapped to value 1.
                // and since in the initialization: private final static int END_DOC = 1;
                // in.ttype returns a int, like TT_WORD
             case END_DOC:
                        /* when we hit the end of a document, lets create a new object */
                        /* to store info needed for indexing                           */
                        length = in.currentPosition - offset;
                        Document d = new Document (documentID, docName, title, dateline,
                                         inputFileName, offset, length, distinctTerms);
                        ++documentID;
                        documentList.add(d);
                        break;
                /* initialize document attributes when we start a new doc */
                case BEGIN_DOC:
                        System.out.println("Started new document");
                        offset = in.currentPosition;
                        docName = null;
                        dateline = null;
                        title = null;
                        distinctTerms = new HashMap();
                        break;
             case BEGIN_DOC_NAME:
                     docName = readNoIndex(in, END_DOC_NAME);
                        break;
             case BEGIN_TITLE:
                        title = readNoIndex(in, END_TITLE);
                        break;
                case BEGIN_DATELINE:
                        dateline = readNoIndex(in, END_DATELINE);
                        break;
                case BEGIN_TEXT:
                        readText(in, stopWords);
                        break;
                /* only time we get a word here is if its outside of any tags */
                case WORD:
                        break;
             default: {
                    System.out.println("Unrecognized tag: "+in.sval);
                    System.exit(-1);
          } catch (IOException e) {
                  System.out.println("Exception while reading doc ");
                  e.printStackTrace();
      return documentList;
   }Did I answer your question correctly?

Similar Messages

  • How to SKIP a line in REPORTS 2.5 ?

    Hi
    I am trying to print something on a sheet which has perforations in the middle of the sheet. I need to skip the perforation line from my repeating frame which keeps printing the data. For eg. If I cross line no 11 then I need to skip one line and then start printing.
    Could someone help with this one.
    Thanks
    Prabs
    null

    Hi,
    Add "SKIP 1" to the BREAK command:
    BREAK   ON  amount      SKIP    1

  • How to skip certain lines for a txt file and insert into array

    so here is my question:
    i had a file to read, and it requires to input into the array starting from a certain line
    example:
    4
    john 25 M
    mary 22 F
    lee 20 M
    faye 10 F
    faye john
    mary john
    mary faye
    i want to insert the friend list, starting 5th line into a 2d array, which is the int from first line +1.
    can someone help me with it?
    i believe there is a skip method and stuff..
    but just dont know how to use it
    may someone tell me how to do tat?

    the thing is i think that takes too long and it is not efficient..
    however...i just solved it with a better method
    Scanner in = new Scanner (reader);
    int size = Integer.parseInt(in.next());
    BufferedReader insert = new BufferedReader(new FileReader(new File(input)));
    String line = null;
    int count = 0;
    int startAtLineNo =size+1; // 0-based
    while ((line = insert.readLine()) != null) {
    if (count >= startAtLineNo) {
    /* do stuff */
    System.out.println(line);
    // else ignore
    count++;
    thanks anyways

  • How to skip blank line in FCC

    Hi,
    How to deal with blank rows in FCC for fixed width file?
    I want to ignore these records as they will fail the message in mapping.
    thanks,
    Anirudh.

    Hi
    Look this thread might help u
    Blank line in receiver file adapter content conversion

  • How to skip blank line (EOF char) at the end of the file while creating ?

    Hi,
    In my program I have to create a file in Text mode using OPEN DATASET statement. This file is being sent to a third party system for their processing. I came to know while creating the file using OPEN DATASET, one LF character is inserted end of the file resulting a blank line end of the file. Thus if my internal table contains 5 reocrds, in the created text file I can see 6 lines where last is a blank. My question is how to remove this blank line which is causing issue in the thirdparty system.
    Here is the Code I have used.
       TRY.
    Write the file in Text Mode
            OPEN DATASET lv_outpf FOR OUTPUT IN TEXT MODE ENCODING NON-UNICODE
                                 WITH SMART LINEFEED
                                  MESSAGE lv_msg.
            IF lv_msg IS NOT INITIAL.
              WRITE / lv_msg.
              EXIT.
            ENDIF.
            LOOP AT itab_new INTO st.
              TRANSFER st TO lv_outpf.
            ENDLOOP.
            CLOSE DATASET   lv_outpf.
          CATCH cx_root.                                    "#EC No Handler
        ENDTRY.

    an effective way to do it:
    open your dataset in binary mode, transfer the records but between each record transfer the LF (or CRLF according to your need)
    after the last record you don't transfer the LF

  • How to Skip/delete line under report name

    Hi,
    How to delete line under report name
    Thanks,,

    Hi,
    Edit The Title View and goto Format Title View just near to brush symbol ie above the word Title
    Border -> Position -> None
    Thanks,
    Balaa...

  • Neat infographic and how do you make lines like this?

    Here is the infographic:
    http://www.stevey.com/wp-content/uploads/2009/01/50-years-exploration-huge.jpg
    I can't figure out if these lines rendered by hand or done with some application -- anybody suggest an approach?

    I can't figure out if these lines rendered by hand or done with some application
    Probably both.
    You utilize whatever geometric automation is made available in the feature set of the drawing program you are using, and complete the rest with other features and/or manual path-specific manipulations.
    In general-purpose drawing programs like Illustrator, you could, for example, use features like Offset Path or the Polar Grid tool to draw the concentric ring portions of the paths, and use Blends or the Reshape Tool to draw the converging portions of the paths.
    If drawn in Corel Draw, for example, its Contour Tool might have been used for the concentric portions.
    Often, several different approaches can accomplish the final art (some more efficient than others).
    The general process is:
    Study the geometry needed for aspects that suggest they might be automated or at least semi-automated.
    Determine which aspects need to be numerically accurate (as opposed to merely aesthetically pleasing).
    Consider the automated features of the program you are using (requires, of course, proficiency in the program.)
    Based on that knowledge (of the requirements and the capabilities of the program), devise a procedure that meets the requirements, adding manual path drawing where you have to.
    On the other hand, it's entirely possible that the orbits are drawn entirely automatically by a program that interprets numerical data as curves. For example, a CAE program might have modeled the paths from numerical data and exported them as 2D paths that were then imported into an ordinary drawing program for artistic treatment and embellishment (in much the same way you might take a mathematically-correct 2D perspective drawing exported from a modeling program and then "make it pretty" it in a drawing program).
    Some ordinary drawing programs (Canvas, for example) include features that can plot curves from algebraic curve functions. Some ordinary drawing programs (Illustrator, Corel Draw, Canvas) can be scripted to draw paths from numerical data (although unlikely for the whole paths shown in your example, since the geometry would have to be tediously converted to control coordinates for Bezier curves).
    So there are several ways to skin the cat; but you first have to know which cat has to be skinned (i.e.; what the real requirements are). Are the orbits actually accurate flight paths, or are they just "artist conception" interpretations? That's not clear from just looking at your example.
    JET

  • How to skip to another line??

    Can anyone tell or show me when I wrire a message I. Don't know how to skip a line to go to another paragragh I tried using space key pad but its not working how can I?

    Press the return or enter key to skip a line while typing a message.
    tanzim                                                                                  
    If your query is resolved then please click on “Accept as Solution”
    Click on the LIKE on the bottom right if the post deserves credit

  • Skip 1 Line in ALV report.....

    Hi Experts,
    I have made ALV report in which user wants one blank line after each row.
    How to skip one line after each row in ALV ?
    Yusuf

    Hi Yusuf,
    It is really a typical requirement.
    There is a solution to that.
    In the internal table that you are passing to the function module, append a blank line to that itself.
    Like:
    LOOP AT itab INTO wa.
      APPEND wa TO itab3.
      CLEAR wa.
      APPEND wa TO itab3.
    ENDLOOP.
    REFRESH itab.
    itab[] = itab3[].
    But remember not to do sorting after that or not to pass the sorting table parameters.
    If you have an extra line at the bottom you can delete that by reading the number of lines through the describe statement and delete  the last row of the table.
    Try this hiope this will work.
    Reward points if found useful.
    Thanks & regards
    Abhijit

  • How to get trending line in the graph

    Hi,
    I am using 10g. I have requirement as the following :
    Name Date Metric
    I can have bar graph against Date and Metric. My requirement is to have a trending line, so I calculated slope and regression line y=mx+c but I am not getting the right trend line. I can use regr_slope function to create a view in rpd but it doesnt support date values. Can you guys let me know how to get trending line like same in Excel.
    Thanks

    Hi,
    It's better to do at UI level.
    Can you please check the below link:
    http://kpipartners.blogspot.in/2009/04/linear-regression.html
    Hope this helps.
    Thanks,
    Pramod.

  • How to skip line for delimited file type?

    Hi, i wanna ask how to skip (example: the first two line) for delimited file type?
    Thanks...
    Here is my script
    Function NY_Skip06Center [strField, strRecord]
    ' FDM DataPump Import Script:
    'Created by: FDM_Admin
    'Date created: 2/28/2006
    Dim strEntity
    'Check first two characters of entity
    For strEntity = 1 to 6
    'Skip line
    Res.PblnSKip = True
    Next strEntity
    End if
    End Function
    But it returns this error when imported
    Error: An error occurred importing the file.
    Detail: Object variable or With block variable not set
    Anyone knows what's wrong
    Edited by: user649207 on Mar 19, 2010 2:15 AM
    Edited by: user649207 on Mar 19, 2010 3:04 AM

    strAcc = DW.Utilities.fParseString (strField, 1, 1, chr(9))
    I didn't look closely enough last time. The above is illogical. The parsestring function parses a string based on a delimiter.
    The function has these arguments:
    String to Parse
    How many fields are in the string
    The parsed field to return
    Field delimiter
    In the above, strField is returning the field specified in your import format. You are also saying that there is a total of 1 field. If that were the case, you wouldn't need to parse anything.
    I am guessing that your call needs to look something more like this:
    strAcc = DW.Utilities.fParseString (*strRecord*, *8*, 1, chr(9))
    Make sense?
    If not, maybe you can include a few sample lines from your data file and that will make it easier to help you.
    Is your import format fixed or delimited?

  • How to skip first 5 lines from a txt file when using sql*loader

    Hi,
    I have a txt file that contains header info tat i dont need. how can i skip those line when importing the file to my database?
    Cheers

    Danny Fasen wrote:
    I think most of us would process this report using pl/sql:
    - read the file until you've read the column headers
    - read the account info and insert the data in the table until you have read the last account info line
    - read the file until you've read a new set of column headers (page 2)
    - read the account info and insert the data in the table until you have read the last account info line (page 2)
    - etc. until you reach the total block idenfitied by Count On-line ...
    - read the totals and compare them with the data inserted in the tableOr maybe like this...
    First create an external table to read the report as whole lines...
    SQL> ed
    Wrote file afiedt.buf
      1  CREATE TABLE ext_report (
      2    line VARCHAR2(200)
      3          )
      4  ORGANIZATION EXTERNAL (
      5    TYPE oracle_loader
      6    DEFAULT DIRECTORY TEST_DIR
      7    ACCESS PARAMETERS (
      8      RECORDS DELIMITED BY NEWLINE
      9      BADFILE 'bad_report.bad'
    10      DISCARDFILE 'dis_report.dis'
    11      LOGFILE 'log_report.log'
    12      FIELDS TERMINATED BY X'0D' RTRIM
    13      MISSING FIELD VALUES ARE NULL
    14      REJECT ROWS WITH ALL NULL FIELDS
    15        (
    16         line
    17        )
    18      )
    19      LOCATION ('report.txt')
    20    )
    21  PARALLEL
    22* REJECT LIMIT UNLIMITED
    SQL> /
    Table created.
    SQL> select * from ext_report;
    LINE
    x report page1
    CDC:00220 / Sat Aug-08-2009 xxxxp for 02/08/09 - 08/08/09 Effective Date 11/08/09 Wed Sep-30-2009 08:25:43
    Bill to
    Retailer Retailer Name                  Name on Bank Account           Bank ABA   Bank Acct            On-line Amount  Instant Amount  Total Amount
    ======== ============================== ============================== ========== ==================== =============== =============== ===============
    0100103  BANK Terminal                  raji                           123456789  123456789            -29,999.98    9 0.00         99 -29,999.98
    0100105  Independent 1                  Savings                        123456789  100000002            -1,905.00     9 0.00         99 -1,905.00
    0100106  Independent 2                  system                         123456789  100000003            -800.00       9 -15.00       99 -815.00
    LARGE SPACE
    weekly_eft_repo 1.0 Page: 2
    CDC:00220 / Sat Aug-08-2009 Weekly EFT Sweep for 02/08/09 - 08/08/09 Effective Date 11/08/09 Wed Sep-30-2009 08:25:43
    Bill to
    Retailer Retailer Name Name on Bank Account Bank ABA Bank Acct On-line Amount Instant Amount Total Amount
    ======== ============================== ============================== ========== ==================== =============== =============== ===============
    Count On-line Amount Instant Amount Total Amount
    ============== ====================== ====================== ======================
    Debits 0 0.00 0.00 0.00
    Credits 3 -32,704.98 -15.00 -32,719.98
    Totals 3 -32,704.98 -15.00 -32,719.98
    Total Tape Records / Blocks / Hash : 3 1 37037034
    End of Report
    23 rows selected.Then we can check we can just pull out the lines of data we're interested in from that...
    SQL> ed
    Wrote file afiedt.buf
      1  create view vw_report as
      2* select line from ext_report where regexp_like(line, '^[0-9]')
    SQL> /
    View created.
    SQL> select * from vw_report;
    LINE
    0100103  BANK Terminal                  raji                           123456789  123456789            -29,999.98    9 0.00         99 -29,999.98
    0100105  Independent 1                  Savings                        123456789  100000002            -1,905.00     9 0.00         99 -1,905.00
    0100106  Independent 2                  system                         123456789  100000003            -800.00       9 -15.00       99 -815.00And then we adapt that view to extract the data from those lines as actual columns...
    SQL> col retailer format a10
    SQL> col retailer_name format a20
    SQL> col name_on_bank_account format a20
    SQL> col online_amount format 999,990.00
    SQL> col instant_amount format 999,990.00
    SQL> col total_amount format 999,990.00
    SQL> ed
    Wrote file afiedt.buf
      1  create or replace view vw_report as
      2  select regexp_substr(line, '[^ ]+', 1, 1) as retailer
      3        ,trim(regexp_replace(regexp_substr(line, '[[:alpha:]][[:alnum:] ]*[[:alpha:]]', 1, 1), '(.*) +[^ ]+$', '\1')) as retailer_name
      4        ,trim(regexp_replace(regexp_substr(line, '[[:alpha:]][[:alnum:] ]*[[:alpha:]]', 1, 1), '.* ([^ ]+)$', '\1')) as name_on_bank_account
      5        ,to_number(regexp_substr(regexp_replace(line,'.*[[:alpha:]]([^[:alpha:]]+)','\1'), '[^ ]+', 1, 1)) as bank_aba
      6        ,to_number(regexp_substr(regexp_replace(line,'.*[[:alpha:]]([^[:alpha:]]+)','\1'), '[^ ]+', 1, 2)) as bank_account
      7        ,to_number(regexp_substr(regexp_replace(line,'.*[[:alpha:]]([^[:alpha:]]+)','\1'), '[^ ]+', 1, 3),'999,999.00') as online_amount
      8        ,to_number(regexp_substr(regexp_replace(line,'.*[[:alpha:]]([^[:alpha:]]+)','\1'), '[^ ]+', 1, 5),'999,999.00') as instant_amount
      9        ,to_number(regexp_substr(regexp_replace(line,'.*[[:alpha:]]([^[:alpha:]]+)','\1'), '[^ ]+', 1, 7),'999,999.00') as total_amount
    10* from (select line from ext_report where regexp_like(line, '^[0-9]'))
    SQL> /
    View created.
    SQL> select * from vw_report;
    RETAILER   RETAILER_NAME        NAME_ON_BANK_ACCOUNT   BANK_ABA BANK_ACCOUNT ONLINE_AMOUNT INSTANT_AMOUNT TOTAL_AMOUNT
    0100103    BANK Terminal        raji                  123456789    123456789    -29,999.98           0.00   -29,999.98
    0100105    Independent 1        Savings               123456789    100000002     -1,905.00           0.00    -1,905.00
    0100106    Independent 2        system                123456789    100000003       -800.00         -15.00      -815.00
    SQL>I couldn't quite figure out the "9" and the "99" data that was on those lines so I assume it should just be ignored. I also formatted the report data to fixed columns width in my external text file as I'd assume that's how the data would be generated, not that that would make much difference when extracting the values with regular expressions as I've done.
    So... something like that anyway. ;)

  • How to skip first TWO Lines of   .txt  file    using XSLT Mapping

    Hi Friends  ,
                              I have an .txt file in has the format as
                               <TEST>
                                4564564545
                                56456444566
                                56465
                                    How can i skip the first two Lines when i am writing the XSLT Mapping ?
                              That <TEST> and empty line shouldn't go the rfc  .
                             How can i skip and sent to rfc  using XSLT Mapping ?
    Best Regards .,
    V.Rangarajan

    you can avoid the empty lines in your File Content Conversion by defining offset.
    <i>Under Document Offset, specify the number of lines that are to be ignored at the beginning of the document.
    This enables you to skip comment lines or column names during processing. If you do not make an entry, the default value is zero lines.</i>
    ref: http://help.sap.com/saphelp_nw04/helpdata/en/2c/181077dd7d6b4ea6a8029b20bf7e55/content.htm
    then the generated XML after FCC will not have the empty lines.

  • When typing an email, when I go to next line, it skips two lines instead of one line. How do I correct it?

    I was typing an email and when I would go to the next line, it will skip two lines instead of one line. How do I correct the problem?

    I am not sure I understand the problem. Could you include a screenshot of it right after it happens?
    If you need help to create a screenshot, please see [[How do I create a screenshot of my problem?]]
    Once you've done this, attach the saved screenshot file to your forum post by clicking the '''Browse...''' button below the ''Post your reply'' box. This will help us to visualize the problem.
    Thank you!

  • How can I get two lines mail list like Windows Live Mail and How can I get to, cc, bcc lines like Windows Live Mail?

    I wish to move Thunderbird. But i have two main problem.
    * I want two lines mail lis like Live Mail,
    * I want to, cc, bcc line like Live Mail. I mean when, I write an e-mail address I will key "," (or another character) then I will enter next email address. Auto address completion must work for each address which I enter the address line.
    Can i do those on Thunderbird?
    I searched on the internet and there are much people who want this features.
    Thanks..

    I see the dropdown menus for CC and BCC, but they don't work. I click on them and there is no response except the name I already had in the "TO" field disappears! The attached screenshots shows the name in the "to" field, then the result after choosing "add CC" - the "to" name disappears.

Maybe you are looking for

  • How to change dispaly format of Due Date in BPA Invoice Template (11.5.10)

    Hi, I've modified the Invoice Template through template management in Bill Presentment Architecture (BPA). I copied the default layout and made the format changes, added our company logo etc. The only thing that I'm unable to do is change the format

  • Verfiy all pictures are in albums

    I've recently imported all my pictures into IPhoto. I created folders and albums to organize them. My library says I have 1285 pictures. If I select all my folders/albums it says there are 1268. I understand I do have 17 pictures in my library that I

  • How to get Retro for a period or how to know which month is having retro

    Dear Freinds,            I would like to know how can i know which month is having Retro . Iam reading from the FM  cu_rgdir and further iam using PYXX_READ_PAYROLL_RESULT to get the payroll results. But when we see pc_payresult if we have for period

  • Bandwidth percent - ios xr

    hi everyone, in a policy map - when you configure your classes - ex class 1 bandwidth percent 25 class 2 bandwidth percent 25 class 3 bandwidth percent 25 class 4 bandwidth percent 25 - will this negate p2mdrr or mddr? ie. have I configured the polic

  • Change permissions for UME repository

    Dear friends,        Is it possible to change permissions for the ume repository. By default, it has "Allow" permission for List Children and Read properties. I dont see any options to change permissions under Settings -> permissions. Does anyone her