Regular Expressions inside html file

i'm doing some report genearting tool. the template file is in the form of a html file. and i need to add some values from a database to the file in to some specific places which are marked by spcial tags defined by me.
any way thats not the issue. i read the whole html to a string and then i want to check for those tags.
can anyone just tell me what is regular expession for searching this tag.
(Only one occurance of a tag of specific type is there in the file )
this is the tag <!--data><!-->
thanks in advance

   private String html = "";
   private StringBuffer sb;
   public generateReportUser() {
      try {
         BufferedReader fileIn  = new BufferedReader(new FileReader("E:\\BCSProj\\templates\\tmpUserReport.htm"));
         String s;
         while ((s = fileIn.readLine()) != null) {
          //  html += s;
            html += s + "\n"; //this might be better
      } catch (Exception e) {
         e.printStackTrace();
      System.out.println(html);
      fileIn.close();
      Pattern p = Pattern.compile("(<!--data>)(<!-->)");
      Matcher m = p.matcher(html);
      sb = new StringBuffer();
      while (m.find){
        // "The Data" part can be dynamic
        m.appendReplacement(sb, m.group(1) + "The Data" + m.group(2));
   //     m.appendReplacement(sb, "The Data"); //simple replacement
      m.appendTail(sb);
      System.out.println(new String(sb));
/* They are quite nonsense
//check the mathes found this is always false for me
      System.out.println(m.matches());
//just checking parting the string
      if (m.matches()) {
         int i = m.groupCount();
         for (int j=0;j<i;j++) {
            System.out.println(m.group(j+1));
            System.out.println("***********");
      sb = new StringBuffer(html.trim());
   }

Similar Messages

  • XML parser to parse XML inside HTML file

    Hi,
    I wish to know is there any other parsers apart from JAXP to parse xml content present inside HTML file. For example,
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <title></title>
    </head>
    <body>
    <form id="j_id_jsp_1394907664_1" name="j_id_jsp_1394907664_1" method="post" action="/msaiphoneportal1.1c/pages/xmlchech.faces;jsessionid=5666F0E1CF0E44B978940F021012AA41" enctype="application/x-www-form-urlencoded">
    <input type="hidden" name="j_id_jsp_1394907664_1" value="j_id_jsp_1394907664_1" />
    <?xml version="1.0" encoding="UTF-8"?>
    <hospital>
    <Users>
    <User id="1" password="x" type="staff" username="x"/>
    <User id="2" password="y" type="staff" username="y"/>
    <User id="3" password="z" type="staff" username="z"/>
    </Users>
    <Survey/>
    <Patients>staaatus</Patients>
    </hospital>
    <input type="hidden" name="j_id_jsp_1394907664_1:j_id_jsp_1394907664_2" /><input type="hidden" name="javax.faces.ViewState" id="javax.faces.ViewState" value="-4298162632826268059:-1507671971163298623" autocomplete="off" />
    </form>
    </body>
    </html>
    I need to read the XML content inside. Is there any way please let me know
    Edited by: DHURAI on Jul 22, 2010 12:59 AM

    DHURAI wrote:
    while reading, we can fetch the starting of XML through <?xml> tag, but how we know the ending of the XML as it seems to be dynamic.1) Extract the document root element which follows the <?xml ... ?>
    2) From this root element , construct the associated root element terminal by inserting a / after the <.
    3) Search for the terminal.
    If the root name can also be the name of an enclosed element then you will have to count the number of terminals.

  • Dreamweaver CC, how to get php coloring inside html files

    Before I could do it by changing the extension.txt file. Now when I try to change the file extension.txt I can't save it but I get  an access denied error.
    I am running with administrative privileges.
    Any ideas?

    Sorry, I did not explain myself.
    Before I used to be able to tweak the configuration file from DW (extensions.txt) and Dreamweaver would color the php code that was contained inside an html files.
    I need the code coloring to apply in files with extension html.
    I hope I am more clear now.  And yes, @Rob Hecker2 you areright, I see now that it was very lousy explained 

  • Regular expression for html links

    Hi, I'm trying to get text/link pairs from a string, accepted links
    are like:
    text1
    text2
    The expected result would be:
    url: url1
    Text: text1
    url: url2
    Text: text2
    I use the following regular expression to catch the texts and the urls:
    "<a href=\"*(.*)\"*.*>(.*)</a>"
    group(1) should be the url and group(2) the text.
    But it doesn't work ok, I got something like:
    url: http://url1/" garbagetags
    text: text1
    url: utl2
    text: text2
    I'm trying to avoid links with " and without it and dinamic html
    tags.
    I think the problem is the Regular Expression string, I'm new using
    them and I can't found the right one, if you know what's wrong with
    my R.E. string, please help me.!
    thanx

    Had to break it in to two regular expressions:
    import java.util.regex.*;
    class B2  {
       public static void main(String[] args) {
            //String INPUT = "<a href=\"http://url1/\" garbagetags>text1</a>";
            //String INPUT = "<a href=url2>text2</a>";
            //String INPUT ="<a href=\"http://www.google.com\">Google search engine</a>";
              String INPUT="<a id=1a class=q href=\"/imghp?hl=en&tab=wi&ie=UTF-8&oe=UTF-8\" onClick=\"return c('www.google.com/imghp','wi',event);\"><font size=-1>Images</font></a>";
            //String REGEX = "<a .*href=\\\"?h?t?t?p?:?/?/?([\\w\\.\\?\\&=\\-\\d]*)/?\\\"?.*>(.*)</a>";
            String REGEX = "<a .*href=\\\"?h?t?t?p?:?/?/?([\\w\\.\\?\\&=\\-\\d]*)/?\\\"?.*>";
            String REGEX2 = ">\\b([\\w\\s\\d]+)\\b<";
            Pattern p = Pattern.compile(REGEX);
            Matcher m = p.matcher(INPUT);
            StringBuffer sb = new StringBuffer();
            if ( m.find() ) {
            System.out.println(m.group(1) + "     " );  }
            else { System.out.println("No MAtch found");  }
            Pattern p2 = Pattern.compile(REGEX2);
            Matcher m2 = p2.matcher(INPUT);
            if ( m2.find() ) {
            System.out.println(m2.group(1) + "     " );  }
            else { System.out.println("No MAtch found");  }
    } You do realize that you'll never get 100% accuracy with this. There are too many possible variations to account for them all.

  • How can I use regular expression to open files of certain types in java?

    Ok this is the problem I am facing:
    I have a command line input of something like "/usr/foo/bar/*.html"
    and there are multiple files in that folder that end with .html.
    How can I use the input to go through/open all the .html files in Java?
    Help would be greatly appreciated thanks!

    Or if you have to do it in java, check out the interfaces java.io.FileFilter and java.io.FileNameFilter
    http://home.tiscali.nl/~bmc88/java/sbook/0128.html
    class HTMLFilter implements FilenameFilter {
        public boolean accept(File dir, String name) {
            return (name.endsWith(".html"));
    }Cheers,
    evnafets

  • Regular expressions for matching file path

    Could someone give me idea that how can i compare a fixed path, with the paths user gives using regular expressions?
    My fixed path is : src\com\sample\demo\work\gui\.**
    and user may give like src\com\sample\demo\work\gui\test.jsp, src\com\sample\demo\work\gui\init.jsp etc.
    Any ideas are appreciated and thanks in advance.

    ...and if you insist on using regexes, you'll have to double-escape the backslashes: if ( userString.matches("src\\\\com\\\\sample\\\\demo\\\\work\\\\gui\\\\.*") ) { Whether you use regexes or not, you'll save yourself a lot of hassle by converting all backslashes to forward slashes before you do anything with the strings: userString = userString.replace('\\', '/');
    if ( userString.matches("src/com/sample/demo/work/gui/.*") ) {
    // or...
    if ( userString.startsWith("src/com/sample/demo/work/gui/") ) {

  • Regular expressions in file mask in file protocol

    Hi,
    I wanted to check whether we can use regular expressions to in file mask with file protocol? As per the documentation, we can use regular expression in file mask only in case of ftp/sftp.
    Regards,
    Anuj

    The documentation is not correct at this point. There is no regex support in file mask.
    OSB File Transport Configuratuin - File Mask

  • Regular expressing in javascript

    Hi All,
    Does anybody know if there's a way to find out how many words
    appear in a string using regular expression in javascript? For
    example, I have the following code that pops a "Not OK" alert
    whenever str contains "it it it information technology", and I need
    to find out how many times the word "it" exists in the string using
    regular expression.
    <html>
    <body>
    <script type="text/javascript">
    var str = "it it it information technology";
    var reg = /^(\bin\b|\bit\b|\bof\b)(?!
    (\bin\b|\bit\b|\bof\b))/;
    reg = new RegExp(reg);
    var result = str.match(reg);
    document.write(result);
    if (result) {
    alert("OK");
    } else {
    alert("Not OK");
    </script>
    </body>
    </html>
    Thanks very much in advance!

    Refer
    this
    tutorial.

  • Rplacing space with &nbsb; in html using regular expressions

    Hi
    I want to replace space with &nbsb; in HTML.
    I used  the below method to replace space in my html file.
    var spacePattern11:RegExp =/(\s)/g; 
    str= str.replace(spacePattern," "
    Here str varaible contains below html file.In this html file i want to replace space present between " What number does this  represents" with &nbsb;
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head>
    <body>
    <b><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B></B></FONT></P></TEXTFORMAT><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B> What number does this Roman numeral represents MDCCCXVIII ?</B></FONT></P></TEXTFORMAT></b>
    </body>
    </html>
    But by using the above regular expression i am getting like this.
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head><body>
    <b><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B></B></FONT></P></TEXTFORMAT><TEXTFORMAT LEADING="2"><P A LIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0 " KERNING="0"><B> What number does this represents</B></FONT></P></TEXTFORMAT></b>
    </body>
    </html>
    Here what happening means it was replacing space with &nbsb; in HTML tags also.But want to replace space with &nbsb; present in the outside of the HTML tags.I want like this using regular expressions in FLEX
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head>
    <body>What number does this represents</body>
    </html>
    Hi,Please give me the solution to slove the above problem using regular expressions
    Thanks in Advance to all
    Regards
    ssssssss

    sorry i missed some information in above,The modified information was in red color
    Hi
    I want to replace space with &nbsb; in HTML.
    I used  the below method to replace space in my html file.
    var spacePattern11:RegExp =/(\s)/g; 
    str= str.replace(spacePattern," "
    Here str varaible contains below html file.In this html file i want to replace space present between " What number does this  represents" with &nbsb;
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head>
    <body>
    <b><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B></B></FONT></P></TEXTFORMAT><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B> What number does this Roman numeral represents MDCCCXVIII ?</B></FONT></P></TEXTFORMAT></b>
    </body>
    </html>
    But by using the above regular expression i am getting like this.
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head><body>
    <b><TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B333C" LETTERSPACING="0" KERNING="0"><B></B></FONT></P></TEXTFORMAT><TEXTFORMAT LEADIN G="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style = 'font-size:10px' COLOR="#0B33 3C" LETTERSPACING="0" KERNING="0"><B> What number does this represents</B></FONT></P></TEXTFORMAT></b>
    </body>
    </html>
    Here what happening means it was replacing space with &nbsb; in HTML tags also.But want to replace space with &nbsb; present in the outside of the HTML tags.I want like this using regular expressions in FLEX
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    </head>
    <body>What&nbsb;number&nbsb;does&nbsb;this&nbsb;represents</body>
    </html>
    Hi,Please give me the solution to slove the above problem using regular expressions
    Thanks in Advance to all
    Regards
    ssssssss

  • HTML Escaping Regular Expression

    Assume I have the following string:
    <font face="arial">this</font> is a <b>very nice</b> " <a href>String</a>Now say I want to allow everything above, except I want to escape certain tags... IE I want to allow:
    <b>,</b>,<font ...>,</font> and nothing else. The escaped string above should then be converted to:
    <font face="arial">this</font> is a <b>very nice</b> " & lt;a href & gt; String & lt;/a& gt;The idea I'm trying to implement is to allow form input that may contain limited html tags, that I define, and escape anything else.
    This seems like it would be an existing regular expression, does anyone have any ideas?
    Thanks!

    Kudos and thanks for the regex. I did not know how to do negation in a regular expression, ie (?!FONT|B)To answer an earlier post, the application works much like a message board. I accept input from a textarea, then write that input out as an html file (much like how this forum works). I'd like to accept certain HTML markup in the input, but disallow tags like "<javascript>", "<object>" and "<embed>", etc, as writing those tags would allow users to post malicious input (redirects, popups, etc). Thus, defining and parsing what tags I will accept is easier than defining the tags not to accept (and safer).
    That being said, escaping quotes is somewhat important, as " in html that is not in a tag should really be the converted to & quot; for w3c browser standards. However, with 95% of my original question answered (that being the most important part), I'm satisfied with this. Thanks to all for the help thus far!

  • Regular expression html parsing

    I have following sample html
    <html><body>
    First Name<input type="text" class="txtField" name="txtFirstName"\>
    Last name
    <input type="text" name="txtLastName"\>
    Address <textarea name="address" rows="10">Here goes address</textarea>
    <input type="button" name="btnSubmit" class="button"\>
    </body></html>
    I m trying to build a regular expression in such a way that, the expression should find a list of tags based on the set of names available.
    for e.g. I have string array as String names[] = {"txtLastName", "address"}
    using above array, the expression must find tag in above html.
    So in above case the output should be
    <input type="text" name="txtLastName"\>
    <textarea name="address" rows="10">
    Can somebody suggest how this expression should be build?

    Hi,
    As from your question,
    I got that you want to parse the HTML file and from the names in the array you want to
    get the code those controls.
    In that case I think you can use
    Find the string
    1.which starts with '<' and ends with '>'
    2.It must contains your work("txtLastName", "address"....) in double quotes(statring & ending) excatly one.
    Best,
    Ronak

  • Html paser of regular expression

    Dear all,
    I know some of you will think my problem can be solved by an open-source html parser but I tested the following list of parsers (http://java-source.net/open-source/html-parsers) and failed to find one that meets my requirement as I explained below.
    I would like to parse a html file and fetch the hyper links from it.
    I wrote the following regular expression and it works in most cases:
    .*(src|href|url|action)\s*=\s*["|']?(.*?)["|'|\s?|>].*However, I have until now two troubles:
    1. For "<a href="directory.html">Directory</a> | <a href="a-z.html">A - Z</a>", I expceted to fetch "directory.html" and "a-z.html" but I only got the last one.
    2. I expected to exclude "http://www.javaeye.com/upload.jpg" in "<img alt="subwayline13" class="logo" src="http://www.javaeye.com/upload.jpg" title="subject" />". I still could not find a solution for this.
    Therefore, I would wish that you can give me some new advices.
    Merry Chirstmas and Happy New Year!
    Pengyou

    pengyou wrote:
    Dear all,
    I know some of you will think my problem can be solved by an open-source html parser but I tested the following list of parsers (http://java-source.net/open-source/html-parsers) and failed to find one that meets my requirement as I explained below.
    Then you did something wrong when you were using the parser.
    I would like to parse a html file and fetch the hyper links from it.
    I wrote the following regular expression and it works in most cases:
    .*(src|href|url|action)\s*=\s*["|']?(.*?)["|'|\s?|>].*However, I have until now two troubles:
    1. For "<a href="directory.html">Directory</a> | <a href="a-z.html">A - Z</a>", I expceted to fetch "directory.html" and "a-z.html" but I only got the last one.
    2. I expected to exclude "http://www.javaeye.com/upload.jpg" in "<img alt="subwayline13" class="logo" src="http://www.javaeye.com/upload.jpg" title="subject" />". I still could not find a solution for this.
    Therefore, I would wish that you can give me some new advices.
    Same advice as before.
    1. Use an existing html parser correctly.
    2. Write you own html parser. An actual parser. A parser would be part of your solution, not the entire solution.
    And more advice...do not attempt to use regexes to parse html nor xml for that matter. The reason for that is because by the time you get it right, if ever, you will have built a parser. So instead start with one right away.
    I suspect that your actual problem is that you don't know what a parser is and what it should do. So you think that a "parser" should give you there result you want rather than giving you tokens. A parser parses a source based on a grammer and produces tokens. A token is not an image file until you further interpret a particular token that way.
    Finally note that in the above I said you could build your own parser if you wanted. But then you must in fact build a parser. If you do it correctly then you are going to end up with something that is functionally equivalent to one of the existing parsers. If you do it wrong then it won't.

  • Open an html file inside spry collapsible panel

    Greetings,
    Using CS5
    Does anyone know the CSS code/possibility of opening an html file inside the Spry panel?
    I've tried the following code:
    .CollapsiblePanelContent {
        background-image: url(lookoutgraph.html);
    Is there a better CSS element for calling up an html file?
    For what it's worth...the html file I'm tring to open uses a <canvas> tag. When opened by itself, the html file has no problem opening and displaying the canvas data.
    Much Thanks!

    With the exeption of images, to add external content to a document you will need to make use of either serverside or clientside code.
    Have a look at the SpryHTMLPanel here http://labs.adobe.com/technologies/spry/samples/htmlpanel/html_panel_sample.html
    Otherwise, please supply a link to your site so that we can come up with alternatives.
    Gramps

  • How do I have to define a regular expression to filter out data from file?

    Hi all,
    I need to extract parts of lines of a ASCII file and didn't get it done with my low knowledge of regular expressions
    The file contains hundreds of lines and I am just interested in a few lines, within that lines I just need a part of the data.
    One original line looks like that:
    TP3| |TP_SMD|Nicht in Stueckliste|~TP TP_SMD TESTPUNKT|-|0|87.770|157.950|0|top|c| |other|TP_SMD|TP_SMD_60RF-TP
    Only the bold and underlined information is of interest, I don't need the rest.
    I can open that file, read in each line but then I am struggling to pick out only the lines of interest (starting with TP), taking that TP with its number and the coordinates following later on and then writing these shortened lines to a new text file. So the new line should look like that:
    TP3; 87.770;157.950;0 (It doesn't matter if the separator will be ; or |)
    I thought of using regular expressions - is that the right way or is there a better approach?
    Thanks & regards,
    gedi, using LabVIEW 8.5
    Regards,
    gedi

    Hi max,
    for finding a specific part of a string you can use the "Match Pattern" VI, it is located in the Strings Palette.
    Maybe the Extract Numbers.vi example in the examples browser library can help you.
    What I did to filter out my data of interest is first to sort out only the columns which I want to have -
    then there are still a lot of lines remaining I don't need (this is the thing described above).
    The rest I am going to filter out with a (then easy) regular expression with the "Match Pattern" VI.
    Regards,
    gedi
    Regards,
    gedi

  • Regular Expression to remove space in HTML Tag

    Hello All,
    My HTML string is like below.
    select '<CityName>RICHMOND</CityName> 
    <StateCd>ABCD CDE 
    <StateCd/>
    <CtryCd>CAN</CtryCd>
    <CtrySubDivCd>BC</CtrySubDivCd>' Str from dual
    Desired Output is
    <CityName>RICHMOND</CityName><StateCd>ABCD CDE 
    <StateCd/><CtryCd>CAN</CtryCd><CtrySubDivCd>BC</CtrySubDivCd>
    i.e. want to remove those spaces from tag value area having only spaces otherwise leave as it is. Please help to implement the same using Regular expression.

    Hi,
    It's unclear what you want.  This site seems to be formatting your message in some odd way.
    Post a statement like
    SELECT '...' FROM dual;
    without any formatting, to show your input, and post the exact output you want friom that, with as little formatting as possible.  It might help if you use some character like ~ instead of spaces (just for posting; we'll find a solution that works for spaces).
    To remove the text that consists of spaces and nothing else between the tags, you can say
    REGEXP_REPLACE ( str
                   , '> +<'
                   , '><'
    How is this string being generated?  Maybe there's some easier, more efficient way to keep the bad sub-wrtings out of the string in the first place.

Maybe you are looking for

  • Visual Studio 2013 slows down + crashes when using Source Control features.

    I have spent two full days trying to resolve this issue but no luck so here we go, I have created a project using Visual Studio Team Foundation Server plug-in in the past. Later on I switched to Microsoft Git Provider. Now when I connect to that proj

  • Help please.  My newish Ipod won't charge and won't connect to itunes

    Hello, I have had a 20 gb ipod for a month or two and just the other day I put it to charge (not for the first time) and it didn't connect. I have a docker and that didn't work either. I tried to connect to itunes on both laptop and home computer and

  • Problem with alerts in the alert inbox

    Hi experts, I have configured AE alerts and I wanted to display the Error details as in RWB in the alert as well. With the help of SDNers in the thread Alerts for JDBC adapter I added SXMS_TO_ADAPTER_ERRTXT in the container. Now, alerts are getting g

  • Overclocking and Benchmarking Applications

    Hey guys, what applications are you using for stressing and testing any machines that you've overclocked? I've looked at the following applications and/or live CD/DVDs, but I'm wondering if there is something else worth trying? StressLinux (liveCD) P

  • Play sound from AppleTV on iPod

    I have noticed that when I watch a video in iTunes, I can use AirPlay to push the audio to the AppleTV.  Is there any way to make this work in reverse?  I am trying to find a way to watch videos on my AppleTV while listening to the audio through head