Regex: Escaping in HTML

Hi All,
I need to escape certain characters on our jsp pages to make them w3c compliant.
The characters I need to escape are
"  to " < to &lt; > to &gt; & to & (but not when & is followed by # (eg &&#35; would NOT be escaped).
EG
Raw:  John & Mary are > then Sam & Helen but < George & Michelle -- &#27222;
Converted:  John & Mary are & gt; then Sam & Helen but & lt; George & Michelle  --- ;&#27222; I've tried to use PerlUtils but to no avail (cannot work out how to tell it not to escape the &# sequence).
Can you guys help me with to accomplish this with javas regex package (or with 1.4 String class)??
Thanks!

Thanks Uncle Alice! You are the regexp king (or queen?)Of the two, "king" would be more appropriate, but I really prefer "poobah". :-)
I've been writing java apps for 5 years now and I have never touched
regex. I did do some google'ing but really couldnt find real examples
that were easy to follow.I know what you mean. This site has a pretty good tutorial, but the examples are a bit too abstract. I was hoping the Habibi book from APress would fill that void (it was originally titled Real World Regular Expressions with Java), but poor examples turned out to be the least of its problems. Your best bet is still Friedl's book -- good examples are the least of its virtues.

Similar Messages

  • Sorting values by escaping the html tags

    Hi Friends,
    My problem is described below...
    I am facing an issue where I need to sort (main title/ alternate title) by escaping the html tags.
    eg 'Amex' should always be before 'Singer'
    Amex in DB is like Amex.
    Syngo in DB is like &lt;i&gt;Singer&lt;/i&gt
    something can be --- <i>hello<i>
      sql.append("SELECT DISTINCT NVL(ItemDocMeta.xMainTitle, ItemDocMeta.xAlternateTitle) AS generatedPageTitle,");
    other conditions
    // order by
                sql.append(" ORDER BY LOWER(generatedPageTitle) ASC");
    Is there any SQL function that can make it easier?
    I tried using regex in the ith order by clause, but it doesnt work until i keep it in the select part. And I am unable to use regex in the select part with distinct.
    Thanks

    For example :
    SQL> set scan off
    SQL>
    SQL>
    SQL> with sample_data as (
      2      select 'again sort a column' str from dual union all
      3      select '<i>sort</i> a column' from dual union all
      4      select '&lt;i&gt;resort&lt;i&gt;' from dual
      5  )
      6  select str
      7  from sample_data
      8  order by regexp_replace(
      9             utl_i18n.unescape_reference(str)
    10           , '</?[^>]+>'
    11           )
    12  ;
    STR
    again sort a column
    &lt;i&gt;resort&lt;i&gt;
    <i>sort</i> a column

  • Regex - Remove specific HTML Tags

    I have already found a solution in the forum to remove all html tags but I need some specific tags - img, a, b, i, u - and also their closing tags - </a>, </b>.
    The regex also needs to differ between img without the class attribute and with class attribute - it should remove elements with class attribute
    So I have tried to modifiy the found solution:
    result = Regex.Replace(result, "<[^(img|a|b|i|u)][^>]*>", " ");
    It works not optimal because it also removes the closing tags, doesn't differ and doesn't remove the br tags. It's not necessary to do all these actions in one statement.

    you can use regular
    <[(/body|html)\s]*>
    in c#:
    var result = Regex.Replace(html, @"<[(/body|html)\s]*>", "");
    <html>
    <body>
    < / html>
    < / body>

  • Regexes: escape quotes

    Hello Forum...
    I want to use a Pattern/Matcher to find single or double quotes in a string and replace them with escaped quotes e.g. " becomes \" (that I can store in a db). I'm especially confused about the Pattern.compile string - what format will return any quote in a given string ("^*\"\'*")?
    Any advice appreciated - thanks
    /j-p.

    My, that's certainly confusing. Many thanks for the
    patient explanation though I think I'll need some
    practice for it to sink in.
    regards
    /j-p.In plain language, the String literal is being parsed twice. Once, by the compiler which, replaces \\ with \ and \" then again by the regex engine which replaces \\ with \.
    To be honest, I didn't think this out before hand. I just played around with it until I got what it to work and then when you asked, I analzyzed it. Often for things like this, it's a lot faster to do it that way. Start with what you think it should be and then tweak it until it works. Then, if you need to, figure out why your assumption was incorrect and you just taught yourself something.

  • Escaping of html characters

    ahoj!
    in an sql report i have to show text messages that include sometimes special html characters like <. is there an oracle function to convert this characters in the format & #60; (without the blank)? i don't want to replace all the special characters by myself.
    thanks!
    ciao,
    christian
    Edited by: Christian Ropposch on Apr 8, 2009 1:39 PM
    Edited by: Christian Ropposch on Apr 8, 2009 1:40 PM

    Hi Christian R.!
    If I understood right then you are using APEX. HTP and HTF are included with Oracle and APEX. You don't need the Application Server.
    regards

  • HTML Tags escape while exporting to PDF

    Hi friends,
    I have two columns in my BI report like
    Business Group(BG) and OU
    Since in the BG column under the edit formula i used the html tags to show each and every BG column value in bold like below
    '<b>' || "D1.Company"."Business Group Full Name" || '</b>'
    Now the report is appearing fine in answers and dashboard but when i export the same report to PDF i can see the html tags in each and every name of the BG group.
    Is there anything to be done to escape these html tags in PDF while exporting.
    And also if we give the tooltip for the numeric columns like below under the column properties-->Data format-->Custom like
    [html]<p title=\""Employee Count" \">#
    And after that if i tried to escape the same report with the tooltip on the numeric column to pdf means, the number values are missing for the entire column in the PDF, but this case is not appearing while exporting the report to pdf if i set the tooltip to the text column, as the values is missing only for the tooltip of the numeric column. What could be the fix for this problem.
    Thanks
    Regards,
    Saro

    Hi svee,
    Thanks for the reply.
    For an example i tried with the bold option in the edit formula inorder to portray the html tags problem while exporting it to the pdf inspite of using the default bold option within the column properites like u said.
    Suppose if i used html tags like
    <br>, </br>
    in the edit formula section or if i used any other html tags which is not present as a default option in bi like bold. At that time while exporting the report to pdf i can face the issue.
    And also i need ur suggestion regarding with the values disappearing for the numeric column if a tooltip is used in it while exporting it to the pdf.
    Regards,
    Saro

  • XSLT HTML output escaping problem

    I am having problems with JAXP XSLT output escaping in HTML. I have an XML document which correctly escapes &, <, > etc (eg: &amp). However, when I transform the document the XSLT processor is escaping these values again. This means that the HTML output contains &amp; where the XML document contains & (for example). Surely correctly escaped characters in the XML should be left alone? Or should the XML document not contain escaped characters? In my understanding special characters in text nodes of XML documents should be escaped. Obviously a work around is to disable output escaping on these nodes but that is a real pain to have to do each time you pull a value through from the XML document. I'm confused as to what format the XSLT processor expects the XML to be in in relation to escaping of elements. Any information gratefully accepted!
    Thanks.

    Sorry! I was getting confused - the special characters were being escaped twice before they reached the XSLT processor. It looks like the processor will leave alone escape sequences that are already in the XML when it transforms to HTML.

  • Escaping HTML in a Custom Tag

    Hello, all. I am sadly failing to find a library function to escape HTML in Java. I'm writing a tag that used to use JSTL as a custom tag and don't know exactly where to find the functionality the JSTL is using. Consider the old JSP:
    <c:forEach items="${versions}" var="version">
         <tr>
             <td><c:out value="${version.version}"/></td>
             <td><c:out value="${version.releaseDate}"/></td>
        </tr>
    </c:forEach>and the new custom tag class:
    public class VersionTableTag implements Tag
        private PageContext context;
        private Tag parent;
        public int doEndTag() throws JspException
            return EVAL_PAGE;
        public int doStartTag() throws JspException
            JspWriter out = context.getOut();
            List versions;
            try
                versions = (List)(context.getVariableResolver().resolveVariable("versions"));
            } catch (ELException e)
                versions = new ArrayList();
            try
                out.println("<table>");
                out.println("<tr>");
                out.println("<th>Version</th>");
                out.println("<th>Release Date</th>");
                out.println("</tr>");
                for (Object o : versions)
                    VersionVO version = (VersionVO)o;
                    out.println("<tr>");
                    out.println("<td>");
                    out.println(version.getVersion()); // NOTE: not properly escaped
                    out.println("</td><td>");
                    out.println(version.getReleaseDate().toString()); // NOTE: not properly escaped
                    out.println("</td>");
                    out.println("</tr>");
                out.println("</table>");
            } catch (IOException e)
                throw new JspException(e);
            return SKIP_BODY;
    }My questions are as follows:
    *1.* Is it possible to define a tag using a JSP document rather than a Java class? All I'm really after is something similar to an include but I want to be able to provide attributes to it in the long run.
    *2.* Is there some library function I can use to escape the HTML in the VO above?

    1. Is it possible to define a tag using a JSP document rather than a Java class? All I'm really after is something similar to an include but I want to be able to provide attributes to it in the long run.Yes. It was introduced with JSP2.0: tag files.
    Just like a JSP lets you write a servlet easily, a tag file lets your write a Custom tag class easily.
    http://java.sun.com/javaee/5/docs/tutorial/doc/bnama.html
    2. Is there some library function I can use to escape the HTML in the VO above?
    There is one provided in the jakarta commons "lang" library.
    http://commons.apache.org/lang/
    They provide a class "StringEscapeUtils" which will do most of the common escaping that you require.
    How hard is it to write something that replaces & < and > though?

  • Escape HTML content without escaping tags

    How can I escape HTML content without escaping the HTML tags?

    Thanks for the info.  The select statement worked for me when I ran it with the inputs below but when I tried to put the statement in my code it didn't work.  Can you have a look and see what I may have done wrong?
    Thanks
    Karen
    My code looks like this:
    FUNCTION escape_varchar(p_text_in IN VARCHAR2, p_encode  IN NUMBER)
        RETURN VARCHAR2
      IS
        p_text_out VARCHAR2(32767);
      BEGIN
        p_text_out := DBMS_XMLGEN.CONVERT(utl_i18n.escape_reference(unistr(REPLACE(p_text_in,'\','\\')),'US7ASCII'),p_encode);
        RETURN p_text_out;
    END escape_varchar;
    p_text_in =
    <html>
    <head>
      </head>
    <body>
    Make sure the following characters are getting Displayed in EIS
    ~`!@#$%^&*()_-+={}|[]\:;"'<>?,./ ¢ £€¥©®™‰µ >• • … §¶ß‹›«»==–—¯ ?¤¦¨¡¿ˆ˜°-±÷/×¹²³¼½¾ ƒ??8v˜?=?¬n?´¸ªº
    †‡ÀÁÂÃÄÅÆÇÈÉÊËÌÎÏÐÑÒÓÔÕÖØŒŠÙÚÝÜŸÞàáâãäåæçèéêëì í î ï ðñòóôõøœšùúûüýþÿ??G????T?????????S??F??Oaß?de?????
    ?µ???p??st?f??????????????
      </body>
    </html>
    p_encode = 0
    p_test_out =
    &amp;lt;html&amp;gt;
    &amp;lt;head&amp;gt;
      &amp;lt;/head&amp;gt;
      &amp;lt;body&amp;gt;
    Make sure the following characters are getting Displayed in EIS
    ~`!@#$%^&amp;amp;*()_-+={}|[]\:;&amp;quot;&amp;apos;&amp;lt;&amp;gt;?,./ &amp;#xa2; #&amp;#x20ac;Y&amp;#xa9;&amp;#xae;&amp;#x2122;&amp;#x2030;&amp;#xb5; &amp;gt;&amp;#x2022; &amp;#x2022; &amp;#x2026; &amp;#xa7;&amp;#xb6;&amp;#xdf;&amp;#x2039;&amp;#x203a;&lt;&gt;==&amp;#x2013;&amp;#x2014;&amp;#xaf; ?&amp;#xa4;|&amp;#xa8;!&amp;#xbf;&amp;#x2c6;&amp;#x2dc;&amp;#xb0;-&amp;#xb1;&amp;#xf7;/&amp;#xd7;&amp;#xb9;&amp;#xb2;&amp;#xb3;&amp;#xbc;&amp;#xbd;&amp;#xbe; &amp;#x192;??8v&amp;#x2dc;?=?&amp;#xac;n?&apos;&amp;#xb8;&amp;#xaa;&amp;#xba;
    &amp;#x2020;&amp;#x2021;AAA&amp;#xc3;A&amp;#xc5;&amp;#xc6;CEEEEIII&amp;#xd0;&amp;#xd1;OOO&amp;#xd5;O&amp;#xd8;&amp;#x152;SUUYU&amp;#x178;&amp;#xde;aaa&amp;#xe3;a&amp;#xe5;&amp;#xe6;ceeeei i i i &amp;#xf0;&amp;#xf1;ooo&amp;#xf5;&amp;#xf8;&amp;#x153;suuuuy&amp;#xfe;y??G????T?????????S??F??Oa&amp;#xdf;?de?????
    ?&amp;#xb5;???p??st?f??????????????
      &amp;lt;/body&amp;gt;
    &amp;lt;/html&amp;gt;

  • Adding a html link to a carousel

    i have followed a tutorial to build a flash craousel and have had no problems in building that.  i wanted to add a html link to a section of the carousel but everything i have tried up to now has failed.  All the information is passed through to the carousel from an xml file and everything works fine except for the link.
    i have slightly modified the tutorial so i have an extra box on the information page which holds the web address for the site they are looking at in the portfolio.  i have the address showing up but i cannot use the text as a link.
    the box that i am putting it in is a dynamic text box which has been set up with the following settings
    theLink.html = true;
    theLink.htmlText = t.Link;
    so as far as i know these are set to be able to handle html tags/code but they are not playing games.
    i have tried to pass the link through to the swf file from the xml file with the following code but this just seem to break the whole application.
    Link="<![CDATA[<a href='http://www.acmeart.co.uk/pip_new']]"
    to view how i have the file working upto now please view the link below.
    http://acmeart.co.uk/carousel
    i feel the problem is not with the flash application but more with how i am trying to pass the html link over to the application from the xml file.  i will attach the xml file to the post.

    What I had in mind was to place the link inline in your text. Something like this:
    <item name="valley" full="assets/valley.png"
            title='Book - CD ROM'
            body='This product was a Shockwave holiday card. It began as a jigsaw puzzle. As the puzzle was solved, it opened and displayed the holiday greeting. &lt;a href=&quot;http://www.amazon.com" &gt;Purchase a copy.&lt;a\&gt;'>
        </item>
    The url is right in the text. The html markup is escaped as html entities.

  • How to include HTML tags in a success message?

    I have seen a few posts on this topic in the past, but have not seen a definitive response...apologies if this has been answered.
    I would like to have a dynamic process success message that includes some HTML tags in it - in this case an anchor tag that can jump the user back to the object he/she just edited.
    So, I build up a dynamic success message in a page variable - say P10_SUCCESS_MSG - and set the Process Success Message field in the relevant Process to &P10_SUCCESS_MSG.
    The problem is that when the page template substitutes in my message for #SUCCESS_MESSAGE# it looks like it also escapes any HTML tags, so the markup gets displayed literally on the page.
    Is there any recommended way to override or get around this behavior?
    Many thanks,
    Bill

    I just noticed that it looks like the success message is passed around on the URL - when I look at my URL after a successful form process, I see &success_msg=BIG_UGLY_ESCAPED_SUCCESS_MSG_HERE embedded in the URL. If this is the case, I can see why - technically - it ends up getting escaped. Looks like I may have to figure out my own hack for success message processing - maybe some weird post-processing of a placeholder success token that I replace in a page 0 process or something...

  • Parse XML file with regex

    Hi,
    Is that possible to parse a xml file using regular expressions....if s what is the API needed
    thanx in advance

    Is that possible to parse a xml file using regular
    expressions....if s what is the API neededI'm sure it can be done. Here's the regex API:
    http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
    http://www.javaregex.com/tutorial.html
    But that's not where regex is for. Better have a look at this:
    http://java.sun.com/xml/

  • Escape special characters in url for redirection

    In my web page, I want all the characters of the URL to be lower case. For that I created the following method:
    private bool UrlFormatoCorrecto(string url)
    bool formatoCorrecto = true;
    bool upperCa = url.Any(c => char.IsUpper(c));
    if (url.Any(c => char.IsUpper(c)))
    formatoCorrecto = false;
    if (url.Contains(" ") || url.Contains("+"))
    formatoCorrecto = false;
    return formatoCorrecto;
    This works like a charm until a special character appears. The url that I get then will be the following:
    http://localhost/web/coches/proven%C3%A7a-aribau,-08036-barcelona,-barcelona
    So I have upper case characters. When I redirect it using the following code:
    if (!UrlFormatoCorrecto(urlActual))
    Response.RedirectPermanent(urlActual.Replace(" ", "-").Replace("+", "-").ToLower());
    I get the code again with the same URL with upper case. How can I escape the special characters so they won't bother me anytime I want to make the redirection?

    hello,
    you could escape special caracters with :
    Regex.Escape Method
    Regards
    Cédric

  • Escaping characters on help page

    I would like to escape some html-code I'm placing on a help region. Is that possible? How is the escaping for escaped html region implemented? Can I use the same API?

    No, that does not help. I want it to show up in the webpage. So I want to be able to write some htmlcode on the page and display the actual html code for the reader. The comment would just make it not interpreted by the browser, but it would not show up on the webpage.
    In this case it is a basic DIV I want to show including the tags. I want to grab the code from it's actual place and show it here, so just reqriting the tags is not a good option. Calling a function to do some basic replacements would be OK though. Even better would be to be able to use an HTML region that escapes the special characters, but it seem like the help region only uses a standard html region.

  • Html entity expansion

    I was given an HTML document that has an XML document embedded in the HEAD. The XML document is escaped with html entities, so instead of angle brackets, you have the entity equivalent which is "ampersand less than semi colon" (if i typed it out literally it would look like an angle bracket here), or "ampersand greather than semi colon"
    Is there a known method for expanding html entities? Or do I need to code up my own entity replacement routine?
    Thanks for any help.

    I really can't imagine why somebody would do that. It just sounds like a loony idea to me. (I mean the part about putting escaped XML documents in the head of an HTML document, not what you propose to do about it.)
    If that's really your requirement, then hopefully the escaped XML is the text inside an element. (In my example, it's the text inside the "head" element except that I put extra whitespace before and after it.) And hopefully the HTML you're getting is actually XHTML, so you could feed it into an XML parser.
    If both of those conditions are true then what you should do is this:
    1. Feed the entire document into an XML parser. Extract the text from the "head" element that's the child of the root "html" element. You could certainly use DOM for that if you wanted. This will be the XML document you're after, and the parser will have unescaped it for you.
    2. Feed that text into another XML parser and proceed with whatever you have to extract from it.
    But based on the looniness of this data format you probably won't have either of those conditions true.
    If the data isn't XHTML then run it through something that cleans it up, JTidy or TagSoup or something like that.
    If the XML document isn't the only text in an element then you'll have to do some string hacking to get rid of the other text.

Maybe you are looking for

  • Can I share my iCloud storage with members in my Family Sharing group?

    Just wondering if I can share my 200GB with my husband and sons which are on our family sharing plan?  It seems like all these additional GB should be able to be shared if we are signed in to the same family plan?  I've searched support topics and ca

  • Remote control not working on Touchsmart 610-1030uk

    Had been working fine for long time. Now out of warranty. Battery fine. Indicator on remote and on PC (white LED on front panel) both show IR is working. Mouse and keyboard still work fine. I have tried uninstalling and reinstalling drivers for Micro

  • Frequent message that time machine disk not properly ejected

    I am using a 750 GB USB HDD for Time Machine backups.  I am getting frequent pop up message stating that the external drive was not properly ejected when in fact it wasn't ejected at all - on 24/7.  Backups are halted until I click the OK button on t

  • Upload csv file data to sql server tables

    Hi all, I want clients to upload csv file from their machines to the server. Then the program should read all the data from the csv file and do a bulk insert into the SQL Server tables. Please help me of how to go about doing this. Thanx in advance..

  • FCPX constantly backing up and rendering issues.

    FCPX makes an automatic backup of almost every single change I make. It's really slowing me down. Every clip I select and drop into the timeline is backed up and I have to wait for it to process or the program grinds to a halt. It's essentially backi