Replace special characters in xml to its HTML equivalent

Hello All,
I did a small xml processor which will parse the given xml document and retrieves the necessary values. I used XPath with DocumentBuilder to implement the same.
The problem which had faced initially was, i could not able parse the document when the value comes with the '&' special character. For example,
<description>a & b</description>I did some analysis and found that the '&' should be replaced with its corresponding HTML equivalent
&amp; So the problem had solved and i was able to process the xml document as expected. I used the replaceAll() method to the entire document.
Then i thought, there would be some other special character which may cause the same error in future. I found '<' is also will cause the same kind of error. For example,
<description>a < b</description>Here i couldn't able to use the replaceAll(), because even the '<' in the xml element tags was replaced. So i was not able to parse the xml document.
Did anyone face this kind of issue? Can anyone help me to get rid of this issue.
Thanks
kms

Thats the thing about XML. It has to be correct, or it doesn't pass the gate. This is nothing really, seen it many times before. It becomes funny when you start to run into character set mismatches.
In this case the XML data should have either been escaped already, or the data should have been wrapped in cdata blocks, as described here:
http://www.w3schools.com/xml/xml_cdata.asp
Manually "fixing" files is not what you want to be doing. The file has to be correct according to the very simple yet strict XML specifications, or it isn't accepted.

Similar Messages

  • Replacing special characters from xml document/text inside element

    Hi
    Is there any way to replace the xml codes to special characters inside an entire xml document/ for a text in element.
    I want to get the xml codes to get replaced to respective special character(for any special character)
    Please see the sample xml xml element below
    <Details>Advance is applicable only for &lt; 1000. This is mentioned in Seller&apos;s document</Details>
    Thanks in advance .. any help will be highly appreciated.

    So, just to be sure I understand correctly, you want this :
    <Details>Advance is applicable only for &lt; 1000. This is mentioned in Seller&apos;s document</Details>
    to be converted to :
    <Details>Advance is applicable only for < 1000. This is mentioned in Seller's document</Details>
    If so, I'll say again : the resulting XML document will be invalid.
    Extensible Markup Language (XML) 1.0 (Fifth Edition)
    The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively. The right angle bracket (>) may be represented using the string " &gt; ", and MUST, for compatibility, be escaped using either " &gt; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.
    Ask whoever "they" are why they would want to work with not wellformed XML.

  • Vbscript to rename files and replace special characters

    Dear Exprt,
    would you please help to add addtional requirement for rename and replace special characters for file
    by the below script i can re name.
    strAnswer = InputBox("Please enter folder location to rename files:", _
        "File rename")
        strfilenm = InputBox("Enter name:", _
        "Rename Files")
    Set FSO = CreateObject("Scripting.FileSystemObject")
    Sub visitFolder(folderVar)
        For Each fileToRename In folderVar.Files
            fileToRename.Name = strfilenm & fileToRename.Name
        Next
        For Each folderToVisit In folderVar.SubFolders
            visitFolder(folderToVisit)
        Next
    End Sub
    If FSO.FolderExists(strAnswer) Then
        visitFolder(FSO.getFolder(strAnswer))
    End If
    [email protected]

    Thx would you please look below what wrong in its run  nothing happend no error
    strAnswer = InputBox("Please enter folder location to rename files:", _
        "Test")
        strfilenm = InputBox("Enter name:", _
        "Rename Files")
    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set regEx = New RegExp
    'Your pattern here
    Select Case tmpChar
    Case "&"
    changeTo = " and "
    Case "/"
    changeTo = "_"
    Case Else
    changeTo = " "
    End Select
    regEx.Pattern = tmpChar 
    Sub visitFolder(folderVar)
        For Each fileToRename In folderVar.Files
            fileToRename.Name = strfilenm & fileToRename.Name 
            fileToRename.Name = regEx.Replace(fileToRename.Name, tmpChar)
        Next
        For Each folderToVisit In folderVar.SubFolders
            visitFolder(folderToVisit)
        Next
    End Sub
    [email protected]

  • How to remove special characters in xml

    Dear friends,
    How to remove the special character  from the xml. I am placing the xml file and  fetching through file adapter.
    The problem is when there is any special character in xml. i am not able to pass to target system smoothly.
    Customer asking schedule the file adapter in order to do that the source xml should not have any special charatcters
    How to acheive this friends,
    Thanx in advance.
    Take care

    Hi Karthik,
    Go throgh the following links how to handle special character
    https://www.sdn.sap.com/irj/scn/weblogs?blog=/pub/wlg/9420 [original link is broken] [original link is broken] [original link is broken]
    https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42
    Restricting special characters in XML within XI..
    Regards
    Goli Sridhar

  • How to replace special characters in string.

    Hello,
    I want to replace special characters such as , or ; with any other character or " ".I find out there is no such function is java for this.There is only replace() but it accepts only chars.If anybody know how to do this?.
    Thanks,

    Hello,
    I want to replace special characters such as , or ;
    with any other character or " ".I find out there is no
    such function is java for this.There is only replace()
    but it accepts only chars.If anybody know how to do
    this?.
    Thanks,Can't you just do the following?
    public class Test
         public static void main(String[] args)
         String testString = "Hi, there?";
         System.out.println(testString.replace(',',' '));
    }

  • Special characters in XML barcode content

    Hello,
    I made a barcoded form with a custom script that creates a custom XML as barcode content.
    The decoding happens well when the user write plain text in the text fields, but whenever it inputs some special characters (for XML syntax), like ",<,>,=,etc... the content of barcode it is decoded as:
    <barcode>
    <!CDATA[... true content ...]>
    </barcode>
    how can I handle this situation?
    I have to handle what the user writes or I have to change the decode activity?
    Thank you very much for your support!
    Fabio

    Steve,
    I have already encoded decode operation in UTF-8. In form level, because it is an acrobat form, no option to choose the encoding as in LC Designer. In further tests, if I change the extractToXML output to XDP instead of XFDF, then I will receive data rather than &# sequence. It is strange. Don't understand why XDP and XFDF would give out different encoding.
    Tim

  • Handle special characters in xml

    Hi,
      Our end users tend to copy the description text from Word documents to pdf form and submits it.
    If that text contains any special characters, its getting carried to the extracted xml. In the next step, when I try to assign a task to user with template and this xml, Managers cannot able to open the form and showing the error. When I assign the xml without special characters, its running fine.
    Please assist on how to handle this?
    My expectation is that user should be prompted in the form when he pastes any special characters or they should be auto-corrected to null values. if that is not possible, atleast we should able to filter the xml and eliminate special characters before the form go to next stage.
    Appreciate your help.
    Thanks,
    Krishna

    In first instance, I would have followed this way:
    http://www.dvteclipse.com/documentation/svlinter/How_to_use_special_characters_in_XML.3F.h tml
    so, I would have parsed the submitted text in a Validate event and changed any special chars to UTF-8 numeric reference.
    However, I found this:
    http://blog.mark-mclaren.info/2007/02/invalid-xml-characters-when-valid-utf8_5873.html
    which seems to state that not all UTF-8 characters are possible in XML.
    In fact, those allowed are listed here:
    http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char
    so, I would still use a Validate event script but based on the XML specs' Character Range. Exactly as Mark McLaren did in Java.
    This will permit to keep those special chars that are allowed. Your Managers will thank you.
    Hope it helps.

  • Special characters in XML structure when prepared using String

    Hi,
       I am preparing an XML structure using 'String'. I print the prepared XML structure in the server log. Issue is that I am seeing extra characters([[ and ]]) that I am not printing.
    Please let me know how to get rid of those.
    Code Excerpt
            String xmlHeader = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>";
            String lsb_xmlcon = xmlHeader;
            logger.info("ReqXMLString Process  1  --->" + lsb_xmlcon);
            lsb_xmlcon = lsb_xmlcon +("\n");
            logger.info("ReqXMLString Process  1.1  --->" + lsb_xmlcon);
            lsb_xmlcon = lsb_xmlcon +("<REQUEST>");
            lsb_xmlcon = lsb_xmlcon +("\n");
            logger.info("ReqXMLString Process  1.2  --->" + lsb_xmlcon);
    Log
    ReqXMLString Process  1  ---><?xml version="1.0" encoding="utf-8" ?>
    ReqXMLString Process  1.1  ---><?xml version="1.0" encoding="utf-8" ?>[[
    ReqXMLString Process  1.2  ---><?xml version="1.0" encoding="utf-8" ?>[[
    <REQUEST>
    Thanks,
    Message was edited by: 996913
    This issue is observed only while running the code in server, not from Jdev.
    When we append the additional tags without new line character, "\n", there are no extra characters being added. Also, in other case also. where we used "Marshaller" to prepare the XML, we have seen this issue.
    After we set the below property to false, we got rid of the extra characters.
                            jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, false);
    Apparently the insertion of new line when the code runs on server(Weblogic 10.3.6.0) is creating the issue.
    Please let me know if anyone has come across a similar scenario.
    Thanks,

    I am building this XML in a servlet so ,right, DOM does process XML (even though a valid HTML file can be loaded into a DOM object) but if you build XML using DOM then write the XML out using PrintWriter and Transformer objects this will cause the XML to print out in your browser. If you view source on this XML you will see that the DOM object has translated all special characters to there &xxxx; equivalent. For a example: when the string (I know "cool" java) gets loaded into a attribute using the DOM object then wrote back out it looks like (I know &xxx;cool&xxx; java) if you view the source in your browser. This is what it should do, but the DOM object is not change the � to the "&xxxxx;". This servlet is acting as a gateway between a Java API and a windows asp. The asp will call the servlet expecting to get XML back and load it directly into a DOM object. When the windows DOM object gets the xml that I am returning in this servlet is throws a exception "invalid character" because the � was not translated to &xxxx; like the other characters were. According to the book HTML 4 in 24 hours (and other references) the eacute; or #233; are how you say "�" in HTML or XML. How do you say it?

  • Special Characters in XML Publisher PDF Output

    Hi,
    I'm printing "Long Text" in report output in every line based on tab.
    Report output is having special characters like \n.
    I was using below to print in next line, any suggestions for removing \n.
    Below is what was happening:
    ===================
    RDF:
    =====
    lv_notes := replace(:CF_LONG_TEXT_DESC, chr(9), ' ') ;
    lv_notes1 := replace(lv_notes, chr(10), ' ') ;
    lv_notes2 := replace(lv_notes1, chr(13), ' ') ;
    XML
    ===
    <CF_LONG_TEXT_desc>
    Initial Billing Amount: $549,180.00 \n \n Computation: \n a) Estimated Number of Full-Time Students: 12,000 \n
    b) Estimated Number of Calendar Days: 113 \n
    c) Calendar Date (From - To): 1/18/2011 - 5/10/2011 \n
    d) Multiply by: $0.81 \n
    e) Estimated Total Costs: $1,098,360.00 \n
    f) Initial Billing Amount represents 50% of Estimated Total Costs. \n
    \n \n
    If you have questions about your invoice please contact Darud Akbar at (312) 681-2724.
    </CF_LONG_TEXT_desc>
    PDF Output
    ==========
    Initial Billing Amount: $549,180.00 \n \n Computation: \n a) Estimated Number of Full-Time Students: 12,000 \n
    b) Estimated Number of Calendar Days: 113 \n
    c) Calendar Date (From - To): 1/18/2011 - 5/10/2011 \n
    d) Multiply by: $0.81 \n
    e) Estimated Total Costs: $1,098,360.00 \n
    f) Initial Billing Amount represents 50% of Estimated Total Costs. \n
    \n \n
    If you have questions about your invoice please contact Darud Akbar at (312) 681-2724.
    Thanks.

    >
    Initial Value
    =======
    <?CF_LONG_TEXT_desc?>
    Changed to (Below gives me error)
    ========
    <?<xsl:value-of select="translate(CF_LONG_TEXT_desc,'\n','')"/>?>
    Changed to (Below doesn't fetch data)
    ========
    <xsl:value-of select="translate(CF_LONG_TEXT_desc,'\n','')"/>
    >
    must be in field as
    <xsl:value-of select="translate(CF_LONG_TEXT_desc,'\n','')"/>

  • Special characters in XML document

    I have a Flash file saved as version 8 with the following script calling an xml file:
    //init TextArea component
    myText.html = true;
    myText.wordWrap = false;
    myText.multiline = false;
    myText.label.condenseWhite=true;
    //load in XML
    xml = new XML();
    xml.ignoreWhite = true;
    xml.onLoad = function(success){
        if(success){
            myText.text = xml;
    xml.load("titletext.xml");
    My xml file contains the following:
    <?xml version="1.0" encoding="iso-8859-1"?>
    <![CDATA[Smith's dog & cat are "crazy"]]>
    When posted online my flash file displays the encoding tag in the xml file.
    AND the apostrophe, ampersand and quote marks display as html code instead of the actual character.
    I can take the encoding tag out of the xml file but my characters still don't display correctly.
    My dynamic text field in flash (myText) does have special characters embedded, plus I have them entered manually in the field for 'include these characters'.
    Does anyone have suggestions for me?
    You can view this test file at http://wilddezign.com/preshomes_name2.html
    TIA

    Perhaps you need a slightly different approach to loading the XML. Instead of loading the entire XML file, what if you loaded only the child you were looking for? This is what I usually do:
    var xml:XML = new XML();
    xml.ignoreWhite = true;
    xml.load("some.xml");
    xml.onLoad = parse;
    function parse(success){
    if (success){
      root = xml.firstChild;
      _global.numberItems = root.attributes.items;
      itemNode = root.firstChild;
      var i:Number = 0;
      while(itemNode != null){
         myText.text = itemNode.attributes.description;
         itemNode = itemNode.nextSibling;
         i++;
      else {
      trace("XML Bad!!");
    And your XML would be structured like this:
    <?xml version="1.0" encoding="utf-8"?>
    <sample>
    <item description="This is the text that I want to appear on this MC!" />
    </sample>

  • Special characters in XML built using the DOM object

    I am using the DOM object to build xml but I am having problems with special characters. I have a Element object that I create attributes in. Most special characters like the &, ", or ' are translated for me (amp; quot;, acute; put a & in front of those, if I do it here the browser will translate them into the logical character) but I am having a problem with the � character. The DOM object does not translate this. If I do it my self to(eacute; same here add a & in front) before adding the string to the attribute then the DOM object translates it to (amp;eacute; add a & in front of the amp; not in front of the eacute;). As you can see the DOM object translates the & instead of recognizing that it is a HTML character. Can anyone give me a hand?

    I am building this XML in a servlet so ,right, DOM does process XML (even though a valid HTML file can be loaded into a DOM object) but if you build XML using DOM then write the XML out using PrintWriter and Transformer objects this will cause the XML to print out in your browser. If you view source on this XML you will see that the DOM object has translated all special characters to there &xxxx; equivalent. For a example: when the string (I know "cool" java) gets loaded into a attribute using the DOM object then wrote back out it looks like (I know &xxx;cool&xxx; java) if you view the source in your browser. This is what it should do, but the DOM object is not change the � to the "&xxxxx;". This servlet is acting as a gateway between a Java API and a windows asp. The asp will call the servlet expecting to get XML back and load it directly into a DOM object. When the windows DOM object gets the xml that I am returning in this servlet is throws a exception "invalid character" because the � was not translated to &xxxx; like the other characters were. According to the book HTML 4 in 24 hours (and other references) the eacute; or #233; are how you say "�" in HTML or XML. How do you say it?

  • Problems with special characters with XML/PDF printing

    Hi,
    Our setup:
    Apex Listener 2.0.5
    Oracle DB 11G
    Weblogic
    Apex 4.2.2
    Various recent major browsers
    We used this blog post so we could do some PDF printing with APEX Listener:
    http://marcsewtz.blogspot.be/2013/04/pdf-printing-with-oracle-application.html
    The problem is special characters. For example the "&" sign comes out as "%26amp;" when we export an XML.
    Could anyone provide some insights of what we can do to fix this?
    Regards,
    Joni

    This is a known bug.
    https://support.oracle.com/epmos/faces/BugDisplay?_afrLoop=957905848396285&id=18282188&_afrWindowMode=0&_adf.ctrl-state=168vq5zhn3_233
    The bug 18282188 has been fixed in Apex 4.2.5 version.
    Upgrade the Apex version to 4.2.5 when it is available.

  • Regd: Handling of special characters in XML

    Hi ALL,
    i am using java mapping to conevrt IDOC XML to Flat file using SAX parse and then reading the whole content into a single Filed,the output of the java mapping is the input for graphical mapping .
    My problem is there may be some special characters  in the input IDOC. when ever these special characters come my java mapping is not able to Parse the IDOC.
    Please let me know how can handle this special characters.
    Thanks,
    hemanth.

    Hi ,
            Java has some characters reserved which is normally used to declare the Entity Name. To handle such kind of situations, you can replace these characters with these special characters, which get substituted automatically while parsing the XML file.
    Refer:
    http://www.javacommerce.com/displaypage.jsp?name=saxparser3.sql&id=18232
    I hope it helps.
    Regards,
    Anurag Gargh
    Edited by: Anurag Gargh on Aug 11, 2009 3:41 PM

  • Replace special characters in FDF

    Hello,
    I have searched everywhere and cannot find the answer to this simple problem.  I can manipulate javascript in an HTML environment by using free scripts, but it's not so easy in a PDF form.  I am using Adobe Acrobat 9 Pro and I have a form that when filled out and submitted, generates and emails an FDF file.  I am using a server-side script called FormGenie to do this, and the script hates parentheses and some other special characters, they break the FDF file.  All I want to do is set up a document level Javascript that will check for and replace all instances of these special characters with something allowed like a hyphen- or just a space.
    I have tried:
    function clearText() {
         document.formname.fieldname.value=filterNum(document.formname.fieldname.value)
         function filterNum(str) {
              re = /\$|#|~|\%|\*|\^|\(|\)|\+|\=|\[|\]|\[|\}|\{|\;|\"|\<|\>|\?|\||\\|\!|\$/g;
              // remove special characters like "$" and "," etc...
              return str.replace(re, "-");
    and
    function clearText2(str) {
    stringName = stringName.replace(/\$|#|~|\%|\*|\^|\(|\)|\+|\=|\[|\]|\[|\}|\{|\;|\"|\<|\>|\?|\||\\|\!|\$ /g,'-');
    I know these aren't right, but I have to be at least barking up the right tree.
    Can someone help?
    Thank you

    OK, here is my first stab at putting together this  script from searching the API reference.
    To loop  through the text fields and get their values, I have:
    for  ( var i = 0; i < this.numFields; i++) {
    var fname = this.getNthFieldName(i);
    if ( fname.type = "text" ) f.value;   [Not sure if "f.value" is correct or legal here]
    If this is even getting close, where I get stuck  is how to get the original script to take those values and replace any unwanted characters found.
    if (!event.willCommit) {
        event.change = event.change.replace(/[\$\#\~\%\*\^\(\)\+\=\[\]\{\}\;\"\<\>\?\|\\\!]/g, "-");
    I just need to put this together correctly and I don't know how.
    I have the form submission part down, that was already done, I just need to add the custom "Run a javascript" item above it so that is executed first.
    Need a little guidance,
    Thanks

  • Special characters in xml

    Hello!
    I am trying to generate XML from database by using fnd_file package. Trouble is that application NLS_LANG is not set to UTF8 and it should stay so due to the fact that a lot of custom reports written using Oracle Reports may not display special characters correctly. So i have problem with characters like 'ö','ļ'.... Is there a way to set encoding for only XML output? Also in my last efforts the output XML for some reason is concatenating a strange square shaped character into the end of file. Has anybody seen similar problem and maybe solved it?
    Thanks
    Aleksei

    Refer to
    http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JAXPSAX6.html

Maybe you are looking for

  • My itunes wont install plz i need emmediate help

    ok so i have a windows xp with multiple accounts, only one of those accounts is the administrator. when i download itunes everything goes smoothy until this happens: http://img80.imageshack.us/img80/5131/itunesre3.png im really ****** because i just

  • Inserting pdf in word

    While inserting pdf file in word document, quality (resolution) of pdf is decreased. Before while using Win XP and office 2003 the same was not happening. Now using Win 7 and office 2010

  • Adobe CS4 - Windows 7

    I'm running into an issue with installing this software on Windows 7, it seems to freeze while attempting to install the "Adobe Fonts All" this is not a component that can be removed from the install list that I can see...is anyone else experiencing

  • HELP: policytool doesn't work?????

    I must grant the Reflect Permission to two classes, that I've called Invocation and ReflectionOperation. In the policytool I've add a new entry writing the url like: file:///home/luca/jbproject/tesi/classes/BlackCat/core/role/Invocation file:///home/

  • Points before taking backup of R3 which is integrated with CRM

    Dear Experts, We have integrated SAP R3 with SAP CRM 6.0. We are planning to take a backup of Production system(of R3) and load the Master Data into Quality system of R3 (System Refresh). As we have integrated R3 with CRM , we feel that the system re