Encoding Individual Strings with XML entities to replace "&", " ", etc.

We use templates to create XML files. The tags are in the templates and just the values get substituted in. Sometimes the variable value contains an ampersand or other XML-banned character.
1) Is there a simple method in the Java or Xerces class libraries that takes a string and returns one with the suspect characters replaced by XML entities?
2) Sometimes, though rarely (I have no control over this!) the variables arrive with the forbidden characters already replaced by entities. Is there a method which will tell me if the string already contains entities, or at least is already a valid XML string? Otherwise I'm afraid the ampersand in an already-converted entity name will itself be converted to another entity.

I'm not aware of any built-in mechanism for doing that kind of escaping short of a full XML generator. Here's a lightweight solution that replaces ampersands only if they're not part of an XML entity. Just grab a copy of Elliott Hughes' Rewriter and implement it like this: public class Test
  /* Rewriter can be found here:
   * http://elliotth.blogspot.com/2004/07/java-implementation-of-rubys-gsub.html
  public static void main(String... args) throws Exception
    Rewriter xmlEscaper = new Rewriter("[<>'\"]|&(?!(?:lt|gt|apos|quot|amp);)")
      public String replacement()
        char ch = group(0).charAt(0);
        switch (ch)
          case '<'  : return "&#38;lt;";
          case '>'  : return "&#38;gt;";
          case '\'' : return "&#38;apos;";
          case '"'  : return "&#38quot;";
          case '&'  : return "&#38amp;";
          default :
            throw new IllegalStateException("this can't happen");
    String str = "this & that & the <other&#38gt;";

Similar Messages

  • How to display string with XML content in 4.6?

    I`d like to know how to display string with XML content in it for 4.6.
    4.6 has not method parse_string.
    And example like this is not helpful:
      DATA: lo_mxml    TYPE REF TO cl_xml_document.
      CREATE OBJECT lo_mxml.
      CALL METHOD lo_mxml->parse_string
          stream = gv_xml_string.
      CALL METHOD lo_mxml->display.
    Thank you.

    May be you can use fm SAP_CONVERT_TO_XML_FORMAT. But it have some issues with memory usage, the program consumed tons of memory during convert.

  • From String to Element with XML structure

    I have String with XML elements. Can i simple this transform in to XML format?
    I need Element which has XML structure.
    my string looks like that is:
    <payload xmlns="http://xmlns.oracle.com/bpel/workflow/task">
    <usersxmlns="http://www.google.com" xmlns:ns1="http://www.google.com">
    Edited by: Tony_Fabrizzio on Feb 11, 2009 5:57 AM

    Tony_Fabrizzio wrote:
    I have String with XML objects. No you don't. Get your terminology straight
    Can i simple this transform in to XML format?
    I need Element which has XML structure.
    my string looks like that is:
    <payload xmlns="http://xmlns.oracle.com/bpel/workflow/task">
    <usersxmlns="http://www.google.com" xmlns:ns1="http://www.google.com">
    </payload>The short answer is "yes". The long answer is "what are you planning on doing with the result, and 'process it' or 'parse it' isn't an answer"
    But really, you need to do some research

  • How to convert Java string into XML one?

    With SAX I can parse an xml file, but I should create xml file by hands.
    Ok, it's simple, but how to encode java string into XML constant
    like "Hello & goodby" into "Hello & goodby" ?
    Is there a standard method for such special xml characters?

    If you are creating your XML "by hand" then just make sure your hands know that you have to do that. It isn't difficult to write a Java method to do it, if "by hand" means "in Java code". Otherwise your XML is not well-formed. And as far as I know there is no package that takes ill-formed XML and fixes it up.

  • Comparing a String with array elements

    i need some help as to how to compare an individual String with each item of a String array.
    so far i have:
    StringTokenizer drinksOrder = new StringTokenizer(orderLine, "\r");
    while (drinksOrder.hasMoreTokens())
         for (int index = 0; index < NUMBER_OF_DRINKS; index++)
                 barcode = drinksOrder.nextToken();
                        if (barcode.equals(barcodes[index]))
                                     name = drinks[index];
                                     price = prices[index];
                                     status = drinkStatus[index];
              orderedDrinkBC[j] = barcode;
                                              orderedDrinkName[j] = name;
                                              orderedDrinkPrice[j] = price;
                                              orderedDrinkStatus[j] = status;
    At the moment, no Strings are being put in the orderedDrinkBC, orderedDrinkName, orderedDrinkPrice
    and orderedDrinkStatus arrays. the code seems to be failing on the 'barcode.equals(barcodes[index])
    part at the moment.
    any help would be appreciated

    Do this instead (next time post your code in lower case code tags):
    StringTokenizer drinksOrder = new StringTokenizer(orderLine, "\r");
    while (drinksOrder.hasMoreTokens())
      barcode = drinksOrder.nextToken();
      for (int index = 0; index < NUMBER_OF_DRINKS; index++)
        if (barcode.equals(barcodes[index]))
          name = drinks[index];
          price = prices[index];
          status = drinkStatus[index];
          orderedDrinkBC[j] = barcode;
          orderedDrinkName[j] = name;
          orderedDrinkPrice[j] = price;
          orderedDrinkStatus[j] = status;

  • Defining entities with XML schema

    I came across a problem with XML schema's apparent lack of support for entities. e.g. we define an entity like this:
    <!ENTITY le "&#x2264;" >.
    First off I was trying in my XML instance to include an external entity file like this:
    <!DOCTYPE wpi [
    <!ENTITY % DerwentXmlEntities PUBLIC "-//Derwent//Derwent XML General ENTITIES 20000214//EN"
    <WPI xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    but I am not clear on where I should place the file derwent-xml-entities.ent. Everything I tried ended up with:
    ORA-31011: XML parsing failed
    ORA-19202: Error occurred in XML processing
    LPX-00202: could not open file "derwent-xml-entities.ent"
    Is it possible to include a file like this?
    So I tried just defining an entity like this:
    <?xml version="1.0" encoding="UTF-8" ?>
    <!DOCTYPE wpi [
    <!ENTITY times "&#x00D7;">
    <!ENTITY le "&#x2264;">
    The "times" seems OK, but for "le" I get:
    ORA-31011: XML parsing failed
    ORA-19202: Error occurred in XML processing
    LPX-00217: invalid character 8804 (\u2264)
    What sort of encoding should I use?

    <xsd:element name="PsL">
    **Pseudo List, required as in D-types, the editors often used the
    {H control sequence to force an indent to make lists of compounds more
                    legible.  We can't be sure that they are always lists but there is
                    useful mark-up there that we wouldn't want to lose**
                    <xsd:element ref="PsE" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="PsE">
                Pseudo List Entry
                <xsd:sequence minOccurs="1" maxOccurs="1">
                    <xsd:element ref="PsP" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="PsS" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="PsP">
                Pseudo List Prefix
                <xsd:choice minOccurs="0" maxOccurs="unbounded">
                    <xsd:element name="PsP" type="xsd:string"/>
                    <xsd:element ref="Sub" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Sup" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Em" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="PsS">
                Pseudo List Suffix
                <xsd:choice minOccurs="0" maxOccurs="unbounded">
                    <xsd:element name="PsS" type="xsd:string"/>
                    <xsd:element ref="Sub" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Sup" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Em" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="Prt">
                ************************ Part Container Element **************
                <xsd:sequence minOccurs="1" maxOccurs="1">
                    <xsd:element ref="PrtNo" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="PrtNm" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="PrtNo">
                <xsd:choice minOccurs="0" maxOccurs="unbounded">
                    <xsd:element name="PrtNo" type="xsd:string"/>
                    <xsd:element ref="Sub" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Sup" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Em" minOccurs="1" maxOccurs="1"/>
    xsd:element name="PrtNm">
                <xsd:choice minOccurs="0" maxOccurs="unbounded">
                    <xsd:element name="PrtNm" type="xsd:string"/>
                    <xsd:element ref="Sub" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Sup" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Em" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="L">
                ************************ List  Container Element *************
            Attribute typ=used by the composition software to decide how to prefix
                                    the list entries, e.g. (a), (A), (i), (1), etc.
                                    AL for AlphaLower
                                    AU for AlphaUpper
                                    RL for RomanLower
                                    RU for RomanUpper
                                    No for Number
                <xsd:sequence minOccurs="1" maxOccurs="unbounded">
                    <xsd:element ref="LP" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="LS" minOccurs="0" maxOccurs="1"/>
                <xsd:attribute name="typ" use="optional">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="AL"/>
                            <xsd:enumeration value="AU"/>
                            <xsd:enumeration value="RL"/>
                            <xsd:enumeration value="RU"/>
                            <xsd:enumeration value="No"/>
        <xsd:element name="LP">
                List Paragraph
                <xsd:choice minOccurs="0" maxOccurs="unbounded">
                    <xsd:element name="LP" type="xsd:string"/>
                    <xsd:element ref="Sub" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Sup" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="Em" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="RefD" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="RefF" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="LS">
                1st level sublist
            Attribute typ=used by the composition software to decide how to prefix
                                    the list entries, e.g. (a), (A), (i), (1), etc.
                                    AL for AlphaLower
                                    AU for AlphaUpper
                                    RL for RomanLower
                                    RU for RomanUpper
                                    No for Number
                <xsd:sequence minOccurs="1" maxOccurs="unbounded">
                    <xsd:element ref="LP" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="LSS" minOccurs="0" maxOccurs="1"/>
                <xsd:attribute name="typ" use="optional">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="AL"/>
                            <xsd:enumeration value="AU"/>
                           <xsd:enumeration value="RL"/>
                            <xsd:enumeration value="RU"/>
                            <xsd:enumeration value="No"/>
        <xsd:element name="LSS">
                2nd level list
            Attribute typ=used by the composition software to decide how to prefix
                                    the list entries, e.g. (a), (A), (i), (1), etc.
                                    AL for AlphaLower
                                    AU for AlphaUpper
                                    RL for RomanLower
                                    RU for RomanUpper
                                    No for Number
                    <xsd:element ref="LP" minOccurs="1" maxOccurs="unbounded"/>
                <xsd:attribute name="typ" use="optional">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="AL"/>
                            <xsd:enumeration value="AU"/>
                            <xsd:enumeration value="RL"/>
                            <xsd:enumeration value="RU"/>
                            <xsd:enumeration value="No"/>
        <xsd:element name="IndexingCorePt">
                ******** ROOT ELEMENT for Core Patent Indexing   *********
                ** Indexing is composed of Derwent Classification (DC),IPCs,
                    Fragmentation Coding (Frags), Polymer Indexing, Unlinked Registry numbers
                    and Keyword Indexing (KI).
                    Any combination is these three elements is allowed.
                Attribute vs        =Version number starting at 0
                    Attribute co    =Patent country
                    Attribute se    =Patent serial
                    Attribute ki    =Patent Kind
                <xsd:sequence minOccurs="1" maxOccurs="1">
                    <xsd:element ref="DC" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="IPCs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="Frag" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="Polymer" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="IdxU" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="KI" minOccurs="0" maxOccurs="1"/>
                <xsd:attribute name="vs" type ="xsd:NMTOKEN" use="optional"/>
                <xsd:attribute name="co" type ="xsd:NMTOKEN" use="required"/>
                <xsd:attribute name="se" type ="xsd:NMTOKEN" use="required"/>
                <xsd:attribute name="ki" type ="xsd:NMTOKEN" use="required"/>
        <xsd:element name="DC">
                ************* DERWENT CLASSIFICATION *******************
                ** Derwent Classification is divided in 3 main areas:
                    Chemical(CPI),General &amp; Mechanical (EngPI),
                    Electronic &amp; Electronical (EPI)
                <xsd:choice minOccurs="1" maxOccurs="1">
                    <xsd:sequence minOccurs="1" maxOccurs="1">
                        <xsd:element ref="CPIs" minOccurs="1" maxOccurs="1"/>
                        <xsd:element ref="EngPIs" minOccurs="0" maxOccurs="1"/>
                        <xsd:element ref="EPIs" minOccurs="0" maxOccurs="1"/>
                    <xsd:sequence minOccurs="1" maxOccurs="1">
                        <xsd:element ref="EngPIs" minOccurs="1" maxOccurs="1"/>
                        <xsd:element ref="EPIs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="EPIs" minOccurs="1" maxOccurs="1"/>
        <xsd:element name="CPIs">
                UNIQUE ELEMENT DC
                ** Container element for CPI**
                    <xsd:element ref="CPI" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="CPI">
                UNIQUE ELEMENT CPIs
                ** CPI is composed of sections A to N (no section I)
                    Any combination is allowed, there has to be at least one section.
                    No section can appear twice. The DTD does not enforce this.
                    Section N can't be on his own. The DTD does not enforce this.
                    Each single Chemical section is composed of one main class (DCCM)
                    and zero to many secondary classes (DCCSs). One to many Manual
                    code (MCCs) has to be applied to a given chemical section.
                    Attribute section= an allowed Derwent CPI section    **
                Note that content MCCs is optional as its optional in the backfile
                    before 19??
                <xsd:choice minOccurs="1" maxOccurs="1">
                    <xsd:sequence minOccurs="1" maxOccurs="1">
                        <xsd:element ref="DCCM" minOccurs="1" maxOccurs="1"/>
                        <xsd:element ref="DCCSs" minOccurs="0" maxOccurs="1"/>
                        <xsd:element ref="MCCs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="MCCs" minOccurs="1" maxOccurs="1"/>
                <xsd:attribute name="sct" use="required">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="A"/>
                            <xsd:enumeration value="B"/>
                            <xsd:enumeration value="C"/>
                            <xsd:enumeration value="D"/>
                            <xsd:enumeration value="E"/>
                            <xsd:enumeration value="F"/>
                            <xsd:enumeration value="G"/>
                            <xsd:enumeration value="H"/>
                            <xsd:enumeration value="J"/>
                            <xsd:enumeration value="K"/>
                            <xsd:enumeration value="L"/>
                            <xsd:enumeration value="M"/>
                            <xsd:enumeration value="N"/>
        <xsd:element name="DCCM" type="xsd:string">
                ** Derwent Class Chemical Main element **
        <xsd:element name="DCCSs">
                ** Derwent Classes Chemical Secondary Container element**
                    <xsd:element ref="DCCS" minOccurs="1" maxOccurs="unbounded"/>
    <xsd:element name="DCCS" type="xsd:string">
                ** A Derwent Class Chemical Secondary (DCCS) **
        <xsd:element name="MCCs">
                ** Manual Codes CPI Container Element **
                    <xsd:element ref="MCC" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="MCC" type="xsd:string">
                ** A Manual Code CPI **
        <xsd:element name="EngPIs">
                **General &amp; Mechanical Sections Container element**
                    <xsd:element ref="EngPI" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="EngPI">
                UNIQUE ELEMENT EngPIs
                ** General &amp; Mechanical Sections EngPI is composed of sections
                    P and Q. Any combination is allowed, with at least one section.
                    There are no manual codes for these sections.
                    Each single EngPI section is composed of
                    one or more Derwent Class Engineering (DCEng) elements.
                    No section can appear twice. The DTD does not enforce this.
                    Attribute section= an allowed Derwent EngPI section         **
                    <xsd:element ref="DCEngs" minOccurs="1" maxOccurs="1"/>
                <xsd:attribute name="sct" use="required">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="P"/>
                            <xsd:enumeration value="Q"/>
        <xsd:element name="DCEngs">
                ** Derwent Class Engineering Container eement **
                    <xsd:element ref="DCEng" minOccurs="1" maxOccurs="unbounded"/>
    <xsd:element name="DCEng" type="xsd:string">
                ** Derwent Class Engineering  **
        <xsd:element name="EPIs">
                ** Electronic &amp; Electrical Sections EPI **
                Container element for EPI
                    <xsd:element ref="EPI" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="EPI">
                UNIQUE ELEMENT EPIs
                ** EPI is composed of sections R to X. Any combination is allowed,
                      with at least one section.
                     Section R can't be on his own. The DTD does not enforce this.
                     Each single is composed of one or more Derwent Class Electronic
                     and Electrical (DCE) elements.
                     No section can appear twice. The DTD does not enforce this. **
         Attribute sct = an allowed Derwent EPI section        
                    <xsd:element ref="EPIgp" minOccurs="1" maxOccurs="unbounded"/>
                <xsd:attribute name="sct" use="required">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="R"/>
                            <xsd:enumeration value="S"/>
                            <xsd:enumeration value="T"/>
                            <xsd:enumeration value="U"/>
                            <xsd:enumeration value="V"/>
                            <xsd:enumeration value="W"/>
                            <xsd:enumeration value="X"/>
        <xsd:element name="EPIgp">
                ** A Container element to group the related DCE &amp; MCEs elements **
                Note that content MCEs is optional as its optional in the backfile
                     before 19??
                <xsd:sequence minOccurs="1" maxOccurs="1">
                    <xsd:element ref="DCE" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="MCEs" minOccurs="0" maxOccurs="1"/>
        <xsd:element name="DCE" type="xsd:string">
                ** Derwent Class Electronic &amp; Electrical (DCE) contains
                     Manual Codes Electronic &amp; Electrical (MCE)              **
                ** Derwent Class EPI **
        <xsd:element name="MCEs">
                ** Manual Codes EPI container element **
                    <xsd:element ref="MCE" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="MCE" type="xsd:string">
                ** A Manual Code EPI **
        <xsd:element name="IPCs">
                ********************IPCs Container element *************
                ** The following xlink construct is a link to the IPC codes on
                     the WIPO site, this could be added to the DTD to provide the link
                     without adding any thing to the individual instances
                     xlink:type   (locator)  #FIXED     "locator"
                     xlink:rl   NMTOKEN       #FIXED     "IPCs"
                     xlink:href CDATA #FIXED "http://classifications.wipo.int/fulltext/new_ipc/"
                     xlink:title  CDATA       #FIXED     "IPC Codes"            **
                    <xsd:element ref="IPC" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="IPC">
                UNIQUE ELEMENT IPCs
                ** An IPC Code **
                ** Attribute rnk= Derwent assigned character to indicate type
                     of IPC:
                          A= Main IPC
                          B= Other, unlinked IPCs
                          C to Y = Linked IPCs and Index Terms
                          Z= IPC Index Terms
                          -= Additional terms                       **
              <xsd:extension base="xsd:string">
                <xsd:attribute name="rnk" use="required">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="A"/>
                            <xsd:enumeration value="B"/>
                            <xsd:enumeration value="C"/>
                            <xsd:enumeration value="D"/>
                            <xsd:enumeration value="E"/>
                            <xsd:enumeration value="F"/>
                            <xsd:enumeration value="G"/>
                            <xsd:enumeration value="H"/>
                            <xsd:enumeration value="I"/>
                            <xsd:enumeration value="J"/>
                            <xsd:enumeration value="K"/>
                            <xsd:enumeration value="L"/>
                            <xsd:enumeration value="M"/>
                            <xsd:enumeration value="N"/>
                            <xsd:enumeration value="O"/>
                            <xsd:enumeration value="P"/>
                            <xsd:enumeration value="Q"/>
                            <xsd:enumeration value="R"/>
                            <xsd:enumeration value="S"/>
                            <xsd:enumeration value="T"/>
                            <xsd:enumeration value="U"/>
                            <xsd:enumeration value="V"/>
                            <xsd:enumeration value="W"/>
                            <xsd:enumeration value="X"/>
                            <xsd:enumeration value="Y"/>
                            <xsd:enumeration value="Z"/>
                            <xsd:enumeration value="-"/>
        <xsd:element name="Frag">
                ***************BCE INDEXING OR FRAGMENTATION ************
                ** Fragmentation Container element**
                    <xsd:element ref="FragSub" minOccurs="1" maxOccurs="unbounded"/>
        <xsd:element name="FragSub">
                UNIQUE ELEMENT Frag
                ** Fragmentation Sub heading **
                ** Attribute sjct=The main BCE chemical subject categories
                     defined by Derwent as subsets. The subsets are designated by the
                     subheadings MO through M6:
                          MO  Agricultural, pharmaceutical     1963-1969
                          M1  Agricultural, pharmaceutical natural products
                                         and polymers     1970 to present
                          M2  Agricultural, pharmaceutical         1970 to present
                          M3  General chemicals                    1970 to present
                          M4  Dyes                                 1970 to present
                          M5  Steroids                             1963 to present
                          M6  Galenicals                           1976 to present"**
                    <xsd:element ref="CardRec" minOccurs="1" maxOccurs="unbounded"/>
                <xsd:attribute name="sjct" use="required">
                        <xsd:restriction base="xsd:string">
                            <xsd:enumeration value="M0"/>
                            <xsd:enumeration value="M1"/>
                            <xsd:enumeration value="M2"/>
                            <xsd:enumeration value="M3"/>
                            <xsd:enumeration value="M4"/>
                            <xsd:enumeration value="M5"/>
                            <xsd:enumeration value="M6"/>
        <xsd:element name="CardRec">
                ** Card Record **
                     Attribute no     =Derwent assigned record number
                     Attribute mc     = Markush DARC Control Code
                     Attribute trc     = Fragmentaion Control Time ranging codes:
                                    M900     Pre-1970
                                    M901     1970-1971
                                    M902     1972-1981 (8126)
                                    M903     1981 (8127) onward
                     Attribute rn      = 910 Codes generated from registry numbers.
                                    Only searchable from 1981
                     Attribute wd = 911 Wide Disclosure. Only searchable from 1981
                <xsd:sequence minOccurs="1" maxOccurs="1">
                    <xsd:element ref="FCodes" minOccurs="1" maxOccurs="1"/>
                    <xsd:element ref="RINs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="SCNs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="MCNs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="DRNs" minOccurs="0" maxOccurs="1"/>
                    <xsd:element ref="DCRs" minOccurs="0" maxOccurs="1"/>

  • Encoding problem with xml

    I'm trying to save on disk an xml with utf-8 codification and compressed with "gzip" algorithm. For some reasons, I have to use a BufferedWriter, then I must convert the byte[] to String.
    Afterwards, I read this document and I try to decompress it and save on a String variable.
    If I do all this process using utf-8 encoding, the decompressing process throws an exception: Not in GZIP format.
    But if I use iso-8859-1, then everything works OK.
    I don't understand why using iso works, and why using utf-8 does not work (when the webservice specification says that the xml documents are sent in utf-8).
    The code is the following (the static "myCharset" is the key: when I set iso works, and setting utf-8 does not work):
    public class testCompress
    public static String xmlOutput     = "<?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"><soap:Body><consultaCiudadesPorPaisResponse xmlns=\"http://tempuri.org/\"><consultaCiudadesPorPaisResult><xml funcion=\"ConsultaCiudadesPorPais\" xmlns=\"\"><ROK>TRUE</ROK></xml></consultaCiudadesPorPaisResult></consultaCiudadesPorPaisResponse></soap:Body></soap:Envelope>";
    public static String myCharset     = "iso-8859-1";
    // Writes the compressed document to disk.
    public static void writeDocument() throws Exception
      BufferedWriter writer = null;
       // Compress the "xmlOutput" converting this string to bytes using "utf-8" as specification says (in "compress" method)
       // Afterwards, I convert this byte[] to String using "myCharset".
       String compressedFile = new String(CompressionService.compress(xmlOutput, "utf-8", "gzip", 8), myCharset);
       // And write to disk using "myCharset" as the encoding used by "OutputStreamWriter".
       writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("c:/cache"), myCharset), 8192);
      catch (Exception e) { throw e; }
        if (writer != null)
         try { writer.close(); writer = null; } catch (IOException ioe) {}
    // Reads the compressed document from disk.
    public static byte[] readDocument() throws Exception
      BufferedReader reader = null;
      StringBuilder sb           = new StringBuilder();
      char[] buffer           = new char[8192];
       // Open the file using "myCharset" for reading chars.
       reader = new BufferedReader(new InputStreamReader(new FileInputStream("c:/cache"), myCharset));
       int numchars = 0;
       while ((numchars = reader.read(buffer, 0, 8192)) >= 0) sb.append(buffer, 0, numchars);
       // And return the result as a byte[] encoded with "myCharset".
       return (sb.toString().getBytes(myCharset));
      catch (Exception e) { throw e; }
       if (reader != null)
        try { reader.close(); } catch (IOException ioe) {}
    public static void main(String[] args) throws Exception
      byte[] file = readDocument();
      // DECOMPRESS FAILS IF myCharset = "utf-8", and works if myCharset = "iso-8859-1"
      System.out.println(com.vpfw.proxy.services.compress.CompressionService.decompress(file, "utf-8", "gzip", 8));

    Okay, here's what's happening. You created a byte[] by encoding some text as UTF-8, then you ran that byte[] through a gzip deflater. The result is binary data that can only be understood by a gzip inflater; to any other software it just looks like garbage. Now you're taking a randomly-chosen encoding and pretending the binary data is really text that was encoded with that encoding.
    Most encodings have limits on what kinds of input they can accept. For example, US-ASCII only uses the low-order seven bits of each byte; any byte with a value larger than 127 is invalid. When the encoder encounters such a byte, it inserts the standard replacement character, U+FFFD, in that spot. When you try to decode the string again as US-ASCII, the replacement character is what you see in that position; the original byte value is lost. In UTF-8, the bytes have to conform to [certain patterns|http://en.wikipedia.org/wiki/UTF-8#Description]; for example, any byte with a value greater than 127 has to be part of a valid two-, three- or four-byte sequence.
    ISO-8859-1 is different. It's a single-byte encoding like ASCII, but it uses all eight bits of every byte. Furthermore, every possible byte value (0..255) maps to a character, so you can throw any random byte at it and tell it the byte represents a character, and it will believe you. Some of those values may map to control characters that would look like garbage if you displayed them, but they're valid. That means you can re-encode the string as ISO-8859-1 and get back the exact byte sequence you started with.
    So that's why your code "works" when you use ISO-8859-1, but I strongly recommend that you find another way; making binary data masquerade as text is dangerously fragile. Why do you have to use a Writer anyway? Is it for transmission over a medium that only accepts text data? If so, you should use a Base64 encoder or similar tool that's designed for that purpose.

  • Issue with XML encoding

    I am querying an SAP instance for data, and I am getting back the data in an XML document...to which I need apply XSL transformation.
    The instance is an German one... so I am getting a few strange characters because of which, the parser is uanble to apply the transformation.
    I wrote a small java program to get more info on this particular character causing the problem, it turns out to be an ISOControl character. The properties (got by using the Character class API) are as below.
    1. Character 1:
    The getDirectionality is:0
    The getNumericValue is:21
    The getType is:2
    The isDefined is:true
    The isISOControl is:false
    The isMirrored is:false
    The isSpaceChar is:false
    The isUnicodeIdentifierPart is:true
    The isUnicodeIdentifierStart is:true
    The isWhitespace is:false
    2. character 2:
    The getDirectionality is:9
    The getNumericValue is:-1
    The getType is:15
    The isDefined is:true
    The isISOControl is:true
    The isMirrored is:false
    The isSpaceChar is:false
    The isUnicodeIdentifierPart is:true
    The isUnicodeIdentifierStart is:false
    The isWhitespace is:false
    These characters appear to be white space in normal text editors...
    This is how I am applying the transformation
    TransformerFactory tFactory = TransformerFactory.newInstance();
    Transformer transformer =
    new javax.xml.transform.stream.StreamSource
    StringReader sr = new StringReader(xmlToApply);
    StringWriter sw = new StringWriter();
    new javax.xml.transform.stream.StreamSource(sr),
    new javax.xml.transform.stream.StreamResult(sw)
    return sw.toString();
    I have tried to use "UTF-8", "ISO-8859-1" and other encodings, but in vain. Appreciate any pointers on this...

    Also the postings are from 2 different perspectives... this is to address the issue that I am facing with XML parsing and how to eliminate that using XML apis...
    My other posting about a possible work around (encoding) for the problem creating string...

  • How to put String with html tags as it is into xml

    I am using apache dom API to create xml from java.
    I have a string with html tags in it .when I add the string to xml, its replacing all the "<"; with &lt and ">" with > I would like the html tags to look as it is instead of the > and & lt;. How can I acheive that
    this is the code snippet of what I am doing
    In java class
    String titleString = "<font color=red>This Is an Example of a Red Subject</font>"
    Document doc = new DocumentImpl();
    Element root = doc.createElement("bulletin");
    Element item = doc.createElement("title");
    In Xml it looks like below
    <title><font color=red>This Is an Example of a Red Subject</font></title>
    but I would like to have the xml like below
    <title><font color="red">This Is an Example of a Red Subject</font></title>
    Can you please suggest me whats the best way to acheive this.
    I appreciate all your help
    Thank you

    One problem is that you don't understand escaping. If you re-read what you posted you'll see that what you say you get, and what you say you want, are identical. That's because you didn't escape one of the two properly. So your first step should be to find the section about escaping in Chapter 1 of your XML book and read it carefully. Figure out what you should have done here (yes, the same rules apply).
    However, to attempt to answer what I think your question is: if you have a String which contains markup, and you want to convert that String to XML elements, then you have to feed the String into an XML parser.

  • How Can I replace newScale Text Strings with Custom Values?

    How Can I replace newScale Text Strings with Custom Values?
    How can I replace newScale text strings with custom values?
    All  newScale text is customizable. Follow the procedure below to change the  value of any text string that appears in RequestCenter online pages.
    1. Find out the String ID of the text string you would like to overwrite by turning on the String ID display:
    a) Navigate to the RequestCenter.ear/config directory.
    b) Open the newscale.properties file and add the following name-value pair at the end of the file:res.format=2
    c) Save the file.
    d) Repeat steps b and c for the RmiConfig.prop and RequestCenter.prop files.
    e) Stop and restart the RequestCenter service.
    f) Log  in to RequestCenter and browse to the page that has the text you want  to overwrite. In front of the text you will now see the String ID.
    g) Note down the String ID's you want to change.
    2. Navigate to the directory: /RequestCenter.ear/RequestCenter.war/WEB-INF/classes/com/newscale/bfw.
    3. Create the following sub-directory: res/resources
    4. Create the following empty text files in the directory you just created:
    5. Add the custom text strings to the appropriate  RequestCenter_<Number>.properties file in the following manner  (name-value pair) StringID=YourCustomTextString
    Example: The StringID for "Available Work" in ServiceManager is 699.
    If you wanted to change "Available Work" to "General Inbox", you  would add the following line to the RequestCenter_0.properties file
         699=General Inbox
    Strings are divided into the following files, based on their numeric ID:
    Strings are divided into the following files, based on their numeric ID:
    String ID  File Name
    0 to 999 -> RequestCenter_0.properties
    1000 to 1999 -> RequestCenter_1.properties
    2000 to 2999 -> RequestCenter_2.properties
    3000 to 3999 -> RequestCenter_3.properties
    4000 to 4999 -> RequestCenter_4.properties
    5000 to 5999 -> RequestCenter_5.properties
    6000 to 6999 -> RequestCenter_6.properties
    7000 to 7999 -> RequestCenter_7.properties
    6. Turn off the String ID display by removing (or commenting out) the line "res.format=2" from the newscale.properties, RequestCenter.prop and RmiConfig.prop files
    7. Restart RequestCenter.
    Your customized text should be displayed.

    I've recently come across this information and it was very helpful in changing some of the inline text.
    However, one place that seemed out of reach with this method was the three main buttons on an "Order" page.  Specifically the "Add & Review Order" button was confusing some of our users.
    Through the use of JavaScript we were able to modify the label of this button.  We placed JS in the footer.html file that changes the value of the butt

  • Stepping through a query result set, replacing one string with another.

    I want to write a function that replaces the occurance of a string with another different string.  I need it to be a CF fuction that is callable from another CF function.  I want to "hand" this function an SQL statement (a string) like this:   (Please note, don't bother commenting that "there are eaiser ways to write this SQL..., I've made this simple example to get to the point where I need help.  I have to use a "sub_optimal" SQL syntax just to demonstrate the situation)
    Here is the string I want to pass to the function:
    Here is the contents of the ABRV table:
    TBL_NM,  ABRV    <!--- Header row--->
    The function will return the original string, but with the abreviations in place of the long table names, example:
    Notice that only the table names surrounded by brackets and that match a value in the ABRV table have been replaced.  The LONGTABLENAME immediately following the FROM is left as is.
    Now, here is my dum amatuer attempt at writing said function:  Please look at the comment lines for where I need help.
          <cffunction name="AbrvTblNms" output="false" access="remote" returntype="string" >
            <cfargument name="txt" type="string" required="true" />
            <cfset var qAbrvs="">  <!--- variable to hold the query results --->
            <cfset var output_str="#txt#">  <!--- I'm creating a local variable so I can manipulate the data handed in by the TXT parameter.  Is this necessary or can I just use the txt parameter? --->
            <cfquery name="qAbrvs" datasource="cfBAA_odbc" result="rsltAbrvs">
         <!--- I'm assuming that at this point the query has run and there are records in the result set --->
        <cfloop index="idx_str" list="#qAbrvs#">      <!--- Is this correct?  I think not. --->
        <cfset output_str = Replace(output_str, "#idx_str#", )  <!--- Is this correct?  I think not. --->
        </cfloop>               <!--- What am I looping on?  What is the index? How do I do the string replacement? --->
            <!--- The chunck below is a parital listing from my Delphi Object Pascal function that does the same thing
                   I need to know how to write this part in CF9
          while not Eof do
              s := StringReplace(s, '[' +FieldByName('TBL_NM').AsString + ']', FieldByName('ABRV').AsString, [rfReplaceAll]);
        <cfreturn output_txt>
    I'm mainly struggling with syntax here.  I know what I want to happen, I know how to make it happen in another programming language, just not CF9.  Thanks for any help you can provide.

    RedOctober57 wrote:...
    Thanks for any help you can provide.
    <cfset var output_str="#txt#">  <!--- I'm creating a local
    variable so I can manipulate the data handed in by the TXT parameter.
    Is this necessary or can I just use the txt parameter? --->
    No you do not need to create a local variable that is a copy of the arguments variable as the arguments scope is already local to the function, but you do not properly reference the arguments scope, so you leave yourself open to using a 'txt' variable in another scope.  Thus the better practice would be to reference "arguments.txt" where you need to.
    I know what I want to happen, I know how to make it happen in another programming language, just not CF9.
    Then a better start would be to descirbe what you want to happen and give a simple example in the other programming language.  Most of us are muti-lingual and can parse out clear and clean code in just about any syntax.
    <cfloop index="idx_str" list="#qAbrvs#">      <!--- Is this correct?  I think not. --->
    I think you want to be looping over your "qAbrvs" record set returned by your earlier query, maybe.
    <cfloop query="qAbrvs">
    <cfset output_str = Replace(output_str, "#idx_str#", )  <!--- Is this correct?  I think not. --->
    Continuing on that assumption I would guess you want to replace each instance of the long string with the short string form that record set.
    <cfset output_str = Replace(output_str,qAbrs.TBLNM,qAbrs.ABRV,"ALL")>
    </cfloop>               <!--- What am I looping on?  What is the index? How do I do the string replacement? --->
    If this is true, then you are looping over the record set of tablenames and abreviations that you want to replace in the string.

  • Parsing XML string with XPath

    I am trying to parse an XML string with xpath as follows but I am getting null for getresult.
    I am getting java.xml.xpath.xpathexpressionexception at line where
    getresult = xpathexpression.evaluate(isource); is executed.
    What should I do after
    xpathexpression = xPath.compile("a/b");in the below snippet?
    String xmlstring ="..."; // a valid XML string;
    Xpath xpath = XPathFactory.newInstance().newPath();
    xpathexpression = xPath.compile("a/b");
    // I guess the following line is not correct
    InputSource isource = new inputSource(new ByteArrayInputStream(xmlstring.getBytes())); right
    getresult = xpathexpression.evaluate(isource);My xml string is like:
         <result> valid some more tags here
      <c> 10
    </a>Edited by: geoman on Dec 8, 2008 2:30 PM

    I've never used the version of evaluate that takes an InputSource. The difficulty with using it is that it does not save the DOM object. Each expression you evaluate will have to create the DOM object, use it once and then throw it away. I've yet to write a program that only needs one answer from an XML document. Usually, I use XPath to locate somewhere in a document and then read "nearby" content, add new content nearby, delete content, or move content. I'd suggest you may want to parse the XML stream and save the DOM Document.
    Second, all of the XPath expressions search from a "context node". I have not had good luck searching from the Document object, so I always get the root element first. I think the expression should work if you use the root as the context node. You will need one of the versions of evaluate that uses an Object as the parameter.

  • How to replace a character in a string with blank space.

    How to replace a character in a string with blank space.
    I have to change string  CL_DS_1===========CM01 to CL_DS_1               CM01.
    i.e) I have to replace '=' with ' '.
    I have already tried with <b>REPLACE ALL OCCURRENCES OF '=' IN temp_fill_string WITH ' '</b>
    Its not working.

    Try with this..
    call method textedit- >replace_all
        case_sensitive_mode = case_sensitive_mode
        replace_string = replace_string
        search_string = search_string
        whole_word_mode = whole_word_mode
        counter = counter
        error_cntl_call_method = 1
        invalid_parameter = 2.
    <b>Parameters</b>      <b> Description</b>    <b> Possible values</b>
    case_sensitive_mode    Upper-/lowercase       false Do not observe (default value)
                                                                       true  Observe
    replace_string                Text to replace the 
                                         occurrences of
    search_string                 Text to be replaced
    whole_word_mode          Only replace whole words   false Find whole words and                                                                               
    parts of words (default                                                                               
                                                                               true  Only find whole words
    counter                         Return value specifying how
                                        many times the search string
                                        was replaced

  • Replacing a part of a String with a new String

    Hi everybody,
    is there a option or a method to replace a part of a String with a String???
    I only found the method "replace", but with this method I only can replace a char of the String. I don't need to replace only a char of a String, I have to replace a part of a String.
    String str = "Hello you nice world!";
    str.replace("nice","wonderfull");   // this won't work, because I can't replace a String with the method "replace"
                                        // with this method I'm only able to replace charsDoes anyone know some method like I need???
    Thanks for your time on answering my question!!
    king regards

    do check java 1.4 api, I think there is a method in it, however for jdk1.3 you can use
    private static String replace(String str, String word,String word2) {
         if(str==null || word==null || word2 == null ||
               word.equals("") || word2.equals("") || str.equals("")) {
              return str;
         StringBuffer buff = new StringBuffer(str);
         int lastPosition = 0;
         while(lastPosition>-1) {
              int startIndex = str.indexOf(word,lastPosition);
              if(startIndex==-1) {
              int len = word.length();
              char[] charArray = word2.toCharArray();
              str = buff.toString();
              int len2 = startIndex+word2.length();
              lastPosition = str.indexOf(word,len2);
         return buff.toString();

  • Replacing a special character in a string with another string

    I need to replace a special character in a string with another string.
    Say there is a string -  "abc's def's are alphabets"
    and i need to replace all the ' (apostrophe) with &apos& ..which should look like as below
    "abc&apos&s def&apos&s are alphabets" .
    Kindly let me know how this requirement can be met.

    Syntax Forms
    Pattern-based replacement
              IN [section_of] dobj WITH new
              [IN {BYTE|CHARACTER} MODE]
              [REPLACEMENT COUNT rcnt]
              { {[REPLACEMENT OFFSET roff]
                 [REPLACEMENT LENGTH rlen]}
              | [RESULTS result_tab|result_wa] }.
    Position-based replacement
    2. REPLACE SECTION [OFFSET off] [LENGTH len] OF dobj WITH new
                      [IN {BYTE|CHARACTER} MODE].
    This statement replaces characters or bytes of the variable dobj by characters or bytes of the data object new. Here, position-based and pattern-based replacement are possible.
    When the replacement is executed, an interim result without a length limit is implicitly generated and the interim result is transferred to the data object dobj. If the length of the interim result is longer than the length of dobj, the data is cut off on the right in the case of data objects of fixed length. If the length of the interim result is shorter than the length of dobj, data objects of fixed length are filled to the right with blanks or hexadecimal zeroes. Data objects of variable length are adjusted. If data is cut off to the right when the interim result is assigned, sy-subrc is set to 2.
    In the case of character string processing, the closing spaces are taken into account for data objects dobj of fixed length; they are not taken into account in the case of new.
    System fields
    sy-subrc Meaning
    0 The specified section or subsequence was replaced by the content of new and the result is available in full in dobj.
    2 The specified section or subsequence was replaced in dobj by the contents of new and the result of the replacement was cut off to the right.
    4 The subsequence in sub_string was not found in dobj in the pattern-based search.
    8 The data objects sub_string and new contain double-byte characters that cannot be interpreted.
    These forms of the statement REPLACE replace the following obsolete form:
    REPLACE sub_string WITH
    REPLACE sub_string WITH new INTO dobj
            [IN {BYTE|CHARACTER} MODE]
            [LENGTH len].
    2. ... LENGTH len
    This statement searches through a byte string or character string dobj for the subsequence specified in sub_string and replaces the first byte or character string in dobj that matches sub_string with the contents of the data object new.
    The memory areas of sub_string and new must not overlap, otherwise the result is undefined. If sub_string is an empty string, the point before the first character or byte of the search area is found and the content of new is inserted before the first character.
    During character string processing, the closing blank is considered for data objects dobj, sub_string and new of type c, d, n or t.
    System Fields
    sy-subrc Meaning
    0 The subsequence in sub_string was replaced in the target field dobj with the content of new.
    4 The subsequence in sub_string could not be replaced in the target field dobj with the contents of new.
    This variant of the statement REPLACE will be replaced, beginning with Release 6.10, with a new variant.
    Addition 1
    The optional addition IN {BYTE|CHARACTER} MODE determines whether byte or character string processing will be executed. If the addition is not specified, character string processing is executed. Depending on the processing type, the data objects sub_string, new, and dobj must be byte or character type.
    Addition 2
    ... LENGTH len
    If the addition LENGTH is not specified, all the data objects involved are evaluated in their entire length. If the addition LENGTH is specified, only the first len bytes or characters of sub_string are used for the search. For len, a data object of the type i is expected.
    If the length of the interim result is longer than the length of dobj, data objects of fixed length will be cut off to the right. If the length of the interim result is shorter than the length of dobj, data objects of fixed length are filled to the right with blanks or with hexadecimal 0. Data objects of variable length are adapted.
    After the replacements, text1 contains the complete content "I should know that you know", while text2 has the cut-off content "I should know that".
    DATA:   text1      TYPE string       VALUE 'I know you know',
            text2(18)  TYPE c LENGTH 18  VALUE 'I know you know',
            sub_string TYPE string       VALUE 'know',
            new        TYPE string       VALUE 'should know that'.
    REPLACE sub_string WITH new INTO text1.
    REPLACE sub_string WITH new INTO text2.

Maybe you are looking for

  • Change subject of mail provided by Report RSWUWFML2

    Hi I am using program RSWUWFML2 to generate Notification with executable attachment for workitems that are assigned to a User. The requirement is to change the Subject line or rather increase the number of characters that appear in the subject line.

  • Question about X-series with the bundled 65W adapter, gets very hot

    Hi all,      My company purchased many X230 notebook in last year.      some are standard configuration (i3 CPU) and some are (i5 CPU).      The i3 seems to have no problem.      However, the i5 model had some issue that it can't full run all the 4-c

  • Customer return qm

    hi, Actually i am doing a senario of return sales order for quality , for that 1)  I activated 06 inspection type in material master  and activate it and check the inspection with task list,, automatic assignment, 2) in inspection plan header opt usa

  • (SOLVED)How do I make a system snapshot?

    Hi, The latests udev scare has shown me two things: you can rely on the community to fix this (thanks Tobias!) and a good backup/snapshot is worth its wheight in gold. What manual do I read/study/understand to make a system snapshot before upgrading?

  • Spry sliding panels problem

    Hi, I'm using the sliding panels on my website, but I'm getting an error when I use a vertical scrollbar. To the right of the sliding panels I have an image, and when I press the up and down arrows, the hidden sliding panels are visible under the ima