Replacing special characters from xml document/text inside element
Hi
Is there any way to replace the xml codes to special characters inside an entire xml document/ for a text in element.
I want to get the xml codes to get replaced to respective special character(for any special character)
Please see the sample xml xml element below
<Details>Advance is applicable only for < 1000. This is mentioned in Seller's document</Details>
Thanks in advance .. any help will be highly appreciated.
So, just to be sure I understand correctly, you want this :
<Details>Advance is applicable only for < 1000. This is mentioned in Seller's document</Details>
to be converted to :
<Details>Advance is applicable only for < 1000. This is mentioned in Seller's document</Details>
If so, I'll say again : the resulting XML document will be invalid.
Extensible Markup Language (XML) 1.0 (Fifth Edition)
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively. The right angle bracket (>) may be represented using the string " > ", and MUST, for compatibility, be escaped using either " > " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.
Ask whoever "they" are why they would want to work with not wellformed XML.
Similar Messages
-
Question about special characters in XML documents
Hi,
I need to create some XML documents based on documents containing elements containing strings containing accentuated and special characters such as ��� or Asian characters.
I have found the following code claiming that XML documents created by the xformer will have a utf-8 format:
// Write the DOM document to the file
Transformer xformer = TheTF.newTransformer();
xformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");and I have found the InputSource object to be used to read UTF-8 XML documents:
InputSource in = new InputSource(new FileReader(filename));
in.setEncoding("UTF-8");Is this enough to make sure that accentuated and Asian characters (or any other characters) will be converted properly when creating and re-reading the XML document? If one of my elements contain the following text '���33445tata', will the string remain the same when re-read from the XML? Do I need more code to make sure that all conversion will be performed properly and transparently? If yes, can one provide an example?
Thanks!Hi Dr.Clap,
Thanks for your answer, but I am not really sure I understand you. I have performed a test to generate a simple document containing one element. An attribute containing some special characters is added to the element. The element text also contains specials characters.
When I generate the XML into a byte array input and print the content, I get:
<?xml version="1.0" encoding="utf-8" standalone="no"?><UTF8TEST BIBI="��ii��������">tre33������</UTF8TEST>When I parse the document, and print the result, I get:
-- Begin of Parsing
Element Name = UTF8TEST
Attribute Name = BIBI
Attribute Value = �ii����
Found text : tre33���
-- End of Parsingwhich is correct. I am worried by the ��ii� ������-like characters generated in the XML. Should I be worried or shouldn't I? Should I convert these into something looking a more like ASCII, or is this fine?
I am including the code of my example (excluding imports):
public class XMLUTF8_Test extends DefaultHandler {
// Logger
private final static Logger LOG = Logger.getLogger(XMLUTF8_Test.class.getName());
// Static
public static DocumentBuilderFactory DOCUMENT_BUILDER_FACTORY;
public static DocumentBuilder DOCUMENT_BUILDER;
public static TransformerFactory TRANSFORMER_FACTORY;
private static SAXParserFactory FACTORY;
private static SAXParser SAXPARSER;
private static XMLReader XMLREADER;
static {
// Preparing the writting
DOCUMENT_BUILDER_FACTORY = DocumentBuilderFactory.newInstance();
try {
DOCUMENT_BUILDER = DOCUMENT_BUILDER_FACTORY.newDocumentBuilder();
} catch (ParserConfigurationException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
TRANSFORMER_FACTORY = TransformerFactory.newInstance();
// Preparing the reading
try {
FACTORY = SAXParserFactory.newInstance();
SAXPARSER = FACTORY.newSAXParser();
XMLREADER = SAXPARSER.getXMLReader();
} catch (ParserConfigurationException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
} catch (SAXException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
public static void main(String[] args) {
// Perform test
new XMLUTF8_Test();
public XMLUTF8_Test() {
// Saving to File
Document NewDoc = DOCUMENT_BUILDER.newDocument();
Element Root = NewDoc.createElement("UTF8TEST");
Root.setAttribute("BIBI", "�ii����");
Root.setTextContent("tre33���");
NewDoc.appendChild(Root);
ByteArrayOutputStream TheBAOS = new ByteArrayOutputStream();
try {
Source source = new DOMSource(NewDoc);
Result result = new StreamResult(TheBAOS);
Transformer xformer = TRANSFORMER_FACTORY.newTransformer();
xformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
xformer.transform(source, result);
} catch (TransformerConfigurationException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
} catch (TransformerException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
// Printing result
System.out.println(TheBAOS.toString());
// Parsing the result
ByteArrayInputStream TheBAIS = new ByteArrayInputStream(TheBAOS.toByteArray());
XMLREADER.setContentHandler(this);
try {
XMLREADER.parse(new InputSource(TheBAIS));
} catch (IOException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
} catch (SAXException Ex) {
Debugger.LogUnreachableCodeReachedSituation(LOG, Ex);
@Override
public void startDocument() throws SAXException {
System.out.println("-- Begin of Parsing");
@Override
public void endDocument() throws SAXException {
System.out.println("-- End of Parsing");
@Override
public void startElement(String namespaceURI, String LocalName, String QualifiedName,
Attributes TheAttributes) throws SAXException {
String ElementName = LocalName;
if ("".equals(ElementName)) {
ElementName = QualifiedName; // namespaceAware = false
System.out.println("Element Name = " + ElementName);
if (TheAttributes != null) {
for (int i = 0; i < TheAttributes.getLength(); i++) {
String AttributeName = TheAttributes.getLocalName(i); // Attr name
if ("".equals(AttributeName)) AttributeName = TheAttributes.getQName(i);
System.out.println("Attribute Name = " + AttributeName);
System.out.println("Attribute Value = " + TheAttributes.getValue(AttributeName));
@Override
public void endElement(String namespaceURI, String SimpleName, String QualifiedName) throws SAXException {
@Override
public void characters(char[] buf, int offset, int len) throws SAXException {
String s = new String(buf, offset, len);
System.out.println("Found text : " + s);
}Thanks! -
Special characters in XML document
I have a Flash file saved as version 8 with the following script calling an xml file:
//init TextArea component
myText.html = true;
myText.wordWrap = false;
myText.multiline = false;
myText.label.condenseWhite=true;
//load in XML
xml = new XML();
xml.ignoreWhite = true;
xml.onLoad = function(success){
if(success){
myText.text = xml;
xml.load("titletext.xml");
My xml file contains the following:
<?xml version="1.0" encoding="iso-8859-1"?>
<![CDATA[Smith's dog & cat are "crazy"]]>
When posted online my flash file displays the encoding tag in the xml file.
AND the apostrophe, ampersand and quote marks display as html code instead of the actual character.
I can take the encoding tag out of the xml file but my characters still don't display correctly.
My dynamic text field in flash (myText) does have special characters embedded, plus I have them entered manually in the field for 'include these characters'.
Does anyone have suggestions for me?
You can view this test file at http://wilddezign.com/preshomes_name2.html
TIAPerhaps you need a slightly different approach to loading the XML. Instead of loading the entire XML file, what if you loaded only the child you were looking for? This is what I usually do:
var xml:XML = new XML();
xml.ignoreWhite = true;
xml.load("some.xml");
xml.onLoad = parse;
function parse(success){
if (success){
root = xml.firstChild;
_global.numberItems = root.attributes.items;
itemNode = root.firstChild;
var i:Number = 0;
while(itemNode != null){
myText.text = itemNode.attributes.description;
itemNode = itemNode.nextSibling;
i++;
else {
trace("XML Bad!!");
And your XML would be structured like this:
<?xml version="1.0" encoding="utf-8"?>
<sample>
<item description="This is the text that I want to appear on this MC!" />
</sample> -
Replace special characters in xml to its HTML equivalent
Hello All,
I did a small xml processor which will parse the given xml document and retrieves the necessary values. I used XPath with DocumentBuilder to implement the same.
The problem which had faced initially was, i could not able parse the document when the value comes with the '&' special character. For example,
<description>a & b</description>I did some analysis and found that the '&' should be replaced with its corresponding HTML equivalent
& So the problem had solved and i was able to process the xml document as expected. I used the replaceAll() method to the entire document.
Then i thought, there would be some other special character which may cause the same error in future. I found '<' is also will cause the same kind of error. For example,
<description>a < b</description>Here i couldn't able to use the replaceAll(), because even the '<' in the xml element tags was replaced. So i was not able to parse the xml document.
Did anyone face this kind of issue? Can anyone help me to get rid of this issue.
Thanks
kmsThats the thing about XML. It has to be correct, or it doesn't pass the gate. This is nothing really, seen it many times before. It becomes funny when you start to run into character set mismatches.
In this case the XML data should have either been escaped already, or the data should have been wrapped in cdata blocks, as described here:
http://www.w3schools.com/xml/xml_cdata.asp
Manually "fixing" files is not what you want to be doing. The file has to be correct according to the very simple yet strict XML specifications, or it isn't accepted. -
Question: Trying to add a emjoi from special characters to my document which has tables. when adding can not see the emjoi picture. Can anyone help. Thank you
Emoji is not supported by the iWork apps.
Send Feedback to Apple via the Pages menu.
Jerry -
How to remove special characters in xml
Dear friends,
How to remove the special character from the xml. I am placing the xml file and fetching through file adapter.
The problem is when there is any special character in xml. i am not able to pass to target system smoothly.
Customer asking schedule the file adapter in order to do that the source xml should not have any special charatcters
How to acheive this friends,
Thanx in advance.
Take careHi Karthik,
Go throgh the following links how to handle special character
https://www.sdn.sap.com/irj/scn/weblogs?blog=/pub/wlg/9420 [original link is broken] [original link is broken] [original link is broken]
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42
Restricting special characters in XML within XI..
Regards
Goli Sridhar -
The blank pop up began as I tried to access "special characters" from the finder menu. I restarted, turned off and restarted and it did not work. It interferes with any application because I cannot work fast. Every new step takes a few seconds longer such as saving, finding text, check spelling and many more. I am desperate to solve this. Thanks in advance for any help given.
Consuelo CorretjerUse the trackpad to scroll, thats what it was designed for. The scroll bars automatically disappear when not being used and will appear if you scroll up or down using the trackpad.
This is a user-to-user forum and most people will post on here if they have problems. You very rarely get people posting to say there update went smooth. The fact is the vast majority of Mountain Lion users will not be experiencing any major problems with the OS, or maybe with apps which are not compatible, but thats hardly Apple's fault if developers don't update their apps. -
I use classical Hebrew for my work, and Pages will only display English characters even with a Hebrew font selected. If I cut and paste Hebrew characters from another document, as long as the font is supported, it will appear in Pages. If I type it won't continue in Hebrew. I have tried downloading several fonts, including those from professional societies, but the only way to get Hebrew in my document is to cut and paste. Does anyone know how to fix this? I use an older MacBook running OS 10.9.1. I used to do my Hebrew work in Word, but it is no longer supported by Mac OS.
Just clarifying:
Pages '09 has bad support for Hebrew, Arabic etc but will accept pasted text.
Pages 5 has much better support but with bugs.
If you have columns they are in the wrong order ie Text starts in the left column and ends in the right column.
If you type English into Hebrew text it tends to fall in the wrong position eg instead of to the left of Hebrew punctuation it goes to the right.
As Tom recommends the only real solution on the Mac is Mellel.
Peter
btw Tell Apple, they are amazingly slow to fix this running sore which has been broken since RtoL was supposedly introduced in OSX 10.2.3 over a decade ago.
Peter -
? is shown for Norwegian characters when XML document is parsed using DOM
Hi,
I've a sample program that creates a XML document with a single element book having Norwegian characters. Encoding is UTF-8. When i parse the XML document and try to access the value of that element then ? are shown for Norwegian characters. XML document file name is "Sample.xml"
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element root = doc.createElement("root");
root.setAttribute("value", "Á á Ą ą ä É é Ę");
doc.appendChild(root);
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
trans.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
//create string from xml tree
java.io.ByteArrayOutputStream baos = new java.io.ByteArrayOutputStream();
StreamResult result = new StreamResult(baos);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
writeToFile("Sample.xml", baos.toByteArray());
InputSource is = new InputSource(new java.io.ByteArrayInputStream(readFile("Sample.xml")));
is.setEncoding("UTF-8");
DocumentBuilder obj_db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document obj_doc = obj_db.parse(is);
obj_doc.normalize();
System.out.println("Value is : " + new String(((Element) obj_doc.getElementsByTagName("root").item(0)).getAttribute("value").getBytes()));writeFile() - Writes the document bytes in Sample.xml file
readFile() - Reads the Sample.xml file
When i run this program XML editor shows the characters correctly but Java code output is: Á á ? ? ä É é ?
What's the problematic area in my java code. I didn't get any help from any source. Please suggest me the solution of this problem.
Thanx in advance.Hi,
I'm using JBuilder 2005 and i mentioned encoding UTF-8 for saving Java source files and also for compilation. I've modified my source code also. But the problem persists. After applying changing the dumped sample.xml file doesn't display these characters correctly in IE, but earlier it was displaying it correctly at IE.
Modified code is:
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element root = doc.createElement("root");
root.setAttribute("value", "Á á Ą ą ä É é Ę");
doc.appendChild(root);
OutputFormat output = new OutputFormat(doc, "UTF-8", true);
java.io.ByteArrayOutputStream baos = new java.io.ByteArrayOutputStream();
OutputStreamWriter osw = new OutputStreamWriter(baos, "UTF-8");
XMLSerializer s = new XMLSerializer(osw, output);
s.asDOMSerializer();
s.serialize(doc);
writeToFile("Sample5.xml", baos.toByteArray());
InputSource o = new InputSource(new java.io.ByteArrayInputStream(readFile("Sample5.xml")));
o.setEncoding("UTF-8");
com.sun.org.apache.xerces.internal.parsers.DOMParser obj_parser = new com.sun.org.apache.xerces.internal.parsers.DOMParser();
obj_parser.parse(o);
Document obj_doc = obj_parser.getDocument();
System.out.println("Value : " + new String(((Element) obj_doc.getElementsByTagName("root").item(0)).getAttribute("value").getBytes()));I'm hanged on this issue. Can u please provide me the code snippet that works with these characters or suggest solution.
Thanx -
Special characters in XML barcode content
Hello,
I made a barcoded form with a custom script that creates a custom XML as barcode content.
The decoding happens well when the user write plain text in the text fields, but whenever it inputs some special characters (for XML syntax), like ",<,>,=,etc... the content of barcode it is decoded as:
<barcode>
<!CDATA[... true content ...]>
</barcode>
how can I handle this situation?
I have to handle what the user writes or I have to change the decode activity?
Thank you very much for your support!
FabioSteve,
I have already encoded decode operation in UTF-8. In form level, because it is an acrobat form, no option to choose the encoding as in LC Designer. In further tests, if I change the extractToXML output to XDP instead of XFDF, then I will receive data rather than &# sequence. It is strange. Don't understand why XDP and XFDF would give out different encoding.
Tim -
Remove special characters from incoming data
Hi Varun, You could use either of below.. REG_REPLACE(YOUR_INPUT_STR,^[A-Za-z0-9 ],NULL) -- Replaces all non-alphanumeric with null
REG_EXTRACT(YOUR_INPUT_STR,[A-Za-z0-9 ]) -- Extracts only alphanumeric data -Rajanii have special character coming in the source data and i want to remove it before loading into target, currently i am getting one special character , it may come as some other type of special character other than alpha numeric. so how to remove those special characters from data and load the alphanumeric data into target.
-
How to replace special characters in string.
Hello,
I want to replace special characters such as , or ; with any other character or " ".I find out there is no such function is java for this.There is only replace() but it accepts only chars.If anybody know how to do this?.
Thanks,Hello,
I want to replace special characters such as , or ;
with any other character or " ".I find out there is no
such function is java for this.There is only replace()
but it accepts only chars.If anybody know how to do
this?.
Thanks,Can't you just do the following?
public class Test
public static void main(String[] args)
String testString = "Hi, there?";
System.out.println(testString.replace(',',' '));
} -
Preview's "Import from Scanner" documents -Text Tool cannot be activated.
Any documents created within 10.6 and printed/saved as a pdf file can be opened in Preview and the Text Tool activated allowing selection of text and copy/paste or text highlight actions made.
Using OSX 10.6.7 Preview application using the File > Import from Scanner (Epson V500 scanner), the scanned document cannot activate the Text Tool. Only the Move and Select tools cans be selected, activated and used. It's as if Preview's document scanned image using Format-PDF, the scans appear to be an image scan of the text/document into a PDF file and not using OCR to allow a Text Edit.
I can scan a printed document using Acrobat and my Epson scanner and the text tool within Acrobat or Adobe Reader works fine.
Is this a shortcoming within Preview using Format-PDF that the Text tool cannot be activated or used on Preview scanned documents containing text? If so why is Preview's text tool even offered?
If Text Tool can be used on a Preview scanned document…How ? Please!After further research I have been able to answer my own question regarding-
"Preview's "Import from Scanner" documents -Text Tool cannot be activated."
In Apples Mac 101 Support article entitled: "Using a scanner (Mac 10.6) last modified on March 15, 2011, states: "Item 6. Click Scan to scan. Note: The scanned items will become JPEG images with 300 dpi"
So there it is…Any item (or printed document) scanned under Preview using the selected Format-PDF will be scanned as a JPEG image.
That' s why the Text Tool cannot be activated, it's not being scanned as text but as an image.
If a document is created within the computer via Pages, Text Edit, Word etc. and saved using the "Print-PDF-Save asPDF" file option, the already digitized text is saved within the PDF file and can then the PDF file can be opened in Preview and the Text Tool will be available for use to copy/paste or highlight.
Charlie...76 years old and still Mac'ing -
Vbscript to rename files and replace special characters
Dear Exprt,
would you please help to add addtional requirement for rename and replace special characters for file
by the below script i can re name.
strAnswer = InputBox("Please enter folder location to rename files:", _
"File rename")
strfilenm = InputBox("Enter name:", _
"Rename Files")
Set FSO = CreateObject("Scripting.FileSystemObject")
Sub visitFolder(folderVar)
For Each fileToRename In folderVar.Files
fileToRename.Name = strfilenm & fileToRename.Name
Next
For Each folderToVisit In folderVar.SubFolders
visitFolder(folderToVisit)
Next
End Sub
If FSO.FolderExists(strAnswer) Then
visitFolder(FSO.getFolder(strAnswer))
End If
[email protected]Thx would you please look below what wrong in its run nothing happend no error
strAnswer = InputBox("Please enter folder location to rename files:", _
"Test")
strfilenm = InputBox("Enter name:", _
"Rename Files")
Set FSO = CreateObject("Scripting.FileSystemObject")
Set regEx = New RegExp
'Your pattern here
Select Case tmpChar
Case "&"
changeTo = " and "
Case "/"
changeTo = "_"
Case Else
changeTo = " "
End Select
regEx.Pattern = tmpChar
Sub visitFolder(folderVar)
For Each fileToRename In folderVar.Files
fileToRename.Name = strfilenm & fileToRename.Name
fileToRename.Name = regEx.Replace(fileToRename.Name, tmpChar)
Next
For Each folderToVisit In folderVar.SubFolders
visitFolder(folderToVisit)
Next
End Sub
[email protected] -
How do I add special characters from my favorites to text?
I have the new version of Numbers 3.5. In previous versions if I wanted to add foreign language symbols to the text, I only had to click my mouse from the "special characters" and it was added to the word I was typing. Today I am not able to use my Spanish vocabulary because I cannot add the necessary symbols. How do I use this feature in my document text?
Input from the Character Viewer normally requires double-clicking or drag/drop.
The special characters needed for spanish are also available by just holding down the key for the base letter. If you hold down the n key, you should get a popup menu where you can choose ñ.
There are also option key shortcuts for such letters. Option n then n will give you ñ.
Maybe you are looking for
-
Can't get my FaceTime to work on my phone or iPad. Need to FaceTime uk and it's not working. Any help please????? Thanks
-
I bought the new, and my first MacBook Pro 13" a little over a month ago for college and love it, but have started to realize a couple problems I can't seem to solve. The bottom of the MacBook becomes very very hot in a short amount of time, and I re
-
Trouble Printing PDF's scanned with Image Capture
OK this is a strange problem. I have Macbook Pro and a Lexmark E230 laser printer which prints everything out just fine in safari, word, pages, numbers - anything except PDF's that are specifially scanned with the Image Capture application. I printe
-
Simple line chart formatting problems
I am using the Visual Studio 2008 version of Crystal reports. I am trying to chart the maximum, minimum, and today's prices for natural gas. I have a sql view to that returns 48 rows of data. I want the legend to be MIN. MAX, NOW and the colors to be
-
Numbers 09 continuing formulas when adding rows
Numbers 09 - I have a checkbook template but when adding rows at the bottom the formulas do not continue. I have tried adding rows from the last row "Add row below" and also while in the last cell hitting return. Neither of these work. Any suggestion