Html numeric entity to unicode

The html numeric entity "&#163" (the Pound sign) is equivalent to \u00A3.
Given "&#163", how do I get its unicode?
Thanks

If I remove this invalid xml character before doing xml parsing, it could avoid the exception.
How do I make my program treats "& # 3 ;" (again please remove the space) as a single character(or unicode).The problem is that XML does not permit most of the ASCII non-printing characters -- Unicode code points 0 .. 31, so the parser is correctly rejecting it. Unfortunately, there are a lot of XML output tools -- including the one in the Sun JDK -- that simply escape these characters.
The solution is that you're going to have to pre-process the input, search for all occurrences of the invalid character, and remove it before the XML parser touches the file.

Similar Messages

HTML Character entity references on SQLQuery

I am trying retrieve the data thru XMLElement and I like to do a HTML Character entity references. I guess XMLElement does it with the proper character set translations and I was not successul in getting it correct. Could you please help me out. My DB character set it UTF-8.
For example, the "acute" e needs to be translated to its hexa equivalent. I tried it by setting the mid-tier's client's NLS but with no success. I dont want to scan the each character and convert it.
create table master.temp_xml_encode
(party_name_id NUMBER(15),
party_id NUMBER(15),
party_name VARCHAR2(200) );
PARTY_NAME_ID PARTY_ID PARTY_NAME
3831587 5496840 The West Company México, S.A. de C.V.
3844362 5496730 Schiønning & Arvé A/S
3847940 5496836 West Rubber de España, S.A.
4047634 5983166 Timberland España, S.L.
4266163 5983166 Timberland España, S.A.
4285954 6482794 The Young Women¿s Christian Association of Central New Jersey
SELECT XMLELEMENT("party_id", party_id,
xmlforest(party_name_id AS "partynameid",
party_name AS "partyname"
FROM master.temp_xml_encode
Thanks for your help

And Yes forgot to add one thing, the index.jsp is a part of the application supplied by vendor and I do not have src of the struts actions (no control on server side code).
I need to find a solution from client's side.

Generating PDF-files from HTML-page saved as Unicode?

I have followed this Quick Start on how to generate a PDF-file from HTML using web services in .NET: http://livedocs.adobe.com/livecycle/8.2/programLC/programmer/help/000093.html
It works just fine when the html-page is saved as ANSI, but when it's saved as UNICODE I get problem. The code runs without errors but the PFD looks really strange. Any suggestions on how to solve this? I really need to use UNICODE as my application needs to handle different languages (including for example Chinese).

I found out that UTF-8 worked as well so the problem is solved. :-)

Convert xml referenced entity to unicode?

Hi,
Anyone know how one converts all XML Referenced Entities, of the form "&#xxx;", which appear in a String object, to their unicode representation?
For example:
If String A contains the phrase "two letters: Á á", I would like to get a new String which contains the phrase "two letters: � �"

nobody71 wrote:
I can write some code to do the conversion by parsing the string, finding each and every XML reference and then convert each to unicode... I know how to do that...Then go ahead and do it.
but there must already be something that does this in Java's standard API/library...Actually there isn't.
For example, when this type of XML entity reference is found in the text of an XML node, and this node is read into a java DOM object using java's API/library, there is some code somewhere which does the conversion I am inquiring about, because when I view the String representing that text, it now contains the unicode characters. So, there must be a quick and already existing way to do this.The code must exist somewhere, yes. It doesn't follow that the code must be encapsulated in a public method. It's a specialized requirement of XML parsers so there's really no need to make it available outside the parsers where it exists.
I guess in the time it took me to write this post I could have written the converter... :-(Why do you need to do that, anyway? Why not just let an XML parser do it for you?

Easy accent (special character) translation to html numeric codes with xdk.

Hi, I have an xml (obtained via xsql) with the correct special characters (spanish accents in this case), I got it in a servlet like:
XSQLRequest req = new XSQLRequest(pageUrl);
XMLDocument xsqlDoc = (XMLDocument)req.processToXML(params);
XSLProcessor processor = new XSLProcessor();
processor.setBaseURL(xslUrl);
XSLStylesheet sheet = processor.newXSLStylesheet(xslUrl);
processor.processXSL(sheet, xsqlDoc, out);
This works fine, I got a pretty xml document like:
<BLOQUE_B>
<ROW num="1">
<TITULO>Trabajos TopogrÃ¡ficos Catastrales.</TITULO>
<TITULO_RESTO>PolÃgono nÃºm. 6</TITULO_RESTO>
<RESPONSABILIDAD>Servicio de Catastro TopogrÃ¡fico Parcelario;</RESPONSABILIDAD>
<ESCALA>Escala 1:5000</ESCALA>
<FRASE_INTRODUC>VersiÃ³n original</FRASE_INTRODUC>
<DESCRIPCION_FISICA>1 plano</DESCRIPCION_FISICA>
</ROW>
</BLOQUE_B>
The problem comes with the special characters (accents), I need to output as html codes:
Ã¡ --> &#225;
Ã© --> &#233;
etc ...
I've try with xdk 10 (xls v2), using the new character-map funcion, but it didn't work. (applying version="2.0" in the stylesheet and processor.setXSLTVersion(oracle.xml.parser.v2.XSLProcessor.XSLT20)
in the servlet), (example in xml.com: http://www.xml.com/pub/a/2004/06/02/tr-xml.html)
With xdk 9 (xls v1), I try with a string-replace template function (like the http://www.dpawson.co.uk/xsl/sect2/StringReplace.html), and it seens to work.
It's there an easy way to make this?
Why didn't work the character-map with xdk10 (xls v2))
Thanks in advance
Felipe Llano

I found a solution, without using xsl, just with the jtidy library, as I need the translation via servlet, I did it with just a few lines like:
PrintWriter out = response.getWriter();
XSQLRequest req = new XSQLRequest(pageUrl);
XMLDocument xsqlDoc = (XMLDocument)req.processToXML(params);
ByteArrayOutputStream out2 = new ByteArrayOutputStream();
ByteArrayOutputStream out3 = new ByteArrayOutputStream();
xsqlDoc.print(out2);
Tidy tidy = new Tidy();
tidy.setXmlTags(true);
tidy.setXmlOut(true);
tidy.parse(new ByteArrayInputStream(out2.toByteArray()), out3);
out.print(out3.toString() );

Weblogic 6.1 to Weblogic 9.2 : Migration Issue

Hi,
We have migrated our application from weblogic 6.1 to weblogic 9.2
However there is one difference while handling unicode data in weblogic 6.1 and weblogic 9.2
Difference is:
In weblogic 6.1, When user inputs unicode string in JSP, it is sent to servlet as it is, meaning there is not entity conversion.
While in Weblogic 9.2, When user inputs unicode String in JSP, it is sent to servlet in html numeric entity format.
This is creating issues because Weblogic 6.1 enters this unicode data in informix database as it is meaning with out any conversion.
Whereas, Weblogic 9.2 converts the unicode data in html numeric entity form, and this form is saved in database.
So the problem is, when a user tries to retrieve the multi lingual string saved in database using Weblogic 6.1, he is not able to view it correctly.
Where as if the user tries to retrieve the multi lingual string saved in database using Weblogic 9.2, he is able to view it correctly.
Did anybody faced this issue before? I mean what should be the right approach to solve this problem?
1. Can we instruct Weblogic 9.2, not to convert the multi lingual string in html numeric entity form.
2. Since the jsp expects html numeric entity form for displaying data, it is not able to display the multi lingual data properly. Can we instruct the jsp, not to expect html numeric entity, so that it display multi lingual data properly.
Kindly help.
Regards,
Mayank

Hi,
We have done some more analysis for the above issue:
Our application is deployed on two different weblogic servers as mentioned in the first post i.e Weblogic 6.1 and weblogic 9.2.
One thing that we notice is:
Application that is deployed on Weblogic 6.1, enters the multi lingual data as entered by the user i.e it does not do any encoding of the data.
Where as application deployed on weblogic 9.2 stored the data in encoded format.
So while we can see the multi lingual data correctly on Weblogic 9.2 for the data enetered using this server.
But since the data enetered using Weblogic 6.1 is stored as it is, it is not displayed correctly on Weblogic 9.2.
For Example :
Suppose the multi lingual string entered is : Уважаемые Господа
So with application deployed on Weblogic 6.1 server it would be stored as : Уважаемые Господа
while with application deployed on Weblogic 9.2 it would be stored as : &# 1059;&# 1074;а&# 1078;&# 1072;&# 1077;&# 1084;&# 1099;&# 1077; &# 1043;&# 1086;&# 1089;&# 1087;&# 1086;&# 1076;&# 1072;
Actually there is no space between &# and the numbers but since it was not displaying properly i had to put space in between.
So when we view string which is added by application deployed on WLs 6.1, on Weblogic 9.2 it is not getting displayed properly.
Hope this piece of information helps in providing solution.
Please help.
Regards,
Mayank
Edited by: user10423960 on Oct 15, 2008 5:43 AM

How to include "@someword" verbatim in a documentation?

I fear this is quite a beginners question...
The text of a class documentation must contain words that start with an "@" symbol because this is the kind of data the class processes. How can I prevent the error message "@someword is an unknown tag"?
(This is unfortunately not mentioned in the FAQ ... )
Thank you for your patience!

Use HTML numeric entity! See also:
http://forum.java.sun.com/thread.jspa?threadID=729598
Leonid Rudy
http://www.docflex.com

Displaying unicode or HTML escaped characters from HTTPService in Flex components.

Here is a solution on the Flex Cookbook I developed for
displaying data in Flex components when the data comes back from
HTTPService as unicode of HTML escaped data:
Displaying
unicode or HTML escaped characters from HTTPService in Flex
components.

Hi again Greg,
I have just been adapting your idea for encountering
occasional escaped characters within a body of "normal" text, eg
something like
hellô sun&scaron;ine
Now, the handy String.fromCharCode(charCode) call works a
dream if instead of the above I have
hellô sunšine
Do you know if there is an equivalent call that takes the
named entities rather than the numeric ones? Clearly I can just do
some text substitution to get the mapping, but this means rather
more by-hand work than I had hoped. However, this is definitely a
step in a useful direction for me.
Thanks,
Richard
PS hoping that the web page won't simply outguess me and
replace all the above! Basically, the first line uses named
entities and the second the equivalent numbers...

[iPhone] Any built in way to convert HTML entities to Unicode?

I have a string with contents something like:
"© 2008"
Is there some method that I can't seem to find that will convert this to:
"© 2008"
Basically is there something built in to convert all of the '&xxx;' HTML entities to their Unicode counterpart? I can write my own code to do it but I want to check here first.
Thanks.

I had the same problem and did only find a semi built in solution using NSXMLParser
@interface MREntitiesConverter : NSObject {
NSMutableString* resultString;
@property (nonatomic, retain) NSMutableString* resultString;
- (NSString)convertEntiesInString:(NSString)s;
@end
@implementation MREntitiesConverter
@synthesize resultString;
- (id)init
if([super init]) {
resultString = [[NSMutableString alloc] init];
return self;
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)s {
[self.resultString appendString:s];
- (NSString)convertEntiesInString:(NSString)s {
if(s == nil) {
NSLog(@"ERROR : Parameter string is nil");
NSString* xmlStr = [NSString stringWithFormat:@"<d>%@</d>", s];
NSData *data = [xmlStr dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES];
NSXMLParser* xmlParse = [[NSXMLParser alloc] initWithData:data];
[xmlParse setDelegate:self];
[xmlParse parse];
NSString* returnStr = [[NSString alloc] initWithFormat:@"%@",resultString];
return returnStr;
- (void)dealloc {
[resultString release];
[super dealloc];
@end
In Cocoa (Core Foundation) there is
NSString* sI = (NSString*)CFXMLCreateStringByUnescapingEntities(NULL, (CFStringRef)s, NULL);
but that does not (yet?) exist on the IPhone (2.01)

Numeric value of a Character

What is the difference between numerical value and unicode value of a character.
Character.getNumericalValue('a'); --> returns 10
Character.getNumericalValue('A'); --> returns 10
System.out.println(((int)'a')); ---> outputs 97
System.out.println(((int)'a')); ---> outputs 65
Can any one help me understand?

[url http://java.sun.com/j2se/1.4.1/docs/api/java/lang/Character.html#getNumericValue(char)]the api is your friend :)

Displaying Numeric Character Entities

I�m having a problem displaying numeric character entities such as �� (m-dash) in my Java application. I have noticed that some characters will show up correctly and some will not, for example �}� (Right curly brace) shows up just fine. I�m having the same problem with Japanese numeric character entities such as �式�. The encoding that is being used here is UTF-8.
I am reading the text that contains these entities from a XML file using the following code:
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setNamespaceAware(true);
    dbf.setIgnoringComments(true);
    dbf.setIgnoringElementContentWhitespace(true);
    try {
      FileInputStream fff = new FileInputStream(theFile);
      InputSource inSource = new InputSource(new InputStreamReader(fff,"UTF-8"));
      DocumentBuilder db = dbf.newDocumentBuilder();
      Document doc = db.parse(inSource);
    } catch (Exception e) {
      System.out.println("Exception: " + e);
    }Here is a simple example of an xml file.
<?xml version="1.0" encoding="UTF-8"?>
<toc name="Course One" file="course1/toc.xml">
<topic name="Topic � One" file="course1/source/topic1.html"/>
<topic name="式 (Japanese char)" file="course1/source/topic2.html"/>
</toc>When I print out the contents of the name attribute I get boxes (or a ?) in place of the character entity. Any help that can be offered here would be greatly appreciated.
Thanks,
David

I just realized that the entities that I put in my post were resolved by the browser, so they didn't show up as the actual numeric entity code as I intended,  (& + #151;) for m-dash } (& + #125;) for the curly brace and 式 (& + #24335;) for the Japanese character. The XML file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<toc name="Course One" file="course1/toc.xml">
<topic name="Topic  One" file="course1/source/topic1.html"/>
<topic name="式 (Japanese char)" file="course1/source/topic2.html"/>
</toc>

Cannot display Unicode in Netscape

Hi gurus,
I used the following html to display some unicode in browser:
<p >
&#36013;&#21024;&#61103;
</p>
<form name="logonForm" method="POST" action="/something/action.do">
<input type="text" name="userId" maxlength="24" size="25" value="&#36013;&#21024;&#61103;">
</input
</form>
</input
</form>
IE is okay but Netscape does not.
Please help!
Note: the three unicodes are HKSCS.
Thanks!

Netscape 4.7x has difficulties displaying Numeric Character References (see http://www.alanwood.net/unicode/htmlunicode.html for more info). Either try insert a meta tag <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> to your .html file, or upgrade to 6.x

How to get numeric data from a string using t-sql

Hi All,
I have a table with 2 columns ID as Int and Message as nvarchar(max)
Create table Sample
ID int not null,
Message nvarchar(max) null,
CONSTRAINT [PK_ID_Msg] PRIMARY KEY CLUSTERED
ID asc
Insert statement:
INSERT INTO
Sample (ID, Message)
VALUES (1, 'X_YRS: 00 ; X_MONS: 18 ; X_DAYS: 000 ; Y_YRS: 00 ; Y_MONS: 16 ; Y_DAYS: 011 ; Z: 1 ; Z_DATE: 09/04/2014
INSERT INTO Sample (ID, Message) VALUES (2, 'X_YRS: 01 ; X_MONS: 15 ; X_DAYS: 010 ; Y_YRS: 00 ; Y_MONS: 18 ; Y_DAYS: 017
; Z: 1
; Z_DATE: 06/02/2012')
Data in the table looks like:
ID             Message
1       X_YRS: 00 ; X_MONS: 18 ; X_DAYS: 000 ; Y_YRS: 00 ; Y_MONS: 16 ; Y_DAYS: 011 ; Z: 1 ; Z_DATE: 09/04/2014
2       X_YRS: 01 ; X_MONS: 15 ; X_DAYS: 010 ; Y_YRS: 00 ; Y_MONS: 18 ; Y_DAYS: 017 ; Z: 1 ; Z_DATE: 06/02/2012
Need out put as below, just with numeric data:
ID      X-Column         Y-Column
1       00 18 000        00 16 011
2       01 15 010        00 18 017
So, please I need t-SQL to get above output.
Thanks in advance.
RH
sql

;With CTE
AS
SELECT s.ID,RTRIM(LTRIM(STUFF(Val,1,CHARINDEX(':',Val),''))) AS Val,RTRIM(LTRIM(LEFT(Val,CHARINDEX('_',Val+'_')-1))) AS Pattern,
--ROW_NUMBER() OVER (PARTITION BY LEFT(Val,CHARINDEX('_',Val+'_')-1) ORDER BY RTRIM(LTRIM(STUFF(Val,1,CHARINDEX(':',Val),'')))*1)
f.ID AS Seq
FROM Sample s
CROSS APPLY dbo.ParseValues(s.[Message],';')f
WHERE ISNUMERIC(RTRIM(LTRIM(STUFF(Val,1,CHARINDEX(':',Val),'')))+'0.0E0')=1
SELECT ID,
STUFF((SELECT ' ' + Val FROM CTE WHERE ID = c.ID AND Pattern = 'X' ORDER BY Seq FOR XML PATH('')),1,1,'') AS XCol,
STUFF((SELECT ' ' + Val FROM CTE WHERE ID = c.ID AND Pattern = 'Y' ORDER BY Seq FOR XML PATH('')),1,1,'') AS YCol,
STUFF((SELECT ' ' + Val FROM CTE WHERE ID = c.ID AND Pattern = 'Z' ORDER BY Seq FOR XML PATH('')),1,1,'') AS ZCol
FROM (SELECT DISTINCT ID FROM CTE)c
ParseValues can be found here
http://visakhm.blogspot.in/2010/02/parsing-delimited-string.html
Numeric check logic is as per below
http://visakhm.blogspot.in/2014/03/checking-for-integer-or-decimal-values.html
Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

Quote transformed into ' (quote html code)

Hi all,
I try to translate an dbms_xmlapi.DOMDocument with xslt.
My problem is all quote ' are translated to their html code entity : &apos.
The french accent causes problem too.
Could you help me ? I'd like to know if it's possible to disable this translation.
Benoit Pironet

Not currently. You would need to specify output at text, but this is not currently supported by the database XSLT engine

Cannot type/paste normal international text when non-Unicode in CS6

Hello,
In all versions of DW (up to CS3 which I have used) I had no problem pasting / typing HTML or text with international characters in Design View when the page is using a non-Unicode, yet international encoding (like Windows - Greek or ISO - Greek).
Now, DW CS6 auto converts all international chars typed/pasted in Design View to html entities (unless Page Encoding is Unicode).
For example, when the document has an encoding of:
<meta http-equiv="Content-Type" content="text/html; charset=windows-1253">
[ This is equal to Modify / Page Properties / Title/Encoding / Document Type (DTD): None & Encoding: Greek (Windows) ]
...in the past I was able to type/paste greek characters/text in Design view and they were retained as such (simple text) in Code view (and this is what we need now as well).
Yet, in DW CS6 such international chars/text (typed / pasted in Design view) are auto-switched to "&somechar;" entities which is not what should happen; this messes up all text. Design view shows the text correctly, but html source (Code View) does not retain international characters themselves, although it should, as long as the html page is using a proper encoding/charset that allows compatible international text to be retained (e.g. greek encoding is compatible with greek characters). I repeat that this was working correctly at least until DW CS3.
Directly typing/pasting in DW CS6 design view correctly (i.e. retaining the original chars in code view) works ONLY when using Unicode.
However, if we type/paste greek text (with html tags or not) directly in Code view, then DW CS6 retains chars/text properly and Design view displays everything properly too. Consequently, as a work-around, we can use the Code View to type/paste international text/html when not using Unicode (UTF-8) as the Page Encoding. But this makes our life more difficult.
So, has CS6 dropped support for typing/pasting international text/html directly in Design view, for non-Unicode international encodings?
Or something has changed and we need to configure some setting(s) so that the feature works properly? (I haven't been able to find any setting that might affect this behavior. I also played with Document Type (DTD) settings but I found these did not affect the described behavior.)
Please advise. This is very important.
Thanks,
Nick
Message was edited by: JJ-GR

Thanks for the reply.
As I have already mentioned, typing/pasting in Code View works properly.
However, in previous versions of DW, pasting/typing in Design View was working fine, whatever the page encoding.
I agree that pasting in Code View is not really a big deal. But having to do all editing/typing in Code View definitely is! What is the point of using a WYSIWYG editor, if it can't produce correct source code (except in Unicode pages)? If we are going to do all editing in Code View, then we could simply use notepad (just an exaggeration) or other programming-oriented tool.
I hope other people can confirm the problem and suggest solutions or Adobe fixes it.

Html numeric entity to unicode

Similar Messages

Maybe you are looking for