Adding new character encoding to PBP
is there any way to add new character encoding in PBP1.1..?
.I want it to support japanese encoding
1) Derive MyDateFormat from SimpleDateFormat only allowing the default constructor.
2) Override the public void applyPattern(String pattern) method so that it detects the 'Q' and replaces it with some easily identifiable pattern involving the month (say �MM� ) and then call the superclass applyPattern method.
3) Override the public StringBuffer format(Date date,
StringBuffer toAppendTo,
FieldPosition fieldPosition)
method such that if first calls the superclass method to get the formatted output and then corrects this output by replacing (using regular expressions) the �01�, �02� etc with the appropriate quarter.
You might do better to not to actually derive a new class from SimpleDateFormat but just create a class which uses SimpleDateFormat.
Similar Messages
-
I've invented a Character encoding.
Advantages of this: ANSI ASCII compatible, Bitwise operations based, Self-synchronising, Abundant.
Yields of this encoding, against those of UTF-8, are,
Number of Bits This Encoding UTF-8
Number of Codes Accumulation Number of Codes Accumulation
8 128 128 128 128
10 192 320 0 128
12 512 832 0 128
14 1280 2112 0 128
16 3072 5184 2048 2176
18 7168 12352 0 2176
20 16384 28736 0 2176
22 36864 65600 0 2176
24 81920 147520 65536 67712
26 180224 327744 0 67712
28 393216 720960 0 67712
30 851968 1572928 0 67712
32 1835008 3407936 2097152 2164864
This is a new Character encoding scheme (CES) that maps Unicode code points to bit sequences.
Could you please suggest improvements?
Please bear with me as the table may not be formatted well for you, especially when using serif. When reading a line in the table, the first value is number of bits and the next pair is for this encoding, and the other pair is for UTF-8, with the first of each pair being the number of codes and the others are their accumulation.
Sorry! I wish to provide more details on this but I'm restricted for some time. I hope that this does not stop you from assisting me.
Regards
Anbu
This encoding maintains almost all the properties of UTF-8 in a more compact format. ANSI ASCII compatiblity, Bitwise operations based, Self-synchronising and Abundance are some of the properties of this encoding. Further, this encoding encodes all characters in far fewer number of bits than UTF-8 as shown in the table. As I had mentioned earlier I will provide more details and proofs soon. This post is a request for suggestions. Please suggest the most suitable place for this post.Ambuk which Adobe software or service is your inquiry in relation too?
-
Transaction code for adding new character
hi ,
i am new member of this group.I want to know the transaction code for adding a new character in BW.
please help me.
Thanks in advance,
reenaHi reena,
When you say "adding a new character in BW" you want to create a new characteristic (InfoObject) in BW or you want to add a new char to BW?
Modeling in BW is done at tocde RSA1 & go to relevant tab on left screen (dataprovide, InfoObjects etc) & right click to create Objects.
hope it helps
regards
VC -
How to add a new character set encoding?
Hello,
can anybody please explain to me, how to add a new character set encoding to Mac OS Tiger?
I have two Mac laptops, a new one with Snow Leopard and an older one with Tiger, and on the old one i cannot use or enable anywhere the "Russian (DOS)" character set encoding, which i need to be able to use some old text files.
On the Snow Leopard, this encoding is present in the list of available encodings of TextWrangler, but not in TIger.
If i have understood correctly, this is not a problem of TextWrangler, and the same encodings are available systemwide.
So, the question is: how to add new encodings to Tiger (or to Mac OS in general)?
Thanks.I think possibly that's in the Get Info window of Finder?
I don't think either that or the input menu have any effect on available encoding choices. Adding languages to system prefs/international/languages can do that, but once you have added Russian there, I don't know of any way to add an additional Russian encoding (there are quite a number of them). -
Is it possible to add new character set encodings?
Hello,
is it possible to add new character set encodings in Mac OS?
Practically, i need to add "Russian (DOS)" encoding to Tiger, but is it at all possible even in newer versions of Mac OS?
Where to find the missing encodings and how to install them?
Google search did not return much.
Thanks.Well, it was a general question about possibility of adding new encodings in Mac OS, but i posted a more specific question about Tiger here:
http://discussions.apple.com/thread.jspa?threadID=2692557&tstart=0
On Tiger i do not have Russian (DOS) in TextEdit, only Cyrillic (DOS).
The same about TextWrangler on Tiger.
Strangely, i have Russian (DOS) in TeXShop on Tiger. -
JSF myfaces character encoding issues
The basic problem i have is that i cannot get the copyright symbol or the chevron symbols to display in my pages.
I am using:
myfaces 2.0.0
facelets 1.1.14
richfaces 3.3.3.final
tomcat 6
jdk1.6
I have tried a ton of things to resolve this including:
1.) creating a filter to set the character encoding to utf-8.
2.) overridding the view handler to force calculateCharacterEncoding to always return utf-8
3.) adding <meta http-equiv="content-type" content="text/html;charset=UTF-8" charset="UTF-8" /> to my page.
4.) setting different combinations of 'URIEncoding="UTF-8"' and 'useBodyEncodingForURI="true"' in tomcat's server.xml
5.) etc... like trying set encoding on an f:view, using f:verbatim, specifying escape attirbute on some output components.
all with no success.
There is a lot of great information on BalusC's site regarding this problem (http://balusc.blogspot.com/2009/05/unicode-how-to-get-characters-right.html) but I have not been able to resolve it yet.
i have 2 test pages i am using.
if i put these symbols in a jsp (which does NOT go through the faces servlet) it renders fine and the page info shows that it is in utf-8.
<html>
<head>
<!-- <meta http-equiv="content-type" content="text/html;charset=UTF-8" /> -->
</head>
<body>
<br/>copy tag: ©
<br/>js/jsp unicode: ©
<br/>xml unicode: ©
<br/>u2460: \u2460
<br/>u0080: \u0080
<br/>arrow: »
<p />
</body>
</html>if i put these symbols in an xhtml page (which does go through the faces servlet) i get the black diamond symbols with a ? even though the page info says that it is in utf-8.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:ui="http://java.sun.com/jsf/facelets"
xmlns:f="http://java.sun.com/jsf/core"
xmlns:h="http://java.sun.com/jsf/html"
xmlns:rich="http://richfaces.org/rich"
xmlns:c="http://java.sun.com/jstl/core"
xmlns:a4j="http://richfaces.org/a4j">
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8" charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
</head>
<body>
<f:view encoding="utf-8">
<br/>amp/copy tag: ©
<br/>copy tag: ©
<br/>copy tag w/ pound: #©
<br/>houtupt: <h:outputText value="©" escape="true"/>
<br/>houtupt: <h:outputText value="©" escape="false"/>
<br/>js/jsp unicode: ©
<br/>houtupt: <h:outputText value="©" escape="true"/>
<br/>houtupt: <h:outputText value="©" escape="false"/>
<br/>xml unicode: ©
<br/>houtupt: <h:outputText value="©" escape="true"/>
<br/>houtupt: <h:outputText value="©" escape="false"/>
<br/>u2460: \u2460
<br/>u0080: \u0080
<br/>arrow: »
<br/>cdata: <![CDATA[©]]>
<p />
</f:view>
</body>
</html>on a side note, i have another application that is using myfaces 1.1, facelets 1.1.11, and richfaces 3.1.6 and the unicode symbols work fine.
i had another developer try to use my test xhtml page in his mojarra implementation and it works fine there using facelets 1.1.14 but NOT myfaces or richfaces.
i am convinced that somewhere between the view handler and the faces servlet the encoding is being set or reset but i havent been able to resolve it.
if anyone at all can point me in the right direction i would be eternally greatful.
thanks in advance.UPDATE:
I was unable to get the page itself to consume the various options for unicode characters like the copyright symbol.
Ultimately the content I am trying to display is coming from a web service.
I resolved this issue by calling the web service from my backing bean instead of using ui:include on the webservice call directly in the page.
for example:
public String getFooter() throws Exception
HttpClient httpclient = new HttpClient();
GetMethod get = new GetMethod(url);
httpclient.executeMethod(get);
String response = get.getResponseBodyAsString();
return response;
}I'd still love to have a solution to the page usage of the unicode characters, but for the time being this solves my problem. -
XML parser not detecting character encoding
Hi,
I am using Jdeveloper 9.0.5 preview and the same problem is happening in our production AS 9.0.2 release.
The character encoding of an xml document is not correctly being detected by the oracle v2 parser even though the xml declaration correctly contains
<?xml version="1.0" encoding="ISO-8859-1" ?>
instead it treats the document as UTF8 encoding which is fine until a document comes along with an extended character which then causes a
java.io.UTFDataFormatException: Invalid UTF8 encoding.
at oracle.xml.parser.v2.XMLUTF8Reader.checkUTF8Byte(XMLUTF8Reader.java:160)
at oracle.xml.parser.v2.XMLUTF8Reader.readUTF8Char(XMLUTF8Reader.java:187)
at oracle.xml.parser.v2.XMLUTF8Reader.fillBuffer(XMLUTF8Reader.java:120)
at oracle.xml.parser.v2.XMLByteReader.saveBuffer(XMLByteReader.java:448)
at oracle.xml.parser.v2.XMLReader.fillBuffer(XMLReader.java:2023)
at oracle.xml.parser.v2.XMLReader.tryRead(XMLReader.java:972)
at oracle.xml.parser.v2.XMLReader.scanXMLDecl(XMLReader.java:2589)
at oracle.xml.parser.v2.XMLReader.pushXMLReader(XMLReader.java:485)
at oracle.xml.parser.v2.XMLReader.pushXMLReader(XMLReader.java:192)
at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:144)
as you can see it is explicitly casting the XMLUTF8Reader to perform the read.
I can get around this by hard coding the xml input stream to be processed by a reader
XMLSource = new StreamSource(new InputStreamReader(XMLInStream,"ISO-8859-1"));
however the manual documents that the character encoding is automatically picked up from the xml file and casting into a reader is not necessary, so I should be able to write
XMLSource = new StreamSource(XMLInStream)
Does anyone else experience this same problem?
having to hardcode the encoding causes my software to lose flexibility.
Jarrod Sharp.An XML document should be created with 'ISO-8859-1' encoding to be parsed as 'ISO-8859-1' encoding.
-
As i find, the character encoding about chinese in jdk1.4.2 no langer the same of jdk1.4.0.
In jdk1.4.0, the character encoding used the "file.encoding" system property, we often set the
property with "gb2312".
But in jdk1.4.2, i find that the default character encoding no longer used the "file.encoding" system property.
Who knows the reason?
Test Program:
public class B{
public static void main(String args[]) throws Exception{
byte [] bytes = new byte[]{(byte)0xD6,(byte)0xD0,(byte)0xCE,(byte)0xC4};
String s1 = new String(bytes);
String s2 = new String(bytes,System.getProperty("file.encoding"));
System.out.println("s1="+s1+" , s2="+s2);
System.out.println("s1.length=" + s1.length() + " , s2.length="+s2.length());
run four times and the result list:
[root@app15 component]# /usr/local/j2sdk1.4.0/bin/java -Dfile.encoding=ISO-8859-1 -cp . B
s1=中文 , s2=中文
s1.length=4 , s2.length=4
[root@app15 component]# /usr/local/j2sdk1.4.0/bin/java -Dfile.encoding=gb2312 -cp . B
s1=中文 , s2=中文
s1.length=2 , s2.length=2
[root@app15 component]# /usr/local/j2sdk1.4.2/bin/java -Dfile.encoding=ISO-8859-1 -cp . B
s1=中文 , s2=中文
s1.length=4 , s2.length=4
[root@app15 component]# /usr/local/j2sdk1.4.2/bin/java -Dfile.encoding=gb2312 -cp . B
s1=中文 , s2=??
s1.length=4 , s2.length=2
[root@app15 component]#I don't know for sure, but:
-- The API documentation for String says that "new String(byte[])" uses "the platform's default charset".
-- The API documentation for Charset says "The default charset is determined during virtual-machine startup and typically depends upon the locale and charset being used by the underlying operating system."
You'll notice that it doesn't say anything about using the file.encoding system value, so presumably (based on your experiments) it doesn't. I did a search for "java default charset" and didn't find anything specific, but this site says "As of Java 1.4.1, the default Charset varies from platform to platform" and suggests you explicitly hard-code your charset. I would agree with that. -
How can I tell what character encoding is sent from the browser?
Hi,
I am developing a servlet which supposed to be used to send and receive message
in multiple character set. However, I read from the previous postings that each
Weblogic Server can only support one input character encoding. Is that true?
And do you have any suggestions on how I can do what I want. For example, I
have a HTML form for people to post any comments (they may post in any characterset,
like ShiftJIS, Big5, Gb, etc). I need to know what character encoding they are
using before I can read that correctly in the servlet and save in the database.From what I understand (I haven't used it yet) 6.1 supports the 2.3
servlet spec. That should have a method to set the encoding.
Otherwise, I don't think you can support multiple encodings in one
instance of WebLogic.
From what I know browsers don't give any indication at all about what
encoding they're using. I've read some chatter about the HTTP spec
being changed so it's always UTF-8, but that's a Some Day(TM) kind of
thing, so you're stuck with all the stuff out there now which doesn't do
everything in UTF-8.
Sorry for the bad news, but if it makes you feel any better I've felt
your pain. Oh, and trying to process multipart/form-data (file upload)
forms is even worse and from what I've seen the API that people talk
about on these newsgroups assumes everything is ISO-8859-1.
Emmy Lau wrote:
>
Hi,
I am developing a servlet which supposed to be used to send and receive message
in multiple character set. However, I read from the previous postings that each
Weblogic Server can only support one input character encoding. Is that true?
And do you have any suggestions on how I can do what I want. For example, I
have a HTML form for people to post any comments (they may post in any characterset,
like ShiftJIS, Big5, Gb, etc). I need to know what character encoding they are
using before I can read that correctly in the servlet and save in the database. -
How to create new character styles in Numbers 3.0
In Numbers 3.0, it is possible to update existing characters styles, but I cannot find how to create a new character style, the + sign in the popup is always inactive when I edit text in a cell. The paragraph style popup has an active +, but it doesn't work for character style.
Is there any way to create a new character style?Guillame,
You're right about not being able to add styles, and I have not figured out how to modify a predefined character style, other than to Rename or Delete a predefined style. I have added this to my list of things lost.
Jerry -
Where is the Character Encoding option in 29.0.1?
After the new layout change, the developer menu don't have Character Encoding options anymore where the hell is it? It's driving me mad...
You can also this access the Character Encoding via the View menu (Alt+V C)
See also:
*Options > Content > Fonts & Colors > Advanced > Character Encoding for Legacy Content -
I'm helping another group out on this so I'm pretty new to this stuff so please go easy on me if I ask anything that is obvious.
We have a J2EE web application that is sitting on a Red Hat Linux box and is being served up by OAS 10.1.3. The application reads an xml file which contains the actual content of the page and then pulls in the navigation and metadata from other sources.
Everyone works as it should but there is one issue that has been ongoing for a while and we would like to close it off. In my content source xml file, I have encoded special characters such as é as & amp;#233; - when I view the web page all is well (I see the literal value of é but when I do a view source, I see & #233;
If I put & #233; into the source xml file, the page still displays with é but when I do a view source on the web page, I see literal é value in the source which is not what we want. What is decoding the character reference? While inserting & amp;#233; into the xml source file works, we do not want to have to encode everything that way we would prefer the have & #233;. Is it a setting of the OS, the Application Server or the Application itself?
When I previewed this post, I noticed that by typing & amp;#233; as one solid word, it gets decoded as is seen as é so I had to put a space between the & and the amp to get my properly explain myself.
Any help would be appreciated!
Thanks,
/HHThere are a lot of notes on MetaLink about character encoding. I wrote Note 337945.1 a while ago, which explains this into more detail. I will quote some relevant to your situation:
For the core components, there are three places to set NLS_LANG:
- in the system environment (this is obvious)
- in the file opmn.xml
- in the file apachectl
A. Changing opmn.xml
- go to $ORACLE_HOME/opmn/conf and edit the file opmn.xml
- Search for the OC4J container your application runs in.
- Within the <process-type.... > </process-type> section, add an entry similar to:
(1) OracleAS 10g (10.1.2, 10.1.3):
<environment>
<variable id="NLS_LANG" value="ENGLISH_UNITED KINGDOM.AL32UTF8"/>
</environment>
B. Changing apachectl (Unix only)
- Go to $ORACLE_HOME/Apache/Apache/bin
- Open the file 'apachectl'
- search for NLS_LANG
e.g.
NLS_LANG=${NLS_LANG=""}; export NLS_LANG
Verify if the variable is getting the correct value; this may depend on your environment and on the version of OracleAS. If necessary, change this line. In this example, the value from the environment is taken automatically.
There is more on this topic in the mod_plsql area but since you do not mention pulling data from the database, this may be less relevant. Otherwise you need to ensure the same NLS_LANG and character set is used in the database to avoid conversions. -
Hello All,
I am not clear about solving the problem.
We have a Java application on NT that is supposed to communicate with the same application on MVS mainframe through XML.
We have a character encoding for these XML commands we send for communication.
The problem is, on MVS the parser is not understaning the US-ASCII character encoding. And so we are getting the infamous "illegal character error".
The main frame file.encoding=CP1047 and
NT's file.encoding = us-ascii.
Is there any character encoding that is common to these two machines: mainframe and NT.
IF it is Unicode, what is the correct notation for it.
Or is there any way for specifying the parsers to which character encoding should be used.
thanks,
SridharOn the mainframe end maybe something like-
FileInputStream fris = new FileInputStream("C:\\whatever.xml");
InputStreamReader is= new InputStreamReader(fris, "ASCII");//or maybe "us-ascii" "US-ASCII"
BufferedReader brin = new BufferedReader(is);
Or give inputstream/buffered reader to whatever application you are using to parse the xml. The input stream reader should allow you to set your encoding even if the system doesnt have the native encoding. Depends though on which/whose jvm using you are using jdk1.2 at least supports following on this page http://as400bks.rochester.ibm.com/pubs/html/as400/v4r4/ic2924/info/java/rzaha/javaapi/intl/encoding.doc.html -
Detecting character encoding from BLOB stream... (PLSQL)
I'am looking for a procedure/function which can return me the character encoding of a "text/xml/csv/slk" file stored in BLOB..
For example...
I have 4 files in different encodings (UTF8, Utf8BOM, ISO8859_2, Windows1252)...
With java I'can simply detect the character encoding with JuniversalCharDet (http://code.google.com/p/juniversalchardet/)...
thank youSolved...
On my local PC I have installed Java 1.5.0_00 (because on DB is 1.5.0_10)...
With Jdeveloper I have recompiled source code from:
http://juniversalchardet.googlecode.com/svn/trunk/src/org/mozilla/universalchardet
http://code.google.com/p/juniversalchardet/
After that I have made a JAR file and uploaded it with loadjava to my database...
C:\>loadjava -grant r_inis_prod -force -schema insurance2 -verbose -thin -user username/password@ip:port:sid chardet.jarAfter that I have done a java procedure and PLSQL wrapper example below:
public static String verifyEncoding(BLOB p_blob) {
if (p_blob == null) return "-1";
try
InputStream is = new BufferedInputStream(p_blob.getBinaryStream());
UniversalDetector detector = new UniversalDetector(null);
byte[] buf = new byte[p_blob.getChunkSize()];
int nread;
while ((nread = is.read(buf)) > 0 && !detector.isDone()) {
detector.handleData(buf, 0, nread);
detector.dataEnd();
is.close();
return detector.getDetectedCharset();
catch(Exception ex) {
return "-2";
}as you can see I used -2 for exception and -1 if input blob is null.
then i have made a PLSQL procedure:
function f_preveri_encoding(p_blob in blob) return varchar2 is
language Java name 'Zip.Zip.verifyEncoding(oracle.sql.BLOB) return java.lang.String';After that I have uploaded 2 different txt files in my blob field.. (first one is encoded with UTF-8, second one with WINDOWS-1252)..
example how to call:
declare
l_blob blob;
l_encoding varchar2(100);
begin
select vsebina into l_blob from dok_vsebina_dokumenta_blob where id = 401587359 ;
l_encoding := zip_util.f_preveri_encoding(l_blob);
if l_encoding = 'UTF-8' then
dbms_output.put_line('file is encoded with UTF-8');
elsif l_encoding = 'WINDOWS-1252' then
dbms_output.put_line('file is encoded with WINDOWS-1252');
else
dbms_output.put_line('other enc...');
end if;
end;Now I can get encoding from blob and convert it to database encoding and store datas in CLOB field..
Here you have a chardet.jar file if you need this functionality..
https://docs.google.com/open?id=0B6Z9wNTXyUEeVEk3VGh2cDRYTzg
Edited by: peterv6i.blogspot.com on Nov 29, 2012 1:34 PM
Edited by: peterv6i.blogspot.com on Nov 29, 2012 1:34 PM
Edited by: peterv6i.blogspot.com on Nov 29, 2012 1:38 PM -
(nokia N8) Character encoding - reduced support (n...
I'm from Poland, and in my language i use special letters like ó,ż,ź,ą,ę. When i want write a massage and use one of above letters, my message is shorter 90 signs. In older Nokia's phone i change settings text messages (character encoding:full support=>reduced support). When i change the same settings in Nokia N8, this settings doesn't work! I always see shorter message, strangest is that when i choose "conversations" i don't have any problems, everything works fine. When i choose "new message" i can't use reduced support (this option i on, but not working) My software version:011.012 Software version date: 2010-09-18 Release: PR1.0 Custom Version: 011.012.00.01 Custom version date: 2010-09-18 Language set: 011.012.03.01 Product code: 0599823
Talking about product code changes is prohibited in this forum. It is unofficial and is grounds for Nokia to refuse to repair or service your phone in any way.
If you find my post helpful please click the green star on the left under the avatar. Thanks.
Maybe you are looking for
-
My husband originally bought the Scrabble app for his iPhone. I downloaded it onto my iPad and used it, successfully, for 1 day. Since then, when I try to open it, it starts to load and then shuts down and goes back to the main screen. PLEASE, someon
-
Ios 5.0.1 crashing, slow and other problems
I am using an iPhone 4 running 5.0.1 and Im just fed up with it, as much as I loved the iPhone in the past Im afraid I have fallen out of love and I'm conteplating a breakup. Problems I have had with the version 5.*.* update: * Safari and the built i
-
Error while configuring CCMS Agent
Hello Gurus, I am trying to install CCMS agent on our BI Dev system, but it gives me below error. Error Message - ERROR: Cannot open Monitoring Segment 0 rtc = 245 Last reported error: [249] CCMS monitoring segment has wrong EYE CATCH: CCMS mo nitor
-
How can I export PDF text and post the exported text on a web page, to which I can then apply Google Translate? Our organization post PDF articles from our journal. (I can manually block and copy the text, so I know the text can be captured.) I wa
-
Data load fails from DB table - No SID found for value 'MT ' of characteris
Hi, I am loading data into BI from an external system (oracle Database). This system has different Units like BG, ROL, MT (for Meter). While these units are not maintaned in R3/BW. They are respectively BAG, ROLL, M. Now User wants a "z table" to be