Byte[] character encoding for strings

Hi All,
I tried to convert a string into byte[] using the following code:
byte[] out= [B@30c221;
String encodedString = out.toString();
it gives the output [B@30c221 when i print encodedstring.
but when i convert that encodedstring into byte[] using the following code
byte[] output = encodedString.getBytes();
it gives different output.
is there any character encode needed to give the exact output for this?

Sorry, but the question makes no sense, and neither does your code. byte[] out= [B@30c221;
String encodedString = out.toString(); The first line is syntactically incorrect, and the second should print something like "[B@30c221", which isn't particularly useful. The correct way to convert a String to a byte[] is with the getBytes() method. Be aware that the byte[] will be in the system default encoding, which means you could get different results on different platforms. To remove this platform dependency, you should specify the encoding, like so: byte[] output = encodedString.getBytes("UTF-8"); Why are you doing this, anyway? There are very few good reasons to convert a String to a byte[] within your code; that's usually done by the I/O classes when your program communicates with the outside world, as when you write the string to a file.

Similar Messages

As a webservice client, how to set character encoding for JAX-WS?

I couldn't find the right API to set character encoding for a webservice client. What I did is
1, wsimport which gives me MyService, MyPortType...
2. Create new MyService
3. Get MyPort from MyService
4. Call myPort.myOperation with objects
Where is the right place to set character encoding and how to set it? Thanks.
Regards
-Jiaqi Guo

The .js file and the html need to have the same encoding. If
your html uses iso-8859-7, then the .js must also use that. But if
the original text editor created the .js file using utf-8, then
that is what the html needs to use.

Character Encoding for IDOC to JMS scenario with foreign characters

Dear Experts,
The scenario is desribed as follows:
Issue Description:
There is an IDOC which is created after extracting data from different countries (but only one country at a time). So, for instance first time the data is picked in Greek and Latin and corresponding IDOC is created and sent to PI, the next time plain English and sent to PI and next Chinese and so on. As of now every time this IDOC reaches PI ,it comes with UTF-8 character encoding as seen in the IDOC XML.
I am converting this IDOC XML into single string flat file (currently taking the default encoding UTF-8) and sending it to receiver JMS Queue (MQ Series). Now when this data is picked up from the end recepient from the corresponding queue in MQ Series, they see ? wherever there is a Greek/latin characters (may be because that should be having a different encoding like ISO-8859_7). This is causing issues at their end.
My Understanding
SAP system should trigger the IDOC with the right code page i.e if the IDOC is sent with Greek/Latin code page should be ISO-8859_7, if this same IDOC is sent with Chinese characters the corresponding code page else UTF-8 or default code page.
Once this is sent correctly from SAP, Java Mapping should have to use the correct code page when righting the bytes to outputstream and then we would also need to set the right code page as JMS Header before putting the message in the JMS queue so that receiver can interpret it.
Queries:
1. Is my approach for the scenario correct, if not please guide me to the right approach.
2. Does SAP support different code page being picked for the same IDOC based on different data set. If so how is it achieved.
3. What is the JMS Header property to set the right code page. I think there should be some JMS Header defined by MQ Series for Character Encoding which I should be setting correctly) I find that there is a property to set the CCSID in JMS Receiver Adapter but that only refers to Non-ASCII names and doesn't refer to the payload content.
I would appreciate if anybody can give me pointers on how to resolve this issue.
Thanks,
Pratik

Hi Pratik,
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42?quicklink=index&overridelayout=true
This link might help.
regards
Anupam

Set character encoding for data template xml output

Hello everyone, in my data template, I have defined the header as
<?xml version="1.0" encoding="WINDOWS-1256"?>
but when output is generated, it is returned as:
<?xml version="1.0" encoding="UTF-8"?>
Is there a way for me to force the WINDOWS-1256 encoding in my data template?
Many Thanks

This data is read as
bytes then I am using the InputStreamReader to convert
to UTF-8 encoding.Don't you mean "from UTF-8 encoding"? Strings don't have an encoding, bytes can. And do you know that SQL Server produces those bytes encoded in UTF-8, or are you just assuming that?
The stream is then written to a file with the
extension ".xml". When I go and open the file, I get
errors stating that the characters were not
recognized.When you open the file with what? And what errors do you get?
However, when I open the file with
Notepad, I can see my xml data.

Character encoding for ReponseWriter

hi;
how can i control the character encoding of the ResponseWriter?
what encoding does it use for default?
thanks.

Since I had junior developers becoming desperate by this problem I'll post our solution for for anybody that's not working on Websphere and wants to solve this problem.
we seem to have solved it using a servletfilter that inserts this response wrapper:
class CharacterEncodingHttpResponseWrapper extends HttpServletResponseWrapper{
private String contentTypeWithCharacterEncoding;
private String encoding;
CharacterEncodingHttpResponseWrapper(HttpServletResponse resp,String encoding){
    super(resp);
    this.encoding=encoding;
public void setContentType(String contentType){
    //��n plek om encoding te defini�ren ipv in alle JSP pagina's
    contentTypeWithCharacterEncoding = addOrReplaceCharset(contentType,encoding);
    super.setContentType(contentTypeWithCharacterEncoding);
public void setLocale(Locale locale){
    //bij het zetten van de locale wordt ook de charset op ISO gezet
    if(contentTypeWithCharacterEncoding==null){
      CharacterEncodingFilter.LOGGER.warn("Encoding wordt op ISO gezet via locale.");
    }else{
      super.setLocale(locale);
      //en zet de encoding terug naar de gewenste encoding
      setContentType(contentTypeWithCharacterEncoding);
   * utility methode die in de http header
   * <code>Content-Type:application/x-www-form-urlencoded;charset=ISO-8859-1</code>
   * of in content type op servlethttpresponse
   * <code>text/html;charset=ISO-8859-1</code>
   * de charset zet.
private String addOrReplaceCharset(String headervalue, String charset) {
    if (null !=headervalue ) {
      // see if this header had a charset
      String charsetStr = "charset=";
      int len = charsetStr.length(), i = 0;
      // if we have a charset in this Content-Type header
      if (-1 != (i = headervalue.indexOf(charsetStr))) {
        // if it has a non-zero length.
        if ((i + len < headervalue.length())) {
          // none
          headervalue = headervalue.substring(0, i + len) + charset;
        } else {
          headervalue = headervalue + charset;
      } else {
        headervalue = headervalue + ";charset="+charset;
      return headervalue;
    } else {
      LOGGER.warn("content-type header niet gezet");
      return "application/x-www-form-urlencoded;charset="+charset;
}If all your JSF/JSP pages have consistently set the encoding in the contenttype your addOrReplace method should only add, not replace.

Multi-byte character encoding issue in HTTP adapter

Hi Guys,
I am facing problem in the multi-byte character conversion.
Problem:
I am posting data from SAP CRM to third party system using XI as middle ware. I am using HTTP adapter to communicate XI to third party system.
I have given XML code as UT-8 in the XI payload manipulation block.
I am trying to post Chines characters from SAP CRM to third party system. junk characters are going to third party system. my assumption is it is double encoding.
Can you please guide me how to proceed further.
Please let me know if you need more info.
Regards,
Srini

Srinivas,
Can you go through the url:
UTF-8 encoding problem in HTTP adapter
---Satish

Character Encoding for JSPs and HTML forms

After having read loads of postings on character encoding problems I'm still puzzled about the following problem:
I have an instance (A) of WL 8.1 SP3 on a WinXP machine and another instance (B) of WL 8.1 without any SP on a Win2K machine. The underlying Windows locale is english(US) in both cases.
The same application deployed as a war file to these instances does not behave in the same way when it comes to displaying non-Latin1-characters like the Euro symbol: Whereas (A) shows and accepts these characters as request-parameters, (B) does not.
Since the war file is the same (weblogic.xml, jsps and everything), the reason for this must either be the service-pack-level or some other configuration setting I overlooked.
Any hints are appreciated!

Try this:
Prefrences -> Content -> Fonts & Color -> Advanced
At the bottom, choose your Encoding.

Setting character encoding for the whole app

I have a MySQL base which uses latin1 character set. I don't know which Java uses, but when I print some polish signs it puts ? instead. How do I change the character set for the whole application? Or maybe there is some completely different way?

1. the base I have is capable of storing polish chars - at least when I view it within my visual manager it shows then no problem, so I think it's not that
2. I don't use the console - I use the JTable component to show the data
3. java stores using utf, but the app uses default system encoding - I checked it with:
Charset c = Charset.defaultCharset();
System.out.println(c.displayName());So which solution do you suggest?
And how can I change the encoding the application?
Thank's for your interest!

Why does Firefox 18 ignore the specified character encoding for websites?

We are developing a page on our website that will have the page crawled and a newsletter generated and sent out to a mailing list. Many email packages default to character encoding of iso-8859-1 so we have set our character encoding to this on the page via the standard meta tag.
We have a problem on the newsletters that we had until now been unsuccessful to replicate. Though now I know why.... I have just discovered that in Firefox 18, the specified character encoding is being completely ignored. It is rendering the page in UTF-8 even though we specified ISO-8859-1. Firefox 3.6 however, renders the page with the proper encoding (thank god for keeping an old version for testing).
Can anyone explain why the new Firefox is completely ignoring the meta tag? Both browsers are using the factory default (I even opened FF18 in safe mode)...

Thanks for letting me know that Firefox 18 ignores everything but the server headers... but it doesn't help me much. Our website is in UFT-8... but this page is a newsletter, one that is crawled and saved into an email and sent out to a mailing list (by a third party newsletter program) and many email readers use ISO-8859-1 hence why we want to have the page rendered in that encoding so that we can actually test the newsletter properly. We can't test through the third party software as our testing environment is behind a firewall, and you can't change the server headers for a single page... hence the meta tag.
If you explicitly choose to render a page in a specific encoding, that shouldn't be ignored by the browser. It's not a big deal, but now every time we make a code change in our test environment and reload the page we have to force the encoding manually in the browser which is a pain.
The problem is, the newsletter is already live and we have some users complaining because some characters aren't displaying properly in their email packages (Entourage for Mac is one of them), all our testing (which is encoding using UTF-8) looks fine.

How I set character encoding for everypage and alway?

I use Thai window 874 open the page when I select some website it contain Thai then click open new tab it change to western windows 1252. It can not display Thai. I must set character encoding to Thai windows 874 everytime.

Try this:
Prefrences -> Content -> Fonts & Color -> Advanced
At the bottom, choose your Encoding.

Using bytes or chars for String phonetic algorithm?

Hi all. I'm working on a phonetic algorithm, much like Soundex.
Basically the program receives a String, read it either char by char or by chunks of chars, and returns its phonetic version.
The question is which method is better work on this, treating each String "letter" as a char or as a byte?
For example, let's assume one of the rules is to remove every repeated character (e.g., "jagged" becomes "jaged"). Currently this is done as follows:
public final String removeRepeated(String s){
                char[] schar=s.toCharArray();
          StringBuffer sb =new StringBuffer();
          int lastIndex=s.length()-1;
          for(int i=0;i<lastIndex;i++){
               if(schar!=schar[i+1]){
                    sb.append(schar[++i]);//due to increment it wont work for 3+ repetions e.g. jaggged -> jagged
          sb.append(schar[lastIndex]);
          return sb.toString();
Would there be any improvement in this computation:public final String removeRepeated(String s){
          byte[] sbyte=s.getBytes();
          int lastIndex=s.length()-1;
          for(int i=0;i<lastIndex;i++){
               if(sbyte[i]==sbyte[i+1]){
                    sbyte[++i]=32; //the " " String
          return new String(sbyte).replace(" ","");
Well, in case there isn't much improvement from the short(16-bit) to the byte (8-bit) computation, I would very much appreciate if anyone could explain to me how a 32-bit/64-bit processor handles such kind of data so that it makes no difference to work with either short/byte in this case.

You may already know that getBytes() converts the string to a byte array according to the system default encoding, so the result can be different depending on which platform you're running the code on. If the encoding happens to be UTF-8, a single character can be converted to a sequence of up to four bytes. You can specify a single-byte encoding like ISO-8859-1 using the getBytes(String) method, but then you're limited to using characters that can be handled by that encoding. As long as the text contains only ASCII characters you can get away with treating bytes and characters as interchangeable, but it could turn around and bite you later.
Your purpose in using bytes is to make the program more efficient, but I don't think it's worth the effort. First, you'll be constantly converting between {color:#000080}byte{color}s and {color:#000080}char{color}s, which will wipe out much of your efficiency gain. Second, when you do comparisons and arithmetic on {color:#000080}byte{color}s, they tend to get promoted to {color:#000080}int{color}s, so you'll be constantly casting them back to {color:#000080}byte{color}s, but you have to watch for values changing as the JVM tries to preserve their signs.
In short, converting the text to bytes is not going to do anywhere near enough good to justify the extra work it entails. I recommend you leave the text in the form of {color:#000080}char{color}s and concentrate on minimizing the number of passes you make over it.

Character encoding for OutputStreams?

Hi,
I'm working on a web application that outputs html data in two ways: to a web browser directly through a ServletOutputSream and to files of cached web pages to a FileOutputSream. My problem is that only the ServletOutputStream manages to handle non-english characters (such as swedish � � � correctly). The main idea of the code is this:
//Servlet method:
public void doGet(HttpServletRequest req, HttpServletResponse resp)
               throws ServletException, IOException {
     String data = "��";
     resp.setCharacterEncoding("UTF-8");
     ServletOutputStream out = resp.getOutputStream();
     out.write(data.getBytes());
//File writing mehtod:
private void writeToFile(File file) throws IOException{
     String data = "��";
     FileOutputStream out = new FileOutputStream(file);
     out.write(data.getBytes());
}In the doGet method, if I disable the row
resp.setCharacterEncoding("UTF-8");this method also fails to represent the characters correctly, so I guess the encoding set in the response is propagated to the stream.
Is there a way I could set the encoding in the same way in the FileOutputStream? I know this sounds awkward and that I should be using writers, but since this seems to be possible with the ServletOutputStream I thought it might work...
Best regards,
David

Read this excelent introduction into Unicode. It will tell you that there is no such thing as "plain text". Anytime you have (or give away) a byte stream that should represent data you'll need some way to indicate which encoding it is safed in. When wrinting a response from a servlet you do this using setCharacterEncoding().
Then you'll have to make sure that the binary data you send actually is in the encoding you specified. Since you use getBytes() without an argument, you will only achieve this effect by chance (i.e. when your default locale is UTF-8, normally that's the case on modern Linux systems). You'd need to call getBytes("UTF-8") to do the correct thing.
Alternatively you could simply use resp.getWriter() and let that handle all the string-to-binary conversion.
If you safe the file to disk there is no default way to specify the encoding. If your code is the only one handling those file, you can just define that each of those files is stored in UTF-8 encoding and always handle the text with this encoding.

How to set XML character encoding for a SOAP response?

Hi,
We're using Oracle J2EE web services,
and are quite happy with them.
However, it's a problem that we need to have
characters outside the standard English alphabet
in our service responses. So far, we have not been
able to find a way to specify what encoding to use.
Our version (9.0.3 release) produces SOAP-responses
without any encoding specification in the XML header.
Any ideas?

Hello,
If you are using the "Paper Layout", check the Reports's "Before Report Value" property:
Before Report Value :
<meta http-equiv="Content-Type" content="text/html; charset=&Encoding">
If you are using the "Web Layout", take a look to the document :
http://download-uk.oracle.com/docs/cd/B14099_17/bi.1012/b14048/pbr_nls.htm#i1006142
18.3 Specifying a Character Set in a JSP or XML File
Regards

Wrong special character encoding for window.open

Hi all,
I have a problem with special characters in urls.
See the next code:
                  <h:outputLink onclick="window.open('#{webApplication.root}/dynamic/reports/user_efforts.jsf?i=1&person=#{item.person}','userefforts','width=600,height=300,resizable=yes,scrollbars=yes');return false" value="#" rendered="#{!empty item.projectNumber}">
                         <h:outputText value="#{item.person}"/>
                    </h:outputLink>In that, when item.person contains an �, then in the url it is converted to &eacute. The result is obviously that the person param is wrong for my popup,
I suspect it has something to do with encoding, but both pages are UTF-8.
Anyone can get me out of this mess? ;)
Thanks

Perhaps you should use "value" instead of "onclick".
Think it makes no sense that an renderer should encode onclick - more an passthrough.
In that case you also loose session id if cookie is disalbed.
just try something like
<h:outputLink value='yourURL' onclick='this.href;return false' ...

Where is the Character Encoding option in 29.0.1?

After the new layout change, the developer menu don't have Character Encoding options anymore where the hell is it? It's driving me mad...

You can also this access the Character Encoding via the View menu (Alt+V C)
See also:
*Options > Content > Fonts & Colors > Advanced > Character Encoding for Legacy Content

Byte[] character encoding for strings

Similar Messages

Maybe you are looking for