Character encoding and ByteOutputStream

Hi!
I'm currently working on a web application that needs to print non-english characters (e g swedish � � �). Currently this doesn't work, although i have set the character encoding for the HttpServletResponse.
I figure that it is this code that doesn't manage the non-english characters (it's not my own code but i need to fix it and, sorry to say, I'm not that experienced with streams):
ByteArrayOutputStream baos = new ByteArrayOutputStream(16384);
baos.write("���".getBytes());
resp.setDateHeader("Expires", 0);
resp.setContentLength(baos.size());
ServletOutputStream out = resp.getOutputStream();
out.write(baos.toByteArray());
out.flush();
out.close();Any hints on what to do or where to look? Should i wrap the ServletOutputSream in a Writer?
Cheers,
David

But there's no character encoding set for this operation:baos.write("���".getBytes());

Similar Messages

  • Why differing Character Encoding and how to fix it?

    I have PRS-950 and PRS-350 readers, both since 2011.  
    In the last year, I've been getting books with Character Encoding that is not easy to read.  In playing around with my browsers and View -> Encoding menus, I have figured out that it has something to do with the character encoding within the epub files.
    I buy books from several ebook stores and I borrow from the library.
    The problem may be the entire book, but it is usually restricted to a few chapters, with rare occasion where the encoding changes within a chapter.  Usually it is for a whole chapter, not part, and it can be seen in chapters not consecutive to each other.
    It occurs whether the book is downloaded directly to my 950 reader or if I load it to either reader from my computer(s), which are all Mac OS X of several versions fom 10.4 to Mountain Lion.  SInce it happens when the book is downloaded directly, I figure the operating system of my computer is not relevant.
    There are several publishers involved, though Baen (no DRM ebooks) has not so far been one of them.
    If I look at the books with viewers on the computer, the encoding is the same.  I've read them in Calibre, in the Sony Reader App, and in Adobe Digital Editions 2.0.  It's always the same.
    I believe the encoding is inherent to the files.  I would like to fix this if I can to make the books I've purchased, many of them in paper and electronically, more enjoyable to read on my readers.
    Example: I’ve is printed instead of I've.
    ’ for apostrophe
    “ the opening of a quotation,
    â€?  for closing the quotation,
    and I think — is for a hyphen.
    When a sentence had “’m  for " 'm at the beginning of a speech (when the character was slurring his words) it took me a while to figure out how it was supposed to read.
    “’Sides, â€™tis only for a moon.  That ain’t long.â€?
    was in one recent book.
    Translation: " 'Sides, 'tis only for a moon. That ain't long."
    See what I mean? 
    Any ideas?

    Hi
    I wonder if it’s possible to download a free ebook with such issue, in order to make some “tests”.
    Perhaps it’s possible, on free ebooks (without DRM), to add fonts by using softwares like Sigil.

  • Character Encoding and File Encoding issue

    Hi,
    I have a file which has a data encoded using default locale.
    I start jvm in same default locale and try to red the file.
    I took 2 approaches :
    1. Read the file using InputStreamReader() without specifying the encoding, so that default one based on locale will be picked up.
    -- This apprach worked fine.
    -- I also printed system property "file.encoding" which matched with current locales encoding (on unix cooand to get this is "locale charmap").
    2. In this approach, I read the file using InputStream as an array of raw bytes, and passed it to String contructor to convert bytes to String.
    -- The String contained garbled data, meaning encoding failed.
    I tried printing encoding used by JVM using internal class, and "file.encoding" property as well.
    These 2 values do not match, there is weird difference.
    For e.g. for locale ja_JP.eucjp on linux box :
    byte-character uses EUC_JP_LINUX encoding
    file.encoding system property is EUC-JP-LINUX
    To get byte to character encoding, I used following methods (sun.io.*):
    ByteToCharConverter btc = ByteToCharConverter.getDefault();
    System.out.println("BTC uses " + btc.getCharacterEncoding());
    Do you have any idea why is it failing ?
    My understanding was, file encoding and character encoding should always be same by default.
    But, because of this behaviour, I am little perplexed.

    But there's no character encoding set for this operation:baos.write("���".getBytes());

  • Character encoding and Property files

    Hi
    I currently have some source code that has some hardcoded Strings with accented characters (like prot�g� ). I'm trying to move these out to a properties file. I copied the Strings to a properties file and loaded it into the application with the following code:
          ResourceBundle emailProps = ResourceBundle.getBundle("package.whatever.email");
            emailBody = emailProps.getString("email.body");I tried a System.out.println for both the hardcoded and "loaded from properties" method and they came out different:
    hardcoded: prot�g�
    from properties: prot��g��
    Can somebody tell me why the String loaded from properties came out like this? What can I do to fix this?
    Thanks

    Thanks for your help
    No, I don't know the encoding. How can I find out?
    Just an update, though I'm still trying some stuff. I tried to do a native2ascii ant task on the file (ISO-8859-1) as suggested in the Properties javadoc. As expected, it converted the accented characters to \uXXXX format. However, the output is still the same

  • Character encoding and Sybase JDBC driver

    Hi,
    I have an issue with the Sybase JDBC driver we are using. We have characters in our database, e.g., canon with a ~ over the n. When I do a getString from the ResultSet, I see a ? instead of the character. I'm specifying utf8 as the encoding in the properties in the WLS console. I tried utf16 or ucs2, but I got exceptions from the Sybase JDBC driver at start up.
    Any help would be greatly appreciated.

    Pat Bumpus wrote:
    Hi,
    I have an issue with the Sybase JDBC driver we are using. We have characters in our database, e.g., canon with a ~ over the n. When I do a getString from the ResultSet, I see a ? instead of the character. I'm specifying utf8 as the encoding in the properties in the WLS console. I tried utf16 or ucs2, but I got exceptions from the Sybase JDBC driver at start up.
    Any help would be greatly appreciated.Which Sybase driver are you using?
    Joe

  • Default Character Encoding stuck on UTF-8 - Firefox 7

    I cannot change the Character Encoding - it is stuck on Unicode UTF-8 and I can not change it! When a web page opens I get these little boxes with "FF FD" instead of Quote marks. When I change the character encoding on that page using "View->Character Encoding" and click on the Western (ISO-8859-1), the page displays correctly. Every page opens using Unicode UTF-8 as the default.
    View->Character Encoding -- shows Unicode UTF-8 as the default.
    View->Character Encoding->Auto-detect -- shows OFF
    Tool->Options->Content->Advance->Fonts->Default Character Encoding -- shows Western (ISO-8859-1) as well as the "Allow Pages to choose their own fonts..." IS CHECKED in the check box
    THE PAGES ARE NOT UTF-8!!!! The "View Page Source" IS NOT Unicode UTF-8! -- It shows <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">.
    The "View Page Info" shows MetaTag - Content-Type: text/html; charset=iso-8859-1
    Why can I not change the Default Character Encoding?
    I would also like to point out that the Unicode UTF-8 seems to be broken because it is indicating that the QUOTE CHARACTER is an UNPRINTABLE character "FF FD"
    ----- EDIT -----
    The UTF-8 is not broken. The problem as pointed out in http://en.wikipedia.org/wiki/Replacement_character#Replacement_character is that my Firefox being STUCK processing UTF-8 encoding cannot read the clearly marked iso-8859-1 data. So the UTF-8 is reinterpreting smart quotes -&ldquo; and &rdquo;- (“ and ”) as replacement (unprintable) characters.
    So the real problem is why my Firefox is stuck on Unicode UTF-8

    The real problem is that the font that is used doesn't have those characters.
    Do you see the special quotes -“ and ” on this forum page?
    Does it help if you disable the website fonts and set another font as the default font?
    *Tools > Options > Content : Fonts & Colors > Advanced
    *http://en.wikipedia.org/wiki/Punctuation
    *http://en.wikibooks.org/wiki/Unicode/Character_reference/2000-2FFF

  • Why, after all these years, can't Thunderbird auto-detect character encoding

    judging by all the existing messages and complaints about this, not to mention erroneous posts that say the problem is solved when it isn't, I have to conclude Mozilla either doesn't believe this is a problem or doesn't care to fix it. The bottom line is that there is no way to tell Thunderbird to automatically display emails in the character coding format they were written in. I could understand cases where the headers are not properly filled in, but I see tons of emails in which the encoding is plainly there in the headers within the message source. You can force it, but if you do so via the menu VIEW->Character Encoding->UTF8 (for example) it won't "stick" if you view another message. But who would want it to "stick" permanently anyway? What the average user really wants is to be able to toggle VIEW->Character Encoding->Auto Detect from its default "off" to simply "on", and not have to bother with it anymore.
    This is a problem that seems to have gone on forever, and it NEVER happens with other email clients. If there is some backdoor way to actually make autodetect work, I'd appreciate knowing about it. But more important, I think ALL users would appreciate it if it were not some secret "backdoor" setting, but a simple global menu choice for all accounts. Can Mozilla please fix this problem once and for all?

    You said...
    ''Thunderbird is supposed to be using the encoding in the mail.''
    I figured is "should", i'm just reporting that it doesn't
    You said...
    ''Setting auto detect to on disables that.''
    Please explain. I've looked at every setting I can find and there is no way to set auto detect to "ON". I DID try setting it to "universal" in an attempt top solve the problem, but I have since restored it to "off", because the universal setting doesn't help.
    you said...
    ''"Based on your earlier response I assume you need to press the F10 key to see the tools menu you were refered to." ''
    No... I never said that anywhere. I DID refer to Menu->View_>Character Encoding, and I did refer to right clicking on individual folders, to get to the properties dialog, and the general information tab. But F-10 doesn't do anything
    You said...
    ''I have examines dozens of mails in my inbox and each honours the character encoding set in the HTML''
    Well, mine NEVER did. A short example from an email I got today pretty much is exemplative of all mail I get from GMAIL...
    --089e013a0572a067a404fc73ceda
    Content-Type: text/plain; charset=UTF-8
    Ok, very good. Thank you. Phoenix sent you a friend request on Facebook by
    the way. Talk to you soon.
    --089e013a0572a067a404fc73ceda
    Content-Type: text/html; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable
    <p dir=3D"ltr">Ok, very good. Thank you. Phoenix sent you a friend request=
    =C2=A0 on Facebook by the way.=C2=A0 Talk to you soon.</p>
    --089e013a0572a067a404fc73ceda--
    See those incidences pf "=C2=A0"? Each one displays as a strange character, a capitol A with a curved line over it. If I manually set my default encoding to UTF 8, the weird characters go away. If I leave it as Western, there is nothing I can do to tell Thunderbird to "auto detect".
    Anyway, I suppose at this point that no one responsible for the product coding is seriously looking at my issue, which is why its never been solved. If anyone does intend to help track it down and solve it, I'll be happy to provide all the examples and screen shots they ask for. Otherwise.

  • As a webservice client, how to set character encoding for JAX-WS?

    I couldn't find the right API to set character encoding for a webservice client. What I did is
    1, wsimport which gives me MyService, MyPortType...
    2. Create new MyService
    3. Get MyPort from MyService
    4. Call myPort.myOperation with objects
    Where is the right place to set character encoding and how to set it? Thanks.
    Regards
    -Jiaqi Guo

    The .js file and the html need to have the same encoding. If
    your html uses iso-8859-7, then the .js must also use that. But if
    the original text editor created the .js file using utf-8, then
    that is what the html needs to use.

  • Reading Advance Queuing with XMLType payload and JDBC Driver character encoding

    Hi
    I've got a problem retrieving the message from the queue with XMLType payload in Java.
    It was working fine in 10g database but after the switch to 11g it returns corrupted string instead of real XML message. Database NLS_LANG setting is AL32UTF8
    It is said that JDBC driver should deal with that automatically but it obviously don't in this case. When I dequeue the message using database functionality (DBMS_AQ package) it looks fine but not when using JDBC driver so Ithink it is character encoding issue or so. The message itself is enqueued by the database and supposed to be retrieved by dedicated EJB.
    Driver file used: ojdbc6.jar
    Additional libraries: aqapi.jar, xdb.jar
    All file taken from 11g database installation.
    What shoul dI do to get the xml message correctly?

    Do you mean NLS_LANG is AL32UTF8 or the database character set is AL32UTF8? What is the database character set (SELECT value FROM nls_database_parameters WHERE parameter='NLS_CHARACTERSET')?
    Thanks,
    Sergiusz

  • Character Encoding for JSPs and HTML forms

    After having read loads of postings on character encoding problems I'm still puzzled about the following problem:
    I have an instance (A) of WL 8.1 SP3 on a WinXP machine and another instance (B) of WL 8.1 without any SP on a Win2K machine. The underlying Windows locale is english(US) in both cases.
    The same application deployed as a war file to these instances does not behave in the same way when it comes to displaying non-Latin1-characters like the Euro symbol: Whereas (A) shows and accepts these characters as request-parameters, (B) does not.
    Since the war file is the same (weblogic.xml, jsps and everything), the reason for this must either be the service-pack-level or some other configuration setting I overlooked.
    Any hints are appreciated!

    Try this:
    Prefrences -> Content -> Fonts & Color -> Advanced
    At the bottom, choose your Encoding.

  • Problems with Forms and character encoding

    I'm having problems trying to read unicode data inputted into a Form on my JSP page.
    I've used the meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> to set the charset of the page to UTF-8. I've inputted some chinese characters inot my form and when I try to read the subsequent request parameter in my servlet using request.getParameter() the string returned is this
    "&#26469;&#28304;" which is the escape sequence required by HTML to display these characters.
    From what I've read on the subject this doesn't seem like the expected value. I've tried other ways of getting the correct string value such as setting the character encoding request.setCharacterEncoding("UTF-8") and then converting the bytes using this encoding value but it doesn't seem to work.
    I could write a method to split up the string using the ; as a token and working out the correct unicode character but this doesn't seem like the right thing to do.
    Any help on how to pass the correct information from the Form in the JSP page to the servlet would be greatly appreciated

    I don't believe that is correct, but if it's returning HTML escapes instead of URL Encoded characters, then it's the browser doing it. This is my test page for playing with Chinese...
    <%@ page language="java" contentType="text/html; charset=UTF-8" %>
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <html>
    <head>
         <title></title>
         <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    </head>
    <body bgcolor="#ffffff" background="" text="#000000" link="#ff0000" vlink="#800000" alink="#ff00ff">
    <%
    request.setCharacterEncoding("UTF-8");
    String str = "\u7528\u6237\u540d";
    String name = request.getParameter("name");
    %>
    req enc: <%= request.getCharacterEncoding() %><br />
    rsp enc: <%= response.getCharacterEncoding() %><br />
    str: <%= str %><br />
    name: <%= name %><br />
    <form method="GET" action="_lang.jsp" encoding="UTF-8">
    Name: <input type="text" name="name" value="" >
    <input type="submit" name="submit" value="GET Submit" />
    </form>
    <form method="POST" action="_lang.jsp" encoding="UTF-8">
    Name: <input type="text" name="name" value="" >
    <input type="submit" name="submit" value="POST Submit" />
    </form>
    </body>
    </html>

  • Seeing � etc despite having View--Character encoding as unicode and auto-detect universal

    On viewing some web pages see characters such as �, ,  (for example). But View-Character Encoding is set at Unicode (UTF-8) or Western (ISO8859-1) and Tools-Options-Content-Fonts-Advanced Encoding set with either of those

    example of page:
    http://scienceofdoom.com/2010/09/17/on-missing-the-point-by-chilingar-et-al-2008/
    - a little over half way down, the section headed "Anthropogenic Imact on the Earth’s Climate – Tiny" from paragraph "And continue: " there are these non-characters in the equation (12) and subsequently.
    Another page : http://www.zimbabwesituation.com/sep26_2010.html in the topic " Red warning lights" .
    Most web-pages I read are without problem.
    I contacted the writer of the first page and s/he had no idea why it happens.

  • Database Encoding and character set

    Hi,
    Is it possible to change [some easier straigtht forward way] the encodings and character set for a database [DB11 in this case]?
    I have a database which has UTF16 and MSWIN1252 as the encoding and characterset, i want to change it to UTF8 for some testing and reasons.
    Thanks,
    Anupam

    You should detail what you mean by encoding and character set.
    A database has 2 character sets:
    - the character set for CHAR, VARCHAR2 and CLOB
    - the national character set for NCHAR, NVARCHAR2 and NCLOB.
    You have 2 ways to change the database character set:
    - with ALTER DATABASE statement
    - with export/import
    See http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96529/ch10.htm#1656

  • Web pages display OK, but print with garbage characters. I think it's character encoding, but don't know WHICH I should use. Have tried all Western and UTF options. Firefox 3.6.12

    I used to only have troubles with headers & footers printing out as garbage characters. I tried changing Character Encoding, now entire pages have garbage characters, even though pages view ok when browsing.

    If the pages look OK when you are browsing then it is not a problem with the encoding.<br />
    It can be a problem with the font that is used and you can try to disable website fonts and posibly try a few different default fonts to see if that helps.
    Tools > Options > Content : Fonts & Colors: Advanced (Allow pages to choose their own fonts, instead of my selections above)

  • Locale and character encoding. What to do about these dreadful ÅÄÖ??

    It's time for me to get it into my head how this works. Please, help me understand before I go nuts.
    I'm from Sweden and we use a few of these weird characters like ÅÄÖ.
    If I create a file called "övrigt.txt" in windows, then the file will turn up as "?vrigt.txt" on my Linux pc (At least in the console, sometimes it looks ok in other apps in X). The same is true if I create the file in Linux and copy it to Windows, it will look just as weird on the other side.
    As I (probably) can't change the way windows works, my question is what I have to do to have these two systems play nicely with eachother?
    This is the output from locale:
    LANG=en_US.utf8
    LC_CTYPE="en_US.utf8"
    LC_NUMERIC="en_US.utf8"
    LC_TIME="en_US.utf8"
    LC_COLLATE=C
    LC_MONETARY="en_US.utf8"
    LC_MESSAGES="en_US.utf8"
    LC_PAPER="en_US.utf8"
    LC_NAME="en_US.utf8"
    LC_ADDRESS="en_US.utf8"
    LC_TELEPHONE="en_US.utf8"
    LC_MEASUREMENT="en_US.utf8"
    LC_IDENTIFICATION="en_US.utf8"
    LC_ALL=
    Is there anything here I should change? I have tried using ISO-8859-1 with no luck. Mind you that I want to have the system wide language set to english. The only thing I want to achieve is that "Ö" on widows should turn up as "Ö" i Linux as well, and vice versa.
    Please save my hair from being torn off, I'm going bald here...

    Hey, thanks for all the answers!
    I share my files in a number of ways, but mainly trough a web application called Ajaxplorer (very nice btw...). The thing is that as soon as a windows user uploads anything with special chatacters in the file name my programs, xbmc, console etc, refuses to read them correctly. Other ways of sharing is through file copying with usb sticks, ssh etc. It's really not the way of sharing that is the problem I think, but rather the special characters being used sometimes.
    I could probably convert the filenames with suggested applications but then I'll set the windows users in trouble when they want to download them again, won't I?
    I realize that it's cp1252 that is the bad guy in this drama. Is there no way to set/use cp1252 as a character encoding in Linux? It's probably a bad idea as utf8 seems like the future way to go, but the fact that these two OS's can't communicate too well in this area is pretty useless if you ask me.
    To wrap this up I'll answer some questions...
    @EVRAMP: I'm actually using pcmanfm, but that is only for me and I'm not dealing very often with vfat partitions to be honest.
    @pkervien: Well, I think I mentioned my forms of sharing above. (kul med lite arch-svenskar!)
    @quarkup: locale.gen is edited and both sv.SE and en_US have utf-8 and ISO-8859 enabled and generated.
    ...and to clearify things even further. It doesn't matter if I get or provide a file via a usb stick, samba, ftp or by paper. All I want is for "Ö" to always be "Ö", everywhere.
    I can't believe how hard this is to get around. Linus is finish for crying out loud. I thought he'd sorted this out the first thing he did. Maybe he doesn't deal with windows or their users at all

Maybe you are looking for

  • Disk Utility + A Question For CD ROM Drives

    Hello, I Own A MacBook, It Was Shipped With Tiger On It And I Upgraded It To Leopard. Now A Little While Ago I Did A Format And Install To Start Fresh Again, The Internal HDD Is 80Gb, I Needed More So i Got A 80Gb EXTERNAL HDD. i wanted to partition

  • How to Reinstall iMovie

    Every time a open iMovie, I get: Missing QuickTime Component: The QuickTime component necessary to view, edit, import, and export high-definition movies is not installed. It is included with the iMovie installer; please re-install. How do I reinstall

  • Create BP as a organizationa unit (BUP004)

    Hi, I need some urgent help. I need to create BP's for support teams.(SD, MM, FI  etc). I tried creating same via transaction code BP -->create --> organization. However under "Create in bp role" i cannot find the option BUP004 (Organizational unit).

  • Floating point formats: Java/C/C++, PPC and Intel platforms

    Hi everyone Where can I find out about the various bit formats used for 32 bit floating numbers in Java and C/C++ for both Mac hardware platforms? I'm developing a Java audio application which needs to convert vast quantities of variable width intege

  • PHP And Flex

    I have been trying to understand (via researching the Web and books from bookstores) different ways to integrate a php/mysql database into my Flex application. There appear to be multiple solutions which is confusing me so I am looking for some advic