Java 5, Linux, 64-bit: Non-ASCII chars over socket

Hi,
I am having issues with reading non-ASCII chars from a socket. I send a mixed message, with the first part in ASCII and the last bit in non-ASCII. There are no issues with reading the non-ASCII characters on Windows. However, when I try running the server on Linux. The following is a message sample:
Start message<CRLF>
&#1092;&#1074;&#1072;&#1092;&#1099;&#1074;&#1072;&#1092;&#1074;&#1099;&#1072;&#1092;&#1099;&#1074;<CRLF>
The second part (which is encoded in either Windows-1250 or KOI8-R), comes out as 3F (when you look at the bytes) on Linux.
Any suggestions?
Thanks,
Max

You must be using Readers and Writers, and you need to make sure you specify the same charsets when constructing them. Don't leave this to the default, as this seems to vary across platforms and definitely has varied across releases.

Similar Messages

  • Can Java be started in a directory that contains non ascii char

    I installed a product developed using Java in a folder whose name contains non-ascii chars, such as Japanese chars or german chars.
    This will cause error said: unable initialise java virtual machine, error code -1
    Some one said Java doesn't like being started in a directory that contains non ascii characters. There appears to be no way of passing it unicode parameters.
    Is there anyone once hit the similiar issue or know the root cause of such problem?
    Thanks

    Yes you can use your web start application console. To enter data which is required for your application it is better idea to use java application which runs in console mode althou you may try to run console of windows and then read data from its input stream.

  • Does Java 5 accept non-ascii chars as identifiers?

    I am surprised to find out that Java 5.0 accept non-ascii chars as identifiers.
    Is it true that Java 5 really accept non-ascii chars?
    Thanks.

    Here is the code:
    public class non��������name {
      private static void ��������(){
        System.out.println("this is called from a function with a chinese name!");
        int ����1 = 1, ����2 = 2;
        System.out.println("����1 = " + ����1 + ", ����2=" + ����2 + ", �� = " + (����1+����2));
      public static void main(String[] args) {
        ��������();
    }

  • FILE_DATASTORE and non-ASCII chars

    I have created an interMedia Text index
    with the FILE_DATASTORE option, so that
    interMedia treats table entries as
    filenames and indexes the corresponding
    file on the servers's filesystem.
    But whenever the filename contains characters
    which are not part of the US7ASCII charset (like dv| _), the file is not found. But both Oracle and the operating system support these characters.
    The Oracle instance uses UTF8 as internal
    characterset. The client which stores
    the filenames in the table uses the
    WE8ISO8859P1 charset. The values in the
    database table are stored and shown correctly
    when viewed with Oracle or Java client
    programs.
    So where does the charset conversion fail ?
    The names are stored correctly, they can be
    read correctly by clients, but the indexer
    seems to use a wrong charset to translate
    the filenames stored in the database into
    filenames of the operating system.
    Do I have to apply some additonal configurations to my indexer ?
    Greetings,
    Michael Skusa
    null

    I bump Dr. Chucks thread for a similiar problem with non-ascii chars.
    The chars show up but the sorting is a bit off.
    Example: A, Å, B, ... Z
    Should be: A, B, ... Z, Å, Ä, Ö
    In Swedish Å (the letter Aring;) is one of the last letters and should not be placed after A despite being similiar.
    Any ideas?

  • Folders that having non-ascii chars are not displaying on MAC using JFileChooser

    On MAC OS X 10.8.2, I have latest Java 1.7.25 installed. When I run my simple java program which allows me to browse the files and folders of my native file system using JFileChooser, It does not show the folders that having non-ascii char in there name. According this link, this bug had been reported for Java 7 update 6. It was fixed in 7 Update 10. But I am getting this issue in Java 1.7.21 and Java 1.7.25.
    Sample Code-
    {code}
    public class Encoding {
    public static void main(String[] arg) {
    try {
    //NOTE : Here at desktop there is a folder DKF}æßj having spacial char in its name. That is not showing in file chooser as well as while is trying to read for FILE type, it is not identify by Dir as well as File - getting File Not Found Exception
    UIManager.setLookAndFeel(UIManager.getSystemLookAndFeelClassName());
    } catch (IllegalAccessException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (UnsupportedLookAndFeelException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (ClassNotFoundException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (InstantiationException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    JFileChooser chooser = new JFileChooser(".");
    chooser.showOpenDialog(null);
    {code}

    Hi,
    Did you try this link - osx - File.list() retrieves file names with NON-ASCII characters incorrectly on Mac OS X when using Java 7 from Oracle -…
    set the LANG environment variable. It's a GUI application that I want to deploy as an Mac OS X application, and doing so, the LSEnvironment setting
    <key>LSEnvironment</key> <dict> <key>LANG</key> <string>en_US.UTF-8</string> </dict>

  • IPhone: sectionIndexTitlesForTableView and non-ASCII chars

    You specify an array of strings for your section titles in your implementation of UITableViewDataSource method sectionIndexTitlesForTableView. However, it seems like if you have a string with non-ASCII characters, it is left blank (for example the Korean string "누리마루"). Anybody else encounter this issue?

    I bump Dr. Chucks thread for a similiar problem with non-ascii chars.
    The chars show up but the sorting is a bit off.
    Example: A, Å, B, ... Z
    Should be: A, B, ... Z, Å, Ä, Ö
    In Swedish Å (the letter Aring;) is one of the last letters and should not be placed after A despite being similiar.
    Any ideas?

  • Non-ASCII chars in applets?

    hi,
    Spent 4 hours to find a way to use non-ASCII chars in applets (buttons, textareas), but didn't make it.
    Simply saying
    TextFieldObj.setText("\uxxxx");
    //or any equivalent obj. Ex. of \uxxxx: \u015F
    doesn't work. I even went into Graphics.paint() example, but it too can paint only ASCII chars.
    My hunch is that it is smt. about Character.Subset but i still can't figure out how to do it.
    Please SOS,
    Reshat.

    Hi,
    I just managed to get Buttons to show Greek characters, so it appears that static buttons are fine.
    However, i still face the same problem for TextField's:
    TextFields work fine for IE, but in NN they sometimes convert into ASCII and sometimes give ? The same in HotJava.
    So there are 2 questions in my head:
    1. why can't NN use the fonts used by IE to display Non-ASCII chars?
    2. What is the safest font to use for Non-ASCII chars, to cover the widest possible audience.
    P.S. Java solves most cross-platform-browser problems, but the font issue still seems to be dependent on a user and his/her browser. It appears Java is not font-independent in non-ASCII context. If so, it would be nice to develop a plug-in to make sure that if the user doesn't have the font, then a Java-standardized Unicode-based font is used. Otherwise, non-ASCII world is still w/o a real Java.)
    Thank you for your feedback,
    Reshat.

  • How Aperture encodes non-ascii chars

    It looks like Aperture uses a way of encode chars as UTF that seem to be bit unusual, many application (especially web apps) expects another way of encoding Unicode chars. The end result is that non-ascii chars comes out really strange.
    When I use applescript to read the IPTC info I need to do something like this before using the data:
    set c_title to normalize unicode p_title without decomposition
    Is there some way of telling Aperture to do this automatically when exporting photos?
    (sorry about the bad use of Unicode terminology but I don't know the area very well)

    I've spent quite a bit of time on this topic - I'm the author of Phoshare http://code.google.com/p/phoshare/, an open-source tool to export images and metadata from iPhoto and Aperture. Storing non-ASCII characters in image metadata is a messy business. If you want to get a taste of it, have a look at the Exiftool FAQ at http://www.sno.phy.queensu.ca/~phil/exiftool/faq.html#Q10 . A quote: "Most textual information in EXIF is stored in ASCII format, ... However *it is not uncommon for applications to write UTF‑8 or other encodings where ASCII is expected*". This leads to all sorts of problems, because the reader will have to make assumptions about the encoding used by the writer.
    In the case of Aperture, I have found that it writes metadata encoded in a way consistent with what most other applications and on-line services expect. Most of the encoding problems I've debugged where caused by bad input data. E.g. the characters were encoded improperly to begin with, but in a way that they still display as expected in some places (like Aperture). Can you try erasing some of the meta data in Aperture, and retype them from scratch (e.g. not copy-paste, which would paste the same stuff right back). Also make sure that you force-updated the preview images, so that the new meta data get written into the files.

  • UploadedFile and filenames with non-ascii chars

    Hi
    I'm using an UploadedFile object in my web app, and all works fine. However, when I try to upload a file, with a filename containing non-ascii chars (e.g. Spanish), I see that the getBytes method returns an empty byte array, the filename is not stored correctly (the non-ascii chars are lost, replaced by another representation), and that the content-type is application/octet-stream instead of image/png as supposed to be.
    If I rename that same file to have only ascii chars - everything is back to normal.
    How can I upload files with non-ascii chars in their name?

    Hi, back! Spent a few hours experimenting and found
    that everything is working great (including the creation
    of international non-ASCII foldernames) when I used
    utf-8 encoding in the sieve filters rules for the
    the match strings and the folder names... at least
    so far so good... for your ref and sorry for bothering.

  • URGENT : Pass non-serializable objects over sockets

    I am working with a non-serializable class.I can't change it to serializable(since it's an API class).I can't create a subclass either as I need to pass the same exact Object .I have tried wrapping it in a serializable class (as transient )but then the particular object that I need is not serialized.Is there a way to pass non-serializable objects over sockets?.Some other way of doing this?. Should i use RMI or Externalize?. Any help would be appreciated.

    {snip}
    Like this:
    public class SerializableLibraryClass extends
    LibraryClass implements Serializable {
    }Then you've got a class that is exactly the same as
    LibraryClass, but is now serializable.Not quite. The base class still isn't Serializable. Take, for example, the following code:
    public class A
        public int a1;
    public class B extends A implements Serializable
        public int b1;
    B beforeSend = new B();
    beforeSend.a1 = 1;
    beforeSend.b1 = 2;
    objectOutputStream.writeObject(b);
    B afterSend = (B)(objectInputStream.readObject());Both beforeSend and afterSend have fields a1 and b1. For beforeSend, a1 = 1 and b1 = 2. For afterSend, b1 = 2 but a1 = 0. The superclass wasn't Serializable; therefore, its members weren't Serialized and, when the object was created on the other side of the stream, a1 was left at its initial value.
    In short, you can't just Serialize the subclass. That Serialization won't apply to its superclasses.

  • Pass non-serializable objects over sockets

    I am working with a non-serializable class.I can't change it to serializable(since it's an API class).I can't create a subclass either as I need to pass the same exact Object .I have tried wrapping it in a serializable class (as transient )but then the particular object that I need is not serialized.Is there a way to pass non-serializable objects over sockets?.Some other way of doing this?.

    Have you considered leaving it in situ and using RMI to manipulate it remotely ?
    To be a little pedantic, if you need the "same exact object" then there's no technique that will work for you, since you can only copy an object between JVMs, not move it.
    D.

  • GSSName  is corrupted for non ascii chars

    Hi,
    I have a setup where a web application is deployed to use SPNEGO for user authentication ( using kerberos V ) and authorization.
    We have several users with non english characters in the user ID and even though kerberos authentication succeeds for such users ( KDC / Active Directory is returning valid kerberos ticket which the client embeds in the SPNEGO token). Hoowever, on passing the SPNEGO token to GSS API and extracting the user name from GSS API returns incorrect user name. All non ascii characters in the user name are replaced with some junk byte sequences.
    We use JGSS API (with JRE 1.4.08) for extracting the SPNEGO token and create a GSS secruity context object. Later, the GSS Name is extracted from the GSS context object.
    Currently I am tesitng the SPNEGo authentication for a user with user ID 123<sp char> . The <sp char> 's unicode value is FE and UTF-8 encoded byte sequence is C3 BE. However, if I invoke 'export' method of the GSSName object and examine the returned byte sequence, instead of C3 BE, the byte sequence EF BF BD EF BF BD is present. The byte sequence for other english characters are proper.
    Is this a defect in GSS-API ? Or am I not using GSS properly?
    Do I need to have any special setup / configuration for using JGSS with kerberos V for users with non ascii characters in the user ID?
    Please advise.
    Regards,
    Jayaram.
    Message was edited by:
    s_jayaram_s

    I understand that this is an older thread.
    We have spent lot of time on the internet to find out any possible workarounds or permanent solutions to enable Multi-Byte character support for username / password / SPN with Kerberos and Java . But no luck so far :(
    Are there any new updates on this i18n issue ?
    Thanks,
    Venkatesh

  • Can ims5.2p1 do sieve filter searches with non-ASCII chars?

    Particularly, is it possible with the
    Subject header line, which is
    =?xxx?Q?xxxxx?= or =?xxx?B?xxxxx?= encoded
    Resolved the fileinto foldername part, but
    now found that seems like ims's sieve filter
    cannot do non-ASCII matches... Thanks.

    Hi, back! Spent a few hours experimenting and found
    that everything is working great (including the creation
    of international non-ASCII foldernames) when I used
    utf-8 encoding in the sieve filters rules for the
    the match strings and the folder names... at least
    so far so good... for your ref and sorry for bothering.

  • Page encoding / charset / special chars / NON-ASCII chars

    Whenever special characters are used in our environment, the data is stored incorrect ...
    ® gets converted to ®
    and other special chars gets converted to other special chars ... the strange thing is that some inputText fields in some pages work just fine ..chars dont get converted and it happend in some other many pages ...not sure whats going on ..I tried to change the charset or the pageEncoding of the pages or even manually using the setPageEncoding in the reequest and in the response ..with no luck ...when I print the request params ..the values are converted, so ma not sure at what stage the string get converted and how to prevent it ...
    Charset of the jsp-page is UTF-8. Need help here please.
    Thanks.

    You must verify that all resources are in UTF-8. There are a couple of places you have to check:
    Compiler encoding
    Embedded OC4J JSP Compiler options
    Project settings
    After that it worked for me.
    Timo

  • Exchanging non ASCII bytes by socket tcp

    Hello:
    I'm trying to exchange characters between 0-255 of ascii characters table. I'm using the CharToByteConverter and ByteToCharConverter classes on my client/server programs, but I can't get it.
    The client/server programs use socket tcp, but I'm not sure the data structures that I must to use in order to send bytes with these characteristics (0-255 ascii characters)
    For example I need to send the next array to server program:
    char[] car = new char[15];
    car[0] = (char)129;
    car[1] = (char)12;
    car[2] = (char)201;
    car[3] = (char)1;
    car[4] = (char)48;
    car[5] = (char)48;
    car[6] = (char)1;
    car[7] = (char)1;
    car[8] = (char)95;
    car[9] = (char)4;
    car[10] = (char)17;
    car[11] = (char)3;
    car[12] = (char)22;
    car[13] = (char)38;
    car[14] = (char)53;
    Note: In UDP protocol (socket UDP) the interchange data and the conversions work correctly.
    Any comment will be appreciated, regads,
    Ulises

    You are right. The modem is a device that works (send
    and recive) with tcp/ip or UDP/IP protocol (is a
    modem GPRS). So, I can simulate him with a java
    program sending data by socket tcp/ip. In this case
    is the same.
    The format message is a fact. I can't to choose it.Then that is the message that you are dealing with.
    Presumably this is because you are only writing the client or server and not both.
    >
    My problem is that, for example, I'm waiting (my java
    program with server socket tcp/ip actually) for a
    byte 129 how first byte, but instead of it, I receive
    a byte 63.
    Then one of the following is true.
    - You are using a class that you should not be, like DataInputStream or String.
    - Your understanding of the message protocol is incorrect (perhaps because the spec is wrong.)
    - Or the message that is being sent is wrong but presumably there is nothing that you can do about that.
    Sockets send and receive bytes. They don't send and receive characters.
    Now it could be that the data being sent is indeed characters but that has nothing to do with the socket. It doesn't transform them, map them, nor touch them in any way.
    In addition, depending of the data structure used
    (char [] or byte []) how buffer, I lose some bytes.Then either you are doing something wrong (like using the buffer size rather than the read size) or the message protocol is very ill defined. If the second then you will have to develope a fuzzy algorithm to guess at the correct behavior.

Maybe you are looking for