Encodings

Hallo,
I'm reading from a URL class as shown here:
http://java.sun.com/docs/books/tutorial/networking/urls/readingURL.html
however, if site is in cyrillic (or other non-english language) what i get is a lot of ???? question marks, or sometimes other strange symbols. what should i do to display it correctly ?
thanks.

use the constructor for String (charsetName="UTF-8"):
String
public String(byte[] bytes,
String charsetName)
throws UnsupportedEncodingException
Constructs a new String by decoding the specified array of bytes using the specified charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the byte array.
The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharsetDecoder class should be used when more control over the decoding process is required.
Parameters:
bytes - the bytes to be decoded into characters
charsetName - the name of a supported charset
Throws:
UnsupportedEncodingException - If the named charset is not supported
Since:
JDK1.1

Similar Messages

  • Encodings folder empty, Disk Utility won't repair disk

    Hello, I've been having some issues with my old powerbookg4. I've done a clean reinstall already on this machine, which seemed to correct alot of problems, but there seems to be at least one significant problem remaining. the Encodings folder in system/library/coreservices/encodings is completely empty, so ichat will crash when trying to type a message, or firefox crashes when it hits a website with chinese on it. Anyways, I went to disk repair with the 10.3 install disc, but after a while the repair stops, saying that it can't repair the volume. I've also tried just copying the encodings folder from the install disc, but that encounters an error (i think its error 52 or something). thank you very much for any help you could provide.

    i should also add that I've run it with the apple hardware test cd that came with the computer, and that says all the hardware is fine. But when I've booted the computer up in safe mode and had Disk Utility try to "Repair Disc," it stops after about ten-twenty minutes saying that "the volume can't be repaired." So, I'm wondering if there really is a problem with the hard drive that the hardware test isn't picking up, and so even if I reinstall the OS or try and do a clean install of a newer os like os 10.5, the problems won't go away. Is this a hardware issue, or am I just going about this the wrong way?
    I've tried reinstalling several of the language packages on the os 10.3 disc, but I didn't try much because I don't know which of the language packages installs the files for the Encodings folder.
    And on the first OS10.3 install cd, I found the path to the Encodings folder, and tried copying those contents over into the Encodings folder on my hard drive, working under the assumption that for about five minutes I get root access. It throws up an error message when I do that, even though it lets me tinker around in the other parts.
    I'm really as in the dark as to what I'm doing here, which is why I haven't tried much else. So any advice or info you can throw my way would be very much appreciated.

  • The change to a different font was not done because the chosen font and the font encodings in the do

    Hi,
    I created a PDF from a Word document that used Cambria font, and when I select it in Acrobat to try to change the font I get "The change to a different font was not done because the chosen font and the font encodings in the document differ and could not be resolved"
    How can I change the font and why is it preventing me from doing so?
    Thanks,
    Juan

    Juan,
    No, Acrobat is not a word processor.
    More to the point Acrobat lets one "work" PDF and PDF is not a file format designed as a word processing, editing, formating, or page layout file format.
    PDF is, essentially, a "final destination" format.
    What it is is described in the ISO Standard for  PDF (ISO 32000-1).
    Be well...

  • Trying to understand text encodings between windows clients and oracle DB

    I am focusing on a "maybe bad configured" oracle windows xp client connecting into a well configured oracle db server (10g for example). Instead of just keep the correct client settings to make it work I would prefer to understand what is really doing the oracle client with all the posible bad configurations regarding encodings.
    When the client executes something like "SELECT 'Col1', N'Col2' from dual" before sending the sql sentence into the server the sentence itself should be encoded.
    But the way the sentence is encoded or the fact that some magic encoding transform occurs depends on the technology used (Java Thin, OCI, Oledb, etc..) and sometimes it depends on the windows setting "Languages for Non-unicode programs" and sometimes it depends on the client NLS_LANG setting on the registry. Or maybe thigs are simpler and I simply got confused... Here we could add that maybe some third party tool could do some other "helpfull" hidden magic encoding transformation to make it work and things become interesting
    A second time when all this encoding stuff should be considered is when the results of the sentences are retrieved on the client.
    At this point the client receives an byte stream from the server (that could be a VARCHAR column or a NVARCHAR column). Again depending on the technology used the client could be specting a utext, text (OCI) or a CHAR, NCHAR (in JAVA) or a SQL_C_WCHAR, SQL_C_CHAR (in OleDb), etc...
    Well, I'm not sure at all about the first point. Is all the sentence encoded in the same way before sending it to the server (ie is sended as plain text) or before the client parses the sentence and it understand that is sending a SELECT statement that returns two cols (one varchar and the other nvarchar)? For example the N, the ' and the Col2 are sent using the same encoding?

    Todd:
    Ref:
    http://docs.oracle.com/cd/E35855_01/tuxedo/docs12c/ads/adecid.html#wp1075436
    Section: Generating ECID by Native/WS/Jolt clients and Domain Gateway
    We are using Jolt Clients, via JSL for executing Tuxedo Services. The problem, we always face while debugging is cross relation. It would be extremely useful for us if we could get the ECID printed in webserver and tuxedo server process. Can we get ECID using any programming API in jolt client and in Tuxedo server process ? It would help us to correlate web, Tux and server logs
    Thanks,
    Biju

  • IE,East Asian encodings

    Hi to all!Problem:I've a JSP page which sends text to servlet.Servlet writes received from JSP text to the file and performs converting to UTF-8.With FireFox it works fine(text properly converts to UTF-8),but with IE 6.0 I have troubles-does not work converting to UTF from East Asian launguages(Chinese encodings gb2312,gb18030,Korean,Japanese-in the file I can see only "?" symlols instead of text)-but with Chinese Big5 encoding works nice:-/ Also I've detect that IE does not send "accept-charset" header (FireFox sends it always) .I think that code of my servlet properly because converting works fine with text received from FireFox.Have anybody here similar problem?How I can solve it?Thanks:)

    All work fine in IE and FireFox if Unicode (UTF-8) encoding was selected.
    This is my code:
    FileOutputStream fos = new FileOutputStream("test.txt");
                   Writer out = new OutputStreamWriter(fos,"UTF-8");
              out.write(request.getParameter("text"));
              out.flush();
              out.close();
         try
                           FileInputStream fileInputStream = new FileInputStream("test.txt");
                           Reader in = new InputStreamReader(fileInputStream,"utf-8");
                           FileOutputStream fileoutstream =new FileOutputStream("work.dat");
                         Writer out1 = new OutputStreamWriter(fileoutstream,"utf-8");
                         char c=' ';
                                   while (in.ready())
                                    ByteArrayOutputStream string=new ByteArrayOutputStream();
                                    String line ="";
                                    boolean notZ=false;
                                    do
                                    {   c = (char)in.read();
                                              while (in.ready())
                                                   string.write(c);
                                                   c = (char)in.read(); notZ=true;
                                              line = string.toString("utf-8");
                                              out1.write(line);
                                    while (!notZ);
                               out1.flush();
                               out1.close();
         

  • Java IO for different file encodings

    Hi,
    I need a file reading mechanism, wherein i should be able to read files of just about any encoding (eg Shift JIS, EBCDIC, UTF, Unicode, etc).
    I could do this using the following code:
                   FileInputStream fis = new FileInputStream("D:\\FMSStub\\ZENMES2.txt");
                   InputStreamReader isr = new InputStreamReader( fis, "Unicode");
                   BufferedReader br = new BufferedReader( isr );
                   br.read(c1);
    But there is a requirement in our code, to also read some trailers from the file and go back and again read from the middle of the file. Basically a seek kind of functionality.
    I have been desperately trying to figure out how i can do seek (which is possible with RandomAccessFile, but RandomAccessFile wont work with all the various encodings...i have tried this) with the above kind of code.
    Any information on this would be very useful.
    Regards,
    Mallika.

    Hi,
    Thanks for your reply.
    But as you say, when i do a new String(byte[], "encoding") my String does not get formed.
    The case which i tried was a unicode file.
    I read first few bytes and formed a string as follows:
         byte blfilestr[] = new byte[1000];
                   ra = new RandomAccessFile("D:\\FMSStub\\ZENMES2.txt","r");               
         ra.seek(0);
         int ilrd = ra.read(blfilestr);
         String gfilestr = new String(blfilestr, "Unicode");
    This gave me the String correctly.
    But then i did a seek at an intermediate position in the file (but its a locaiton which i confirmed is not in between the 2 bytes of a Unicode char) and it fails to form a string correctly. I need to do this though.
    Any ideas ?
    Mallika.

  • Is it possible to read native encodings without using native2ascii

    Hi All,
    I have a text file containing japanese characters saved in a UTF8 format. I can read this into my java application after converting it using the native2ascii tool.
    However i'm wondering whether i can read this directly into my java application without going through the native2ascii tool since this file is already in UTF8 format. I have tried, but it doesnt work. Please advise.
    Thanks.

    You missed reading the java.util.Properties documentation:
    "When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings."
    What's so bad about using native2ascii?

  • Mapping files for pre-defined Encodings?

    Hi there,
    Does anybody know if and where it is possible to download the PDF pre-defined encodings (StandardEncoding, MacRomanEncoding, WinAnsiEncoding, PdfDocEncoding and MacExpertEncoding) as any processable file?
    I found this: http://www.unicode.org/Public/MAPPINGS/VENDORS/ADOBE/ ...but I'm not sure if they are complete and up-to-date.
    Furthermore I'm missing a processable file for MacExpertEncoding at all.
    Anybody?
    Thanks!
    Jan

    When you have a chance, please contact me via email.
    I think that we can accommodate your request, and the discussion will be smoother if we conduct it through email exchanges.
    Thanks...
    -- Ken

  • Chosen font and font encodings differ & could not be resolved  editing problem

    Can anyone advise how I can edit text in a document I received? The document was generated by Ghostscript.
    When attempting to edit text, a message says, All or part of the selection has no available system font. You cannot add or delete text using the currently selected font.
    When I try to alter the font through Touchup Properties, the error message says, The change to a different font was not done because the chosen font and the font encodings in the document differ and could not be resolved.
    I use Adobe Acrobat Standard, version 7.0. Is this problem solved in versions 8 or 9?

    This IS an Acrobat problem, as they should have anticipated this problem and created a way around it (fall back to a more common font, for example).
    The work-around is to copy the text, create a rectangle with a white fill color, then place it on top of the text in Acrobat to cover it up. Paste in the text using whatever font you wanted.

  • Character Encodings when using Streams for Text

    hi there,
    i have a table with a key column (ordinary varchar) and a text column (memo field in ms access, connection is done via the jdbc2odbc bridge). if i read the stuff with ResultSet.getAsciiStream() everything is fine, i.e., printing the resulting strings looks ok. then i use this method to read it and write it into a second table with the same structure but using PreparedStatement.setCharacterStream(). now i read from the second table using ResultSet.getCharacterStream() and print the strings. but now they seem garbled. something with the text conversion must have gone wrong. i never set any character encoding so the encodings should all be default.
    any ideas?
    robert

    Can you confirm that the data has been stored to the
    second table correctly?
    Ie, is it getCharacterStream or setCharacterStream
    that's causing the problems?that's difficult to answer: if i use the access interface to look into both tables they look alike. especially the target table looks ok.
    but if i read the data with getCharacterStream() i get something that looks as if this had happened: during storing with setCharacterStream() for every character there seem to be two bytes stored in the table. after reading the output looks like every second character was ok and every other character seems garbled. so it looks like on reading not two bytes in the database made a character but only one so every second byte becomes crap.
    unfortunately there is neither method in Reader that allows for setting the encoding nor is there a variant of getCharacterStream() that accepts an encoding parameter. (the same holds for setCharacterStream(); plus, if i create a StringReader there is no way to set the encoding.)
    do you have still any ideas?
    robert

  • Is it possible to add new character set encodings?

    Hello,
    is it possible to add new character set encodings in Mac OS?
    Practically, i need to add "Russian (DOS)" encoding to Tiger, but is it at all possible even in newer versions of Mac OS?
    Where to find the missing encodings and how to install them?
    Google search did not return much.
    Thanks.

    Well, it was a general question about possibility of adding new encodings in Mac OS, but i posted a more specific question about Tiger here:
    http://discussions.apple.com/thread.jspa?threadID=2692557&tstart=0
    On Tiger i do not have Russian (DOS) in TextEdit, only Cyrillic (DOS).
    The same about TextWrangler on Tiger.
    Strangely, i have Russian (DOS) in TeXShop on Tiger.

  • Encodings stuck in FMLECmd /s output list

    Hi,
    I am running FMLE Command Line using FMLECmd.exe tool from several machines, all Windows 7 x64, all FMLE 3.2 latest.
    It is very often happening that some encodings get stuck in the FMLECmd.exe /s output list, see screenshot.
    This processes are NOT running, there is not FMLE process running, they are just stuck there.
    The only way I found to remove them is to run a profile with that name, and stop form FMLECmd.exe, very long and annoying.
    Rebooting machine does not help.
    It looks like this list is saver per user in the machine, if I log in with another user in the machine, I don't have the same output.
    Any hint on how to clean the output?
    Is this saved on the Windows registry?
    Thanks,
    Nicola

    Hi,
    Take a look at programs RSWUWFML2 and SWN_SELSEN. These programs are sending mails based on open workitems and also have the option for sending a link to the executable workitem in the SAP Business Workplace. The technique behind these executable links is what you need I guess.
    Regards,
    Joost

  • Using include-xml and different character encodings

    I have static XML documents which contain different character encodings in the xml prolog....
    ISO-8859, UTF-8, and SHIFT_JIS.
    Each of these documents has entities declared in an internal doctype declaration.
    I have an XSQL page with several <xsql:include-xml href="???.xml"/> statements.
    I want to make either --
    1) UTF-8 as the ultimate encoding of the resulting XML data
    2) Force the encodings on the individual documents to be changed to UTF-8
    I want to get a final XML data steam which can be successfully parsed and transformed. The xsql page has a stylesheet reference.
    Any assistance would be greatly appreciated.
    null

    karol wrote:XML and XSLT are separate, but they're not suitable for printing / pdf generation - LaTeX is. XML is really fluid and has no notions of typesetting built in. It's perfect for on-line reports you view with a web browser.
    Can I ask why XML and XSLT aren't suitable for printing / pdf generation?  Using XSL-FO I can define a page size, margin widths, etc., right?  Just because it isn't normally done doesn't immediately make it a bad idea.  I was under the impression that XML was intended to be abstract enough that it could be used for more than just web pages or data transfer.

  • Aquamacs Asian encodings

    Hi! I'm new to emacs and am having trouble opening files with Asian fonts (Chinese and Japanese, I have not tried others). I'm using Aquamacs 2.4 on a MBP 15'' 2009 running 10.6.8. I can open these files (mostly .txt) in TextEdit just fine as long as I manually open up the Open dialog window, choose the file, and choose the encoding Japanese (Windows, DOS). For Chinese, I have to use Chinese (GB 18030). When I try to open these files in Aquamacs, I get a lot of invisible characters. Sometimes I can see kana, but I can never see kanji/Chinese characters. I have tried opening these files in all of the japanese modes and all of the possible Japanese and Chinese encodings listed on: https://sites.google.com/site/babylonsbabylon999/irc999/charset. The only results I have had are: (for Japanese files) nothing happens, everything is written in the unicode (I think it is called? The \[insert number]), most things remain in unicode but some will convert to see-able kana, and for Chinese files, I just have invisible characters with some white boxs and '§' strewn about. The charecters are still there, as I can copy-paste some of the invisible characters into TextEdit and see the characters. Any ideas/help will be much appreciated!
    bartok94
    Oh, and I can export the .txt files to .html using org-mode and view the files and content just fine through my browser, which isn't too surprising, I guess. Though opening the html file in Aquamacs again only results in invisible characters+html formatting.

    Hi! I'm new to emacs and am having trouble opening files with Asian fonts (Chinese and Japanese, I have not tried others). I'm using Aquamacs 2.4 on a MBP 15'' 2009 running 10.6.8. I can open these files (mostly .txt) in TextEdit just fine as long as I manually open up the Open dialog window, choose the file, and choose the encoding Japanese (Windows, DOS). For Chinese, I have to use Chinese (GB 18030). When I try to open these files in Aquamacs, I get a lot of invisible characters. Sometimes I can see kana, but I can never see kanji/Chinese characters. I have tried opening these files in all of the japanese modes and all of the possible Japanese and Chinese encodings listed on: https://sites.google.com/site/babylonsbabylon999/irc999/charset. The only results I have had are: (for Japanese files) nothing happens, everything is written in the unicode (I think it is called? The \[insert number]), most things remain in unicode but some will convert to see-able kana, and for Chinese files, I just have invisible characters with some white boxs and '§' strewn about. The charecters are still there, as I can copy-paste some of the invisible characters into TextEdit and see the characters. Any ideas/help will be much appreciated!
    bartok94
    Oh, and I can export the .txt files to .html using org-mode and view the files and content just fine through my browser, which isn't too surprising, I guess. Though opening the html file in Aquamacs again only results in invisible characters+html formatting.

  • Supported Encodings

    Hi,
    I found the following statements in the internationalization tutorial that I need help understanding:
    "The list of supported character encodings is not part of the Java programming language specification. Therefore the character encodings supported by the APIs may vary with platform. To see which encodings the Java Development Kit supports, see the Supported Encodings document."
    So do they mean that the character encodings supported in Unix are different from those supported on Windows and So on? If so isnt the next statement contradicting it? If supported encodings is not part of java programming language what is the meaning of the encodings that JDK supports?
    Thanks
    Pratima

    http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.doc.html
    This link should explain things for you.
    Mark

  • How to handle other encodings in JTextArea?

    Hi,
    I created a simple editor using JTextArea. When I typed some
    Chinese characters in it, it can be displayed. But after I used
    getText() to get the content, and wrote it to file byte by byte, then
    loaded it back to the same application, the Chinese characters
    are gone. Is there any special handling I should do when dealing
    with other character encodings?

    Hi,
    I tried to use Reader and Writer, but the file I reloaded is still different
    from what I saved. Here is a complete list of my code. Is there anythingI did wrong? Thanks a lot.
    import java.awt.*;
    import java.awt.event.*;
    import javax.swing.*;
    import java.io.*;
    public class Editor extends JPanel
       JFrame frame;
       final JButton loadBtn = new JButton("Load");
       final JButton saveBtn = new JButton("Save");
       final JButton exitBtn = new JButton("Exit");
       final JTextArea editArea = new JTextArea(100, 10);
       final JFileChooser fd = new JFileChooser();
       public Editor()
          setLayout(new BoxLayout(this, BoxLayout.Y_AXIS));
          frame = new JFrame("test");
          final JPanel ctrlPanel = new JPanel();
          ctrlPanel.add(loadBtn);
          ctrlPanel.add(saveBtn);
          ctrlPanel.add(exitBtn);
          add(ctrlPanel);
          final JScrollPane editPane = new JScrollPane(editArea);
          add(editArea);
          editArea.setVisible(true);
          loadBtn.addActionListener( new ActionListener () {
           public void actionPerformed (ActionEvent e) {
              JFrame fm= new JFrame();
              fd.showOpenDialog(fm);
              File f = fd.getSelectedFile();
              if (f!=null) load(f);
          saveBtn.addActionListener( new ActionListener () {
           public void actionPerformed (ActionEvent e) {
              JFrame fm= new JFrame();
              fd.showSaveDialog(fm);
              File f = fd.getSelectedFile();
              if (f!=null) save(f);
          exitBtn.addActionListener( new ActionListener () {
           public void actionPerformed (ActionEvent e) {
              System.exit(0);
         frame.getContentPane().add(this);
         frame.setLocation(400, 400);
         frame.setSize(600, 300);
         frame.show();
       public boolean save(OutputStream os)
          try
            OutputStreamWriter osw = new OutputStreamWriter(os);
            osw.write(editArea.getText());
            osw.close();
          catch (Exception e)
            e.printStackTrace();
            return false;
          return true;
       public boolean save(File f)
          boolean code = true;
          try
             FileOutputStream os = new FileOutputStream(f);
             code = save(os);
             os.close();
          catch (Exception e)
             e.printStackTrace();
             return false;
          return code;
       public boolean load(InputStream is)
         try
            BufferedReader in = new BufferedReader(new InputStreamReader(is));
            String inputLine;
            while ((inputLine = in.readLine()) != null)
               if (inputLine.length()>0)
                 editArea.append(inputLine+"\n");
            in.close();
         catch (Exception e)
            e.printStackTrace();
            return false;
         return true;
       public boolean load(File f)
         boolean code = true;
         try
            FileInputStream fis = new FileInputStream (f);
            code = load(fis);
            fis.close();
         catch (Exception e)
            e.printStackTrace();
            return false;
         return code;
       public static void main(String args[])
         Editor editor = new Editor();
         editor.frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
         editor.show();
    }

Maybe you are looking for