Java App on Linux : Unable to read iso-8859-1 encoded file correctly.

I have a file which is encoded as iso-8859-1, and contains characters such as ô .
I am reading this file with java code, something like:
File in = new File("myfile.csv");
InputStream fr = new FileInputStream(in);
byte[] buffer = new byte[4096];
while (true) {
int byteCount = fr.read(buffer, 0, buffer.length);
if (byteCount <= 0) {
break;
String s = new String(buffer, 0, byteCount,"ISO-8859-1");
System.out.println(s);
However the ô character is always garbled, usually printing as a ? .
I am running this on a Linux machine. It works fine on my XP machine.
I have verified that I can see the correct characters when I cat the file on the terminal.
(Interestingly, but I think maybe only by co-incidence, it works when I run with the -Dfile.encoding=UTF16 option, but not with UTF8, although this appears a hack rather than a fix since this option was not intended for developer use by sun - but I thought mentioning it may provide some clues as to what is going on)

I think your main probelm is with the console. When you send text to the console, it's sent in the system default encoding. On an English-locale system that might be ASCII, ISO-8859-1, windows-1252, UTF-8, MacRoman, and probably several other possibilities. Then the console decodes the the bytes using whatever encoding it feels like using--on my WinXP machine, it uses cp437 by default (just for laughs, as far as I can tell). If the text happens to be pure, seven-bit ASCII, there's no problem, since all those encodings are identical in that range.
But if you need to output anything other than ASCII characters, avoid the console. Send the output to a file and specify an encoding that you know will be able to handle your characters--UTF-8 can handle anything. Then open the file with an editor that can read that encoding; most of them can handle UTF-8 these days, and many will even detect it automatically. You also need to be using a font that can display your characters.
However, you're also going about the reading part wrong. Instead of reading the text in as bytes and passing them to a String constructor, you should use an InputStreamReader and read it as text from the beginning: BufferedReader br = new BufferedReader(
  new InputStreamReader(
    new FileInputStream("myfile.csv"), "ISO-8859-1"));I am curious about your statement that "it works" when you run with the -Dfile.encoding=UTF16 option. I wouldn't be surprised to see it output the correct characters (ASCII characters, anyway), but I would expect to see the characters interspersed with blank spaces or rectangles.

Similar Messages

  • Error updating AE: ERROR: DS015: Unable to read symlink target of source file "/Applications/Adobe After Effects CC 2014/Plug-ins/Effects/mochaAE/(Mocha Support)/mocha AE CC.app/Contents/MacOS/mediaioserver.app/Contents/CodeResources"(Seq 212)

    Error updating AE: ERROR: DS015: Unable to read symlink target of source file "/Applications/Adobe After Effects CC 2014/Plug-ins/Effects/mochaAE/(Mocha Support)/mocha AE CC.app/Contents/MacOS/mediaioserver.app/Contents/CodeResources"(Seq 212)
    What can I do?

    Run the cleaner tool and install from scratch.
    Use the CC Cleaner Tool to solve installation problems | CC, CS3-CS6
    Mylenium

  • ORA-27047: unable to read the header block of file

    My Windows 2003 crashed which was running Oracle XE.
    I installed Oracle XE on Windows XP on another machine.
    I coped my D:\oracle\XE10g\oradata folder of Win2003 to the same location in WinXP machine.
    When I start the database in WinXP using SQLPLUS i get the following message
    SQL> startup
    ORACLE instance started.
    Total System Global Area 146800640 bytes
    Fixed Size 1286220 bytes
    Variable Size 62918580 bytes
    Database Buffers 79691776 bytes
    Redo Buffers 2904064 bytes
    ORA-00205: error in identifying control file, check alert log for more info
    I my D:\oracle\XE10g\app\oracle\admin\XE\bdump\alert_xe I found following errors
    starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
    starting up 4 shared server(s) ...
    Oracle Data Guard is not available in this edition of Oracle.
    Wed Apr 25 18:38:36 2007
    ALTER DATABASE MOUNT
    Wed Apr 25 18:38:36 2007
    ORA-00202: control file: 'D:\ORACLE\XE10G\ORADATA\XE\CONTROL.DBF'
    ORA-27047: unable to read the header block of file
    OSD-04001: invalid logical block size (OS 2800189884)
    Wed Apr 25 18:38:36 2007
    ORA-205 signalled during: ALTER DATABASE MOUNT...
    ORA-00202: control file: 'D:\ORACLE\XE10G\ORADATA\XE\CONTROL.DBF'
    ORA-27047: unable to read the header block of file
    OSD-04001: invalid logical block size (OS 2800189884)
    Please help.
    Regards,
    Zulqarnain

    Try to install win 2003 server software, do the fresh installation of oracle software, now copy the datafiles and controlfiles to same locations as you did on winxp.
    get back to us, if still not out of the woods. I still doubt that a simple restore would do the trick, since you doing it across different platforms, might be I can be wrong, but this is what I personally feel, you not able to start the database on winxp.
    hare krishna
    Alok

  • Setting ISO-8859-1 Encoding in AXIS RPC call

    As per my understanding UTF-8 is default for webservice call in AXIS implementation I may be wrong.
    Is there any way to set ISO-8859-1 encoding in AXIS RPC call?
    Below is the WSDL file details.
    Thanks for you help.
    Regards,
    Hemen
    <Code>
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <definitions name="TestService_Service"
    targetNamespace="http://webservice.de.rt.wsdl.WebContent/TestService_Service/"
    xmlns="http://schemas.xmlsoap.org/wsdl/"
    xmlns:format="http://schemas.xmlsoap.org/wsdl/formatbinding/"
    xmlns:java="http://schemas.xmlsoap.org/wsdl/java/"
    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
    xmlns:tns="http://webservice.de.rt.wsdl.WebContent/TestService_Service/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <message name="getResponseRequest">
    <part name="xml" type="xsd:string"/>
    <part name="product" type="xsd:string"/>
    <part name="source" type="xsd:string"/>
    <part name="tran" type="xsd:string"/>
    <part name="content" type="xsd:string"/>
    </message>
    <message name="getResponseResponse">
    <part name="return" type="xsd:string"/>
    </message>
    <portType name="TestService">
    <operation name="getResponse" parameterOrder="xml product source tran content">
    <input message="tns:getResponseRequest" name="getResponseRequest"/>
    <output message="tns:getResponseResponse" name="getResponseResponse"/>
    </operation>
    </portType>
    <binding name="TestServiceJavaBinding" type="tns:TestService">
    <java:binding/>
    <format:typeMapping encoding="Java" style="Java">
    <format:typeMap formatType="java.lang.String" typeName="xsd:string"/>
    </format:typeMapping>
    <operation name="getResponse">
    <java:operation methodName="getResponse"
    parameterOrder="xml product source tran content" returnPart="return"/>
    <input name="getResponseRequest"/>
    <output name="getResponseResponse"/>
    </operation>
    </binding>
    <binding name="TestServiceBinding" type="tns:TestService">
    <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>
    <operation name="getResponse">
    <soap:operation soapAction="" style="rpc"/>
    <input name="getResponseRequest">
    <soap:body
    encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
    namespace="distributionengine.com"
    parts="xml product source tran content" use="encoded"/>
    </input>
    <output name="getResponseResponse">
    <soap:body
    encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
    namespace="distributionengine.com" use="encoded"/>
    </output>
    </operation>
    </binding>
    <service name="TestServiceService">
    <port binding="tns:TestServiceJavaBinding" name="TestServiceJavaPort">
    <java:address className="rt.de.webservice.TestService"/>
    </port>
    </service>
    <service name="TestService_Service">
    <port binding="tns:TestServiceBinding" name="TestServicePort">
    <soap:address location="http://localhost:9080/servlet/rpcrouter"/>
    </port>
    </service>
    </definitions>
    </code>

    try to put the below code at the top of the JSP page
    <%@ page language="java" pageEncoding="UTF-8"%>

  • File adapter ISO-8859-1 encoding problems in XI 3.0

    We are using the XI 3.0 file adapter and are experiencing some XML encoding troubles.
    A SAP R/3 system is delivering an IDoc outbound. XI picks up the IDoc and converts it to an external defined .xml file. The .xml file is send to a connected ftp-server. At the remote FTP server the file is generating an error, as it is expected to arrive in ISO-8859-1 encoding. The Transfer Mode is set to Binary, File Type Text, and Encoding ISO-8859-1.
    The .xml file is encoded correctly in ISO-8859-1, but the problem is that the XML encoding declaration has the wrong value 'UTF-8'.
    Does anybody know of a work around, to change the encoding declaration to ‘ISO-8859-1’ in the message mapping program?

    An example of the XSL code might be as follow:
    <?xml version='1.0'?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method='xml' encoding='ISO-8859-1' />
    <xsl:template match="/">
         <xsl:copy-of select="*" />
    </xsl:template>
    </xsl:stylesheet>

  • WSDL Generating in ISO-8859-1 encoding want in UTF - 8

    Hi XI Geeks,
           When I am using "Define Web Service" tool in Integration Directory, the WSDL file is being created with encoding <b>ISO-8859-1</b>, <i><b>but we want in "UTF-8"</b></i>.
            Is there any setting we need to change in WAS which will change this default.
    Any ideas are welcome and thanks in advance.
    Regards
    Sujan

    Hi Vishnu,
             We already did that for the time being. But the client wants it be consistent.
             In another client engagement the WSDL is created in  "UTF-8" format we double checked twice.
             We are running on SP12.
    If any of the SP12 guys can generated a WSDL and let us know that it is created in "ISO-8859-1" encoding.
    Regards
    Sujan

  • How to store UTF-8 characters in an iso-8859-1 encoded oracle database?

    How can we store UTF-8 characters in an iso-8859-1 encoded oracle database? We can NOT change the database encoding but need to store e.g. Polish or Russian characters besides other European languages.
    Is there any stable sollution with good performance?
    We use Oracle 8.1.6 with iso-8859-1 encoding, Bea WebLogic 7.0, JDK 1.3.1 and the following thin driver: "Oracle JDBC Driver version - 9.0.2.0.0".

    There are a couple of unsupported options, but I wouldn't consider using them on a production database running other critical applications. I would also strongly discourage their use unless you understand in detail how Oracle National Language Support (NLS) works, otherwise you could end up with corrupt data or worse.
    In a sense, you've been asked to do the impossible. The existing databas echaracter sets do not support encoding the data you've been asked to store.
    Can you create a new database with an appropriate database character set and deploy your application there? That's probably the easiest solution.
    If that isn't an option, and you really need to store data in this database, you could use one of the binary data types (RAW and BLOB), but that would mean that it would be exceptionally difficult for applications other than yours to extract the data. You would have to ensure that the data was always encoded in the same character set, otherwise you wouldn't be able to properly decode it later. This would also add a lot of complexity to your application, since you couldn't send or recieve string data from the database.
    Unfortunately, I suspect you will have to choose from a list of bad options.
    Justin
    Distributed Database Consulting, Inc.
    http://www.ddbcinc.com/askDDBC

  • Help needed in voice enabled java app on LINUX OS

    I have problem in capturing voice from mic on Linux m/c with in java application. I tried that by using native app on linux. It is working fine, but when i tried to make it work with a java application, it is not working.
    Any help is appreciated. Thanks in Advance...
    -Bhaskar

    I have the same problem, I'm able to receive RTP audio, but not send, I have Red Hat linux 9

  • Problems with reading XML files with ISO-8859-1 encoding

    Hi!
    I try to read a RSS file. The script below works with XML files with UTF-8 encoding but not ISO-8859-1. How to fix so it work with booth?
    Here's the code:
    import java.io.File;
    import javax.xml.parsers.*;
    import org.w3c.dom.*;
    import java.net.*;
    * @author gustav
    public class RSSDocument {
        /** Creates a new instance of RSSDocument */
        public RSSDocument(String inurl) {
            String url = new String(inurl);
            try{
                DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
                Document doc = builder.parse(url);
                NodeList nodes = doc.getElementsByTagName("item");
                for (int i = 0; i < nodes.getLength(); i++) {
                    Element element = (Element) nodes.item(i);
                    NodeList title = element.getElementsByTagName("title");
                    Element line = (Element) title.item(0);
                    System.out.println("Title: " + getCharacterDataFromElement(line));
                    NodeList des = element.getElementsByTagName("description");
                    line = (Element) des.item(0);
                    System.out.println("Des: " + getCharacterDataFromElement(line));
            } catch (Exception e) {
                e.printStackTrace();
        public String getCharacterDataFromElement(Element e) {
            Node child = e.getFirstChild();
            if (child instanceof CharacterData) {
                CharacterData cd = (CharacterData) child;
                return cd.getData();
            return "?";
    }And here's the error message:
    org.xml.sax.SAXParseException: Teckenkonverteringsfel: "Malformed UTF-8 char -- is an XML encoding declaration missing?" (radnumret kan vara f�r l�gt).
        at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
        at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
        at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
        at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1183)
        at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:653)
        at org.apache.crimson.parser.Parser2.parse(Parser2.java:337)
        at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
        at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
        at getrss.RSSDocument.<init>(RSSDocument.java:25)
        at getrss.Main.main(Main.java:25)

    I read files from the web, but there is a XML tag
    with the encoding attribute in the RSS file.If you are quite sure that you have an encoding attribute set to ISO-8859-1 then I expect that your RSS file has non-ISO-8859-1 character though I thought all bytes -128 to 127 were valid ISO-8859-1 characters!
    Many years ago I had a problem with an XML file with invalid characters. I wrote a simple filter (using FilterInputStream) that made sure that all the byes it processed were ASCII. My problem turned out to be characters with value zero which the Microsoft XML parser failed to process. It put the parser in an infinite loop!
    In the filter, as each byte is read you could write out the Hex value. That way you should be able to find the offending character(s).

  • Excel unable to read WEBI schedule download excel file

    Hi experts,
    We use BO central management's instance manager to schedule download a report as excel format in our ftp share folder.
    And the download did run successfully and an icon of the file been shown as excel  format in ftp folder.
    while we use excel 2007 to open the file, excel pop out the message of unable to read the file.
    it is saying the excel found unreadbale content in report. Do you want to recover the contens of this workbook? if you trust the source of this workbook, click yes. after applied the yes, excel stil pop out the message of unable to read the file.
    the general tab in excel been uncheck the "ignore other applications that use Dynamic Data Exchange (DDE), but result still shown unable to read.
    The whole set of report can be run in WEBI and using save report to my computer as excel without any issues.
    WEBI been setup as refresh on open.
    Any clues?

    Hi,
    Did you create a PivotTable report in the file? As your description, the HTML format could be used correctly in Excel 2007. Did you get the other error message?
    Such as,Unable to read file.
    When you click OK,   you receive the following error message:
    Errors were   detected in '<var>filename</var>.xls', but Microsoft Excel was   able to open the file by making the repairs listed below. Save the file to make   these repairs permanent.
    PivotTable report   '<var>report_name</var>' on   '[<var>filename</var>.xls]worksheet_name' was discarded due to   integrity problems.
    Please try to use the workaround:
    Turn off AutoRecovery when you work with any workbooks.Click Excel button>Excel options>Save>Uncheck AutoRecovery
    Following this, if the issue exists, I recommend you fix the issue the following KB:
    http://support.microsoft.com/kb/943088
    http://support.microsoft.com/kb/929766
    Regards,
    George Zhao
    TechNet Community Support

  • Help needed to enable ISO-8859-1 encoding in Weblogic 10.3

    We have upgraded from Weblogic 10 to Weblogic 10.3. In the old environment foreign characters (iso-8859-1 charater set) were enabled using the Java properties "*-Dfile.encoding=iso-8859-1 -Dclient.encoding.override=iso-8859-1*".
    Also, we set the page encoding in the JSPs to be
    _'<%@ page pageEncoding="iso-8859-1" contentType="text/html; charset=iso-8859-1" %>'._
    This was working fine. We were able to input characters like euro (€) and display it back to user.
    After we upgraded to Weblogic 10.3, I applied the same Java properties, all the special characters of the 8859-1 set like Æ, à etc are just displayed as garbled text like '��ï' etc. So, the new server is unable to handle/encode these characters.
    Our weblogic environment is - Weblogic appserver 10.3, JRockit 1.6.0_31-R28.2.3-4.1.0 running on Linux OS.
    Is there any additional seting in 10.3 to encode character sets?
    Thanks in advance.

    Hi Kalyan,
    Thanks for the info. Please help me understand just one more thing, before I ask our server team to apply the patch.
    I checked the full version of our Weblogic server from WEBLOGIC_HOME/registry.xml. It is *<component name="WebLogic Server" version="10.3.6.0" >*.
    The patch from MOS says it is *"Release WLS 10.3.3"*. Does this mean the version 10.3.6 should already have this patch? Is the release of Patch same as WLS version?
    Should I still go ahead and install the patch for this version 10.3.6 also?
    Thanks again.
    - Shankar.

  • Mail Receiver - Send file in ISO-8859-1 encoding

    Hi,
    I'm sending mail with an attachment using mail adapter, but instead of specified ISO-8859-1 it is converted to UTF-8 no BOM,. Because of that, some characters (ñ,ç, etc) are not transferred properly.
    Settings:
    Message protocol: XIPAYLOAD
    No mail package.
    Transform.ContentType: multipart/mixed; boundary=--AaZz; charset=ISO-8859-1
    Payload:
    multipart/mixed; boundary=AaZz; charset=ISO-8859-1</Content_Type><Content>--AaZz
    Content-Type: text/plain; charset=ISO-8859-1
    Content-Disposition: inline
    File attachment
    AaZz
    Content-Type: text/plain; charset= ISO-8859-1
    Content-Disposition: attachment; filename=TestFile
    iso-8859 characters ñ ç ñ ñ
    AaZz--
    </Content></ns:Mail>
    I need advice in how to force the file to be created with ISO-8859-1 enconding.
    Thanks in advance.
    Regards,
    Iván.

    Hi Jean-Philippe,
    Yes, please check my first post, if you use same settings, and create message as mine, it should work, the TestFile is created as an attachment.
    Include this line in the module configuration with transform key:
    Transform.ContentType: multipart/mixed; boundary=--AaZz;
    If you still have issues, please give me a description of the error.
    Regards,
    Ivan.

  • Convert from ISO 8859-1 encoding to UTF-8

    Hi
    My Os name is 'SunOS ut51109 5.10 Generic_144500-19 sun4v sparc SUNW,T5440'.
    I want to change the encoding of the existing .bcp file from ISO 8859-1 to UTF-8 with out using any temp files as these .bcp file will be pointed by an external table.
    here is the command I issued in the script(ksh file)
    iconv -f ISO8859-1 -t UTF-8 file1.bcp > file1.bcp
    After the script got executed file1 is empty(showing 0 bytes).
    Please correct me or let me know the syntax to be followed to write the data into the same file in UTF-8 format.
    Thanks
    kartheek

    You cannot do conversion from ISO 8859-1 to UTF-8 in-place because the UTF-8 version will generally be longer (unless you convert a pure ASCII file, which does not need conversion in the first place). Therefore, you would have to overwrite what you have not read yet. Instead, convert to a new file with a temporary name, drop the original and rename the temporary back to original. This is not that complicated.
    If the problem is that you want to overwrite a file already open by the database, then rename the incoming file first and then convert copying to the target.
    -- Sergiusz

  • UTF-8 encoding vs ISO 8859-1 encoding

    The iTunes tech specs call for UTF-8 encoding of the XML feed file; a friend of mine uses feed generator software through his blog that uses ISO 8859 encoding. Is there a way to convert the latter to UTF-8 so that iTunes tags may be successfully added?
    When I tried editing his XML file, I got error messages when I submitted the file to RSS feed validator sites (such as http://feedvalidator.org/. Any help or knowledge is appreciated because I am not the least bit expert in this coding arena.

    You don't need to convert iso 8859-1 (us-ascii) to utf-8 unless you have nonstandard characters. Basically, ascii is a subset of utf-8 and for English it will serve you just fine. You can have iTunes tags in the xml file even if the file itself is encoded in iso 8859-1.
    The error you see at feedvalidator.org is most likely a warning.
    Hope this helps!
    - Andy Kim
    Potion Factory
    http://www.potionfactory.com

  • Oracle Service Bus to IBM MQ v7.0 Integration Issue ISO-8859-1 Encoding

    Hi,
    This is the issue we badly struggling with for a few days..
    I"ve a MQ proxy service listening to MQ REQUEST queue and when message is arrived to proxy, do some service callouts to local protcol proxy services and reply back to MQ REPLY queue on the same Listener Proxy. This is working good without any issues.
    However, when I want to route reply through MQ business service, the message is going to *MQ reply queue with encoding="ISO-8859-1*". but expected requirement is to publish a message *with encoding="UTF-8"* which I'm not able to do. Hence MQ is rejecting the message.
    Working path: MQ <--> MQ proxy <-->DynamicRouting<-->Local Proxy
    *NOT Working Path: REQUEST MQ --> MQ proxy -->DynamicRouting -->Response Route Line -->For Loop -->Publish-->MQ Business Service-->REPLY MQ*
    Transport Headers I'm passing to Outbound Request: mq:putApplicationName, mq:Format
    MQ v7.0
    OSB v11.1.1.5
    Could you please help and guide me where I'm wrong and which property I need to set explicitly to achieve this.
    Thank you
    Edited by: 1002815 on May 2, 2013 10:51 AM

    Here is a simple Answer:
    In MQ Transport Headers, set mq:characterset to '1208' (UTF8) and Check 'Pass all headers". That's itThanks for sharing it here with the community. Would you mind closing this thread as well?
    Regards,
    Anuj

Maybe you are looking for