Cannot create file with Non-latin characters- I/O

I'm trying to create a file w/ Greek (or any other non-latin) characters ... for use in a RegEx demo.
I can't seem to create the characters. I'm thinking I'm doing something wrong w/ IO.
The code follows. Any insight would be appreciated. - Thanks
import java.util.regex.*;
import java.io.*;
public class GreekChars{
     public static void main(String [ ] args ) throws Exception{
          int c;
          createInputFile();
//          String input = new BufferedReader(new FileReader("GreekChars.txt")).readLine();
//          System.out.println(input);
          FileReader fr = new FileReader("GreekChars.txt");
          while( (c = fr.read()) != -1)
               System.out.println( (char)c  );
     public static void createInputFile() throws Exception {
          PrintStream ps = new PrintStream(new FileOutputStream("GreekChars.txt"));
          ps.println("\u03A9\u0398\u03A0\u03A3"); // omega,theta,pi,sigma
          System.out.println("\u03A9\u0398\u03A0\u03A3"); // omega,theta,pi,sigma
          ps.flush();
          ps.close();
          FileWriter fw = new FileWriter("GreekChars.txt");
          fw.write("\u03A9\u0398\u03A0\u03A3",0,4);
          fw.flush();
          fw.close();
// using a printstream to create file ... and BufferedReader to read
C:> java GreekChars
// using a Filewriter to create files  .. and FileReader to read
C:> java GreekChars
*/

Construct your file writer using a unicode format. If
you don't then the file is written using the platform
"default" format -probably ascii.
example:
FileWriter fw = new FileWriter("GreekChars.txt",
"UTF-8");I don't know what version of FileWriter you are using, but not that I know of take two string parameters. You should try checking the API before trying to help someone, instead of just making things up.
To the OP:
The proper way to produce a file in UTF-8 format would be this:
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream("filename"), "UTF-8");Then to read the file, you would use:
InputStreamReader reader = new InputStreamReader(new FileInputStream("filename"), "UTF-8");

Similar Messages

  • Cannot rename file with non-ASCII characters when using the

    My application moves files from one directory to another by calling File[] srcFiles = srcDir.listFiles() to get a list of files in the source directory, and then calling srcFiles.renameTo(destFile) to rename each file.
    This does not work (renameTo returns false and the file is not moved) under the following circumstances:
    - the file's leaf name contains non-ASCII characters, for example "�"
    - the OS is Solaris 9
    - the LANG and LC_* environment variables are unset, i.e. the C locale is being used
    If I set the LANG environment variable to, for example, en_GB.UTF-8 then the rename succeeds.
    I have tried calling srcFiles[index].getName().getBytes("UTF-8") and the non-ASCII characters are being replaced with ? (0x3f) characters when LANG is unset.
    Is this a bug in the JRE? I would argue that since my code does not actually manipulate the filename (I just use the File object that File.listFiles() gives me) then the rename should succeed. Of course I would not expect the file name to be displayed correctly if I printed it out.
    I have reproduced this behaviour with JDK 1.4.2_05 and 1.5.0_04 on Solaris 9.
    Francis

    Thanks for the info Alan.
    I considered setting the locale in the environment (this sounds like the "correct" fix to me and we might implement it later), but this application shares a WebLogic server with many other applications so we would have to do a huge amount of testing to make sure that the locale change wouldn't break the other apps. In the end I worked around the problem by making the code that generates the filenames in the first place strip out any non-ASCII characters (the names of the files are not critically important).
    Looking forward to JSR-203, in the meantime perhaps a note about this behaviour in the java.io.File javadoc would be useful.

  • Naming files with non English characters.

    I'm using filemaker to creat PDF's through Acrobat 10.1.12. I need to use Polish, Hungarian, Czech and Slovakian characters in the file name but the characters are not recognised and so the file name will not create. This is for Windows, the problem does not occur on a mac.

    Hi
    Have a look at csv upload -- suggestion needed with non-English character in csv file it might help you.
    Thanks,
    Manish

  • Upload text files with non-english characters

    I use an Apex page to upload text files. Then i retrieve the contents of files from wwv_flow_files.blob_content and convert them to varchar2 with utl_raw.cast_to_varchar2, but characters like ò, à, ù become garbage.
    What could be the problem? Are characters lost when files are stored in wwv_flow_files or when i do the conversion?
    Some other info:
    * I see wwv_flow_files.DAD_CHARSET is set to "ascii", wwv_flow_files.FILE_CHARSET is null.
    * Trying utl_raw.cast_to_varchar2( utl_raw.cast_to_raw('àòèù') ) returns 'àòèù' correctly;
    * NLS_CHARACTERSET parameter is AL32UTF8 (not just english ASCII)

    Hi
    Have a look at csv upload -- suggestion needed with non-English character in csv file it might help you.
    Thanks,
    Manish

  • Cp and tar files with non printable characters

    Hi all,
    Maybe it's a silly question but just got stuck with this.
    We have an XSan with diverse material from varios departaments. Besides having a backup on tape I was trying to just do a plain copy from a terminal of all the files to another disk just using cp or tar.
    But whenever cp or tar encounters a file with a nonprintable char they don't copy it.
    Let's say in the client Finder the named the file "opción.txt"
    The file shows up in terminal with an ? but cp or tar won't get the file.
    any clues?
    thanks!

    Hi
    Have a look at csv upload -- suggestion needed with non-English character in csv file it might help you.
    Thanks,
    Manish

  • [Solved]Amarok don't play files with non latin1 characters in filename

    I know that this problem is not in amarok, i patched phonon from aur http://aur.archlinux.org/packages.php?ID=27465
    /src/import/gstreamer/mediaobject.cpp
    at line 365 in bool MediaObject::createPipefromURL(const QUrl &url)
    QByteArray encoded_cstr_url = url.toEncoded();
    m_datasource = gst_element_make_from_uri(GST_URI_SRC, encoded_cstr_url.constData(), (const char*)NULL);
    //for utf8 locale replace toLocal8Bit to toAscii
    //toLocalFile fails for files with "?" in filenames. toLocalFile("a?c.s") => "a"
    + if ( url.toString().indexOf("file://") == 0 )
    + g_object_set(m_datasource, "location", url.toString().replace(0,7,"").toLocal8Bit().constData(), NULL);
    if (!m_datasource)
    return false;
    and now amarok plays ALL files
    Last edited by klama (2009-08-03 12:23:16)

    which phonon backend are you using? try with a different one

  • Filenames with non-latin characters aren't found by the filesystem [S]

    This might be a bug, but I'm hoping it's just a config file problem.
    I have a few files here and there on my NTFS drive that have Japanese characters in their filenames.  Sometime recently (I don't have an exact date when they disappeared), they stopped showing up at all.  If I browse to a folder that used to contain filenames with Japanese characters, it just appears empty in Gnome.  Using ls from a terminal also says the directory is empty.  They used to work just fine, but a recent upgrade must have broken them.
    Does anyone have any ideas what I can do to get my files to appear again?  Is there some way to enable unicode support for filenames or something?
    Many thanks!
    Edit: Rebooting the system fixed it, though I still think that was a pretty strange problem.  Any ideas what was up?
    Last edited by ColdPie (2007-11-11 02:07:11)

    The funny thing is that bold font [when message unread in message list] shows OK, ie in greek, but when i click on unread message, it is assumed to have been read, so it changes over to medium [non bold] and the encoding changes as well into the one that is not greek and thus unreadable.  In ~/.sylpheed/sylpheedrc the fonts are:
    widget_font=
    message_font=-microsoft-sylfaenarm-medium-r-normal-*-*-160-*-*-p-*-iso8859-7
    normal_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
    bold_font=-monotype-arial-bold-r-normal-*-12-*-*-*-*-*-iso8859-7
    small_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
    In /etc/gtk, for gtk1.2 apps the file refering to greek encoding [el] seems to be fine [exactly the same as in slackware 9.1].

  • Internet links with non-Latin characters in Mail

    If this link is shown in Mail on the iPhone, only the Latin text will be clickable:
    http://ja.wikipedia.org/wiki/春琴抄
    The Chinese characters will not be considered part of the URL.
    When I click on the link it will not take me to the intended article.
    The same is true for other alphabets, these Japanese characters will not be part of the URL:
    http://ja.wikipedia.org/wiki/ブロメライン
    Anyone else having this problem? Any ideas? I have submitted a bug report to Apple.
    Jason

    I think it is a bug, but it can make a difference how the link gets sent to you. If you go to a page in Safari and use File > Mail Link to this Page, the link may be usable by the email recipient. If you could try that I would be interested if it works.

  • ITunes bug with non-latin characters in the path

    I've just bought a Beyoncé video clip from iTunes Store.
    iTunes unable to play (even transfer to iPod) this video because folder's name contains french "é" symbol. When I've renamed folder to "Beyonce" then iTunes played video successfully.
    BTW, my system locale is Russian (because I'm russian), OS - Vista.

    Kasyan, as of Leopard AppleScript treats all text as Unicode pre this you can specify 'as Unicode text'. Try a test with these.
    -- Leopard
    set x to POSIX path of (path to desktop)
    -- Pre Leopard
    set x to POSIX path of (path to desktop as Unicode text)
    -- Leopard
    set x to POSIX path of (choose file without invisibles)
    -- Pre Leopard
    set x to POSIX path of ((choose file without invisibles) as Unicode text)

  • Non latin characters in .cfm filename

    Hi - I have users who want to name files with non latin characters.  i.e.
    Логотип_БелРусь_2500x1.cfm
    We get a file not found error, it is not an IIS issue and we have UTF-8 encoding and are running CF8.
    Yes we can rename the files but for now would like to know if non latin characters are allowed in .cfm file names.
    Thank you!
    Sapna

    PaulH wrote:
    en_US is the JRE locale. is that the same as the OS? and what file encoding?
    (check via cfadmin).
    i ask, because pretty sure you can't use non-ascii file names w/cf. there's an
    open bug on that:
    http://cfbugs.adobe.com/cfbugreport/flexbugui/cfbugtracker/main.html#bugId=77177
    only can guess that file encoding isn't latin-1, etc. and/or OS locale equals
    the same language as the file name.
    cfadmin gives pretty much the same information. Here's a direct copy
    Server Product
    ColdFusion
    Version
    9,0,0,241018  
    Edition
    Developer  
    Serial Number
    Operating System
    Windows 2000  
    OS Version
    5.0  
    Update Level
    /C:/ColdFusion9/lib/updates/hf900-78588.jar  
    Adobe Driver Version
    4.0 (Build 0005)  
    JVM Details
    Java Version
    1.6.0_12  
    Java Vendor
    Sun Microsystems Inc.  
    Java Vendor URL
    http://java.sun.com/
    Java Home
    C:\ColdFusion9\runtime\jre  
    Java File Encoding
    Cp1252  
    Java Default Locale
    en_US  
    File Separator
    Path Separator
    Line Separator
    Chr(13)

  • We cannot type Polish (non-latin) characters in WebDynpro applications

    We cannot type Polish (non-latin) characters in WebDynpro application (in runtime) because 'Browser Help Shortcuts' are fired.
    To type a polish character in polish keyboard you need to press AltGr + letter (ie. AltGr + a/c/e/s/o/l/z/x/n). To type an uppercase polish character you need to press AltGr + Shift + letter. This comination is in fact the same as pressing Alt + Ctrl + Shift + letter (because AltGr produces Alt + Ctrl) and it fires some of 'Browser Help Shortcuts'. For example AltGr + Shift + O should produce a letter O with a dash on it's top but instead it fires 'Show nesting of HTML containers'.
    We tried to turn off sap-wd-lightspeed, but then other key combinations are reserved for u2018Browser Help Shortcutsu2019.
    We need to be able to use AltGr + Shift + a/c/e/s/o/l/z/x/n in runtime.
    Product: SAP NW 7.11 SP04
    WebDynpro for Java
    I hope there is a somewhere a hidden parameter that solves our problem Maybe we're in some kind of debug mode?
    Thanks for your help!!

    The funny thing is that bold font [when message unread in message list] shows OK, ie in greek, but when i click on unread message, it is assumed to have been read, so it changes over to medium [non bold] and the encoding changes as well into the one that is not greek and thus unreadable.  In ~/.sylpheed/sylpheedrc the fonts are:
    widget_font=
    message_font=-microsoft-sylfaenarm-medium-r-normal-*-*-160-*-*-p-*-iso8859-7
    normal_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
    bold_font=-monotype-arial-bold-r-normal-*-12-*-*-*-*-*-iso8859-7
    small_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
    In /etc/gtk, for gtk1.2 apps the file refering to greek encoding [el] seems to be fine [exactly the same as in slackware 9.1].

  • [AS] Problem with non English characters in file path

    I wrote a script that exports a pdf file from ID, rasterizes it in PS, applies an action, saves it as another pdf file, and finally creates a Mail message, and attaches the file to it (the last part is written in AppleScript).
    The problem is that it doesn't work when the path to this file contains non English characters.
    This works:
    make new attachment with properties {file name:"/Volumes/Macintosh HD/BackUp Tetard/Test.pdf"}
    but this doesn't:
    make new attachment with properties {file name:"/Volumes/Macintosh HD/BackUp Têtard /Test.pdf"}
    I remember vaguely that I read somewhere that AppleScript can work with Unicode — in other words with such characters — starting from some version, don't remember which exactly, but it seems to me — Leopard.
    I am on Mac OS X 10.4.11 right now. Will updating solve this problem? Does anybody know any solution to this problem: a scripting addition, some hidden setting, etc.
    I made a little test: used a Russian character — ё and it works, but when I use — ê (Dutch) it doesn't. May it have something to do with the Region setting in International panel?
    Thanks in advance,
    Kasyan

    Kasyan, as of Leopard AppleScript treats all text as Unicode pre this you can specify 'as Unicode text'. Try a test with these.
    -- Leopard
    set x to POSIX path of (path to desktop)
    -- Pre Leopard
    set x to POSIX path of (path to desktop as Unicode text)
    -- Leopard
    set x to POSIX path of (choose file without invisibles)
    -- Pre Leopard
    set x to POSIX path of ((choose file without invisibles) as Unicode text)

  • ACR Raw images wont copy to Jpeg or Tiff; as it fails with "Cannot Create File"

    Requesting immediate help if anyone available currently
    I've been using Photoshop CS6 for several years, never a problem at all. As soon as I bought Creative Cloud, I see many issues that cannot be resolved or are recurring.
    Again today as I go to copy files (images) from Adobe Camera Raw, to say a Tiff or a Jpeg, it appears to begin to copy as normal, then suddenly a box appears with all my copying
    file name/numbers, and it begins to eliminate one by one "Cannot Create File", failing the entire process to copy. I have never seen this until I began using Creative Cloud about a
    month ago.
    Can anyone help this Saturday evening as i am trying to complete a large project of images for others who expect them soon.?
    Thank you and I look forward to any advice soon,
    Mark Seibold, Artist-Astronomer, Portland Oregon

    Hi Rick
    Thanks for clarifying that. Yes, that term was my mistake, as I meant to write 'Save As"
    Another forum respondent just replied also, and he states as so many others do, that they do not use
    the Adobe Camera Raw app formerly with CS6, as they instead use Lightroom, so he apologized that he
    was not familiar with ACR but that I might have an old CS6 plug-in running that may not be completely
    compatible with the new Creative Cloud. I am not sure if this is the problem.
    What I would find as not really likely, is that the Creative Cloud engineers would go through all the trouble
    to redesign from the old CS6, then even assist us over the phone to help install it, but in oversight,
    leave us without a proper update to properly run Adobe Camera Raw. I also have never really understood the advantage of Lightroom. I'm sure that many must enjoy its efficiency, as I hear wedding photographers like it for huge batch processing of adjusting many images in synchronicity all at once. I would say that I do the same with ACR accessing it through Bridge, and I like ACR's quick and simple access for simple image adjustments and then say a small batch all synchronized, then finally to "Save As" for all selected to a final output as Tiff or Jpeg.
    As an astronomer that has just started learning to use Registax for night sky images, we also like the ACR adjustments and then to "Save As" eg. all Jpeg or all Tiff, to then photo-stitch large landscape and night sky panoramas with Microsoft ICE photo-stitching.
    Sorry for the long description here, but I hope to relate the total idea of what I am doing for the past years. I'll even attach an example >
    Thanks for your possible solution to my problem in "Saving As" in ACR,
    Mark

  • How to display feeds with non-latin utf8 characters in Raggle?

    Has anyone tried to use raggle to read feeds with non-latin utf8 characters?
    If you are successful, how to do it?
    Thanks

    i have this problem too...
    Last edited by vdo (2008-09-02 12:19:31)

  • Non-Latin Characters lead to finder distress.

    One of the nicest features of Macintosh from the time I first played on an SE30 is the capacity to quickly type non-Latin characters. While to many this might not seem like a big deal, for me being able to write Tetris™ without a second thought is a great convenience.
    So I was very surprised when I began typing µ in the Finder under Snow Leopard and didn't end up on a file that started with µ but rather on m. As if that wasn't irritating enough µ is not treated as m so the file becomes completely unreachable by alphabetic selection. This is something I use almost constantly in Finder so having a file unreachable is even worse than merely having the character interpreted incorrectly.
    Just to be certain that this wasn't merely a flaw with that one character I examined other common characters.
    ƒ, é, π, ∑ all suffer from the same problem, misinterpreted when typed and interpreted correctly during the comparison.
    So the question is, “Is this an error in how I set up my machine, an error in the string comparison system, or an error in the Finder program?”

    Yes, I just double checked, and I was in error, accented Latins do work as expected. I am certain that the inclusion of such in the prior list was a user error.
    However the fact that the Greek key layout works begins to suggest the root of the problem.
    Interestingly enough this also applies to the Greek layouts internal option modified keys.
    I am strongly suspecting a bug here.

Maybe you are looking for

  • How to publish a flash banner without generating errors?

    I need to upload a flash banner every time a condition is verified. But I have troubles from the HTML code I use to call the flash banner. <CFIF #file# IS "fox"> <div><object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase=" http://down

  • Regarding Import Duty

    Hi this is regarding Import duty structure. one my client is having import process and they are paying customs duty in advance, and at the time of purchasing booking , only the invoice value is accounted in purchases and the corresponding customs dut

  • Entity EJB deployment on Oracle through JDeveloper 3.2

    I have a project which is using Oracle 9iAS as the for the web server. There were intial plans to use the 8i database within the 9iAS machine as the place where the EJBs would be stored. There was a comment from the Oracle support people that the EJB

  • 2.0.1 Issues

    I am on firmware 2.0.1 and am having some problems with it. I have had to restore numerous times because it gets stuck on the Apple Logo, on bootup. For some reason this always happens. No matter what I do. Please help

  • Spreadsheet values disappear

    I have a model where I have labels that are linked to Excel cells. I can see the values in Xcelsius 2008 (SP1) and they are correct. However, when I run the preview or export to .swf, some, not all of the values are empty (not zero). I have taken a s