UploadedFile and filenames with non-ascii chars

Hi
I'm using an UploadedFile object in my web app, and all works fine. However, when I try to upload a file, with a filename containing non-ascii chars (e.g. Spanish), I see that the getBytes method returns an empty byte array, the filename is not stored correctly (the non-ascii chars are lost, replaced by another representation), and that the content-type is application/octet-stream instead of image/png as supposed to be.
If I rename that same file to have only ascii chars - everything is back to normal.
How can I upload files with non-ascii chars in their name?

Hi, back! Spent a few hours experimenting and found
that everything is working great (including the creation
of international non-ASCII foldernames) when I used
utf-8 encoding in the sieve filters rules for the
the match strings and the folder names... at least
so far so good... for your ref and sorry for bothering.

Similar Messages

  • Can ims5.2p1 do sieve filter searches with non-ASCII chars?

    Particularly, is it possible with the
    Subject header line, which is
    =?xxx?Q?xxxxx?= or =?xxx?B?xxxxx?= encoded
    Resolved the fileinto foldername part, but
    now found that seems like ims's sieve filter
    cannot do non-ASCII matches... Thanks.

    Hi, back! Spent a few hours experimenting and found
    that everything is working great (including the creation
    of international non-ASCII foldernames) when I used
    utf-8 encoding in the sieve filters rules for the
    the match strings and the folder names... at least
    so far so good... for your ref and sorry for bothering.

  • CFMail breaks up attachments filename with non ASCII

    Hi CF-Devs out there!
    My problem is, that when attaching files via cfmailparam - no
    matter if inline or not - and using a filename that contains or a
    non 7bit-ASCII character or a simple space, the filename is not
    displayed correctly in Outlook nor Thunderbird. I am sure that this
    topic has been discussed here earlier but I really don't find any
    topic when using the Forum Search function... so please help anyway
    :) I am still using CFMX 6.1 :(
    My code is the following:
    <cfmailparam type="#variables.attachments
    .getContentType()#"
    file="#variables.attachments.getFilename()#"
    disposition="#this.dispositionType(variables.attachments
    .isInline())#"
    contentID = "#variables.attachments.getCid()#">
    When I revise the generated email code then I find the
    following:
    Content-Type: application/octet-stream
    Content-Transfer-Encoding: base64
    Content-Disposition: attachment; filename*2=nger.csv;
    filename*0*=UTF-8''emailempf; filename*1*=%c3%a4
    Content-ID: <DEE13419-D022-9396-C8A4DF2AAAA7C4EF>
    You can see that the filename (emailempfänger.csv)
    breaks up into several pieces which according to
    RFC 2231 is a
    valid format but obviously is not interpreted correctly by e.g.
    Outlook. Or am I missing something else?
    I would appreciate your help. Thanks in advance.
    Buergermeister

    The filename you used does not contain any non 7bit ASCII
    character. The problem only occurs if I use characters like
    "ä","ö","ü" but even [space] does provoke the error.
    The variables hold the following values (you can see them in
    the source posted, too):
    1) variables.attachments.getContentType() :
    application/octet-stream
    2) variables.attachments.getFilename() :
    [AbsoluteFilePath]/emailempfänger.csv
    3) this.dispositionType(variables.attachments.isInline()) :
    attachment
    4) variables.attachments.getCid():
    DEE13419-D022-9396-C8A4DF2AAAA7C4EF
    Cheers!

  • FILE_DATASTORE and non-ASCII chars

    I have created an interMedia Text index
    with the FILE_DATASTORE option, so that
    interMedia treats table entries as
    filenames and indexes the corresponding
    file on the servers's filesystem.
    But whenever the filename contains characters
    which are not part of the US7ASCII charset (like dv| _), the file is not found. But both Oracle and the operating system support these characters.
    The Oracle instance uses UTF8 as internal
    characterset. The client which stores
    the filenames in the table uses the
    WE8ISO8859P1 charset. The values in the
    database table are stored and shown correctly
    when viewed with Oracle or Java client
    programs.
    So where does the charset conversion fail ?
    The names are stored correctly, they can be
    read correctly by clients, but the indexer
    seems to use a wrong charset to translate
    the filenames stored in the database into
    filenames of the operating system.
    Do I have to apply some additonal configurations to my indexer ?
    Greetings,
    Michael Skusa
    null

    I bump Dr. Chucks thread for a similiar problem with non-ascii chars.
    The chars show up but the sorting is a bit off.
    Example: A, Å, B, ... Z
    Should be: A, B, ... Z, Å, Ä, Ö
    In Swedish Å (the letter Aring;) is one of the last letters and should not be placed after A despite being similiar.
    Any ideas?

  • IPhone: sectionIndexTitlesForTableView and non-ASCII chars

    You specify an array of strings for your section titles in your implementation of UITableViewDataSource method sectionIndexTitlesForTableView. However, it seems like if you have a string with non-ASCII characters, it is left blank (for example the Korean string "누리마루"). Anybody else encounter this issue?

    I bump Dr. Chucks thread for a similiar problem with non-ascii chars.
    The chars show up but the sorting is a bit off.
    Example: A, Å, B, ... Z
    Should be: A, B, ... Z, Å, Ä, Ö
    In Swedish Å (the letter Aring;) is one of the last letters and should not be placed after A despite being similiar.
    Any ideas?

  • Can't get the attachment filename out of a Part (with non ascii characters)

    Hello, all and happy new year :)
    My issue is with non ascii filename in attachments... Yes i've read the FAQ : http://www.oracle.com/technetwork/java/faq-135477.html#encodefilename
    I can't get the filename out of the BodyPart for those kind of attachments
    here's my unit test :
         * contains various parts from various mailer encoded in different ways...
         private enum EncodedFileNamePart{
              OUTLOOK("Content-Type: text/plain;\n name=\"=?iso-8859-1?Q?c'estd=E9j=E0no=EBl=E7ac'estcool.txt?=\" \nContent-Transfer-Encoding: 7bit\nContent-Disposition: attachment;\n filename=\"=?iso-8859-1?Q?c'estd=E9j=E0no=EBl=E7ac'estcool.txt?=\" \n\nnoel 2010\n","c'estdéjànoëlçac'estcool.txt"),
              GMAIL("Content-Type: text/plain; charset=US-ASCII; name=\"=?ISO-8859-1?B?ZOlq4G5v62znYWNlc3Rjb29sLnR4dA==?=\"\nContent-Disposition: attachment; filename=\"=?ISO-8859-1?B?ZOlq4G5v62znYWNlc3Rjb29sLnR4dA==?=\"\nContent-Transfer-Encoding: base64\nX-Attachment-Id: f_giityr5r0\n\namluZ2xlIGJlbGxzIQo=\n","déjànoëlçacestcool.txt"),
              THUNDERBIRD("Content-Type: text/plain;\n name=\"=?ISO-8859-1?Q?d=E9j=E0no=EBl=E7acestcool=2Etxt?=\"\nContent-Transfer-Encoding: 7bit\nContent-Disposition: attachment;\n filename*0*=ISO-8859-1''%64%E9%6A%E0%6E%6F%EB%6C%E7%61%63%65%73%74%63%6F;\n filename*1*=%6F%6C%2E%74%78%74\n\njingle bells!\n","déjànoëlçacestcool.txt"),
              EVOLUTION("Content-Disposition: attachment; filename*=ISO-8859-1''d%E9j%E0no%EBl.txt\nContent-Type: text/plain; name*=ISO-8859-1''d%E9j%E0no%EBl.txt; charset=\"UTF-8\" \nContent-Transfer-Encoding: 7bit\n\njingle bells\n","déjànoël.txt"),
              String content=null;
              String target=null;
              private EncodedFileNamePart(String content,String target){
                   this.content=content;
                   this.target=target;
              public Part get(){
                   try{
                   ByteArrayInputStream bis = new ByteArrayInputStream(this.content.getBytes());
                   Part part = new MimeBodyPart(bis);
                   bis.close();
                   return part;
                   catch(Throwable e){
                        return null;
              public String getTarget(){
                   return this.target;
         @Test
         public void testJavamailDecode() throws MessagingException, UnsupportedEncodingException{
              System.setProperty("mail.mime.encodefilename", "true");
              System.setProperty("mail.mime.decodefilename", "true");
              for(EncodedFileNamePart part : EncodedFileNamePart.values())
                   assertEquals(part.name(),MimeUtility.decodeText(part.get().getFileName()),part.getTarget());     
    I take a NullPointerExcepion in the decodeText because getFileName() return null for the EVOLUTION case, and work well with OUTLOOK, THUNDERBIRD and GMAIL.
    Evolution's content type is "Content-Disposition: attachment; filename*=ISO-8859-1''d%E9j%E0no%EBl.txt" wich doesn't look like the other (looks like the RFC 2616 or RFC5987 to do it.)
    How can i handle this situation except by writting my own decoder?
    Thanks for your answers!
    Edited by: user13619058 on 4 janv. 2011 07:44

    Set the System property "mail.mime.decodeparameters" to "true" to enable the RFC 2231 support.
    See the javadocs for the javax.mail.internet package for the list of properties.
    Yes, the FAQ entry should contain those details as well.

  • File upload with non-ascii name

    I'm designing a system that includes file-uploads. My problem is that any non-ascii chars in the filename are encoded strangely when saved. &auml; is encoded to a&#778; etc.
    I use Tomcat with the -Dfile.encoding="UTF-8" in the Catalina file. I get the same result despite method; my own implementation, apache commons or Javazoom's uploadBean. All the JSP charset parameters are set.
    Any ideas?

    Hi amitads,
    I'm sure u've used Java enough. Also, u must have learned about the \ (backslash) escape character ?
    So, if u want to include a string like a single quote, u can write \' and for a double quote u can write \".
    Try this ... I'm sure it will help.
    Keep me posted.
    Cheers !!
    Sherbir.

  • Folders that having non-ascii chars are not displaying on MAC using JFileChooser

    On MAC OS X 10.8.2, I have latest Java 1.7.25 installed. When I run my simple java program which allows me to browse the files and folders of my native file system using JFileChooser, It does not show the folders that having non-ascii char in there name. According this link, this bug had been reported for Java 7 update 6. It was fixed in 7 Update 10. But I am getting this issue in Java 1.7.21 and Java 1.7.25.
    Sample Code-
    {code}
    public class Encoding {
    public static void main(String[] arg) {
    try {
    //NOTE : Here at desktop there is a folder DKF}æßj having spacial char in its name. That is not showing in file chooser as well as while is trying to read for FILE type, it is not identify by Dir as well as File - getting File Not Found Exception
    UIManager.setLookAndFeel(UIManager.getSystemLookAndFeelClassName());
    } catch (IllegalAccessException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (UnsupportedLookAndFeelException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (ClassNotFoundException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    } catch (InstantiationException ex) {
    Logger.getLogger(Encoding.class.getName()).log(Level.SEVERE, null, ex);
    JFileChooser chooser = new JFileChooser(".");
    chooser.showOpenDialog(null);
    {code}

    Hi,
    Did you try this link - osx - File.list() retrieves file names with NON-ASCII characters incorrectly on Mac OS X when using Java 7 from Oracle -…
    set the LANG environment variable. It's a GUI application that I want to deploy as an Mac OS X application, and doing so, the LSEnvironment setting
    <key>LSEnvironment</key> <dict> <key>LANG</key> <string>en_US.UTF-8</string> </dict>

  • Java 5, Linux, 64-bit: Non-ASCII chars over socket

    Hi,
    I am having issues with reading non-ASCII chars from a socket. I send a mixed message, with the first part in ASCII and the last bit in non-ASCII. There are no issues with reading the non-ASCII characters on Windows. However, when I try running the server on Linux. The following is a message sample:
    Start message<CRLF>
    &#1092;&#1074;&#1072;&#1092;&#1099;&#1074;&#1072;&#1092;&#1074;&#1099;&#1072;&#1092;&#1099;&#1074;<CRLF>
    The second part (which is encoded in either Windows-1250 or KOI8-R), comes out as 3F (when you look at the bytes) on Linux.
    Any suggestions?
    Thanks,
    Max

    You must be using Readers and Writers, and you need to make sure you specify the same charsets when constructing them. Don't leave this to the default, as this seems to vary across platforms and definitely has varied across releases.

  • GSSName  is corrupted for non ascii chars

    Hi,
    I have a setup where a web application is deployed to use SPNEGO for user authentication ( using kerberos V ) and authorization.
    We have several users with non english characters in the user ID and even though kerberos authentication succeeds for such users ( KDC / Active Directory is returning valid kerberos ticket which the client embeds in the SPNEGO token). Hoowever, on passing the SPNEGO token to GSS API and extracting the user name from GSS API returns incorrect user name. All non ascii characters in the user name are replaced with some junk byte sequences.
    We use JGSS API (with JRE 1.4.08) for extracting the SPNEGO token and create a GSS secruity context object. Later, the GSS Name is extracted from the GSS context object.
    Currently I am tesitng the SPNEGo authentication for a user with user ID 123<sp char> . The <sp char> 's unicode value is FE and UTF-8 encoded byte sequence is C3 BE. However, if I invoke 'export' method of the GSSName object and examine the returned byte sequence, instead of C3 BE, the byte sequence EF BF BD EF BF BD is present. The byte sequence for other english characters are proper.
    Is this a defect in GSS-API ? Or am I not using GSS properly?
    Do I need to have any special setup / configuration for using JGSS with kerberos V for users with non ascii characters in the user ID?
    Please advise.
    Regards,
    Jayaram.
    Message was edited by:
    s_jayaram_s

    I understand that this is an older thread.
    We have spent lot of time on the internet to find out any possible workarounds or permanent solutions to enable Multi-Byte character support for username / password / SPN with Kerberos and Java . But no luck so far :(
    Are there any new updates on this i18n issue ?
    Thanks,
    Venkatesh

  • How Aperture encodes non-ascii chars

    It looks like Aperture uses a way of encode chars as UTF that seem to be bit unusual, many application (especially web apps) expects another way of encoding Unicode chars. The end result is that non-ascii chars comes out really strange.
    When I use applescript to read the IPTC info I need to do something like this before using the data:
    set c_title to normalize unicode p_title without decomposition
    Is there some way of telling Aperture to do this automatically when exporting photos?
    (sorry about the bad use of Unicode terminology but I don't know the area very well)

    I've spent quite a bit of time on this topic - I'm the author of Phoshare http://code.google.com/p/phoshare/, an open-source tool to export images and metadata from iPhoto and Aperture. Storing non-ASCII characters in image metadata is a messy business. If you want to get a taste of it, have a look at the Exiftool FAQ at http://www.sno.phy.queensu.ca/~phil/exiftool/faq.html#Q10 . A quote: "Most textual information in EXIF is stored in ASCII format, ... However *it is not uncommon for applications to write UTF‑8 or other encodings where ASCII is expected*". This leads to all sorts of problems, because the reader will have to make assumptions about the encoding used by the writer.
    In the case of Aperture, I have found that it writes metadata encoded in a way consistent with what most other applications and on-line services expect. Most of the encoding problems I've debugged where caused by bad input data. E.g. the characters were encoded improperly to begin with, but in a way that they still display as expected in some places (like Aperture). Can you try erasing some of the meta data in Aperture, and retype them from scratch (e.g. not copy-paste, which would paste the same stuff right back). Also make sure that you force-updated the preview images, so that the new meta data get written into the files.

  • Problems with non-ASCII characters on Linux Unit Test Import

    I found a problem with non-ASCII characters in the Unit Test Import for Linux.  This problem does not appear in the Unit Test Import for Windows.
    I have attached a Unit Test export called PROC1.XML  It tests a procedure that is included in another attachment called PROC1.txt. The unit test includes 2 implementations.  Both implementations pass non-ASCII characters to the procedure and return them unchanged.
    In Linux, the unit test import will change the non-ASCII characters in the XML file to xFFFD. If I copy/paste the the non-ASCII characters into the Unit Test after the import, they will be stored and executed correctly.
    Amazon Ubuntu 3.13.0-45-generic / lubuntu-core
    Oracle 11g Express Edition - AL32UTF8
    SQL*Developer 4.0.3.16 Build MAIN-16.84
    Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
    In Windows, the unit test will import the non-ASCII characters unchanged from the XML file.
    Windows 7 Home Premium, Service Pack 1
    Oracle 11g Express Edition - AL32UTF8
    SQL*Developer 4.0.3.16 Build MAIN-16.84
    Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)
    If SQL*Developer is coded the same between Windows and Linux, The JVM must be causing the problem.

    Set the System property "mail.mime.decodeparameters" to "true" to enable the RFC 2231 support.
    See the javadocs for the javax.mail.internet package for the list of properties.
    Yes, the FAQ entry should contain those details as well.

  • Filling clob with non ascii characters

    Hello,
    I have had some problems with clobs and usage of german
    umlauts (����). I was'nt able to insert or update
    strings containing umlaute in combination with string
    binding. After inserting or updating the umlaut
    characters were replaced by strange (spanish) '?'
    which were upside down.
    However, it was working when I did not use string bindung.
    I tried varios things, after some time I tracked
    the problem down to to oracle.toplink.queryframework.SQLCall.java. In the
    prepareStatement(...) you find something
    like
    ByteArrayInputStream inputStream = new ByteArrayInputStream(((String) parameter).getBytes());
    // Binding starts with a 1 not 0.
    statement.setAsciiStream(index + 1, inputStream,((String) parameter).getBytes().length);
    I replaced the usage of ByteArrayInputStram with CharArrayReader:
    // TH changed, 26.11.2003, Umlaut will not work with this.
    CharArrayReader reader = new CharArrayReader(((String) parameter).toCharArray());     
    statement.setCharacterStream(index + 1, reader, ((String) parameter).length() );
    and this worked.
    Is there any other way achieving this? Did anyone
    get clobs with non ascii characters to work?
    Regards -- Tobias
    (Toplink 9.0.3, Clob was mapped to String, Driver was Oracle OCI)

    I don't think the console font is the problem. I use Lat2-Terminus16 because I read the Beginner's Guide on the wiki while installing the system.
    My /etc/vconsole.conf:
    KEYMAP=de
    FONT=Lat2-Terminus16
    showconsolefont even shows me the characters missing in the file names; e.g.: Ö, Ä, Ü

  • Non-ASCII chars in applets?

    hi,
    Spent 4 hours to find a way to use non-ASCII chars in applets (buttons, textareas), but didn't make it.
    Simply saying
    TextFieldObj.setText("\uxxxx");
    //or any equivalent obj. Ex. of \uxxxx: \u015F
    doesn't work. I even went into Graphics.paint() example, but it too can paint only ASCII chars.
    My hunch is that it is smt. about Character.Subset but i still can't figure out how to do it.
    Please SOS,
    Reshat.

    Hi,
    I just managed to get Buttons to show Greek characters, so it appears that static buttons are fine.
    However, i still face the same problem for TextField's:
    TextFields work fine for IE, but in NN they sometimes convert into ASCII and sometimes give ? The same in HotJava.
    So there are 2 questions in my head:
    1. why can't NN use the fonts used by IE to display Non-ASCII chars?
    2. What is the safest font to use for Non-ASCII chars, to cover the widest possible audience.
    P.S. Java solves most cross-platform-browser problems, but the font issue still seems to be dependent on a user and his/her browser. It appears Java is not font-independent in non-ASCII context. If so, it would be nice to develop a plug-in to make sure that if the user doesn't have the font, then a Java-standardized Unicode-based font is used. Otherwise, non-ASCII world is still w/o a real Java.)
    Thank you for your feedback,
    Reshat.

  • Problems with non-ascii keywords

    I have some problems with non-ascii keywords that makes the whole keyword feature useless for me. I don't know if I'm doing something wrong or if there something I'm missing completely.
    The problem is that when I enter something like "grön" iPhoto sometimes refuses me to type in "grön" on another photo and "eats" the "ö". And if I select the matching keyword from the popup list iPhoto has changed the originally "grön" to "gr¨ön" (if that comes out right on the web). Here are a few screenshots to show what happens
    If someone knows what happens I would really appriciate some hints on how to avoid this.
    Note that adding/editing keywords works just fine in Aperture.

    There is a long standing issue with iPhoto and non-ascii characters. I know of no solution.
    iPhoto menu -> Provide iPhoto Feedback and report it as a bug.
    Regards
    TD

Maybe you are looking for