Read then rewrite non-ascii file

Hello,
Please could somebody answer this question.
I need to read in various filetypes to a Java app and then eventually write them out again. They will include jpg, gif, exe, pdf files and maybe some others.
My question is... if I read these files in to byte arrays, and then write them back out to another file, will the file be intact or does Java do anything 'wierd' to the data?
Thanks

a byte[] should work fine, and no, java doesn't do anything special.
But honestly, if you're doing all this file manipulation, I would probably want to do this in c/c++

Similar Messages

Problem with non-ASCII file name in content disposition header

Hi All,
I am facing some problems with the non-ASCII file name incase of content-disposition header. I read from the RFC 2183 that if the file name contains non-ASCII characters then the same should be encoded before sending to browser. I did the same but realized 2 problems:
1. The name of the file is truncated in case the file name is slightly long for e.g. ��j�b�g��.txt
2. Also when the same file is opened in notepad, the title is showing encoded name %E6%9C%80%E4%B8%8A%E4%BD%8D.....
Overall, I feel that the browser is not understanding or responding to the encoded header values.
Is there any solution to this problem? I am using Microsoft IE 6.0.
The code snippet is given below:
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
          String fileName = "��j�b�g��.txt";
          fileName = URLEncoder.encode(fileName, "UTF-8");
          resp.setCharacterEncoding("UTF-8");
          resp.setHeader("Content-Disposition", "attachment; filename=\"" + fileName + "\"");
          resp.setContentType("application/download-binary");
          String s = "This is inside txt file";
          resp.getOutputStream().write(s.getBytes("UTF-8"));
          return;
     }Any help or pointer would be highly appreciated.
Thanks and Regards,
Ashish

The MIME standards for non-ASCII filenames are not widely implemented.
Many mailers use an ad hoc method for encoding filenames. JavaMail
supports both methods, but you need to set properties, such as the
mail.mime.encodefilename property. See the JavaMail javadocs for
the javax.mail.internet package.

JavaMail with non-ascii file name attachment.

Hello,
I add a file to the mail, which file name is a Chinese name. After I sent it and receive in other email client (M$outlook), it could not display the correct file name for the attachment file.
How can solve this ?

The MIME standards for non-ASCII filenames are not widely implemented.
Many mailers use an ad hoc method for encoding filenames. JavaMail
supports both methods, but you need to set properties, such as the
mail.mime.encodefilename property. See the JavaMail javadocs for
the javax.mail.internet package.

Reading and writing non-English files

Hi there..
I wonder if anyone can tell me how to read and write non-English language files like French or Arabic files...
Im also interested in knowing how can i convert from Unicode to ASCII and vice versa..
thanx.. Mourad

hi there ..
thanx for ur cooperation .. but actually i have tried the following code and it didnt work; so, im looking for more help :)
esp. if there are any notes about the files themselves for example;
by the way i used Notepad files in unicode format..
my code is :
import java.io.*;
public class MainClass {
public static void main(String args[]){
String inputfile = "arabic.txt";
String outfile = "outfile.txt";
try{
RandomAccessFile raf = new RandomAccessFile(inputfile, "r");
InputStreamReader isr = new InputStreamReader(
new FileInputStream(inputfile), "Cp1256");
OutputStreamWriter osw = new OutputStreamWriter(
new FileOutputStream(outfile), "Cp1256");
for(int i=0; i<raf.length(); i++)
osw.write(isr.read());
osw.close();
}catch(UnsupportedEncodingException uee){
System.out.println("UEException: "+ uee.getMessage());
}catch(FileNotFoundException fnfe){
System.out.println("FNFException: "+ fnfe.getMessage());
}catch(IOException ioe){
System.out.println("IOException: "+ ioe.getMessage());
thanx again..Mourad

How to install tamil fonts to read and write non pdf files

Hi,
i bought a new mac book pro...i want to edit my documents which are typed in vanavil avvaiyar a tamil font...but i dont know how to install it..and i hav a duobt that whether this mac may support tamil fonts in mac office.....

The Mac supports Tamil input Method and lists Anjal and Tamil99.
Microsoft Office probably does not.

Problem searching some PDF files in Acrobat Reader – Non-ASCII characters

Acrobat Reader cannot search some .pdf files. I have put an example document up on Scribd here.
Any attempt to search for any word that can be clearly seen to be in the document fails with “No matches were found.”
This example document is NOT a scanned document – words and characters can be selected.
A hex display tool shows that the characters in a PDF document that can be successfully searched are in the ASCII/1252 range (A=0x41, etc).
Copying and pasting characters in the example document to a hex display tool shows that the characters in the document are not in the ASCII range.
For example the letters A to Z in the example document are in the range ‘A’ = 0xDF (decimal 223), ‘B’ = 0xDE (decimal 222), through to ‘Z’ = 0xC6 (decimal 198).
However, characters in these non-ASCII ranges are displayed perfectly by Acrobat Reader, as can be see if the example document is opened.
Therefore, as Acrobat Reader knows what these characters are, it doesn’t seem unreasonable to say that it should be able to search for and find them.
Tests were performed using Acrobat Reader X v10.1.4.
Can anyone say what this problem is?

Hi Pat, thanks for your reply.
Your reference to the title of that page being 'HARNESSES' indicates that, when you view that document in Adobe Reader, you are seeing 'HARNESSES', not
"ØßÎÒÛÍÍÛÍ". And that the remainder of the document is similarly being displayed in readable English language.
Yes as you say, you can search for 'ß' and get hits on 'A' (to use that as an example) in the example document.
But the need to form a word to be searched for into whatever code mapping this is using (for example having to enter "ØßÎÒÛÍÍ" for HARNESSES - I'm not even sure how that would be entered from a keyboard) doesn't seem to be very convenient.
Its clear the example document is using some code mapping other than ASCII / Windows-1252 (which has 'A' as 0x41). But it is also clear that Adobe Reader knows what that mapping is, and knows to use it, as its displaying (for example) 'A' for the code 0xDF.
So I guess the question is - why isn't Adobe Reader's knowledge of this mapping being extended to its search input?

Cannot rename file with non-ASCII characters when using the

My application moves files from one directory to another by calling File[] srcFiles = srcDir.listFiles() to get a list of files in the source directory, and then calling srcFiles.renameTo(destFile) to rename each file.
This does not work (renameTo returns false and the file is not moved) under the following circumstances:
- the file's leaf name contains non-ASCII characters, for example "�"
- the OS is Solaris 9
- the LANG and LC_* environment variables are unset, i.e. the C locale is being used
If I set the LANG environment variable to, for example, en_GB.UTF-8 then the rename succeeds.
I have tried calling srcFiles[index].getName().getBytes("UTF-8") and the non-ASCII characters are being replaced with ? (0x3f) characters when LANG is unset.
Is this a bug in the JRE? I would argue that since my code does not actually manipulate the filename (I just use the File object that File.listFiles() gives me) then the rename should succeed. Of course I would not expect the file name to be displayed correctly if I printed it out.
I have reproduced this behaviour with JDK 1.4.2_05 and 1.5.0_04 on Solaris 9.
Francis

Thanks for the info Alan.
I considered setting the locale in the environment (this sounds like the "correct" fix to me and we might implement it later), but this application shares a WebLogic server with many other applications so we would have to do a huge amount of testing to make sure that the locale change wouldn't break the other apps. In the end I worked around the problem by making the code that generates the filenames in the first place strip out any non-ASCII characters (the names of the files are not critically important).
Looking forward to JSR-203, in the meantime perhaps a note about this behaviour in the java.io.File javadoc would be useful.

Acrobat 9.3.4 no longer finds bookmarked non-PDF files (and then launches assoc app)

I produce a documentation DVD consisting of a one-page PDF file of ~ 1400 bookmarks and a lot of content. The bookmarks point to the content on the DVD which includes pdf files as well as text, DOC, XLS, and other misc. file types, and a large number of html, swf, flv, .jpg, etc. files which make up 4 complete web sites. The web sites are entirely contained on the DVD (no extermal internet access is referenced or required)
Content is accessed by clicking a bookmark which simply opens the file (regardless of type.) Prior to Acrobat 9.3.4, appropriate warnings popped up for non-PDF files but if allowed by the user, the appropriate app was launched for the requested file.
Since applying 9.3.4, a "Launch File" warning box pops up showing the file name (which lacks a device) and when the warning's "Open" button is pressed, a Windows error appears with the message "Windows cannot find 'file.txt'. Make sure you typed the name correctly, and then try again. To search for a file.... ", and finally an Adobe Acrobat information box pops up and says "Could not open the file 'path/file.txt' ".
I have read and re-read about launching external apps, restricted urls and attachments, trust manager preferences, etc. etc. I've added the files to the privileged locations list under the enhanced security settings and on and on. It no longer works, yet I tested this DVD on several computers just last week and it worked fine. So, as a shot in the dark, I decided to fire up my laptop and test it there again.
Once on, the laptop immediately wanted to update about 12,000 software products but I politely said "No, wait until i test this out!" The test was successful and the DVD's pdf file worked great again! At that point I let the machine apply the Acrobat 9.3.4 update (and some uncountable number of Windows patches and fixes.) It's running a current version of Win7. After the reboot, i tried the DVD again (under Acrobat 9.3.4) and it failed as described above. Because of Win7 I was able to revert to acrobat 9.3.2 using System Restore. After doing that, the DVD worked again.
Most of our user's machines are either Macs or WinXP (current versions) and i haven't had time to figure out if you can even recover from this update under those OS's. (It says you cannot remove it from XP, however.) Regardless, even if i can do it for my laptop and get it working again, i cannot do it for all of our users.
So, my question is if i am not doing something wrong, how do i let someone at Adobe know this release is broken?? If i am doing something wrong on the other hand, what is it and how do i get this to work again??
Thanks all.

Thanks. I did submit a report at the site. I hope somebody reads it as this is a big problem for us.
Thanks again.

Please help me in reading the ascii file in encoded format

hi ,
iam trying to read the ascii file and i need to encode the text file, can u suggest me what arre the different methods are availble for me to encode the text file, please suggest me which is the effiecient method tto use?

This question has been answered before, please search the forums in future.
You could do something like this, it probably not the most efficient though.
If you don't need to do it in code then the native2ascii does this conversion see: http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/native2ascii.html
        public static void convert(String vsFile) throws IOException
          File f = new File(vsFile);
          FileReader fr = new FileReader(f);
          char[] buf;
          int bytesRead = 0;
          int startAt = 0;
          String content = "";
          do
               buf = new char[512];
               bytesRead = fr.read(buf, 0, 512);
               startAt += bytesRead;
               String tmp = new String(buf);
               content += tmp;
          while (bytesRead == 512);
          FileOutputStream fout = new FileOutputStream(f);
          OutputStreamWriter out = new OutputStreamWriter(fout, "UTF-8");
          out.write(content.trim());
          out.flush();
          out.close();
     }

How to send an attached file containing with non-ascii code ?

Hi,
I want to send a attaced text file containing with non-ascii code(Traditional Chinese). Is there any way to solve the encoding problem?
Currently, it transfer into non-meaningful code in receiving side.
Thanks for the help in advance.

Here is the code:
Session _gSession = null;
MimeMessage message = null;
Properties props = new Properties();
props.put("mail.smtp.host", smtpHost);
_gSession = javax.mail.Session.getInstance(props, null);
message = new MimeMessage(_gSession);
message.setFrom(new InternetAddress(emailSender , emailSender));
InternetAddress ia[] = new InternetAddress[1];
ia[0] = new InternetAddress(emailReceiver, emailReceiver);
message.setRecipients(Message.RecipientType.TO, ia);
message.setSubject("Test Encoding Attached File");
message.saveChanges();
BodyPart messageBodyPart = new MimeBodyPart();
DataSource fds = new FileDataSource("Big5_Code.txt");
messageBodyPart.setDataHandler(new DataHandler(fds));
messageBodyPart.addHeader("Content-ID","meme");
MimeMultipart multipart = new MimeMultipart("related");
multipart.addBodyPart(messageBodyPart);
message.setContent(multipart);
transport.connect();
transport.send(message);

File upload with non-ascii name

I'm designing a system that includes file-uploads. My problem is that any non-ascii chars in the filename are encoded strangely when saved. ä is encoded to å etc.
I use Tomcat with the -Dfile.encoding="UTF-8" in the Catalina file. I get the same result despite method; my own implementation, apache commons or Javazoom's uploadBean. All the JSP charset parameters are set.
Any ideas?

Hi amitads,
I'm sure u've used Java enough. Also, u must have learned about the \ (backslash) escape character ?
So, if u want to include a string like a single quote, u can write \' and for a double quote u can write \".
Try this ... I'm sure it will help.
Keep me posted.
Cheers !!
Sherbir.

Non-blocking file behaviour for Reader

Hi!
It will be really nice if Reader will not block files for writing. Assume someone is using TeX to create PDF and he need to
a) examine the results;
b) edit the source.
It becomes a nightmare with Reader. Compile TeX, open Reader, examine results, close Reader. And then from the beginning.
Auto reloading of a PDF file is also a nice feature!
So far I can see only one solution: one should use another viewer to get non-blocking behaviour.
Thanks!

OK, let us see. I use Windows XP Professional SP3, Reader version is 9.3.0. I made a simple test: openned a file in Reader, then openned this file in my favourite text editor and tried to change "%PDF-1.4" to "%PDF-1.3" and save. I received a "File sharing violation error". I can also try to do it in Ubuntu.
As for the open source viewers. That is what I am currently doing: I use additional viewer to view the files I create and I use Adobe Reader to view all other files. One can leave with such solution, but it is not the best.

How to translate/convert read non-mac files

I can't open my bank statement which I download from the internet. It's in 'Money' format.
But since I can 'translate' .xls when working in Appleworks why can't I get a downloaded equivalent version?
What's the best way to convert non-mac files that can't be concerted by maclinkplus (which is installed in my iBook)?
Thanks
Gordon

That's something to take up with your bank. Most bank & some credit card company web sites give you the option of what format you can download your account information in. Unfortunately, most don't give an option of tab-delimited text, just CSV (comma-delimited), & AppleWorks can't open CSV natively. You can, though, open the file in a work processing document & use Find/Replace to Find the comma delimiters, usually "," including the quotes & replace with \t, the "code" for tab. You will need to manually remove the beginning & ending quotes. Then select all & paste into a spreadsheet.
Or you can do what I do for several of mine. I access my account info using Camino or Firefox, not Safari, & select > copy > paste into my spreadsheet. I use Camino or Firefox because the tabs between fields are usually copies as such. Safari doesn't "understand" these tabs & will paste as a lot of spaces & returns.

How to load file thru reader which contains non-english char in file name

Hi ,
I want to know how to load file in english machine thru reader which contains non-english chars in file names (eg. 置顶.pdf)
as LoadFile gives error while passing unicode converted file name.
Regards,
Arvind

You don't mention what version of Reader? And you are using the AcroPDF.dll, yes?
Sent from my iPad

Mail IMAP after 10.5.2 can't read ANY mails in boxes with non-ascii names

I have Mail with IMAP that connects to a regular MAC OS X Tiger server running IMAP.
After upgrading to 10.5.2 on the client, Mail can no longer read ANY mails at all in any mailbox whose path contains a non-ascii character!
Hence: If a box is called 'Övrigt', it only lists the mails in the mail box, but it will not show any of the contents of any of the mails! Hence, I cannot access any of these mails! Catastrophic!
I have to downgrade to 10.5.1, unless someone knows of a workaround.

Same problem here (or at least in part). Some .mac folders did no longer show any messages, while they were there and could be seen online and with Thunderbird. After your remark I changed the name of a folder which contained a "´" and now it works. It is really strange because there is another folder with a "¨" in it which does not work (I will test if the name change works with this folder as well in a minute) whilst there is another one with such name which works fine. The update really messed up Mail and in Dutch we just use such characters so Mail without supporting them will be rather useless for me...

Read then rewrite non-ascii file

Similar Messages

Maybe you are looking for