Encoding Issue: Change UTF8 with BOM char file to UTF16LE without BOM char.

i am trying to read UTF 8 file with BOM char .. if BOM mark present then i want to remove that BOM char
and write same file with UTF16LE with out BOM char ..
Please suggest solution on this .
FileInputStream fis = new FileInputStream(file);
               long size = file.length();
               byte[] b = new byte[(int) size];
               int bytesRead = fis.read(b, 0, (int) size);
               if (bytesRead != size) {
                    throw new IOException("cannot read file");
               byte[] srcBytes = b;
                        int b0 = srcBytes[0] & 0xff;
               int b1 = srcBytes[1] & 0xff;
               int b2 = srcBytes[2] & 0xff;
               int b3 = srcBytes[3] & 0xff;
               if (b0 == 0xef && b1 == 0xbb && b2 == 0xbf) {
                    System.out.println("Hint: the file starts with a UTF-8 BOM.");
                         String srcStr = new String(b ,"UTF8");
               String      encoding= "UnicodeLittle";
                        writeFile(filePath, srcStr,encoding);// Here is writing file with UTF16LE
     But files gets written with BOM char .
how do i remove this .
Please suggest solution on this

'uncle_alice' - in the OP's other thread on this topic I posted a decorated InputStream class that will strip of any (well any I could find definitions of) BOM prefix. Using this it is almost trivial for the OP to convert a file from one encoding to another without worrying about the BOM. I showed him the water but I can't make him drink.

Similar Messages

Now that Apple no longer supports AppleWorks, how can I change a lot of AppleWorks files to Pages, without doing them one at at time?

Now that Apple no longer supports AppleWorks, how can I change a lot of AppleWorks files to Pages, without doing them one at at time?

Maybe these?
https://discussions.apple.com/thread/3162022?start=0&tstart=0
http://macscripter.net/viewtopic.php?id=19987
But why if you're running 10.6 do you need to do this? AW works fine in 10.6 with Rosetta.
(BTW, you're in the older iMac PPC forum.)

Compressor encoded speed change clip with echo as motorboat sound

Took me two days to troubleshoot, through trial and error and process of elimination, why Compressor encoded an audio clip that had been speed changed and echo applied to it -- as rapid MOTORBOAT sound! Project is a 5.1 mix. Since STP3 won’t accept speed change clips, after I completed mix in STP I placed speed changed clips on FCP timeline below the exported STP 5.1 mix with the FCP audio tracks for that clip assigned to channels 1 & 2 for the fronts and channels 5 & 6 for the rears. The clip on 5 & 6 was duplicated and identical in every way to the clip on channels 1 & 2. Only the dupe clip on 5 & 6 did Compressor encode as motorboat. The identical clip on 1 & 2 encoded correctly. Both clips were set to play at same time since I wanted the effect to be on fronts and rears simultaneously. I could not get the motorboat sound to play back from either FCP timeline or Compressor preview window. Only after encode and burn to DVD did it show up. Why would Compressor introduce such artifacts on one of the clips and not both – if it’s going to do it at all – which it shouldn’t? Any ideas? I have other such effects throughout the project where I duped a speed changed clip with an effect applied to it and set it to play on fronts and rears simultaneously and Compressor encoded those correctly.

Well, it's a little late, but this is one of the reasons to have your startup drive cloned so you can restore it to a working state if an update, upgrade or install sends things south.
https://discussions.apple.com/docs/DOC-2494
Have you tried resetting your fcp preferences
https://discussions.apple.com/docs/DOC-2491
If this doesn't fix the problem, you could try sending the shot to motion. You could aslo try changing the speed in the motion tab of the viewer.

How to view the change immediately after a java file is modified without restarting server or redeploy?

          Hi All,
          How to view the change immediately after a java file that is used in jsp is modified
          without restarting server or redeploy?
          Moreover, it is better to keep the original session.
          Any suggestion is appreciated.
          Kammau


          Hi,
          In order to have a new version of a java class, the current classloader must be
          deleted and a new one created. This is what redeployment does. I believe that
          this is more of an issue with Sun's implementation of classloaders. You could
          ask BEA support (719.232.7878) and see if they have any plans to periodically
          check jar files to see if java class file timestamps and destroy and re-create
          classloaders on the fly.
          1) You will still have to accept the performance hit of destroying classloaders
          and creating new ones. There isn't any way around that.
          2) I would think you would want to have more explicit control in production and
          integration anyway.
          You can redeploy applications from the command line (script) file not just the
          console.
          Hope this helps,
          pat
          "Kammau" <[email protected]> wrote:
          >
          >Hi All,
          >How to view the change immediately after a java file that is used in
          >jsp is modified
          >without restarting server or redeploy?
          >Moreover, it is better to keep the original session.
          >Any suggestion is appreciated.
          >
          >Kammau

Change Open With for all files with a specific extension

Hi, I've been using OS X for quite some time, but more recently cannot see how you do the following which is proving frustrating:
* Open Get Info for a file,
* Use the drop down under Open With to choose Other,
* Select an appropriate application that wasn't on the drop down list (in this case Smultron text editor for CSV files),
* Press the Change All button and the drop down turns back to Excel which was the previous default.
So basically, how do I change all files with CSV extension to Smultron when Smultron wasn't in the list of Open With items and I have to use Other from the list. This is repeatable for any file type with any application that isn't listed and is on both my iMac and my work's MacBook. Try it yourself and you'll see what I mean!!
Thanks

Stuart Mchattie wrote:
Thanks NeroWolf,
Your solution does work for that single file, but doesn't change the system wide association of that file extension with a particular application. It is because this works that I believe the problem I am having is uncircumventable and is in fact a bug in the OS. Could anyone else confirm this?
Cheers,
Stuart
It might be a bug. What I found is that the icon does not change unless I log out/in or reboot, even when the app normally puts an icon on the file.
For example if I change an audio file to always open with VLC, it does even though the icon remains as before. However, even though clicking plays the audio with VLC, if I right click the file and select "Open With" it still says iTunes (app) default which is incorrect.
Message was edited by: nerowolfe

Can't change "Open with" for multiple files

For some reason, all swfs on my computer won't open in the Flash player that came with Flash 9. If I select 1 swf and set it to open with the player, it will work, however, if I select multiple ones, get info and try to set them all to open in the player (or "Change All") I get a dialogue saying I don't "have privileges to change the application for these documents only. Do you want to change all similar documents to open with the application Flash Player.app?" This is rediculous because I could so this in Tiger just fine. Also, I'm the admin and i'm in MY account, so I should be able to do anything I want, right?
Any insights? I've repaired disk permissions by the way.
Dan P.

This is sometimes a problem with the launchservices database. You can try the following to fix it:
Rebuild LaunchServices Database
For Tiger users
Open the Terminal application in your Utilities folder. At the prompt paste in the following command in its entirety:
/System/Library/Frameworks/ApplicationServices.framework/Frameworks/LaunchServic es.framework/Support/lsregister -kill -r -domain local -domain system -domain user
Press RETURN.
For Leopard users
Open the Terminal application in your Utilities folder. At the prompt paste in the following command in its entirety:
/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchSe rvices.framework/Versions/A/Support/lsregister -kill -r -domain local -domain system -domain user
Press RETURN.
If this isn't successful then there may be other problems involved. You can try this:
Repairing the Hard Drive and Permissions
Boot from your OS X Installer disc. After the installer loads select your language and click on the Continue button. When the menu bar appears select Disk Utility from the Installer menu (Utilities menu for Tiger and Leopard.) After DU loads select your hard drive entry (mfgr.'s ID and drive size) from the the left side list. In the DU status area you will see an entry for the S.M.A.R.T. status of the hard drive. If it does not say "Verified" then the hard drive is failing or failed. (SMART status is not reported on external Firewire or USB drives.) If the drive is "Verified" then select your OS X volume from the list on the left (sub-entry below the drive entry,) click on the First Aid tab, then click on the Repair Disk button. If DU reports any errors that have been fixed, then re-run Repair Disk until no errors are reported. If no errors are reported click on the Repair Permissions button. Wait until the operation completes, then quit DU and return to the installer. Now restart normally.
If DU reports errors it cannot fix, then you will need Disk Warrior (4.0 for Tiger, and 4.1 for Leopard) and/or TechTool Pro (4.6.1 for Leopard) to repair the drive. If you don't have either of them or if neither of them can fix the drive, then you will need to reformat the drive and reinstall OS X.
You can install the freeware preference pane, RCDefaultApp - VersionTracker or MacUpdate - which is useful for extensive control over various file associations.

How to detect encoding file in ANSI, UTF8 and UTF8 without BOM

Hi all,
I am having a problem with detecting a .txt/.csv file encoding. I need to detect a file in ANSI, UTF8 and UTF8 without BOM but the problem is the encoding of ANSI and UTF8 without BOM are the same. I checked the function below and saw that ANSI and UTF8
without BOM have the same encoding. so, How can I detect UTF8 without BOM encoding file? because I need to handle for this case in my code.
Thanks.
public Encoding GetFileEncoding(string srcFile)
// *** Use Default of Encoding.Default (Ansi CodePage)
Encoding enc = Encoding.Default;
// *** Detect byte order mark if any - otherwise assume default
byte[] buffer = new byte[10];
FileStream file = new FileStream(srcFile, FileMode.Open);
file.Read(buffer, 0, 10);
file.Close();
if (buffer[0] == 0xef && buffer[1] == 0xbb && buffer[2] == 0xbf)
enc = Encoding.UTF8;
else if (buffer[0] == 0xfe && buffer[1] == 0xff)
enc = Encoding.Unicode;
else if (buffer[0] == 0 && buffer[1] == 0 && buffer[2] == 0xfe && buffer[3] == 0xff)
enc = Encoding.UTF32;
else if (buffer[0] == 0x2b && buffer[1] == 0x2f && buffer[2] == 0x76)
enc = Encoding.UTF7;
else if (buffer[0] == 0xFE && buffer[1] == 0xFF)
// 1201 unicodeFFFE Unicode (Big-Endian)
enc = Encoding.GetEncoding(1201);
else if (buffer[0] == 0xFF && buffer[1] == 0xFE)
// 1200 utf-16 Unicode
enc = Encoding.GetEncoding(1200);
return enc;

what you want is to get the encoding utf-8 without bom which can only be detected if the file has special characters, so do the following:
public Encoding GetFileEncoding(string srcFile)
// *** Use Default of Encoding.Default (Ansi CodePage)
Encoding enc = Encoding.Default;
// *** Detect byte order mark if any - otherwise assume default
byte[] buffer = new byte[10];
FileStream file = new FileStream(srcFile, FileMode.Open);
file.Read(buffer, 0, 10);
file.Close();
if (buffer[0] == 0xef && buffer[1] == 0xbb && buffer[2]
== 0xbf)
enc = Encoding.UTF8;
else if (buffer[0] == 0xfe && buffer[1] == 0xff)
enc = Encoding.Unicode;
else if (buffer[0] == 0 && buffer[1] == 0 && buffer[2]
== 0xfe && buffer[3] == 0xff)
enc = Encoding.UTF32;
else if (buffer[0] == 0x2b && buffer[1] == 0x2f &&
buffer[2] == 0x76)
enc = Encoding.UTF7;
else if (buffer[0] == 0xFE && buffer[1] == 0xFF)
// 1201 unicodeFFFE Unicode (Big-Endian)
enc = Encoding.GetEncoding(1201);
else if (buffer[0] == 0xFF && buffer[1] == 0xFE)
// 1200 utf-16 Unicode
enc = Encoding.GetEncoding(1200);
else if (validatUtf8whitBOM(srcFile))
enc = UTF8Encoding(false);
return enc;
private bool validateUtf8whitBOM(string FileSource)
bool bReturn = false;
string TextUTF8 = "", TextANSI = "";
//lread the file as utf8
StreamReader srFileWhitBOM = new StreamReader(FileSource);
TextUTF8 = srFileWhitBOM .ReadToEnd();
srFileWhitBOM .Close();
//lread the file as ANSI
StreamReader srFileWhitBOM = new StreamReader(FileSource,Encoding.Defaul,false);
TextANSI = srFileWhitBOM .ReadToEnd();
srFileWhitBOM .Close();
// if the file contains special characters is UTF8 text read ansi show signs
if(TextANSI.Contains("Ã") || TextANSI.Contains("±")
bReturn = true;
return bReturn;

Captivate 8 - How to change to producing "project.json" file instead of a "project.txt" file? Having issues with viewing published projects due to this.

Captivate 8 - How to change to producing "project.json" file instead of a "project.txt" file? Having issues with viewing published projects due to this. Would be thankful fo rany/all advice.

I'm having the same issue. I've been able to work around it by opening the txt file and copying the contents. Then I open a new file in Sublime Text 2, paste the contents and save the new file as "project.json". It seems to work, but it'd been nice if I didn't have to do this everytime I publish a new project.

Spool ouput in UTF8 with BOM encoding

I've a requirement to get data from Oracle tables in a flat file.I'm using sqlplus script for the same.But those flat files must be in UTF8 With BOM(Byte Order Mark) format .Can anyone tell me how to spool data to a flat file of UTF8 format by running a sqlplus script in windows environment.Do I need to use convert function for every column in sqlplus script?Im usisng Oracle DB 11g

We could if you were asking how to do this in SQL Developer, e.g. use the Tools > Database Export wizard, but this is not the SQL*Plus or SQL forum.

Loading "fixed length" text files in UTF8 with SQL*Loader

Hi!
We have a lot of files, we load with SQL*Loader into our database. All Datafiles have fixed length columns, so we use POSITION(pos1, pos2) in the ctl-file. Till now the files were in WE8ISO8859P1 and everything was fine.
Now the source-system generating the files changes to unicode and the files are in UTF8!
The SQL-Loader docu says "The start and end arguments to the POSITION parameter are interpreted in bytes, even if character-length semantics are in use in a datafile....."
As I see this now, there is no way to say "column A starts at "CHARACTER Position pos1" and ends at "Character Position pos2".
I tested with
load data
CHARACTERSET AL32UTF8
LENGTH SEMANTICS CHARACTER
replace ...
in the .ctl file, but when the first character with more than one byte encoding (for example ü ) is in the file, all positions of that record are mixed up.
Is there a way to load these files in UTF8 without changing the file-definition to a column-seperator?
Thanks for any hints - charly

I have not tested this but you should be able to achieve what you want by using LENGTH SEMANTICS CHARACTER and by specifying field lengths (e.g. CHAR(5)) instead of only their positions. You could still use the POSITION(*+n) syntax to skip any separator columns that contain only spaces or tabs.
If the above does not work, an alternative would be to convert all UTF8 files to UTF16 before loading so that they become fixed-width.
-- Sergiusz

File sender adapter - Encoding issue

Hi,
On my customer site, we have an interface taking a file and sending an IDoc to the non Unicode ERP system. Unfortunately, when we have cyrillic characters in the file, the processing files with the error:
com.sap.aii.utilxi.misc.api.BaseRuntimeException: Fatal Error: com.sap.engine.lib.xml.parser.ParserException: Invalid char #0x6(:main:, row:17776, col:893)
This is of course the result of using an invalid encoding in the communication channel. Until now, it was left blank, so UTF8 was used. I want to improve this interface in order to never again have this error because it involves some manual work fixing it and it's getting annoying in production to see this once a month.
What I want to do next is finding out the encoding from the guys delivering the file and then placing it in the communication channel. Pretty straightforward, right? On SAP, I think the cyrillic, non ASCII character will be replaced by #, but this is acceptable by the business. Not acceptable is this constant error.
Because I want to be sure of my assessment before I ask for approval on doing this modification with the associated testing, communication and everything, my question to you is: have you experienced this before in PI? Are all my conclusions accurate? How would you solve the problem?
Thanks in advance and best regards,
George

did you try giving the encoding as ISO 8859-5 in the file adapter?
File Type
Specify the document data type.
u25CB Binary
u25CB Text
Under File Encoding, specify a code page.
The default setting is to use the system code page that is specific to the configuration of the installed operating system. The file content is converted to the UTF-8 code page before it is sent.
also ref: http://en.wikipedia.org/wiki/Cyrillic_alphabet#Computer_encoding
http://help.sap.com/saphelp_nw04/helpdata/en/e3/94007075cae04f930cc4c034e411e1/content.htm

FCC with ASCII and UTF-8 encoding issue

Hi,
I have File to IDoc scenario and I am doing FCC which has Japanese chars. (PO with HEADER,1,ITEMS,*)
I have specified UTF-8 encoding in file adapter to processed the file.
Earlier, my source file was in ASCII format which had junk chars; my file was picked and Idoc posted had junk chars.
Then I used UTF-8 encoding for my source file to correct this issue. XI showed proper Japanese chars but this time Header part is missing.
Do I have to specify encoding in "module" for File adapter?
Regards,

Thanks for your replies Chirag/Gabriel,
ISO encoding didn't work.
My source file will be in UTF-8 format.
There is one correction. It is ANSI encoding, not ASCII as in the subject.
I still have this issue when my document offset is 0.
I tried to play around with FCC and found this odd thing.
When first line of my input file is blank....and I omit reading the first line with offset 1, then file is read in its entirety.
Again, when I remove this blank line and the file starts with Header and with offset 0 in File adapter, then again my Header part is missing.
What to do?
Regards,
AV

[SOLVED] File name encoding issue

Hi all,
I have a large series of files with accented characters, they were all displayed nicely, but at some point, when I copied them to another computer, the characters were replaced by codes, for instance: "ó" --> "Ã³".
+Renaming ie. "PasÃ³" (bad encoding of "Pasó") --> Pasó, while writing it, it shows the correct character, but when pressing enter the name remains ("PasÃ³")
+If I rename the file to something else and then to the correct name, it will accept it: PasÃ³ --> Pas --> Pasó will display correctly.
I don't know if it's a system wide encoding issue because new files are displayed correctly, but I would like to know if I have to change file names manually to make them right.
PS. When copying bad encoded files to another FS (like a USB drive), nautilus and bash refuse to copy them.
Last edited by Wasser (2012-09-17 21:10:52)

My fstab:
# /etc/fstab: static file system information
# <file system> <dir> <type> <options> <dump> <pass>
tmpfs /tmp tmpfs nodev,nosuid 0 0
# /dev/sda2 LABEL=ROOT
UUID=d2243d9c-b8e7-442a-8446-5a43a4d9221b / ext4 rw,relatime,data=ordered 0 1
# /dev/sda5 LABEL=HOME
UUID=e67f5cfa-3ec3-4c06-9c2c-62c4cc188ffe /home ext4 rw,relatime,data=ordered 0 2
# /dev/sda3 LABEL=VAR
UUID=caac4924-2a13-4c97-9926-668ac0595ba3 /var reiserfs rw,relatime 0 2
# /dev/sda1 LABEL=UEFI
UUID=1E70-6485 /boot/efi vfat rw,relatime,fmask=0022,dmask=0022,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 2
# /dev/sda4
UUID=14993c2e-4bc4-42e4-b2e5-9dbc286abb4c none swap defaults 0 0
Files in question are in /dev/sda5 (HOME)
Last edited by Wasser (2012-09-16 08:37:52)

Issue with length of file paths - Windows & C++ plugin

Hello,
I've got an issue that just popped up on my OCR plugin I've been working on that I suspect is related to the length of the filepath.
I'm getting the following error that is being caught and logged when trying to open a file (filename changed for security purposes):
Error Opening File: D:\aVeryLongFilePath.pdf
Exception info: This file cannot be found.
The entire string, including the D:\ part, is 266 characters long. I cut down the length of part of the path one by one and it was able to open and OCR the document when the length was 259 characters.
I know there's a MAX_PATH variable in Windows and/or there's some kind of limitation for file length. Note that I can open the file in Acrobat using File->Open and run OCR on it individually using the built-in Recognize Text function, but if I try to Recognize Text for Multiple files and choose "Add Folders", the file in question doesn't show up in the list of files to be batch OCR'd (even though it is there). Interestingly, choosing "Add Files" from the Recognize Text->In Multiple Files does work. So Acrobat itself has at least some issues opening the file using some of it's features.
Here's how I'm opening the document:
     string pn;               // assume this is initialized, I'm just putting this here to demonstrate what type it is
     pathName = pn;
     ASAtom pathType = ASAtomFromString("Cstring");
     asPathName = ASFileSysCreatePathName(ASGetDefaultFileSys(), pathType, pn.c_str(), NULL);
     pdDoc = PDDocOpen(asPathName, ASGetDefaultFileSys(), NULL, true);
Is there a way around this problem?
Thanks.

Yes, you are hitting the MAX_PATH on certain versions of Windows.   The only workaround would be to see that the file is "a very long path" and then break up the pathname construction into multiple pieces (perhaps the containing directory and then the file itself). The other option is don't use Cstring to construct, use the Unicode variant.
From: Adobe Forums <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
Date: Mon, 7 Nov 2011 15:43:33 -0800
To: Leonard Rosenthol <[email protected]<mailto:[email protected]>>
Subject: Issue with length of file paths - Windows & C++ plugin
Issue with length of file paths - Windows & C++ plugin
created by zephed56<http://forums.adobe.com/people/zephed56> in Acrobat SDK - View the full discussion<http://forums.adobe.com/message/4012710#4012710

When I send an email with a photoshop file attached, the recipient receives it "flattened". How do I change this?

when I send an email with a photoshop file attached, the recipient receives it "flattened". How do I change this?

I've sent psd files to various people without issues, however you might want to try something like www.filedropper.com to make sure it's not that they can't open your version of psd file.
You may also want to zip the file using the built in archiving software, simply right click (or control click) the file and choose Compress ...., then you can email it to the person as a compressed archive. That should preserve all the aspects of the file properly without any modifications.

Encoding Issue: Change UTF8 with BOM char file to UTF16LE without BOM char.

Similar Messages

Maybe you are looking for