Reading Unicode from textfile, displaying characters
There is a textfile "readit.txt", which contains the following Unicodes:
\u672c\u8a9e\u6587\u5b57\u5217I'm reading this file and try to display it, expecting to see some Chinese characters on my screen.
try {
fis = new FileInputStream(file);
in = new InputStreamReader(fis);
br = new BufferedReader(in);
while((fileLine = br.readLine()) != null) {
sb.append(fileLine + lineSep);
} catch(FileNotFoundException e) {
e.printStackTrace();
} catch(IOException e) {
e.printStackTrace();
System.out.println(sb.toString());Above code doesn't work. It just displays exactly whats in the file.
The weird thing is that it works well when doing this:
System.out.println("\u672c\u8a9e\u6587\u5b57\u5217");It will then output 本語文字列
836019 wrote:
There is a textfile "readit.txt", which contains the following Unicodes:
\u672c\u8a9e\u6587\u5b57\u5217I'm reading this file and try to display it, expecting to see some Chinese characters on my screen.
try {
fis = new FileInputStream(file);
in = new InputStreamReader(fis);
br = new BufferedReader(in);
while((fileLine = br.readLine()) != null) {
sb.append(fileLine + lineSep);
} catch(FileNotFoundException e) {
e.printStackTrace();
} catch(IOException e) {
e.printStackTrace();
System.out.println(sb.toString());Above code doesn't work. It just displays exactly whats in the file.It works perfectly. The file has the character '\', then the character 'u', then the characters '6', '7', '2', 'c', etc. Those are just characters. BufferedReader has no way of knowing you want to interpret them as unicode escape sequences.
I imagine there's a method somewhere in the core API that will convert those sequences to their corresponding characters.
The weird thing is that it works well when doing this:
System.out.println("\u672c\u8a9e\u6587\u5b57\u5217");It will then output 本語文字列That's because the Java compiler, unlike BufferedReader, does parse unicode escape sequences.
Similar Messages
-
Reading Unicode from file doesn't appear in form.append
HI all
I have a text file, and I read data from it and insert the data into String array.
my_array[1]="My text with Unicode \u00F6 ";
Then I do
form.appen(my_array[1]) and there is a unicode in this array
It must generate the output
My text with Unicode __ (in the name __ is Unicode)
but the output is
My text with Unicode \u00F6
The output must generate my Unicode but it doesn't generate.
When I write form.appen("\u00F6");
it appears.
How can I do it.You didn't understand me. I open a txt file
Class c = this.getClass();
InputStream is = c.getResourceAsStream("test.txt");
StringBuffer str = new StringBuffer();
byte b[] = new byte[100];
while ( is.read(b) != -1 ) {
str.append(new String(b));
my_string=str.toString();
inside my_string there is a Unicode.
When I write
form.append(my_string) I see \u00F6
but when directly I open a file, copy line with Unicode from there and write
form.append(copied_line);
I can see my Unicode character -
Reading fields from ALV display
Hi All,
I have a check box in front of each data record in the ALV display. This check box is editable. The user will check some records after seeing the details and click on a puch button provided. On the click of this button I need to read all the records whose check box has been checked.
However, this edited value of check box is not getting reflected in the program. Is there any FM or any other method through which I can read the changed values in the ALV display ??
Request your urgent help.
Thanks and Regards,
Archana.Hi ,
Please see this example program - BCALV_EDIT_05
I am sure this will help you.
If u have used standard ALV function module(REUSE_ALV_GRID_DISPLAY) instead of class cl_gui_alv_grid , then you can set layout and fieldcatalog accordinglly.
dont forget to add filed
checkbox type c.
in your display table (itab).
fieldcatalog setting :
for field : 'CHECKBOX'.
<b> ls_fcat-checkbox = 'X'.
ls_fcat-edit = 'X'.</b>
In user command subroutine :
check displayed internal table (itab here)
u will find the checkbox-checked record has itab-checkbox = 'X' .
Regards,
Mihir. -
Need to read Unicode in a file
Hi,
My need to read Unicode from a file (on a Windows box) is due to the fact my software is used in different countries and on different keyboard naturally. All the users are not computer literate but, like me, they are all lazy and want to put their username and password in a config file my application reads. If their username or password contain Unicode characters I have a problem reading.
They are simple users that I would like to advise them to open the config file using Windows Notepad, then type in their username and password, and save the file as Unicode. Notepad has four ways to save a file, ANSI, Unicode, Unicode big endian, and UTF-8 (I've tried them all except ANSI of course). Saving a file in a different format is as complicated as I would like it to get for them, some will have trouble even with this.
I read the file like so:
BufferedReader rdr =
new BufferedReader(
new InputStreamReader(new FileInputStream(file_name), "UTF-16"));
String line;
while ((line = rdr.readLine()) != null) {
String[] pieces = line.split("[=:]");
if (pieces.length == 2) {
if (pieces[0].equals("PASSWORD")) {
byte[] possibleUnicode = pieces[1].getBytes("various encodings");
pieces[1] = new String(possibleUnicode, "various encodings");
propertyTable.setProperty(pieces[0], pieces[1]);
}All reading is perfect except for a username or password which can contain a real multi-byte character. I have used many variations of converting the string I get into a byte[] using string_in.getBytes("various encodings tried") and then back to a string but nothing has worked.
I tried a regular FileReader to a BufferedReader and that didn't work. I tried a FileInputStreamto a DataInputStream and that didn't work. I accomplished the most with what I described above, FileInputStream to InputStreamReader to BufferedReader.
Does anyone know how to read Unicode in a file on a Windows file system?
hopiI have used the byte conversion technique before
successfully when I loaded a set of properties
from a URL openStream(). The properties load()
method takes an InputStream and assumes ISO-8859-1
so I converted the bytes from ISO-8859-1 to UTF-8.
Garbage characters were cleared up perfectly.I think you just got lucky that time. For characters up to U+007F, the UTF-8 encoding is the same as ISO-8859-1 (and most other encodings, for that matter). Characters in the range U+0080 to U+00FF will be encoded with one byte in ISO-8859-1, and with two bytes in UTF-8. In most cases, each of the two bytes in the UTF-8 representation will have values that are valid in ISO-8859-1. The decoded characters will be incorrect (and there will be too many of them), but they effectively preserve the original byte values, making it possible for you to re-encode the characters and then decode them correctly. But there's a big gap in the middle where the UTF-8 bytes produce garbage when decoded as IS)-8859-1. Run the included program to see what I mean.
I don't know what's going wrong with your application, but I do know that changing the encoding retroactively is not the solution. I also think you're right about asking users save files in a certain encoding. Considering how much trouble programmers have with this stuff, it's definitely too much to ask of users.
import java.awt.Font;
import javax.swing.*;
public class Test
public static void main(String... args) throws Exception
JTextArea ta = new JTextArea();
ta.setFont(new Font("monospaced", Font.PLAIN, 14));
JFrame frame = new JFrame();
frame.add(new JScrollPane(ta));
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
StringBuilder sb = new StringBuilder();
for (int i = 0xA0; i <= 0xFF; i++)
sb.append((char)i);
String str1 = sb.toString();
byte[] utfBytes = str1.getBytes("UTF-8");
String str2 = new String(utfBytes, "ISO-8859-1");
for (int i = 0, j = 0; i < str1.length(); i++, j += 2)
char ch = str1.charAt(i);
byte b1 = utfBytes[j];
byte b2 = utfBytes[j+1];
String s1 = Integer.toBinaryString(b1 & 0xFF);
String s2 = Integer.toBinaryString(b2 & 0xFF);
char ch1 = str2.charAt(j);
char ch2 = str2.charAt(j+1);
ta.append(String.format("%2c%10s%10s%3x%3x%3c%3c\n",
ch, s1, s2, b1, b2, ch1, ch2));
frame.setSize(400, 700);
frame.setLocationRelativeTo(null);
frame.setVisible(true);
} -
Displaying unicode or HTML escaped characters from HTTPService in Flex components.
Here is a solution on the Flex Cookbook I developed for
displaying data in Flex components when the data comes back from
HTTPService as unicode of HTML escaped data:
Displaying
unicode or HTML escaped characters from HTTPService in Flex
components.Hi again Greg,
I have just been adapting your idea for encountering
occasional escaped characters within a body of "normal" text, eg
something like
hellô sunšine
Now, the handy String.fromCharCode(charCode) call works a
dream if instead of the above I have
hellô sunšine
Do you know if there is an equivalent call that takes the
named entities rather than the numeric ones? Clearly I can just do
some text substitution to get the mapping, but this means rather
more by-hand work than I had hoped. However, this is definitely a
step in a useful direction for me.
Thanks,
Richard
PS hoping that the web page won't simply outguess me and
replace all the above! Basically, the first line uses named
entities and the second the equivalent numbers... -
Reading Unicode data from a file...
I am writing an application that needs to read some configuration data from a file. An end user edits the configuration file to provide the configuration data. The Java code reads this file and uses the configuration data supplied by the user.
The user can also save non-ascii characters as part of the configuration data. hence, I do not want to use java properties files. What are the other options available that allow me reading Unicode data into my Java code and will also allow user to save the configuration file as Unicode?Java characters are Unicode characters. Read file data that consists of Unicode characters as Java characters or strings.
You can read the data as primitive char values using the DataInputStream class. The InputStreamReader class can also read Unicode (UTF-16) data.
Data can be written using the OutputStreamWriter class. -
Sending unicode data to Excel from servlet displayed as "?"
Hi,
In my application we are export some unicode data to Microsoft Excel 2003 from servlet .
For that i am using the following code.
response.setHeader( "Content-Disposition", "attachment; filename=results.xls" );
response.setContentType( "text/xls" );
theHeader.append("\u30ec\u30dd\u30fc\u30c8 \u30bf\u30a4\u30c8\u30eb");
The above unicode data (japanese character) not displaying properly in Excel (display as "?"). other non unicode things are display properly.
can anyone advise me.I was having problems writing a unicode (utf-8 or utf-16) csv excel file, found a solution in the end ill post it here in case anyone else is looking.
You need to use the funky encoding x-UTF-16LE-BOM
an example of how to do this is
Writer writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(outputFile, true), "x-UTF-16LE-BOM"));
then instead of using commas (,) use tabs (\t)
then in my web.xml i have
<mime-mapping>
<extension>csv</extension>
<mime-type>application/msexcel</mime-type>
</mime-mapping>
this site is a good reference
http://members.chello.at/robert.graf/CSV/
Hope this helps someone ! =]. I am using excel 2007may or may not work with earlier versions. -
Read numbers from a .txt file and display them in a graph
How can I get Labview 7 to read from a txt. file containing a lot of
coloumns with different datas? There`s only two of the coloumns that are
interesting to me, the first, that contains the time of the measuring, and
one in the middle, that contains the measured temperatures. I want Labview
to read this datas and display them graphicly.
Thanks from StaleHere's one way.
You can also use the help-> find examples and search for "text".
2006 Ultimate LabVIEW G-eek.
Attachments:
Graph.vi 21 KB -
We want to read data from weigh bridge and display in oracle forms & store
Sir/Madam,
in our organisation we had one requirement. i.e is reading data from weigh bridge using serial port, displying that data in oracle forms and when ever user click save button we store that into my oracle database. we are using oracle 8i and forms 6i and windows OS environment. we don't know reading data from serial port and placing that into form items. please help me as early as possible if there is any property available in d2k regarding this requirement .
thank you,
vishnuThere's no property in Forms that makes you read serial ports, but as far as I know you need to know the API of the machine which you want to read data from (it should come with machine's manual) and then it will be easy to store it in forms item.
Tony -
Read Unicode string entry out of MS-Access though Java
I'm making a small Finnish Dictionary programme use Java,so i first need to test if I can display those Unicoded Finnish words from Access to Java.
Those words reads well from my FFF.mdb by Java,but when print out on the screen,some Unicode character display in question mark-??????,by the way,I can retrieve and display my query pretty well in Ms-Access,but now problems comes to Java.
Before I print them on screen,do I need to add some filter on those unicoded finnish words after they be retrived form database,so that they can be displayed properly? thanX!
=========
try{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
Connection conn2=DriverManager.getConnection("jdbc:odbc:FFF");
Statement stmt2=conn2.createStatement();
ResultSet rs=stmt2.executeQuery("Select * from fin_dictionary");
while (rs.next()){
//here,java can not display the finnish letters properly:
System.out.println(rs.getString("words")+",");
stmt2.close();
conn2.close();
}catch(Exception e){
System.out.println("\n SQL Exception:"+ e.getMessage()+"\n");
===end===but also when I use Java's sql to instert unicode character in to the database,the character be inserted displayed in kind of "???" too. so I think there should be some method to make java handle those unicode properly.
-
E90 Web Browser displays characters incorrectly
Hi there,
I've recently bought a Nokia E90 and have found it pretty cool so far, except for one thing: the Web Browser is not displaying characters correctly. It shows almost all of them as empty rectangles, with just one or two exceptions.
I've checked several web pages (including ones that I maintain and which, therefore, I /know/ send correct character encoding information), both online and local and it happens on all of them.
I've tried changing the settings to "Automatic" character encoding and "UTF-8" and it doesn't make any difference (I haven't tried any of the others, though most of them are for non-Latin alphabets).
The characters display correctly throughout the rest of the OS (notes, QuickOffice, Acrobat Reader, etc.) - its just the Web Browser which is effected.
The only significant changes I've made from factory settings are installing three applications: Python, PuTTY and OggPlay. I removed OggPlay after I installed it because it didn't seem to work (in fact, it sort of "crashed" the phone - I had to remove the battery to get it to work again!).
Does anyone have any ideas why the Web Browser is not displaying characters correctly? Any help would be very much appreciated (considering how expensive the phone is!)
Thanks.Hi Guys,
I have also encountered the same issue: Most Characters are not displaying properly on the E90 Web browser. I am currently using E90 which is (English/Arabic and French) enabled.
I have checked the same websites with N95 and there not a single problem.
I am quite disappointed with the E90 website browsing experience especially that I am running the latest FW (v210.34.75), so keep hoping that this gets fixed with the next FW version. -
Create unicode file and read unicode file
Hi
How can create a unicode file and open unicode file in LV
Regards
Madhugmadhu wrote:
Hi
How can create a unicode file and open unicode file in LV
Regards
Madhu
In principle you can't. LabVIEW does not support Unicode (yet)! When it will officially support that is a question that I can't answer since I don't know it and as far as I know NI doesn't want to answer.
So the real question you have to ask first is where and why do you want to read and write Unicode file. And what type of Unicode? Unicode is definitly not just Unicode as Windows has a different notion of Unicode (16 Bit characters) than Unix has (32 Bit characters). The 16 Bit Unicode from Windows is able to cover most languages on this globe but definitly not all without code expansion techniques.
If you want to do this on Windows and have decided that there is no other way to do what you want you will probably have to access the WideCharToMultiByte() and MultibyteToWideChar() Windows APIs using the Call Library Node in order to convert between 8 bit multybyte strings as used in LabVIEW and the Unicode format necessary in your file.
Rolf Kalbermatter
Rolf Kalbermatter
CIT Engineering Netherlands
a division of Test & Measurement Solutions -
Hi,
I am new in Java and facing problem with reading unicode filename. I tried to read a xml file that stored in desktop with chinese character filename using java but it return exception when come to this line of code - xmldoc = builder.build(inputFile);
Below is the code i have wrote:
Document xmldoc = null;
String xmlDirectory = null;
SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser");
xmlDirectory = "C:\\Documents and Settings\\user\\Desktop\\";
String inputFile = xmlDirectory + fileName;
xmldoc = builder.build(inputFile);
And the exception returns is as below:
java.io.FileNotFoundException: C:\Documents and Settings\user\Desktop\??_1_ref.xml (The filename, directory name, or volume label syntax is incorrect)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileInputStream.<init>(FileInputStream.java:66)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:69)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:156)
at java.net.URL.openStream(URL.java:913)
at org.apache.xerces.impl.XMLEntityManager.startEntity(XMLEntityManager.java:731)
at org.apache.xerces.impl.XMLEntityManager.startDocumentEntity(XMLEntityManager.java:676)
at org.apache.xerces.impl.XMLDocumentScannerImpl.setInputSource(XMLDocumentScannerImpl.java:252)
at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:499)
at org.apache.xerces.parsers.StandardParserConfiguration.parse(StandardParserConfiguration.java:581)
at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:147)
at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1157)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:453)
at org.jdom.input.SAXBuilder.build(SAXBuilder.java:891)
The fileName is "����_1_ref.xml" (it is actually in chinese character). However, if normal fileName, such as "1_ref.xml", the above code is actually working fine.
May i know how to solve this issue?Retrieve the variable (unicode) from Oracle as a
string,
and append to the fileName. The field in DB is
NVARCHAR2, and is in chinese charactors instead of
something like this "\u4e00....".
Then for the case above, the fileName is pass in as
parameter through a java method.
eg.
public void startGenerate(String fileName) throws
ApplicationException
................I just tried myself. I stored a Chinese file name in a property file, retrieve the name, and open the file. But I got no problem.
Have you tried debugging the program to see if the "?" characters appear immediately after rs.getString() or after they are appended to the fileName?
I found one reference site,
http://www.chinesecomputing.com/programming/java.html
Good luck
:D -
Problem with reading String from Xlsx file.
Hi! I am trying to read string and numerical data from an xlsx file and trying to display its contents in a word file. I tried converting it to a .lvm file too. On using the "Read From Spreadsheet" tool, I get random characters as output. On using " Read From Measurement File" tool, I am getting an error saying "Error 100 occurred at Read From Measurement File->Untitled 1". What do I do? At the end I need to display the output, row by row, in a Microsoft Word file. I am so lost. Please Help.
Solved!
Go to Solution.bsvare wrote:
labview currently does not read directly from an xlsx file. If you convert the xlsx to an xls file first, then you can use the read from spreadsheet tool to load the data from the file.
Hey, I tried doing that. It still just gave the values 0.00 in all the cells of the indicator array. Plus my file has string in it also. The data type in the spreadsheet was "general". Here in the tool, it is "Double". I changed the tool data type to string and spreadsheet to text but I only got gibbrish for my efforts.
Thanks anyway! -
MacBook Pro wakes up when unplugging usb from cinema display
Hi all,
I have a 30" cinema display at work and a 23" at home which I use with my macbook pro. At work, I use a usb keyboard which stays plugged into the monitor. At home, I have a wireless keyboard and a usb printer plugged into the monitor. Thus in both cases, I plug the usb cable from the display into my laptop. I typically leave the MBP open so as to have the main display and the cinema display at the same time. When I leave home or work, I close to MBP to put it to sleep. However, upon unplugging the display and especially its usb cable, the laptop invariably wakes up, which is annoying.
I tried leaving the MPB closed and putting it to sleep with the display's on/off button. Same behavior.
Is there a setting to prevent usb 'signals' from waking up the computer? Should I proceed differently?
Any advice would be greatly appreciated. Thanks in advance.I know this question is answered, but I didn't want to create another thread to discuss the same topic.
I think this issue is rooted in the Cinema Display being attached to port 3 of it's own internal hub. This is to allow the display dimming, the power button that controls the sleep/wake of the laptop, etc. I just got my 30" cinema display a week or two ago, and before that was using an HP IPS LCD panel (24") that also had an integrated USB hub. I have the same devices attached too --USB hard drive for time machine, and keyboard (mighty mouse plugged in to keyboard). When unplugging the cable to the HP monitor's hub, the macbook would not wake up. Just with the Apple display.
The problem with the hack/bandaid/workaround suggested is that to unplug the USB cable before sleeping the laptop, I need to open disk utility and eject the external drive first, else the system will complain. When the laptop goes to sleep, it runs a script (that I wrote) to eject the drive before sleeping, thus no error messages.
Apple really needs to work on making "docking" work right. Until then, I'm using a Belkin USB hub which doesn't wake up the mac when unplugging it.
Maybe you are looking for
-
Why do my Organizer and Editor screens look different from the ones in books and in the brochure that came with installation cd?
-
After using Dup Cleaner, I deleted several pics. Now in my organizer, many pics have Question Marks indicating no Pic. How do I eliminate the thumbnails?
-
Always get '1 new add-on has been installed' when launching FF
Hi, I think an incompletely installed add-on may be causing the 'Add-ons' window to always be displayed when FF launches. - FEBE extensions backup warns about one it can't process, and hangs. 'Could not process extension/theme with GUID {DDC359D1-844
-
Running DBMON for ST04 Statistics?
Is everyone running DBMON for ST04 statistics? At one time it slowed down general database activity because it was so busy collecting statistics. We are checking to see if that is still true. The main reason for the question is whether it could be s
-
I am getting this error when attempting to perform and update through the Mac App Store... the update is the RAW Camera Update... The operation couldn't be completed. (NSURLErrorDomain error -1102.)(-1102) Please let me know how to resolve. Thank you