Spool non english character names to a file

Hi There,
We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
Any suggesstions?
Thanks,
Sun
Edited by: ryansun on Jun 23, 2012 4:24 AM

ryansun wrote:
Hi There,
We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
Any suggesstions?
Thanks,
Sun
Edited by: ryansun on Jun 23, 2012 4:24 AMwhen dealing with non-ASCII characters, two different issues can exist,
1) data storage - incorrect byte value is stored
2) data presentation - incorrect character is displayed.
Can the utility utilized to view the *CSV file actually display the non-ASCII value properly?
can you inspect the *CSV file using an hexadecimal editor? what do you see inside the file?

Similar Messages

Error when import file with non-english character

Hi, I have images file with non-english character (unicode), for example ABC<X>.png where <X> is non-english character such as japanese, chinese, etc. Whenever I want to import the file to After Effects (right click -> import -> file), I always encounter error: Finding file/dir info for the file "C:\...\ABC?.png" -- file not found (-43) (3::30) Can't import file "ABC?.png": unsupported filetype or extension. (0::1) My PC is Windows XP Professional 2002 SP2 English. How to solve this problem? Thanks

Adjust your system language settings. Proper file name conventions require a consistent Unicode environment, so install the respective foreign language support files or switch the language system-wide. Mixing different zones/ code ranges is always a bad idea. If your system is not in Japanese, AE will always misinterpret the characters and refuse to import. If that's not feasible, simply rename the files.
Mylenium

Csv upload -- suggestion needed with non-English character in csv file

Hi All,
I have a process which uploads a csv file into a table. It works with the normal english characters. In case of non-English characters in the csv file it doesn't populate the actual columns.
My csv file content is
First Name | Middle Name | Last Name
José | # | Reema
Sam | # | Peter
Out put is coming like : (the last name is coming as blank )
First Name | Middle Name | Last Name
Jos鬣 | Reema | blank 
Sam | # | Peter
http://apex.oracle.com/pls/otn/f?p=53121:1
workspace- gil_dev
user- apex
password- apex12
Thanks for your help.
Manish

Manish,
PROCEDURE csv_to_array (
 -- Utility to take a CSV string, parse it into a PL/SQL table
 -- Note that it takes care of some elements optionally enclosed
 -- by double-quotes.
 p_csv_string IN VARCHAR2,
 p_array OUT wwv_flow_global.vc_arr2,
 p_separator IN VARCHAR2 := ';'
 IS
 l_start_separator PLS_INTEGER := 0;
 l_stop_separator PLS_INTEGER := 0;
 l_length PLS_INTEGER := 0;
 l_idx BINARY_INTEGER := 0;
 l_quote_enclosed BOOLEAN := FALSE;
 l_offset PLS_INTEGER := 1;
 BEGIN
 l_length := NVL (LENGTH (p_csv_string), 0);
 IF (l_length <= 0)
 THEN
 RETURN;
 END IF;
 LOOP
 l_idx := l_idx + 1;
 l_quote_enclosed := FALSE;
 IF SUBSTR (p_csv_string, l_start_separator + 1, 1) = '"'
 THEN
 l_quote_enclosed := TRUE;
 l_offset := 2;
 l_stop_separator :=
 INSTR (p_csv_string, '"', l_start_separator + l_offset, 1);
 ELSE
 l_offset := 1;
 l_stop_separator :=
 INSTR (p_csv_string,
 p_separator,
 l_start_separator + l_offset,
 1
 END IF;
 IF l_stop_separator = 0
 THEN
 l_stop_separator := l_length + 1;
 END IF;
 p_array (l_idx) :=
 (SUBSTR (p_csv_string,
 l_start_separator + l_offset,
 (l_stop_separator - l_start_separator - l_offset
 EXIT WHEN l_stop_separator >= l_length;
 IF l_quote_enclosed
 THEN
 l_stop_separator := l_stop_separator + 1;
 END IF;
 l_start_separator := l_stop_separator;
 END LOOP;
 END csv_to_array;and
PROCEDURE get_records (p_clob IN CLOB, p_records OUT varchar2_t)
 IS
 l_record_separator VARCHAR2 (2) := CHR (13) || CHR (10);
 l_last INTEGER;
 l_current INTEGER;
 BEGIN
 -- SIf HTMLDB has generated the file,
 -- it will be a Unix text file. If user has manually created the file, it
 -- will have DOS newlines.
 -- If the file has a DOS newline (cr+lf), use that
 -- If the file does not have a DOS newline, use a Unix newline (lf)
 IF (NVL (DBMS_LOB.INSTR (p_clob, l_record_separator, 1, 1), 0) = 0)
 THEN
 l_record_separator := CHR (10);
 END IF;
 l_last := 1;
 LOOP
 l_current := DBMS_LOB.INSTR (p_clob, l_record_separator, l_last, 1);
 EXIT WHEN (NVL (l_current, 0) = 0);
 p_records (p_records.COUNT + 1) :=
 REPLACE (DBMS_LOB.SUBSTR (p_clob, l_current - l_last, l_last),
 l_last := l_current + LENGTH (l_record_separator);
 END LOOP;
 END get_records;Denes Kubicek
http://deneskubicek.blogspot.com/
http://www.opal-consulting.de/training
http://htmldb.oracle.com/pls/otn/f?p=31517:1
-------------------------------------------------------------------

Getting a request in a non English character

Hi ,
In an attempt to solve a problem of getting a request in a non English character , i use the code , taken from O'Reilly's "Java Servlet programing" First edition:
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
public class MyServlet extends HttpServlet {
 public void doGet(HttpServletRequest req, HttpServletResponse res)
 throws ServletException, IOException {
 try {
 //set encoding of request and responce
 req.setCharacterEncoding("Cp1255"); //for hebrew windows
 res.setCharacterEncoding("Cp1255");
 res.setContentType("Text/html; Cp1255");
 String value = req.getParameter("param");
 // Now convert it from an array of bytes to an array of characters.
 // Here we bother to read only the first line.
 BufferedReader reader = new BufferedReader(
 new InputStreamReader(new StringBufferInputStream(value), "Cp1255"));
 String valueInUnicode = reader.readLine();
 }catch (Exception e) {
 e.printStackTrace();
this works fine , the only problem is that StringBufferInputStream is deprecated .
is there any other alternative for that ?
Thanks in advance
Yair

Hi Again ..
To get to the root of things , here is a servlet test and an http client test which demonstrates using the above patch and not using it :
The servlet :
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.io.StringBufferInputStream;
public class Hebrew2test extends HttpServlet {
 public void doGet(HttpServletRequest request, HttpServletResponse response)
 throws ServletException, IOException {
 request.setCharacterEncoding("Cp1255");
 response.setCharacterEncoding("Cp1255");
 response.setContentType("Text/html; Cp1255");
 PrintWriter out = response.getWriter();
 String name = request.getParameter("name");
 //print without any patch
 out.println(name);
 //a try with patch 1 DEPRECATED
 out.println("patch 1:");
 BufferedReader reader =
 new BufferedReader(new InputStreamReader(new StringBufferInputStream(name), "cp1255"));
 String patch_name = reader.readLine();
 out.println(patch_name);
 //a try with patch 2 which doesn't work
 out.println("patch 2:");
 String valueInUnicode = new String(name.getBytes("Cp1255"), "UTF8");
 out.println(valueInUnicode);
and now for a test client :
import java.io.*;
import java.net.*;
public class HttpClient_cp1255 {
private static void printUsage() {
System.out.println("usage: java HttpClient host port");
public static void main(String[] args) {
if (args.length < 2) {
printUsage();
return;
// Host is the first parameter, port is the second
String host = args[0];
int port;
try {
port = Integer.parseInt(args[1]);
catch (NumberFormatException e) {
printUsage();
return;
try {
// Open a socket to the server
Socket s = new Socket(host, port);
// Start a thread to send reuest to the server
new Request_(s).start();
// Now print everything we receive from the socket
BufferedReader in = new BufferedReader(new InputStreamReader(s.getInputStream(),"cp1255"));
String line;
File f = new File("in.txt");
FileWriter out = new FileWriter(f);
while ((line = in.readLine()) != null) {
System.out.println(line);
out.write(line);
out.close();
 catch (Exception e) {
e.printStackTrace();
class Request_ extends Thread {
Socket s;
public Request_( Socket s) {
this.s = s;
setPriority(MIN_PRIORITY); // socket reads should have a higher priority
// Wish I could use a select() !
setDaemon(true); // let the app die even when this thread is running
public void run() {
try {
 OutputStreamWriter server = new OutputStreamWriter(s.getOutputStream(),"cp1255");
 //String query= "GET /userprofiles/hebrew2test?name=yair"; //yair in Englisg ..
 String query= "GET /userprofiles/hebrew2test?name=\u05d9\u05d0\u05d9\u05e8"; //yair in hebrew - in unicode
 System.out.println("Connected... your HTTP request is sent");
 System.out.println("------------------------------------------");
 server.write(query);
 server.write("\r\n"); // HTTP lines end with \r\n
 server.flush();
 System.out.println(server.getEncoding());
 server = new OutputStreamWriter(new FileOutputStream("out.txt"),"cp1255");
 server.write(query);
 server.flush();
catch (Exception e) {
e.printStackTrace();

Reading a non-english character

Hi, I have a trouble with reading a non-english character from a html page.
I'm taking the word from the html page, and compare it with itself,
like this
string.equals("BİTTİ")
but it returns false.
is it possible to correct this?

specify an encoding for your inputstream reader:
BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream("infilename"), "8859_1")); for example

How to validate for non-english character on a single line text field

In a "Single Line Text" field we would like to allow the users to enter alpha numeric values only. We should show error when the user enter non-English values like
carácter
Vijayaragavan, MCTS

Hi,
According to your post, my understanding is that you wanted to validate for non-english character on a single line text field.
I recommend to use jQuery to attach regular expression validation. Please refer to:
Using #jQuery to attach regular expression validation to a #SharePoint list form field
In addition, for custom validations you can create your own Types. Refer to
this[^] for creating custom field type
More information:
SharePoint Custom Field - Regex Validator
Thanks,
Linda Li
Forum Support
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact
[email protected]
Linda Li
TechNet Community Support

Does querybuilder support non-english character?

I want to make query using querybuilder with non-english character (Chinese)?
I tried with http://localhost:4502/libs/cq/search/content/querydebug.html but it is not working.
below is my query string:
property=contenttext
property.value=你好嗎
I have converted the chinese character (你好嗎)to unicode.
Can anyone help me?

That's a bug in the debugger UI. But it's easy to fix:
in crxde lite, overlay /libs/cq/search/components/querydebug/querydebug.jsp by copying it to /apps/cq/search/components/querydebug/querydebug.jsp
open /apps/cq/search/components/querydebug/querydebug.jsp
find the line "props.load(new ByteArrayInputStream(queryParam.getBytes("ISO-8859-1")));"
and replace with "props.load(new StringReader(queryParam));"
Will be fixed in 5.6.1.

Linux or JVM: cannot display non english character

hi,
i am trying to implement a GUI that supports both turkish and english. user can switch between them on the fly.
public class SampleGUI {
JButton trTranslate = new JButton(); /* Button, to translate into turkish */
/* Label text will be translated */
JLabel label = new JLable("Text to Be Translated!");
trTranslate.addActionListener (new ActionListener(){
void ActionPerformed(ActionEvent e){
String language="tr";
String country="TR";
Locale currentLocale;
ResourceBundle messages;
currentLocale = new Locale(language, country);
messages = ResourceBundle.getBundle("TranslateMessages",currentLocale);
/* get from properties file turkish match of "TextTranslate "*/
label.setText(messages.getString("TextToTranslate"));
Finally, my problem is my application does not display non english chracaters like "� ş � ğ � i" in GUI after triggering translation.However, if i do not use ResourceBundle and instead assign directly the turkish match for that label (i.e. label.setText("şşşşş")), GUI successfully displays turkish characters. what may be the problem? which encoding set does not conform?
ps : i am using redhat linux8.0, j2sdk1.4.1. current locale = "tr_TR.UTF-8". in /etc/sysconfig/keyboard , keyTable = "trq". There seems no problem for me as i can input and output
turkish characters. OS supports this. Also jvm gets the current encoding from OS.It seems as if there is a problem in reading properties file in inappropriate encoding.
thanx for dedicating ur time and effort,
hELin

I would suspect it would work in vim only if vim supported the UTF8 character set. I have no idea if it does.
Here is one blurb I found on google:
USING UNICODE IN THE GUI
The nice thing about Unicode is that other encodings can be converted to it
and back without losing information. When you make Vim use Unicode
internally, you will be able to edit files in any encoding.
Unfortunately, the number of systems supporting Unicode is still limited.
Thus it's unlikely that your language uses it. You need to tell Vim you want
to use Unicode, and how to handle interfacing with the rest of the system.
Let's start with the GUI version of Vim, which is able to display Unicode
characters. This should work:
     :set encoding=utf-8
     :set guifont=-misc-fixed-medium-r-normal--18-120-100-100-c-90-iso10646-1
The 'encoding' option tells Vim the encoding of the characters that you use.
This applies to the text in buffers (files you are editing), registers, Vim
script files, etc. You can regard 'encoding' as the setting for the internals
of Vim.
This example assumes you have this font on your system. The name in the
example is for X-Windows. This font is in a package that is used to enhance
xterm with Unicode support. If you don't have this font, you might find it
here:
     http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz

Problem with Vcard and non-English character

VCard feature is what I would like to use, but I have quite a few contacts with Non-English name (Korean).
I know Ipod can display in Korean, but when I create a v-card with Korean character and copy the vcard file over into /contacts folder, I can see the filename as the person's name (From windows explorer), but I can ONLY see first character of the file when I display contacts in iPod.
Does anyone have tips/tricks on displaying all the filename in IPod contacts?
Thanks.
Windows XP Pro

Because i use the string nota into a jsp page and i print the string nota into a textarea and the text is with no newline, example:
<textarea name="nota" rows="4" cols="60"><%= nota %></textarea>
the text into textarea is:
first linesecond linethird line
but i want that the text displayed into textarea is equal the text into the CDATA section:
first line
second line
third line

How does one install non-English character sets for use with the "find" function in Acrabat Pro 11?

I have pdf files in European languages and want to be able to enter non-English characters in the "find" function. How does one install other character sets for use with Acrobat Pro XI?

Have you tried applying the update by going to Help>Updates within Photoshop Lightroom? The update should be using the same licensing? Did you perhaps customize the installation location? Finally which operating system are you using?

Non-english character display as square box

Hi all,
I'm not very sure if this question should be asked here or in the JRE board, thus I'm trying here also
I have been trying an opensourced application called Alliancep2p (could be obtained from www.alliancep2p.com) using JRE 1.6 on an English Windows XP Pro machine.
The problem:
all chinese input are displayed as "square box". It looks like the programme "gets" the correct character, only that everything is displayed as "square box".
It looks like a font issue, though I'm not that sure. Is there anyway the default fonts could be changed, or to get the characters correctly displayed?
Note: I have east asian fonts installed, and the Java config panel can display chinese or other non-english characters correctly.
I tried the same application under GNU/Linux (locale is UTF-8) and chinese input/display correctly without any problem at all. Does it mean that it is not the problem of the application, or?
The original question in the JRE board:
http://forum.java.sun.com/thread.jspa?threadID=5265369&tstart=0
Thanks for all the input.

I'm not really sure if it's a problem of the application or not. But the fact that it works perfectly under Linux makes me think maybe it's not the problem of the program, and actually their developers said that unicode is being used all over the program and seems like they're not CJK users also.
I'm not a java guru so I can't really tell from the source if there's anything wrong.

Identify Non English Character in a String

All,
We have a requirement to Identify the Non English Characters from the User Key In data and return an error message saying only valid English, Numeric and some special characters are allowed.
For Example, If the User enters data like "This is a Test data" then the return value should be true. or if he enters something like "My Native Language is inglés" then it should return false. Similarly any Chinese, russian or japansese character entryies should also return false.
How can we achieve this?
Thanks,
Nagarajan.

Hi Nagarajan,
You could use Unicode character blocks or simply craft a regular expression that contains all the characters you need. The latter is easy to understand and gives you full control over which characters you want to allow. E.g. I assume you might want something like this:
if(!"This is a proper input string".matches("[\\s\\w\\p{Punct}]+")) {
// Issue error message and re-get input string
The String method matches() takes a regular expression as input parameter. If you haven't dealt with regular expressions before, check out the Java API help for class java.util.regex.Pattern. Here's a short breakdown of the pattern I used:
<ol>
<li>The square brackets [] enclose a list of allowed characters; here you can explicitly list all allowed characters.</li>
<li>You can specify ranges like a-z as a character class, list individual characters like ;:| or utilize predefined character classes (\s for any whitespace character, \w for all letters a-z and A-Z, underscore and 0-9 and the posix class \p for a list of punctuation symbols). For a complete list check Java API help on java.util.regex.Pattern.
<li>The + at the end indicates that the characters listed can occur once or more.</li>
</ol>
There's other ways to achieve what you want, but I think this might be an easy way to start with.
Cheers, harald

Connecting to Wireless AP with non-english SSID name

Hi everyone
I have a wireless AP with a non-english name ( The name is "IsolÈ coffee"), when I do scan for wireless network in Mac OS X it will not show this network and even if I use iStumbler it will show it but it will not able to connect to it
if I use any Windows PC/Linux it can connect to the wireless AP with no issue
is it possible in anyway to connect to this wireless AP (and NO I can't change the SSID name because I don't own it, it's a public AP)
Thanks

This post is a little old, i don't know if it's proper to add a reply. I just got the same problem here, and hope someone would work on it. The ssid is some chinese characters, and i don't have authority to change it. Same as what AhmadT mentionded, airport couldn't find anything, even if the ssid is typed in manually, it doesn't work.
The only workaround i could find is to use the other wifi card (USB ones), and have it driven in WindowsXp which is running in VMWARE Fusion. so that i could access the wifi-router and internet in the vm-machine.

Non-English character problem in Oracle 10g Express Edition

Hi There;
I have a table. It's name is INSTITUTION. It has a NUMBER INS_ID and NVARCHAR2(50) INS_NAME . INS_NAME can contain Turkish characters, such as "ğ,ü,ş,ç,ö". According to business logic, there can not be a repetition on the INS_NAME.
User will enter institution name from a textbox in ASP.NET , and I check this name in database from c sharp code, if there is no repetition, we will add this record.
The problem is; when user enter a instition name that contains Turkish character, there is a duplication. If there is a instition name is *"su işleri"* , the both query; SELECT * FROM INSTITUTION WHERE INS_NAME = *'su işleri'*; and SELECT * FROM INSTITUTION WHERE INS_NAME = *'su isleri'*; returns no result, even though there it is.
But if instition name is "oracle corporation" (there is no Turkish character) it query successfully. I have the same problem in Toad for Oracle 11.5.1.2. When I query database from toad SELECT * FROM INSTITUTION, the phrase *"su işleri"* has appeared. But when I query SELECT * FROM INSTITUTION WHERE INS_NAME = *'su işleri'*; , there is again no result.
When I connect oracle database directly and perform the query SELECT * FROM INSTITUTION , the phrase *"su isleri"* (not *"su işleri"* ) has appeared.
Here are the language settings of the database:
National Language Support
National Language Parameter Value
NLS_CALENDAR______________GREGORIAN
NLS_CHARACTERSET__________WE8MSWIN1252
NLS_COMP__________________BINARY
NLS_CURRENCY______________TL
NLS_DATE_FORMAT__________DD/MM/RRRR
NLS_DATE_LANGUAGE________TURKISH
NLS_DUAL_CURRENCY_________YTL
NLS_ISO_CURRENCY__________TURKEY
NLS_LANGUAGE______________TURKISH
NLS_LENGTH_SEMANTICS______BYTE
NLS_NCHAR_CHARACTERSET___AL16UTF16
NLS_NCHAR_CONV_EXCP______FALSE
NLS_NUMERIC_CHARACTERS____ ,.
NLS_SORT___________________TURKISH
NLS_TERRITORY______________TURKEY
NLS_TIME_FORMAT____________HH24:MI:SSXFF
NLS_TIMESTAMP_FORMAT_______DD/MM/RRRR HH24:MI:SSXFF
NLS_TIMESTAMP_TZ_FORMAT____DD/MM/RRRR HH24:MI:SSXFF TZR
NLS_TIME_TZ_FORMAT__________HH24:MI:SSXFF TZR
How can I resolve that problem? Thanks in advance.
Edited by: 963344 on 05.Eki.2012 01:00
Edited by: 963344 on 05.Eki.2012 01:01
Edited by: 963344 on 05.Eki.2012 01:06

This type of question/discussion belongs in {forum:id=50} forum.
Very recently a thread there touched the topic of Turkish character support.
Please read it: Western European Characterset to Turkish in sql
>
NLS_CHARACTERSET__________WE8MSWIN1252 Check the character set repertoire of win-1252 (look for the typical turkish language characters you've mentioned above).
http://msdn.microsoft.com/en-us/goglobal/cc305145.aspx
Look at character names, such as "... letter s with cedilla".

Contact bug with non English last name

I put the chinese name as last name, so that they will show up on the contact next to the English first name ie. George [chinese], this worked no problem, but when I send sms to more than one such contact it will fail with "xxx is not on the contact...", but I can send to one contact at a time no problem. I reset the phone and re-do the contact via OVI suite but still the same problem, the only way to make it work is to put the chinese as part of the first name, this looks good on the contact and can send sms to multi contact no problem, but this is a very stupid workaround, I don't have this problem on previous N phones.

that *IS* a bug! If you can, Have a search around on bugs.maemo.org and log a new bug.. especially since it can be reproduced!
good find!

Spool non english character names to a file

Similar Messages

Maybe you are looking for