Non-ASCII charaters in a unix file

Hi,
Could you pls tell me how do i know the non-ascii/ascii characters in a unix file.
Thanks

in vi
:set list
$cat[b] -v abc.txt |more

Similar Messages

Sql to remove brackets non ascii charaters

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
Hi there,
I have a requirement to remove the any non alphabets like brackets, , quotes and non ascii, languages other than english and everything in lower case. The select is simple
select
lower(customer_name)
from
customer_data
where CONVERT ( customer_name, 'US7ASCII')     = customer_nameHowever, I also want to add maybe the replace function on the customer name, such that the brackets, ", ', / - or any other such character can be removed. I tried unsuccessfully something like
{code}
select replace('cvp, llc','''-/,,".') from dual
I wanted to do something on those lines on my first query so that I only get english alphabets customer names, stripping off all the non english charaters and any non-english characters.
Could you please advise?
Thanks,
Ryan
Edited by: ryansun on Nov 15, 2012 6:09 AM
Edited by: ryansun on Nov 15, 2012 6:10 AM

Hi Chris,
if you want to use TRANSLATE then you have to modify it a bit:
WITH customer_data AS
   SELECT 'cvp, llc' AS customer_name FROM DUAL
SELECT translate(customer_name,'abcdefghijklmnopqurstuvwxyzABCDEFGHIJKLMNOPQURSTUVWXYZ1234567890'
                ,'abcdefghijklmnopqurstuvwxyzABCDEFGHIJKLMNOPQURSTUVWXYZ1234567890') txt
FROM customer_data;
TXT
cvp, llc
WITH customer_data AS
   SELECT 'cvp, llc' AS customer_name FROM DUAL
WITH customer_data AS
   SELECT 'cvp, llc' AS customer_name FROM DUAL
SELECT translate(customer_name,' '
                 ||'[''-/,".]' -- chars to remove here
                ,' ') txt
FROM customer_data;
TXT
cvp llc
{code}
Regards.
Al
Edited by: Alberto Faenza on Nov 15, 2012 3:52 PM
Modified query

Non-Ascii charaters

Does acrobat.com supports non-ascii characters in file names?
Please have a look on screen shots.
Regards
Bartek

Is it possible to use unicode names?

Problems with non-ASCII characters on Linux Unit Test Import

I found a problem with non-ASCII characters in the Unit Test Import for Linux. This problem does not appear in the Unit Test Import for Windows.
I have attached a Unit Test export called PROC1.XML It tests a procedure that is included in another attachment called PROC1.txt. The unit test includes 2 implementations. Both implementations pass non-ASCII characters to the procedure and return them unchanged.
In Linux, the unit test import will change the non-ASCII characters in the XML file to xFFFD. If I copy/paste the the non-ASCII characters into the Unit Test after the import, they will be stored and executed correctly.
Amazon Ubuntu 3.13.0-45-generic / lubuntu-core
Oracle 11g Express Edition - AL32UTF8
SQL*Developer 4.0.3.16 Build MAIN-16.84
Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
In Windows, the unit test will import the non-ASCII characters unchanged from the XML file.
Windows 7 Home Premium, Service Pack 1
Oracle 11g Express Edition - AL32UTF8
SQL*Developer 4.0.3.16 Build MAIN-16.84
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)
If SQL*Developer is coded the same between Windows and Linux, The JVM must be causing the problem.

Set the System property "mail.mime.decodeparameters" to "true" to enable the RFC 2231 support.
See the javadocs for the javax.mail.internet package for the list of properties.
Yes, the FAQ entry should contain those details as well.

Unable to Open unix file in UNICODE system which created NON-UNICODE system

Unable to Open unix file in UNICODE system which created in NON-UNICODE system
We have two SAP systems both are ECC6.0 but System 1 is NON-Unicode and System2 is Unicode system.
There is a common unix directory/folder for both system.
Our requirement is to create one file on unix common folder and write the data to file from system1 .
In system2 open the same file for appending mode to write the data .
The file in system 1 created with below sentence.
OPEN DATASET g_unix_file FOR OUTPUT IN TEXT MODE ENCODING UTF-8.
Now I have to append the data from system 2 to same file.
I have tried to used below statement in system 2 to open the file but sy-subrc value comes as '8'.
1> OPEN DATASET g_unix_file FOR APPENDING IN TEXT MODE ENCODING UTF-8.
2>OPEN DATASET g_unix_file FOR APPENDING IN legacy TEXT MODE CODE PAGE
cdp IGNORING CONVERSION ERRORS .
3>OPEN DATASET g_unix_file FOR APPENDING IN TEXT MODE ENCODING Default.
4>OPEN DATASET g_unix_file FOR APPENDING IN TEXT MODE ENCODING NON-UNICODE.
Tried out all the possibilities as per F1 help given for open dataset , but still there is problem with opn file in appending as well output mode.However the file successfully open in Input mode(Read).
Please advice suggestion to resolve this issue.
Thanks.

Messgae captured as 'Permission Denied". The program gets triggered with system user Id PPID.
How to check the security access of the User ID.

Problem with non-ASCII file name in content disposition header

Hi All,
I am facing some problems with the non-ASCII file name incase of content-disposition header. I read from the RFC 2183 that if the file name contains non-ASCII characters then the same should be encoded before sending to browser. I did the same but realized 2 problems:
1. The name of the file is truncated in case the file name is slightly long for e.g. ��j�b�g��.txt
2. Also when the same file is opened in notepad, the title is showing encoded name %E6%9C%80%E4%B8%8A%E4%BD%8D.....
Overall, I feel that the browser is not understanding or responding to the encoded header values.
Is there any solution to this problem? I am using Microsoft IE 6.0.
The code snippet is given below:
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
          String fileName = "��j�b�g��.txt";
          fileName = URLEncoder.encode(fileName, "UTF-8");
          resp.setCharacterEncoding("UTF-8");
          resp.setHeader("Content-Disposition", "attachment; filename=\"" + fileName + "\"");
          resp.setContentType("application/download-binary");
          String s = "This is inside txt file";
          resp.getOutputStream().write(s.getBytes("UTF-8"));
          return;
     }Any help or pointer would be highly appreciated.
Thanks and Regards,
Ashish

The MIME standards for non-ASCII filenames are not widely implemented.
Many mailers use an ad hoc method for encoding filenames. JavaMail
supports both methods, but you need to set properties, such as the
mail.mime.encodefilename property. See the JavaMail javadocs for
the javax.mail.internet package.

Cannot rename file with non-ASCII characters when using the

My application moves files from one directory to another by calling File[] srcFiles = srcDir.listFiles() to get a list of files in the source directory, and then calling srcFiles.renameTo(destFile) to rename each file.
This does not work (renameTo returns false and the file is not moved) under the following circumstances:
- the file's leaf name contains non-ASCII characters, for example "�"
- the OS is Solaris 9
- the LANG and LC_* environment variables are unset, i.e. the C locale is being used
If I set the LANG environment variable to, for example, en_GB.UTF-8 then the rename succeeds.
I have tried calling srcFiles[index].getName().getBytes("UTF-8") and the non-ASCII characters are being replaced with ? (0x3f) characters when LANG is unset.
Is this a bug in the JRE? I would argue that since my code does not actually manipulate the filename (I just use the File object that File.listFiles() gives me) then the rename should succeed. Of course I would not expect the file name to be displayed correctly if I printed it out.
I have reproduced this behaviour with JDK 1.4.2_05 and 1.5.0_04 on Solaris 9.
Francis

Thanks for the info Alan.
I considered setting the locale in the environment (this sounds like the "correct" fix to me and we might implement it later), but this application shares a WebLogic server with many other applications so we would have to do a huge amount of testing to make sure that the locale change wouldn't break the other apps. In the end I worked around the problem by making the code that generates the filenames in the first place strip out any non-ASCII characters (the names of the files are not critically important).
Looking forward to JSR-203, in the meantime perhaps a note about this behaviour in the java.io.File javadoc would be useful.

How to send an attached file containing with non-ascii code ?

Hi,
I want to send a attaced text file containing with non-ascii code(Traditional Chinese). Is there any way to solve the encoding problem?
Currently, it transfer into non-meaningful code in receiving side.
Thanks for the help in advance.

Here is the code:
Session _gSession = null;
MimeMessage message = null;
Properties props = new Properties();
props.put("mail.smtp.host", smtpHost);
_gSession = javax.mail.Session.getInstance(props, null);
message = new MimeMessage(_gSession);
message.setFrom(new InternetAddress(emailSender , emailSender));
InternetAddress ia[] = new InternetAddress[1];
ia[0] = new InternetAddress(emailReceiver, emailReceiver);
message.setRecipients(Message.RecipientType.TO, ia);
message.setSubject("Test Encoding Attached File");
message.saveChanges();
BodyPart messageBodyPart = new MimeBodyPart();
DataSource fds = new FileDataSource("Big5_Code.txt");
messageBodyPart.setDataHandler(new DataHandler(fds));
messageBodyPart.addHeader("Content-ID","meme");
MimeMultipart multipart = new MimeMultipart("related");
multipart.addBodyPart(messageBodyPart);
message.setContent(multipart);
transport.connect();
transport.send(message);

File upload with non-ascii name

I'm designing a system that includes file-uploads. My problem is that any non-ascii chars in the filename are encoded strangely when saved. ä is encoded to å etc.
I use Tomcat with the -Dfile.encoding="UTF-8" in the Catalina file. I get the same result despite method; my own implementation, apache commons or Javazoom's uploadBean. All the JSP charset parameters are set.
Any ideas?

Hi amitads,
I'm sure u've used Java enough. Also, u must have learned about the \ (backslash) escape character ?
So, if u want to include a string like a single quote, u can write \' and for a double quote u can write \".
Try this ... I'm sure it will help.
Keep me posted.
Cheers !!
Sherbir.

JavaMail with non-ascii file name attachment.

Hello,
I add a file to the mail, which file name is a Chinese name. After I sent it and receive in other email client (M$outlook), it could not display the correct file name for the attachment file.
How can solve this ?

The MIME standards for non-ASCII filenames are not widely implemented.
Many mailers use an ad hoc method for encoding filenames. JavaMail
supports both methods, but you need to set properties, such as the
mail.mime.encodefilename property. See the JavaMail javadocs for
the javax.mail.internet package.

Problem searching some PDF files in Acrobat Reader – Non-ASCII characters

Acrobat Reader cannot search some .pdf files. I have put an example document up on Scribd here.
Any attempt to search for any word that can be clearly seen to be in the document fails with “No matches were found.”
This example document is NOT a scanned document – words and characters can be selected.
A hex display tool shows that the characters in a PDF document that can be successfully searched are in the ASCII/1252 range (A=0x41, etc).
Copying and pasting characters in the example document to a hex display tool shows that the characters in the document are not in the ASCII range.
For example the letters A to Z in the example document are in the range ‘A’ = 0xDF (decimal 223), ‘B’ = 0xDE (decimal 222), through to ‘Z’ = 0xC6 (decimal 198).
However, characters in these non-ASCII ranges are displayed perfectly by Acrobat Reader, as can be see if the example document is opened.
Therefore, as Acrobat Reader knows what these characters are, it doesn’t seem unreasonable to say that it should be able to search for and find them.
Tests were performed using Acrobat Reader X v10.1.4.
Can anyone say what this problem is?

Hi Pat, thanks for your reply.
Your reference to the title of that page being 'HARNESSES' indicates that, when you view that document in Adobe Reader, you are seeing 'HARNESSES', not
"ØßÎÒÛÍÍÛÍ". And that the remainder of the document is similarly being displayed in readable English language.
Yes as you say, you can search for 'ß' and get hits on 'A' (to use that as an example) in the example document.
But the need to form a word to be searched for into whatever code mapping this is using (for example having to enter "ØßÎÒÛÍÍ" for HARNESSES - I'm not even sure how that would be entered from a keyboard) doesn't seem to be very convenient.
Its clear the example document is using some code mapping other than ASCII / Windows-1252 (which has 'A' as 0x41). But it is also clear that Adobe Reader knows what that mapping is, and knows to use it, as its displaying (for example) 'A' for the code 0xDF.
So I guess the question is - why isn't Adobe Reader's knowledge of this mapping being extended to its search input?

Problem with special character in Unix file

Hi All,
Need a help here. We have unicoded our system recently.
It's regarding a special character (umlaut) that comes through a 3rd party system to an Unix file. This gets displayed in AL11 file as # instead of ö.
As our program picks the file from Unix, it also has the # but we want ö.
We could have done some fixes in our code to fix if AL11 file at least had ö and we got # in our program.
But it's the other way round. So, how can we get rid of this issue? Please suggest.
Regards,
Sanj.

How Al11 is reading file ? Look for
OPEN DATASET "yourfilename" IN TEXT MODE ENCODING DEFAULT FOR INPUT
IGNORING CONVERSION ERRORS.
If the above code is there .. you might need to play with different 'OPEN DATASET " options .
Also look for Note 1174468 - Non-7bit-ASCII characters used in ABAP Workbench
Note 1227961 - Names of text fields with non-7-bit ASCII characters
Good Luck !
^Saquib

Spool non english character names to a file

Hi There,
We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
Any suggesstions?
Thanks,
Sun
Edited by: ryansun on Jun 23, 2012 4:24 AM

ryansun wrote:
Hi There,
We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
Any suggesstions?
Thanks,
Sun
Edited by: ryansun on Jun 23, 2012 4:24 AMwhen dealing with non-ASCII characters, two different issues can exist,
1) data storage - incorrect byte value is stored
2) data presentation - incorrect character is displayed.
Can the utility utilized to view the *CSV file actually display the non-ASCII value properly?
can you inspect the *CSV file using an hexadecimal editor? what do you see inside the file?

Keyword import fails on non-ascii character

I recently tried to import a long set of keywords (about 4000 terms). i set up the file in excel and then tried to import the records. I kept getting this message:
only text files encoded with ascii or unicode UTF-8 are supported when importing keywords.
I finally tracked down the problem when i converted the file to a MS word text file, broke it down into parts and eventually found the problem record. for some reason, the apostrophe in the words "don't know" had been corrupted to a weird character. after i corrected this, everything worked.
however, this took a long time. It would have been helpful if lightroom could have at least pinpointed the line where the import failed or offered to convert non-compliant charaters to some specific character or set of characters.

Yeah, that didn't work so well since SuperDuper ran across repeated errors trying to do so; I suspect it's something to do with the drive. (SuperDuper complains about WD's MyBook, which is what the drive is.) Because SD stops the entire copy operation on single errors, it'd be a painstaking process.
Besides that, I like doing fresh installs of all the bits.

Unable to play videos with non-ASCII-characters in filename

Hi!
I use a MediaPlayer to display MP4-videos in my application. This works quite well. Unfortunately I have a problem if the filename of the video to be shown contains non-ASCII-charcaters.
I get the following message:
-->file:D:\daten\avi\�� .MPG
Error: Unable to realize com.sun.media.amovie.AMController@4b7651
Failed to realizeThe first line shows the filename I pass to the setMediaLocation()-method of the MediaPlayer-object.
What's wrong? If I rename the file to ABC.mpg it works fine.
Thanks for your help
Thomas

Hi!
I use a MediaPlayer to display MP4-videos in my application. This works quite well. Unfortunately I have a problem if the filename of the video to be shown contains non-ASCII-charcaters.
I get the following message:
-->file:D:\daten\avi\�� .MPG
Error: Unable to realize com.sun.media.amovie.AMController@4b7651
Failed to realizeThe first line shows the filename I pass to the setMediaLocation()-method of the MediaPlayer-object.
What's wrong? If I rename the file to ABC.mpg it works fine.
Thanks for your help
Thomas

Non-ASCII charaters in a unix file

Similar Messages

Maybe you are looking for