Shape file encode in UTF-8

hi all,
I am interested to import shape file which is encode in UTF-8. When i import shape file with the help of map builder, i lost my Unicode character.
NLS_NCHAR_CHARACTERSET value of my data base is AL16UTF16
Any suggestion how i can import these Unicode code characters.

If your CSV API offers the option of saving as UTF-8 (and it should), that would be the best way to go. Otherwise, you can use InputStreamReader and OutputStreamWriter to convert the file.

Similar Messages

File encoding with UTF-8

Hello all,
My scenario is IDoc -> XI -> File (txt).
Everything was working fine until i have to handle eastern european language with weird symbol
So in my file adapter receiver, i'm using the file encoding code UTF-8 and when i look my field in output, everything is fine.
BUT, when i look in binary, the length of these field is not longer fixed because a special character takes 2 bytes instead of one.
I would like to know if it's possible to handle those characters with a file encoding code UTF-8 in a fixed length field of 40 characters for example don't want a variable length for my fields...
Thanks by advance,
JP

I agree with you. In XI, i don't have this problem, i have it in my ouput file when i edit my text file in binary mode !
My field should be on 40 characters but the special symbol which take 2 bytes instead of 1 make the length of my output fields variable !!!
My question was to know if there is a way to have a fixed length in my output file..
Sorry if i wasn't clear in my first post.
JP

Unable to set JVM file encoding to UTF-8 on Windows

Hi,
I am running Tomcat on 1.5.0_05 JRE. I tried several things to set the jvm file encoding to UTF-8 instead of the default Cp1252, but no luck yet.
The most intuitive approach seems to be to use a JVM option like
"-Dfile.encoding=UTF-8"
but this does not seem to have any effect. I have a WinXP pro m/c. I saw some bug reports which seemed to indicate that changing the JVM file encoding is not an available feature....is that correct? I would really appreciate any help/pointers on this. I will post the solution if I find something in the meantime.
Thanks,
Sriram

I am fail to set it too. I think it is better to separate the file.encoding into two, one for accept local OS, the other for compile .java and .jsp and so on files. So we can change it and the bugs will be decrease!

How to set File Encoding to UTF-8 On Save action in JDeveloper 11G R2?

Hello,
I am facing issue when I am modifying a File using JDeveloper 11G R2. JDeveloper is changing the Encoding of the File to System default Encoding (ANSI) instead of UTF-8. I have updated the Encoding to UTF-8 in "Tools | Preferences | Environment | Encoding" option and restarted the JDeveloper. I have also updated "Project Properties | Compiler | Character Encoding" option to UTF-8. None of them are working.
I am using below version of JDeveloper,
Oracle JDeveloper 11g Release 2 11.1.2.3.0
Studio Edition Version 11.1.2.3.0
Product Version: 11.1.2.3.39.62.76.1
I created a file in UTF-8 Encoding. I opened it, do some changes and Save it.
When I open the "Properties" tab using "Help | About" Menu, I can see that the Properties of JDeveloper are showing encoding as Cp1252. Is it related?
Properties
sun.jnu.encoding
Cp1252
file.encoding
Cp1252
Any idea how to make sure JDeveloper saves the File in UTF-8 always?
- Sujay

I have already done that. That is the first thing I did as mentioned in my Thread. I have also added below 2 options in jdev.conf and restarted JDeveloper, but that also did not work.
AddVMOption -Dfile.encoding=UTF-8
AddVMOption -Dsun.jnu.encoding=UTF-8
- Sujay

CSV file encoded as UTF - 8 loses characters when displayed with excel 2010

Hello everybody,
I have adapted a customer report to be able to send certain data via mail a a CSV attachment.
For that purpose I am using class cl_bcs.
Everything goes fine, but since mail attachment contains certain german characters as Ü, when displaying it with excel those characters appear as corrupted.
It seems the problem is with excel, because when opening the same file with notepad, the Ü is there. If I import the file to excel with the importer, it is correct too.
Anyway, is there any solution to this problem?
I have tried concatenating byte_order_mark_utf8 in the beginning of the file, but still excel does not recognize it.
Thanks in advance,
Pablo.
Edited by: katathema on Jan 31, 2012 2:05 PM

- Does ms excell actually support UTF-8
Yes. I believed that we installed some international add-on which is not in default installnation. Anyway, other UTF-8 or UTF-16 file can be openned and viewed by Excel without any problem.
- have you verifide that the file is viewable as a UTF-8 -encoded file
I think so. If I open it into Notepad and choose "save as", the file type if UTF-8 file
- Try opening the file in a program you are confident
that it support UTF-8 - eg. Mozilla...
I will try that.
- Check that your UTF-8 -encoded file has a UTF-8 identifier (0xFEFF ?)
as the first character
The unicode-16(LE or BE) file I got from internet, I found there is always two bytes in the front. (0xFEFF or 0xFFFE). My UTF-8 file generated by java doesn't have that. But should UTF-8 file also has this kind of specifcal bytes in the front? If I manually add these bytes in the front of my file using Ultraeditor and open it in Excel2000, it didn't help.
- Try using another spreadsheet program that supports UTF-8
Do you know any other spreadsheet program supports csv file and UTF-8.

How to set the system default file character encoding to UTF-8?

Hi all. This is driving me nuts, on both my Windows box and Snow Leopard; I figure much more chance of finding the answer for OS X.
My language and locale are set to Australian English. $LANG=en_AU.UTF-8
However, as I believe is expected, OS X (and Windows for that matter) will create files by default with character encoding of Cp1252 (Latin-1). That is, the FILE encoding in the file metadata - the Byte Order Mark I believe. The file itself, not the characters written to it.
This, in a word, bites. I don't want to be restricted to only ASCII by default, and it is causing me problems with certain software (a Firefox plugin) that creates text files, passing in UTF-8 encoded content, which is then mangled because the file encoding itself is still Cp1252. (I know, I've tested this by changing the file encoding manually and having it overwritten again by the plugin: works correctly.)
As a simple example, just `touch somefile` from terminal creates a file in Cp1252 -- I'm obtaining that info by opening in jEdit by the way (anyone know of something better?).
In other locales that are not English-based, I believe the default file encoding is UTF-8. But surely this can be controlled independently? There must be a system configuration value somewhere that specifies file encoding default. Can someone please tell me what it is?
Thanks!

However, as I believe is expected, OS X (and Windows for that matter) will create files by default with character encoding of Cp1252 (Latin-1). That is, the FILE encoding in the file metadata - the Byte Order Mark I believe. The file itself, not the characters written to it.
Apps like TextEdit and Mail have settings that let you determine the encoding of text produced. The default would normally depend on the character content of the file, ranging from ASCII for basic English to Windows Latin-1 (Win 1252) or ISO Latin -1 (ISO 8859-1) to UTF-8 for other content.
Win 1252 is not ASCII, but has twice the number of characters in the latter.
Byte Order Mark is something totally different --it's a particular character used to signal certain encodings.
http://en.wikipedia.org/wiki/Byteordermark
As a simple example, just `touch somefile` from terminal creates a file in Cp1252 -- I'm obtaining that info by opening in jEdit by the way (anyone know of something better?).
For what Terminal does and how to change it, it might best to post in the Unix forum:
http://discussions.apple.com/forum.jspa?forumID=735
For problems with a FireFox plugin, it might be good to ask on their own forums as well.

File adapter, File encoding national characters

Hi,
I have a problem with national characters (ÅÄÖ) when sending (receiver adapter) files with the fileadapter.
When i specify Transfere mode = Binary and File Type = Binary everything works fine but when i use Transfere mode =+ Text+ the national characters gets converted to "?". I have tried to set File Type = text and tryed File Encoding with UTF-8 and ISO-8859-1 without success.
Please help!
Regards
Claes

Hi,
Check this out: <a href="https://www.sdn.sap.comhttp://www.sdn.sap.comhttp://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42">How To Work with Character Encodings in Process Integration</a>
Regards,
Jakub

Corrupt XML file encoding utf-8 special chars (IDOC - File scenario)

Dear experts,
I have a problem with the XML output files of XI and could not find the answer in one of the current posts.
I'm sending Master Data from R/3 with IDOCs through XI to a FTP directory. These files include characters as Á, Ê, etc.
The XI server includes the utf-8 encoding in the output XML message. However, when opening these files I receive errors (tried it in multiple programs). It tells me that Á is not utf-8.
It will not accepts Á. I was under the impression that utf-8 included extended Latin and thus would accept these characters. Thus implying that the message was created wrong. Also importing these files into MDM import manager gives errors.
All rfc destinations are on Unicode.
By the way, we experience the same problem when syndicating files from the MDM server.
Any suggestions?
Cheers.
* Will reward points for helpful answers.

Hi,
Check out this guide..
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42
make use of the messagetransformbean - http://help.sap.com/saphelp_nw04/helpdata/en/57/0b2c4142aef623e10000000a155106/content.htm
Also for further ref: go thru this thread - Change encoding from utf-8 to iso-8859-1 in JMS receiver!
regards
sasi.........
<b>Reward if useful</b>

UTF-8 file encoding issues within Java?

I'm working on an application that takes data from an IBM mainframe(z/OS), converts it from IBM-1047 encoding to UTF-8(via iconv utility) and binary FTP's it to a Unix box where we process the file with our Java app and return the processed file.
Within our Java app on the Unix platform we stream the file into a byte array and then create a new String from the byte array specifying "UTF-8" as the encoding parameter.
The problem is that Java appears to be taking certain 2 byte UTF-8 characters and converting them to a single char.
E.g. I have a \uC3A6 char in the input file, I can view the bytes in the byte array that's read in, and it's still a \uC3A6, but as soon as I create the new String with UTF-8 encoding and view the bytes, those 2 bytes are now shown as a single byte(0xE6). The code I have that's looking for the char \uC3A6 then fails.
Can anyone explain what's happening here?? Sorry for the long message.

The encodings which convert the character (char)0xC3A6 to the 2-entry byte array {0xC3, 0xA6} (unsigned) are "UTF-16BE", "UnicodeBigUnmarked", and "UnicodeBig." These are essentially identical except for the use of byte-order mark. As was said above UTF-8 converts (char)0xC3A6 to the 3-entry byte array {0xEC, 0x8E, 0xA6} (unsigned).
http://java.sun.com/j2se/1.4.1/docs/guide/intl/encoding.doc.html

Text file attachment in UTF-8 encoding

Hi
I have written a program which sends mails to the users with text file attached. the problem is the text file when you save it to the local desktop ( by clicking on save as ) the encoding is by default ANSI. I want to make the encoding as UTF-8. Is it possible to change this in program?.
thanks
sankar

OPEN DATASET - encoding
Syntax
... ENCODING { DEFAULT
| {UTF-8 [SKIPPING|WITH BYTE-ORDER MARK]}
| NON-UNICODE } ... .
Alternatives:
1. ... DEFAULT
2. ... UTF-8 [SKIPPING|WITH BYTE-ORDER MARK]
3. ... NON-UNICODE
Effect
: The additions after ENCODING determine the character representation in which the content of the file is handled. The addition ENCODING must be specified in Unicode programs and may only be omitted in non-Unicode programs. If the addition ENCODING is not specified in non-Unicode programs, the addition NON-UNICODE is used implicitly.
Note
: It is recommended that files are always written in UTF-8, if all readers can process this format. Otherwise, the code page can depend on the text environment and it is difficult to identify the code page from the file content.
Alternative 1
... DEFAULT
Effect
: In a Unicode system, the specification DEFAULT corresponds to UTF-8, and in a non-Unicode system, it corresponds to NON-UNICODE.
Alternative 2
... UTF-8 [SKIPPING|WITH BYTE-ORDER MARK]
Addition:
... SKIPPING|WITH BYTE-ORDER MARK
Effect
: The characters in the file are handled according to the Unicode character representation UTF-8.
Notes
: The class CL_ABAP_FILE_UTILITIES contains the method CHECK_UTF8 for determining whether a file is a UTF-8 file.
A UTF-16 file can only be opened as a binary file.
Addition
... SKIPPING|WITH BYTE-ORDER MARK
Effect
: This addition defines how the byte order mark (BOM), with which a file encoded in the UTF-8 format can begin, is handled. The BOM is a sequence of 3 bytes that indicates that a file is encoded in UTF-8.
SKIPPING BYTE-ORDER MARK
is only permitted if the file is opened for reading or changing using FOR INPUT or FOR UPDATE. If there is a BOM at the start of the file, this is ignored and the file pointer is set after it. Without the addition, the BOM is handled as normal file content.
WITH BYTE-ORDER MARK
is only permitted if the file is opened for writing using FOR OUTPUT. When the file is opened, a BOM is inserted at the start of the file. Without the addition, no BOM is inserted.
The addition BYTE-ORDER MARK cannot be used together with the AT POSITION.
Notes
: When opening UTF-8 files for reading, it is recommended to always enter the addition SKIPPING BYTE-ORDER MARK so that a BOM is not handled as file content.
It is recommended to always open a file for reading as a UTF-8 with the addition WITH BYTE-ORDER MARK, if all readers can process this format.
Alternative 3
... NON-UNICODE
Effect
: In a non-Unicode system, the data is read or written without conversion. In a Unicode system, the characters of the file are handled according to the non-Unicode codepage that would be assigned at the time of reading or writing in a non-Unicode system according to the entry in the database table TCP0C of the current text environment.

How to Set "file.encoding" System Property to default "UTF-8"

When i execute my code some special character are not being display correct so by programming approach i am trying to set "file.encoding" system property to "UTF-8", using command System.setProperty( "file.encoding", "UTF-8" ); and it is not working.
If i run my jar using command java -Dfile.encoding=UTF-8 -jar myprog.jar . It is working and my special characters are also looking in right way.
Can i set this defalut encoding by programming approach.
Thanks
Ashish Pancholi

Hello,
I have the same problem. I have a java prog that is started with "-Dfile.encoding=ISO-8859-1". Now in this program I want to print some characters using the UTF-8 encoding because I know that the terminal I will be printing on has this encoding. I tried using InputStramReader without success:
    InputStreamReader isr = new InputStreamReader(new ByteArrayInputStream("Müller".getBytes()), "UTF-8");
    BufferedReader br = new BufferedReader(isr);
    String line = null;
    while ((line = br.readLine()) != null) {
        System.out.println(line);
    }EDIT:
the above example is to read something into my java program. If I want to write something from my java class to an output it goes like this:
Writer out = new BufferedWriter(new OutputStreamWriter(System.out, "UTF8"));
out.write("Müller\n");
out.flush();... in that case I get the correct encoding.
Thanks,
T

When creating new table in sqllite db via Flex it become encoded as "utf-16le"

Hi Guys
I have some annoying problem with my AIR application
The application is communicating with a local DB (sqllite).
as part of initial installation I'm checking if the db exist.
in case not then:
I create one (file)
create the relevent tables inside
and populate them.
For some reason, on the tables creation step the sqllite db become encoded as UTF-16le instead of UTF-8.
The question is how can I make the tables creation step to leave the db as UTF-8
thanks in advance for your help.
This is my creation code
the "connection" is from flash.data.SQLConnection type
The "file" contain the following information
<sql>
<statement>
CREATE TABLE IF NOT EXISTS MYTABLE
      MYTABLE_VERSION                NUMBER NOT NULL,
       MYTA|BLE_INSERT_DATE                 DATE NOT NULL
</statement></sql>
The below is the relevent code:
var stream:FileStream = new FileStream();
            stream.open(file, FileMode.READ);
            var xml:XML = XML(stream.readUTFBytes(stream.bytesAvailable));
            stream.close();
            var statement:XML = null;
            try
                connection.begin(lockType);
                for each (statement in xml.statement)
                    var stmt:SQLStatement = new SQLStatement();
                    stmt.sqlConnection = connection;
                    stmt.text = statement;
                    stmt.execute();
            } catch(err:Error)
                connection.rollback();
                throw err;
            connection.commit();

It doesn't look like you're using DBSequence domain for the OrderLinesId attribute. If you are then you do not need to fill in the sequence as you've done in the create method.
Getting back to create issue, You may want to set the 'order' id (foreign key) values before calling super and then call the getOrder() (or getXXX where XXX is the order accessor in this entity) method to verify if the order of the given ID exists/found in the cache.
By the way, are you also using a similar create() in order with DBSequence as the type for the pK and you force a sequence value on top of it via setAttribute?
Yes, this is the create method inside CrpOrderLinesImpl.java
protected void create(AttributeList attributeList) {
super.create(attributeList);
SequenceImpl s = new SequenceImpl("CRP_ORDER_LINES_ID_SEQ", getDBTransaction());
setAttribute("OrderLinesId",s.getSequenceNumber());
Thanks,
Brad

HOW SPECIFY FILE.ENCODING=ANSI FORMAT IN J2SE ADAPTER.

Hi All,
we are using j2se plain adapter, we need the outputdata in ANSI FORMAT.
Default file.encoding=UTF-8
how to achive this.
thanks in advance.
Regards,
Mohamed Asif KP

File adapter would behave in a similar fashion on J2ee. Providing u the link to ongoing discussion
is ANSI ENCODING possible using file/j2see adapter
Regards,
Prateek

How to save a file in unicode (UTF-8)

Hello,
I'm trying to save a xml file in unicode (UTF-8) in a 4.6C system. I tried the OPEN DATASET 'file' IN TEXT MODE FOR OUTPUT ENCODING UTF-8 but this is not available in 4.6C. Does anybody have an idea how to do this?
Thanks in advance
Kind regards
Roel

Hi Roel,
There is a workaround for this issue.
Use code below:
encoding = 'utf-8'.
data: codepage            type cpcodepage.
call function 'SCP_CODEPAGE_BY_EXTERNAL_NAME'
    exporting
      external_name = encoding
    importing
      sap_codepage = codepage
    exceptions
      not_found     = 1
      others        = 2.
if sy-subrc <> 0.
endif.
call function 'SCP_TRANSLATE_CHARS'
    exporting
      inbuff           = sourcedata_xml
      inbufflg         = length
      incode           = codepage
      outcode          = codepage
      substc_space     = 'X'
      substc           = '00035'
    importing
      outbuff          = custom_data
    exceptions
      invalid_codepage = 1
      internal_error   = 2
      cannot_convert   = 3
      fields_bad_type = 4
      others           = 5.
Now write this custom_data onto application server by using open dataset and transfer.
Also have a look at this weblog, there is a code sample in it.
/people/thomas.jung3/blog/2004/08/31/bsp-150-a-developer146s-journal-part-x--igs-charting
Hope it'll help.
Cheers
Ankur

File encoding question

Just attempting to open some file and read it to screen with BufferedReader and no encoding is handled anywhere...
I tried to save an html file to .txt and just use that as a test file, but yeah encoding was different from ???(the standard encoding I guess) and stopped at some point in the text file.
I did open notepad2 and say under encoding to be ANSI, read a bit on the forums and see UTF-16 mentioned a bit as well... So my question is just what type of encoding is a normal standard text file?

Hi,
rugae wrote:
So if I change my regional settings to Chinese or something and create a document with Chinese characters and attempt to read the file If you create the file with a editor, you save it with the encoding that is appointed in your editor. If you save it with java, use OutputStreamWriter with a defined charset (see http://www.exampledepot.com/egs/java.io/WriteToUTF8.html).
say with BufferedReader or just Scanner line by line with no implementation at all to handle different encodings, to console or another file is that going to give me correct outputs? (yes/no would be sufficient to clear it up for me...)Maybe ;-) Finally you have to know the encoding of the file or you have to guess it. If it is a HTML document, there are HTML headers that shows the encoding. If it is transported via HTTP then there is also a HTTP header Content-Type with a encoding declaration. Firefox shows this to the user with [Tools]-[Page Info]. http://www.beijing.gov.cn/ for ex is encoded in GB2312. Some Taiwanese pages may be encoded in BIG5. Encodings aka charsets are defined by IANA http://www.iana.org/assignments/character-sets.
If you know the encoding(charset) of the file, then read it with InputStreamReader and the right charset as shown in http://www.exampledepot.com/egs/java.io/ReadFromUTF8.html.
You can feel confident, if you have saved the file in UTF-(8 or 16) and opened it with the same encoding. UTF encodings are recommended.
greetings
Axel
ps: In java Charset.defaultCharset() shows you the default charset of this Java virtual machine.
Edited by: Axel_Richter on Oct 9, 2007 8:12 PM
Edited by: Axel_Richter on Oct 9, 2007 8:17 PM

Shape file encode in UTF-8

Similar Messages

Maybe you are looking for