Byte to Chars converter

Hi,
I'm looking for a converter from byte array to char for different encodings (ANSI, UTF-8 and UTF-16BE, UTF-16LE encodings).
I strongly believe there is a library/class that can do this.
Thanks in advance.
Lubos

Well the problem is that I have to count the number of bytes and look for a special characters (like Form Feed).
The only possibility I've found is to write my own parser..
I've tried the InputStreamReader and implement my own implementation of InputStream for counting the bytes, but it seems the method read(byte[]) is not called every time..

Similar Messages

  • Form English Char ( Single Byte  ) TO Double Byte ( Japanese Char )

    Hello EveryOne !!!!
    I need Help !!
    I am new to Java ,.... I got assignment where i need to Check the String , if that string Contains Any Non Japanse Character ( a~z , A~Z , 0 -9 ) then this should be replaced with Double Byte ( Japanese Char )...
    I am using Java 1.2 ..
    Please guide me ...
    thanks and regards
    Maruti Chavan

    hello ..
    as you all asked Detail requirement here i an pasting C code where 'a' is passed as input character ..after process it is giving me Double Byte Japanese "A" .. i want this to be Done ..using this i am able to Convert APLHA-Numeric from singale byte to Doubel Byte ( Japanse ) ...
    Same program i want to Java ... so pleas guide me ..
    #include <stdio.h>
    int main( int argc, char *argv[] )
    char c[2];
    char d[3];
    strcpy( c, "a" ); // a is input char
    d[0] = 0xa3;
    d[1] = c[0] + 0x80;
    printf( ":%s:\n", c ); // Orginal Single byte char
    printf( ":%s:\n", d ); // Converted Double Byte ..
    please ..
    thax and regards
    Maruti Chavan

  • How to find out if colums is defined as VARCHAR2 in bytes or char?

    Hello,
    I'd like to know if it is possible to find out if a colum table (or view) is defined as a VARCHAR2 in bytes or in CHAR on Oracle 10g.
    When I do a desc, it shows only VARCHAR2 with its length but not if it is bytes or char. How can I know for sure?
    Thanks,

    SQL> create table t
      id    varchar2 (10 char),
      id2   varchar2 (10 byte)
    Table created.
    SQL> select column_name, data_type, char_used
      from cols
    where table_name = 'T'
    COLUMN_NAME                                   DATA_TYPE       CHAR_USED
    ID                                            VARCHAR2        C       
    ID2                                           VARCHAR2        B       
    2 rows selected.

  • Oracle Best practices for changing  Byte to Char on Varchar2 columns

    Dear Team,
    Application Team wanted to change Byte to Char on Varchar2 columns to accommodate Multi byte character  on couple of production tables.
    Wanted to know is it safe to have mixture of BYTE and CHAR semantics in the same table i have read on the couple of documents that It's good practice to avoid using a mixture of BYTE and CHAR semantics columns in the same table.
    What happens if we have mixture of BYTE and CHAR semantics columns in the same table?
    Do we need to gather stats & rebuild indexes on the table after these column changes .
    Thanks in Advance !!!
    SK

    Application Team wanted to change Byte to Char on Varchar2 columns to accommodate Multi byte character  on couple of production tables.
    Wanted to know is it safe to have mixture of BYTE and CHAR semantics in the same table i have read on the couple of documents that It's good practice to avoid using a mixture of BYTE and CHAR semantics columns in the same table.
    No change is needed to 'accommodate Multibyte characters'. That support has NOTHING to do with whether a column is specified using BYTE or CHAR.
    In 11g the limit for a VARCHAR2 column is 4000 bytes, period. If you specify CHAR and try to insert 1001 characters that each take 4 bytes you will get an exception since that would require 4004 bytes and the limit is 4000 bytes.
    In practice the use of CHAR is mostly a convenience to the developer when defining columns for multibyte characters. For example for a NAME column you might want to make sure Oracle will allocate room for 50 characters REGARDLESS of the actual length in bytes.
    If you provide a name of 50 one byte characters then only 50 bytes will be used. Provide a name of 50 four byte characters and 200 bytes will be used.
    So if  that NAME column was defined using BYTE how would you know what length to use for the column? Fifty BYTES will seldom be long enough and 200 bytes SEEMS large since the business user wants a limit of FIFTY characters.
    That is why such columns would typically use CHAR; so that the length (fifty) defined for the column matches the logical length of the number of characters.
    What happens if we have mixture of BYTE and CHAR semantics columns in the same table?
    Nothing happens - Oracle could care less.
    Do we need to gather stats & rebuild indexes on the table after these column changes .
    No - not if you by 'need' you mean simply because you made ONLY that change.
    But that begs the question: if the table already exists, has data and has been in use without their being any problems then why bother changing things now?
    In other words: if it ain't broke why try to fix it?
    So back to your question of 'best practices'
    Best practices is to set the length semantics at the database level when the database is first created and to then use that same setting (BYTE or CHAR) when you create new objects or make DDL changes.
    Best practices is also to not fix things that aren't broken.
    See the 'Length Semantics' section of the globalization support guide for more best practices
    http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch2charset.htm#i1006683

  • NLS_LENGTH_SEMENTICS from BYTE TO CHAR

    Hi,
    For supporting multibyte character, we need to change NLS_LENGTH_SEMENTICS parameter from BYTE to CHAR. But this parameter setting will effect for new database tables created thereafter. To change the storage characteristics for existing database tables we explicitly executed Alter statements for database tables for columns having datatype as “Varchar2” and “Char”.
    Problem:
    ======
    Since the number of database tables in PRODUCTION are very high and contains approx. 600 million of records spread over 400 database tables, we are not in a position to afford the time which will be spent in altering these database tables in PRODUCTION.
    We ran the test by alter script in System Test environment and alteration of database tables covering 150 tables and 200 million of records was carried out in almost 16-20 hrs.
    APPROACHES WE HAVE IN MIND
    ==========================
    1. Alter all the table columns (We tried the same and taking too much time)
    2. Export /Import with NLS_LENGTH_SEMENTICS set as CHAR(We discuss with our DBA about this approach and found that it will also take too much time and there is RISK of data inconsistency)
    3. Drop the index of the table, run alter script for changing storage type BYTE to CHAR , and rebuild the index (this is also taking too much time).
    All above approaches are very costly in terms of time, that we cannot afford.
    If any one having better solution then please suggest.
    thanks in advance
    Syed

    Hi
    We are also facing a similar problem
    We ran alter table scripts and now compiling the objects
    and that is taking lot of time.
    we have around 4000 invalids that by parallel recomp came down to 2000 but still these 2000 which are mostly packages .. are giving a hard time.
    if anyone has faced/found a similar issue/solution pls post. or maildirectly to me.
    Sunil Choudhary

  • Change NLS_LENGTH_SEMANTICS from BYTE to CHAR on Oracle 9i2 problem!

    Hi,
    I have created a new database on Oracle 9i 2, I did not find the correct pfile parameter for the NLS_LENGTH_SEMANTICS setting.
    So have created a standart UTF8 database and now I am trying to change the standard NLS_LENGTH_SEMANTICS=BYTE to CHAR. When I execute the following command in SQL PLUS "ALTER SYSTEM SET NLS_LENGTH_SEMANTICS=CHAR SCOPE=BOTH"
    The system is tells me that command is successfully executed.
    But when I look at the NLS_DATABASE_PARAMETERS table I do not see any change for the NLS_LENGTH_SEMANTICS parameter.
    I have also restarted the instance but still everything is the same as it was before.
    Do you know what I am doing wrong?
    Regards
    RobH

    Hi,
    Yeah you are right, the nls_session_parameters "NLS_LENGTH_SEMANTICS" for the app user is set to CHAR.
    This means that NLS_DATABASE_PARAMETERS is from the SYS or SYSTEM user view?
    Thanks a lot
    Regards
    RobH

  • Byte or CHAR? - as Unit

    Hi all,
    DB - 9.2
    Which is the better option to keep unit as Byte or CHAR while using the VARCHAR2 datatype?
    Suppose unit is selected as Byte then is it not cumbersome to calculate the max. no. of characters could be stored in the field?

    Hi!
    I'd propose CHAR, because with BYTE it is dependent of the character set how many characters you can store in your field. This means -> calculate :)
    Best regards,
    Daniel

  • Mixing byte and char input

    Hi,
    I have a setup where I wish to arbitarily read bytes or chars from the same underlying input stream, under jdk 1.3.x.
    As far as I can see, InputStreamReader will always always over-read from the underlying stream (to the tune of 8k, if it can) and never replace the data it didn't use. This could be removing data that should have been read as binary data, and thus corrupting the stream.
    Given that there's no way, it seems, of detecting the number of bytes per character for a given encoding, the only solution that I can think of is to provide an intermediate buffering stream that drip feeds the reader a conservative number of bytes. This obviously has profound efficiency implications if the reader has to re-fill() itself that often.
    Has anyone else sucessfully dealt with this problem?

    I'm still not
    sure I understand the problem. Is it that the char
    reader will keep reading past the chars removing the
    binary data from what is left to be read and
    preventing it from being read by a 'binary' reader?Yes - InputStreamReader, one of two ways to ensure that byte to char conversion is done properly pre-1.4, will use a fixed buffer size of 8192 bytes to read from the underlying stream.
    It seems that you may need to read in everything as
    binary data and delegate chars off to another Reader
    as you come across them. Of course you will have to
    know when the data should be interpreted as chars and
    when it should not. Only acheivable insofar as the underlying mechanism will be told by clients what to read(i.e. readLine() / readBytes(), or something).
    Some sort of home-rolled throttle between the underlying stream and the reader seems to be the only option, but this will be really clunky.
    There will have to be some sort
    of separator between the types. Am I understanding
    that this will be used with several different
    protocols?Yes, potentially a large number (this will, eventually, be a port of the Indy set of components for Delphi, http://www.nevrona.com/indy, FYI).

  • Using byte or char array?

    I am trying to use byte[] or char[] to read from a socket. I used byte[] and char[], both works fine. However, I know byte is 8 bits and char is 16 bits. Does it matter which one I used?
    Thanks.

    When you deal with streams of plain bytes you should use input/outputstreams, not readers or writers that are meant for character streams. The inputstream equivalent to BufferedReader is BufferedInputStream and there is a method for reading a full byte[] there...
    But using a "bigger" data type than you need will not cause you big problems.

  • Byte and char !!!

    Hi!
    What is the diference between an byte and char in Java! Could I use char instead of byte and reverse?
    Thanks.

    TYPE BYTE BITS SIGN RANGE
    byte 1 8 SIGNED -128 to 127
    char 2 16 UNSIGNED \u0000 to \uFFFF
    Both the date types can be interchanged
    Have a look at this
    class UText
         public static void main(String[] args)
              byte c1;          
              char c2= 'a';
              c1 = (byte) c2;
              c2 = (char) c1;          
              System.out.println(c1 +" "+c2);
    But It Leads to confusion while interchanging because of its SIGN. And the range of byte is very less.So its better to prefer char.

  • Exp/imp, convertion from byte to char.

    Hi,
    I have a dump file exported from a database with nls_lenght_sematics=BYTES. When i import it into a database with nls_lenght_sematics=CHAR, it is retaining the BYTE charecteristics. Is there any way to change it into CHAR while importing. This for globalization support.
    Thanks
    Muneer

    Hi Muneer,
    No, import always preserve the LENGTH semantics of the original columns.
    The workaround is to create your schema objects in the traget database first, with the desired semantics, prior to the importing the data.
    Nat

  • Converting bytes to chars.

    Cant get the right ByteToCharConverter...
    This works (but not quite as it is supposed to):InputStreamReader isr;
    isr = new InputStreamReader(socket.getInputStream());Whereas thisInputStreamReader isr;
    isr = new InputStreamReader(socket.getInputStream(),"UTF-8");does not work.
    I get the UnsupportedEncodingException:
    java.io.UnsupportedEncodingException: UTF-8 [Could not load class: sun.io.ByteToCharUTF-8]
         at sun/io/ByteToCharConverter.getConverter (ByteToCharConverter.java)
         at java/io/InputStreamReader.<init> (InputStreamReader.java)
    How do I solve this?The code that doesn�t crash makes all the special characters (non-English letters ������������... the copyright sign ... ) wrong!
    Please help me!
    Ragnvald

    HURRAY!!!
    UTF8 not UTF-8 !!!!!

  • Using bytes or chars for String phonetic algorithm?

    Hi all. I'm working on a phonetic algorithm, much like Soundex.
    Basically the program receives a String, read it either char by char or by chunks of chars, and returns its phonetic version.
    The question is which method is better work on this, treating each String "letter" as a char or as a byte?
    For example, let's assume one of the rules is to remove every repeated character (e.g., "jagged" becomes "jaged"). Currently this is done as follows:
    public final String removeRepeated(String s){
                    char[] schar=s.toCharArray();
              StringBuffer sb =new StringBuffer();
              int lastIndex=s.length()-1;
              for(int i=0;i<lastIndex;i++){
                   if(schar!=schar[i+1]){
                        sb.append(schar[++i]);//due to increment it wont work for 3+ repetions e.g. jaggged -> jagged
              sb.append(schar[lastIndex]);
              return sb.toString();
    Would there be any improvement in this computation:public final String removeRepeated(String s){
              byte[] sbyte=s.getBytes();
              int lastIndex=s.length()-1;
              for(int i=0;i<lastIndex;i++){
                   if(sbyte[i]==sbyte[i+1]){
                        sbyte[++i]=32; //the " " String
              return new String(sbyte).replace(" ","");
    Well, in case there isn't much improvement from the short(16-bit) to the byte (8-bit) computation, I would very much appreciate if anyone could explain to me how a 32-bit/64-bit processor handles such kind of data so that it makes no difference to work with either short/byte in this case.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

    You may already know that getBytes() converts the string to a byte array according to the system default encoding, so the result can be different depending on which platform you're running the code on. If the encoding happens to be UTF-8, a single character can be converted to a sequence of up to four bytes. You can specify a single-byte encoding like ISO-8859-1 using the getBytes(String) method, but then you're limited to using characters that can be handled by that encoding. As long as the text contains only ASCII characters you can get away with treating bytes and characters as interchangeable, but it could turn around and bite you later.
    Your purpose in using bytes is to make the program more efficient, but I don't think it's worth the effort. First, you'll be constantly converting between {color:#000080}byte{color}s and {color:#000080}char{color}s, which will wipe out much of your efficiency gain. Second, when you do comparisons and arithmetic on {color:#000080}byte{color}s, they tend to get promoted to {color:#000080}int{color}s, so you'll be constantly casting them back to {color:#000080}byte{color}s, but you have to watch for values changing as the JVM tries to preserve their signs.
    In short, converting the text to bytes is not going to do anywhere near enough good to justify the extra work it entails. I recommend you leave the text in the form of {color:#000080}char{color}s and concentrate on minimizing the number of passes you make over it.

  • Byte to Char ...

    Hi,
    I have written an encryption function in PHP which at a certain point uses the built in chr() function to convert a byte back to a character. This works fine, however when I do the same thing in Java ( by casting the byte as a char ) I have problems.
    It seems characters with ASCCI values greater than 128 are being displayed as the wrong characters and sometimes not at all ( they become ? ).
    Some example output from the PHP ...
    _____Byte______ASCCI_______Char
    ....-785314906..........166...................... �
    ....805139163...........219.......................�
    Im assuming this is because the way PHP handles the bytes differs from java. Does anyone know how I can get my java representation of these bytes to work for conversion to characters ?
    My java code goes like this ...
    System.out.println((char)(byte) byteCode)* byteCode is an integer variable containing values like those shown above.
    Many thanks for any insight ...
    Chris

    A byte is only 8 bits and it's signed as well so 1 bit is reserved for the sign.
    As a consequence a byte in Java has the numerical range of -128 to 127.
    Just for the record, as far as I know a byte could never have the values -785314906 or 805139163 in any language unless that language defines a byte to be something other than a byte.
    Try the following to convert your "byteCode"s:System.out.println((char)(0xff & byteCode));

  • Byte to char casting

    Hi friends
    I am using some third part api  for my application. This api expects utf8 string from my application. I write some dirty methods to convert my string to utf8. It's works properly except some cases when second byte of character contains only digits.
    For example Ukrainian char "І"  D0 86
    Casting 0x0086 to char returns "" instead of †
    Could anyone explain me why (char)0x0086 return "" ?
    Thank you!

    These classes provide some nice functionality that you might need
    https://msdn.microsoft.com/en-us/library/system.text.encoding%28v=vs.110%29.aspx
    https://msdn.microsoft.com/en-us/library/system.text.encoding.utf8%28v=vs.110%29.aspx
    String encoding is fully supported by .net. Most common encoding are supported, so no need for weird methods.

Maybe you are looking for

  • ERROR IN ESS AND MSS pages when deleting the superadmin role

    hi all According to my clent requirement i have worked with uwl for leave request.....for that i configured uwl and added uwl iview to the standard user role......after i have assigned standard user role to my manager. but' after some time they told

  • Regarding cash flow(FSV)

    could anyone plz help me... here the problem is i need business area wise balance sheet but it is not displaying in production server. We have implimented ecc6.0 when we implementede ecc6.0that time we just activated new g/l a/c but we didn't go comp

  • How to change data in  jtable?

    Hai.I have a problem to change data in jtable.Can anbody help me to slove this? Think you in advance

  • 3.1.2 no Baseband and error 1013 when recovering iPhone3GS

    Hi. I'm having this strange problem that it actually seems like im the only guy with a 3GS that have ever had that problem. well here it goes, i have my lovely iphone 3GS with firmware 3.1.2 but it crashed earlier yesterday when i tried to restore my

  • Linker "unresolved external" error

      I have a large project written in straight C that Visual Studio 2013 (with a little forcing) compiles and runs without error.  I have written a C++ shell around it to create a new project.  Everything works except that I get the "unresolved externa