Unicode(String) to actual Unicode !

Hi, i have an unicode data which is retrieved from database, the unicode is in String format(\u4eba\u53c2), how to make it to be actual unicode "\u4eba\u53c2". i have a problem where the unicode from database doesn't give me the actual character but the unicode string itself. Please comment on it . Thanks.
_calv                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

Hi Calv,
I'm pretty sure that the conversion from the ASCII string to Unicode is not available within the API. (If someone knows otherwise please jump in). However it should be fairly easy for you to program this conversion: for example, you could parse your string into the six character substrings that represent characters, strip off the \u, and then cast the sixteen bit integer into a Character.
In case it's helpful, I am pasting a couple of methods I wrote to go in the opposite direction:
returns a (ASCII) string that represents the specified character by a  unicode escape sequence
static public String toUnicodeString(  char character) {
     short unicode = (short) character;
   char hexDigit[] = {
      '0', '1', '2', '3', '4', '5', '6', '7',
      '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
   char[] array = {hexDigit[( unicode >> 12) & 0x0f],hexDigit[( unicode >> 8) & 0x0f],
        hexDigit[( unicode >> 4) & 0x0f], hexDigit[ unicode & 0x0f] };
String result = new String ("\\u" + new String (array));
   return result;
returns a (ASCII)  string representing the java string argument:
e.g. -> "\u1234\u5678"
static public String toUnicodeString(  String  string) {
     String result = "\"";
     for (int   index =      0; index < string.length (); index++) {
          result = result +  toUnicodeString (string.charAt (index));
     result = result + "\"";
   return result;
}   Regards,
Joe

Similar Messages

  • BareCode reader and insert String into actual selected JTextField

    Hi everyone,
    I can't invent anything appropriate about my concept. I would like to write a program for BareCode reading. I have working code witch gets a text string from reader which is connected over RS232. But I have to send this String to actual selected JTextField in other java program. I think to use clipboard to overcome this problem but I'm not sure if it's a good solution. Copy this String to clipboard and auto Paste... Any ideas ?
    Please help me!
    Many thanks for any advices :)

    Hmm... I missed that bit about having to poke it into
    another Java program. In that case I would
    look into modifying the other Java program instead of
    trying to write a separate program to deal with it.
    Otherwise you run into management issues like making
    sure the other program is running, and not minimized,
    and located at the right place on the screen, and has
    the JTextField in question in focus, and so on.In most cases, I would agree. But if his java program is header-less and just responds to the serial events and calls Robot.keyPress() and Robot.keyRelease() he will just be imitating the keyboard, which is exactly what most barcode readers can already do. And this would work in any program that can get keyboard input, no matter what the language was written in.
    We are currently doing this with a web-based application. The web page just has a text field and when they scan the barcode it submits the page. Of course the barcode reader we are using just imitates the keyboard, no mucking around converting serial data into keyboard events.
    I bet if the OP looks around he could find software that will already convert the barcode RS232 data into keyboard events.

  • What String.intern() actually improve

    Everyone is saying that calling String.intern() for comparison is a much better way to do than calling String.equals() method. I make a quick test to make sure about it before using in my program. However, it comes out as a surprise that by running a loop of 10 millions iterations, version of equals() spend 800 ms while vesion of intern() spend 3000 ms which is much longer.
    Anyone has any idea why? Is .intern() really an improvement?
    My code:
    class testIntern {
    public static void main(String[] args) {
    String sTest = new String("Hello Everyone");
    System.out.println(sTest.intern()=="Hello Everyone");
    System.out.println("version".intern()=="version");
    long x = System.currentTimeMillis();
    int j = 0;
    for (int i = 0; i < 10000000; i ++) {
    sTest = new String("Hello Everyone");
    if(sTest.intern() == "Hello Everyone") {
    j++;
    System.out.println(System.currentTimeMillis() - x);
    System.out.println(j);
    }

    Grepping the JDK 1.5 source for the pattern "intern *(", I found 56 files. So far, I'm considering it an example of how not to use intern(). For example, in com.sun.org.apache.xml.internal.utils.NamespaceSupport2, we have this gem:
        void declarePrefix (String prefix, String uri)
                                    // Lazy processing...
            if (!tablesDirty) {
                copyTables();
            if (declarations == null) {
                declarations = new Vector();
            prefix = prefix.intern();
            uri = uri.intern();
            if ("".equals(prefix)) {
                if ("".equals(uri)) {
                    defaultNS = null;
                } else {
                    defaultNS = uri;
            } else {
                prefixTable.put(prefix, uri);
                uriTable.put(uri, prefix); // may wipe out another prefix
            declarations.addElement(prefix);
        }So, they intern both the namespace prefix and URI, taking up space in the permgen. Then they use .equals() to compare those to a constant string, and put both values in standard Hashtables!
    There are several cases where intern() is used to prevent the compiler from performing constant substitution:
        public final static String PREFIX_XMLNS = "xmlns".intern();It's a nice little trick, when your constants are likely to change: no need to recompile dependent classes. However, since this particular prefix is defined by the W3C namespace spec, the trick is of very dubious value here.
    There are a few cases, such as in java.lang.Class, where intern() is used so that you can do a fast string search over a fixed array. I suspect these cases exist to avoid a dependency on Map. Whether or not a simple equals() would have been sufficient is an open question.

  • Characters in String : Unicode 16-bit to custom 32-bit

    I understand that internally in Java, characters in Strings are actually Unicode characters, with each character represented with 16 bits.
    So, character �L� in Unicode is 0x004C
    which is also 0000 0000 0100 1100
    Now, I wish to encode each of the 4 bits above into individual ASCII characters:
    = 0 0 4 C
    = 0x30 0x30 0x34 0x43
    = 00110000 00110000 00110100 01000011
    So, from the original 16-bit character in Java, I want a final 32-bit.
    Eventually, I�ll need to send the final result over the network, via OutputStream/writer and socket.
    Can someone help me on this ? Or give me some ideas... Thanks.

    trick: prepend the number with 1 and use substring... like int charWith1 = c + 0x10000. That'll make charWith1 to be of the format 0x1XXXX. Then call hexstring on that, you get a string like "1XXXX." Then you can drop the 1 with a call to substring.
    of course there are methods that use only bit operations and additions to do it, making it a bit faster.. like this:
    byte byte0 = (byte) ((c & 0x000F) + '0');
    byte byte1 = (byte) (((c & 0x00F0) >> 4) + '0');
    ...

  • How to deal with such Unicode source data in BI 7.0?

    I encountered error when activating DSO data. It turned out that the source data is Unicode in the HTML representation style. For example, the source character string is:
    ABCDEFG& #65288;XYZ  (I added a space in between & and # so that it won't be interpreted to Unicode in SDN by web browser)
    After some analysis, I see it's actually the Unicode string
    ABCDEFG&#65288;XYZ
    Please notice the wide left parenthesis. It's the actual character from the HTML $#xxx style above. To compare, here is the Unicode parenthesis '&#65288;'  and here is the ASCII one '(' . You see they are different.
    My question is: as I have trouble loading the &#... string, I think I should translate the string to actual Unicode character (like '&#65288;' in this case). But how can I achieve this?
    Thanks!
    Message was edited by:
            Tom Jerry

    I found this is called "Numeric character reference", or NCR, in HTML term. So the question is how to convert string in NCR fashion back to Unicode. Thanks.

  • TABLE_ENTRIES_GET_VIA_RFC in unicode system

    Hi all,
    I know this is going to be a long initial post, but please please take the time to read it. Otherwise there would be many unnecessary questions.
    We are using a middleware for mobile devices that reads connects to SAP for reading table data (via a.m. RFC) and posting RFCs/BAPIs.
    Now we try to connect to an unicode SAP system (6.20). The statement
    SELECT * FROM (TABNAME) INTO TABLE TABENTRY WHERE (SEL_TAB)
    where TABENTRY is a table of type char(2048) does not work any more as in unicode systems the structure of the db table and the internal table have to be the same.
    So we found the fm CRM_CODEX_GET_TABLE_VIA_RFC in CRM which is built from a copy of the a.m. fm and solves this problem by
    1.) creating dynamically an internal table of the same type as the db table.
    2.) select the data into this new internal table
    3.) loop over the internal table and converting each field of its structure to a char variable and then appending it to a the result char(2048).
    Theoretically everything's ok. The fm works now and returns correct data. But there's still one problem, the middleware doesn't convert the data correctly, as the values of fields of type 'p' are passed differently.
    non unicode (standard fm):
    1000000000000280401011000COMPDL              200408 ###E8##############################
    unicode (changed fm):
    1000000000000280401016000COMPDL              200504 10.000       0.000        0.000        0.000
    As you can see, the select statement from the top of this post just puts the data into the string without actually converting the numbers in fields of type p (or QUAN, CURR in db).
    The changed fm with converting every field also converts the number values, now they appear as char fields.
    The middleware tries to convert the number values, but always returns 0 (I can only the results as the actually programming is a black box for me).
    Has anyone any idea how to solve this problem? (beside getting help from the middleware vendor, which is difficult, as there is a new release working with unicode systems. But we will stay on the old release for some months from now...)

    Hi Raja,
    thanks for your answer.
    I had already searched the forum and found your document about RFC_READ_TABLE which I think is quite interesting and a good solution.
    But unfortunately, I cannot change the middleware's RFC logic, e. g. change the BAPI or make changes to the in-/output streams.
    I now live with a workaround:
    I modified the RFC to convert all p type fields to character fields and also changed the metadata RFCs accordingly, which works OK.
    For all RFCs I use to post data to SAP, I write a wrapper RFC with character only structures and convert them to the internal RFCs inside SAP.
    This is not my preferred solution, but I am very short of time and it works pretty well.
    Regards,
    Hans

  • Unicode characters longer than 2 bytes

    It seems that Flex 3 only handles double-byte Unicode characters.  Unicode has characters outside the BMP (Basic Multilingual Plane), which have codes greater than 2^16 and cannot be encoded in two bytes, but can be encoded in UTF-8.  Will such characters be supported in the future, e.g. in Flex 4?
    Thanks,
    Francisco

    How to tell whether a "character" (really a UTF-16 code unit) in an AS String is actually part of a surrogate pair:
    D800..DBFF: high surrogate
    DC00..DFFF: low surrogate
    everything else: a character

  • Display String - UCS2

    Hi,
    I have a web based application. The default language is English.
    I have some data in UCS2. Using Java, how can I display the decoded value of the UCS2 String on the screen? Say, if the UCS2 String is actually Chinese/Thailand Language Character, how can I do so?
    Thank you,

    I have some data in UCS2. Using Java, how can I
    display the decoded value of the UCS2 String on the
    screen? Say, if the UCS2 String is actually
    Chinese/Thailand Language Character, how can I do
    so?What do you mean 'on the screen'? If you're using System.out.println(), it may be impossible. See
    http://forum.java.sun.com/thread.jspa?threadID=525433&messageID=2519054
    if you want to learn more, maybe take a look at
    http://www.jorendorff.com/articles/unicode/java.html

  • Null String and Empty String problem

    Hello everyone,
    since i am totally new in JSP, i am getting problem in handling strings.
    Suppose i have a variable users = ""; then
    I want to ask when to use:
    if (users.equals(""))
    and
    if(users == "")
    in my code, variable users has value "regional" for regional users.
    and i am checking this code as:
    if (users.equals{"regional")) {
    out.print ("I am inside code");
    at that time, the code is throwing error (run time error)
    and when i changed the code as:
    if (users == "regional") {
    out.print ("I am inside code");
    this time, the code is not generating error but the part message "I am inside code " is not displaying. The code do not inserts inside the if condition
    I hope u understand my problem. Can anybody help me out with this.

    This has basically nothing to do with JSP, but with basic Java knowledge.
    When using the '==' operator to compare Objects (yes, String is actually a subclass of Object), then it will look if they are of the same reference. Using the '==' operator to compare primitive datatypes (int, boolean, char, etc) will look if they have the same value.
    That is why the Object class has the equals() method to give the ability compare with another objects. And you can only invoke it when the Object is actually instantiated. So if it is not null.
    if (string != null && string.equals("somevalue")) {
    // or
    if ("somevalue".equals(string)) {
    }should work.
    Edit rym82: this will not throw a NPE, but an ordinary compilation error ;)
    Message was edited by:
    BalusC

  • Using a String in the "IN" clause

    Hello folks,
        I am trying to output results from a table based on a String which I was planning on using in the "IN" clause. When I run the query through the PL/SQL procedure, I get no results.
    In the PL/SQL program, I have a variable p_string where I am appending the ID's in a loop. So, when I print the string, I am seeing '1001','1002','1003' but when I do the following I get nothing.
    select * from test_tb
    where ID IN (p_string);
    Is this because there is an extra quote at the beginning and end of the string and the string is actually ''1001','1002','1003''?
    create table test_tb(ID varchar2(4), description varchar2(20));
    INSERT INTO TEST_TB (ID, DESCRIPTION) VALUES ('1001', 'Testing 1001');
    INSERT INTO TEST_TB (ID, DESCRIPTION) VALUES ('1002', 'Testing 1002');
    INSERT INTO TEST_TB (ID, DESCRIPTION) VALUES ('1003', 'Testing 1003');
    INSERT INTO TEST_TB (ID, DESCRIPTION) VALUES ('1004', 'Testing 1004');
    INSERT INTO TEST_TB (ID, DESCRIPTION) VALUES ('1005', 'Testing 1005');
    Thanks

    Thanks for the link, Greg.
    I was able to find another link which worked for me as I am not too familiar with Collections. I used Regular Expressions instead.
    https://blogs.oracle.com/aramamoo/entry/how_to_split_comma_separated_string_and_pass_to_in_clause_of_select_statement

  • "scan from string" to timestamp doesn't work for 18:00:00 (6PM)

    I just found a strange issue in LabVIEW.  I hope I'm doing something silly, but I just may have found an unusual bug.
    run the snippet below with the following for the input string: 03:00:00,18:00:00,17:00:00
    Time converts fine for just about any other time EXCEPT 18:00:00 (6 PM) for which it is returned as 00:00:00 (midnight). If you even add a second to it (18:00:01) you get back the expected result.
    Here's hoping I'm not loosing my mind
    Matt Holt
    Certified LabVIEW Architect
    Solved!
    Go to Solution.
    Attachments:
    TimeParseBug.vi ‏11 KB

    As annoying as it may seem, this exact scenario is an abuse of the timestamp. A timestamp is meant to be used for absolute times. And that includes a date. As Ravens Fan already pointed out, the 0 seconds since January 1, 1904 GMT is used in all timestamp display routines to mean the canonical invalid timestamp and hence the timestamp control displays the default string indicating the actual date/time format rather than a specific date/time.
    If you need an absolute timestamp, for instance because you do want to have a local time indication, although the date is not relevant, adding an offset of 86400 to all values would fix it once and for all. Now the timezone offset can't cause the timestamp to reach 0 ever, even if you reside west of GMT, and it will be fine (until you start to do timestamp arithmetic that involves subtraction of relative timespans, then you would have to make the offset big enough that this will never get an issue). The current date would serve as a nice offset for that, which would be MattH's last suggestion. Nice to see that the Scan from String routine actually does use the passed in timestamp as default value and only replaces the values it is configured to parse.
    Rolf Kalbermatter
    CIT Engineering Netherlands
    a division of Test & Measurement Solutions

  • Obtaining the " #text " string in a nodeList !!!!

    hello guys,
    i just have a bizarre problem, i'm getting the #text string in a nodeList whereas this string does not exist in any of the .xml files.
    here is my function:
    private Vector<String> getClassAttributs(Document doc)
          String nN="";
          NodeList listeClassAttributes = doc.getElementsByTagName("attribut");
          for(int i=0;i<listeClassAttributes.getLength();i++)
             nN=listeClassAttributes.item(i).getParentNode().getNodeName();
          //  System.out.println(nN);
             if (nN.equals("classe"))
                 NodeList children=listeClassAttributes.item(i).getChildNodes();
                 for(int j=0;j<children.getLength();j++){
                    System.out.println( children.item(j).getNodeName());//here it displays me among other string(that actually exist) the #text (which does not exist)
                   //if(children.item(j).getNodeName().equals("valeur"))
                     // System.out.println(children.item(j).getNodeValue());
                  //classAttributes.add(children.item(i).getNodeValue());
          return classAttributes;
        }thank you

    In your for loop check for node type before printing the value. Something like this...
    if (children.item(j).getNodeType()== Node.ELEMENT_NODE)

  • Scan From String White Space

    i all,
    I'm trying to use Scan From String in order to parse some data coming in from UDP. 
    Input String: ASCII [00 01 02 03 ... FF]
    What I want: s[00 .. 30]  d[12], d[34], d[56] leftover s[37 38 39 ... FF]
    ATTEMPT1
    Format String: %49s%2d%2d%2d
    What I get: s[00-09] RUNTIME ERROR!
    ATTEMPT2
    Format String: %49[^]%2d%2d%2d
    What I get: Only allows first output. Will error out if I use any additional outputs from Scan From String
    ATTEMPT3
    Format String: %49[^(0xFF)]%2d%2d%2d Value in () is ASCII character FF.
    What I get: s[00 .. 30] d[12], d[34], d[56] leftover s[37 38 39 ... FF]
    It appears as though when I use %##[^] it thinks I'm looking for the ENTIRE string so it will not let me add any more Formatting.  If I add a delimiter other than ^ it will run, and it will work presuming that character isn't within the first 49 characters... and I can't guarentee that it won't.
    I'm aware I can parse my string using subsets and whatnot... but Scan From String is so elegant.  It would be great if %S allowed for white space... or if $##[^] would simply take the first ## characters and allow me to Format after that.
    Is there a simple, elegant way to do this?  I wish my dataset was only 3-4 outputs. It'd be ideal if I could.  Thanks.
    Edit:
    It might be more helpful if I provide a less abstract example:
    I have an ASCII Header (Finite Length String), a Sender IP (Finite Length String), a Timestamp, a Message ID (Finite Length Decimal), A Message in ASCII ( '1' actually means 0x31, not 0x01)  And for some ungodly reason... no delimiters.
    So I was HOPING %##s%##s%<%H:%M:%s>t%##d   (With leftover string to be my message)  would work, but if any white space is contained within there... it messes up. 

    I cannot provide exact strings because the string is actually ASCII characters, most of which aren't displayable. 
    I have a string where I have:
    24 ASCII Characters representing 6x U32 Header Data
    13 ASCII characters represening the sender IP (string)
    12 ASCII Characters representing the name of the message (String)
    12 ASCII Characters representing 3x U32 Data
    12 ASCII Characters represneting the name of packet (String)
    12 ASCII Characters representing 12x U8 Data
    256 ASCII Characters represening 256x U8 Data
    etc...
    It would be ideal to simply Scan from the string and output the data with the appropriate data types already assigned instead of splitting string and type casting each individualy.  But if, for example, my header starts with an ASCII representation of a U32 of 2560(decimal) it would look like this:  [00][00][0A][00].  ASCII 0A is considered white space.  So my header would only contain 2 ASCII characters instead of the desired 24.

  • From string to method

    Hi all,
    i have a property file.
    i need to read from the file some string and activate a method with the same name.
    i know how to read the file.
    my question is how can i transfer from the string to actual method.
    Gabi

    How much of this are you doing? If it's a lot, then you may be reinventing the Spring .

  • Eliminating ###'s after hex reconversion to char string

    This is interesting.
    I was originally required to pass a segment to my subroutine. I had to do so using text literals. The   symbol is not recognizable by text literals. Therefore, I had to convert the entire string to hex.
    I then passed the hex segment to the subroutine, and reconverted it back to char string.
    After this I found a ton of unexpected # symbols in my segment. I tried doing a
    REPLACE '###' WITH '' INTO CHARSTRING.
    , but unsuccessfully. There is not much documentation on hex_to_char conversion, there are 2 FMs on 46, but I cannot get them to work. If anyone knows how to eliminate these unexpected # symbols please let me know.

    DATA: HEXSTRINGER(6000) TYPE 'X'.
    DATA: STRINGER(3000) TYPE 'C'.
    FIELD-SYMBOLS: <FSHEX>, <RECONVERT2>.
    CALL FUNCTION 'ARCHIVE_GET_NEXT_RECORD'
                   EXPORTING
                         ARCHIVE_HANDLE       = READ_HANDLE
                   IMPORTING
                         RECORD               = ARC_BUFFER-SEGMENT
                         RECORD_STRUCTURE     = ARC_BUFFER-RNAME
               MOVE ARC_BUFFER-SEGMENT TO SEGSTER.
               assign SEGSTER to <fsHEX> type 'X'.
               MOVE <fsHEX> to HEXSTRINGER.
    APPEND 'REPORT ZSUBR.' TO CODE.
    APPEND 'FORM DYN1 USING HEXSTRINGER.' TO CODE.
    APPEND 'DATA: SEGSTER2(3000) TYPE ''C''.' TO CODE.
    APPEND 'ASSIGN HEXSTRINGER TO <RECONVERT2> TYPE ''C'' .' TO CODE.
    APPEND 'MOVE <RECONVERT2> TO SEGSTER2.' TO CODE.
    APPEND 'WRITE:/ SEGSTER2.' TO CODE.
    APPEND 'ENDFORM.' TO CODE.
    I write out Segster2 and compare it to segster.
    I've done some variations of this where after I change the char string to hex string I replace 'C' with '7C' into hexstringer and then re-replace it in the subroutine, and then reconvert it to a char string- but actually I don't think that was a necessary step. I believe I can just convert it to hex, and then reconvert it back to hex in the subroutine.

Maybe you are looking for