Double-byte parser

Hey,
I need to load in 400 character string of data into text using the FM SAVE_TEXT in chucks of 60 characters.  If the user logs in Japanese, they could be loading in a mix of single byte and double characters.  I need to be able to check if I break my string of data at the 60th character, will I be corrupting a double character. 
Does anyone know how to perform this test in ABAP?
Thanks for the feedback! Bill

Hi Bill,
If you are on version 6.2+ you could try using the NUMOFCHAR( ) command.
I guess you could do something like:
* First determine the number of actual
* characters in 60th and 61st bytes.
w_num_of_chars = NUMOFCHAR( w_text+60(2) ).
* If there is only 1, then split on 59.
IF w_num_of_chars = 1.
* Otherwise, we have two characters so we can split at 60.
ELSE.
ENDIF.
I must admit I'm a complete novice when it comes to double byte languages and their handling, so that may just be complete gobbledegook.
Hope that helps.
Cheers,
Brad

Similar Messages

  • Does Oracle XML Parser support double byte charset?

    Hi,
    Does Oracle XML Parser support double byte characters such as Korean or Chinese? If so, please tell me what version and how to construct xml/xsl files (...encoding="???")?
    Thanks for any help,
    Tuan

    Hi Raymond,
    Thank you for your help. It worked when I running in JDeveloper with your posted code. However, when I tried in my real application, it won't work.
    The problem is for localization purposes, my application using some texts display in browsers are saved in Unicode file. Later, application runs and depends on languages setting in browsers, with JavaServlet retrieves those texts and saves in formated xml StringBuffer. Then, using existed XSL Stylesheet file and OracleXMLParser to generate an output HTML.
    It has worked fine with English, France or others (single byte characters), but it can't
    for double bytes character such as Korean or Chinese. I also tried different charset in xml file.
    The following is one of returning errors:
    -- oracle.xml.parser.v2.XSLException: XSL-1004: Error while parsing input XML document (<Line 1, Column 552>: XML-0221: (Fatal Error) Invalid char in text.)
    I run this app in win2000/IIS with ServletExec3.0, JDK1.2.2 and OracleXMLParser v2.0.2.10
    Thank you for any helps,
    Tuan
    <BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by Raymond Hayes Jr ([email protected]):
    Nothing fancy 'cause I'm half asleep but I used your xml/xsl and it seemed to work. No errors anyway. This is what I put together in JDeveloper 3.2
    package demo;
    import javax.servlet.*;
    import javax.servlet.http.*;
    import java.io.*;
    import java.net.*;
    import java.util.*;
    import oracle.xml.parser.v2.*;
    public class CuriosityKilledTheCat extends HttpServlet {
    * Initialize global variables
    public void init(ServletConfig config) throws ServletException {
    super.init(config);
    * Service the request
    public void service(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    try
    XSLStylesheet xsl = new XSLStylesheet( new URL ("file:///c:\\temp\\input.xsl") , null );
    XSLProcessor xp = new XSLProcessor();
    XMLDocument xd = new XMLDocument ();
    XMLDocumentFragment xf = new XMLDocumentFragment();
    xf = xp.processXSL ( xsl , new URL ( "file:///c:\\temp\\input.xml") , null );
    System.out.println ( "here" );
    xd.appendChild( xf );
    xd.print ( response.getOutputStream() );
    catch ( Exception e )
    System.out.println ( e.getMessage() );
    * Get Servlet information
    * @return java.lang.String
    public String getServletInfo() {
    return "demo.CuriosityKilledTheCat Information";
    }<HR></BLOCKQUOTE>
    null

  • Regular Expressions and Double Byte Characters ?

    Is it possible to use Java Regular Expressions to parse
    a file that will contain double byte characters ?
    For example, I want a regular expression to match the following line
    tag="double byte stuff" id="double byte stuff"

    The comments on the bytes/strings were helpful. Thanks.
    But I'm still confused as to what matching pattern could be used.
    For example a pattern like:
    [A-Za-z]
    I assume would not match any double byte characters.
    I also assume the following won't work either:
    [\\p{Alpah}]
    because it is posix - US-ASCII only.
    So how do you say "match the tag, then take any characters,
    double byte, ascii, whatever, then match the text tag - per the
    original example ?

  • ASCII representations of double-byte characters

    My file contains ASCII representations of double-byte CJK characters (output of native2ascii). How do I restore them back to the original native characters?
    I mean, when I load the file with FileInputStream, what I get are all strings like \uabcd. How do I get the characters represented by these strings?

    My file contains ASCII representations of double-byte
    CJK characters (output of native2ascii. How do
    I restore them back to the original native
    characters?
    I am no expert in unicode so I don't know if this is correct, but I assume that if a String starts with "\u" then there will be 4 more characters that are a hexadecimal representation of the char value. If that's right, then you should be able to parse out the "\uxxxx" and convert it to a char by parsing the hex. For example//the variable unicode is a String like \uabcd
    String hex = unicode.substring(2);
    char result = (char) (Integer.parseInt(hex,16));

  • Encoded double byte characters string conversion

    I have double byte encoded strings stored in a properties file. A sample of such string below (I think it is japanese):
    \u30fc\u30af\u306e\u30a2
    I am supposed to read it from file, convert it to actual string and use them on UI. I am not able to figure how to do the conversion -- the string contains text as is, char backslash, char u, and so on. How to convert it to correct text (either using ai::UnicodeString or otherwise)?
    Thanks.

    Where did this file come from? Some kind of Java or Ruby export? I don't think AI has anything in its SDK that would natively read that. You could just parse the string, looking for \u[4 characters]. I believe if you created a QChar and initialized it with the integer value of the four-character hex string, it would properly create the character.

  • Text strings from VISA read don't match identical looking text constants - could it be double byte characters"

    Our RS232-enabled instrument sends ASCII strings to COM 1 and I read strings in. For example I get the string "TPM", or at least it looks like "TPM" if I display it. However, if I send that to the selector input of a Case structure, and create a case for "TPM", whether the two appear to match varies. Sometimes it matches, and measuring its length returns 3. Sometimes it measures 7 or 11 or 12 characters long, and it doesn't match. I can reproduce a match or a mismatch by my choice of the command that went to the instrument prior to the command that causes the TPM response, but have made no sense of this clue. I have run it through Trim Whitespace, with Both Ends (the default) explicitly selected. I have also turned the string into a byte array, autoindexed a For loop on that, and only passed the bytes if they don't equal 32, or if they don't equal 0, thinking spaces or nulls might be in there, but no better.
    The Trim Whitespace function's Help remarks that it does not remove "double byte characters". But I can't find anything else about "double byte characters". Could this be the problem? Are there functions that can tell whether there are "double byte characters", or convert into or out of them? By "double byte characters", do they just mean Unicode?
    Solved!
    Go to Solution.

    Cebailey,
    The double byte characters are generally used for characters specific to languages other than English.  If you display your message in  " '\' Codes Display"  in a string indicator do you see any other characters?   You could also use Hex Display to see count the number of bytes in the message.  You are probably getting messages with non-printable characters that might need to be trimmed before using your application.  If you want more information the '\' Codes Display, there's a detailed description found in the LabVIEW Help.  You can also find the same information on our website in the LabVIEW Help.  Backslash ('\') Codes Display
    Caleb WNational Instruments

  • Using Double Byte Characters in URL For Session Variables

    When I supply the value for a session variable in the URL for an IRPT page where the value contains double byte characters, Japanese in this case, the characters are corrupted by the time they are entered for the session variables.  Does anyone know a solution to this problem or experience in this area?  Currently using xMII 11.5 SR3.

    Hi Bryan,
    I would suspect that under the covers the session variable is of datatype string.  For double byte characters, it would need to be wstring.  There is a better explanation to be found at:
    Link: [Kanji and Java Datatypes|http://www.unix.com.ua/orelly/java-ent/jenut/ch10_04.htm] or you can try google on  Kanji Datatype  OR Kanji Java Datatype
    It could also be a problem with the operating system which I ran into about 10 years ago, but I would hope that Microsoft had moved beyond that by now.
    Maybe some more technical folks could chime in to confirm or deny my explanation.
    Mike
    Edited by: Michael Appleby on Jul 8, 2008 5:23 PM

  • Crystal XI R2 exporting issues with double-byte character sets

    NOTE: I have also posted this in the Business Objects General section with no resolution, so I figured I would try this forum as well.
    We are using Crystal Reports XI Release 2 (version 11.5.0.313).
    We have an application that can be run using multiple cultures/languages, chosen at login time. We have discovered an issue when exporting a Crystal report from our application while using a double-byte character set (Korean, Japanese).
    The original text when viewed through our application in the Crystal preview window looks correct:
    性能 著概要
    When exported to Microsoft Word, it also looks correct. However, when we export to PDF or even RPT, the characters are not being converted. The double-byte characters are rendered as boxes instead. It seems that the PDF and RPT exports are somehow not making use of the linked fonts Windows provides for double-byte character sets. This same behavior is exhibited when exporting a PDF from the Crystal report designer environment. We are using Tahoma, a TrueType font, in our report.
    I did discover some new behavior that may or may not have any bearing on this issue. When a text field containing double-byte characters is just sitting on the report in the report designer, the box characters are displayed where the Korean characters should be. However, when I double click on the text field to edit the text, the Korean characters suddenly appear, replacing the boxes. And when I exit edit mode of the text field, the boxes are back. And they remain this way when exported, whether from inside the design environment or outside it.
    Has anyone seen this behavior? Is SAP/Business Objects/Crystal aware of this? Is there a fix available? Any insights would be welcomed.
    Thanks,
    Jeff

    Hi Jef
    I searched on the forums and got the following information:
    1) If font linking is enabled on your device, you can examine the registry by enumerating the subkeys of the registry key at HKEY_LOCAL_MACHINEu2013\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink to determine the mappings of linked fonts to base fonts. You can add links by using Regedit to create additional subkeys. Once you have located the registry key that has just been mentioned, from the Edit menu, Highlight the font face name of the font you want to link to and then from the Edit menu, click Modify. On a new line in the dialog field "Value data" of the Edit Multi-String dialog box, enter "path and file to link to," "face name of the font to link".u201D
    2) "Fonts in general, especially TrueType and OpenType, are u201CUnicodeu201D.
    Since you are using a 'true type' font, it may be an Unicode type already.However,if Bud's suggestion works then nothing better than that.
    Also, could you please check the output from crystal designer with different version of pdf than the current one?
    Meanwhile, I will look out for any additional/suitable information on this issue.

  • Invoke-WebRequest - Double byte characters issue in windows 8.1

    I try write a powershell script to download a file from web server but failed. The path have double byte characters.
    I could run in Windows server 2012 and 2012 R2 successfully, but fail in Windows 8 & 8.1
    Do there any difference below Windows server and client powershell?
    Region and setting are same in Windows 2012 & Windows 8
    Script as below
    Invoke-WebRequest -Uri " http://hostname/m/%E9%...../......./...../xxx.jpg"

    Security settings are one possible cause of this.
    Since we don't have your URL we cannot reproduce this. 
    It is "different". Using "difference" had me confused for qa bit.  I though you were trying to figure out the difference between two things.
    Use:
    $wc=New-Object System.Net.WebClient
    $ws.DownloadFile($url,'c:\file.jpg')
    You will see less issues and it is faster.
    ¯\_(ツ)_/¯

  • Given filename or path contains Unicode or double-byte characters.Retry using ASCII characters for filename and path What does this mean? it happen when I publish an OAM

    Given file name or path contains Unicode or double-byte characters. Retry using ASCII characters for filename and path
    What does this mean? It is happening when I try to publish an OAM for Dreamweaver.
    Also: How can I specify the browser in Edge Animate? It is just going wherever. Are there no Preferences for Edge Animate?
    BTW. Just call it Edge. Seriously. Do you call it Illustrator Draw? Photoshop Retouching?

    No, my file name is mainContent.oam
    My project name is mainContent.an
    This error happens when I try to import into Dreamweaver. Sorry, I wasn't clear on that earlier.
    I thought maybe it was because I had saved my image as a png. So re-saved as a svg, still get the error.
    DO I have a setting is Dreamweaver CC that is wrong? Should I try this in Dreamweaver CS6? I might try that next.
    Why is this program so difficult? I know Flash. I know After Effects. I can work the timeline part just great. It's always in the export that I have problems.
    On a MacPro, 10.7.
    Are you an Adobe person or just a nice helper?

  • How do I convert a double-byte encoded file to single-byte ASCII?

    Hello,
    I am working with XML files (apparently coded in UTF-8) which encoded in double-byte characters.
    The problem is the characters for end of line: 00 0D 00 0A
    This double byte end of line is causing a problem with a legacy conversion tool (which deals with 0D 0A). The file itself contains no
    accented/international characters, so in principle converting to single-byte should not cause any problems.
    I have tried to convert this file with tools like native2ascii and the conversion tools that are part of Notepad++ but without
    any luck - the "00 0D 00 0A" are still present in the output
    Can anyone point me to a tool or some code that can convet this file into single-byte?
    Thank you.

    Amiens wrote:
    native2ascii.exe -encoding UTF-16 -reverse INPUT.xml OUTPUT.xml
    gives 00 00 0 0D 00 00 00 0A
    so clearly that is not the required output.What you've got there is UTF-16 encoded text that's been converted to UTF-16. Get rid of the "-reverse" option and you should see the result you expect.

  • Report in PDF format can not support double byte character ?

    Hi :
    I am developing the simplified chinese application , when i generate the report in PDF format , i found all double byte character did not display correctly . i am using dev6i+patch5 , why ?
    it is a bug of form or not ???????
    HELP !!!

    try this ,Install acrobat4.0 or above and make sure you got acrobat distiller which will add a acrobat distiller printer in you windows. print report direct to that printer (don't change desformat) and you should got your PDF file with correct font setting.
    <BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by xie min ([email protected]):
    Hi :
    I am developing the simplified chinese application , when i generate the report in PDF format , i found all double byte character did not display correctly . i am using dev6i+patch5 , why ?
    it is a bug of form or not ???????
    HELP !!!<HR></BLOCKQUOTE>
    null

  • JSF and Double Byte Character

    Hi,
    I wanted to know how to handle <h:outputText> with chinese character or double byte character.
    See sample code below :
    <%@ page language="java" contentType="text/html; charset=UTF-8"      pageEncoding="UTF-8"%>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <h:form styleClass="form" id="form1">
    <% request.setCharacterEncoding("UTF-8"); %>
    <h:inputText styleClass="inputText" id="text1"></h:inputText>
    <hx:commandExButton type="submit" value="Submit" styleClass="commandExButton" id="button1"      action="#{pc_SubmitTest.doButton1Action}"></hx:commandExButton>
    <h:outputText styleClass="outputText" id="text2"></h:outputText>
    </h:form>
    When you input with double byte character and submit,
    the output screen value did not render properly .
    I tried this similiar code at JSP, it work fine.
    Anybody know how to solve this problem ?
    Anything need to do at pagecode level ?
    Thank you.
    Reinardy

    Problem was due to the fact that I was trying to generate the excel file in char stream instead of byte stream

  • To read Double Byte Character

    Hello All,
    What is meant by Double Byte Character.
    I want to read the this Double Byte Character.Is there any function module or class to read this or any other way to get the double byte character.
    This is very urgent please i will reward points.
    Thanks,
    Karan

    Hello Indrakaran,
    For to determine Field Properties in Unicode please take a look at the New Unicode Class CL_ABAP_CHAR_UTILITIES.
    CHARSIZE attributes:
    The CHARSIZE attribute declares the length of a C(1) field in bytes - that is, one byte in NUS, and either two or four in US, UTF-8 or UTF-16 respectively.
    Hope it helps,
    Heinz

  • PDF acceleration F5 BigIP WA and double byte characters

    We have been trying to use the F5 appliance from BigIP to accelerate the delivery of PDF files from SharePoint over the WAN.  However, we encountered problems with the double-byte files many months ago and have been trying to resolve the problem with F5.  We have turned off PDF acceleration on the F5 because of the problems.  The problem occurs when PDF files have Kanji characters in the file name.  If the file names are English (single byte) the problem does not occur, even if the content of the PDF contains Kanji characters.
    After many months of working with F5, they are now saying that the problem is with the Adobe plug-in to Internet Explorer.  Specifically they say:
    The issue is a result of Adobe's (not F5's) handling of the linearization request of PDF’s with the Japanese character set over 300 KB when the Web Accelerator is enabled on the BigIP (F5) appliance.  We assume the issue exists for all double-byte languages, not only Japanese.  If a non-double byte character set is used, this works fine.  “Linearization” is a feature which allows the Adobe web plug-in to start displaying the PDF file while it is still being downloaded in the background.
    The F5 case number is available to anybody from Adobe if interested.
    The F5 product management  and the F5 Adobe relationship manager have been made aware of this and will bring this issue up to Adobe.  But this is as far as F5 is willing to pursue as a resolution.  F5 consider this an Adobe issue, not a F5 issue.
    Anybody know if this is truly a bug with the PDF browser plug-in?  Anybody else experienced this?

    Your searches should have also come up with the fact that CR XI R2 is not supported in .NET 2008. Only CR 2008 (12.x) and Crystal Reports Basic for Visual Studio 2008 (10.5) are supported in .NET 2008. I realize this is not good news given the release time line, but support or non support of cr xi r2 in .net 2008 is well documented - from [Supported Platforms|https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/7081b21c-911e-2b10-678e-fe062159b453
    ] to [KBases|http://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/oss_notes_boj/sdn_oss_boj_dev/sap(bD1lbiZjPTAwMQ==)/bc/bsp/spn/scn_bosap/notes.do], to [Wiki|https://wiki.sdn.sap.com/wiki/display/BOBJ/WhichCrystalReportsassemblyversionsaresupportedinwhichversionsofVisualStudio+.NET].
    Best I can suggest is to try SP6:
    https://smpdl.sap-ag.de/~sapidp/012002523100015859952009E/crxir2win_sp6.exe
    MSM:
    https://smpdl.sap-ag.de/~sapidp/012002523100000634042010E/crxir2sp6_net_mm.zip
    MSI:
    https://smpdl.sap-ag.de/~sapidp/012002523100000633302010E/crxir2sp6_net_si.zip
    Failing that, you will have to move to a supported environment...
    Ludek
    Follow us on Twitter http://twitter.com/SAPCRNetSup
    Edited by: Ludek Uher on Jul 20, 2010 7:54 AM

Maybe you are looking for