Weird character filter

I used to code in RedHat. Something happened (something nasty) and I changed to Mandrake to give it a try. Among my source file I had a Class with a method that filtered characters, some usual characters (dollar, punctuation) and some not so usual (yen, euro, dot in the middle of the line, long hyphen, long underline,...). When i reopened my source file in Mandrake, all of the characters in the second group had turned into "?\200" , "?\202\2004" and things of the sort, sometimes even worse. I guess this has to do with encoding and Mandrake, but it's not only a question of display, my classes (which have compiled for a long time without any problems) are now unable to go through this source file, due to the character stuff. Is there a way of going around this? I've tried to find out what character set those escape sequences come from, but I've been unable to find out.
Any ideas?
Here's (a version of) the method:
    private String filterDoc(String in){
     char c;
     char prec;
     StringBuffer out = new StringBuffer();
     for (int i = 0; i < in.length(); i++){
         c = in.charAt(i);
         if ( ( c >= 'a' ) && ( c <= 'z' ) ) out.append(c);
         else switch ( c ) {
         case '.' : {
          prec = in.charAt( i > 1 ? i-2 : i+2);
          if ( (prec == '.') || (prec == ' ') ) out.append(".");
          else out.append("\n");
         } break;
         case '��' : out.append('.');break;  // this line : mid-height dot
         case ',' : out.append('\n'); break;
         case ';' : out.append('\n'); break;
         case ':' : out.append('\n'); break;
         case '?' : out.append('\n'); break;
         case '$' : out.append('\n'); break;
         case '��' : out.append('\n'); break;  // this line : pound symbol
         case '�,B,(B' : out.append('\n'); break; // this line : euro symbol
         case '��' : out.append('\n'); break;  // this line : yen symbol
         case '��' : out.append('\n'); break;  // this line: copyright symbol
         case '*' : out.append('\n'); break;
         case '-' : out.append('-'); break;
         case '��$(D' (B: out.append('-'); break; // this line : hyphen
         case '_' : out.append('-'); break;
         default : out.append(c); break;
     return out.toString();
    }I say it's 'a version' of the method, because in windows it looks just like in mandrake, and when I copy paste it here I can see 'some' things have changed, for instance, I see squares (or black squares) up here, but I don't see any squares on emacs, I get the \20something.

The Java compiler uses the platform's default character encoding to read source files, so for portability you shouldn't write non-ASCII characters directly, instead you should use the \uXXXX unicode escape sequences, for instance \u20AC for euro.
It appears that in RedHat your default encoding was UTF-8 in which the characters you mention are encoded with multiple bytes, and when you transferred to Mandrake the default encoding changed and now it's some encoding where each character is encoded with a single byte.
There are a number of things you can do:
o Convert the source files to use unicode escape sequences. The easiest way is to use the native2ascii tool that comes with the SDK: run this command to all your source files:    native2ascii -encoding UTF-8 inputfile outputfileAlso, if you are looking for the escape sequence of a particular character you can find it from http://www.unicode.org/charts/
o Continue using UTF-8 and pass the option -encoding UTF-8 to the compiler:    javac -encoding UTF-8 MyClass.javao Change the platform's default character encoding by modifying locale settings, for instance     export LANG=en_US.UTF-8You can use the command "locale -a" to list all available locales.

Similar Messages

  • When typing an 'L' followed by forward slash '/' I get a weird character

    I can't figure this out. whenever I type a string of text that requires an L and forward slash together I get some weird character that resembles an 'f' like for designating a 'function' in an equation only its backwards. this occurs in terminal, like in this example: cd /usr/local/name the 'l' at the end of local and the forward slash become one character. I can't duplicate this character any other time, just when l/ are in a string. Any ideas? I get this in terminal and entourage (only in plain text mode - html emails the character doesn't show) mainly.

    What you are seeing is a diacritic of an L with a line through it. It does the same for o/. You are probably using a font like Palantino, Hoefler, Chicago, or New York.
    I don’t use Terminal so I don’t know how to change it in that application, but in TextEdit
    1. open the Font panel (Format/Fonts/Show Fonts, or Command-t)
    2. from the Action menu (small gear shape in lower left corner) choose Typography
    3. in the Typography window select the Diacritics section (if there is no Diacritics, select in the Fonts panel one of the fonts mentioned above, like Chicago and the Typography window will change to show a Diacritics panel)
    4. click Don’t Compose Diacritics button
    I couldn’t find a Show Fonts command in Terminal, but now that you know the procedure, maybe you can figure it out.

  • Weird Character in Code 128 Codeset B Barcode

    Hello Everyone,
    I am encoutering two issues dealing with barcodes in Crystal Reports 2008:
    I have an old Seagate Crystal Reports 7 report that uses Code 128 Codeset B barcodes.  I have converted the report to Crystal Reports 2008 and have tried using the new built-in 'Change to Barcode' function and the old BarcodeC128B() function that the old report used to create the barcode. The barcode is set to use font 'Code128'.  However, in both instances, and when I start fresh with the built-in function in a blank report, strange characters are appended to the end of the barcode.  Most of the characters are capital A's with different accent marks over the character, and sometimes they are capital E's with accent marks.  These barcodes do not match the barcodes that are generated out of the old version of the report.  Unfortunately, I do not have a scanner to test to see if they are working the same, but my instinct says they would not work with these extra characters.
    My next issue occurs when I export the report output to Microsoft Word.  The barcodes seem to change, i.e. they become smaller and the weird character on the end changes.  When I export report output from the old version of the Crystal Report to Word, the barcodes do not change.  Has anyone else experienced issues exporting barcodes to Word from Crystal Reports 2008?  I am using Crystal Reports 2008 version 12.3.5.925.  Thank you for your help.

    Hi Brian,
    I'm not convinced yet that it's the code set.  I am using Microsoft Server 2008 R2, and the fonts have been copied to the fonts folder just as I have done on every other platform where I have needed to produce or view the UPCs.  I have copied a file with UPCs that were generated correctly from Crystal Reports 8 over to the server and the UPCs are displayed correctly, which tells me the font is working on the server.
    I have contacted Azalea's technical support twice but I have gotten no response.  Has anyone else had a similar issue?

  • Weird character in search result for HitHighlightedSummary property

    We have a INTEG search center and display managed property 'HitHighlightedSummary', which is a out of box managed property
    One of the results display as below in search result page:
    "Issue 9 brings us the latest news from the Home Oxygen Service team … Don't forget to read CorpNEWS next week to get the latest Home Oxygen Service update.  "
    Noticed that the last weird character? how does that happen? we have same setup in DEV env, but we don't get this issue, how to track down this issue.

    The Ascii code 8203 stands for line break :
    http://www.fileformat.info/info/unicode/char/200b/index.htm
    commonly abbreviated ZWSP ;  this character is intended for invisible word separation and for line break control; it has no width, but its presence between two characters does not prevent increased letter spacing in justification
    A third party editor may be adding them , if they are not originally present in your script/code.
    Are you using any such editor tools?
    It will be helpful to open a troubleshooting ticket with Microsoft so that we can look at the issue in more depth.
    http://social.msdn.microsoft.com/Forums/sharepoint/en-US/23804eed-8f00-4b07-bc63-7662311a35a4/why-does-sharepoint-put-in-character-code-8203-in-a-richtext-field?forum=sharepointdevelopment

  • Weird character input in textfield when hitting CTRL+somekeys

    Hi,
    Whenever an input textfield has the focus, some ctrl+something key combinations (for example ctrl+s, ctrl+o) produce some weird character input, that is, these weird characters (certainly non-ascii) appear in the textfield as if you had typed them.
    This is obviously a bug, but is there any workaround? How can I prevent these combinations of keys to produce any text input at all? (without removing the focus from the textfield)?
    This only happens in Flash Player (and Adobe AIR): I've never seen those character appear in any application as a result of ctrl+S, ctrl+O or any key combination (except alt+<numerical character code>)
    Thanks in advance
    m.

    I'm not entirely sure that the issue we're facing is a font issue - I've managed to set a font for the TextField in question that can handle Japanese as well as English characters (in this case I'm using the Bitstream Cyberbit font), following guidelines in an article on the subject:
    http://www.onjava.com/pub/a/onjava/2001/04/12/internationalization.html
    If I enter the Japanese characters in a window of a native application, and then copy & paste into the TextField in my Java applet, they appear when using this font, and it seems like the applet can interpret them OK (in IE at least, still haven't tested in NS as yet) - the problem is that when entering the characters directly into the applet using the Win2k IME facility, they only appear as question marks.
    Damian.

  • Weird character while printing amount field in ADOBE form

    When amount field is set for any diplay pattern  in PDF form it is displaying like  ¤7.41.
    All display patterns are tried leading to same display.
    All display patterns show the first character as $, but displays this weird character '¤'.
    Advice me how to eliminate this or
    Is there a way to hide this character?

    Duplicate post - thread locked.

  • A weird character becomes another one after saving it with RSKC!

    When we load a master data from R3 to BW, get an invalid character error which shows the weird character is kind of like two vertical parallel bars (sorry I am unable to give it here since whenever I submit this question, it will become other characters which could confuse you guys), we run RSKC to input this weird character, when inputting it, it shows as it is, but after saving it, find it becomes ##.  And the data load still fails.
    Any solution?
    Thanks
    Message was edited by: Kevin Smith

    hi Kevin,
    check if helps
    RSKC and # sign in the middle of a word
    Permitted characters during BW data load
    Load problem.......Special Chars..Urgent..

  • Weird character 'u00A4' display  at output of  ADOBE form in SAP for amount

    When display pattern for amount field is set for ADOBE form in SAP, output displays ' ¤ '.
    All display patterns shows '$' at the begining, but when output is generated it displays '¤'.
    Advice me how to eliminate this. (or)
    Is there a way to hide this character.

    I am not saying this is your problem BUT I was troubled for quite a while by having unprintable characters imbedded within text fields.  It turns out that Ctrl V character was being embedded within text by user.
    So I suggest that you view the data while in debug and do it in hex.  This was the only way I was able to see that I had a problem with the text itself.  Then I was able to add logic to filter for these errors.

  • Weird character code

    In some pages, character code shows up in a weird stylized font that is not readable. My settings are in Unicode UTF-8 and in Preference, fonts are Western. Why would it select the font Westwood LET Plain and Zapfino?
    I even tried to uninstall and reinstall Firefox but nothing changed.

    Thanks Cor-el
    I had resolved the issue temporarily by changing the default font to something that didn't glitch.
    Your link and suggestion to review and validate fonts from the Font Book is the best way to make sure the issue stops.
    NOTE: Make sure you verify each fonts that may cause problems (duplicates) before deleting them... lost one that I liked by selecting all and confirming to resolve duplicates all at once ... :( not good.
    Cheers.

  • Weird character spacing in paragraphs

    We are having issues where we are seeing random spacing in some of our paragraphs. Its not all paragraphs in a file and it always seems to be the first 1-2 characters of a line... but not every line in a paragraph. All our publications are generated at a server level with Typefi using templates where the main text paragraph styles are set to Adobe World-Ready Paragraph Composer as we also use the In Tools "World Tools" plugin due to occasions where we use arabic/hebrew/thai script etc and we dont want to have to activate/deactivate the plugin depending on what titles comes through production. Our font is MillerDailyOne and as we have used this font for sometime without issues across many products I tend to think its not a font issue per se.
    We have just recently upgraded to CS6 from CS4 and its only since the upgrade we are seeing this issue.
    Our current fix for affected paragraphs is to override the Adobe World-Ready Paragraph Composer with Adobe Paragraph Composer .... however this isn't ideal as it is a time consuming process going through an entire book identifying the affected paragraphs and we are finding that we fix one lot of paragraphs only to find that others are then affected somehow during our layout phase. Also not all of our titles have this issue... some titles go through production without any problems at all!
    So my question is: has anyone seen this before and does anyone know exactly what the issue is?  Is it a bug with the Adobe World-Ready Paragraph Composer? The only other reference I've found to this issue was the reverse of what we are experiencing: extra space when using Minion Pro Font
    ie. in first paragraph: E lvebakken/l ounge/a n/k itchen; in the 2nd paragraph: after the bracket; in third paragraph: G rillen

    ok so I exported two files to IDML then opened them in CS6 again and the spacing remained the same, however when i opened each of these IDMLs in CS4 the issue disappeared!
    As for PDFing the plot thickens! "Exporting" as a PDF the issue disappears, BUT if i generate the PDF via the "print" option in InDesign (which is how we generate our "to printer" pdfs) the issue remains.
    Also whats interesting is that in one chapter the "extra space" is consistently after the first character of the line, whilst in the second chapter is it consistently after the second character of the line - except for the word "The" where it comes in as "T he"... And whilst they predominately come off the server with this issue I've just had feedback from production that its happening randomly in chapters during layout. If chapters are kerned it appears ok at first only to revert back to the weird spacing later on. And also, its appears to be mostly confined to a few pages here and there rather than consistently throughout a chapter. FYI masterpages are all the same. And just looking at this more closely it seems to be lines where words are hyphenated over two lines that are mostly affected - but not always...
    Here's two more examples with non-printing characters (I've scaled them up so you can hopefully see the invisables). The first example is one character then a space, the second example is with two characters then a space....

  • Weird: Character in View cut out.

    I have a very strange behavoir in Crystal Reports. Somehow I feel that there is a simple cause for this.
    OK.
    I have a field in Crystal Reports containing the STATIC string "2-û". The String is displayed correctly in the designer.
    When i save the report and display it through the .net Crystal Reports viewer the - char disappears. It only displays "2û".
    You can test this behavoir easily. It seems to be only the combination of the to characters -û which cause this. The reason why I need this, is that I am displaying barcode, which miracously couldn't be read anymore when the calculated checksum was this - character the û is the stop character of the certain font.
    Does anyone have an idea?
    Thank You!

    I have found an related article:
    [Likely Bug: Chr(173) not printing in CR Basic 2008|Re: Likely Bug: Chr(173) not printing in CR Basic 2008]

  • Weird character counting

    I'm having a problem with this bit of code. This works perfecly fine if there are not line breaks (/n) in the TextPane. if there are line breaks the .indexOf method seems to ignore them but the TextPane doesn't so when the format is applied it is shifted to the right by as many characters as there are line breaks before it.
    please help.
    clarkie
           StyledDocument doc = mainTextArea.getStyledDocument();
            // create format for keywords.
            MutableAttributeSet keyWordFormat = new SimpleAttributeSet();
            StyleConstants.setForeground(keyWordFormat, Color.red);
            // create format for default text
            MutableAttributeSet defaultFormat = new SimpleAttributeSet();
            StyleConstants.setForeground(defaultFormat, Color.black);
            //String mainText = mainTextArea.getText();
            // reset the text to the default format
            doc.setCharacterAttributes(0,mainTextArea.getText().length(),defaultFormat,true);
            int x = 0;
            String keyWord = "<HTML>";
            // search through the string and highlight keywords.
            while ( mainTextArea.getText().indexOf(keyWord,x )!= -1 ) {
                doc.setCharacterAttributes(mainTextArea.getText().indexOf(keyWord,x),keyWord.length(),keyWordFormat,true);
                x = mainTextArea.getText().indexOf(keyWord, x)+1;
                System.out.println(x);
            }

    In a Document only a single character is stored for each new line. When you you use the getText(...) method the new line String for the Document is inserted for each new line character. On a Windows platform this new line String is ("\r\n"). You can override the default by using:
    document.putProperty( DefaultEditorKit.EndOfLineStringProperty, "\n" );
    By the way your while loop is extremely inefficient. Each time through the loop you use the getText() method twice. You should just initialize a variable containing the text of the Document outside of your loop.

  • Weird character

    Hi all
    I am kinda new to Java!
    I have an anemic class (defines no behaviour) that has a field of primitive type char. I know that fields are always initialized even if don´t do it, so I created this class to find out what character java assigns to a character field.
    I didn´t find it out yet!
    The code is working and as a shortcut i prefer to run it in the command prompt by typing:
    javac CharacterTest.java
    java CharacterTest (Here you have to omit the extension)
    /******************************************************************BEGGINING OF THE CLASS**********************************************************************/
    public class CharacterTest {
         public static void main(String args[]) {
              System.out.println("AnemicClass.caractere " + Test.caractere);
              boolean isWhiteSpace = (AnemicClass.caractere == ' ')? true: false;
              System.out.println("Is AnemicClass.caractere field a whitespace using primitive type char comparison? " + isWhiteSpace);
              String character = String.valueOf(AnemicClass.caractere);
              isWhiteSpace = (" ".equals(character))? true: false;
              System.out.println("Is AnemicClass.caractere field a whitespace using String object comparison? " + isWhiteSpace);
              //Just to be sure if the string has any character in its state
              System.out.println("How many characters are there in the String object that encapsulates the character? " + character.length());
              //print the codepoint of our Unicode character
              System.out.println("What codepoint represents such character? " + character.codePointAt(0));
              //Is this character defined in Unicode?
              System.out.println("Is this character defined in Unicode? " + Character.isDefined(Test.caractere));
              //Is this character (Unicode code point) an ISO control character?
              System.out.println("Is this character (Unicode code point) an ISO control character? " + Character.isISOControl(Test.caractere));
              //Is this character (Unicode code point) a white space according to Java?
              System.out.println("Is this character (Unicode code point) a white space according to Java? " + Character.isWhitespace(Test.caractere));
              //Is this character (Unicode code point) a letter or a digit?
              System.out.println("Is this character (Unicode code point) a letter or a digit? " + Character.isLetterOrDigit(Test.caractere));
              System.out.println("Just to make sure the field character from class AnemicClass is not a whitespace let´s check the whitespace character
    codepoint: " +
              " ".codePointAt(0));
    This is an anemic (defines no behaviour) class. It is only a repository for data.
    class AnemicClass {
         public static char caractere; //primitive type char     
    /*************************************************************************END OF THE CLASS******************************************************************/
    AND THIS IS THE PROGRAM OUTPUT
    AnemicClass.caractere
    Is AnemicClass.caractere field a whitespace using primitive type char comparison? false
    Is AnemicClass.caractere field a whitespace using String object comparison? false
    How many characters are there in the String object that encapsulates the character?  1
    What codepoint represents such character? 0
    Is this character defined in Unicode? true
    Is this character (Unicode code point) an ISO control character? true
    Is this character (Unicode code point) a white space according to Java? false
    Is this character (Unicode code point) a letter or a digit? false
    Just to make sure the field character from class AnemicClass is not a whitespace let&#9508;s check the whitespace character codepoint 32
    Edited by: charllescuba1008 on Mar 20, 2009 11:20 AM
    Edited by: charllescuba1008 on Mar 20, 2009 11:21 AM

    Hi all
    I am kinda new to Java!
    I have an anemic class (defines no behaviour) that has a field of primitive type char. I know that fields are always initialized even if don´t do it, so I created this class to find out what character java assigns to a character field.
    I didn´t find it out yet!
    The code is working and as a shortcut i prefer to run it in the command prompt by typing:
    javac CharacterTest.java
    java CharacterTest (Here you have to omit the extension)
    /******************************************************************BEGGINING OF THE CLASS**********************************************************************/
    public class CharacterTest {
    public static void main(String args[]) {
    System.out.println("AnemicClass.caractere " + Test.caractere);
    boolean isWhiteSpace = (AnemicClass.caractere == ' ')? true: false;
    System.out.println("Is AnemicClass.caractere field a whitespace using primitive type char comparison? " + isWhiteSpace);
    String character = String.valueOf(AnemicClass.caractere);
    isWhiteSpace = (" ".equals(character))? true: false;
    System.out.println("Is AnemicClass.caractere field a whitespace using String object comparison? " + isWhiteSpace);
    //Just to be sure if the string has any character in its state
    System.out.println("How many characters are there in the String object that encapsulates the character? " + character.length());
    //print the codepoint of our Unicode character
    System.out.println("What codepoint represents such character? " + character.codePointAt(0));
    //Is this character defined in Unicode?
    System.out.println("Is this character defined in Unicode? " + Character.isDefined(Test.caractere));
    //Is this character (Unicode code point) an ISO control character?
    System.out.println("Is this character (Unicode code point) an ISO control character? " + Character.isISOControl(Test.caractere));
    //Is this character (Unicode code point) a white space according to Java?
    System.out.println("Is this character (Unicode code point) a white space according to Java? " + Character.isWhitespace(Test.caractere));
    //Is this character (Unicode code point) a letter or a digit?
    System.out.println("Is this character (Unicode code point) a letter or a digit? " + Character.isLetterOrDigit(Test.caractere));
    System.out.println("Just to make sure the field character from class AnemicClass is not a whitespace let´s check the whitespace character
    codepoint: " +
    " ".codePointAt(0));
    This is an anemic (defines no behaviour) class. It is only a repository for data.
    class AnemicClass {
    public static char caractere; //primitive type char
    }/*************************************************************************END OF THE CLASS******************************************************************/
    AND THIS IS THE PROGRAM OUTPUT
    AnemicClass.caractere
    Is AnemicClass.caractere field a whitespace using primitive type char comparison? false
    Is AnemicClass.caractere field a whitespace using String object comparison? false
    How many characters are there in the String object that encapsulates the character? 1
    What codepoint represents such character? 0
    Is this character defined in Unicode? true
    Is this character (Unicode code point) an ISO control character? true
    Is this character (Unicode code point) a white space according to Java? false
    Is this character (Unicode code point) a letter or a digit? false
    Just to make sure the field character from class AnemicClass is not a whitespace let&#9508;s check the whitespace character codepoint 32

  • Adobe CS4 showing weird character in Lion

    Hi, I'm using Adobe CS4  (photoshop, illustrator ..) with Lion and it start to mess up with my keyboard.
    After using either of them for a while, my keyboard layout is change to something very weird.
    Example if I type ASDFG it become ß∂ƒ¸
    First I thought my keyboard "alt" button got stuck but it wasn't. Obviously is the CS4 because it work perfectly until I start the program.
    Is anyone having the same problem too?

    oh the problem is fix. sorry, wasn't cs4 problem. is the wacom.

  • Weird character substitutions ?

    When accessing one of my url parameters it somehow gets changed to the actual parameter name with the jsp tag around it. But not a regular jsp tag, this one has a question mark in it.
    String projectID = (String)request.getParameter("projectid") ;
    Remember projectID has to be a number in String format.
    When I do the call on the bean and print out the resulting sql statement that is giving me errors I get this-
    INSERT(...) VALUES(<?rojectID%>...) I have no idea why it does that, I have checked the call to the bean and also to see if there was anything done in between the call to the bean and the execution of the statement and there was nothing wrong. At least from what I saw.
    Thanks a lot.
    Sunil Nicholas

    You are right about missing the little things, that is something I am notorious for actually. The problem here though is that I am using a bean for all my sql stuff. So the function is called from the jsp and then uses the param. list to execute the insert statement. Also the statement I use to call the function uses a defined variable in the "java" part of the jsp. I have no clue as to why it would want to even insert the <%= %> tags. Catch my drift? Here's a little more of the code.
    //this is to make sure there is a project id otherwise you have to start
    // over
    <%
    String projectID = (String)request.getParameter("projectid") ;
    if(projectID == null){
    %>
    <jsp:forward page="./prj_home.jsp" />
    <%}
    %>
    /* Add Entry to DATABASE*/
    prjb.addIssue(projectID, issueopendate, issuetargetdate, issueresolvedate, issuescope, issuequality, issuetime, issuecost, issueoriginator, originatortype, issueassignedto, assignedtotype, issuedescription, issuemitigationplan, issueimpact, issueclosed, issuevisible);

Maybe you are looking for

  • Shut down problem on new macbook pro 13"

    Hey, Just a new convert to mac after ditching my glitchy windows desktop replacement. Been really enjoying the mac (i've had it for 3 days now). I've experienced a shut down problem where everything would close, like the dock and the bar up the top.

  • OpenOffice 2.2.1 and callouts problem

    Hello, I have just noticed that my openoffice 2.2.1-3 does not correctly display callouts (I tried impress and draw), but they are displayed correctly when viewed on another computer (with OO 2.2.0 installed). Also, documents created on another compu

  • Ipod said it needs restored but Itunes or windows won't recognize

    I have an 80 gb classic and it said one day (after being fully charged) that it needed to be restored. So I plug it in and Itunes won't recognize it and windows shows it as a removeable disk but there's no info there either. I have the latest version

  • Ant or what tool to compile Java Classes

    For the past year I have been using Windows command window to compile Java Classes for my Servlets, Java Beans and Helper classes. I use Cold Fusion studio as my IDE for creating the Java classes. All of my Web Applications use MVC for Forms feeding

  • Error updating User Table (ODBC - 2035)

    Hello all. Im having a problem when I try to update an user table @tbldoc and @tbldetail. "This entry already exist in the following tables 'table' (@tbldoc)" First time I have this issue was using my UI addon, my form is auto managed... then I try t