Removing unwanted characters..

Hey guys,
I'm back for help again. Unfortunately my brain isn't creative enough, so please help! :-)
Ok, I need to remove unwanted characters from a file...the problem is that the characters look like this:
in any text editor. I'm using JEdit, and it's ISO-8859-1 encoding. The text was initally from a html file, and i think that most of the html is displayed well as text in JEdit. But these squares, which are bits of info that I don't need, are making it a little trick to do my extraction.
Ex: the word I want to extract is "trouble". But in the file, it looks like this:
troble....
Anyone know how to get rid of all that stuff???
Thanks in advance.
...DJVege...

you could try to set a filter on the characters you accept. Process each character and only accept those that fall into some ASCII boundary. If you accept ASCII characters that have values between 33-255, most blocks should be eliminated.
Something like this should help:
import java.io.*;
public class Example {
     protected static final int MIN_ASCII = 33;
     protected static final int MAX_ASCII = 255;
     public Example(String file) throws IOException {
          BufferedReader b = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
          PrintStream p = new PrintStream(new FileOutputStream(file+"_fix.txt"));
          String s = "";
          int j;
          while ((s = b.readLine()) != null) {
               for (j = 0;j < s.length();j++) {
                    if (valid(s.charAt(j))) {
                         p.print(s.charAt(j));
               p.println();
          b.close();
          p.close();
     protected boolean valid(char c) {
          int asc = (int)c;
          // allow for tabs and spaces
          if (asc == 9 || asc == 32) {
               return true;
          return (asc >= MIN_ASCII && asc <= MAX_ASCII);
     public static void main(String args[]) {
          if (args.length > 0) {
               try {
                    new Example(args[0]);
               catch (IOException e) {
                    e.printStackTrace();
}useage: java Example <file>
of course, something like this will probably only work on "english" files as I dont have an understanding on how foreign characters are encoded..
if this doesnt solve your problem, you might want to adjust the range to 33-127, which will eliminate all "block" characters and all "special formatted" characters (i.e. accented characters, currency signs, etc.)
see http://asciitable.com/ for more information on ASCII characters

Similar Messages

  • Actively remove unwanted characters

    Hi, I'm writing a function to actively remove unwanted characters from an input field. It will run every time a key press event occurs on the selected input fields. It is going to remove quotes and things like that. This is what I have so far..
    void test(){
    char unwantedCharSet[50]={"abcD"};
    char tempString2[200];
    int matchedCharIndex;
    GetCtrlVal (panelHandle[MAIN], MAIN_STRING,tempString);
    do{
    RemoveSurroundingWhiteSpace (tempString);
    matchedCharIndex = strcspn (tempString, unwantedCharSet);
    if(matchedCharIndex==strlen(tempString)){
    SetCtrlVal (panelHandle[MAIN], MAIN_STRING, tempString);
    return;
    else{
    strncpy (tempString2, tempString, matchedCharIndex);
    CopyString (tempString, matchedCharIndex, tempString,matchedCharIndex+1,(strlen(tempString)-matchedCharIndex));
    strcat (tempString2, tempString);
    sprintf(tempString,"%s",tempString2);
    } while(1);
    return;
    It is still glitchy. Does anyone have any advice to efficiently do this?
    Thanks in advance!
    Solved!
    Go to Solution.

    This was my solution, although, it takes a long time to perform. Does anyone have any tips on a better (faster/cleaner) approach?
    /*=====================================================================*/
    // TEST
    /*=====================================================================*/
    void test(){
    GetCtrlVal(panelHandle[MAIN],MAIN_STRING,tempString);
    SetCtrlVal(panelHandle[MAIN],MAIN_STRING,filter(tempString,unwantedCharSet));
    return;
    /*=====================================================================*/
    // FILTER
    /*=====================================================================*/
    char *filter(char *inputString,char *filterString){
    int ptr=0,matchedCharIndex=0;
    char outputString[100];
    do{
    matchedCharIndex = strcspn (inputString,filterString);
    if(matchedCharIndex!=strlen(tempString)){
    strcpy (outputString, inputString);
    for(ptr=matchedCharIndex;ptr<=strlen(inputString);ptr++){
    outputString[ptr]=inputString[ptr+1];
    DebugPrintf("%s\n",outputString);
    if(outputString[ptr]=='\n'||outputString[ptr]=='\0')
    break;
    strcpy(inputString,outputString);
    else
    return inputString;
    } while(1);
    return inputString;

  • Removing unwanted characters from imported string

    Hello,
    I have a tab-delimited .txt file which I have to import into Indesign for further processing.
    The file is composed by a 3 columns header row at the beginning (Code, Description, price) followed by a series of 3 columns data rows.
    The problem is that sometimes, depending on the way the txt/csv file has been created, may include unwanted characters (such as spaces, double spaces, etc.).
    Is there a way to "clean" the imported strings from these unwanted characters?
    This is my starting code:
    function processImportedTxt(){
        //Open .csv file
        var csvFile = File.openDialog("Open file .csv","tab-delimited(*.csv):*.csv;");
        datafile = new File(csvFile);
        if (datafile.exists){
            datafile.open('r');
       var csvData = new Array();
       while(!datafile.eof){//read every row till the end of file
            csvData.push(datafile.readln());
        datafile.close();
        for(a=1; a<csvData.length; a++){
            var myRowData = csvData[a];//row of data
            var mySplitData = myRowData.toString().split("\t");//divide columns
            var myRowCode = mySplitData[0];
            var myRowDesc = mySplitData[1];
            var myRowPrice = mySplitData[2];
            // Here goes code for cleaning strings from unwanted characters
    processImportedTxt();
    Any help would be much appreciated
    Thanks in advance

    Hi,
    If you want to safe 1-space occurences just a small correction:
    i.e.:
    var myRowCode = mySplitData[0].replace(/\s\s+/g,'');
    Jarek

  • Removing unwanted from string

    I am trying to remove unwanted from a string. For example:
    String is "boy the movie part 2 (2009)". from this string i wanted to remove "(2009)" only, so my string can be looked like as "boy the movie part 2".
    Please provide me logic to do this.
    Thanks,
    Amol

    gimbal2 wrote:
    Hey if that's all you need:
    [snip]
    Just substring minus the last six characters and to not have to make any assumptions about whitespace being present in the result, a trim() is added for the fun of it.
    This is absolutely not fail proof, but it does what you ask.@OP: Most likely this is not what you really want. However, as gimbal2 points out, it does what you ask. The lesson is, you need to be clear and precise in expressing your requirements. To drive the point home, here is some more code that does what you asked for, but probably not what you really want it do do.
    if (str.equals("boy the movie part 2 (2009)") {
      str = "boy the movie part 2";
    }

  • HT4915 I share my iTunes accont with my kids. Many of the songs that they purchase, I do not want in my iphone Library, and I don't want them to play when I have my phone on shuffle. How an I remove unwanted songs from my phone library?

    I share my iTunes accont with my kids. Many of the songs that they purchase, I do not want in my iphone Library, and I don't want them to play when I have my phone on shuffle. How an I remove unwanted songs from my phone library withot affecting their devices?

    These are the downsides of a shared library because if you delete the song from your library it will get also automatically  deleted from your kids library. If you want to proceed then please follow this guide: https://support.apple.com/kb/HT4915

  • How can i remove unwanted e-mail address from my list?

    how can i remove unwanted e-mail address from my list?

    Go to settings -> mail contacts and calendars -> tap on the desired account and choose delete at the bottom of the page. This will delete the email accunt from the device.

  • How do I remove unwanted updates from the App Store?

    How do I remove unwanted updates from the App Store? So you understand better what I'm referring to:
    I have the App Store icon in the Dock. When there's an update a red number appears on said icon. Well, I went to check on the updates and it's for 4 different applications I either no longer have or use. One of them is an update for Lion OS users. I'm still on Snow Leopard, so it doesn't even apply to me.
    So there they sit... I'm not going to download the updates... so how do I get rid of them?

    Hi Andy ..
    no longer have or use
    If there are updates available for apps you have deleted, try this.
    Go to ~/Library/Caches/com.apple.appstore
    Move the Cache.db and Updates files from the com.apple.appstore folder to the Trash.
    Empty the Trash, restart your Mac.
    For any apps you still have installed but do not use, the updates will still be available from the Updates top of the App Store window and show on the red badge on the App Store icon in the Dock.
    edited by:  cs

  • How can I remove unwanted downloads from my macbook pro?

    How can I remove unwanted downloads from my macbook pro?

    Go to the Download folders as specified in the browser's or mail app's preferences and delete what you don't want to keep.

  • I need urgent help to remove unwanted adware from my iMac. Help!

    I need urgent help to remove unwanted adware from my iMac. I have somehow got green underline in text ads, pop up ads that come in from the sides of the screen and ads that pop up selling similar products all over the page wherever I go. Its getting worse and I have researched and researched to no avail. I am really hestitant to download any software to remove whatever it is that is causing this problem. I have removed and reinstalled chrome. I have cleared Chrome and Safari cookies. Checked extensions, there are none to remove. I need to find an answer on how to get rid of these ads. Help please!

    You installed the "DownLite" trojan, perhaps under a different name. Remove it as follows.
    Back up all data.
    Triple-click anywhere in the line below on this page to select it:
    /Library/Application Support/VSearch
    Right-click or control-click the line and select
    Services ▹ Reveal in Finder (or just Reveal)
    from the contextual menu.* A folder should open with an item named "VSearch" selected. Drag the selected item to the Trash. You may be prompted for your administrator login password.
    Repeat with each of these lines:
    /Library/LaunchAgents/com.vsearch.agent.plist
    /Library/LaunchDaemons/com.vsearch.daemon.plist
    /Library/LaunchDaemons/com.vsearch.helper.plist
    /Library/LaunchDaemons/Jack.plist
    /Library/PrivilegedHelperTools/Jack
    /System/Library/Frameworks/VSearch.framework
    Some of these items may be absent, in which case you'll get a message that the file can't be found. Skip that item and go on to the next one.
    Restart and empty the Trash. Don't try to empty the Trash until you have restarted.
    From the Safari menu bar, select
    Safari ▹ Preferences... ▹ Extensions
    Uninstall any extensions you don't know you need, including any that have the word "Spigot" in the description. If in doubt, uninstall all extensions. Do the equivalent for the Firefox and Chrome browsers, if you use either of those.
    This trojan is distributed on illegal websites that traffic in pirated movies. If you, or anyone else who uses the computer, visit such sites and follow prompts to install software, you can expect much worse to happen in the future.
    You may be wondering why you didn't get a warning from Gatekeeper about installing software from an unknown developer, as you should have. The reason is that the DownLite developer has a codesigning certificate issued by Apple, which causes Gatekeeper to give the installer a pass. Apple could revoke the certificate, but as of this writing, has not done so, even though it's aware of the problem. It must be said that this failure of oversight is inexcusable and has seriously compromised the value of Gatekeeper and the Developer ID program. You cannot rely on Gatekeeper alone to protect you from harmful software.
    *If you don't see the contextual menu item, copy the selected text to the Clipboard by pressing the key combination  command-C. In the Finder, select
    Go ▹ Go to Folder...
    from the menu bar and paste into the box that opens by pressing command-V. You won't see what you pasted because a line break is included. Press return.

  • Need to remove special characters

    Hello All
        Some input is comming from source field, if any special characters comming from source field, I need remove special characters and send data source to target field. please suggest me how will i do .
    Thanks&Regards,
    Venkat

    Hi Venkat,
    check this thread.
    Handling Special Characters
    check the document :
    https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42
    Depends upon encoding methods handling will differ,
    you can use ISO-8859-1 or ISO-8859-2 instead of UTF-8 for some special characters.
    check this blog:
    /people/ulrich.brink/blog/2005/08/18/unicode-file-handling-in-abap
    cheers
    Sunil

  • Sql to remove special characters from my search

    Hi everyone.  I'm very new to sql and have hit another road block.  I am doing a query on my database in oracle sql developer.  I want to search manufacturer numbers but sometimes they were entered with dashes ( 999-99-9999) and other times not (999999999)  is it possible to apply a function to overlook the dash in both my query numbers and in the database mfr_nbr column?
    any help would be appreciated.
    Kelly

    ok,  I have built a nesting string of replaces to remove all of my special characters and it worked perfectly but now I am not sure where to place the nest later in the string to remove it from my search of mfr numbers.  The reason I need to do it again is because I want to remove the characters so I am searching in terms of " apples to apples" so to speak.   here is my string so far. I still need to add the part where I put in my search for the manuf_item_nbr.  my question is  where do I need to place the nested replace's to remove it from my numbers I'm going to search?
    SELECT  MAX(item_nbr) ,REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE( manuf_item_nbr, ' '), ','), '<'), '.'), '>'), '?'), ''), '"'), ';'), ':'), '\'), '|'), ']'), '}'), '['), '{'), '='), '+'), '_'), '-'), ')'), '('), '*'), '&'), '^'), '^'), '%'), '$'), '#'), '@'), '!'), '~'), '`'),
      manuf_item_nbr  ,MAX(description), MAX(description2), MAX(GHX_FULL_ITEM_DESCR), MAX(Cntrct_nbr_txt), MAX(uom_cd)   ,
    MAX(item_qty), MAX(tier_descr), MAX(tier_prc_amt),MAX(list_prc_amt), MAX(vndr_nm), MAX(vndr_id), MAX(unspsc_nbr),MAX(iss_account)   FROM
    ( SELECT '' AS item_nbr, manuf_item_nbr,'' AS description, '' AS description2,'' AS GHX_FULL_ITEM_DESCR, Cntrct_nbr_txt, uom_cd, CAST(item_qty AS VARCHAR (255)) AS item_qty,tier_descr, CAST ( tier_prc_amt AS VARCHAR (255)) AS tier_prc_amt, CAST (list_prc_amt AS VARCHAR (255)) AS list_prc_amt,
    vndr_nm, '' AS vndr_id, '' AS unspsc_nbr,'' AS iss_account FROM ROI.CNTRCT_PRC_LIST
    WHERE ACTN_CD <> 'D'
    AND ROW_UPDT_TSP IS NULL 
    UNION ALL
    SELECT item_nbr, manuf_item_nbr,'', '', GHX_FULL_ITEM_DESCR,'',  purch_uom_txt  AS uom_cd,
      purch_qoe_txt  AS item_qty, '',  '' AS tier_prc_amt,'' AS list_prc_amt,
    vndr_nm, vndr_id, unspsc_nbr,
      gl_cd  AS iss_account
      FROM ROI.ROI_ITEM_ENRCHD_NUVIA
       UNION ALL
    SELECT  trim(item)  AS item_nbr,
       trim(manuf_nbr)  AS manuf_item_nbr,
       trim(description),
       trim(description2), '' AS GHX_FULL_ITEM_DESCR, '',
        trim(stock_uom ) AS uom_cd,
        ''  AS item_qty,'', '','' AS tier_prc_amt, '' AS list_prc_amt,'' AS vndr_id, '' AS unspsc_nbr,
        CAST( trim(iss_account) AS VARCHAR(255))
          FROM ITEMMAST_LAW
    )GROUP BY manuf_item_nbr
       ORDER BY manuf_item_nbr

  • Check a string in bash for unwanted characters

    Hello,
    I'm trying to do a bash script that checks a variable against a list of unwanted characters, e.g. to parse a file name.
    This doesn't really sound like a difficult task, but for some reason, whatever I've tried so far does not work, including my last attempt, shown below. Perhaps I'm doing something silly here - and I'm getting tired of it. What would be the best way for instance to parse a file name for invalid characters, or to accomplish or fix the below?
    #!/bin/bash
    read -p "Enter a filename: " fname
    invalid_chars=", . ! @ # \$ % ^ & \* ( ) + = ? { } [ ] | ~"
    i=0
    while (( i <= ${#fname} )); do
       char=${fname:$i:1}
       for char in `echo $invalid_chars`; do
         echo "$char"
       done
       (( i += 1 ))
    doneThanks.

    Meanwhile I figured out the mistake I made in the for loop. The below finally works catching the list of characters. It won't catch * and ? though. I wonder if there wasn't an easier way to do it, beside using "grep".
    #!/bin/bash
    f_varcheck()
      ifs_orig=$IFS
      count=0
      score=0
      while (( count <= ${#1} )); do
        char=${1:$count:1}
        if [ "$score" = "0" ]; then
          wanted_char='~,!,@,#,$,%,^,&,(,),+,`,='
          IFS=$','
          for item in `echo "$wanted_char"`; do
            [ "$item" = "$char" ] && score=1
          done
        fi
        if [ "$score" = "0" ]; then
          wanted_char='{,},|,[,],\,:,",;,<,>,., ,/,'
          IFS=$','
          for item in `echo "$wanted_char"`; do
            [ "$item" = "$char" ] && score=1
          done
        fi
        if [ "$score" = "0" ]; then
          wanted_char=','
          IFS=$' '
          for item in `echo "$wanted_char"`; do
            [ "$item" = "$char" ] && score=1
          done
        fi
      if [ "$score" = "0" ]; then
        (( count += 1 ))
      else
        break
      fi
      done
      IFS=$ifs_orig
      [ "$score" = "1" ] && return 1 || return 0
    read -p "Enter a filename: " fname
    if ! f_varcheck "$fname"; then
       echo "Invalid character \`$char\` found."
    else
       echo "Input is ok."
    fi
    {code}
    Edited by: Dude on Feb 11, 2012 8:51 AM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Remove control characters in txt file (saved from Excel)

    Hi,
    I have a txt file that contains invisible control characters and I want to remove those characters. I've been thinking of 2 options
    1/ Get the content of the file into a string, then go through each character and basically takes only alphanumeric, new lines, Alt+Enter character (character that is created in txt files in Excel that breaks line). With this approach, I'm stuck on getting the character code for Alt+Enter so if anyone could point out. That helps a great deal.
    2/ Use some pattern matching {ctrnl} or something to remove all control characters. I've tried this approach and it didn't work for me.
    Please help me with this problem. Any help or suggestion is greatly appreciated.

    (saved from Excel) Why not save it as csv?
    trivektor wrote:
    With this approach, I'm stuck on getting the character code for Alt+Enter so if anyone could point out. That helps a great deal.
    You can figure that out with a hex editor or just write a small app that prints int values for each byte, not character, and print the file.
    Presumably you already found the Character class and its methods.
    Edited by: jschell on Sep 22, 2008 4:29 PM

  • I want to purchase Apature, but In the advertising blurb I dont see how to remove unwanted objects from a photo. I know it can be done in Adobe Elements, but I would rather Apature.

    I want to purchase Apature, but in the advertising blurb I dont see hoe to remove unwanted oblectsfrom a photo. I know it can be done in Adobe but I would rather have Apature.

    What DiploStrat said.
    To expand a little bit:  Aperture allows you to assign a graphics program (such as PSE) as an "external editor".  In general, you use Aperture to organize your photos, to develop them (make them as good as possible for each use), and to publish them (create share-able files).  Your external editor is used when you want to create a new graphics file, which is what happens in all "destructive pixel editing".
    Aperture provides tools to remove sensor spots from skies (and like operations), but for removing people and billboards you'll want a proper compositor, a/k/a a graphics program.

  • Can't I remove unwanted App's Updates in the App Store ?

    Can't I remove unwanted App's Updates in the App Store (found them there by pure magic indeed)...thanx !
    <Edited By Host>

    Hello Orionquest70
    Check out the troubleshooting article below for issues with access to the iTunes Store. Also there was a small amount of outage accessing the iTunes Store this past Wednesday.
    Can't connect to the iTunes Store
    http://support.apple.com/kb/TS1368
    Apple Services, Stores, and iCloud
    http://www.apple.com/support/systemstatus/
    Thanks for using Apple Support Communities.
    Regards,
    -Norm G.

Maybe you are looking for