Regular expression weirdness: egrep vs awk

Having used scripting before on other UNIX flavours I ran into the following question recently (Mac OS X 10.4.10 but also earlier versions):
Consider two files which I created for illustration purposes
a) text file t1.txt
b) awk script t1.awk
and run the commands egrep and awk against t1.txt
/Users/me/bin>cat t1.txt
1
2
12
1
2
34
3
4
/Users/me/bin>egrep '^[0-9]{2}$' t1.txt
12
34
/Users/me/bin>cat t1.awk
/^[0-9]$/ { printf("Number: %i has one digit\n",$1) }
/^[0-9]{2}$/ { printf("Number: %i has two digits\n",$1) }
/Users/me/bin>awk -f t1.awk < t1.txt
Number: 1 has one digit
Number: 2 has one digit
Number: 1 has one digit
Number: 2 has one digit
Number: 3 has one digit
Number: 4 has one digit
/Users/me/bin>
Same regular expression, egrep acts as expected, awk does not. I am obviously too dumb to see what I am doing wrong, anybody any idea?
Thanks
DocJoc

Ok, I checked this out a little further following the suggestions given. I find
1) awk is installed, nawk is not
2) I got gawk and
a) it generates the very same error if called without addtl parameters
b) it runs as expected if adding option "--posix" as suggested by LittleSaint
Well, looks like a bug to me because according to the man pages awk should already show the expected behaviour. Anyway, using gawk solves the problem here.
Thanks for the answers!

Similar Messages

  • Regular expressions: find files with exactly 'n' digits in a row

    Hi there,
    I want to filter files that contain only a fixed number of digits, but not more (at least not in after the digits).
    For example, I have
    01.mp3
    02.mp3
    test10.txt
    test000110101010.txt
    04.flac
    and for n=2 I want to get all files except 'test000110101010.txt'.
    The following is not working, and I'm a total newb regarding regular expressions
    ls -l | grep '^-' | awk '{print $9}' | grep '([0-9]\{2\})[^0-9]\{2\}'
    Thanks for help.
    Regards,
    drm

    Thanks!
    I wrote a python script to scan e.g. a music folder for missing files and needed to extract the file numbers from the files to get the "highest" number.
    You can get it from here: http://pastebin.com/Sg9yDHiw (Python3, expires in 1 month)
    Regards,
    drm
    Edit: found a bug
    Last edited by drm00 (2011-02-04 13:57:43)

  • [SOLVED]ZSH and regular expressions

    Hi
    I am getting into regular expressions and i have noticed that with my .zshrc file i have some problem. In bash this expression works:
    \^\[^#]
    but not also in zsh. I have also noted that regular expression works fine with other zshrc configurations found in archwiki (like grml) but i want to have my configuration. And i really can't find what command make a difference
    My .zshrc file is pulled from this site https://github.com/slashbeast/things/bl … s/DOTzshrc.
    # .zshrc
    # Author: Piotr Karbowski <[email protected]>
    # License: beerware.
    # Basic zsh config.
    umask 077
    ZDOTDIR=${ZDOTDIR:-${HOME}}
    ZSHDDIR="${HOME}/.config/zsh.d"
    HISTFILE="${ZDOTDIR}/.zsh_history"
    HISTSIZE='10000'
    SAVEHIST="${HISTSIZE}"
    export EDITOR="/usr/bin/vim"
    export TMP="$HOME/tmp"
    export TEMP="$TMP"
    export TMPDIR="$TMP"
    export TMPPREFIX="${TMPDIR}/zsh"
    if [ ! -d "${TMP}" ]; then mkdir "${TMP}"; fi
    if ! [[ "${PATH}" =~ "^${HOME}/bin" ]]; then
    export PATH="${HOME}/bin:${PATH}"
    fi
    # Not all servers have terminfo for rxvt-256color. :<
    if [ "${TERM}" = 'rxvt-256color' ] && ! [ -f '/usr/share/terminfo/r/rxvt-256color' ] && ! [ -f '/lib/terminfo/r/rxvt-256color' ] && ! [ -f "${HOME}/.terminfo/r/rxvt-256color" ]; then
    export TERM='rxvt-unicode'
    fi
    # Colors.
    red='\e[0;31m'
    RED='\e[1;31m'
    green='\e[0;32m'
    GREEN='\e[1;32m'
    yellow='\e[0;33m'
    YELLOW='\e[1;33m'
    blue='\e[0;34m'
    BLUE='\e[1;34m'
    purple='\e[0;35m'
    PURPLE='\e[1;35m'
    cyan='\e[0;36m'
    CYAN='\e[1;36m'
    NC='\e[0m'
    # Functions
    if [ -f '/etc/profile.d/prll.sh' ]; then
    . "/etc/profile.d/prll.sh"
    fi
    run_under_tmux() {
    # Run $1 under session or attach if such session already exist.
    # $2 is optional path, if no specified, will use $1 from $PATH.
    # If you need to pass extra variables, use $2 for it as in example below..
    # Example usage:
    # torrent() { run_under_tmux 'rtorrent' '/usr/local/rtorrent-git/bin/rtorrent'; }
    # mutt() { run_under_tmux 'mutt'; }
    # irc() { run_under_tmux 'irssi' "TERM='screen' command irssi"; }
    # There is a bug in linux's libevent...
    # export EVENT_NOEPOLL=1
    command -v tmux >/dev/null 2>&1 || return 1
    if [ -z "$1" ]; then return 1; fi
    local name="$1"
    if [ -n "$2" ]; then
    local file_path="$2"
    else
    local file_path="command ${name}"
    fi
    if tmux has-session -t "${name}" 2>/dev/null; then
    tmux attach -d -t "${name}"
    else
    tmux new-session -s "${name}" "${file_path}" \; set-option status \; set set-titles-string "${name} (tmux@${HOST})"
    fi
    t() { run_under_tmux rtorrent; }
    irc() { run_under_tmux irssi "TERM='screen' command irssi"; }
    over_ssh() {
    if [ -n "${SSH_CLIENT}" ]; then
    return 0
    else
    return 1
    fi
    reload () {
    exec "${SHELL}" "$@"
    confirm() {
    local answer
    echo -ne "zsh: sure you want to run '${YELLOW}$@${NC}' [yN]? "
    read -q answer
    echo
    if [[ "${answer}" =~ ^[Yy]$ ]]; then
    command "${=1}" "${=@:2}"
    else
    return 1
    fi
    confirm_wrapper() {
    if [ "$1" = '--root' ]; then
    local as_root='true'
    shift
    fi
    local runcommand="$1"; shift
    if [ "${as_root}" = 'true' ] && [ "${USER}" != 'root' ]; then
    runcommand="sudo ${runcommand}"
    fi
    confirm "${runcommand}" "$@"
    poweroff() { confirm_wrapper --root $0 "$@"; }
    reboot() { confirm_wrapper --root $0 "$@"; }
    hibernate() { confirm_wrapper --root $0 "$@"; }
    detox() {
    if [ "$#" -ge 1 ]; then
    confirm detox "$@"
    else
    command detox "$@"
    fi
    has() {
    local string="${1}"
    shift
    local element=''
    for element in "$@"; do
    if [ "${string}" = "${element}" ]; then
    return 0
    fi
    done
    return 1
    begin_with() {
    local string="${1}"
    shift
    local element=''
    for element in "$@"; do
    if [[ "${string}" =~ "^${element}" ]]; then
    return 0
    fi
    done
    return 1
    termtitle() {
    case "$TERM" in
    rxvt*|xterm|nxterm|gnome|screen|screen-*)
    local prompt_host="${(%):-%m}"
    local prompt_user="${(%):-%n}"
    local prompt_char="${(%):-%~}"
    case "$1" in
    precmd)
    printf '\e]0;%s@%s: %s\a' "${prompt_user}" "${prompt_host}" "${prompt_char}"
    preexec)
    printf '\e]0;%s [%s@%s: %s]\a' "$2" "${prompt_user}" "${prompt_host}" "${prompt_char}"
    esac
    esac
    git_check_if_worktree() {
    # This function intend to be only executed in chpwd().
    # Check if the current path is in git repo.
    # We would want stop this function, on some big git repos it can take some time to cd into.
    if [ -n "${skip_zsh_git}" ]; then
    git_pwd_is_worktree='false'
    return 1
    fi
    # The : separated list of paths where we will run check for git repo.
    # If not set, then we will do it only for /root and /home.
    if [ "${UID}" = '0' ]; then
    # running 'git' in repo changes owner of git's index files to root, skip prompt git magic if CWD=/home/*
    git_check_if_workdir_path="${git_check_if_workdir_path:-/root:/etc}"
    else
    git_check_if_workdir_path="${git_check_if_workdir_path:-/home}"
    git_check_if_workdir_path_exclude="${git_check_if_workdir_path_exclude:-${HOME}/_sshfs}"
    fi
    if begin_with "${PWD}" ${=git_check_if_workdir_path//:/ }; then
    if ! begin_with "${PWD}" ${=git_check_if_workdir_path_exclude//:/ }; then
    local git_pwd_is_worktree_match='true'
    else
    local git_pwd_is_worktree_match='false'
    fi
    fi
    if ! [ "${git_pwd_is_worktree_match}" = 'true' ]; then
    git_pwd_is_worktree='false'
    return 1
    fi
    # todo: Prevent checking for /.git or /home/.git, if PWD=/home or PWD=/ maybe...
    # damn annoying RBAC messages about Access denied there.
    if [ -d '.git' ] || [ "$(git rev-parse --is-inside-work-tree 2> /dev/null)" = 'true' ]; then
    git_pwd_is_worktree='true'
    git_worktree_is_bare="$(git config core.bare)"
    else
    unset git_branch git_worktree_is_bare
    git_pwd_is_worktree='false'
    fi
    git_branch() {
    git_branch="$(git symbolic-ref HEAD 2>/dev/null)"
    git_branch="${git_branch##*/}"
    git_branch="${git_branch:-no branch}"
    git_dirty() {
    if [ "${git_worktree_is_bare}" = 'false' ] && [ -n "$(git status --untracked-files='no' --porcelain)" ]; then
    git_dirty='%F{green}*'
    else
    unset git_dirty
    fi
    precmd() {
    # Set terminal title.
    termtitle precmd
    if [ "${git_pwd_is_worktree}" = 'true' ]; then
    git_branch
    git_dirty
    git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
    else
    unset git_prompt
    fi
    preexec() {
    # Set terminal title along with current executed command pass as second argument
    termtitle preexec "${(V)1}"
    chpwd() {
    git_check_if_worktree
    man() {
    if command -v vimmanpager >/dev/null 2>&1; then
    PAGER="vimmanpager" command man "$@"
    else
    command man "$@"
    fi
    # Are we running under grsecurity's RBAC?
    rbac_auth() {
    local auth_to_role='admin'
    if [ "${USER}" = 'root' ]; then
    if ! grep -qE '^RBAC:' "/proc/self/status" && command -v gradm > /dev/null 2>&1; then
    echo -e "\n${BLUE}*${NC} ${GREEN}RBAC${NC} Authorize to '${auth_to_role}' RBAC role."
    gradm -a "${auth_to_role}"
    fi
    fi
    #rbac_auth
    # Check if we started zsh in git worktree, useful with tmux when your new zsh may spawn in source dir.
    git_check_if_worktree
    if [ "${git_pwd_is_worktree}" = 'true' ]; then
    git_branch
    git_dirty
    git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
    else
    unset git_prompt
    fi
    # Le features!
    # extended globbing, awesome!
    setopt extendedGlob
    # zmv - a command for renaming files by means of shell patterns.
    autoload -U zmv
    # zargs, as an alternative to find -exec and xargs.
    autoload -U zargs
    # Turn on command substitution in the prompt (and parameter expansion and arithmetic expansion).
    setopt promptsubst
    # Control-x-e to open current line in $EDITOR, awesome when writting functions or editing multiline commands.
    autoload -U edit-command-line
    zle -N edit-command-line
    bindkey '^x^e' edit-command-line
    # Include user-specified configs.
    if [ ! -d "${ZSHDDIR}" ]; then
    mkdir -p "${ZSHDDIR}" && echo "# Put your user-specified config here." > "${ZSHDDIR}/example.zsh"
    fi
    for zshd in $(ls -A ${HOME}/.config/zsh.d/^*.(z)sh$); do
    . "${zshd}"
    done
    # Completion.
    autoload -Uz compinit
    compinit
    zstyle ':completion:*' matcher-list 'm:{a-z}={A-Z}'
    zstyle ':completion:*' completer _expand _complete _ignored _approximate
    zstyle ':completion:*' menu select=2
    zstyle ':completion:*' select-prompt '%SScrolling active: current selection at %p%s'
    zstyle ':completion::complete:*' use-cache 1
    zstyle ':completion:*:descriptions' format '%U%F{cyan}%d%f%u'
    # If running as root and nice >0, renice to 0.
    if [ "$USER" = 'root' ] && [ "$(cut -d ' ' -f 19 /proc/$$/stat)" -gt 0 ]; then
    renice -n 0 -p "$$" && echo "# Adjusted nice level for current shell to 0."
    fi
    # Fancy prompt.
    if over_ssh && [ -z "${TMUX}" ]; then
    prompt_is_ssh='%F{blue}[%F{red}SSH%F{blue}] '
    elif over_ssh; then
    prompt_is_ssh='%F{blue}[%F{253}SSH%F{blue}] '
    else
    unset prompt_is_ssh
    fi
    case $USER in
    root)
    PROMPT='%B%F{cyan}%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{blue}%1~${git_prompt}%F{blue} %# %b%f%k'
    PROMPT='%B%F{blue}%n@%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{cyan}%1~${git_prompt}%F{cyan} %# %b%f%k'
    esac
    # Ignore lines prefixed with '#'.
    setopt interactivecomments
    # Ignore duplicate in history.
    setopt hist_ignore_dups
    # Prevent record in history entry if preceding them with at least one space
    setopt hist_ignore_space
    # Nobody need flow control anymore. Troublesome feature.
    #stty -ixon
    setopt noflowcontrol
    # Fix for tmux on linux.
    case "$(uname -o)" in
    'GNU/Linux')
    export EVENT_NOEPOLL=1
    esac
    # Aliases
    alias cp='cp -iv'
    alias rcp='rsync -v --progress'
    alias rmv='rsync -v --progress --remove-source-files'
    alias mv='mv -iv'
    alias rm='rm -iv'
    alias rmdir='rmdir -v'
    alias ln='ln -v'
    alias chmod="chmod -c"
    alias chown="chown -c"
    if command -v colordiff > /dev/null 2>&1; then
    alias diff="colordiff -Nuar"
    else
    alias diff="diff -Nuar"
    fi
    alias grep='grep --colour=auto'
    alias egrep='egrep --colour=auto'
    alias ls='ls --color=auto --human-readable --group-directories-first --classify'
    # Keys.
    case $TERM in
    rxvt*|xterm*)
    bindkey "^[[7~" beginning-of-line #Home key
    bindkey "^[[8~" end-of-line #End key
    bindkey "^[[3~" delete-char #Del key
    bindkey "^[[A" history-beginning-search-backward #Up Arrow
    bindkey "^[[B" history-beginning-search-forward #Down Arrow
    bindkey "^[Oc" forward-word # control + right arrow
    bindkey "^[Od" backward-word # control + left arrow
    bindkey "^H" backward-kill-word # control + backspace
    bindkey "^[[3^" kill-word # control + delete
    linux)
    bindkey "^[[1~" beginning-of-line #Home key
    bindkey "^[[4~" end-of-line #End key
    bindkey "^[[3~" delete-char #Del key
    bindkey "^[[A" history-beginning-search-backward
    bindkey "^[[B" history-beginning-search-forward
    screen|screen-*)
    bindkey "^[[1~" beginning-of-line #Home key
    bindkey "^[[4~" end-of-line #End key
    bindkey "^[[3~" delete-char #Del key
    bindkey "^[[A" history-beginning-search-backward #Up Arrow
    bindkey "^[[B" history-beginning-search-forward #Down Arrow
    bindkey "^[Oc" forward-word # control + right arrow
    bindkey "^[Od" backward-word # control + left arrow
    bindkey "^H" backward-kill-word # control + backspace
    bindkey "^[[3^" kill-word # control + delete
    esac
    bindkey "^R" history-incremental-pattern-search-backward
    bindkey "^S" history-incremental-pattern-search-forward
    if [ -f ~/.alert ]; then cat ~/.alert; fi
    Thanks for all the help.
    Last edited by Shark (2013-05-11 22:32:24)

    Raynman wrote:
    "This expression doesn't work", "It doesn't work" ...
    Could you try being a bit more specific?
    Firstly, i am sorry i didn't post the output. I should have know better.
    Secondly, chill out.
    I have used above regex with grep command. Output from terminal is:
    zsh: bad pattern: ^[^#]
    In bash it works perfectly.
    If i issue "setopt re_match_pcre" i have the same ouput as above.
    EDIT: If i issue "unsetopt no_match" it actually works but i have to change the regex from "\^\[^#]" to "\^[^#]" otherwise i get the same output as above. In bash both options work.
    Last edited by Shark (2013-05-11 22:07:21)

  • Regular Expressions and Numbers in Strings

    Looking for someone who has had experience in using Regular Expressions in the Classification Rule Builder.
    We have an eVar that is collecting the number of search results in this fashion:
    <Total Results>_<# of Item 1>_<# of Item 2>_<# of Item 3>_<# of Item 4>
    Example output would look like this:
    150_50_0_25_75
    What we've done is initially create a Regular Expression that looks like this:
    ^(.+)\_(.+)\_(.+)\_(.+)\_(.+)$
    The problem is, it appears in situations where the output contains a zero in one of the slots, the value is ignored and it receives the value in the next place over.  Using the example output shown above, I would end up with values like this:
    $0 150_50_0_25_75
    $1 150
    $2 50
    $3 25
    $4 75
    $5 {null}
    Here's the weird part.  When I perform a test of a single record, it appears like it will work just fine, but when it actually runs in Omniture, it's not working as expected.  Here's something else I'd like to know if it's possible to address.  The five-place string is only the newest iteration of this approach.  In the past, we started out with a two-place version, then three-place and then four.  Any recommendations for handling all scenarios?
    Any and all advice is welcome.  Thanks in advance!

    Doing some playing around on rubular.com and thinking the Regular Expression should be build this way instead:
    ^(\d+)\_(\d+)\_(\d+)\_(\d+)\_(\d+)$
    Again, still looking for any additional guidance from more experienced individuals.  Thanks!

  • Need help in unix regular expressions

    Hi All,
    I'm new to shell scripting. Please help me in achieving this
    I am trying to a find regular expression that need to pick a file with begin with the below format and mask variable is called in xml file.
    currently the script accepts:
    mask="CLIENT_ID+'_ADHSUITE_IN_'+date2str(now,'MMddyy','US/Eastern')+'.txt'"
    But it should accept in the below format
    2595_ADHSUITE_IN_ANNWEL_030309_2009-02-10_15-12-46-000_648.TXT715.outpgp_out
    where CLIENT_ID=2595. How to place wild card character '*' in the below to accept file in the above format. here is what i made changes.
    mask="CLIENT_ID+'_ADHSUITE_IN_'*+date2str(now,'MMddyy','US/Eastern')*+'.TXT'*+'.outpgp_out'"
    Please help.
    Thanks

    I believe your statement is being passed over twice:
    First Pass: (This is done by something like javascript)
    CLIENT_ID+'_ADHSUITE_IN_'+'.*'+date2str(now,'MMddyy','US/Eastern')+'.*'+'.TXT'+'.*'+'.outpgp_out'In this pass the variables and functions that are enclosed in literals are processed:
    (1) CLIENT_ID is replaced by 2595 or whatever is current value is:
    (2) date2str(now,'MMddyy','US/Eastern') gets replaced by 040609 (if the current time now is 4th april 2009).
    So at the end of this first pass we have a string:
    2595_ADHSUITE_IN_.\*040609.\*.TXT.*.outpgp_outThis string at the end of the first pass is a Posix basic regular expression. (ref: [http://en.wikipedia.org/wiki/Regular_expression] ) accessed at time of post).
    This is the string I put in the Regular Expression text box on [http://www.fileformat.info/tool/regex.htm]
    and it matches "2595_ADHSUITE_IN_ANNWEL_040609_2009-01-27_17-02-28-000_631.TXT715.outpgp_out" for me (though I prefer my egrep test).
    I hope this is somewhat clearer. Remember I have very little information about your system/application and I make big guesses.
    NB: (I should thank Frits earlier for pointing my sloppiness between wildcards (for eg unix shell filename expansion) and regular expressions).
    For the second pass this used to compared other strings to see

  • Using Regular Expressions to replace Quotes in Strings

    I am writing a program that generates Java files and there are Strings that are used that contain Quotes. I want to use regular expressions to replace " with \" when it is written to the file. The code I was trying to use was:
    String temp = "\"Hello\" i am a \"variable\"";
    temp = temp.replaceAll("\"","\\\\\"");
    however, this does not work and when i print out the code to the file the resulting code appears as:
    String someVar = ""Hello" i am a "variable"";
    and not as:
    String someVar = "\"Hello\" i am a \"variable\"";
    I am assumming my regular expression is wrong. If it is, could someone explain to me how to fix it so that it will work?
    Thanks in advance.

    Thanks, appearently I'm just doing something weird that I just need to look at a little bit harder.

  • Regular expressions API in v1.3.1

    hi,
    is there any package like java.util.regex of java version 1.4.1 in java version 1.3.1? is it possible to use regular expressions in 1.3.1 version? i need methods like matches() and replaceAll() in Matcher class of version 1.4.1
    if not, is there any external packages for this purpose?

    You can use ORO from Apache Jakarta Project.
    "The Jakarta-ORO Java classes are a set of text-processing Java classes that provide Perl5 compatible regular expressions, AWK-like regular expressions, glob expressions, and utility classes for performing substitutions, splits, filtering filenames, etc. "
    URL: http://jakarta.apache.org/oro/index.html

  • Regular Expressions Character Class shortcuts

    I have been learning to use regular expressions to modify some of my text files. I noticed that on my ARCH box the Character Class shortcuts do not work e.g. [[:digit:]] in an expression works but \d does not. Is this normal or is my installation broken in some way?

    Bebo wrote:
    There are several regexp "dialects". It's quite painful actually For instance, as far as I know, \d works in perl, but not in sed or grep.
    So, yes, this is normal.
    Yeah -- Henry Spencer's regexp stuff is always generally considered the portable form for sed, awk, since they're all based from it.  Newer versions of grep though do allow for a -P flag for perl-regexps to be used, but this is non-portable, obviously.
    -- Thomas Adam

  • Is there a way to be sure that created Regular expression (Class RegExp) is valid?

    Is it possible to create wrong Regular expression in  ActionScript/Flex which will cause runtime error? Or how to validate RegExp?
    I've tried so many  weird regexpes in Flash and Flex, it never complained! How do I know If my  regexp valid?

    "It is VERY reliable. If match() returns null - this means that String doesn't have any parts that match pattern."
    Exactly, but is doenst mean that syntax of RegExp is correct since you dont know what gave you null
    Apparemtly /foo([/ is wrong expression, and Java compiller complains:
    Unclosed character class near index 4 foo([     ^
    but in AS3 i can create
    var xxd:RegExp = /foo([/;
    when I do xxd.exec("") it got null;
    and no complains!! How to understand that?
    But in same time
    var xxd:RegExp = /foo/;
    xxd.exec(""), will be null as well!?

  • "Stupid design of regular expression in java"

    I bet majority will stumble on this:
    String inputStr = "a,,b";
    String patternStr = ",";
    String[] fields = inputStr.split(patternStr);
    result: ["a", "", "c"]
    so if inputStr = ,,
    then result: ["","",""] ??
    If you think so, give yourself a treat.
    Now give yourself a bigger treat. cause
    the result is [""]
    Unbelievable about inconsistency of regular expression.
    It look so irregular.

    Take a look at the site which quote
    http://www.devarticles.com/c/a/Java/Regular-Expressions/10/
    Perl is probably the most popular language to offer regular expression support. As such, it makes sense to put Java&#8217;s regex support in context by comparing it to that of Perl. The distinctions you should be aware of are highlighted in the sections that follow. Generally speaking, J2SE doesn&#8217;t include some Perl constructs, because Java is a full-featured programming language that offers sophisticated condition and logical paths of execution that are reasonable alternatives to the constructs offered by Perl."
    Since, Java doesn't offer means it is perl, why not fix the "weirdness" of perl is in this case ? let the split function work as intended.
    Check this out from Java almanac
    // Parse a comma-separated string
    String inputStr = "a,,b";
    String patternStr = ",";
    String[] fields = inputStr.split(patternStr);
    // ["a", "", "b"]
    // Parse a line whose separator is a comma followed by a space
    inputStr = "a, b, c,d";
    patternStr = ", ";
    fields = inputStr.split(patternStr, -1);
    // ["a", "b", "c,d"]
    // Parse a line with and's and or's
    inputStr = "a, b, and c";
    patternStr = "[, ]+(and|or)*[, ]*";
    fields = inputStr.split(patternStr, -1);
    // ["a", "b", "c"]
    Everything is mentions except to solve the common issue I have.
    http://javaalmanac.com/egs/java.util.regex/ParseLine.html
    The above will failed if any empty data within pipe. It is how you manage to detect the empty data that is relatively important because it screw up your program. And too code to do that increase your chance of program error.
    Sometimes we just accept thing but never question why there aren't better way.

  • GUI Based tool for search and replace using regular expression

    Hi,
    I have developed this small tool which can be used for search and replace in multiple files using reqular expressions.
    Features:
    1. Full regular expression
    2. GUI based with Highlighted results
    3. Preview for replace available
    4. Pure Java based.
    5. Its like unix sed and grep
    Please visit below site for download/more information :
    http://sourceforge.net/projects/regexsearchrepl/
    Thanks,
    Hitesh Viseria

    I agree with you, it cannot compete grep/sed/awk combination. I couldnt find anything even close to grep/sed for windows which will also have a preview and most important, free. That is what made me write this tool. I am trying to
    improve its performance and more features like history etc.
    Any suggestions on additional features are most welcome.

  • Regular Expression Validator Problem

    Hi,
    I have a text area that I want to use a regular expression
    validator on to verify that one or more 9 char alphanumeric serial
    numbers were entered.
    <mx:RegExpValidator
    source="{entryField}"
    property="text"
    flags="gi"
    expression="{/[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0- 9][a-zA-Z0-9][a-zA-Z0-9](
    |,|\\n|\\r|\\t)/}"
    valid="handleResult(event)"
    invalid="handleResult(event)"
    trigger="{entryField}"
    triggerEvent="textInput"
    />
    //for testing
    private function
    handleResult(eventObj:ValidationResultEvent):void {
    if (eventObj.type == ValidationResultEvent.VALID)
    entryField.setStyle("borderColor", "green");
    entryField.setStyle("borderThickness", "2");
    else
    entryField.setStyle("borderColor", "red");
    entryField.setStyle("borderThickness", "2");
    The problem is the handler function always comes back
    invalid, even when it should be valid, such as: 123456789 a12345678
    lk231jkop
    Can anyone advise where I might be going wrong?
    Thanks!

    This is weird, i just tried your RegExp using the Adobe's
    Live Docs example and it worked perfectly.
    Try it at:
    http://livedocs.adobe.com/flex/201/langref/mx/validators/RegExpValidator.html
    Scroll down to the end of the page, then type your RegExp and
    your TestValue and click the Validate button, it works. I am
    guessing that maybe it is your
    i
    flag, try removing it from your validator.
    Hope this helps

  • [solved]Issue concerning regular expressions

    Perhaps the problem in it's entirety is a bit more involved that just regular expressions, but for a start, this small problem is what I need help with.
    In short, I have some data which can be in the of some text, an IP address or an URL.
    In case that it's URL, it will usually, if not always, be in the form of a base URL, with or without one or more subdomains.
    I need to strip away these sub domains so that only the base URL remain, thus:
    abc.def.ghi.com
    becomes
    ghi.com
    forgetting for the moment that some top level domains have two parts, it seems so easy, yes I cannot for the life of me figure out how to do it, using only the basic command line tool available such as sed, awk, and so on and so forth, and thus I hope that someone here can and will help me. It is of course also possible that the process is simply to involved that a simple command line will do, in which case I suppose a bahs or python script will have to do, but I really hope not. In any case I will appreciate any help and/or advise you can give me.
    Best regards.
    Last edited by zacariaz (2015-05-07 13:38:47)

    WorMzy wrote:
    Is this a homework exercise?
    What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.
    Mod note: Moving to Newbie Corner
    I wish it were homework, but no.
    The rev idea is interesting, and I'll look into it, but as it is, an awk solution would be preferable, as this is only part of a larger issue, and I'd like to keep it as simple as possible. Also, as for topdomain with two parts, it's still somewhat involved.
    As for what I've tried, a lot as I only stopped and went to bed when I realized the sun was raising, but the main problem is that I don't really know how to attack the problem, and of course I'm not exactly a master of regular expressions.
    Anyway thanks.

  • Logical AND in Java Regular Expressions

    I'm trying to implement logical AND using Java Regular Expressions.
    I couldn't figure out how to do it after reading Java docs and textbooks. I can do something like "abc.*def", which means that I'm looking for strings which have "abc", then anything, then "def", but it is not "pure" logical AND - I will not find "def.*abc" this way.
    Any ideas, how to do it ?
    Baken

    First off, looks like you're really talking about an "OR", not an "AND" - you want it to match abc.*def OR def.*abc right? If you tried to match abc.*def AND def.*abc nothing would ever match that, as no string can begin with both "abc" and "def", just like no numeric value can be both 2 and 5.
    Anyway, maybe regex isn't the right tool for this job. Can you not simply programmatically match it yourself using String methods? You want it to match if the string "starts with" abc and "ends with" def, or vice-versa. Just write some simple code.

  • Help in Regular expression

    Hello..
    I wanted to write a regular expression to match the foll string..
    <!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->
    <p> <b>NEW ORLEANS, Louisiana (CNN) </b>
    -- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi, residents say much of America has forgotten their plight.
    </p> <!--startclickprintexclude-->
    I tried doing..
    Matcher matcher= Pattern.compile("<!--endclickprintexclude--> <p><b>([^<^>]+?)</p><!--startclickprintexclude-->", Pattern.CASE_INSENSITIVE).matcher(story);
    Its not working...
    is there any other soln?

    Theres probably a better way to do this but here's a way that works.
    import java.util.regex.*;
    public class RegexTester{
    public static void main(String[] args){
         String text =
         "<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->" +
         "<p> <b>NEW ORLEANS, Louisiana (CNN) </b>" +
         "-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi," +
         "residents say much of America has forgotten their plight." +
         "</p> <!--startclickprintexclude-->";
         String regex = ">((?:\\s*[\\S&&[^<>]]+\\s*)*?)<";
         Pattern p = Pattern.compile(regex);
         Matcher m = p.matcher(text);
         while(m.find()){
         System.out.println("Match: '" + m.group(1) + "'");
    }

Maybe you are looking for

  • Adjusting color in part of an image in a photo.

    Hi, just bought Photoshop CC and am stuck.  I just received a photo of my uncle that was taken through the glass of the frame it was in.  The right side of his face is lighter than the rest of his face.  How do I adjust this color? You can also see s

  • Does anyone have any information about a VB SDK for LDAP?

    I would like to know if anyone has information/documentation on a VB SDK for Netscape LDAP SERVERS? Thanks.

  • Themes and group themes

    is it possible to implement group themes in oracle spatial? i have been searching for a while but i haven't found anything about this thanks!!

  • Rendering Files Deleted by...?

    Something truly effed up just happened and I'm curious to know if this is a problem with Final Cut or xmas gremlins. I'm working on an FCE project with source files on an external drive. The drive filled up so I set project settings to point to a loc

  • Can Front Row continue playing the next video? (continuous playing)

    Hey everybody, I tried searching the forum, but no hits, so I thought: let's post something I got lots of small (get-a-mac) video's in my Movies folder, and I want to play them on my TV through Front Row. But after every video I have to select the ne