Regular expression on words with % wildcard

Hi,
I've got some processing working using regular expression where I need to process words e.g.
regexp_replace('word1 word2','(\w+)','myprefix{\1}') - results in - 'myprefixword1 myprefixword2'
However, if I'm presented with this; '%word0 word1% wo%d2 word3', then I need to treat % as special case and leave the word as is, so result here would be; - '%word0 word1% wo%d2 myprefixword3', is this achievable using regexp ?

And for those who don't know, I guess we should explain why we're having to expand single spaces to double spaces...
(I'll use the "¬" character to represent spaces to make it clearer to see)
If we have a string such as
word1¬word2¬word3and we want to identify the words in the string (without using any special regexp word identifier) then we are going to use the spaces to identify the start and end of words. To make life easy, we manually put a space at the start and end of the string so we can say that each word in the string will have a space before and after it regardless of where it is in the string...
¬word1¬word2¬word3¬However, when we specify what we want to search for we are going to say we want a space, followed by a number of characters (not spaces), followed by a space...
¬[^¬]*¬So, ideally, you'd expect it to look through the string and say
¬word1¬word2¬word3¬
\_____/... found word1
¬word1¬word2¬word3¬
      \_____/... found word2
¬word1¬word2¬word3¬
            \_____/... found word3
Unfortunately, there is a problem. Once the first word has been found the pointer for searching the rest of the string is located on the next character after the match i.e.
¬word1¬word2¬word3¬
       ^So it won't be able to pick out word2 and will only get to word3. Let's see it in action...
SQL> ed
Wrote file afiedt.buf
1 with t as (select ' word1 word2 word3 ' as txt from dual)
2 --
3 select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
4* from t
SQL> /
TXT
xxxxxword2xxxxx
SQL>In order to deal with this, if we replace the single spaces with double spaces (not required at the start and end) our string looks like...
¬word1¬¬word2¬¬word3¬So as it searches it finds word1 as a match and then the pointer in the string is located...
¬word1¬¬word2¬¬word3¬
       ^... so the next match for the pattern of space-characters-space is word2 and then the pointer is located...
¬word1¬¬word2¬¬word3¬
              ^... ready to find word 3. Example...
SQL> ed
Wrote file afiedt.buf
1 with t as (select ' word1 word2 word3 ' as txt from dual)
2 --
3 select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
4* from t
SQL> /
TXT
xxxxxxxxxxxxxxx
SQL>Hopefully that's a little clearer. You just have to remember the "pointer" principle and the fact that once a match is found it is located on the character after the match.
;)

Similar Messages

Regular Expression - Extract words before the PLUS Sign ?

Dear All,
I had many words with having a symbol plus. I need to extract the words before the plus sign.
I can able to do this by using String.indexOf or String.contains. But i like to know is there is any way to do this using Regular Expression.
sample string
Kathire+san Output Kathire
World+islike Output World
Thanks,
J.Kathir

Here's one way.
import java.util.regex.Pattern;
String input = "abc+def";
Pattern pat = pat.compile("\\+");
String beforePlus = pat.split(input)[0];
Sun's Regular Expression Tutorial for Java
Regular-Expressions.info

Regular Expression Character Sets with Pattern and Matcher

Hi,
I am a little bit confused about a regular expressions I am writing, it works in other languages but not in Java.
The regular expressions is to match LaTeX commands from a file, and is as follows:
\\begin{command}([.|\n\r\s]*)\\end{command}
This does not work in Java but does in PHP, C, etc...
The part that is strange is the . character. If placed as .* it works but if placed as [.]* it doesnt. Does this mean that . cannot be placed in a character range in Java?
Any help very much appreciated.
Kind Regards
Paul Bain

In PHP it seems that the "." still works as a all character operator inside character classes.
The regular expression posted did not work, but it does if I do:
\\begin{command}((.|[\n\r\s])*)?\\end{command}
Basically what I'm trying to match is a block of LaTeX, so the \\begin{command} and \\end{command} in LaTeX, not regex, although the \\ is a single one in LaTeX. I basically want to match any block which starts with one of those and ends in the end command. so really the regular expression that counts is the bit in the middle, ((.|[\n\r\s])*)?
Am I right it saying that the "?" will prevent the engine matching the first and last \\bein and \\end in the following example:
\\begin{command}
some stuff
\\end{command}
\\begin{command}
some stuff
\\end{command}

Safari May Not Interpret Regular Expression in Compliance with W3C Standard

We are troubleshooting why some websites are no longer working when rendered with the Safari 2.0.4 browser. The failure begins when client entered data is validated using regular expressions.
We have localized the issue to Safari's not interpretting regular expressions consistently.
For example:
Does the regex \u00e9 match the literal character é? (Validates the regular expression engine understands Unicode escape sequences for extended characters.)? - NO, but it does on IE and FireFox
Does the regex \u0041 match the literal character A? (Validates the regular expression engine understands Unicode escape sequences for ASCII characters.)? - NO, but it does on IE and FireFox
Does the regex é match the literal character é? (Validates the regular expression engine understands literal characters outside the ASCII range – this is against ECMAScript spec.)? - Sometimes, but always on IE and FireFox
Write a Unicode escape sequence to the screen on the client side. (Validates the string parsing and display in the JS engine works.) - Works on all 3
Is escape sequence \u00e9 equivalent to literal character é? (Validates the string functionality in the JS engine works with extended characters.)? Yes on all 3.
Is escape sequence \u0041 equivalent to literal character A? (Validates the string functionality in the JS engine works with ASCII characters.)? Yes on all 3
Does the regex A match the literal character A? (Validates the regular expression engine understands literal characters in the ASCII range – this is ECMAScript spec.)? Yes on all 3
Please help. It's hard for me to believe that the regular expression / javascript interpreter(s) for Safari aren't working as they have in the past - but all roads are pointed that way....
Thank you for your review.
Mac OS X (10.4.7)

We are troubleshooting why some websites are no longer working when rendered with the Safari 2.0.4 browser. The failure begins when client entered data is validated using regular expressions.
We have localized the issue to Safari's not interpretting regular expressions consistently.
For example:
Does the regex \u00e9 match the literal character é? (Validates the regular expression engine understands Unicode escape sequences for extended characters.)? - NO, but it does on IE and FireFox
Does the regex \u0041 match the literal character A? (Validates the regular expression engine understands Unicode escape sequences for ASCII characters.)? - NO, but it does on IE and FireFox
Does the regex é match the literal character é? (Validates the regular expression engine understands literal characters outside the ASCII range – this is against ECMAScript spec.)? - Sometimes, but always on IE and FireFox
Write a Unicode escape sequence to the screen on the client side. (Validates the string parsing and display in the JS engine works.) - Works on all 3
Is escape sequence \u00e9 equivalent to literal character é? (Validates the string functionality in the JS engine works with extended characters.)? Yes on all 3.
Is escape sequence \u0041 equivalent to literal character A? (Validates the string functionality in the JS engine works with ASCII characters.)? Yes on all 3
Does the regex A match the literal character A? (Validates the regular expression engine understands literal characters in the ASCII range – this is ECMAScript spec.)? Yes on all 3
Please help. It's hard for me to believe that the regular expression / javascript interpreter(s) for Safari aren't working as they have in the past - but all roads are pointed that way....
Thank you for your review.
Mac OS X (10.4.7)

Regular Expression wierdness - problem with $ character

If I use the following KM code in Beanshell Technology - it works correctly and replaces "C$_0MYREMOTETABLE RMTALIAS, MYLOCALTABLE LOCALIAS, " with "C$_0MYREMOTETABLE_000111 RMTALIAS, MYLOCALTABLE LOCALIAS, "
But when I try to use the same exact code in 'Undefined' technology - it does not match anything in the source string - and does not replace anything.
If I change the regular expression to not use the $ it still does not work.
But if I change the source string to remove the $ - then the regular expression works.
If I use the same code in Beanshell technology - it works fine - but then I can't use the value in a later 'Undefined' technology step.
Does anyone know if the java technology does something special with $ characters when ODI parses the KM code?
Does anyone know if there is a way to use the value from a Beanshell variable in a 'Undefined' technology step?
String newSourceTableList = "";
String sessionNum ="<%=odiRef.getSession("SESS_NO") %>";
String sourceTableList = "<%=odiRef.getSrcTablesList("", "[WORK_SCHEMA].[TABLE_NAME] [POP_TAB_ALIAS]" , ",", ",") %>";
String matchExpr = "(C\\$_\\S*)"; (should end with two backslashes followed by 'S*' - this editor mangles it)
String replaceExpr = "$0_"+sessionNum+ " ";
newSourceTableList = sourceTableList.replaceAll(matchExpr,replaceExpr);
---------------------------------------------------

Phases of substitution in ODI:
The way ODI works allows for three separate phases of substitution, and you can use them all. The three phases are:
- First Phase: <% %> You will see these appear in the knowledge moduiles etc and these are substituted on generation. (when you generate a scenario, or tell ODI to execute an interface directly) this phase is used to generate the column names, table names etc which are known from the metadata at that phase.
- Second Phase: <? ?> This phase is substituted when the scenario is instatntiuated as an excution - session generation. At this point, ODI has the additional information which allows it to generate the schema names, as it has resolved the Logical/Physical Schemas through the use of the Context (which is provided for the execution to take place. All the substitutions at this point are written to the execution log.
- Third Phase <@ @> This phase is substituted when the execution code is read from the session log for execution. You will note that anything substituted in this phase is NEVER written to the execution log. (see PASSWORDS as a prime example, you don't want those written to the logs, with the security risks associated with that!)
Anything in <@ @> is always interpreted for substitution by the java beanshell, it does not have to be a Java Beanshell step, it can be any kind of step, it will be interpreted at that run-time point.

Regular expressions: find files with exactly 'n' digits in a row

Hi there,
I want to filter files that contain only a fixed number of digits, but not more (at least not in after the digits).
For example, I have
01.mp3
02.mp3
test10.txt
test000110101010.txt
04.flac
and for n=2 I want to get all files except 'test000110101010.txt'.
The following is not working, and I'm a total newb regarding regular expressions
ls -l | grep '^-' | awk '{print $9}' | grep '([0-9]\{2\})[^0-9]\{2\}'
Thanks for help.
Regards,
drm

Thanks!
I wrote a python script to scan e.g. a music folder for missing files and needed to extract the file numbers from the files to get the "highest" number.
You can get it from here: http://pastebin.com/Sg9yDHiw (Python3, expires in 1 month)
Regards,
drm
Edit: found a bug
Last edited by drm00 (2011-02-04 13:57:43)

[SOLVED]ZSH and regular expressions

Hi
I am getting into regular expressions and i have noticed that with my .zshrc file i have some problem. In bash this expression works:
\^\[^#]
but not also in zsh. I have also noted that regular expression works fine with other zshrc configurations found in archwiki (like grml) but i want to have my configuration. And i really can't find what command make a difference
My .zshrc file is pulled from this site https://github.com/slashbeast/things/bl … s/DOTzshrc.
# .zshrc
# Author: Piotr Karbowski <[email protected]>
# License: beerware.
# Basic zsh config.
umask 077
ZDOTDIR=${ZDOTDIR:-${HOME}}
ZSHDDIR="${HOME}/.config/zsh.d"
HISTFILE="${ZDOTDIR}/.zsh_history"
HISTSIZE='10000'
SAVEHIST="${HISTSIZE}"
export EDITOR="/usr/bin/vim"
export TMP="$HOME/tmp"
export TEMP="$TMP"
export TMPDIR="$TMP"
export TMPPREFIX="${TMPDIR}/zsh"
if [ ! -d "${TMP}" ]; then mkdir "${TMP}"; fi
if ! [[ "${PATH}" =~ "^${HOME}/bin" ]]; then
export PATH="${HOME}/bin:${PATH}"
fi
# Not all servers have terminfo for rxvt-256color. :<
if [ "${TERM}" = 'rxvt-256color' ] && ! [ -f '/usr/share/terminfo/r/rxvt-256color' ] && ! [ -f '/lib/terminfo/r/rxvt-256color' ] && ! [ -f "${HOME}/.terminfo/r/rxvt-256color" ]; then
export TERM='rxvt-unicode'
fi
# Colors.
red='\e[0;31m'
RED='\e[1;31m'
green='\e[0;32m'
GREEN='\e[1;32m'
yellow='\e[0;33m'
YELLOW='\e[1;33m'
blue='\e[0;34m'
BLUE='\e[1;34m'
purple='\e[0;35m'
PURPLE='\e[1;35m'
cyan='\e[0;36m'
CYAN='\e[1;36m'
NC='\e[0m'
# Functions
if [ -f '/etc/profile.d/prll.sh' ]; then
. "/etc/profile.d/prll.sh"
fi
run_under_tmux() {
# Run $1 under session or attach if such session already exist.
# $2 is optional path, if no specified, will use $1 from $PATH.
# If you need to pass extra variables, use $2 for it as in example below..
# Example usage:
# torrent() { run_under_tmux 'rtorrent' '/usr/local/rtorrent-git/bin/rtorrent'; }
# mutt() { run_under_tmux 'mutt'; }
# irc() { run_under_tmux 'irssi' "TERM='screen' command irssi"; }
# There is a bug in linux's libevent...
# export EVENT_NOEPOLL=1
command -v tmux >/dev/null 2>&1 || return 1
if [ -z "$1" ]; then return 1; fi
local name="$1"
if [ -n "$2" ]; then
local file_path="$2"
else
local file_path="command ${name}"
fi
if tmux has-session -t "${name}" 2>/dev/null; then
tmux attach -d -t "${name}"
else
tmux new-session -s "${name}" "${file_path}" \; set-option status \; set set-titles-string "${name} (tmux@${HOST})"
fi
t() { run_under_tmux rtorrent; }
irc() { run_under_tmux irssi "TERM='screen' command irssi"; }
over_ssh() {
if [ -n "${SSH_CLIENT}" ]; then
return 0
else
return 1
fi
reload () {
exec "${SHELL}" "$@"
confirm() {
local answer
echo -ne "zsh: sure you want to run '${YELLOW}$@${NC}' [yN]? "
read -q answer
echo
if [[ "${answer}" =~ ^[Yy]$ ]]; then
command "${=1}" "${=@:2}"
else
return 1
fi
confirm_wrapper() {
if [ "$1" = '--root' ]; then
local as_root='true'
shift
fi
local runcommand="$1"; shift
if [ "${as_root}" = 'true' ] && [ "${USER}" != 'root' ]; then
runcommand="sudo ${runcommand}"
fi
confirm "${runcommand}" "$@"
poweroff() { confirm_wrapper --root $0 "$@"; }
reboot() { confirm_wrapper --root $0 "$@"; }
hibernate() { confirm_wrapper --root $0 "$@"; }
detox() {
if [ "$#" -ge 1 ]; then
confirm detox "$@"
else
command detox "$@"
fi
has() {
local string="${1}"
shift
local element=''
for element in "$@"; do
if [ "${string}" = "${element}" ]; then
return 0
fi
done
return 1
begin_with() {
local string="${1}"
shift
local element=''
for element in "$@"; do
if [[ "${string}" =~ "^${element}" ]]; then
return 0
fi
done
return 1
termtitle() {
case "$TERM" in
rxvt*|xterm|nxterm|gnome|screen|screen-*)
local prompt_host="${(%):-%m}"
local prompt_user="${(%):-%n}"
local prompt_char="${(%):-%~}"
case "$1" in
precmd)
printf '\e]0;%s@%s: %s\a' "${prompt_user}" "${prompt_host}" "${prompt_char}"
preexec)
printf '\e]0;%s [%s@%s: %s]\a' "$2" "${prompt_user}" "${prompt_host}" "${prompt_char}"
esac
esac
git_check_if_worktree() {
# This function intend to be only executed in chpwd().
# Check if the current path is in git repo.
# We would want stop this function, on some big git repos it can take some time to cd into.
if [ -n "${skip_zsh_git}" ]; then
git_pwd_is_worktree='false'
return 1
fi
# The : separated list of paths where we will run check for git repo.
# If not set, then we will do it only for /root and /home.
if [ "${UID}" = '0' ]; then
# running 'git' in repo changes owner of git's index files to root, skip prompt git magic if CWD=/home/*
git_check_if_workdir_path="${git_check_if_workdir_path:-/root:/etc}"
else
git_check_if_workdir_path="${git_check_if_workdir_path:-/home}"
git_check_if_workdir_path_exclude="${git_check_if_workdir_path_exclude:-${HOME}/_sshfs}"
fi
if begin_with "${PWD}" ${=git_check_if_workdir_path//:/ }; then
if ! begin_with "${PWD}" ${=git_check_if_workdir_path_exclude//:/ }; then
local git_pwd_is_worktree_match='true'
else
local git_pwd_is_worktree_match='false'
fi
fi
if ! [ "${git_pwd_is_worktree_match}" = 'true' ]; then
git_pwd_is_worktree='false'
return 1
fi
# todo: Prevent checking for /.git or /home/.git, if PWD=/home or PWD=/ maybe...
# damn annoying RBAC messages about Access denied there.
if [ -d '.git' ] || [ "$(git rev-parse --is-inside-work-tree 2> /dev/null)" = 'true' ]; then
git_pwd_is_worktree='true'
git_worktree_is_bare="$(git config core.bare)"
else
unset git_branch git_worktree_is_bare
git_pwd_is_worktree='false'
fi
git_branch() {
git_branch="$(git symbolic-ref HEAD 2>/dev/null)"
git_branch="${git_branch##*/}"
git_branch="${git_branch:-no branch}"
git_dirty() {
if [ "${git_worktree_is_bare}" = 'false' ] && [ -n "$(git status --untracked-files='no' --porcelain)" ]; then
git_dirty='%F{green}*'
else
unset git_dirty
fi
precmd() {
# Set terminal title.
termtitle precmd
if [ "${git_pwd_is_worktree}" = 'true' ]; then
git_branch
git_dirty
git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
else
unset git_prompt
fi
preexec() {
# Set terminal title along with current executed command pass as second argument
termtitle preexec "${(V)1}"
chpwd() {
git_check_if_worktree
man() {
if command -v vimmanpager >/dev/null 2>&1; then
PAGER="vimmanpager" command man "$@"
else
command man "$@"
fi
# Are we running under grsecurity's RBAC?
rbac_auth() {
local auth_to_role='admin'
if [ "${USER}" = 'root' ]; then
if ! grep -qE '^RBAC:' "/proc/self/status" && command -v gradm > /dev/null 2>&1; then
echo -e "\n${BLUE}*${NC} ${GREEN}RBAC${NC} Authorize to '${auth_to_role}' RBAC role."
gradm -a "${auth_to_role}"
fi
fi
#rbac_auth
# Check if we started zsh in git worktree, useful with tmux when your new zsh may spawn in source dir.
git_check_if_worktree
if [ "${git_pwd_is_worktree}" = 'true' ]; then
git_branch
git_dirty
git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
else
unset git_prompt
fi
# Le features!
# extended globbing, awesome!
setopt extendedGlob
# zmv - a command for renaming files by means of shell patterns.
autoload -U zmv
# zargs, as an alternative to find -exec and xargs.
autoload -U zargs
# Turn on command substitution in the prompt (and parameter expansion and arithmetic expansion).
setopt promptsubst
# Control-x-e to open current line in $EDITOR, awesome when writting functions or editing multiline commands.
autoload -U edit-command-line
zle -N edit-command-line
bindkey '^x^e' edit-command-line
# Include user-specified configs.
if [ ! -d "${ZSHDDIR}" ]; then
mkdir -p "${ZSHDDIR}" && echo "# Put your user-specified config here." > "${ZSHDDIR}/example.zsh"
fi
for zshd in $(ls -A ${HOME}/.config/zsh.d/^*.(z)sh$); do
. "${zshd}"
done
# Completion.
autoload -Uz compinit
compinit
zstyle ':completion:*' matcher-list 'm:{a-z}={A-Z}'
zstyle ':completion:*' completer _expand _complete _ignored _approximate
zstyle ':completion:*' menu select=2
zstyle ':completion:*' select-prompt '%SScrolling active: current selection at %p%s'
zstyle ':completion::complete:*' use-cache 1
zstyle ':completion:*:descriptions' format '%U%F{cyan}%d%f%u'
# If running as root and nice >0, renice to 0.
if [ "$USER" = 'root' ] && [ "$(cut -d ' ' -f 19 /proc/$$/stat)" -gt 0 ]; then
renice -n 0 -p "$$" && echo "# Adjusted nice level for current shell to 0."
fi
# Fancy prompt.
if over_ssh && [ -z "${TMUX}" ]; then
prompt_is_ssh='%F{blue}[%F{red}SSH%F{blue}] '
elif over_ssh; then
prompt_is_ssh='%F{blue}[%F{253}SSH%F{blue}] '
else
unset prompt_is_ssh
fi
case $USER in
root)
PROMPT='%B%F{cyan}%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{blue}%1~${git_prompt}%F{blue} %# %b%f%k'
PROMPT='%B%F{blue}%n@%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{cyan}%1~${git_prompt}%F{cyan} %# %b%f%k'
esac
# Ignore lines prefixed with '#'.
setopt interactivecomments
# Ignore duplicate in history.
setopt hist_ignore_dups
# Prevent record in history entry if preceding them with at least one space
setopt hist_ignore_space
# Nobody need flow control anymore. Troublesome feature.
#stty -ixon
setopt noflowcontrol
# Fix for tmux on linux.
case "$(uname -o)" in
'GNU/Linux')
export EVENT_NOEPOLL=1
esac
# Aliases
alias cp='cp -iv'
alias rcp='rsync -v --progress'
alias rmv='rsync -v --progress --remove-source-files'
alias mv='mv -iv'
alias rm='rm -iv'
alias rmdir='rmdir -v'
alias ln='ln -v'
alias chmod="chmod -c"
alias chown="chown -c"
if command -v colordiff > /dev/null 2>&1; then
alias diff="colordiff -Nuar"
else
alias diff="diff -Nuar"
fi
alias grep='grep --colour=auto'
alias egrep='egrep --colour=auto'
alias ls='ls --color=auto --human-readable --group-directories-first --classify'
# Keys.
case $TERM in
rxvt*|xterm*)
bindkey "^[[7~" beginning-of-line #Home key
bindkey "^[[8~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward #Up Arrow
bindkey "^[[B" history-beginning-search-forward #Down Arrow
bindkey "^[Oc" forward-word # control + right arrow
bindkey "^[Od" backward-word # control + left arrow
bindkey "^H" backward-kill-word # control + backspace
bindkey "^[[3^" kill-word # control + delete
linux)
bindkey "^[[1~" beginning-of-line #Home key
bindkey "^[[4~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward
bindkey "^[[B" history-beginning-search-forward
screen|screen-*)
bindkey "^[[1~" beginning-of-line #Home key
bindkey "^[[4~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward #Up Arrow
bindkey "^[[B" history-beginning-search-forward #Down Arrow
bindkey "^[Oc" forward-word # control + right arrow
bindkey "^[Od" backward-word # control + left arrow
bindkey "^H" backward-kill-word # control + backspace
bindkey "^[[3^" kill-word # control + delete
esac
bindkey "^R" history-incremental-pattern-search-backward
bindkey "^S" history-incremental-pattern-search-forward
if [ -f ~/.alert ]; then cat ~/.alert; fi
Thanks for all the help.
Last edited by Shark (2013-05-11 22:32:24)

Raynman wrote:
"This expression doesn't work", "It doesn't work" ...
Could you try being a bit more specific?
Firstly, i am sorry i didn't post the output. I should have know better.
Secondly, chill out.
I have used above regex with grep command. Output from terminal is:
zsh: bad pattern: ^[^#]
In bash it works perfectly.
If i issue "setopt re_match_pcre" i have the same ouput as above.
EDIT: If i issue "unsetopt no_match" it actually works but i have to change the regex from "\^\[^#]" to "\^[^#]" otherwise i get the same output as above. In bash both options work.
Last edited by Shark (2013-05-11 22:07:21)

Regular Expressions, please help.

Hello everyone.
Can I get a Java Regular Expression to match with a word of the following language...
Start --> Expression;
Expression --> [0-9]+;
Expression --> Expression * Expression;
So the regexp should match with words like:
4;
4664;
4 * 763;
5 * 4534 * 23534;
04 * 002 * 1 * 10 * ...
I would be very happy, if anyone could help.

I dont think that I need to learn anything more.
I am sure it is not possible to make, what I want.
I want to build a compiler.
I just finished the abstract syntax of my language. Now I need a possibility to compile the concrete syntax of my language to the abstract one.
But I think, it is not possible with regular expressions.
Cause I need possibility to match a syntax of type chomsky 2.
I think regular expressions only match chomsky 3 languages.
But the "Backtracking"-mechanism of Java RegExp could do this.
I am not sure with this.
If you have any ideas please post.

Regular expressions: serious design flaw?

I wondered why sometimes, the replaceAll() method works in my code and sometimes it throws a java.util.regex.PatternSyntaxException. I wrote a little test case to show the problem
private void regularExpressionTest() {
 String escapers = "\\([{^$|)?*+."; //characters to escape to avoid PatternSyntaxException
 String text = "The quick brown fox jumps over the lazy dog.";
 System.out.println("Original: "+text);
 String regExp = "o";
 String word = "HELLO";
 String result = text.replaceAll(regExp, word);
 System.out.println("Replace all '"+regExp+"' with "+word+": "+result);
 text = "The quick brown {animal} jumps over the lazy dog.";
 System.out.println("Original: "+text);
 regExp = "{animal}";
// regExp = escapeChars(regExp, escapers); //escape possible regular expression characters
 word = "catterpillar";
 result = text.replaceAll(regExp, word);
 System.out.println("Replace all '"+regExp+"' with "+word+": "+result);
}//regularExpressionTest()The output is:
Original: The quick brown fox jumps over the lazy dog.
Replace all 'o' with HELLO: The quick brHELLOwn fHELLOx jumps HELLOver the lazy dHELLOg.
Original: The quick brown {animal} jumps over the lazy dog.
java.util.regex.PatternSyntaxException: Illegal repetition
{animal}
 at java.util.regex.Pattern.error(Pattern.java:1528)
 at java.util.regex.Pattern.closure(Pattern.java:2545)
 at java.util.regex.Pattern.sequence(Pattern.java:1656)
 at java.util.regex.Pattern.expr(Pattern.java:1545)
 at java.util.regex.Pattern.compile(Pattern.java:1279)
 at java.util.regex.Pattern.<init>(Pattern.java:1035)
 at java.util.regex.Pattern.compile(Pattern.java:779)
 at java.lang.String.replaceAll(String.java:1663)
 at de.icomps.prototypes.Test.regularExpressionTest(Test.java:57)
 at de.icomps.prototypes.Test.main(Test.java:34)
Exception in thread "main" As the first replacement works as espected, the second throws an Exception. Possible because '{' is a special character for the regular expression API. In case I know the regular expression, i could escape these special characters. But in my generic server app, the strings are all parameters, so the only way for replaceAll() to work is, to escape all possible special characters in the regular expression.
1. Is there a complete list of all special characters for regular expressions that need to be escaped?
2. Is there a similar replaceAll() method without regular expressions? So that all occurences of a substring can be replaced by another string? (So far, I wrote my own method but this is of course more time consuming for massive string operations.)

1. The complete list of specially-recognized characters are listed in the Java 1.4.* API. (Of course, new ones may eventually be added as the regex package matures).
2. It is time consuming to program in general. You should have written your own utility method that goes through a String using indexOf and building up the String in a StringBuffer, and apparently you did. Now you have the utility method...you no longer need to write that method again.
3. Or you could have written some kind of method that automatically escapes the specially-recognized characters...

Using Regular Expressions to replace Quotes in Strings

I am writing a program that generates Java files and there are Strings that are used that contain Quotes. I want to use regular expressions to replace " with \" when it is written to the file. The code I was trying to use was:
String temp = "\"Hello\" i am a \"variable\"";
temp = temp.replaceAll("\"","\\\\\"");
however, this does not work and when i print out the code to the file the resulting code appears as:
String someVar = ""Hello" i am a "variable"";
and not as:
String someVar = "\"Hello\" i am a \"variable\"";
I am assumming my regular expression is wrong. If it is, could someone explain to me how to fix it so that it will work?
Thanks in advance.

Thanks, appearently I'm just doing something weird that I just need to look at a little bit harder.

Question sql and regular expressions

I am trying to write a query where a like with a regular % wont work.
select <cols>
from <tab>
where col2 starts with 'SOME CHARACTERS' The field can have 0 or more empty spaces before there are more characters. However, I know what those characters are. How do I do 0 or more characters?
select mycols
from myemptable
where empname like 'JO%JO'There could be 0 or more empty spaces between JO and JO, but the next character is 'JO' . This is a general case. So it could be something else.
so % won't work. since i could get back 'JO HA JO' which I dont want.

Regular expressions where introduced with Unix in 1973.
It shouldn't be difficult to find online resources.
You seem to belong to the big class of users in this forum who doesn't know how to use the Internet, or find that to demanding.
The solution can of course be found in 5 minutes, of 2 to start my database especially for you at almost midnight.
with a as (select 'JO JO' col1 from dual
 union
 select 'JO JO' col1 from dual
 union
 select 'JO HA JO' col1 from dual
 union all
 select 'JO JO BLEH' from dual)
 select * from a where regexp_like(col1,'^JO( +)JO.*$')
You find out what it means as exercise.
Sybrand Bakker
Senior Oracle DBA

Setting set-search-data-hiding-rule-prop target-dn-regular-expressions

Hi. I'm trying to set up a data hiding rule in DPS 6.3. I want to have the data hiding rule apply to any DN that ends with o=ny,c=us. I tried setting the target-dn-regular-expressions to 'o=ny,c=us$'. I figured if DPS used regular expression matching, the $ at the end should be used as an anchor for that string. But the rule is not firing. I proved it by setting the regular expression to none and then setting the target-dns to match exactly the dn I'm returning. Anyone know what I would need to put in for the expression to match what I want? Thanks

Hi,
Regular expressions must comply with the Java regular expression specificationa, available at [http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html ]
For instance, the regular expression (.*)o=ny,c=us filters out the whole o=ny,c=us tree.
Hope this helps
Sylvain

Regular Expression to Locate Words with Character

I want to identify all the words in a document that are followed by the register mark (®) symbol.
I built, what I thought was a regular expression that would search for a register mark preceeded by alpha number characters and a space. So if my text contained the sentence "Adobe InDesign® is a great product.", the regular expression would find "InDesign®"
Below is the regular expression I composed. It grabs anything with a register mark, not just the register marks preceded by a space and alpha numeric characters. Where did I go wrong? I though the \s would restrict the search to complete words with a register mark.
\s[a-zA-Z0-9]|®

\s is the special GREP code for "any kind of space" -- a regular space, a tab, hard return, or any of ID's own white space codes. It has nothing to do with "complete words", because a word can appear at the start of a story, without any preceding space. It would also not find "InDesign®" because there is no space before it, there is a double quote instead.
Your GREP does not work because, well, you got the general idea (words may consist of the set of characters "a-z", "A-Z", and "0-9") but since you use the [..] without any other code, GREP will apply this rule once -- per character. If you want to find words of more than one character, you need to tell GREP "one or more of these, please": with a +.
Second, where did that | come from? It's the OR operator. Essentially, you are looking for
any space followed by one character from the set "a-z", "A-Z", and "0-9"
OR
the ® character
The 'word break' you were looking for is this code: \b, so you could search for "\b[a-zA-Z0-9]+" (note the '+' to allow more than one instance) -- but it's not necessary, because by default GREP grabs as much as it can. The set 'a-zA-Z0-9' etc. describes the allowed "word" characters, but you might want to prefer these: \l (ell) and \u for all lowercase and all uppercase characters -- they are shorter, and they automatically include accented characters, Greek, Russian, and a lot more. Similar, \d (for "digits") is the short-cut for "0-9". And even better: \w is the shortcut for "word character", i.e., your set but then shorter and a bit better.
Try this one:
\w+~r

Regular expression to replace "emtpy space" ( ) bitween words with +

Hallo!
When I wish to find in code something like this:
12144541 FirstWord SecondWord
regular expression for that is:
(\d{1,100})[\s-]\D{1,100}[\s-]\D{1,100}
Now, please help me tu find regular expression to replace
"emtpy space" ( ) bitween words with +
12144541 FirstWord SecondWord to become
12144541+FirstWord+SecondWord
Thank you very, very, very much!

A simple-minded solution is to use \s to match all
whitespace; e.g. find \s and replace with +. DW CS3, at least, is
smart enough to not replace end of line characters with the '+'
character if you limit your search & replace to text.

Regular Expression Find and Replace with Wildcards

Hi!
For the world of me, I can't figure out the right way to do this.
I basically have a list of last names, first names. I want the last name to have a different css style than the first name.
So this is what I have now:
AAGAARD, TODD, S. 
AAMOT, KARI, 
AARON, MARJORIE, C. 
and this is what I need to have:
AAGAARD , TODD, S. 
AAMOT , KARI, 
AARON , MARJORIE, C. 
Any ideas?
Thanks!

Make a backup first.
In the Find field use:
(\w+),\s+([^<]+)<\/b>\s* 
In the Replace field use:
$1 $2 
Select Use regular expression. Light the blue touch paper, and click Replace All.

Regular expression on words with % wildcard

Similar Messages

Maybe you are looking for