Regular Expression Abbreviation of Words
Suppose I have got data in my column like
Balla Ram Chog Mal College
Maharishi Dayanand University
Cambridge Public School
Now I want to write a query using regular expressions to find out the abbreviations. e.g the resulting data set should be:
BRCMC
MDU
CPS
How should I write regexp for it ?
One way, using SUBSTR and INSTR, tested on 10g.
with data as
select 'Balla Ram Chog Mal College' col from dual union all
select 'Maharishi Dayanand University' col from dual union all
select 'Cambridge Public School' col from dual
select col, replace(ltrim(max(sys_connect_by_path(str, ',')) keep (dense_rank last order by r), ','), ',') abbr
from (
select col, substr(col, decode(level, 1, 1, instr(col, ' ', 1, level - 1) + 1), 1) str, level, row_number() over (partition by col order by level) r
from data
connect by level <= length(col) - length(replace(col, ' ')) + 1
and col = prior col
and prior sys_guid() is not null
order by col, level
group by col
start with r = 1
connect by r - 1 = prior r
and col = prior col
and prior sys_guid() is not null;
COL ABBR
Balla Ram Chog Mal College BRCMC
Cambridge Public School CPS
Maharishi Dayanand University MDU
With 11g, you will not require the Outer query to concatenate the results, you can directly use LISTAGG as demonstrated by Hashim.
Similar Messages
-
Regular Expression to Locate Words with Character
I want to identify all the words in a document that are followed by the register mark (®) symbol.
I built, what I thought was a regular expression that would search for a register mark preceeded by alpha number characters and a space. So if my text contained the sentence "Adobe InDesign® is a great product.", the regular expression would find "InDesign®"
Below is the regular expression I composed. It grabs anything with a register mark, not just the register marks preceded by a space and alpha numeric characters. Where did I go wrong? I though the \s would restrict the search to complete words with a register mark.
\s[a-zA-Z0-9]|®\s is the special GREP code for "any kind of space" -- a regular space, a tab, hard return, or any of ID's own white space codes. It has nothing to do with "complete words", because a word can appear at the start of a story, without any preceding space. It would also not find "InDesign®" because there is no space before it, there is a double quote instead.
Your GREP does not work because, well, you got the general idea (words may consist of the set of characters "a-z", "A-Z", and "0-9") but since you use the [..] without any other code, GREP will apply this rule once -- per character. If you want to find words of more than one character, you need to tell GREP "one or more of these, please": with a +.
Second, where did that | come from? It's the OR operator. Essentially, you are looking for
any space followed by one character from the set "a-z", "A-Z", and "0-9"
OR
the ® character
The 'word break' you were looking for is this code: \b, so you could search for "\b[a-zA-Z0-9]+" (note the '+' to allow more than one instance) -- but it's not necessary, because by default GREP grabs as much as it can. The set 'a-zA-Z0-9' etc. describes the allowed "word" characters, but you might want to prefer these: \l (ell) and \u for all lowercase and all uppercase characters -- they are shorter, and they automatically include accented characters, Greek, Russian, and a lot more. Similar, \d (for "digits") is the short-cut for "0-9". And even better: \w is the shortcut for "word character", i.e., your set but then shorter and a bit better.
Try this one:
\w+~r -
Regular Expression for non-words
hello all!
can you help me construct a regular expression that will match non-word strings say "������". I will be needing this to filter words from a Microsoft Word Document.
Thanx!hello all!
can you help me construct a regular expression that
will match non-word strings say "������". I will
be needing this to filter words from a Microsoft Word
Document. I don't think this is a problem that should be solved with regex. You would have to convert your Word document to a String and use replaceAll() with "\\W" as the regex.
Correct me if I am wrong but I thought that Word files were binary so your first problem will be to convert the file(s) to a String. -
Regular Expression to spilt words
Hi all,
i want to split the last word in string, after found last space the maximum lenght of string is five words.
i used the follwoing query not working ok .
SQL> SELECT REGEXP_SUBSTR('system hello sidval',
2 '[a-z]+\S+') RESULT
3 FROM DUAL;
RESULT
system
SQL> examples
1- if string is
Daivd from uk
output is uk if string is
David john
output is
john the maximum lenght of string is five words
regards
Edited by: Ayham on Oct 7, 2012 12:01 PM
Edited by: Ayham on Oct 7, 2012 12:18 PMAyham wrote:
Hi all,
i want to split the last word in string, after found last space the maximum lenght of string is five words.
i used the follwoing query not working ok .
Try thisSQL> SELECT REGEXP_SUBSTR('system hello sidval', '[a-z]+\S*$') RESULT FROM DUAL; The extra <tt>$</tt> tells the regex to match the end of the line. the <tt>*</tt> instead of the <tt>+</tt> does also match if the line does not ent with a space character.
bye
TPD
Edited by: TPD Opitz-Consulting com on 07.10.2012 21:35 -
Regular Expression - Select two words after specific string
Hi,
I am trying to select the two words/strings after the first word "door". I am using the search pattern (?<=door).\w+ but in this case I get the complete text after the word "door". I only want to select the two words after the first "door" in the complete text.
Can anybody help me?
Thanks!
Marco SnelsHi Marco,
I'm relatively handy with RegEx but this seems like a problem where I would employ a little bit of RegEx and CTL, just to make life easier.
You can use the following RegEx (note: I didn't test this in Integrator, only in a RegEx testing tool) to extract the two words after door (but including door, unfortunately):
(?:door)[\s]\w+[\s]\w+
This would give you something like the following in your extracted field:
door is brown
You could then pass through a re-formatter to remove "door" and the whitespace and be on your way. Not the best answer but should perform reasonably well and get you up and going.
Regards,
Patrick Rafferty
http://branchbird.com -
Quick regular expression question/help
Can someone help me with two regular expressions I need. I could spend a while trying to figure it out myself, however times short and I really would like to get a fool proof optimal solution (my attempt would be buggy).
Sample sentence
The population, is projected to reach 200,000, or more (by 2020).[7] This is {dummy} text.
The first regular expression
I need all brackets and every thing between them to be removed from a sentence.
Brackets such as: ( ), [ ] and { } .
I.e. Given the above sentence the following would be returned:
The population, is projected to reach 200,000, or more. This is text.
The second regular expression
If a word has a trailing comma character I need to add a whitespace between the word and the comma.
I.e. Given the sentence returned from the first regular expression, this regex would return:
The population *,* is projected to reach 200,000 *,* or more. This is text.
Many thanks to anyonewho can help me with this!
Edited by: Myles on Jan 18, 2008 8:12 AMhttp://java.sun.com/docs/books/tutorial/extra/regex/index.html
http://www.regular-expressions.info -
SQL Injection and Java Regular Expression: How to match words?
Dear friends,
I am handling sql injection attack to our application with java regular expression. I used it to match that if there are malicious characters or key words injected into the parameter value.
The denied characters and key words can be " ' ", " ; ", "insert", "delete" and so on. The expression I write is String pattern_str="('|;|insert|delete)+".
I know it is not correct. It could not be used to only match the whole word insert or delete. Each character in the two words can be matched and it is not what I want. Do you have any idea to only match the whole word?
Thanks,
Ricky
Edited by: Ricky Ru on 28/04/2011 02:29Avoid dynamic sql, avoid string concatenation and use bind variables and the risk is negligible.
-
Regular Expression - Extract words before the PLUS Sign ?
Dear All,
I had many words with having a symbol plus. I need to extract the words before the plus sign.
I can able to do this by using String.indexOf or String.contains. But i like to know is there is any way to do this using Regular Expression.
sample string
Kathire+san Output Kathire
World+islike Output World
Thanks,
J.KathirHere's one way.
import java.util.regex.Pattern;
String input = "abc+def";
Pattern pat = pat.compile("\\+");
String beforePlus = pat.split(input)[0];
Sun's Regular Expression Tutorial for Java
Regular-Expressions.info -
Regular expression to replace "emtpy space" ( ) bitween words with +
Hallo!
When I wish to find in code something like this:
12144541 FirstWord SecondWord
regular expression for that is:
(\d{1,100})[\s-]\D{1,100}[\s-]\D{1,100}
Now, please help me tu find regular expression to replace
"emtpy space" ( ) bitween words with +
12144541 FirstWord SecondWord to become
12144541+FirstWord+SecondWord
Thank you very, very, very much!A simple-minded solution is to use \s to match all
whitespace; e.g. find \s and replace with +. DW CS3, at least, is
smart enough to not replace end of line characters with the '+'
character if you limit your search & replace to text. -
Want to replace a string containing consecutive repeating words to one using regular expression
Hi Experts,
I need a regular expression to replace all duplicate words in a string with one.
eg: 'Hello Hello World 4-4-5 etc etc' should be changed to 'Hello World 4-4-5 etc'.
I tried many of them but they had one or the other problem. like (\w+\S\W)\1+' replace with ' \1' and ' (\w+\W)\1+' replaced with ' \1' , etc
Thanks in advance
TariqueHi,
Translating what frank said to JAVA would be something like this:
StringBuffer result = new StringBuffer();
String myString = "This is right right, that is wrong.";
String[] words = myString.split(" ");
String lastWord = "";
for (String str : words){
if (!str.contains(lastWord))
result.append(str);
else
result.append(str.substring((lastWord.length() >= 0 ? lastWord.length() : 0 ) , str.length()));
lastWord = str;
result.append(" ");
System.out.println(result);
If you didnt have points and commas in your message then would be easier. But the code is not 100% correct and you will need to make it work according to yours requirements. -
Regular expression to check a value if it contains a specific word.
Hi All,
How can i check if a certain word exists in a value in regular expression ?
I have an attribute called Race. Race can contain the following:
White, Non-Hispanic
Black, Non-Hispanic
White, Non Hispanic
Black, Non Hispanic
White, NonHispanic
Non-Hispanic, white
Non Hispanic - black
What i want is to check if my value contains the word "NON" (NON can be at the beginning, middle or end), if it does, parse it and return it.
This is what I have, however I want to make sure it covers all cases and not missing anything else
select REGEXP_SUBSTR(UPPER(trim('Black, Non-Hispanic')), '[NON]+') from dual;Thanks in advance.Rooney wrote:
Could you please explain what are the 2 ones's for ?The two 1 are not really needed for this. It is just taht the syntax requires those parameters when I add the fifth parameter.
http://docs.oracle.com/cd/E14072_01/server.112/e10592/functions148.htm
First 1 is where the search starts (same as in substr('Abc',1))
Second 1 is the number of occurences. Here meaning return the first occurence that was found. Replace it with 2 in my next example to see a (very slight) difference.
Also 'NON' alone will not cover all cases ?But you don't have non alone. You have regexp with non + upper. The 'i' replaces the upper. Also the output is slightly different. the 'i' version will return the same capitalization as it was found in the original. It depends a little what you want to achieve. And of cause INSTR will give the same info as your version. if the result is > 0 it means NON was found.
with testdata as (select 'White,Non-Hispanic' str from dual union all
select 'Non-White,nOn-Hispanic' str from dual union all
select 'White,Hispanic' str from dual
/* end of test data creation */
select str,
REGEXP_SUBSTR(UPPER(TRIM(str)), 'NON') regexp1,
REGEXP_SUBSTR(str, 'NON',1,1,'i') regexp2,
instr(upper(str),'NON') instr
from testdata;
STR REGEXP1 REGEXP2 INSTR
White,Non-Hispanic NON Non 7
Non-White,Non-Hispanic NON Non 1
White,Hispanic 0 -
Regular expression on words with % wildcard
Hi,
I've got some processing working using regular expression where I need to process words e.g.
regexp_replace('word1 word2','(\w+)','myprefix{\1}') - results in - 'myprefixword1 myprefixword2'
However, if I'm presented with this; '%word0 word1% wo%d2 word3', then I need to treat % as special case and leave the word as is, so result here would be; - '%word0 word1% wo%d2 myprefixword3', is this achievable using regexp ?And for those who don't know, I guess we should explain why we're having to expand single spaces to double spaces...
(I'll use the "¬" character to represent spaces to make it clearer to see)
If we have a string such as
word1¬word2¬word3and we want to identify the words in the string (without using any special regexp word identifier) then we are going to use the spaces to identify the start and end of words. To make life easy, we manually put a space at the start and end of the string so we can say that each word in the string will have a space before and after it regardless of where it is in the string...
¬word1¬word2¬word3¬However, when we specify what we want to search for we are going to say we want a space, followed by a number of characters (not spaces), followed by a space...
¬[^¬]*¬So, ideally, you'd expect it to look through the string and say
¬word1¬word2¬word3¬
\_____/... found word1
¬word1¬word2¬word3¬
\_____/... found word2
¬word1¬word2¬word3¬
\_____/... found word3
Unfortunately, there is a problem. Once the first word has been found the pointer for searching the rest of the string is located on the next character after the match i.e.
¬word1¬word2¬word3¬
^So it won't be able to pick out word2 and will only get to word3. Let's see it in action...
SQL> ed
Wrote file afiedt.buf
1 with t as (select ' word1 word2 word3 ' as txt from dual)
2 --
3 select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
4* from t
SQL> /
TXT
xxxxxword2xxxxx
SQL>In order to deal with this, if we replace the single spaces with double spaces (not required at the start and end) our string looks like...
¬word1¬¬word2¬¬word3¬So as it searches it finds word1 as a match and then the pointer in the string is located...
¬word1¬¬word2¬¬word3¬
^... so the next match for the pattern of space-characters-space is word2 and then the pointer is located...
¬word1¬¬word2¬¬word3¬
^... ready to find word 3. Example...
SQL> ed
Wrote file afiedt.buf
1 with t as (select ' word1 word2 word3 ' as txt from dual)
2 --
3 select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
4* from t
SQL> /
TXT
xxxxxxxxxxxxxxx
SQL>Hopefully that's a little clearer. You just have to remember the "pointer" principle and the fact that once a match is found it is located on the character after the match.
;) -
Regular Expression - replaceAll() - how to replace words?
Hiya,
I have this regex to replace all instances of myWord:
String oldWord = "oldWord";
String newWord = "newWord";
String sentence = "some sentence that contains " + oldWord;
String newSentence = replaceWordsInSentence(sentence, oldWord, newWord);
private String replaceWordsInSentence(String sentence, String oldWord, String newWord) {
return sentence.replaceAll("\b" + oldWord + "\b", newWord);
}...it works in most instances, but when oldWord is at the end of the sentence it is not replaced. Presumably the problem is that "/b" is not a sufficient word boundary. Can someone help me out with the correct regular expression code?
Thanks,
JamesMel, you did appear to misunderstand as you thought points 2 and 3 were alternatives, but you now recognise that they are additional "shoulds".
Of course, I applied the extra backslash as soon as Joachim advised. Maybe you don't agree with my rationale, but I prefer the complete solution that will work in all instances... so was simply waiting for him to post a code example that included the latter 2 points as (although I understood the point of them perfectly) I was not sure how to implement them.
Have come up with the following, expanded, method...
private String replaceWordsInSentence(String sentence, String oldWord, String newWord) {
return sentence.replaceAll("\\b" + Pattern.quote(oldWord) + "\\b", Matcher.quoteReplacement(newWord));
}...works fine with the tests I have run. Joachim, can you confirm this is correct. -
Regular expression to add undesrcore before single capital
I am trying to convert the names of attributes that use capitalization sort of like camelcase to distinguish multiple words, e.g. VehicleColor to use underscores instead, eg. Vehicle_Color.
I have a regular expression that does this, however I have a problem when an abbreviation consisting of multiple upper case characters is present, e.g. AverageMPG becomes Average_M_P_G. I am trying to come up with a pattern that only adds the underscores to the first occurrence of a capital letter in a series which should result in the abbreviation MPG becoming Average_MPG.
SQL> select * from v$version where rownum = 1;
BANNER
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
SQL> with test_data as
2 (
3 select 'VehicleColor' str from dual union all
4 select 'WeightClass' str from dual union all
5 select 'AverageMPG' str from dual union all
6 select 'HighMPG' str from dual union all
7 select 'LowMPG' str from dual union all
8 select 'ABS_System' str from dual
9 )
10 select
11 str,
12 regexp_replace(str, '([A-Z])', '_\1', 2) result
13 from
14 test_data;
STR RESULT
VehicleColor Vehicle_Color
WeightClass Weight_Class
AverageMPG Average_M_P_G
HighMPG High_M_P_G
LowMPG Low_M_P_G
ABS_System A_B_S__System
6 rows selected.
SQL>These are the results I would like, but I don't know how to modify the pattern to only have the replace act on the first capital letter in a series of capitals or if it is possible.
STR RESULT
VehicleColor Vehicle_Color
WeightClass Weight_Class
AverageMPG Average_MPG
HighMPG High_MPG
LowMPG Low_MPG
ABS_System ABS_Systemwith test_data as
select 'VehicleColor' str from dual union /**/all
select 'WeightClass' str from dual union /**/all
select 'AverageMPG' str from dual union/**/ all
select 'HighMPG' str from dual union/**/ all
select 'LowMPG' str from dual union/**/ all
select 'ABS_System' str from dual
select str, replace(regexp_replace(replace(str,'_',' '), '([^[:upper:]])([[:upper:]]{1,})([^[:upper:]]|$)', '\1_\2\3' ),' ') result
from test_data
STR RESULT
VehicleColor Vehicle_Color
WeightClass Weight_Class
AverageMPG Average_MPG
HighMPG High_MPG
LowMPG Low_MPG
ABS_System ABS_System -
Help in regular expression matching
I have three expressions like
1) [(y2009)(y2011)]
2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
3) [(y2009M1d20)(y2011M12d31)]
i want regular expression pattern for the above three expressions
I am using :
REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
but its giving results for all above expressions while i want different expression for each.
i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
ThanksI dont get your question Can you post your desired output? and also give some sample data.
Please consider the following when you post a question.
1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
You can use the following query and do a copy past of the output.
select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
that are asked the answer is already there.
3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
I have the following table called sales
with sales
as
select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
union all
select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
select *
from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
For example in the above sales table, I want to know the total quantity and number of invoice for each product.
The output should look like this
Prod_id sum_qty count_inv
1 145 2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
use the {noformat}{noformat} tags.
The usage of the tag is like this.
<place your code here>\
7. If you are posting a *Performance Related Question*. Please read
{thread:id=501834} and {thread:id=863295}.
Following those guide will be very helpful.
8. Please keep in mind that this is a public forum. Here No question is URGENT.
So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude.
Maybe you are looking for
-
Ac adaptor power supply needed
I need heeeelp I have 5.1 PCWorks LX520 speaker , & it's ac adaptor doesn't work & I searched for alternatives for it put I didn't find any , Its details : 2 V , 200 Am ,,, 230 V it's the same as inspire models [2.1 & 4.1 ] I live in egypt , how can
-
Hi Is there a way to get spell checking working with Lync 2010 without having to set up third party software? I have Office 2010 installed and I did read somewhere that if you download and install Office 2013 proofing tools it should work, but it has
-
How transfer custom OAF pages to Jdeveloper to Server
Hi, i am facing the issue while opening the custom OAF pages from server which are developed by som other guy. Please let me know how to open those custom OAF pages in Jdeveloper from server . Regards , Maheswara Raju
-
I recently upgraded to the new Firefox and now receive a message of a Script file is not loading. Script: chrome://global/content/bindings/general.xml:0
-
I need a driver for HP Deskjet 882C for Windows 7 64 Bit
Although Windows 7 does not officially support an old printer like HP deskjet 882C, is there some driver which would work for black and white printing?