Regular Expression Abbreviation of Words

Suppose I have got data in my column like
Balla Ram Chog Mal College
Maharishi Dayanand University
Cambridge Public School
Now I want to write a query using regular expressions to find out the abbreviations. e.g the resulting data set should be:
BRCMC
MDU
CPS
How should I write regexp for it ?

One way, using SUBSTR and INSTR, tested on 10g.
with data as
  select 'Balla Ram Chog Mal College' col from dual union all
  select 'Maharishi Dayanand University' col from dual union all
  select 'Cambridge Public School' col from dual
select col, replace(ltrim(max(sys_connect_by_path(str, ',')) keep (dense_rank last order by r), ','), ',') abbr
  from (
select col, substr(col, decode(level, 1, 1, instr(col, ' ', 1, level - 1) + 1), 1) str, level, row_number() over (partition by col order by level) r
  from data
connect by level <= length(col) - length(replace(col, ' ')) + 1
       and col = prior col
       and prior sys_guid() is not null
order by col, level
group by col
start with r = 1
connect by r - 1 = prior r
       and col = prior col
       and prior sys_guid() is not null;
COL                           ABBR
Balla Ram Chog Mal College    BRCMC
Cambridge Public School       CPS 
Maharishi Dayanand University MDU
With 11g, you will not require the Outer query to concatenate the results, you can directly use LISTAGG as demonstrated by Hashim.

Similar Messages

  • Regular Expression to Locate Words with Character

    I want to identify all the words in a document that are followed by the register mark (®) symbol.
    I built, what I thought was a regular expression that would search for a register mark preceeded by alpha number characters and a space. So if my text contained the sentence "Adobe InDesign® is a great product.", the regular expression would find "InDesign®"
    Below is the regular expression I composed. It grabs anything with a register mark, not just the register marks preceded by a space and alpha numeric characters. Where did I go wrong? I though the \s would restrict the search to complete words with a register mark.
    \s[a-zA-Z0-9]|®

    \s is the special GREP code for "any kind of space" -- a regular space, a tab, hard return, or any of ID's own white space codes. It has nothing to do with "complete words", because a word can appear at the start of a story, without any preceding space. It would also not find "InDesign®" because there is no space before it, there is a double quote instead.
    Your GREP does not work because, well, you got the general idea (words may consist of the set of characters "a-z", "A-Z", and "0-9") but since you use the [..] without any other code, GREP will apply this rule once -- per character. If you want to find words of more than one character, you need to tell GREP "one or more of these, please": with a +.
    Second, where did that | come from? It's the OR operator. Essentially, you are looking for
          any space followed by one character from the set "a-z", "A-Z", and "0-9"
    OR
          the ® character
    The 'word break' you were looking for is this code: \b, so you could search for "\b[a-zA-Z0-9]+" (note the '+' to allow more than one instance) -- but it's not necessary, because by default GREP grabs as much as it can. The set 'a-zA-Z0-9' etc. describes the allowed "word" characters, but you might want to prefer these: \l (ell) and \u for all lowercase and all uppercase characters -- they are shorter, and they automatically include accented characters, Greek, Russian, and a lot more. Similar, \d (for "digits") is the short-cut for "0-9". And even better: \w is the shortcut for "word character", i.e., your set but then shorter and a bit better.
    Try this one:
    \w+~r

  • Regular Expression for non-words

    hello all!
    can you help me construct a regular expression that will match non-word strings say "������". I will be needing this to filter words from a Microsoft Word Document.
    Thanx!

    hello all!
    can you help me construct a regular expression that
    will match non-word strings say "������". I will
    be needing this to filter words from a Microsoft Word
    Document. I don't think this is a problem that should be solved with regex. You would have to convert your Word document to a String and use replaceAll() with "\\W" as the regex.
    Correct me if I am wrong but I thought that Word files were binary so your first problem will be to convert the file(s) to a String.

  • Regular Expression to spilt words

    Hi all,
    i want to split the last word in string, after found last space the maximum lenght of string is five words.
    i used the follwoing query not working ok .
    SQL> SELECT REGEXP_SUBSTR('system hello sidval',
      2  '[a-z]+\S+') RESULT
      3  FROM DUAL;
    RESULT
    system
    SQL> examples
    1-  if string is
    Daivd  from  uk    
    output is   uk if string is
    David john
    output is
    john the maximum lenght of string is five words
    regards
    Edited by: Ayham on Oct 7, 2012 12:01 PM
    Edited by: Ayham on Oct 7, 2012 12:18 PM

    Ayham wrote:
    Hi all,
    i want to split the last word in string, after found last space the maximum lenght of string is five words.
    i used the follwoing query not working ok .
    Try thisSQL> SELECT REGEXP_SUBSTR('system hello sidval',  '[a-z]+\S*$') RESULT  FROM DUAL; The extra <tt>$</tt> tells the regex to match the end of the line. the <tt>*</tt> instead of the <tt>+</tt> does also match if the line does not ent with a space character.
    bye
    TPD
    Edited by: TPD Opitz-Consulting com on 07.10.2012 21:35

  • Regular Expression - Select two words after specific string

    Hi,
    I am trying to select the two words/strings after the first word "door". I am using the search pattern (?<=door).\w+ but in this case I get the complete text after the word "door". I only want to select the two words after the first "door" in the complete text.
    Can anybody help me?
    Thanks!
    Marco Snels

    Hi Marco,
    I'm relatively handy with RegEx but this seems like a problem where I would employ a little bit of RegEx and CTL, just to make life easier.
    You can use the following RegEx (note: I didn't test this in Integrator, only in a RegEx testing tool) to extract the two words after door (but including door, unfortunately):
    (?:door)[\s]\w+[\s]\w+
    This would give you something like the following in your extracted field:
    door is brown
    You could then pass through a re-formatter to remove "door" and the whitespace and be on your way. Not the best answer but should perform reasonably well and get you up and going.
    Regards,
    Patrick Rafferty
    http://branchbird.com

  • Quick regular expression question/help

    Can someone help me with two regular expressions I need. I could spend a while trying to figure it out myself, however times short and I really would like to get a fool proof optimal solution (my attempt would be buggy).
    Sample sentence
    The population, is projected to reach 200,000, or more (by 2020).[7] This is {dummy} text.
    The first regular expression
    I need all brackets and every thing between them to be removed from a sentence.
    Brackets such as: ( ), [ ] and { } .
    I.e. Given the above sentence the following would be returned:
    The population, is projected to reach 200,000, or more. This is text.
    The second regular expression
    If a word has a trailing comma character I need to add a whitespace between the word and the comma.
    I.e. Given the sentence returned from the first regular expression, this regex would return:
    The population *,* is projected to reach 200,000 *,* or more. This is text.
    Many thanks to anyonewho can help me with this!
    Edited by: Myles on Jan 18, 2008 8:12 AM

    http://java.sun.com/docs/books/tutorial/extra/regex/index.html
    http://www.regular-expressions.info

  • SQL Injection and Java Regular Expression: How to match words?

    Dear friends,
    I am handling sql injection attack to our application with java regular expression. I used it to match that if there are malicious characters or key words injected into the parameter value.
    The denied characters and key words can be " ' ", " ; ", "insert", "delete" and so on. The expression I write is String pattern_str="('|;|insert|delete)+".
    I know it is not correct. It could not be used to only match the whole word insert or delete. Each character in the two words can be matched and it is not what I want. Do you have any idea to only match the whole word?
    Thanks,
    Ricky
    Edited by: Ricky Ru on 28/04/2011 02:29

    Avoid dynamic sql, avoid string concatenation and use bind variables and the risk is negligible.

  • Regular Expression - Extract words before the PLUS Sign ?

    Dear All,
    I had many words with having a symbol plus. I need to extract the words before the plus sign.
    I can able to do this by using String.indexOf or String.contains. But i like to know is there is any way to do this using Regular Expression.
    sample string
    Kathire+san Output Kathire
    World+islike Output World
    Thanks,
    J.Kathir

    Here's one way.
    import java.util.regex.Pattern;
    String input = "abc+def";
    Pattern pat = pat.compile("\\+");
    String beforePlus = pat.split(input)[0];
    Sun's Regular Expression Tutorial for Java
    Regular-Expressions.info

  • Regular expression to replace "emtpy space" ( ) bitween words with +

    Hallo!
    When I wish to find in code something like this:
    12144541 FirstWord SecondWord
    regular expression for that is:
    (\d{1,100})[\s-]\D{1,100}[\s-]\D{1,100}
    Now, please help me tu find regular expression to replace
    "emtpy space" ( ) bitween words with +
    12144541 FirstWord SecondWord to become
    12144541+FirstWord+SecondWord
    Thank you very, very, very much!

    A simple-minded solution is to use \s to match all
    whitespace; e.g. find \s and replace with +. DW CS3, at least, is
    smart enough to not replace end of line characters with the '+'
    character if you limit your search & replace to text.

  • Want to replace a string containing consecutive repeating  words to one using regular expression

    Hi Experts,
    I need a regular expression to replace all duplicate words in a string with one.
    eg: 'Hello Hello World 4-4-5 etc etc' should be changed to 'Hello World 4-4-5 etc'.
    I tried many of them but they had one or the other problem. like (\w+\S\W)\1+' replace with ' \1' and  ' (\w+\W)\1+' replaced with ' \1' , etc
    Thanks in advance
    Tarique

    Hi,
    Translating what frank said to JAVA would be something like this:
            StringBuffer result = new StringBuffer();
            String myString = "This is right right, that is wrong.";
            String[] words = myString.split(" ");
            String lastWord = "";
            for (String str : words){
                if (!str.contains(lastWord))
                    result.append(str);
                else
                    result.append(str.substring((lastWord.length() >= 0 ? lastWord.length() : 0 ) , str.length()));
                lastWord = str;
                result.append(" ");
            System.out.println(result);
    If you didnt have points and commas in your message then would be easier. But the code is not 100% correct and you will need to make it work according to yours requirements.

  • Regular expression to check a value if it contains a specific word.

    Hi All,
    How can i check if a certain word exists in a value in regular expression ?
    I have an attribute called Race. Race can contain the following:
    White, Non-Hispanic
    Black, Non-Hispanic
    White, Non Hispanic
    Black, Non Hispanic
    White, NonHispanic
    Non-Hispanic, white
    Non Hispanic - black
    What i want is to check if my value contains the word "NON" (NON can be at the beginning, middle or end), if it does, parse it and return it.
    This is what I have, however I want to make sure it covers all cases and not missing anything else
    select REGEXP_SUBSTR(UPPER(trim('Black, Non-Hispanic')), '[NON]+') from dual;Thanks in advance.

    Rooney wrote:
    Could you please explain what are the 2 ones's for ?The two 1 are not really needed for this. It is just taht the syntax requires those parameters when I add the fifth parameter.
    http://docs.oracle.com/cd/E14072_01/server.112/e10592/functions148.htm
    First 1 is where the search starts (same as in substr('Abc',1))
    Second 1 is the number of occurences. Here meaning return the first occurence that was found. Replace it with 2 in my next example to see a (very slight) difference.
    Also 'NON' alone will not cover all cases ?But you don't have non alone. You have regexp with non + upper. The 'i' replaces the upper. Also the output is slightly different. the 'i' version will return the same capitalization as it was found in the original. It depends a little what you want to achieve. And of cause INSTR will give the same info as your version. if the result is > 0 it means NON was found.
    with testdata as (select 'White,Non-Hispanic' str from dual union all
                      select 'Non-White,nOn-Hispanic' str from dual union all
                      select 'White,Hispanic' str from dual
    /* end of test data creation */
    select str,
          REGEXP_SUBSTR(UPPER(TRIM(str)), 'NON') regexp1,
          REGEXP_SUBSTR(str, 'NON',1,1,'i') regexp2,
          instr(upper(str),'NON') instr
    from testdata;
    STR                    REGEXP1                REGEXP2                INSTR
    White,Non-Hispanic     NON                    Non                        7
    Non-White,Non-Hispanic NON                    Non                        1
    White,Hispanic                                                           0

  • Regular expression on words with % wildcard

    Hi,
    I've got some processing working using regular expression where I need to process words e.g.
    regexp_replace('word1 word2','(\w+)','myprefix{\1}') - results in - 'myprefixword1 myprefixword2'
    However, if I'm presented with this; '%word0 word1% wo%d2 word3', then I need to treat % as special case and leave the word as is, so result here would be; - '%word0 word1% wo%d2 myprefixword3', is this achievable using regexp ?

    And for those who don't know, I guess we should explain why we're having to expand single spaces to double spaces...
    (I'll use the "¬" character to represent spaces to make it clearer to see)
    If we have a string such as
    word1¬word2¬word3and we want to identify the words in the string (without using any special regexp word identifier) then we are going to use the spaces to identify the start and end of words. To make life easy, we manually put a space at the start and end of the string so we can say that each word in the string will have a space before and after it regardless of where it is in the string...
    ¬word1¬word2¬word3¬However, when we specify what we want to search for we are going to say we want a space, followed by a number of characters (not spaces), followed by a space...
    ¬[^¬]*¬So, ideally, you'd expect it to look through the string and say
    ¬word1¬word2¬word3¬
    \_____/... found word1
    ¬word1¬word2¬word3¬
          \_____/... found word2
    ¬word1¬word2¬word3¬
                \_____/... found word3
    Unfortunately, there is a problem. Once the first word has been found the pointer for searching the rest of the string is located on the next character after the match i.e.
    ¬word1¬word2¬word3¬
           ^So it won't be able to pick out word2 and will only get to word3. Let's see it in action...
    SQL> ed
    Wrote file afiedt.buf
      1  with t as (select ' word1 word2 word3 ' as txt from dual)
      2  --
      3  select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
      4* from t
    SQL> /
    TXT
    xxxxxword2xxxxx
    SQL>In order to deal with this, if we replace the single spaces with double spaces (not required at the start and end) our string looks like...
    ¬word1¬¬word2¬¬word3¬So as it searches it finds word1 as a match and then the pointer in the string is located...
    ¬word1¬¬word2¬¬word3¬
           ^... so the next match for the pattern of space-characters-space is word2 and then the pointer is located...
    ¬word1¬¬word2¬¬word3¬
                  ^... ready to find word 3. Example...
    SQL> ed
    Wrote file afiedt.buf
      1  with t as (select ' word1  word2  word3 ' as txt from dual)
      2  --
      3  select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
      4* from t
    SQL> /
    TXT
    xxxxxxxxxxxxxxx
    SQL>Hopefully that's a little clearer. You just have to remember the "pointer" principle and the fact that once a match is found it is located on the character after the match.
    ;)

  • Regular Expression - replaceAll() - how to replace words?

    Hiya,
    I have this regex to replace all instances of myWord:
    String oldWord = "oldWord";
    String newWord = "newWord";
    String sentence = "some sentence that contains " + oldWord;
    String newSentence = replaceWordsInSentence(sentence, oldWord, newWord);
    private String replaceWordsInSentence(String sentence, String oldWord, String newWord) {
        return sentence.replaceAll("\b" + oldWord + "\b", newWord);
    }...it works in most instances, but when oldWord is at the end of the sentence it is not replaced. Presumably the problem is that "/b" is not a sufficient word boundary. Can someone help me out with the correct regular expression code?
    Thanks,
    James

    Mel, you did appear to misunderstand as you thought points 2 and 3 were alternatives, but you now recognise that they are additional "shoulds".
    Of course, I applied the extra backslash as soon as Joachim advised. Maybe you don't agree with my rationale, but I prefer the complete solution that will work in all instances... so was simply waiting for him to post a code example that included the latter 2 points as (although I understood the point of them perfectly) I was not sure how to implement them.
    Have come up with the following, expanded, method...
        private String replaceWordsInSentence(String sentence, String oldWord, String newWord) {
            return sentence.replaceAll("\\b" + Pattern.quote(oldWord) + "\\b", Matcher.quoteReplacement(newWord));
        }...works fine with the tests I have run. Joachim, can you confirm this is correct.

  • Regular expression to add undesrcore before single capital

    I am trying to convert the names of attributes that use capitalization sort of like camelcase to distinguish multiple words, e.g. VehicleColor to use underscores instead, eg. Vehicle_Color.
    I have a regular expression that does this, however I have a problem when an abbreviation consisting of multiple upper case characters is present, e.g. AverageMPG becomes Average_M_P_G. I am trying to come up with a pattern that only adds the underscores to the first occurrence of a capital letter in a series which should result in the abbreviation MPG becoming Average_MPG.
    SQL> select * from v$version where rownum = 1;
    BANNER
    Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
    SQL> with test_data as
      2      (
      3      select 'VehicleColor' str from dual union all
      4      select 'WeightClass' str from dual union all
      5      select 'AverageMPG' str from dual union all
      6      select 'HighMPG' str from dual union all
      7      select 'LowMPG' str from dual union all
      8      select 'ABS_System' str from dual
      9      )
    10  select
    11      str,
    12      regexp_replace(str, '([A-Z])', '_\1', 2) result
    13  from
    14      test_data;
    STR          RESULT
    VehicleColor Vehicle_Color
    WeightClass  Weight_Class
    AverageMPG   Average_M_P_G
    HighMPG      High_M_P_G
    LowMPG       Low_M_P_G
    ABS_System   A_B_S__System
    6 rows selected.
    SQL>These are the results I would like, but I don't know how to modify the pattern to only have the replace act on the first capital letter in a series of capitals or if it is possible.
    STR          RESULT
    VehicleColor Vehicle_Color
    WeightClass  Weight_Class
    AverageMPG   Average_MPG
    HighMPG      High_MPG
    LowMPG       Low_MPG
    ABS_System   ABS_System 

    with test_data as
            select 'VehicleColor' str from dual union /**/all
            select 'WeightClass' str from dual union /**/all
            select 'AverageMPG' str from dual union/**/ all
            select 'HighMPG' str from dual union/**/ all
           select 'LowMPG' str from dual union/**/ all
            select 'ABS_System' str from dual
            select str, replace(regexp_replace(replace(str,'_',' '), '([^[:upper:]])([[:upper:]]{1,})([^[:upper:]]|$)', '\1_\2\3' ),' ') result
            from test_data
    STR     RESULT
    VehicleColor     Vehicle_Color
    WeightClass     Weight_Class
    AverageMPG     Average_MPG
    HighMPG     High_MPG
    LowMPG     Low_MPG
    ABS_System     ABS_System

  • Help in regular expression matching

    I have three expressions like
    1) [(y2009)(y2011)]
    2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
    3) [(y2009M1d20)(y2011M12d31)]
    i want regular expression pattern for the above three expressions
    I am using :
    REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
    but its giving results for all above expressions while i want different expression for each.
    i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
    Thanks

    I dont get your question Can you post your desired output? and also give some sample data.
    Please consider the following when you post a question.
    1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
    You can use the following query and do a copy past of the output.
    select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
    that are asked the answer is already there.
    3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
    I have the following table called sales
    with sales
    as
          select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
          union all
          select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
    select *
      from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
    For example in the above sales table, I want to know the total quantity and number of invoice for each product.
    The output should look like this
    Prod_id   sum_qty   count_inv
    1         145       2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
    6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
    Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
    use the {noformat}{noformat} tags.
    The usage of the tag is like this.
    <place your code here>\
    7. If you are posting a *Performance Related Question*. Please read
       {thread:id=501834} and {thread:id=863295}.
       Following those guide will be very helpful.
    8. Please keep in mind that this is a public forum. Here No question is URGENT.
       So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

Maybe you are looking for

  • Ac adaptor power supply needed

    I need heeeelp I have 5.1 PCWorks LX520 speaker , & it's ac adaptor doesn't work & I searched for alternatives for it put I didn't find any , Its details : 2 V , 200 Am ,,, 230 V it's the same as inspire models [2.1 & 4.1 ] I live in egypt , how can

  • Lync 2010 Spell Check

    Hi Is there a way to get spell checking working with Lync 2010 without having to set up third party software? I have Office 2010 installed and I did read somewhere that if you download and install Office 2013 proofing tools it should work, but it has

  • How transfer custom OAF pages to Jdeveloper to Server

    Hi, i am facing the issue while opening the custom OAF pages from server which are developed by som other guy. Please let me know how to open those custom OAF pages in Jdeveloper from server . Regards , Maheswara Raju

  • What is this: Script: chrome://global/content/bindings/general.xml:0

    I recently upgraded to the new Firefox and now receive a message of a Script file is not loading. Script: chrome://global/content/bindings/general.xml:0

  • I need a driver for HP Deskjet 882C for Windows 7 64 Bit

    Although Windows 7 does not officially support an old printer like HP deskjet 882C, is there some driver which would work for black and white printing?