Regexp_substr

My earlier post was answered by Frank Kulash, thanks Frank.
However i have few questions
SELECT owner_id
,    owner_name        AS old_owner_name
,    REGEXP_SUBSTR ( owner_name
              , '^$([^)]+)'
              , 1
              , 1
              , NULL
              , 1
              )        AS legacy_id
,    REGEXP_SUBSTR ( owner_name
              , '-([^(]+)'
              , 1
              , 1
              , NULL
              , 1
              )        AS DEPT_ID
,    REGEXP_SUBSTR ( owner_name
              , '\(([^)]+)$ * ($OLD$)?$'
              , 1
              , 1
              , NULL
              , 1
              )        AS NEW_OWNER_NAME
FROM    stg_wms_setup_owner_unpiv
ORDER BY owner_id
;The problem i'm encountering is with the 'OWNER_NAME'. The original field in the table has a record '(1104) - (History) FORT HARRISON' under 'OWNER_NAME', in the above query to get the NEW~_OWNER_NAME i have
REGEXP_SUBSTR ( owner_name
              , '$([^)]+)$ * ($OLD$)?$'
              , 1
              , 1
              , NULL
              , 1
              )        AS NEW_OWNER_NAMEBut the above sql takes care when there is '(OLD)' at the end of the record, but i want to get rid of '(History)' as well, please need help , i'm thinking if i change the back reference will that help?
Thanks

Hi,
897837 wrote:
... Frank just like using a negative value in SUBSTR can we do a similar thing in back references? How can i copy the resultant dataset as a table and paste , because once i posted the dataset the fields are again getting closer and difficult for the users to read.Is this one question or two completely unrealted questions?
just like using a negative value in SUBSTR can we do a similar thing in back references? Unfortunately, no. I don't believe there's any good way to tell REGEXP_SUBSTR to return the last matrching pattern, or the next-to-last, or the N-th from the end. Often you can work around that by specifying specify what comes after the part you really want, and anchoring the expression to the end of the string ($). Do you really need to find the N-th-to last expression in this case? You've never explained exactly how you get the new columns from the original owner_name.
How can i copy the resultant dataset as a table and paste , because once i posted the dataset the fields are again getting closer and difficult for the users to read.Use \ tags, just like you did around the CREATE TABLE statement, and again around the INSERT statements. (By the way, it looks like an editor or something replaced the single-quotes with some kind of fancy leaning quotes. I had to edit that out manually.)
The word "code" in the \ tags means that the text is to be displayed in the style of code; that is, with a fixed-width font and no whitespace compressed. It has nothing to do with the content. You can use \ tags to format query results, or poetry, or graphics, or any other kind of text.
The query you posted looks pretty close to what you need. If 'History' is case-insensitive, then use REGEXP_REPLACE instead of REPLACE to get rid of it. If you're using REGEXP_REPLACE, it's easy to remove any spaces that come before or after '(History)' at the same time.SELECT owner_id
,      REGEXP_SUBSTR ( owner_name
     , '^$([^)]+)'
     , 1
     , 1
     , NULL
, 1
     ) AS legacy_id
, REGEXP_SUBSTR ( owner_name
     , '\(([^)]+)$ * ($OLD$)?$'
     , 1
     , 1
     , NULL
, 1
     ) AS dept_id
,      owner_name     AS old_owner_name
, REGEXP_SUBSTR ( REGEXP_REPLACE ( owner_name
                    , ' *\(History\ *'
                    , NULL
                    , 1
                    , 1
                    , 'i'     -- Case-insensitive
     , '-([^(]+)'
     , 1
     , 1
     , NULL
, 1
     ) AS new_owner_name
FROM t
ORDER BY owner_id
This seems to be right except for new_owner_name. The expression above says there must be a hyphen before new_owner_name, but in your desired results (if I understand them correctly) you way you want new_owner_name to include the only hyphen in one case:2424 PS065413 FW - HM (PS065413) (OLD) FW - HM
If you can explain the rules about how to find new_owner_name, I can help you find a regular expression to do it. Apparantly, the rule is not "everything from the hyphen to the next left parenthesis, or the end of the string", but I don't know what the rule is.
Are there other places where this is not doing what you need? Point out those places, and explain how you get the correct results in those places.
Sorry, I'm not at an Oracle 11 database today. I tested the query above as well as I could using Oracle 10.2.

Similar Messages

Problem with REGEXP_SUBSTR related query.

I am having a problem with this query:
SELECT *
FROM (    SELECT REGEXP_SUBSTR ('{SUMMER}|{POINT OF SALE}',
                                           '({)([A-Z]+ *[A-Z]*)(})',
                                           1,
                                           LEVEL,
                                           'i',
                                           2)
                               val
              FROM DUAL
        CONNECT BY LEVEL <=
                      REGEXP_COUNT ('{SUMMER}|{POINT OF SALE}', '|') + 1)
WHERE val IS NOT NULL
I need the output in 2 rows in this format:
VAL
====
SUMMER
POINT OF SALE
But I am not able to get 'POINT OF SALE' in the output because of the blank space, or maybe some other reason. Can anyone correct my query?

So you want something like this?
SQL> ed
Wrote file afiedt.buf
1 SELECT *
2    FROM (    SELECT REGEXP_SUBSTR ('{SUMMER}|{POINT OF SALE}', '[^{|}]+', 1, LEVEL, 'i') val
3                FROM DUAL
4          CONNECT BY LEVEL <=
5                        REGEXP_COUNT ('{SUMMER}|{POINT OF SALE}', '[^|]+'))
6* WHERE val IS NOT NULL
SQL> /
VAL
SUMMER
POINT OF SALE
SQL>

Regexp_substr unexpected behaviour

While trying to come up with a constraint to check for correct RGB color codes
(RvalueGvalueBvalue) where the value is between 0 and 255
I saw the following:
database used: Oracle Database 10g Express Edition Release 10.2.0.1.0
CREATE TABLE test
(color varchar2(12)
,status varchar2(1)
insert into test
values ('R2G2B2','G')
insert into test
values ('R22G22B22','G')
insert into test
values ('R222G222B222','G')
insert into test
values ('R300G2B2','F')
insert into test
values ('R300G256B2','F')
insert into test
values ('R300G256B256','F')
select t.status
     , t.color
     , REGEXP_SUBSTR(t.color,'^R(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])') red
     , REGEXP_SUBSTR(t.color,'G(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])') green
     , REGEXP_SUBSTR(t.color,'B(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])$') blue
     , REGEXP_SUBSTR(t.color,'^R(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])G(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])B(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])$','1','1','i') total
from test t
order by redoutput
status   color        red    green    blue total
G      R2G2B2       R2      G2       B2    R2G2B2
G      R22G22B22    R22     G22      B22   R22G22B22
G      R222G222B222 R22     G22      B222 R222G222B222
F      R300G2B2     R30     G2       B2
F      R300G300B2   R30     G30      B2
F      R300G300B300 R30     G30Expected output
status   color        red    green    blue total
G      R2G2B2       R2      G2       B2    R2G2B2
G      R22G22B22    R22     G22      B22   R22G22B22
G      R222G222B222 R222    G222     B222 R222G222B222
F      R300G2B2             G2       B2
F      R300G300B2                    B2
F      R300G300B300 Both the total and blue colums have the output as expected
but the red and the green give a output with a maximum of 2 digits instead of 3
even when the string should not match (300 > 255).
I have read the information about regular expressions in the Application Developer's Guide - Fundamentals and SQL Reference but neither explain above behaviour
Can someone explain why this is happening?
Or give a hint of how to correct the select statement in such a way that results are like the expected output?

First thanks everybody for their response.
I was wrong in assuming that R300G0B0 with above mentioned regular expression for red should return null instate of R30 because 30 is on the range 0-255 I didn't say that the expression should take all the digits into account.
Akeeti Jyuuzou's solutions is doing just that with anchoring the red to the green and the green to the blue and the blue to the end of the line. In the blow example I have anchored the red to any non digit character.
SQL> select t.kleur color
2       , REGEXP_SUBSTR(t.kleur,'^R(25[0-5]|2[0-4]\d{1}|1\d{2}|\d{1,2})\D') red
3       , REGEXP_SUBSTR(t.kleur,'G(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])') green
4       , REGEXP_SUBSTR(t.kleur,'B(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])$') blue
5       , REGEXP_SUBSTR(t.kleur,'^R(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])G(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])B(\d{1,2}|1\d{2}|2[0-4]\d{1}|25[0-5])$','1','1','i') total
6 from test t
7 /
COLOR                 RED   GREEN   BLUE    TOTAL
R300G0B300                   G0
R0G0B20X            R0G    G0
SQL Slow moe is indeed right that the alterations are read form left to right and is stopped by the first correct pattern as is shown below.
In below example I have turned around the alterations for the red but left the green to see the difference
SQL> select t.kleur color
2       , REGEXP_SUBSTR(t.kleur,'^R(25[012345]|2[01234]\d{1}|1\d{2}|\d{1,2})') red
3       , REGEXP_SUBSTR(t.kleur,'G(\d{1,2}|1\d{2}|2[01234]\d{1}|25[012345])') green
4       , REGEXP_SUBSTR(t.kleur,'B(\d{1,2}|1\d{2}|2[01234]\d{1}|25[012345])$') blue
5       , REGEXP_SUBSTR(t.kleur,'^R(\d{1,2}|1\d{2}|2[01234]\d{1}|25[012345])G(\d{1,2}|1\d{2}|2[01234]\d{1}|25[012345])B(\d{1,2}|1\d{2}|2[01234]\d{1}|25[012345])$','1','1','i') total
6 from test t
7 order by id
8 /
COLOR                   RED     GREEN     BLUE     TOTAL
R222G222B222        R222     G22         B222       R222G222B222

[Mostly Sorted] Extracting tags - regexp_substr and count help needed!

My original query got sorted, but additional regexp_substr and count help is required further on down!
Hi,
I have a table on a 10.2.0.3 database which contains a clob field (sql_stmt), with contents that look something like:
SELECT <COB_DATE>, col2, .... coln
FROM   tab1, tab2, ...., tabn
WHERE tab1.run_id = <RUNID>
AND    tab2.other_col = '<OTHER TAG>'(That's a highly simplified sql_stmt example, of course - if they were all that small we'd not be needing a clob field!).
I wanted to extract all the tags from the sql_stmt field for a given row, so I can get my (well not "mine" - I'd never have designed something like this, but hey, it works, sorta, and I'm improving it as and where I can!) pl/sql to replace the tags with the correct values. A tag is anything that's in triangular brackets (eg. <RUNID> from the above example)
So, I did this:
SELECT     SUBSTR (sql_stmt,
                   INSTR (sql_stmt, '<', 1, LEVEL),
                   INSTR (substr(sql_stmt, INSTR (sql_stmt, '<', 1, LEVEL)), '>', 1, 1)
                   ) tag
FROM       export_jobs
WHERE      exp_id = p_exp_id
CONNECT BY LEVEL <= (LENGTH (sql_stmt) - LENGTH (REPLACE (sql_stmt, '<')))Which I thought would be fine (having tested it on a text column). However, it runs very poorly against a clob column, for some reason (probably doesn't like the substr, instr, etc on the clob, at a guess) - the waits show "direct path read".
When I cast the sql_stmt as a varchar2 like so:
with my_tab as (select cast(substr(sql_stmt, instr(sql_stmt, '<', 1), instr(sql_stmt, '>', -1) - instr(sql_stmt, '<', 1) + 1) as varchar2(4000)) sql_stmt
                from export_jobs
                WHERE      exp_id = p_exp_id)
SELECT     SUBSTR (sql_stmt,
                   INSTR (sql_stmt, '<', 1, LEVEL),
                   INSTR (substr(sql_stmt, INSTR (sql_stmt, '<', 1, LEVEL)), '>', 1, 1)
                   ) tag
FROM       my_tab
CONNECT BY LEVEL <= (LENGTH (sql_stmt) - LENGTH (REPLACE (sql_stmt, '<')))it runs blisteringly fast in comparison, except when the substr'd sql_stmt is over 4000 chars, of course! Using dbms_lob instr and substr etc doesn't help either.
So, I thought maybe I could find an xml related method, and from this link:get xml node name in loop , I tried:
select t.column_value.getrootelement() node
from (select sql_stmt xml from export_jobs where exp_id = 28) xml,
table (xmlsequence(xml.xml.extract('//*'))) tBut I get this error: ORA-22806: not an object or REF. (It might not be the way to go after all, as it's not proper xml, being as there are no corresponding close tags, but I was trying to think outside the box. I've not needed to use xml stuff before, so I'm a bit clueless about it, really!)
I tried casting sql_stmt into an xmltype, but I got: ORA-22907: invalid CAST to a type that is not a nested table or VARRAY
Is anyone able to suggest a better method of trying to extract my tags from the clob column, please?
Message was edited by:
Boneist

I don't know if it may work for you, but I had a similar activity where I defined sql statements with bind variables (:var_name) and then I simply looked for witch variables to bind in that statement through this query.
with x as (
     select ':var1
     /*a block comment
     :varname_dontcatch
     select hello, --line comment :var_no
          ''a string with double quote '''' and a :variable '', --:variable
          :var3,
          :var2, '':var1'''':varno'',
     from dual'     as string
     from dual
), fil as (
     select string,
          regexp_replace(string,'(/\*[^*]*\*/)'||'|'||'(--.*)'||'|'||'(''([^'']|(''''))*'')',null) as res
     from x
select string,res,
     regexp_substr(res,'\:[[:alpha:]]([[:alnum:]]|_)*',1,level)
from fil
connect by regexp_instr(res,'\:[[:alpha:]]([[:alnum:]]|_)*',1,level) > 0
/Or through these procedures
     function get_binds(
          inp_string in varchar2
     ) return string_table
     deterministic
     is
          loc_str varchar2(32767);
          loc_idx number;
          out_tab string_table;
     begin
          --dbms_output.put_line('cond = '||inp_string);
          loc_str := regexp_replace(inp_string,'(/\*[^*]*\*/)'||'|'||'(--.*)'||'|'||'(''([^'']|(''''))*'')',null);
          loc_idx := 0;
          out_tab := string_table();
          --dbms_output.put_line('fcond ='||loc_str);
          loop
               loc_idx := regexp_instr(loc_str,'\:[[:alpha:]]([[:alnum:]]|_)*',loc_idx+1);
               exit when loc_idx = 0;
               out_tab.extend;
               out_tab(out_tab.last) := regexp_substr(loc_str,'[[:alpha:]]([[:alnum:]]|_)*',loc_idx+1);
          end loop;
          return out_tab;
     end;
     function divide_string (
          inp_string in varchar2
          --,inp_length in number
     --return string_table
     return dbms_sql.varchar2a
     is
          inp_length number := 256;
          loc_ind_1 pls_integer;
          loc_ind_2 pls_integer;
          loc_string_length pls_integer;
          loc_curr_string varchar2(32767);
          --out_tab string_table;
          out_tab dbms_sql.varchar2a;
     begin
          --out_tab := dbms_sql.varchar2a();
          loc_ind_1 := 1;
          loc_ind_2 := 1;
          loc_string_length := length(inp_string);
          while ( loc_ind_2 < loc_string_length ) loop
               --out_tab.extend;
               loc_curr_string := substr(inp_string,loc_ind_2,inp_length);
               dbms_output.put(loc_curr_string);
               out_tab(loc_ind_1) := loc_curr_string;
               loc_ind_1 := loc_ind_1 + 1;
               loc_ind_2 := loc_ind_2 + length(loc_curr_string);
          end loop;
          dbms_output.put_line('');
          return out_tab;
     end;
     function execute_statement(
          inp_statement in varchar2,
          inp_binds in string_table,
          inp_parameters in parametri
     return number
     is
          loc_stat dbms_sql.varchar2a;
          loc_dyn_cur number;
          out_rows number;
     begin
          loc_stat := divide_string(inp_statement);
          loc_dyn_cur := dbms_sql.open_cursor;
          dbms_sql.parse(c => loc_dyn_cur,
               statement => loc_stat,
               lb => loc_stat.first,
               ub => loc_stat.last,
               lfflg => false,
               language_flag => dbms_sql.native
          for i in inp_binds.first .. inp_binds.last loop
               DBMS_SQL.BIND_VARIABLE(loc_dyn_cur, inp_binds(i), inp_parameters(inp_binds(i)));
               dbms_output.put_line(':'||inp_binds(i)||'='||inp_parameters(inp_binds(i)));
          end loop;
          dbms_output.put_line('');
          --out_rows := DBMS_SQL.EXECUTE(loc_dyn_cur);
          DBMS_SQL.CLOSE_CURSOR(loc_dyn_cur);
          return out_rows;
     end;Bye Alessandro
Message was edited by:
Alessandro Rossi
There is something missing in the functions but if there is something that may interest you you can ask.

REGEXP_SUBSTR - Take part before '*'

Hello everyone,
I was wondering if it is possible to do this with REGEXP_SUBSTR?
WITH T as (
SELECT 'Hello * The* World!' as str FROM dual)
SELECT trim(SUBSTR(str, 1, instr(str, '*')-1))
FROM t;Not that I absolutely need it, but it's more a curiosity on how REGEXP_SUBSTR works actually. I would like to retrieve with REGEXP_SUBSTR the first part of a string before a given character, in this case '*'. Is this possible?
Thanks

WITH t AS    ( 1 )       ( 2 )     ( 3 )     ( 4-NULL )
     (SELECT 'Hello * The* World!' AS str
        FROM DUAL)
SELECT REGEXP_SUBSTR (str, '[^*]+'),
       REGEXP_SUBSTR (str, '[^*]+', 1, 1),
       REGEXP_SUBSTR (str, '[^*]+', 1, 2),
       REGEXP_SUBSTR (str, '[^*]+', 1, 3),
       REGEXP_SUBSTR (str, '[^*]+', 1, 4)
FROM t
Hello Hello The World   (Null)Edited by: user2361373 on May 3, 2011 1:55 AM

Problem with REGEXP_SUBSTR and the desired results

I have table like below:
USERNAME
DOCNAME
DESCRIPTION
user1
doc1
yes|no|none
user1
doc2
ok|not
user1
doc3
allryt
Now I want to display the table like below:
USERNAME
DOCNAME
DESCRIPTION
user1
doc1
yes
user1
doc1
no
user1
doc1
none
user1
doc2
ok
user1
doc2
not
user1
doc3
allryt
Now this is the query which I am executing to separate the rows:
SELECT a.*, REGEXP_SUBSTR (description, '[^|]+', 1, LEVEL, 'i') val
                 FROM (SELECT * FROM sample_table WHERE username='user1') a
           CONNECT BY LEVEL <=
                         REGEXP_COUNT (description, '[^|]+')
But I am getting results like below (rows getting duplicated)
user1
doc1
yes|no|none
yes
user1
doc1
yes|no|none
no
user1
doc1
yes|no|none
none
user1
doc2
ok|not
not
user1
doc1
yes|no|none
none
user1
doc2
ok|not
ok
user1
doc1
yes|no|none
no
user1
doc1
yes|no|none
none
user1
doc2
ok|not
not
user1
doc1
yes|no|none
none
user1
doc3
allryt
allryt
user1
doc1
yes|no|none
no
user1
doc1
yes|no|none
none
user1
doc2
ok|not
not
user1
doc1
yes|no|none
none
Can anyone correct my query or modify it?

Try this
SQL> with t
2 as
3 (
4 select 'user1' user_name, 'doc1' doc_name, 'yes|no|none' descr from dual union all
5 select 'User1' user_name, 'doc2' doc_name, 'ok|not' descr from dual union all
6 select 'user1' user_name, 'doc3' doc_name, 'allryt' descr from dual
7 )
8   select user_name
9        , doc_name
10        , regexp_substr(descr, '[^\|]+', 1, level) descr
11     from t
12 connect by level <= length(descr) - length(replace(descr, '|')) + 1
13      and descr = prior descr
14      and prior sys_guid() is not null
15    order
16       by user_name
17        , doc_name
18        , level;
USER_ DOC_ DESCR
User1 doc2 ok
User1 doc2 not
user1 doc1 yes
user1 doc1 no
user1 doc1 none
user1 doc3 allryt
6 rows selected.

Want to select with REGEXP_SUBSTR a double vertical bar

Hi all,
I want to have this result
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0
CMC US quity;09/07/2008;2008|0;2009|71;2010|400;2011|0
with this data
CMC US quity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0
I try with
select
regexp_substr('CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0','[^\||]+',1,1) from dual;
but I have only
CMC US Equity;09/07/2008;2008
as result
how can I do ?
thanks
Babata

Input String:
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0
Output String:
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0
CMC US Equity;09/07/2008;2008|0;2009|71;2010|400;2011|0
(A)
If you want it in 1 row and 1 column:
test@ora>
test@ora> --
test@ora> with t as (
2    select 'CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0' as x from dual)
3 --
4 select
5    x,
6    regexp_replace(x,'(([^;\|]*?;){2})([0-9\|;]*?)(\|\|)(.*)','\1\3'||chr(10)||'\1\5') as mod_x
7 from t;
X                                                                                      MOD_X
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0 CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0
                                                                                       CMC US Equity;09/07/2008;2008|0;2009|71;2010|400;2011|0
1 row selected.
test@ora>
test@ora>
test@ora>(B)
If you want it in the 1 row and 2 columns:
test@ora>
test@ora>
test@ora> --
test@ora> with t as (
2    select 'CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0' as x from dual)
3 --
4 select
5    x,
6    regexp_replace(x,'(([^;\|]*?;){2})([0-9\|;]*?)(\|\|)(.*)','\1\3') as mod_x1,
7    regexp_replace(x,'(([^;\|]*?;){2})([0-9\|;]*?)(\|\|)(.*)','\1\5') as mod_x2
8 from t;
X                                                                                      MOD_X1                                                  MOD_X2
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0 CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0 CMC US Equity;09/07/2008;2008|0;2009|71;2010|400;2011|0
1 row selected.
test@ora>
test@ora>
test@ora>
test@ora>(C)
If you want it in the 2 rows and 1 column:
test@ora>
test@ora>
test@ora> --
test@ora> with t as (
2    select 'CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0' as x from dual)
3 --
4 select
5    x,
6    regexp_replace(x,'(([^;\|]*?;){2})([0-9\|;]*?)(\|\|)(.*)','\1\3') as mod_x
7 from t
8 union all
9 select
10    x,
11    regexp_replace(x,'(([^;\|]*?;){2})([0-9\|;]*?)(\|\|)(.*)','\1\5')
12 from t;
X                                                                                      MOD_X
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0 CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0
CMC US Equity;09/07/2008;2008|0;2009|100;2010|0;2011|0||2008|0;2009|71;2010|400;2011|0 CMC US Equity;09/07/2008;2008|0;2009|71;2010|400;2011|0
2 rows selected.
test@ora>
test@ora>
test@ora>HTH
isotope

Help me about built-in functions like regexp_substr,regexp_replace

Hi everybody
Can anyone help me to understand these functions like regexp_substr,regexp_replace ...
Will be better if documantation include examples with different situations or it may be links
Thx

also - if you're just trying to learn regular expressions - which are generic and very much not specific to oracle, there are plenty of tutorial websites around the place that you'll find by googling.
well worth learning, regardless of the programming language you're working with.

Putting variable in regexp_substr pattern returns nothing

I'm trying to create a function to simply extract positioned text from a string, ie I want the text in position two from string X with a pattern of ' - '. I want to create a function so it is simpler for the developers but regexp_substr doesn't seem to like it.
So I want a function like below to be executed, select text_extractor('ABC - DEF - HIJ',' - ',2) from dual; which would return DEF;
But regexp_substr doesn't like matching on ' - '. If I tell it to give me position 2, I get '-', and if I tell it position 3, I get DEF. It seems to work fine if the '-' is substritued with a ':'
ie
select regexp_substr('ABC - DEF - HIJ','[^ - ]+',1,2) from dual;
results
select regexp_substr('ABC - DEF - HIJ','[^ - ]+',1,3) from dual;
results
DEF
select regexp_substr('ABC : DEF : HIJ','[^ : ]+',1,2) from dual;
results
DEF
I would further like to make a function to wrap it in but it didn't work at all.
The following examples return nothing.
create or replace function text_extractor (p_text varchar2, p_delimiter varchar2, p_position number)
return varchar2
is
begin
return (regexp_substr(p_text,'^p_delimiter]+', 1,p_position));
end;
I've also tried
create or replace function text_extractor (p_text varchar2, p_delimiter varchar2, p_position number)
return varchar2
is
v_search_expression varchar2(2000);
begin
v_search_expression := '''[^'||p_delimiter||']+''';
return (regexp_substr(p_text,v_search_expression),1,p_position);
end;
But I get nothing. Any ideas?

When you do this:
select regexp_substr('ABC - DEF - HIJ','[^ - ]+',1,2) from dual;
results
-Your search string is a set of characters, so you are saying anything that is NOT a space or dash (or space again, but that's redundant). So the first match is ABC, the second match is "-" as that's the second non-space match, so that's why you get that. Probably what you want is to ignore the spaces and use the "-" as your delimiter, then just trim the spaces off after...
SQL> ed
Wrote file afiedt.buf
1* select trim(regexp_substr('ABC - DEF - HIJ','[^-]+',1,2)) from dual
SQL> /
TRI
DEF
SQL>

REGEXP_SUBSTR() help

Hi:
I've got a bunch of triggers that I'm trying to get rid of. They are BEFORE UPDATE OF x1,x2,x3... triggers and they basically prevent the user from updating the columns. I want to do away with these and make the table declarations handle it (grant select only on the columns not found in the trigger of a given table). So step 1 is I'm trying to go through the user_triggers description column and parse out the columns that the current trigger applies to. I've simplified the example by including text directly into the REGEXP call but I'm really using it in a larger query on user_triggers and the REGEXP is working off of the description column contents.
My problem is that I want a way to extract the columns and only the columns. I currently have:
select upper(REGEXP_SUBSTR(
'TRIGGER "GAFF".access_tbus
   BEFORE UPDATE OF access_code,foo1_col,foo2_col ON access_tbl
BEGIN
   s2_error_pck.update_not_allowed(''Access_TBL column Access_Code, foo1_col, foo2_col'');
END;',
       '(((before update)[ ]{1})'                  -- keywords followed by a space
      || '(\s*[[:alnum:]_]+\s*,){0,10})'        -- 0 to 10 instances of column_name followed by comma
      || '(\s*([[:alnum:]_]\s*).*)',1,1,'i'))     -- the final (perhaps only) column name, not followed by comma
from dualbut the result is
BEFORE UPDATE OF ACCESS_CODE,FOO1_COL,FOO2_COL ON ACCESS_TBLI currently have this working (I think!) with 1-n columns in the list but I get more than I want in the output. How do I user the BEFORE UPDATE OF to match but not include it or the table name in the result? (I'd ideally like to tack on a ' ON ' in the match as well if I can do that but not get stuck with that in the output as well).
Thanks,
Gaff

Hi,
Example:
Connected to Oracle Database 10g Express Edition Release 10.2.0.1.0
Connected as hr
SQL>
SQL> SELECT ut.trigger_name,
2         ut.trigger_type,
3         ut.triggering_event,
4         utc.table_name,
5         utc.column_name
6    FROM user_triggers     ut,
7         user_trigger_cols utc
8   WHERE ut.trigger_name = utc.trigger_name
9   ORDER BY ut.trigger_name,
10            utc.table_name,
11            utc.column_name;
TRIGGER_NAME                   TRIGGER_TYPE     TRIGGERING_EVENT                                                                 TABLE_NAME                     COLUMN_NAME
UPDATE_JOB_HISTORY             AFTER EACH ROW   UPDATE                                                                           EMPLOYEES                      DEPARTMENT_ID
UPDATE_JOB_HISTORY             AFTER EACH ROW   UPDATE                                                                           EMPLOYEES                      EMPLOYEE_ID
UPDATE_JOB_HISTORY             AFTER EACH ROW   UPDATE                                                                           EMPLOYEES                      HIRE_DATE
UPDATE_JOB_HISTORY             AFTER EACH ROW   UPDATE                                                                           EMPLOYEES                      JOB_ID
SQL> Regards,
Edited by: Walter Fernández on Feb 23, 2009 5:13 PM

REGEXP_SUBSTR read whole lines

Hi,
I'm trying to work with REGEXP_SUBSTR.
But I’ve a problem with it:
For example I’ve this text in a varchar field:
SCAFFOLD REF: 529
LOCATION: CD4
DATE OF ERECTION: 120607
LAST INSPECTION: 110308
LADDER INSPECTION DATE: 110308
COMMENTS: Scaffold OK - issue over time logged at inspection
IO130 am this appears to be on all tags
DATE SIGNED: 170308
Now I want all the text from "COMMENTS" (without “comment”) until DATE also without date itself. Is that possible?
Thanks

Hi,
Maybe another approach is to remove all data before "COMMENTS:" and after "DATE SIGNED:". Because using regexp_substr, you will not be able to get rid of "COMMENTS: " and "DATE SIGNED: ". See example below :
SQL> drop table t;
Table dropped
SQL> create table t(info varchar2(1000));
Table created
SQL> insert into t(info)
2 values ('SCAFFOLD REF: 529
3 LAST INSPECTION: 110308
4 LADDER INSPECTION DATE: 110308
5 COMMENTS: Scaffold OK - issue over time logged at inspection
6 IO130 am this appears to be on all tags
7
8 DATE SIGNED: 170308');
1 row inserted
SQL> select regexp_replace(info, '([^,]*COMMENTS: )([^,]*)(DATE SIGNED: [^,]*)', '\2') as regexp_replace from t;
REGEXP_REPLACE
Scaffold OK - issue over time logged at inspection
IO130 am this appears to be on all tags
SQL>
regards,
François

Problem with REGEXP_SUBSTR

Hi
I am trying to use REGEXP_SUBSTR which looks a very powerful function but one that you have to get the hang of!
The following gives me the results I expect:
select
REGEXP_SUBSTR('123456^789^AB^CDEF', '[^\^]+[^\^]',1,1) test1,
REGEXP_SUBSTR('123456^789^AB^CDEF', '[^\^]+[^\^]',1,2) test2,
REGEXP_SUBSTR('123456^789^AB^CDEF', '[^\^]+[^\^]',1,3) test3,
REGEXP_SUBSTR('123456^789^AB^CDEF', '[^\^]+[^\^]',1,4) test4
FROM dual
123456,789,AB,CDEFBut this does not:
select
REGEXP_SUBSTR('123456^7^A^CDEF', '[^\^]+[^\^]',1,1) test1,
REGEXP_SUBSTR('123456^7^A^CDEF', '[^\^]+[^\^]',1,2) test2,
REGEXP_SUBSTR('123456^7^A^CDEF', '[^\^]+[^\^]',1,3) test3,
REGEXP_SUBSTR('123456^7^A^CDEF', '[^\^]+[^\^]',1,4) test4
FROM dual
123456,CDEF,,Please help!!
Thanks

Hi,
Actually, the two patterns are not equivalent.
(1) '[^\^]+' means "1 or more consecutive non-carets"
(2) '[^\^]+[^\^]' means "1 or more consecutive non-carets, followed immediately by a non-caret"
(3) '[^\^]{2,}' means "2 or more consecutive non-carets", which is the same result as (2), but (in my opinion) clearer

Oracle regular expressions REGEXP_SUBSTR

Hi,
I'm relatively new here and this is might be a kind of silly.
Start using reg expressiona and do not know how to get the second pattern from the end (7 in this case)?
select REGEXP_SUBSTR('1/2/3/4/5/6/7/8' ,'[^/]+$',1, 1),
REGEXP_SUBSTR('1/2/3/4/5/6/7/8' ,'[^/]+$',1, 2),
REGEXP_SUBSTR('1/2/3/4/5/6/7/8' ,'[^/]+$')
from dual;
Please help.
Edited by: lsy_nn on Jul 21, 2010 1:51 PM

RegExp_Replace is useful ;-)
Let us read these threads.
I have created part4 :8}
Introduction to regular expressions part1 to part4
Introduction to regular expressions ...
Introduction to regular expressions ... continued.
Introduction to regular expressions ... last part.
Introduction to regular expressions part4
col extStr for a10
select
RegExp_Replace('1/2/3/4/5/6/7/8',
'^.*([^/]+)/.*$',
'\1') as extStr
from dual;
EXTSTR
7

Regexp_substr question

Oracle 10g and Oracle 11g
with mike_test as
(select
'Dont want this Line
Name want this line
Name want this line
Dont want this line
Name want this line' xx
from dual)
select regexp_substr(xx,'Name(.*)') from mike_test
I'm looking to return
Name want this line
Name want this line
Name want this line
but can only get one occurrence. I keep trying to use the matching mode 'm' to search through all lines but I'm not having success.
thanks
Mike

This should also work. As long as the "Name" doesn't include a special char used to do some regexp_conversions. I used the char "§".
with mike_test as
(select
'Dont want this Line
Name want this line
Naem not want this line !
Dont want this line
Name want this line' xx
from dual)
select regexp_substr(xx,'^Name.*$',1,1,'m') first_result
   ,regexp_count(xx, '^Name.*$', 1,'m') number_of_lines
   ,regexp_replace(
       regexp_replace(
           regexp_replace(xx,'(Name)','§',1,0,'m')
           ,'^[^§](.*)$','',1,0,'m')
       ,'§','Name',1,0,'m') all_lines
from mike_test
ALL_LINES
Name want this line
Name want this line"The problem with regexp_substr is that it will deliver only one occurence of the string.
Regexp_replace can eliminate all occurences. However you can't easily say something like: Everthing that is not starting with "Name".
Therefore I first replace the "Name" Part by a single character § and then later used the [^§] not replacement to eliminate the not needed lines. Then replace the substitution character § back to the wanted "Name".
Edited by: Sven W. on Dec 3, 2012 10:56 PM

Regexp_substr to extract date

I have a field that has date (mm/dd/yyyy format) followed by a new line character. How do I extract just the data from the field using regexp_substr?
I can use substr for this, but i want to see if i can use regexp_substr
Thanks
Billu

Perhaps:
select regexp_substr(dt, '\d{2}/\d{2}/\d{4}') from tSample execution:
Connected to Oracle Database 10g Enterprise Edition Release 10.2.0.4.0
Connected as fsitja
SQL>
SQL> with t as (
2 select '01/31/2010' ||chr(10) dt from dual union all
3 select ' 02/28/2010' ||chr(10) dt from dual
4 )
5 --
6 select to_date(regexp_substr(dt, '\d{2}/\d{2}/\d{4}'), 'MM/DD/YYYY') from t;
TO_DATE(REGEXP_SUBSTR(DT,'\D{2
31/1/2010
28/2/2010
SQL>

Regexp_substr

Similar Messages

Maybe you are looking for