RegExp and \n

Hi,
today i've got to know how i'll find out whether or not a line ends with an \n.
I am parsing a csv file with 10 semicolon seperated values.
So, here's my (working) regEx so far:
Pattern p = Pattern.compile("((.*)(;{1})){10}");BUT..... i'd like to add the '\n' parsing at the end of my expression and do not know how. Maybe the whole expression mentioned before is rubbish, so pls help...
thx

I made a test program, but I really don't know if it is what the OP is expecting. Please, check it out:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CSVTest {
     public static void main(String[] args) {
          String
               csv =
                    "337;111;55dfg3;247;61;528;470;117;312;379\n"+
                    "294;131;566;511;339;424;140;498;444;5dfg96\n"+
                    "276;54;589;510;44;160;68;268;105;112\n"+
                    "42;393;195;442;591;11;261;407;340;1 dfg 44\n"+
                    "448;492;297;185;204;66;390;518;584;36\n"+
                    "406;452;464;24;535;542;156;256;349;226\r\n"+
                    "294;18;499;451;212;260;266;84;331;229\n"+
                    "22;128;309;88;3;257;413;94;532;95\r\n"+
                    "197;81;350;164;419;528;349;457;277;172\n"+
                    "420;267;555;108;182;274;335;40;187;67",
               regex = "([\\n\\r])*((?:[^;]*;){9}[^\\r\\n]*)"
          Pattern p = Pattern.compile(regex);
          Matcher m = p.matcher(csv);
          int i = 1;
          while(m.find()) {
               System.out.println(
                    "Match #" + i++ + "[" + m.group(2) + "]"
}I hope it helps you

Similar Messages

REGEXP and BLOB

Hi
One of my BLOB column contains a text which I need to check in my SQL Query.
Can we achive this by using REGEXP?
Is there is any other way to determine the value of fields/data inside BLOB in SQL Query (Not in Procedure or Function)?
Kapilk

You mean CLOB ? BLOB is binary type you can't get reasonable text out of it.
Yes, Regular Expressions works with CLOB
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#sthref534
Oracle Database Implementation of Regular Expressions
Oracle Database implements regular expression support with a set of Oracle Database SQL functions and conditions that enable you to search and manipulate string data. You can use these functions in any environment that supports Oracle Database SQL. You can use these functions on a text literal, bind variable, or any column that holds character data such as CHAR, NCHAR, CLOB, NCLOB, NVARCHAR2, and VARCHAR2 (but not LONG).

Regular Expressions and variables OR RegExp and var

Sorry about that, my browser hiccuped and sent three
times....

Newlines are only a problem if you're reading the
text line-by-line and applying the regexp to each
line. It wouldn't catch expressions that span
lines.
@sabre150: your note re: CharSequence -- so what
you're suggesting is to implement a CharSequence that
wraps the file contents, and then use the regexps on
the whole thing? I like the idea but it seems like
it would only be easy to implement if the file uses a
fixed-width character set. Or am I missing
something...?You are correct for the most basic implementation. It is very easy to create a char sequence for fixed width character sets using RandomAccessFile. Once you go to character sets such as UTF-8 then more effort is required.
While ever the regex is moving forward thought the CharSequence one char at a time there is no problem because one can wrap a Reader but once it backtracks then one needs random access and one will need to have a buffer. I have used a ring buffer for this which seems to work OK but of course this will not allow the regex to move to any point in the CharSequence.
'uncle_alice' is the regex king round here so listen to him.
:-( I should read further ahead next time!
Message was edited by:
sabre150
Message was edited by:
sabre150

Tricky Regexp and string to column-row SQL

Hi All,
I am TRYING to build an SQL that will convert a string passed as
HP|250 GB * 2 + 80 GB * 3 + 100 GB | SATAto
HP | 250 GB | SATA
HP | 250 GB | SATA
HP | 80 GB | SATA
HP | 80 GB | SATA
HP | 80 GB | SATA
HP | 100 GB | SATAMy attempt so far is (which tells me to learn more about regexp)
WITH T AS
( SELECT q'[HP|250 GB * 2 + 80 GB * 3 + 100 GB | SATA]' str FROM DUAL
t2 AS
(SELECT trim(regexp_substr(str,'[^|]+',1,level)) val
   FROM T
   CONNECT BY level <= LENGTH (str)-LENGTH(REPLACE(str,'|'))+1
),t3 AS
(SELECT DISTINCT trim(regexp_substr(val,'[^+]+',1,level)) val
FROM t2 WHERE VAL LIKE '%*%' OR VAL LIKE '%+%'
CONNECT BY level <= LENGTH (val)-LENGTH(REPLACE(val,'+'))+1
),t4 as
(SELECT VAL,ROWNUM RN FROM T2 A1
   WHERE VAL NOT LIKE '%*%' OR VAL NOT LIKE '%+%'),
t5 as
(SELECT A.VAL MK, T3.VAL CONFG, B.VAL TYP
   FROM   T3, (SELECT VAL FROM T4 WHERE RN = 1)A,(SELECT VAL FROM T4 WHERE RN = 2) B)
   SELECT *
   FROM   T5;And output I got so far is:
MK                                        CONFG                                     TYP
HP                                        80 GB * 3                                 SATA
HP                                        250 GB * 2                                SATA
HP                                        100 GB                                    SATA                                      Please suggest what more shall I do to get the desired output(An SQL)?
BANNER
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE     11.2.0.1.0     Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production Thanks for reading this post
*009*

with t1 as (
            select 'HP|250 GB * 2 + 80 GB * 3 + 100 GB | SATA' str from dual union all
            select 'INTEL|40 GB + 55 GB| IDE' from dual
     t2 as (
            select regexp_substr(str,'^[^|]+') mk,
                    trim(regexp_substr(replace(str,'|','+'),'[^+]+',1,column_value + 1)) element,
                    regexp_substr(str,'[^|]+$') typ
              from t1,
                    table(
                          cast(
                               multiset(
                                        select level
                                          from dual
                                          connect by level <= length(regexp_replace(str || '+','[^+]'))
                               as sys.OdciNumberList
select mk,
        trim(regexp_replace(element,'\*.*$')) val,
        typ
from t2,
        table(
              cast(
                   multiset(
                            select level
                              from dual
                              connect by level <= substr(regexp_substr(element,'\*.*$'),2)
                   as sys.OdciNumberList
MK         VAL        TYP
HP         250 GB      SATA
HP         250 GB      SATA
HP         80 GB       SATA
HP         80 GB       SATA
HP         80 GB       SATA
HP         100 GB      SATA
INTEL      40 GB       IDE
INTEL      55 GB       IDE
8 rows selected.
SQL> SY.
P.S. Post your version. If 11.1 it can be simplified. If 11.2 can be simplified even more.

Regexp and performance

ahoj!
i want to create an index on a column that i query via an regular expression, for example ...where regexp_like(phone.extension_col, whitelist.extension_col)...
phone.extension_col is the column with the phonenumber-extensions i want to query and whitelist.extension_col is a list of extension that are allowed. whitelist contains extensions like 2450 AND extensions as regular expressions like ^8...$!
i tried to create an index on phone.extension, but it doesn't work:
create index idx_extension on phone (regexp_like(extension_col, '^8...$'));
-> ORA-00904 column name not valid
someone can help me? thx!
bye,
christian

hi,
i think u have given a wrong column name,
create index idx_extension on phone (regexp like('phone.extension_col', 'whitelist.extension_col'));
I dont know above would work or not, but u havae to look for correct column name by looking at the error
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_5010.htm
regards
Jafar

Regexp and ampersand

Hi all,
I am using regexp to substring a clob of data. The data contains an ampersand (&) so i will need to add this to my regexp. The trouble is it doesnt take it as a literal character (even when using the escape [\] char before it), instead it thinks i am trying to define a variable input field.
Any ideas?
Thanks for your time,
James

Hi!
Read and try this:
SQL*Plus
When issuing SQL from within SQL*Plus several considerations need to be made.
An ampersand (&) will be treated as a define variable so should it appear in the
pattern as a text literal, you will be prompted to enter it's value. This behaviour can
be removed by setting the SQL*Plus variable DEFINE to be OFF.
If the SQL*Plus variable ESCAPE is set to ON then any instance of the escape
variable will be stripped. This is unfortunate for regular expressions as the default
escape character is backslash (\) which is a common metacharacter. To be safe it is
best to ensure that ESCAPE is set to OFF before issuing a regular expression
query.
link: http://www.oracle.com/technology/products/database/application_development/pdf/TWP_Regular_Expressions.pdf
Hope this helps!

ReplaceAll regexp and JavaScript

I want to write a function that will escape special characters in a String so that they render correctly in JavaScript.
For example, if a string contains a tab, I want the tab to be replaced by \t and so on for all the escape characters : \', \", \\, \b, \f, \n, \t and \r.
Here is my function so far :
public static String JSStringFormat(String value) {
    String result = "";
    if (value != null) {
        result = value;
        result = result.replaceAll("\'", "\\\\'");
        result = result.replaceAll("\b", "\\\\b");
        result = result.replaceAll("\f", "\\\\f");
        result = result.replaceAll("\n", "\\\\n");
        result = result.replaceAll("\t", "\\\\t");
        result = result.replaceAll("\r", "\\\\r");
    return result;
}So far, this works... But I can't find the solution for \\ and \"... I've tried many things but nothing works...
Anyone can help ?

replaceAll("\"", "\\\""); // I thinkIt throws compilation error on this line itself. If
you are using an IDE, it wont allow you to go further
with that line.
replaceAll("\\\\", "\\\\\\\\"); // I think.
I had a slight error in the quote one the replacement string needed two more quotes. This works:         String str = "\\a \"xyz\" \\ b";
        System.out.println(str);
        //System.out.println(str.replaceAll("\\" , "\\\\\\"));
        System.out.println(str.replaceAll("\"", "\\\\\""));
        System.out.println(str.replaceAll("\\\\", "\\\\\\\\"));Your line (commented above) gives Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
^
regEx is particular about special characters like $
and \.Yes, I know.
So when you are using replaceAll, you have to
be careful with regex, and replace them. For example,
if you have a $ to be replaced, you have to replace
it the following way.
str.replaceAll("\\$", "\\\\\\$");Yes.
>
so, instead of directly using the $, you have to use
"\\" that tells the regEx, yes this is a special
character and treat it normally, like any other
character.Yes.
Along those same lines, like I said, you need \\ to get a single \ in a Java String. Then, if you want a literal \ in the regex, you need to provide the regex compiler with \\, which means your Java String literal must be \\\\.

Regexp and group capturing

Hi,
I 've trouble with capturing group as mention in the example below
String s = "KLASSE3";
Pattern p = Pattern.compile("KLASSE(\\d)");
Matcher m= p.matcher(s);
System.out.println( m.groupCount());
System.out.println( m.group(1));
Running this gives :
1
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:421)
Tried with 1.5.0_08 and 1.6.0-b105
Thanks for any hint

shame on me !!!
thanks very much dude

RegExp and group

Hi I'm currently working on an IDL compiler and was wondering if somebody can help me with a regular expression I use to read out methods from an IDL file.
this.methodPattern = Pattern.compile("\\s*" +
     "(\\w+)" +
     "\\s+(\\w+)\\s*" +
     "[\$]\\s*" +
     "(?:(?:\\s*?(\\w+)\\s*(\\w+)\\s*?[,]?)?)*" +
     "[\$]" +
     "\\s*(?:\\s*raises\\s*(\\w+))?;");
Now this is supposed to match something like
String abc(int a, int b, int c) raises RemoteException
which it does :)
The problem lies with the '+' quantifier which overwrites my parameters, now it only stores the last parameter in this case it would store:
group(i) -> int
group(i+1) -> c
The problem only lies in the highlighted part of the code, the rest is matched and stored properly.
it overwrites the previous parameters, is there a way to store all parameters instead of only the last one? I thought the + quantifier wouldn't cause issues like that.
I'm using JRE 1.6
If someone has an idea as to what I can do in this situation I would be most grateful for suggestions.
- I know I could just match the whole contents and the split the String into an Array, however I would prefer to keep it to one Pattern.

You probably want a compiler generator like
ANTLR or
JavaCC.Note that JavaCC has an IDL grammar; you can get it here:
https://javacc.dev.java.net/servlets/ProjectDocumentList?folderID=110&expandFolder=110&folderID=0
To use it, download it and do something like this:
$ cat Hello.idl
module HelloApp
interface HelloCallback
   void callback(in string message);
interface Hello
   string sayHello(in HelloCallback objRef, in string message);
$ javacc IDL.jj && javac *.java && java IDLParser Hello.idl
Java Compiler Compiler Version 4.0 (Parser Generator)
(type "javacc" with no arguments for help)
Reading from file IDL.jj . . .
File "TokenMgrError.java" does not exist. Will create one.
File "ParseException.java" does not exist. Will create one.
File "Token.java" does not exist. Will create one.
File "SimpleCharStream.java" does not exist. Will create one.
Parser generated successfully.
Note: IDLParser.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
IDL Parser Version 0.1: Reading from file Hello.idl . . .
IDL Parser Version 0.1: IDL definitions parsed successfully.You can either add syntactic actions to collect the method names or convert this to a JJTree grammar and use a Visitor; the latter is the cleaner technique but the former may be a bit simpler.

JS for search and creation of automated hyperlinks from the results

Hi,
Is it possible to create a javascript that searches a PDF document for a part of phrase (with regexp) and then creates a hyperlink of the whole row where the phrase is?
I'll explain a little bit more....
In a PDF catalog I have part numbers of 10 digits that always starts with "5010" and then there's a short describing text of the product and finally a price at the end. The part no and the short description is separated with two spaces and a "pipe" (|) and so is also the price separated from the describing text.
Example: 5010101538 | This is the describing text for the product | $4996
Now, I want create a hyperlink to my website so each product row is clickable in Acrobat. The link is static at first and at the end the product no (10 digits) comes. After the product no there is also the extension .aspx (http://www.myweb.com/pd_5010.......aspx.
I know that this should be done at the creative stage but the DB connection plugin for the parts does not support url linking in InDesign..... so I'm stuck with Acrobat for my 900 links that needs to be created. ;)
Since I'm new to JS in Acrobat I hope there is help out there!
Kindly
Magnus

Hi Magnus,
It's a bit tricky, but it can be done. Contact me by email for more info.

ReplaceAll / regexp / escaping a string

Hi, I have a question about escaping a string for use with replaceAll. My objective is to replace all instances of string1 with string2 so I have been doing so using replaceAll. This works well however with it relying on regexp's I have some questions about escaping.
I want to, for example, replace [%PARAM%] with '100%' however this starts to get in to the murky (to me) worlf of regular expressions. After some testing I realize I have lots of characters that are important to the regexp and I can avoid problems by escaping them like this:
\\[\\%PARAM\\%\\] with '100\\%'
This works however the values will be parsed out of strings the user enters so clearly I don't want them to have to add the escape characters, is there a way I can:
a) Tell it to ignore special characters so it is treated like a basic searcha nd replace or
b) Autmatically escape any problematic characters in the string before calling replaceAll?
If there is another approach you would recommend I'd be very interested to hear it too.
Thanks in advance,
Chris.

Don't get me started on the evils of writing Java inside JSP!
Anyway, if you look at the documentation I helpfully linked for you, you'll see that the replace method that can take two Strings (two CharSequences to be precise) versus just two chars was introduced in version 1.5, so you must be using an older compiler.
You should consider upgrading to 1.6. Even 1.5, let alone the version you must be using, is in its Java Technology End of Life (EOL) transition period.
[http://java.sun.com/javase/downloads/index_jdk5.jsp]

Regular Expression + Find and Replace

Hey there- I have a question about regExp and the Find and Replace. Basically I want to search a wildcard between a href tag, how would that look, because the code below does not work.
countryLink = "<a href=\"http://www.whateve.com\" target=\"_parent\">";
[code]
countryLink = "([^"]*)";
[code]
Thanks! Any help is appreciated!
Also, how do i add code blocks to this forum?

Yes, I meant the <a> tag, but thank you for displaying the href attribute solution as well. This solved my issue. Thanks! Thought I would display what I did with your code incase someone was interested in using this code to convert a javascript string to XML.
query this:
countryLink = "<a href=\"http://www.whateve.com\" target=\"_blank\">";
add this to the Find box:
countryLink = "<a href=\\"([-\w:/.?=&;]+)\\" target=\\"_parent\\">";
add this to the Replace box:
<countryLink>$1</countryLink>
creates an output of this:
<countryLink>http://www.whateve.com</countryLink>

Substr, regexp & clean-up

Hallo everyone,
I am loading a file with some extremely messed-up data that I am trying to 'unpick'. The basic method that I am using is the following where with (many such) if clauses, I look for some key text (here STNR:AT) and then extrapolate from there where the core information is.
In the example below I find the text and then take the 'AT' and the remainder of the string ...
if INSTR(p_POSTEXT, 'STNR:AT', 1) > 0 then
     RetVal := SUBSTR(p_POSTEXT, INSTR(p_POSTEXT, 'STNR:AT', 1) + 5);
end if;
This works fine but will be a lot of effort to program and so I am looking for short-cuts.
The first thing I am wondering is whether I can 'preserve' the value returned by the if-clause call and use this value in the processing (to get RelVal) without having to call INSTR a second time. I know how to do this is VBA but am not sure of the syntax in PL/SQL.
The second thing, I was thinking might help is the use of regular expression (regexp) processing but cannot see what advantage this would give me over SUBSTR and INSTR.
Anyway, if anyone has any help for me with this, then that would be a great help.
Regards and thanks,
Alan Searle

Hi Michael,
Many thanks for the tip and, yes, I took a deep look at 'REGEXP' and found a whole stack of features that I could use. In the end I came up with this code below which really simplified matters for me.
Regards and many thanks,
Alan.
CREATE OR REPLACE FUNCTION fnc_at_bearb(p_POSTEXT varchar) RETURN VARCHAR AS
PRAGMA AUTONOMOUS_TRANSACTION;
strATNR varchar2(255);
numStart number;
numEnd number;
strRetVal varchar2(255);
BEGIN
numStart := REGEXP_INSTR(p_POSTEXT, 'AT', 1, 1, 1);
if numStart = 0 Then
     numStart := REGEXP_INSTR(p_POSTEXT, 'STOERANF.NR.|STNR:AXT|STOENR|STOENR|STOERNR|STOEM|STNR|STBNR|STOE|SNR', 1, 1, 1);
end if;
if numStart > 0 then
     numEnd := REGEXP_INSTR(p_POSTEXT, '[[:digit:]]+', numStart, 1, 1);
     strATNR := TRIM(REGEXP_REPLACE(SUBSTR(p_POSTEXT, numStart, numEnd - numStart), '[[:punct:]]', Null));
     strRetVal := SUBSTR(SUBSTR('AT000000000000000', 1, 15 - LENGTH(strATNR)) || strATNR, 1, 15);
else
strRetVal := numStart;
end if;
RETURN strRetVal;
END;
/

MODEL clause to process a comma separated string

Hi,
I'm trying to parse a comma separated string using SQL so that it will return the parsed values as rows;
eg. 'ABC,DEF GHI,JKL' would return 3 rows;
'ABC'
'DEF GHI'
'JKL'
I'm thinking that I could possibily use the MODEL clause combined with REGULAR expressions to solve this as I've already got a bit of SQL which does the opposite ie. turning the rows into 1 comma separated string;
select id, substr( concat_string, 2 ) as string
from (select 1 id, 'ABC' string from dual union all select 1, 'DEF GHI' from dual union all select 1, 'JKL' from dual)
model
return updated rows
partition by ( id )
dimension by ( row_number() over (partition by id order by string) as position )
measures ( cast(string as varchar2(4000) ) as concat_string )
rules
upsert
iterate( 1000 )
until ( presentv(concat_string[iteration_number+2],1,0) = 0 )
( concat_string[0] = concat_string[0] || ',' || concat_string[iteration_number+1] )
order by id;
Can anyone give me some pointers how to parse the comma separated string using regexp and create as many rows as needed using the MODEL clause?

Yes, you could do it without using ITERATE, but FOR ... INCREMENT is pretty much same loop. Couple of improvements:
a) there is no need for CHAINE measure
b) there is no need for CASE in RULES clause
c) NVL can be applies on measures level
with t as (select 1 id, 'ABC,DEF GHI,JKL,DEF GHI,JKL,DEF GHI,JKL,DEF,GHI,JKL' string from dual
   union all
    select 2,'MNO' string from dual
    union all
   select 3,null string from dual
SELECT id,
         string
FROM   T
   MODEL
    RETURN UPDATED ROWS
    partition by (id)
    DIMENSION BY (0 POSITION)
    MEASURES(
             string,
             NVL(LENGTH(REGEXP_REPLACE(string,'[^,]+','')),0)+1 NB_MOT
    RULES
     string[FOR POSITION FROM 1 TO NB_MOT[0] INCREMENT 1] = REGEXP_SUBSTR(string[0],'[^,]+',1,CV(POSITION))
SQL> with t as (select 1 id, 'ABC,DEF GHI,JKL,DEF GHI,JKL,DEF GHI,JKL,DEF,GHI,JKL' string from dual
2     union all
3      select 2,'MNO' string from dual
4      union all
5     select 3,null string from dual
6      )
7   SELECT id,
8           string
9    FROM   T
10     MODEL
11      RETURN UPDATED ROWS
12      partition by (id)
13      DIMENSION BY (0 POSITION)
14      MEASURES(
15               string,
16               NVL(LENGTH(REGEXP_REPLACE(string,'[^,]+','')),0)+1 NB_MOT
17              )
18      RULES
19      (
20       string[FOR POSITION FROM 1 TO NB_MOT[0] INCREMENT 1] = REGEXP_SUBSTR(string[0],'[^,]+',1,CV(POSITION))
21      )
22 /
        ID STRING
         1 ABC
         1 DEF GHI
         1 JKL
         1 DEF GHI
         1 JKL
         1 DEF GHI
         1 JKL
         1 DEF
         1 GHI
         1 JKL
         2 MNO
        ID STRING
         3
12 rows selected.
SQL> SY.

Regular Expressions in Oracle

Hello All,
I come from Perl scripting language background. Perl's regular expressions are rock solid, robust and very fast.
Now I am planning to master Regular Expressions in Oracle.
Could someone please point the correct place to start with it like official Oracle documentation on Regular Expressions, any good book on Regex or may be any online link etc.
Cheers,
Parag
Edited by: Parag Kalra on Dec 19, 2009 11:03 AM

Hi, Parag,
Look under [R in the index of the SQL language manual|http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/index.htm#R]. All the regular expression functions and operators start with "REGEXP", and there are a couple of entries under "regular expressions".
That applies to the Oracle 11 and 10.2 documentation. Regular expressions were hidden in the Oracle 10.1 SQL Language manual; you had to look up some similar function (like REGR_SYY, itself hidden under S for "SQL Functions", and then step through the pages one at a time.
Sorry, I don't know a good tutorial or introduction.
If you find something hopeful, please post a reference here. I think a lot of people would be interrested.

RegExp and \n

Similar Messages

Maybe you are looking for