Regular expressions for URLs to match extensions

I am working on a simple method that will assign a specific extension
(e.g. ".jsp", ".php", ".cfm", etc.) to the end of a URL if it doesn't
find anything marking a valid extension, however, I do not want to add
an extension if one is found.
Consider my code:
import java.util.regex.Pattern;
public static final String urlEndSlashPattern = "/?";
public static final String urlQSPattern = "\\??([a-zA-Z0-9\\-_\\.]
+=[^&]*&?)*";
public static final String urlAnchorPattern = "#[^#]*$";
public static void addExtToUrl(String url, String myExt, String[]
exts) {
   StringBuffer sb = new StringBuffer();
   boolean hasExt = false;
   for (int i = 0; i < exts.length; i++) {
sb.append(".").append(exts).append(urlEndSlashPattern).append(urlQSPatte�rn).append(urlAnchorPattern);
if (Pattern.matches(sb.toString(), url)) {
hasExt = true;
sb = new StringBuffer();
if (!hasExt) {
url += "." + myExt;
The issue I want to bring up is the regular expression pattern I'm
using appears to fail.  I want to check and see if the URL I provide
ends with a valid extension, followed by optional "/" or a query
string or an anchor or any combination of these.
Like say if I have
http://www.blah.com/index.html
Then don't add the ".jsp" extension
But if I have
http://www.blah.com/registration/
Then I *want* to add the ".jsp" extension:
http://www.blah.com/registration.jsp
Or if I have:
http://www.blah.com/registration/?foo=bar#baz
Then it needs to change to
http://www.blah.com/registration.jsp?foo=bar#baz
But if I have
http://www.blah.com/registration/index.php?foo=bar#baz
Then I do *not* add the ".jsp" extension.
Hope that makes sense now.  Bottom line is that the pattern above
doesn't seem to work.  Ideas?
Thanks                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

Ok, just for fun, let's try it.
URL: http://www.blah.com/?bar=baz#anchor
Absolutely valid URL and something possible (remember, Balus, this is from a database entry, thus, it can be anything at all!)
Let's split onto the last /
Your last entry will be
?bar=baz#anchor
So if I re-insert the .jsp that it would not find I get
http://www.blah.com/.jsp?bar=baz#anchor
INVALID URL
Now let's try this one, BalusC:
http://www.blah.com/index.jsp?forwardURL=http://www.wee.com/gotcha_balus/
Now we split according to the ... last /
http://www.blah.com/index.jsp?forwardURL=http://www.wee.com/gotcha_balus.jsp
Valid URL, but possibly wrong when the query string value for "forwardURL" is read, thus, making the URL ultimately wrong.
Two examples that show that splitting by the last / can't work here.

Similar Messages

  • Regular expressions for URLs

    Hi everyone,
    I have a question about regular expressions.
    Let's say I want my program to extract last 10-digits from any URL that will be found (every found URL will end up on 10digit number!) and insert that number in the middle of other URL.
    Would anyone tell me please how to do that?
    Thank you

    I am not sure how to do that either...
    Actually I just figured out that there is no garantee that the URL will be ended on 10-digit number.
    Ok, my program is meant to search for the movie info on Yahoo (user enters keyword to search and chooses either 'title', 'actor', 'trailer', 'review' in drop-down menu). After the 'search' button is clicked the appropriate page is supposed to be found.
    For example, if the user types in 'shrek' and chooses 'trailer', the result is supposed to be this link http://movies.yahoo.com/movie/1808405861/trailer and not the
    following ones:
    http://movies.yahoo.com/mv/search?p=shrek
    or
    http://movies.yahoo.com/shop?d=hv&cf=info&id=1808405861
    So in my program the line for the 'title' search is
    url = "http://movies.yahoo.com/mv/search?type=all&p=";and it works for the titles. I thought if the found link has 10 -digit number on the end I can somehow 'catch' that number and insert into another link -so the page with trailers would be pulled up (it's an ID number in yahoo database) .
    But now since I am not sure if 10-digit number is going to be in the found link at all, I have no idea how to 'catch' that number.
    Does anyone have any ideas for my case?

  • Need a regular expression for URL

    Regular expression: ^(udp|norm)://(?:(?:25[0-5]|2[0-4]\\d|[01]\\d\\d|\\d?\\d)(?(?=\\.?\\d)\\.)){4}:(6553[0-5]|655[0-2][0-9]\\d|65[0-4](\\d){2}|6[0-4](\\d){3}|[1-5](\\d){4}|[1-9](\\d){0,3})$
    Source: udp://125.3.4.121:23456
    Getting error:java.util.regex.PatternSyntaxException: Unknown inline modifier near index 54
    ^(udp|norm)://(?:(?:25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)(?(?=\.?\d)\.)){4}:(6553[0-5]|655[0-2][0-9]\d|65[0-4](\d){2}|6[0-4](\d){3}|[1-5](\d){4}|[1-9](\d){0,3})$

    paulcw wrote:
    I can't help wondering if a simpler solution would involve creating a java.net.URL object, passing it the string to be checked, and let URL do the parsing and an initial pass at checking the syntax. Then you could query URL for individual components for further checking (e.g., getting the transfer type to make sure it's one of the kinds you want). This approach would probably be more self-documenting, if nothing else.Nice idea, but java.net.URL doesn't actually do much work at all. It basically takes the protocol part, looks up a handler for it, and delegates everything else to that. The handlers themselves are down in the sun packages (or whatever your Java impl is) and since there isn't one for the OPs protocol, he'd have to write and register his own anyway, which kind-of negates the entire purpose of leveraging it.
    @OP: Paul's suggestion can still be of use. Apache, Codehaus and the like will have written their own protocol handlers, it's worth grabbing the source code for, say, Apache HttpClient, and seeing if they've got anything useful in there. Then again, who's to say that what's a valid URL for this 'norm' protocol, other than whoever designed the protocol?

  • How to form a regular expression for matching the xml tag?

    hi i wanted to find the and match the xml tag for that i required to write the regex.
    for exmple i have a string[] str={"<data>abc</data>"};
    i want this string has to be splitted like this <data>, abc and </data>. so that i can read the splitted string value.
    the above is for a small excercise but the tagname and value can be of combination of chars/digits/spl symbols like wise.
    so please help me to write the regular expression for the above requirement

    your suggestion is most appreciable if u can give the startup like how to do this. which parser is to be used and stuff like that

  • Regular expressions, using pattern and matcher but not include the pattern

    Hi all
    i have a regular expression but in my matcher it is including the text that is in my regular expression.
    ie
    String str="-------------------stuff===========";
    Pattern mainBody = Pattern.compile("-----(.*?)=====", Pattern.MULTILINE);this matches -------------------stuff=====
    now i expect to get some - in there as i match from the start, but i dont want to have the = in the match. how do i do a match that excludes the matching expressions.

    nevermind, figured it out
    when i do myMatch.group(1) it gives me just my match
    sorry for wasting time :)

  • Using regular expressions for validation in i18n

    Can we use regular expressions for validation of inputs in a java application taking care of i18N aspects too. Zip code for different locales are different. Can we use regular expressions to validate zipcode inputs from different locales

    hi,
    For that shall i have to create individual patterns for matching the inputs from different locales or a single pattern will do in the case of validating phone nos. around the world, zip codes etc. In case different patterns are required, programmer should have a konwledge of difference in patters for different locales.
    regards
    sdas

  • Regular Expression For Dreamweaver

    I still haven't had the time to really become a professional when it comes to regular expressions, and sadly I am in need of one an finding it difficult to wrap my head around.
    In a text file I have hundreds of instances like the following:
    {Click here to visit my website}{http://www.adobe.com/}
    I need a regular expression for Dreamweaver that I can run within the "Find and Replace" window to switch the order of the above elements to:
    {http://www.adobe.com/}{Click here to visit my website}
    Can anyone provide some guidance? I'm coming up short due to my lack of experience with regular expressions.
    Thank you in advance!

    So you have a string that starts { and goes until the first }.  Then you have another string exactly the same.  And you want to swap them.  I'm not making any assumption that the second one has to look like a URL (that's a whole other minefield, but perhaps you could do something simple like it must start with http). 
    You don't specify how your text file is divided up, have you got this as a complete line to itself, or is it just  a huge block of text?  Preferably as individual lines.
    I don't have Dreamweaver, but this worked for me in Notepad++
    Find: ^{(.*?)}{(.*?)}$
    Replace with: {\2}{\1}
    My file looked like this:
    {Click here to visit my website}{http://www.adobe.com/}
    {some other site}{http://www.example.com/foo}
    And doing a Replace All ended up like this:
    {http://www.adobe.com/}{Click here to visit my website}
    {http://www.example.com/foo}{some other site}

  • Regular Expression for PathName???

    Anyone have a "ready to go" regular expression for detecting a pathname?
    for example I need to detect the following:
    myfile.txt
    ./myfile.txt
    ../my-file.ini
    /home/my-home/myFile.foo
    etc.
    Now, in a perfect world, it could also do Windows (or ANY OS for that matter) pathnames (though this is not terrbibly important for my case at least).
    TIA,
    /m

    import java.util.regex.*;
    * @author  Ian Schneider
    public class FileRegex {
        static Pattern pattern;
        /** Creates a new instance of FileRegex */
        public FileRegex() {
        public Pattern getPattern() {
            if (pattern == null) {
                pattern = Pattern.compile("([\\/]?(\\w+|\\.|\\.\\.)[\\/])*(\\w+)\\.?(\\w+)?");
            return pattern;
        public String[] parts(String path) {
            Matcher m = getPattern().matcher(path);
            if (m.find()) {
                return new String[] { m.group(1),m.group(3),m.group(4) };
            return null;
        public boolean matches(String path) {
            return getPattern().matcher(path).matches();
        public static final void main(String[] args) throws Exception {
            FileRegex regex = new FileRegex();
            String[] files = {
                "myfile.txt",
                "../myfile.txt",
                "./myfile.txt",
                "/a/b/c/myfile.txt",
                "/a/../myfile.txt",
                "myfile"
            for (int i = 0, ii = files.length; i < ii; i++) {
                System.out.println( files[i] + " match " + regex.matches(files));
    String[] pieces = regex.parts(files[i]);
    if (pieces != null)
    System.out.println(" path : " + pieces[0] + " file : " + pieces[1] + " ext : " + pieces[2]);
    I will leave it to you as an excercise to add support for spaces in path names, different separator characters, etc..

  • Regular Expression for IPAddress

    Hello members.....
    I am a new member to this forum
    I am in need of the Regular Expression for IPAddress...
    "[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}"....is the expression i wrote..But it is taking 0.0.0.0 as a valid IPAddress. (0.0.0.0 is not a valid IPAddress)
    Please reply....awaiting
    Rajeshwar

    I am in need of the Regular Expression for
    IPAddress...
    "[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}"
    ....is the expression i wrote..But it is taking
    0.0.0.0 as a valid IPAddress. (0.0.0.0 is not a valid
    IPAddress)Your regex matches "999.999.999.999", which (of course) isn't a vaild IP address as well.
    This one is closer, but still allows 0.0.0.0:
    \\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\bBut why not roll your own method which checks an IP address?

  • What is regular expression for cheking a set operation

    I want a regular expression for checking the syntax of the set operation given as input
    ex:
    the input should be
    [1,2,3]+[2,3] or [1,2]-[2] or [1,2,3]*[1,2]
    is should check the square bracket and the operator between two set operands

    I think you should code that from scratch. When you validate input, you usually want to tell the user what they did wrong, but a regex will only tell you whether it matches or not.

  • Wat should be the regular expression for string MT940_UB_*.txt to be used in SFTP sender channel in PI 7.31 ??

    Hi All,
    What should be the regular expression for string MT940_UB_*.txt and MT940_MB_*.txt to be used as filename inSFTP sender channel in PI 7.31 ??
    If any one has any idea on this please let me know.
    Thanks
    Neha

    Hi All,
    None of the file names suggested is working.
    I have tried using - MT940_MB_*\.txt , MT940_MB_*.*txt , MT940*.txt
    None of them is able to pick this filename - MT940_MB_20142204060823_1.txt
    Currently I am using generic regular expression which picks all .txt files. - ([^\s]+(\.(txt))$)
    Let me know ur suggestion on this.
    Thanks
    Neha Verma

  • Regular Expression for a Person's Name

    Hi,
    I am using the org.apache.regexp package and trying to find the regular expression for a person's name. It allows only the alphabetic string.
    I tried [a-zA-Z]+. But this also accepts the thing like "BUSH88", which is not what I want...
    Can anybody help me figure this out?
    Thanks in advance,
    Tong

    Hi,
    I am using the org.apache.regexp package and trying to
    find the regular expression for a person's name. It
    allows only the alphabetic string.
    I tried [a-zA-Z]+. But this also accepts the thing
    like "BUSH88", which is not what I want...
    Can anybody help me figure this out?
    Thanks in advance,
    Tongtry this:
    ^[a-zA-Z]+$
    the ^ represents the start of the String and the $ represents the end.
    So the expression is saying: "between the beginning and the end of the String there will only be alphbetical characters"

  • How to write the regular expression for Square brackets?

    Hi,
    I want regular expression for the [] ‘Square brackets’.
    I have tried to insert in the below code but the expression not validate the [] square brackets.
    If anyone knows please help me how to write the regular expression for ‘[]’ Square brackets.
    private static final Pattern DESC_PATTERN = Pattern.compile("({1}[a-zA-Z])" +"([a-zA-Z0-9\\s.,_():}{/&#-]+)$");Thanks
    Raghav

    Since square brackets are meta characters in regex they need to be escaped when they need to be used as regular characters so prefix them with \\ (the escape character).

  • Need a regular expression for the text field

    Hi ,
    I need a regular expression for a text filed.
    if the value is alphanumeric then min 3 char shud be there
    and if the value is numeric then no limit of chars in that field.[0-9].
    Any help is appriciated...
    thanks
    bharathi.

    Try the following in the change event:
    r=/^[a-z]{1,3}$|^\d+$/i;
    if (!r.test(xfa.event.newText))
    xfa.event.change="";
    Kyle

  • Regular Expression for /, \, #, -, & ‘

    Hi,
    Can anybody tell me the regular expression for provided characters.
    Code is preferable.
    Thanks in advance.

    "[-/\\\\#&']"

Maybe you are looking for

  • Deliveries not getting generated through VL10H

    Hi All, I've a scenario where users are trying to create deliveries against Sales Order through VL10H but it displays an error log "Delivery group 005 is not complete (delivery group will be deleted)". But while using VL04 no erros are there. This is

  • PC Suite/no device recognition

    I have downloaded the latest version of PC Suite 6.85 and installed on my PC (windows XP home edition service pack 2) My phone is a Nokia 6233. I follow the procedure and connect to the PC with the USB cable supplied. After a while I get the message

  • How to include jar file ??

    Hi, I want to use some classes from a .jar file in my application. In NT I have placed the .jar file in ....iplanet/ias6/ias/lib/java and included the path in CLASSPATH variable in .../iplanet/ias6/ias60/lib/IAB . I am facing problem in doing the sam

  • RabbitMQ/AMQP messaging: example for LabVIEW

    RabbitMQ (http://www.rabbitmq.com/) is an open source message broker based on the AMQP protocol (http://www.amqp.org/). I plan to use it to connect data processing modules written in different programming languages (LabVIEW, Ruby, Python and Java) an

  • Text missing in po preview

    HI In me21n t.code in header i maintained text in header text its shows in po preview,if i maintain text in header note its not showing in preview issue,kindly help Regards Sam