Xml regex help - can't use xml parser

Hi,
I receive xmls from a legacy system and use JAXB to parse these xml's into java objects.
The xml looks roughly like this:
<root-Tag>
   <general-Info-Tag-1>blah</<general-Info-Tag-1>
   <general-Info-Tag-2>blah</<general-Info-Tag-2>
   <general-Info-Tag-3>blah</<general-Info-Tag-3>
   <entry-Tag>
      <entry-info-Tag-1>info</entry-info-Tag-1>
      <entry-info-Tag-2>info</entry-info-Tag-2>
       etc...
   </entry-Tag>
   <entry-Tag>
      <entry-info-Tag-1>info</entry-info-Tag-1>
      <entry-info-Tag-2>info</entry-info-Tag-2>
       etc...
   </entry-Tag>
</root-Tag>The xml contains a root tag.
The root element contains some general info tags and entry tags.
The entry elements contain some entry relevant info inside nested tags.
It's important to note that this the xml is not nested anymore than the entry-info tags.
My Problem:
The info in the entry-info-tags sometimes contains illegal chars like <,>,& and is not wrapped
properly by a CDATA section.
So I need to find these elements and wrap them with <![CDATA[info]]> before using JAXB.
How can I achieve this using Pattern/Matcher? I've been unsuccessful so far...
Edited by: Samuelz on Jul 1, 2010 1:58 AM

Heres what i've got so far:
A helpful function to print out matches for regex on an input string:
private static void printMathces(final String regex, final String input) {
     Pattern p = Pattern.compile(regex, Pattern.DOTALL);
     Matcher m = p.matcher(input);
     while (m.find()){
          System.out.println("found:");
          for (int i = 0; i < m.groupCount(); i++)
               System.out.println(i + ":" + m.group(i));
}My input string for testing:
String s =
"<entries>\n"+
     "<entry>\n" +
          "<TagName>sam&max</TagName>\n" +
     "</entry>\n" +
     "<entry>\n" +
          "<TagName><![CDATA[sam&max]]></TagName>\n" +
     "</entry>\n" +
"</entries>\n";regex to get contents of TagName:
printMatches("<(TagName)>((.*?))</\\1>+", s);
found:
0:<TagName>sam&max</TagName>
1:TagName
2:sam&max
found:
0:<TagName><![CDATA[sam&max]]></TagName>
1:TagName
2:<![CDATA[sam&max]]>regex to get content of TagName when its wrapped by CDATA
printMathces("<(TagName)><!\\[CDATA\\[((.*?))\\]\\]></\\1>", s);
found:
0:<TagName><![CDATA[sam&max]]></TagName>
1:TagName
2:sam&maxI'm trying to build a regex that will find the contents of TagName without knowing if its already wrapped by CDATA and then replace these matches by something like <$1><![CDATA[$2]]</$1>
How do I "combine" these 2 regular expressions?

Similar Messages

  • Using Suri asking any localised question in the UK, it responds with 'cannot help' can only use US locations etc. and in US English. Have I missed something or set something up incorrectly? I also thought it would interface with your Facebook but it just

    Using Suri asking any localised question in the UK, it responds with 'cannot help' can only use US locations etc. and in US English. Have I missed something or set something up incorrectly? I also thought it would interface with your Facebook but it just comes back with a 'I can't help you with Facebook' message

    Yeah, Siri - a headline feature on the iPhone 4s page - is beta at the mo.
    But then anyone who's used Macs for a few years will know that most new software is pretty much still in beta when it's released anyway. Only Apple could get away with it and retain customers

  • Need Help Can i use Merge command along with exist function in oracle?

    I am using Merge command to update the destination table and updating the rows which are already in the destination table.
    But what i want is to delete the existing rows from the destination table and insert fresh rows instead of updating the existing rows in the destination table.
    So can we use exist function to check the existing rows and delete them and use merge command to insert the rows in the table.

    You definitely need to do a DELETE then INSERT since MERGE will not delete rows, although I'm not really sure what that gets you since the net effect would be the same as a MERGE over the same pair of tables.
    If you really want to do it this way, then I would likely do something like:
    DELETE FROM target_table
    WHERE (columns_you_would_match_on) IN (SELECT columns_you_would_match_on
                                           FROM source_table
                                           WHERE predicate_you_would_use_in_using);
    INSERT INTO target_table (column_list)
    SELECT column_list
    FROM source_table
    WHERE predicate_you_would_use_in_using;John

  • Nokia 5000d help (can i use this is spain with a u...

    Hiya Guys
    Can someone please tell me if i can use a Nokia 5000d Mobile in spain with a UK T-MOBILE SIM?
    Many Thanks

    Thanks for your reply, Im not sure if its sim unlocked but it currently works with a T-mobile uk sim card here in the uk, What id like to know is will the phone work in spain because im not sure if its tri band or a dual band phone & i cant seem to find out. If you know this you'd be a great help. Thanks

  • Please help - Can not use stored procedure with CTE and temp table in OLEDB source

    Hi,
       I am going to create a simple package. It has OLEDB source , a Derived transformation and a OLEDB Target database.
    Now, for the OLEDB Source, I have a stored procedure with CTE and there are many temp tables inside it. When I give like EXEC <Procedure name> then I am getting the error like ''The metadata  could not be determined because statement with CTE.......uses
    temp table. 
    Please help me how to resolve this ?

    you write to the temp tables that get created at the time the procedure runs I guess
    Instead do it a staged approach, run Execute SQL to populate them, then pull the data using the source.
    You must set retainsameconnection to TRUE to be able to use the temp tables
    Arthur My Blog

  • Help: can't use class: Cipher, Signaure, MessageDigest

    Hi all,
    When I use these Class: Signature or SignatureMessageRecovery, MessageDigest, Cipher in my applet and when I send CreateApplet adpu to the applet, I get some error (SW1SW2=0x6444) and the CREF throw the Exception: SYSTEMEXCEPT_NO_TRANSIENT_SPACE.
    If I change the Signature algorithm to ALG_RSA_SHA_ISO9796_MR and
    cast the Signature Object to SignatureMessageRecovery, I get the same error.
    If I change the Signature algorithm to ALG_DES_MAC8_ISO9797_M2, the applet return success(SW1SW2=9000).
    somebody help me!~
    Thanks!
    code:
    private Signature signature;
    private MessageDigest digest;
    private Cipher cipher;
    private void initSecurityData(){
    digest = MessageDigest.getInstance(MessageDigest.ALG_SHA, false);
    signature = Signature.getInstance(Signature.ALG_RSA_SHA_PKCS1, false);
    cipher = Cipher.getInstance(Cipher.ALG_DES_CBC_ISO9797_M2, false);
    protected CreditCard(byte[] bArray, short bOffset, byte bLength){
    initSecurityData();
    byte aidLen = bArray[bOffset];
    if (aidLen == (byte)0){
    register();
    } else {
    register(bArray, (short)(bOffset+1), aidLen);
    }

    The file is saved as the class name.
    Here's what happens:
    C:\jdk1.2.2\bin>javac HelloWorld.java
    C:\jdk1.2.2\bin>java HelloWorld
    Exception in thread "main" java.lang.NoClassDefFoundError: HelloWorld
    Here's the code that I'm trying to get to run:
    The HelloWorld application program
    public class HelloWorld
         public static void main(String argv[])
              System.out.println("Hello World!");
    }

  • Help: Can you use software trigger on a digital line.

    Hi:
    We have a legacy DAQCARDard 700 which does not support hardware triggering.
    We have a trigger from our instrument that is 5 V+ and we would like to
    trigger when it drops to 0 V. We have several questions.
    1. Do we need to invert the signal in MAX or is that only for V that are
    negative?
    2. Is there a good example in the examples where the state of this digital
    line is used to trigger an analog acquisition? When we tried using the
    digital trigger examples, we had an error saying that the hardware did not
    support that mode.
    3. We have an analog software trigger mode (based on the Analog Software
    Trigger example) working which we would like to modify over to read the
    digital line, but we have done very little wi
    th digital I/O.
    4. The digital trigger has been assigned a virtual channel of dtrg.
    Any help would be much appreciated.
    Thanks in advance,
    Pete

    Thanks Doug,
    If we read the digital line instead of as an analog line would it improve
    the accuracy of the triggering. Everything works now except we have a
    little bit of timing jitter within about 1 data point scanning at 50 kHz.
    However, I've never done any digital i/o with LabView, but may I should work
    through some of the tutorials. If you thought that this might solve the
    jitter problem. Would checking the state of the digital line allow a faster
    response with softtrig, I guess is my question?
    Pete
    "Doug Norman" wrote in message
    news:[email protected]..
    > Hello Pete,
    >
    > You are correct that the DAQCard-700 has no digital (or analog)
    > hardware trigger. The analog trigger example t
    hat is working for you
    > is using conditional retrieval. This is where data is always being
    > acquired and the driver looks at the values to determine when to
    > "trigger" and read the data into LabVIEW. To answer your questions:
    > 1. I don't think you need to invert the signal. This is for when you
    > want a digital low (below 0.8 volts) to show up as a digital high, and
    > a high (above 2.0 volts) to be read as a low.
    > 2. I don't know of a good example. You would basically have to
    > monitor the digital line. When it goes from high to low you would
    > then start your analog acquisition.
    > 3. I think this could be your best bet. If you have enough analog
    > input lines, why not just connect this digital signal as one of your
    > analog inputs. Then use this example to trigger when the 5 volts
    > drops to 0. It won't hurt to acquire your digital signal on an anlog
    > input along with your other data.
    > 4. I don't understand this question.
    >
    > Best Regards,
    >
    > Doug Norman

  • Help can't use my credit card.

    I recently had a billing snafoo and entered the wrong card number when there wasn't enough money on it. now that it is reloaded I want to use it and I can't. How do I remedy this?

    Go to Settings > Store
    Tap on your Apple ID
    Tap on Billing Information
    Delete, then re-enter your billing information

  • Help,can not use oms client!

    oracle 9.2.0.1 on Redhat linux 7.3
    I install oms on linux and installed client on my win2000
    but when I use client connect to my OMS,it throw a Exception:java.lang.NullPointerException

    Hi Hungnghiem1964,
    Have you checked that your phone is on General profile and the ringer volume settings is not at the minimum? And have you restarted your phone after the installation, as that might help. If you are still encountering problems, then it is probably best to contact the developer of this application by either Googling for them and seeing if they have a website / contact, or to contact them via the Nokia Store.

  • Corrupt fonts? PLEASE HELP CAN'T USE PROGRAMS!!

    When i boot up my computer and try to open my Mail, it won't open. It stalls and i have to force quit. This happens also with Safari, system preferences etc. But i can get into photoshop and some other non-system programs. i have logged out and gone into another profile and everything works fine. Sometimes when i log back into my profile, everything works fine but other times i have to restart cuz it won't log in, it freezes. then i have to restart and try this all over again. I have loaded quite a few fonts on my profile and i believe that some of the fonts, potentially system fonts, have been corrupted and that;s affecting the programs. My question is; Is there a way to fix this problem without having to reinstall os x?

    Hi Jason,
    First see if it's just corrupt font cache files. Download and run Font Finagler. If that doesn't do it and you're using Font Book, try this. Follow the steps in Undoing Font Book. Note that you will lose any font collections you have created. Doing this will also clear the system's font cache files.
    If after that you're still having trouble, then the fonts are the likely problem. To replace all fonts OS X came with without having to install the entire OS, follow the instructions at the bottom of my article, Font Management in Mac OS X Tiger and Panther.
    If after that the problems persist, then one or more of the third party fonts you've added are bad.
    The link, or one of the links above directs you to my personal web site. While the information is free, it does ask for a contribution. As such, I am required by Apple's rules for these discussions to include the following disclaimer.
    I may receive some form of compensation, financial or otherwise, from my recommendation or link.
    Edit: Heh! roam beat me.

  • Need help can not use my Photoshop Elements

    Hi
    I bought the Samung Series7 Ultra "CORE i5" from Bestbuy.
    but every time I try to open photoshop so I can not. Why ? Error 213:19

    http://helpx.adobe.com/photoshop-elements/kb/licensing-problem.html

  • Help - Can't use regular keyboard for typing!

    My torch is not allowing me to type with the regular keyboard.  I tried re-installing the software and cleaning the phone and this still isn't helping.
    Any suggestions?

    Hello onfire99jr,
    Welcome to the BlackBerry Support Community.
    The following article will assist you regarding the issue you are experiencing with the keyboard on your BlackBerry® smartphone:
    Trackpad, trackball, or keyboard not working on a BlackBerry Smartphone - http://www.blackberry.com/btsc/KB29640
    We hope this helps.
    Thank you.
    -FB
    Come follow your BlackBerry Technical Team on Twitter! @BlackBerryHelp
    Be sure to click Kudos! for those who have helped you.
    Click "Accept as a Solution" for posts that have solved your issue(s)!

  • Is Oracle XML parser thread safe ?

    Hi,
    Is XML Parser thread safe ? In other words, if I call parse() from one thread, can I call the same method (or a different method) from another thread ? Is it safe ?. Is there an API that will let me use parser in a thread -safe way ?
    Thanks
    Vissu

    No, the parser is not thread safe. But you can re-use the parser in the same or a diff thread after the current parse completes.

  • Can Java be used to parse Microsoft Word(.doc) files?

    Hi guys ,
    I want to know whether Java can be used to parse Microsoft Word(.doc) files for searching a string or for checking for grammatical errors, etc
    Thanks in advance.
    Avichal

    Hey man, anything and every thing can be done these days.
    About ur question doc is like all other normal text files with some extra features and extra character supports and other stuffs.
    If u neglect those parts and if u consider it to be a normal text file then its a much simpler job.
    Here is a code that searches for the key word in all the doc files, txt files, pdf files and html files
    in the mentioned folder and sub folders. Any way its a servlet u can change it to a normal program.
    It first check the file to know whether they are doc, pdf, html or txt files if yes then it will read the file and
    store the contents in the vector and parse the vector for the search string and display the result.
    Along with the result the below code will also display the time taken and the number of search string found in the document
    import java.io.*;
    import java.util.*;
    import java.net.*;
    import javax.servlet.*;
    import javax.servlet.http.*;
    public class search_local extends HttpServlet
         public void service( HttpServletRequest _req, HttpServletResponse _res ) throws ServletException, IOException
              long startTime = System.currentTimeMillis();          
              File RootDir     = new File( _req.getRealPath( "/docs/" ) );
              if ( RootDir.isDirectory() == false )
                   System.out.println( "Invalid directory" );
                   _res.setStatus( HttpServletResponse.SC_NO_CONTENT );
                   return;
              Vector kList = new Vector( 3 );
              StringTokenizer st = new StringTokenizer( _req.getParameter( "search_text" ), "+" );
              while ( st.hasMoreTokens() )
                   kList.addElement( st.nextToken().trim() );
              //- Run through list
              Vector toBeDone     = new Vector( 10 );
              Vector found     = new Vector( 10 );
              String dir[] = RootDir.list( new htmlFilter() );
              cDirInfo tX = new cDirInfo( RootDir, dir );
              toBeDone.addElement( tX );
              while (  toBeDone.isEmpty() == false )
                   tX = (cDirInfo)toBeDone.firstElement();
                   try
                        int x = 0;
                        for ( ;; )
                             File newFile = new File( tX.rootDir, tX.dirList[x] );
                             if ( newFile.isDirectory() )
                                  File t = new File( tX.rootDir, tX.dirList[x] );
                                  String a[] = newFile.list( new htmlFilter() );
                                  toBeDone.addElement( new cDirInfo( t, a ) );
                             else
                                  int freq = searchFile( kList, newFile );
                                  if ( freq != 0 )
                                       found.addElement( new cPage( freq, newFile ) );                              
                             x++;
                   catch( ArrayIndexOutOfBoundsException E ){}
                   toBeDone.removeElementAt(0);
                   dir     = null;
              long totalTime = System.currentTimeMillis()     - startTime;
              formatResults( found, kList, totalTime, _req.getRealPath( "/docs" ), _res );
         private void formatResults( Vector _fList, Vector _kList, long time, String _root, HttpServletResponse _res ) throws IOException
                 _res.setContentType("text/html");
              PrintWriter Out = new PrintWriter( _res.getOutputStream() );
              Out.println( "<HTML><HEAD><TITLE>Search results</TITLE></HEAD>" );
              Out.println( "<BODY><H3>Search Results</H3><BR>" );
              Out.println( "Keywords:<B> " );
              Enumeration E = _kList.elements();
              while ( E.hasMoreElements() )
                   Out.println( (String)E.nextElement() + " : " );
              Out.println( "</B><BR><BR><CENTER><HR WIDTH=100%></CENTER><BR>" );
              E = _fList.elements();
              cPage sPage;
              String link;
              while ( E.hasMoreElements() )
                   sPage = (cPage)E.nextElement();
                   link  = sPage.cFile.toString();
                   link  = "http://localhost/BugFix/docs/" + link.substring( link.indexOf( _root )+_root.length(), link.length() );
                   Out.println( "<FONT SIZE=+1><A HREF=" + link + ">" + sPage.cFile.getName() + "</A></FONT>" );
                   Out.println( "<FONT SIZE=-2>(" + sPage.freq + ")</FONT><BR>" );
              if ( _fList.size() == 0 )
                   Out.println( "<I><B>No sites found!</I></B><BR>");
              Out.println( "<BR><CENTER><HR WIDTH=100%></CENTER>" );
              Out.println( "<BR><FONT SIZE=-1>Time to complete: " + ((double)time/1000) + " seconds</FONT>" );
              Out.println( "</BODY></HTML>" );
              Out.flush();
         private int searchFile( Vector _klist, File _filename )
              //- Links the file
              int     frequency=0;
              try
                   DataInputStream In     = new DataInputStream( new FileInputStream( _filename ) );
                   String LineIn, token;
                   boolean bValid = true;
                   Enumeration E;
                   cLineParse lp;
                   while ( (LineIn = In.readLine()) != null )
                        lp = new cLineParse( LineIn.toUpperCase() );
                        while ( (token=lp.nextToken()) != "" )
                             if ( token.indexOf( "<" ) != -1 && (
                                   token.indexOf( "<A" ) != -1 ||
                                   token.indexOf( "<HE" ) != -1 ||
                                   token.indexOf( "<APP" ) != -1 ||
                                   token.indexOf( "<SER" ) != -1 ||
                                   token.indexOf( "<TEX" ) != -1  ))
                                  bValid  = false;
                             else if (     token.indexOf( "<" ) != -1 && (
                                            token.indexOf( "</A" ) != -1 ||
                                            token.indexOf( "</HE" ) != -1 ||
                                            token.indexOf( "</APP" ) != -1 ||
                                            token.indexOf( "</SER" ) != -1 ||
                                            token.indexOf( "</TEX" ) != -1  ))
                                  bValid  = true;
                             else if ( bValid )
                                  E = _klist.elements();
                                  String key;
                                  while ( E.hasMoreElements() )
                                       key     = ((String)E.nextElement()).toUpperCase();
                                       if ( token.indexOf( key ) != -1 )
                                            frequency++;
                   In.close();
              catch( IOException E ){}
              return frequency;
    class cPage extends Object
         public int     freq;
         public File cFile;
         public cPage( int _freq, File _cFile )
              freq = _freq;
              cFile = _cFile;
    //- End of file
    //----- Supporting classes
    class htmlFilter implements FilenameFilter
         public boolean accept(File dir, String name)
              File tF     = new File( dir, name );
              if ( tF.isDirectory() )
                   return true;
              int indx = name.lastIndexOf( "." );
              if ( indx == -1 )
                   return false;
              String Ext = name.substring( indx+1, name.length() ).toLowerCase();
              if ( Ext.equals( "html" ) ||
                    Ext.equals( "pdf" ) ||
                    Ext.equals( "txt" ) ||
                    Ext.equals( "doc" ) )
                    return true;
              return false;
    class cDirInfo
         public File     rootDir;
         public String[] dirList;
         public cDirInfo( File _r, String[] _d )
              rootDir     = _r;
              dirList = _d;
    class cLineParse
         String L;
         public cLineParse( String _s )
              L = _s;
         public String nextToken()
              String ns="";
              boolean bStart = false;
              for ( int x=0; x < L.length(); x++ )
                   if ( L.charAt(x) == '<' && ns.length() != 0 )
                        L = L.substring( x, L.length() );
                        return ns;
                   else if ( L.charAt(x) == '<' )
                        ns     = ns + L.charAt( x );
                        bStart = true;
                   else if ( L.charAt(x) == '>' ||
                               L.charAt(x) == '\r' ||
                         ( L.charAt(x) == ' ' && bStart == false ) )
                        ns     = ns + L.charAt( x );
                        L = L.substring( x+1, L.length() );
                        return ns;
                   else
                        ns     = ns + L.charAt( x );
              L = "";
              return ns;
    }

  • Can i Use java Class

    Can any help, can i use java class in
    froms 6 or any one have designed form
    of progress bar please help me

    Thanks for your reply...
    Actually the task is quite simple... It requires me to do a better interface for a tomcat server folders...
    For example, users are free to access
    http://apache.oss.eznetsols.org/jakarta/tomcat-5/v5.0.12-beta/
    to download the tomcat... However, its index may not nice and easy for browing.. Therefore, the task requires a new dynamic page to access this server folder, and get all file names under this folder, anlaysis whether it is file or directory..and finally get its path....
    It is quite easy to do a JSP page, but I don't know whether it is possible to do by using JavaScript .... Although JS is running in client side, the user/client also can access the tomcat folder, why not JavaScript?
    Thanks a lot!
    SD

Maybe you are looking for

  • How can I use the /home directory

    I'm doing development work on my Mac (running Snow Leopard). The output location for the production server I use is under "/home/subdir". I'd like to develop my files on the Mac so they reside in the same relative location, but from the default insta

  • Performance evaluation for file storage.

    If I were working in C/C++ I would know what should be an appropriate record size so it could be stored in one cache line, and processed therein; precisely how many pointer transitions will be required to fetch a record, and how microprocessor regist

  • HT1600 my apple tv update wasnt successful.

    how do i fix my apple tv? i tried to update it and it wasnt successful

  • Bex analyzer - internal error problem with table c_t_variable

    HI All,      When ever I try to refresh a query in Bex-Analyzer ,Iam getting "internal error problem with table c_t_variable".      Please advise Edited by: Aparna Duvvuri on Jan 13, 2010 10:53 AM

  • How to investigate maximum 'permissible' value for jvmmx

    Hi all. I am deploying jdk 1.5 with tomcat 5.5.26, and am facing problems during server start up. However, when I set the jvmmx value below a certain threshold, the server starts up properly. Is there a way that may tell the maximum permissible value