Query to extract HTML tag with data

Hi All,
I have a string.
'<HTML><HEAD>THIS IS HEAD.</HEAD><BODY>THIS IS BODY.<P>THIS IS P1.</P>NIMISH<P>THIS IS P2.</P></BODY></HTML>'
I want to extract a html tag including its opening & closing tab with data as
if i say P1
then the output should be
'<P>THIS IS P1.</P>'
for P2
then the output should be
<P>THIS IS P2.</P>
please help me in writing this query with regular expression
i have tried it as following but it is not giving desired result:
WITH T AS
SELECT
    '<HTML><HEAD>THIS IS HEAD.</HEAD><BODY>THIS IS BODY.<P>THIS IS P1.</P>NIMISH<P>THIS IS P2.</P></BODY></HTML>' STR
FROM   
    DUAL
SELECT REGEXP_SUBSTR(STR, '<P>.+P2.+</P>') FROM T
Thanks & Regards
Nimish GargEdited by: Nimish Garg on May 7, 2012 5:49 PM

Nimish Garg wrote:
My requirement is to extract a <tag>data</tag> from a HTML/XML string
where data contains any specified value.HTML is not XML.
And that is a critical distinction to make. HTML parsing is horribly complex. XML is quite easy. For HTML you have to code your own parser in PL/SQL. XML can be parsed using the XMLTYPE class/data type in PL/SQL.
So if you need to find a single specific tag in HTML - I would not try to treat it as XML. I may not even try to use regular expressions.
I would do a basic substring search for the start of the tag. Read the data following the tag. Ensure that there are no nested or embedded tags in the data. Until the end tag is read. Because HTML is that much abused - and because that is an accepted norm as parsers used by browsers deals with that abuse without complaining.
Proper HTML is mostly a myth in my experience of "screen scraping" web servers for data extraction as they do not have web services supplying the data.

Similar Messages

  • HTML Tags in data field

    Does Documaker support reading and applying HTML tags in data fields? For example, we are publishing a list of questions that are being passed to me from the database. These questions contain embedd HTML formatting tags:
    Are you a member or do you intend to become a member of the armed forces <.i>including reserves</.i>?
    ** Please note: I had to add the . into the tag because the forum was formatting italics tag appropriately.
    I want Documaker to read the sting and apply the formats that the tags specify.
    Any help would be appreciated.
    Dave

    Dave,
    Have a look at the rule MessageFromExtr, it may help you.
    Gaétan

  • Oracle function to strip HTML tags in data

    Hi, may I know if there is any oracle function that can strip HTML tags from the data? I am currently using Oracle 9i. Therefore is unable to use the RegExp_Replace functionality provided by Oracle 10g.
    Any help will be appreciated.
    Thanks a lot!

    I found this function:
    function str_html ( line in varchar2 ) return
    varchar2 is
    x varchar2(32767) := null;Let's hope whoever uses this doesn't want to deal with large XML's.
    This question would have been better asked over in the XDB forum. All depends whether there is a schema available to describe the XML, what version of the database etc. With a schema, the schema could be registered into the Oracle 10g database and that used to create relational tables automatically, from where you can load in the XML files. Loads of information available on the XDB forum, most likely in the FAQ's on there.

  • How to set an attribute of a HTML tag with a value in the Servlet

    I have a HTML page and a Servlet.
    The HTML page sends a request to the Servlet.
    The Servlet has to read the contents of the HTML page. When the Servlet encounters the body tag it should set the bgcolor attribute of the body tag with a string(For eg.a string called color with a value blue) in the servlet.
    After doing this the Servlet has to update the original HTML page with the changes (in the body tag).
    I need a help on this.

    Hi sangee,
    you could get what you want to do by using a Java Server Page instead of both a HTML page and a Servlet.
    I should code something like this:
    <%Strung color="yourColor"%>
    <html><head><title></title></head>
    <body bgcolor="<%=color%>
    </body>
    </html>

  • RMAN TAG with date stamp

    We have daily RMAN backup with TAG FULL_BACKUP. So every day we have backupsets with the same tag. I would like to know if it possible to create some kind dynamic tag with every backup. For example: FULL_BACKUP_11012007
    FULL_BACKUP_11022007 ...
    We use UNIX AIX and Oracle 10g

    Why would anyone use bash on AIX or any other Unix?Agreed for Unix (I use ksh too), but on Linux I use bash, and it works :
    $ echo $SHELL
    /bin/bash
    $ cat back.sh
    TAG=FULL_BACKUP_`date +%Y%m%d`
    rman target / << EOF
    run {
    backup database tag $TAG;
    EOF
    $ ./back.sh
    Recovery Manager: Release 10.2.0.3.0 - Production on Thu Nov 1 17:17:55 2007
    Copyright (c) 1982, 2005, Oracle.  All rights reserved.
    connected to target database: DB102 (DBID=XXXXXXXXX)
    RMAN> 2> 3>
    Starting backup at 01-NOV-07
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: sid=211 devtype=DISK
    channel ORA_DISK_1: starting full datafile backupset
    channel ORA_DISK_1: specifying datafile(s) in backupset
    input datafile fno=00001 name=/home/oracle/base/oradata/db102/system01.dbf
    input datafile fno=00003 name=/home/oracle/base/oradata/db102/sysaux01.dbf
    input datafile fno=00004 name=/home/oracle/base/oradata/db102/users01.dbf
    input datafile fno=00006 name=/home/oracle/base/oradata/db102/undotbs02.dbf
    channel ORA_DISK_1: starting piece 1 at 01-NOV-07
    channel ORA_DISK_1: finished piece 1 at 01-NOV-07
    piece handle=/home/oracle/base/flash_recovery_area/DB102/backupset/2007_11_01/o1_mf_nnndf_FULL_BACKUP_20071101_3lmz1qfk_.bkp tag=FULL_BACKUP_20071101 comment=NONE
    channel ORA_DISK_1: backup set complete, elapsed time: 00:01:46
    Finished backup at 01-NOV-07
    Starting Control File Autobackup at 01-NOV-07
    piece handle=/home/oracle/base/flash_recovery_area/DB102/autobackup/2007_11_01/o1_mf_n_637521584_3lmz5254_.bkp comment=NONE
    Finished Control File Autobackup at 01-NOV-07
    RMAN>
    Recovery Manager complete.
    $ rman target /
    Recovery Manager: Release 10.2.0.3.0 - Production on Thu Nov 1 17:19:57 2007
    Copyright (c) 1982, 2005, Oracle.  All rights reserved.
    connected to target database: DB102 (DBID=XXXXXXXXX)
    RMAN> list backup of database;
    using target database control file instead of recovery catalog
    List of Backup Sets
    ===================
    BS Key  Type LV Size       Device Type Elapsed Time Completion Time
    15      Full    834.42M    DISK        00:01:38     01-NOV-07     
            BP Key: 15   Status: AVAILABLE  Compressed: NO  Tag: FULL_BACKUP_20071101
            Piece Name: /home/oracle/base/flash_recovery_area/DB102/backupset/2007_11_01/o1_mf_nnndf_FULL_BACKUP_20071101_3lmz1qfk_.bkp
      List of Datafiles in backup set 15
      File LV Type Ckp SCN    Ckp Time  Name
      1       Full 2485667575733 01-NOV-07 /home/oracle/base/oradata/db102/system01.dbf
      3       Full 2485667575733 01-NOV-07 /home/oracle/base/oradata/db102/sysaux01.dbf
      4       Full 2485667575733 01-NOV-07 /home/oracle/base/oradata/db102/users01.dbf
      6       Full 2485667575733 01-NOV-07 /home/oracle/base/oradata/db102/undotbs02.dbf
    RMAN>

  • How to Extract email subject with date from outlook?

    Hello,
    I am new to powershell and was wondering how i can extract the email subject with date for entire last month? i need to generate a report every month end and have to go through all the emails which can be very cumbersome at times. 
    Divyansh 
    Divyansh

     Ok i was able to find the commands but it only list email which are exactly 2 week old .. it does not list the recent items ..  
     Add-type -assembly "Microsoft.Office.Interop.Outlook" | out-null
     $olFolders = "Microsoft.Office.Interop.Outlook.olDefaultFolders" -as [type] 
     $outlook = new-object -comobject outlook.application
     $namespace = $outlook.GetNameSpace("MAPI")
     $folder = $namespace.getDefaultFolder($olFolders::olFolderSentMail)
     $folder.items  | where { $_.SentOn -gt [datetime]"3/1/2014" -AND $_.On -lt [datetime]"3/25/2014" }  | Select-Object -Property Subject, SentOn, Importance, SenderName
    Divyansh

  • Custom html tags with JEditorPane

    I'm trying, to use my own tags within html in the JEditorPane. An insert parses without exception but the html in the pane is missing the custom tags that I attempted to insert. I have tried to use the 'setPreservesUnknownTags' command on the html document, but it appears to have no affect. This is the code I have been trying to use:
    edtHTML.insertHTML(docHTML, 0, "<p><my:link id=\"12\">ABC</my:link></p>", 0, 0, HTML.Tag.P);
    I have also tried
    edtHTML.insertHTML(docHTML, 0, "<my:link id=\"12\">ABC</my:link>", 0, 0, new HTML.UnknownTag("my:link"));
    Can anyone help?
    Thanks

    I have no example for custom Tags but I found out that every tag is stored as an attribute.
    You can access the attributes if you use a JTextPane. There is a method (i don't know the name yet I will find out tomorrow in the office) which gives you the AttributeSet of each Character at the Carent position. All characters have an attribute "name" and the value of this attribute should be the name of the tag you've inserted.
    The only problem is that I had to read out every character so it's quit slow.
    If you could find a better solution please let me know ([email protected]).
    Thank you.
    J&ouml;rn

  • Java library to extract html tags in the given text

    hello friends
    i would like to know is there any java library available through which i can accesss the html tags that are present in my simple text
    waiting for reply

    jainshasha wrote:
    well in html parser we require to give some url in which it parse out the tags as per the requirementReally? Which parser did you choose which had that limitation?
    but what i want is i want to give my simple string as an input and in that it parse out the html tagsThe standard XML parser built into Java allows that. Convert your HTML to XHTML and use that, if you can't figure out how to get your chosen HTML parser to parse from a string. Or better, go back to the documentation for the HTML parser you chose and look again.

  • Why do I get html tags with my Flash forms CGI-mailed input text?

    I'm using a Flash form to send input text to a cgi page
    (using load variable). The following text is on the CGI page.
    To: [email protected]
    From: [email]
    Errors-To: [email protected]
    Subject: [subject]
    Type of Project: [subject]
    Deadline: [details]
    Name: [realname]
    e-mail: [email]
    Phone: [phone]
    e-mails me results like this...
    Type of Project: <TEXTFORMAT LEADING="2"><P
    ALIGN="LEFT"><FONT
    FACE="_sans" SIZE="12" COLOR="#000000"
    LETTERSPACING="0" KERNING="0">by
    george</FONT></P></TEXTFORMAT>
    Deadline: <TEXTFORMAT LEADING="2"><P
    ALIGN="LEFT"><FONT
    FACE="_sans" SIZE="12" COLOR="#000000"
    LETTERSPACING="0"
    KERNING="0"></FONT></P></TEXTFORMAT><TEXTFORMAT
    LEADING="2"><P ALIGN="LEFT"><FONT FACE="_sans"
    SIZE="12" COLOR="#000000" LETTERSPACING="0"
    KERNING="0">it!!!</FONT></P></TEXTFORMAT>
    Name:
    e-mail: <TEXTFORMAT LEADING="2"><P
    ALIGN="LEFT"><FONT
    FACE="_sans" SIZE="12" COLOR="#000000"
    LETTERSPACING="0"
    KERNING="0">[email protected]</FONT></P></TEXTFORMAT>
    Phone: <TEXTFORMAT LEADING="2"><P
    ALIGN="LEFT"><FONT
    FACE="_sans" SIZE="12" COLOR="#000000"
    LETTERSPACING="0" KERNING="0">I&apos;ve
    got</FONT></P></TEXTFORMAT>
    anyone know how I can get it to drop all the extraneous html
    crap? - I just want the input text items.
    Also I can't figure out how to do a flash pulldown list as a
    form element... all tutorials I find on the sbject are for Flash 5
    and say to use Smartclips from the common librarirs tab - but there
    are no "smartclips" in my Flash CS3. I see other things in
    components (menu, list) - but cannot figure out how to add my items
    to the list.

    is there some way I could use an "expression" to subtract the
    unwanted text strings from the info being passed by the variables?
    If so, does anyone have an example of the context I would used to
    subtract 2 strings from that info...
    this string of html tags appears before the passed input text
    TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT
    FACE="_sans" SIZE="12" COLOR="#000000"
    LETTERSPACING="0" KERNING="0">
    and this appears affter it...
    </FONT></P></TEXTFORMAT>
    it's always the same.

  • Removing html tags with \n

    Hi all,
    I have the following html tag to remove:
    <a onClick="s_code_linktrack('Article-MultiPagePageNum2');"
                           title="Page 2"
                           href="?pagewanted=2&ei=5088&en=dea221a153307225&ex=1284955200&partner=rssnyt&emc=rss">2<a onClick="s_code_linktrack('Article-MultiPagePageNum3');"
                           title="Page 3"
                           href="?pagewanted=3&ei=5088&en=dea221a153307225&ex=1284955200&partner=rssnyt&emc=rss">3<a onClick="s_code_linktrack('Article-MultiPagePageNum4');"
                           title="Page 4"
                           href="?pagewanted=4&ei=5088&en=dea221a153307225&ex=1284955200&partner=rssnyt&emc=rss">4<a class="next" onClick="s_code_linktrack('Article-MultiPage-Next');"
                 title="Next Page"
                 href="?pagewanted=2&ei=5088&en=dea221a153307225&ex=1284955200&partner=rssnyt&emc=rss">I tried using
    bText = bText.replaceAll("\\<.*?\\>","");
    and
    bText = bText.replaceAll("\\<.*?m)\\>","");
    Niether work. Any thoughts?
    thanx in advance

    What is the difference between this thread and your previous thread on the topic?
    http://forum.java.sun.com/thread.jspa?threadID=665617

  • How can i fill selection box on html form with data on the clientside?

    hi
    i want to make a html form that reads option values from the client.
    Because there are too many data, it's not reasonable for me to design a page which connects to server each time to fill the selection boxes.Instead i want to check if data resides at the clientside, if so fill selection boxes with that data, if not download it for the first time and store it on the client for later local retrieval.In addition i must be able to update that data residing on the client when i want.

    Don't see where Java comes into this. Sounds like you'd be using JavaScript on the client.
    A cookie would probably be the only way to save data on the client.

  • Query taking too much time with dates??

    hello folks,
    I am trying pull some data using the date condition and for somereason its taking too much time to return the data
       and trunc(al.activity_date) = TRUNC (SYSDATE, 'DD') - 1     --If i use this its takes too much time
      and al.activity_date >= to_date('20101123 000000', 'YYYYMMDD HH24MISS')
       and al.activity_date <= to_date('20101123 235959', 'YYYYMMDD HH24MISS') -- If i use this it returns the data in a second. why is that?
    How do i get the previous day without using the hardcoded to_date('20101123 000000', 'YYYYMMDD HH24MISS'), if i need to retrieve it faster??

    Presumably you've got an index on activity_date.
    If you apply a function like TRUNC to activity_date, you can no longer use the index.
    Post execution plans to verify.
    and al.activity_date >= TRUNC (SYSDATE, 'DD') - 1
    and al.activity_date < TRUNC (SYSDATE, 'DD')

  • JDBC MS Access--- cannot extract entry with null value with data type Meta

    I'm trying to extract a data entry with null value by using JDBC. The database is MS Access.
    The question is how to extract null entry with data type memo? The following code works when the label has data type Text, but it throws sqlException when the data type is memo.
    Any advice will be appreciated! thanks!
    Following are the table description and JDBC code:
    test table has the following attributes:
    Field name Data Type
    name Text
    label Memo
    table contents:
    name label
    me null
    you gates
    Code:
    String query = "SELECT name, label FROM test where name like 'me' ";
    ResultSet rs = stmt.executeQuery(query);
    while (rs.next())
    String name = rs.getString("name");
    rs.getString("val");
    String label = rs.getString("label");
    System.out.println("\t"+name+"\t"+label);
    catch (SQLException ex)
    System.out.println(ex.getSQLState());
    System.out.println(ex.getErrorCode());
    System.out.println("in sqlexception");
    output:
    C:\Temp\SEFormExtractor>java DBTest
    yet SELECT name, label FROM test
    null
    0
    in sqlexception

    The question is how to extract null entry with data type memo?Okay, what you need to do is this:
    if (rs.getString("val") == null)
      // do something
    }This way, when it's a null value, you can check it first, and then handle it how you want, rather than getting an exception.

  • HTML tags not displayed when using Data Template

    Hi All...
    I'm developing a BI Publisher report in which one of the columns is a clob data type. I'm using an xsl stylesheet to format the data present in the clob column.
    I've developed the report using data template as the data set. The problem is the clob column which has the HTML tags where not displayed properly...for example
    the tag starting with
    <
    is replaced with
    & lt;
    I did a couple of searches in this forum and in tim's blog but couldn't find a proper solution...
    http://blogs.oracle.com/xmlpublisher/2007/01/formatting_html_with_templates.html
    API and HTML Formated Content
    Re: Problem with text data elements containing escaped HTML codes
    HTML Output from CDATA
    Re: HTML formatted output
    Re: Special characters in CLOB are making report fail
    Re: Formatting of HTML tag problem
    I'm using BI Publisher standalone:Release 10.1.3.2. In one of the threads..
    Re: Special characters in CLOB are making report fail
    I came to know that data template cannot generate proper HTML tags for release 10.1.3.2. Is there any work around way to get the proper HTML tags when data template is used as a data set?
    Thanks in Advance...
    Edited by: user10280715 on Dec 9, 2008 3:13 PM

    Issue could be with the data that is selected in the other environment. It generally happens that the ALV will not give the same results as in the DEV in the other systems.
    Possible errors could be the control break statements in the loop...endloop block. validate the correctness of the control break stmts if any.

  • Extracting delta's with ODP-data source 0MATERIAL_ATTR from SAP ECC

    We are using ODP-replication to extract data from SAP ECC to a MS SQL-database by SAP BusinessObjects Data Services.
    For extracting Material-data we use the standard data source 0MATERIAL_ATTR in SAP ECC.
    This data source delivers a full load with all Materials in it.
    In the properties of this data source (tab attributes) we saw that this data source is Delta_Extractor_Capable.
    Therefore we set the extraction mode from 'Query' to CDC (= Change Data Capable) and did a new initial load to get all Materials again.
    After changing a Material in ECC, we did a delta-load, but unfortunately we received ZERO records.
    Is there anyone with experience on extracting delta's with data source 0MATERIAL_ATTR or another standard masterdata data source in ECC?
    Thank you for your reaction in advance,
    Best regards,
    Jan-Hendrik

    Hi,
    You can check the delta records in RSA3 see teh article. in https://wiki.sdn.sap.com/wiki/display/profile/Surendra+Reddy
    Checking the Data using Extractor Checker (RSA3) in ECC Delta Repeat Delta etc...
    http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/80f4c455-1dc2-2c10-f187-d264838f21b5&overridelayout=true 
    Data Flow from LBWQ/SMQ1 to RSA7 in ECC and Delta Extraction in BI
    http://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/library/business-intelligence/d-f/data%20flow%20from%20lbwq_smq1%20to%20rsa7%20in%20ecc%20and%20delta%20extraction%20in%20bi.pdf
    Delete the Queue in LBWQ/RSA7 and then freshly load Init and Check the Delta.
    Once you load the Init then if there are no new records then it won't get any data. so 0 records
    Check in SM37 in ECC why the job is terminated. See in ST22 also.
    Thanks
    Reddy

Maybe you are looking for

  • Maintaince view of a table

    Hi All, I have created a custom table. While I display data for this table I need one extra field (from some other table) in the display. That extra field is not in the include structure. Any pointers... Thanks in advance. Neerja

  • How to suppress Compare documents dialog in Acrobat 8 using Visual Basic

    Hi, We are having one utility developed in VB and we are using this to compare PDF documents with Adobe 8.0. Before comparing the PDFs,first Compare Documents dialog is displayed and then we have to say OK. Then, mismatches between the documents is s

  • Mail, calendar etc. cannot be activated after logging in to iCloud

    I'm running Yosemite (OS X 10.10 (14A388b)). When checking my iCloud settings, I realized that Mail, calendar, contacts, reminders, notes could not be activated. I logged out from iCloud and logged in again. Same result. When trying to activate Mail

  • IAS 8i on Unix Server

    I have a problem running my form over the web. Sometimes I get a WEB CGI Error: Load Balancer Problem. Does anyone know how to turn off the load balancer. I also have this running on NT and when I chose not to install load balancer the problem went a

  • Macbook Core 2 Duo speaker balance

    Hi, I just got a Macbook Core 2 Duo 2.0 Ghz. I have read about speaker volume and distortion problems which I don't seem to be having. But it sounds to me that the sound coming from the right speaker is louder and clearer than from the left? Has anyo