JS/HTML: HTML to DOM so I can XPath

Im new to all this and all I want to do is get info from a
page and redisplay it. I would really like to get that info like
the title suggests tho...I want to start with the html source,
parse it to a DOM and then use xpath to get the info I want. Ive
only ever done this sort of thing in Greasemonkey before tho and
dont seem to able to get it.
Any help would be appreciated.

Just to answer my own question ;).......
quote:
<html>
<head>
<title>Get It!</title>
<link href="sample.css" rel="stylesheet"
type="text/css"/>
<script type="text/javascript"
src="lib/air/AIRAliases.js"></script>
<script type="text/javascript"
src="lib/air/AIRIntrospector.js"></script>
<script type="text/javascript">
// AIR-related functions created by the developer
function onHTMLLoadComplete(e)
//get a reference to the top level html document
var doc = html.window.document;
//var doc = e.target.window.document;
var node=doc.evaluate("//title",doc).iterateNext();
// while (thisNode = nodes.interateNext()) {
// alert( thisNode.textContent );
// thisNode = nodes.iterateNext();
var elem = document.createElement( 'div' );
elem.innerText = 'Title of Page is: ' + node.textContent;
document.body.appendChild( elem );
// loads the content of a remote URL
function doRequest(url) {
var req = new XMLHttpRequest();
req.onreadystatechange = function() {
if (req.readyState == 4) {
var str = req.responseText;
html = new air.HTMLLoader();
html.addEventListener(air.Event.COMPLETE,
onHTMLLoadComplete);
html.loadString(str);
req.open('GET', url, true);
req.send(null);
function openInBrowser(url) {
air.navigateToURL( new air.URLRequest(url));
</script>
</head>
<body>
<h3>HTML to DOM for XPath</h3>
<ul>
<li>XMLHttpRequest object can reach into remote
domains &mdash; the following loads
http://www.adobe.com:
<br/>
<input type="button" onclick='doRequest("
http://www.adobe.com");'
value='doRequest("
http://www.adobe.com");'/>
</li>
</ul>
</body>
</html>
Now Id like to know if there's an option for it to not load
images when it parses the dom. I assume its still loading the
images from the amount of time it took to load (I have dialup). If
not I wonder if a regular expression could be made to wreck the
urls of the images (by changing the href attribute id to something
else) and then search for the new attribute with xpath.....pity Im
no good at regular expressions.

Similar Messages

  • Convert html to Dom object possible

    Hi all,
    my question is , can we convert a simple html to Dom object or not
    Thanks in advance

    That would print the code as a PDF, not the rendered page.
    Murray --- ICQ 71997575
    Adobe Community Expert
    (If you *MUST* email me, don't LAUGH when you do so!)
    ==================
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    ==================
    "Ian Edwards" <[email protected]>
    wrote in message
    news:g0q95l$ego$[email protected]..
    > Hi
    >
    > not sure really want you want to achieve but assuming
    you have the full
    > suite you have the full versiom of acrobat. This
    installs a "pdf" creator
    > as a printer so you can print your code to it by
    selectig file--print code
    > and then selecting the Adobe pdf printer, (if you have
    flash paper you can
    > print to that and then save as pdf)
    >
    > If you want the design view printed I think you will
    have to preview in a
    > browser andthen print as above.
    >
    > HTH
    >
    > Ian
    >
    > --
    > [email protected]
    >
    http://www.edwards-micros.co.uk
    >

  • How to update the HTML file so that we can Control our process in real time

    After installing following three steps as per the lookout 4 online help I am unable to Monitor and control the Process in HTML format, which was exported manually in lookout server.
    1) Creating a Web Client Page in Lookout
    2) Download a Lookout Web Client
    3) Setting Up Own Web Server
    My browser shows only the instance, which I have uploaded manually without any update
    Problem: How to automatically update/refresh the HTML file so that we can Monitor/Control our process in real time/bi-directional mode.

    Hi,
    It seems like your process is not updating. When you create a Web Client, it uses ActiveX which lets you control the Lookout process fully. Make sure that you run the process. You can do this by pressing CTRL+Spacebar which puts it in Run-mode. Perhaps then you may see your graphs, etc updating.
    Also, please refer to page 11-1 of the Users Manual linked below:
    http://www.ni.com/pdf/manuals/322390a.pdf
    What kind of Web Server are you using? Make sure all the settings in it are done properly. If you have LabVIEW, you can use the LabVIEW Web Server.
    Hope this information is helpful. Please let us know if you have any further questions.
    Regards,
    A Saha
    Applications Engineer
    National Instruments
    Anu Saha
    Academic Product Marketing Engineer
    National Instruments

  • XML DOM API vs. XPATH

    Hi,
    We are using Oracle v9.2.0.4. Can any one share there knowledge regarding the XML DOM API and XPATH? What are pros and cons using either approaches? In terms of performance, what approach much better? Also, which approach can maintain the CDATA?
    Here is a sample of an XML. Please show some sample on how to get the values for SHOP element using both XML DOM API vs. XPATH.
    <FILTER>
    <USER_NAME></USER_NAME>
    <PROFILE>
    <SHOP>TOYS 'R US</SHOP>
    <SHOP>TONY & MARIE'S SHOP</SHOP>
    <SHOP>A HUGE STORE <Sample Name></SHOP>
    </PROFILE>
    </FILTER>
    Thank you in advance.

    Refer
    http://forums.oracle.com/forums/thread.jspa;jsessionid=8d922006ce6dda4555e27134d9c9502b63e2f3808d1.rQjwahaM-AXMmkbGngTxpQOUaNaKaxD3lN4OagSLb3mIaN8IbwSLa30Qn3qP-x4KbhiKbA8Omh8Q-wOSa30K8Oz1iNyKb2TRnk8LaN8IpR9vmQLz-AbJpgaTahaPbN8Kc3uxf2bdeNf+lQnJqBjHqNeL8QfznA5Pp7ftolbGmkTy?messageID=449186&#449186

  • HTML window in AIR app can't open new window

    Hi, By allowing HTML content to be displayed inside our AIR app it's possible for our partner organization to write their own custom features hosted on HTML pages at their site, but for their content to appear integrated seamlessly within our AIR app's container so that it looks like it belongs there...
    We've successfully got an HTML window within our AIR app that navigates to content in a sub-folder on a web-hosted domain. Content displays correctly and hyperlinks function within the HTML window as we'd expect apart from three scenarios that appear to be manifestations of the same problem:
    A hyperlink on a page shown in-app with a link to a PDF stored on the web server has no action
    A hyperlink on a page shown in-app with a link to a video file stored on the web server has no action
    A hyperlink on a page shown in-app with a link to another site (target="_blank" parameter) has no action
    All three hyperlink scenarios work as we'd wish if we navigate to the page in a standard browser.
    How can we code the HTML so that we can display selected content in an HTML window inside the AIR app; but have selected hyperlinks invoke the user's standard web browser, or launch Adobe Reader, or play a video file etc?
    Note, we understand how to do those things within AIR itself, but can't figure out how to achieve this from inside the HTML window in the app.

    Hello,
    As to "_blank" links:
    this is long-standing lacking feature - as there is no introspection of event of such type - so it goes and could not be prevented. One could either handle all navigation with system browser (all links open in system external browser) or handle them in embedded browser - similar issue is bugging people using phoneGab with jQueryMobile - application eats all external links or none at all). There are solution for that including runtime introspection of DOM object to retrieve all anchor (a) tags in rendered document and attach custom click handler via host object like on complete:
    var document:Object = html.htmlLoader.window.document;
    if(!document && !document.hasOwnProperty("getElementsByTagName")) return;
    var linksArray:Object = document.getElementsByTagName("a");
    if(!linksArray) return;
    var a:Object = null;
    for(var i:Number = 0; i < linksArray.length; i++)
         a = linksArray[i];
         if(a)
              a.onclick = function(event:Object):Boolean
                   if(event.target.hasOwnProperty("href") && event.target.hasOwnProperty("target"))
                        if(event.target.target != "_blank") return true;
                        flash.net.navigateToURL(new URLRequest(event.target.href));
                   return false;
    but if you have control on what content is provided you could take care of handling links depending on runtime feature detection that way in javascript:
    <script type="text/javascript">
         function handleClick(a)
              if(!window.runtime) return true;
              if(a.hasAttribute("target") && a.getAttribute("target") == "_blank")
                   var href = a.getAttribute("href");
                   var req = new window.runtime.flash.net.URLRequest(href);
                   if(req) window.runtime.flash.net.navigateToURL(req);
              return false;
    </script>
    <a     href="http://www.bbc.co.uk/" target="_blank" onclick="handleClick(this);">BBC</a>
    <br />
    <a href="http://www.google.com/" onclick="handleClick(this);">Google</a>
    (above could be scripted globally with help of jQuery for example for all links without much coding).
    In 2.7 there is new event introduced to help with introspection so one could prevent event if link is internal and do whatever is expected in application:
    http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/filesystem/File.h tml
    If you post some details on how PDF and video content is expected to be shown in html I'm sure someone would share some hints as well,
    regards,
    Peter Blazejewicz

  • HTML attachments of the mail can not be opened

    I can not open HTML attachments of the mail in the PlayBook. Is there any solution to this???

    Please review the following information.
    http://btsc.webapps.blackberry.com/btsc/microsites/search.do?cmd=displayKC&docType=kc&externalId=KB2... 0 1716682976
    Be a Shepard and not an iSheep.

  • Parsing HTML into DOM using HTMLEditorKit

    I am trying to parse an HTML file using javax.swing.text.html.HTMLEditorKit. My limitations are that I cannot install new libraries like jtidy and I must use a .jsp file, not a servlet. I'm able to get the url and parse it using ParserCallBack, but the new handleText method will not write to the page. Further more I cannot pass anything out of this method to use later because it is void. I want to get some data back from this method or at least do something useful within it. Is that possible?
         java.net.URL url = new java.net.URL("http://" + request.getServerName() + "/" + urls.get(i));
         java.io.InputStream is = url.openStream();
         java.io.InputStreamReader isr = new java.io.InputStreamReader(is);
         java.io.BufferedReader br = new java.io.BufferedReader(isr);
       javax.swing.text.html.HTMLEditorKit.ParserCallback callback =
          new javax.swing.text.html.HTMLEditorKit.ParserCallback () {
            public void handleText(char[] data, int pos) {
                out.println(data);
        new javax.swing.text.html.parser.ParserDelegator().parse(br, callback, false);Attempting to print from within this method gives this error:
    Attempt to use a non-final variable out from a different method. From enclosing blocks, only final local variables are available.
    Maybe I need to try and write the output xml file all from inside the parserCallback?

    Those are rather stupid requirements. Okay, I can see the one about not using external libraries because nobody knows how to deal with the licences. But making you use a JSP instead of a servlet just gets in the way of writing the Java code which you could probably do perfectly well if you didn't have to cram it into a JSP scriptlet. Stupid.
    But anyway: the error message says you need a final local variable. So don't just sit there, give it a final local variable. I forget just what type "out" is supposed to be, but something like "final JSPWriter fakeOut = out", followed by using "fakeOut" rather than "out" should work.

  • Start site is not uploaded correctly. html is missing. What can I do?

    Hallo,
    I am searching for a new web host and created my site with Iweb.
    Unfortunately I cannot upload it correctly to the testing site. Pictures and the blog are all right. But the start site with its links is missing.
    Apparently it is an Iweb fault. What can I do?
    thanks for your help. My testing site is running out tomorrow:

    The method for uploading files using the iWeb FTP is shown here...
    http://www.iwebformusicians.com/iWeb/Publish-Website.html
    This uploads the external index.html and a folder containing the rest of the website files which has the same name as the site name in iWeb (the one at the top of the left column in the iWeb window). The URL to any page of your site is...
    http://www.domain-name/website-name/page-name.html
    Entering domain-name.com into the browser should take you to the Home page.
    Uploading files using this method, its not possible for files to be mislaid or corrupted so the problem would appear to be on the server.
    The other method of publishing requires that you publish to a local folder and upload the files inside the main folder and this is shown here...
    http://www.iwebformusicians.com/iWeb/URLs-Favicons.html
    Using this method its possible to miss out some files which would cause the site to not load properly.
    Uploading via Ftp is discussed here...
    http://www.iwebformusicians.com/Search-Engine-Optimization/Upload.html

  • All of a sudden, I can't print .html documents on Firefox (I can on IE.) What happened?

    As of yesterday, I cannot print an .html document while on Firefox. I print a lot of recipes from different sites and also was trying to print a credit card bill. I went to Internet Explorer and can print just fine there. What is going on?

    See this: <br />
    http://kb.mozillazine.org/Problems_printing_web_pages

  • Can I run Mac Os 10.4 on my iMac G4 Dome? Where can I get install disks?

    I have an iMac G4 that currently has Mac OS10.3.9 installed.  I am unable to connect to the internet using my Airport Extreme even though there is an Airport card installed..  Can I install Mac OS 10.4 and if so where can I get installation CD's/DVD's for PPC G4? 

    Check in the Airport forums here:
    AirPort
    There may be a way to change security settings on the AEBS to accomodate the older OS.
    As for OS versions, any iMac G4 can run OS 10.4.11, and those with a 1ghz or faster processor can run OS 10.5.8. However, demand has pushed prices of legitimate retail install packages for older Mac OS versions sky-high---often more than the computer is worth used---and has also attracted the attention of scam artists. There are NO downloadable versions of the Mac OS since 7.5. Any offered are scams and piracy.
    You cannot use the gray system install/restore disks that came with another Mac model. A retail disk is black witha  big X and will NOT say "Up to Date" on it.
    Another caution: Some older G4 iMacs shipped with CD drives, not DVDs, and most Tiger retail disks out there are DVDs. Make sure your computer can use a DVD system disk before your order.

  • HT201304 i cant remember my password i put on for my restriction setting. how many times cani try before i get locked out, and what domi do ifi can't remember it?

    i cannot remeber my password i put on when i restricted my account ..( caution for 10 yr old daughter) but now i can't remember it. i' m on my 9th try ... will  get locked out after a certain amount of attempts... any suggestions .  i dont want to have to start over ( i think i read that somewhere how many tries do i get?

    Not sure if there is a limit for the number of attempts for it, but if you can't remember the restrictions code then was it on the iPad when you last backed up ? If not then you can restore to that backup and it should be removed. If it was on the iPad when you last backed up then the only way to remove it is to reset the iPad back to factory defaults and you can then re-sync your content back to the iPad - you won't be able to restore to the backup as that will keep the code in place.

  • Can Xpath be used in LiveCycle Designer ES?

    Is it possible to use xpath in LiveCycle Designer ES?
    Thanks.

    Hi Capiono,
    If you are trying to use XPaths in the PDF there is a an applyXPath method available in the Acrobat API, documented in js_api_reference.pdf which is part of the Acrobat X SDK available at http://www.adobe.com/devnet/acrobat/sdk/eula.html or in the Acrobat 8 version directly at http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/js_api_r eference.pdf#page=728
    So if my form had some xml like;
    <formData>
        <requestId name="RequestId">1263678</requestId>
    </formData>
    The following will display 1263678 on the console;
    console.println(XMLData.applyXPath($data, '//*[@name = "RequestId"]').item(0).value);
    That is the value of the first element with a name attribute of RequestId.
    I’ve never got this method to work with namespaces and the context is not what I would expect, in this example the root should be $data, but it is still possible to pick up elements from the XFA template.  This is one reason why the XPath expressions I used ended up being more complicated than they could have been, like the one above.
    So I stick with SOM expression, but for simple relative XPaths you can use the following to convert to a SOM;
    // Attempt to turn a relative XPath into a relative SOM expression
    // e.g. ../../clientId/text()  becomes  parent.parent.clientId
    function xpathToSom(xpath)
        xpath = stripNamespacePrefixes(xpath);
        xpath = xpath.replace(/\.\.\//g, "parent.");
        xpath = xpath.replace(/\/text\(\)/, "");
        xpath = xpath.replace(/\//g, ".");
        return xpath;
    // Strip all namespace prefixes from an xpath expression
    // e.g. /default:collections/default:collection/@reportingPeriod
    //       will become
    //  /collections/collection/@reportingPeriod
    function stripNamespacePrefixes(xpath)
        return xpath.replace(/\/\w*?:/g, "/");
    This is far from a general routine but maybe somewhere to start.
    Bruce

  • Can xpath be a number?

    an xml doc:
    <a>
    <b>b</b>
    <1>1</1>
    </a>
    can i use the cause:
    for $i in collection('////xmlContainer.dbxml')/a
    where $i/1 = 1
    return $i
    to query this doc?
    some Exception occured at my program:
    com.sleepycat.dbxml.XmlException: Error: Equality operator for given types not supported [err:XPTY0004], <query>:2:12, errcode = XPATH_EVALUATION_ERROR
    anybody can help me?thanks~

    The real question is how you created a document using an element name of "1" since that is not well-formed XML. Element names cannot start with numeric character.
    Regards,
    George

  • How to send HTML DOM to Servlet?

    How to send HTML DOM to Servlet?

    What exactly you mean by sending DOM to servlet?? if you want to post the entire html to servlet use XMLHttp object and post the entire html to servlet. You can get more info on XMLHttp at microsoft's MSDN site.

  • Parsing HTML to get DOM structure

    I have been looking at the various XML libraries such as JTidy, HotSax, Xalan, Tagsoup, htmlparser, etc. trying to find a library which would allow me to parse some HTML, retrieving the DOM structure of the document, without trying to make it any better.
    My goal is to write an application which is able to go through a huge bunch of html templates to modify some parts of it, and since these can be footers, headers, or just pieces of content, I don't want some HTML and BODY tags to be automatically generated...
    Is there any way I could achieve that? All the libraries I tried ended up generating some extra HTML in the DOM structure which I wasn't able to get rid of...

    Well, what I'm doing is a program which can process existing HTML templates so that I can refactor some patterns we have targeted to make everything more uniform.
    Thus I want to be able to read HTML code, alter it, and then produce the result without adding any extra tags guessed by a cleaner. The reason is simple, since the templates are only pieces of a final page, I don't want to end up with <html> tags inside every template piece!
    Oh and it is true that TagSoup is SAX based, but I mixed it with Xalan so that it produces a DOM tree. Here's the resource I found which helped me do that:
    http://www.hackdiary.com/archives/000041.html

Maybe you are looking for

  • NULL and Empty String

    Hi There, As far as I know, Null is not the same as an empty string; however, when I try this out, I get some unexpected results (well, at least unexpected for my liking): SQL> CREATE TABLE TS (MID NUMBER,   2  MDESC VARCHAR2(20) DEFAULT '' NOT NULL)

  • CD Burn Fail. Error 4280

    The attempt to burn a disk failed. An unknown error occured. 4280 How do I fix this? What does it mean? Diagnostics is able to read and report the drive and error. (Mashita UJ840D)

  • Freaky new User log-in option at start-up-arrgrhghghhh

    When I start-up (10.4...) I am presented with the log-in screen per my usual preference settings - I recently had to reset my passwords using the install cd to access the options for such a task (don't ask). Everything went as planned but when the co

  • Bug w/ xsl:attribute-set in java xdk

    Hi - I found what seems to be a bug in the 9.0.0.2.0.0A xdk beta for java. In xsl, if you have an empty xsl:attribute-set (one that uses other attribute sets but does not define any of it's own), the XSL Processor gives an internal error. Adding an a

  • How to Transfer iTunes content from old computer to new

    I have a new Windows 7 computer. I downloaded iTunes and now I want to transfer all my content from my old Windows Vista notebook to the new computer. How do I do it? I don't see any import/transfer solution. Thanks