Get content from html page

Hey guys, Im looking at accessing a webpage, downloading the content then stripping out the parts i want.
http://sunsolve.sun.com/search/document.do?assetkey=1-34-9-1
For example, I would like to be left with just the patches and their information, not the heading and intro. Where should i start?

here is some class that can read an URL:import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLConnection;
import java.io.OutputStream;
import java.net.URLEncoder;
public class test {
     public static void main(String args[]) {
          if(args.length!=0){
               new test(args);
          new test();
     public test() {
          this.openURL("http://www.google.com",null);
     public test(String[] args){
          int i = 0;
          while(i<args.length){
               // do something with the encoding, I am assuing utf-8
               // but the openURL method can check the header for you
               try{
                    System.out.println(new String(this.openURL(args,null),"UTF-8"));
               }catch(Exception e){
                    e.printStackTrace();
               i++;
     public byte[] openURL(String urlpath,URL u) {
          // it is VERRY importaint to read the entire response
          // if you want to connect to the same server again
          // this is because closing the inputstream does not close the socket
          // and response data from a previous request could be mixed up with the current
          InputStream is;
          OutputStream os;
          byte[] buf = new byte[1024];
          URLConnection urlc = null;
          try {
               URL a = null;
               if(u!=null){
                    a = u;
               }else{
                    a = new URL(urlpath);
               urlc = a.openConnection();
               urlc.setDoOutput(false);
          // either setDoOutput to false or Post some info
//               os = urlc.getOutputStream();
//               String name = "key="+URLEncoder.encode("value", "UTF-8");
//               os.write(name.getBytes("UTF-8"));
//               os.close();
               is = urlc.getInputStream();
               int len = 0;
               ByteArrayOutputStream bos = new ByteArrayOutputStream();
               while ((len = is.read(buf)) > 0) {
                    bos.write(buf, 0, len);
               // close the inputstream
               is.close();
               return bos.toByteArray();
          } catch (Exception e) {
               e.printStackTrace();
               try {
                    // now failing to read the inputstream does not mean the server did not send
                    // any data, here is how you can read that data, this is needed for the same
                    // reason mentioned above.
                    ((HttpURLConnection) urlc).getResponseCode();
                    InputStream es = ((HttpURLConnection) urlc).getErrorStream();
                    int ret = 0;
                    // read the response body
                    while ((ret = es.read(buf)) > 0) {
                    // close the errorstream
                    es.close();
               } catch (IOException ex) {
                    ex.printStackTrace();
                    // deal with the exception
          return new byte[0];
Here is some code to set a proxy// IF YOUR PROXY NEEDS AUTHENTICATION
//The base64encoder is part of the w3c tools
//download jigsaw and look for the base64,,, file
//http://www.google.nl/search?hl=nl&q=site%3Aw3c.org+jigsaw&lr=
//compiled it and put it in [jre home]/lib/ext
//put this jar file in the classpath when you compile
String proxyUrl = "myproxy";
String user = "myUser";
String password = "myPassword";
               URLConnection conn = url.openConnection();
               if(proxyUrl!=null){
                    System.getProperties().put( "proxySet", "true" );
                    System.getProperties().put( "proxyHost", proxyUrl );
                    System.getProperties().put( "proxyPort", "80" );
                    pwd = user + ":" + password;
                    Base64Encoder enc  = new Base64Encoder(password);
                    encodedPassword = enc.processString() ;
                    // optional
                    conn.setRequestProperty( "Proxy-Authorization", encodedPassword );
               // start opening output or inputstream on the connection

Similar Messages

  • Get parameters from html page from java application standalone ...

    Hi all,
    I work in one solution that i have values in Html Page and i want get the parameters values from html and cath they in java application standalone.
    The Html page is in same host than de java application.
    I want know if this is possible. I wnat know if without HttpServlet i can get the parameters from Html Page pure.
    Thanks in Advance for the ideas,
    Antonio.

    Hi Abdul,
    The problem is my client want one solution where i have one page simple page Html and one application java standalone. This application runs in one machine, but we don't have web server. So the question is: Is possible without web server i can get the parameters values that is inside the html page from java application. I remember you that the application java is one .jar that run's with one command line from crontab "java -jar teste.jar".

  • Help with getting links from HTML page

    Hello all. I found the sun tutorial for getting HREF values from a tags in an HTML document at <http://java.sun.com/developer/TechTips/1999/tt0923.html>. My question now is how would a person add the ability to get the text of the link to this code?
    For example:
    Provided the HTML code:<a href="link.html">example</a>Returned is: href=link.html text=example

    I think the TechTip you've linked too is quite old (1999). I would write a simple SAXParser that uses TagSoup (http://www.ccil.org/~cowan/XML/tagsoup/) as its input source. In your handler, simply set a flag and reset a StringBuffer to collect the contents of any <a>...</a> element. Simplified:
        public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
            if ("a".equals(localName)) {
                currentHref = attributes.getValue("href");
                if (currentHref != null && currentHref.length() > 0) {
                    inLink = true;
                    //reset the string buffer
                    buffer.setLength(0);
        public void characters(char[] ch, int start, int length) throws SAXException {
            if (inLink) buf.append(ch, start, length);
        public void endElement(String uri, String localName, String qName) throws SAXException {
            if ("a".equals(localName) && inLink) {
                inLink = false;
                //add link to the stack
                links.add(new Link(currentHref, buffer.toString()));
        }Completely untested, of course... .Good luck...

  • Values not getting displayed from first page of the report.

    Values in the report is getting displayed from second page.
    First page in the report only displaying the report title and column names.
    Secone page onwards, data and column names are generated.
    Can any one please help me, with the cause of the problem.

    what reporting tool?
    Interactive Reporting
    Financial Reporting

  • I have a 120gb Classic that has no space left on it. i am going to buy a larger gb capacity ipod as soon as i get clearer instruction on how to get the old ipod content to the new ipod. How do you get content from one ipod to the other?

    how do you get content from one ipod to the new one? my content is on an external hard drive not on my pc and i have run out of space on my 120gb classic. can you get old ipod content to new? my itunes has only got short cuts, the real content is on an external drive? can this be done?? please help

    If the content is on an external drive, but your library knows where to find it, then it should all work. Connect your device, make some selections for what to put on it, and sync. If, on the other hand, your current iPod is the only place holding some of your media then see this user tip: Recover your iTunes library from your iPod or iOS device.
    tt2

  • How do I get content from my iPad to show up on the tv screen using Apple TV without going thru iTunes?

    How can I get content from my iPad and my air book to show up on the tv screen using Apple TV, without going thru iTunes?

    You will need to use AirPlay to see that.
    Assuming both devices are on the same network and that AirPlay is not turned off on the Apple TV, then simply tap on the screen when you are watching content you wish to stream to your Apple TV, then tap the airplay icon that appears in the control bar, choose the Apple TV from the menu that appears.
    When displaying the content you wish to mirror on the iPad 2 (or better), iPad Mini, iPhone 4S (or better), double tap the home button (quickly) and swipe the bottom row of apps to the right to reveal the playback controls, tap the AirPlay icon and select your Apple TV from the list of available devices.

  • Read Text from HTML-Pages and want to solve "ChangedCharSetException"

    Hello,
    I have an app that connect via threads with pages and parse them an gives me only the Text-version of a HTML-page. Works fine, but if it found a page, where the text is within images, than the whole app stopps and gave me the message:
    javax.swing.text.ChangedCharSetException
            at javax.swing.text.html.parser.DocumentParser.handleEmptyTag(DocumentParser.java:169)
            at javax.swing.text.html.parser.Parser.startTag(Parser.java:372)
            at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1846)
            at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1881)
            at javax.swing.text.html.parser.Parser.parse(Parser.java:2047)
            at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:106)
            at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:78)
            at aufruf.main(aufruf.java:33)So I tried to catch them with "getCharSetSpec()" and "keyEqualsCharSet( )" from the class "javax.swing.text.ChangedCharSetException" and hoped that this solved the problem. But still doesen't work...
    Then I looked at the web and found, that I have to add the line:
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));"doc." is a new HTML Dokument, created with the HTMLEditorKit. I do not have much knowledge about that and so I hope, that someone can explain me, how I can solve that problem, within my code.
    Here we go:
    import javax.swing.text.*;
    import java.lang.*;
    import java.util.*;
    import java.net.*;
    import java.io.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    public class myParser extends Thread
            private String name;
            public void run()
                    try
                            URL viele = new URL(name);                       // "name" ia a variable with a lot of links
                    URLConnection hs = viele.openConnection();
                    hs.connect();
                    if (hs.getContentType().startsWith("text/html"))
                            InputStream is = hs.getInputStream();
                            InputStreamReader isr = new InputStreamReader(is);
                            BufferedReader br = new BufferedReader(isr);
                            Lesen los = new Lesen();
                            ParserDelegator parser = new ParserDelegator();
                            parser.parse(br,los, false);
            catch (MalformedURLException e)
                    System.err.print("Doesn't work");
            catch (ChangedCharSetException e)
                    e.getCharSetSpec();
                    e.keyEqualsCharSet();
                    e.printStackTrace();
            catch (Exception o)
            public void vowi(String n)
                    name = n;
    }and for the case that it is important here is the class "Lesen"
    import java.net.*;
    import java.io.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    class Lesen extends HTMLEditorKit.ParserCallback
            public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
                    try
                            if ((t==HTML.Tag.P) || (t==HTML.Tag.H1) || (t==HTML.Tag.H2) || (t==HTML.Tag.H3) || (t==HTML.Tag.H4) || (t==HTML.Tag.H5) || (t==HTML.Tag.H6))
                                    System.out.println();
                    catch (Exception q)
                            System.out.println(q.getMessage());
            public void handleSimpleTag(HTML.Tag t,MutableAttributeSet a, int pos)
                    try
                            if (t==HTML.Tag.BR)
                                    System.out.println(); // Neue Zeile
                                    System.out.println();
                    catch (Exception qw)
                            System.out.println(qw.getMessage());
            public void handleText(char[] data, int pos)
                    try
                            System.out.print(data);                                           // prints the text from HTML-pages
                    catch (Exception ab)
                            System.out.println(ab.getMessage());
    }Thanks a lot for helping...
    Stephan

    parser.parse(br,los, false);
    parser.parse(br,los, true);

  • Parsing the FRAME tag from HTML pages

    Hello to everybody,
    I am trying to parse the A tags & the Frame tags from HTML pages. I have developed the code below, which works for the A tags but it does not work for the Frame tags. Is there any idea about this?
    private void getLinks() throws Exception {
         System.out.println(diskName);
    links=new ArrayList();
    frames=new ArrayList();
    BufferedReader rd = new BufferedReader(new FileReader(diskName));
    // Parse the HTML
    EditorKit kit = new HTMLEditorKit();
    HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));
    try {
         kit.read(rd, doc, 0);
    catch (RuntimeException e) {return;}
    // Find all the FRAME elements in the HTML document, It finds nothing
         HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.FRAME);
    while(it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String frameSrc = (String)s.getAttribute(HTML.Attribute.SRC);
         frames.add(frameSrc);
    // Find all the A elements in the HTML document, it works ok
    it = doc.getIterator(HTML.Tag.A);
    while (it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String link = (String)s.getAttribute(HTML.Attribute.HREF);
    int endOfSet=it.getEndOffset(),
    startOfSet=it.getStartOffset();
    String text=doc.getText(startOfSet,endOfSet-startOfSet);
    if (link != null)
         links.add(new Link(link,text));
    it.next();
    }

    Hello to everybody,
    I am trying to parse the A tags & the Frame tags from HTML pages. I have developed the code below, which works for the A tags but it does not work for the Frame tags. Is there any idea about this?
    private void getLinks() throws Exception {
         System.out.println(diskName);
    links=new ArrayList();
    frames=new ArrayList();
    BufferedReader rd = new BufferedReader(new FileReader(diskName));
    // Parse the HTML
    EditorKit kit = new HTMLEditorKit();
    HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));
    try {
         kit.read(rd, doc, 0);
    catch (RuntimeException e) {return;}
    // Find all the FRAME elements in the HTML document, It finds nothing
         HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.FRAME);
    while(it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String frameSrc = (String)s.getAttribute(HTML.Attribute.SRC);
         frames.add(frameSrc);
    // Find all the A elements in the HTML document, it works ok
    it = doc.getIterator(HTML.Tag.A);
    while (it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String link = (String)s.getAttribute(HTML.Attribute.HREF);
    int endOfSet=it.getEndOffset(),
    startOfSet=it.getStartOffset();
    String text=doc.getText(startOfSet,endOfSet-startOfSet);
    if (link != null)
         links.add(new Link(link,text));
    it.next();
    }

  • XMII Login from Html page

    Hi,
    how can we login in xMII from Html page? for example, if i give username and password in HTML page. that need to directly login in xMII? how can it do?
    - senthil

    Jeremy,
    When I use the following URL to open a specific page, it works.
    http://server/Lighthammer/Login.jsp?IllumLoginName=accountname&IllumLoginPassword=accountpassword&session=true&target=/Test/report.irpt
    Question:
    1) This URL opens the html or irpt page itself directly without the associated xMII navigation/menu and navigation bar. Is it possible to open the page with the associated xMII menu/navigation thro an URL

  • XMII Login from Html page under xMII Version 12

    Hi,
    I found this thread
    xMII Login from Html page
    but I'm not sure this will also work under xMII  Version 12.
    I have now this question:
    Is it possible to use a url login with loginname and password under xMII V12 similar this example for Version 11.5:
    http://server/Lighthammer/Login.jsp?IllumLoginName=accountname&IllumLoginPassword=accountpassword&session=true&target=/Test/report.irpt
    Many thanks in advance

    Hi,
    Has anyone had any luck with this - displaying a v12 MII screen without requiring login?
    We need to be able to do this as well in order to display read-only screens on large screen monitors on the manufacturing floor without requiring login to MII.
    Under v11.5 it worked with no issues.  Under v12, we haven't figured out how to do it yet.
    I've waded through the NetWeaver UME documentation, have searched through the NetWeaver forums, etc. but to this point have had no luck in making it work.
    We've tried enabling the UME Guest account, assigning Guest to the anonymous group and guest role (and xMII Users role), creating a Navigation for the Guest user, but still the NetWeaver login screen is displayed.
    MII experts - if you are aware of how to do this can you please give detailed instructions instead of just referencing the NetWeaver / UME documentation?
    Thank you for your help!

  • How do I get content from my old iPod Touch to my new iPhone 5?  My iPod Touch is too old to use iCloud

    How do I get content from my old iPod Touch to my new iPhone 5?  My iPod Touch is too old to use iCloud. I have activated the phone but am stick at the point of the set-up where my 3 choices are:
    1. Set up as new phone
    2.  Restore from icloud Backup
    3.  Restore from iTunes Backup
    thanks!

    If your iPod touch music is in your iTunes library, you just need to sync the new iPod nano with iTunes, the songs will be transfer to your new iPod. If not, you can follow this guide to transfer the songs from iPod touch to iTunes first, and then re-sync them to your new iPod. Hope it helps. Feel free to email me if you need further help.

  • Need to Copy a subform along with content from one page to another page

    Hi All,
    I am new to Adobe Live Cycle .
    I am facing a particular problem in one scenario.
    I have a growing list of item i.e the number of Items are uncertain. I have put all these item in a sub form.
    Now I need a copy of this sub form from the First page to the 2nd Page.
    Basically , I want to copy a Subform along with the content from one page to another.
    Can anybody please help me.

    In source project open Tempo List (the one that is a list editor). Select all tempo changes and "copy them (command+c)
    close project
    Open destination project, open Tempo List delete all information and paste (command+V). Remember that Logic should be stopped at the exact position where the first tempo event happens. This is ussually 1.1.1.1, but check it in the source before closing it.
    hope this helps.
    regards

  • ExtendedScript from html page

    Hello,
    can i make script in the extendedScript toolkit(this script place document to indesign application) and run this script from html page?
    or is there integration between extendedscript and html?
    thanks

    HTML pages are usually displayed in web browsers, whose security model is designed specifically against any access to local resources, especially such calls to local applications.
    Besides to browsers, on the Mac, the "Dashboard" allows you to write little applications "widgets" exactly the way you're suggesting. It uses WebKit to display HTML.
    http://www.apple.com/downloads/dashboard/
    In order to communicate with extendscript, you'd have to go through the operating system's command line.
    http://developer.apple.com/documentation/AppleApplications/Conceptual/Dashboard_ProgTopics /Articles/CommandLine.html
    From there you'd invoke the scripting system with "OSAScript"
    http://developer.apple.com/documentation/Darwin/Reference/Manpages/man1/osascript.1.html
    then pass on your extendscript and arguments into InDesign via doScript.
    Another alternative to the main HTML browser is AIR, which also has an instance of WebKit for HTML display.
    http://livedocs.adobe.com/labs/air/1/quickstartshtml/
    Apparently it is possible to invoke BridgeTalk - the Creative Suite's inter application communication used by extendscript - straight from AIR. I don't yet have references how to do that, found the link below just yesterday.
    http://www.inthemod.com/bps/?p=165
    Dirk

  • How to send information from HTML page to JSP without reloading HTML page?

    Hello,
    Is it possible to send information(row number selected by user) from HTML page to JSP without reloading HTML page?
    Thanks.
    Oleg.

    Yes, you can do this with framesets and a hidden frame.
    You need a bit of JavaScritp in the "visible" frame that
    sets the location of the hidden frame to the JSP.
    Add the user's choice as a parameter to the JSP URL.

  • How do i get content from my ipod touch to my new desk top computer

    how do i get content from my ipod touch to my new desktop computer?

    For iTunes purchases >  iTunes Store: Transferring purchases from your iPhone, iPad, or iPod to a computer

Maybe you are looking for