Parsing an html page

i am trying to parse an html page read from the internet.
i assume i need to create a URL of it's address, but after that im not sure how i go about reading the lines of that html page.
i would like to load each line as a String into an array or a Vector so that i can easily parse each line from there...
any tips?
thanks.

haha...this is what i ended up doing...and i was able to parse it all pretty easy...
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
in.readLine();
thanks!

Similar Messages

  • How to parser a HTML page to get its variable and values?

    Hi, everyone, here is my situation:
    I need to parser a HTML page to get the variables and their associated values between <form>...</form> tag. for example, if you have a piece of HTML as below
    <form>
    <input type = "hidden" name = "para1" value = "value1">
    <select name = "para2">
    <option>value2</option>
    </form>
    the actual page is much complex than this. I want retrive pare1 = value1 and para2 = value2, I tried Jtidy but it doesn't reconginze select, could you recomend some good package this purpose? better with sample code.
    Thanks a lot
    Kevin

    See for example Request taglib from Coldtags suite:
    http://www.servletsuite.com/jsp.htm

  • Parsing a html page

    I want to parse page for specified contents.
    I feel that easy to do but my problem is there are many URLs in the html page and it has to enter into each link and grab the specified content from that html page.In this way it has to parse all the links.
    Can anyone help me in this? Also if anyone has a code for it please say me.I have been trying a lot for it.
    Thanks Swetha.

    Sounds like you are making a web spider. Here are a few open source spiders you could "dissect" (pun intended).
    http://java-source.net/open-source/crawlers
    Also there are a fair amount of tutorials on this kind of project if you poke around google for little bit. There are several ways you could approach this, most likely the one you choose will be based on how many urls you plan on visiting and what you plan on doing there.
    If you only plan on visiting a small number or url's you could simply maintain a list of unvisited pages and a list of visited pages. These could be linkedlists if you don't care about seeing the same page twice, or perhaps a hashset if you do. So you pull off your first url, read the contents of the page, and then find the occurences of http:// and then add that url to your unvisited list. when you are done with the current page move that url to the visited list.
    when you are parsing out the urls you could do something as simple as using a StringTokenizer and breaking the html code into words. then you could tell it was a link by calling something similar to s.startsWith("href="); and then go from there...
    If you are going to be visited many pages you might want to investigate using multiple threads. In this case you'll need to use a list that is threadsafe and you might want to throttle the threads (have them sleep a little inbetween url requests) so you don't go blasting their/your bandwidth...

  • How to parser a html page and get useful information?

    now ,I try to get the page by the url,after getting the whole page,
    is there any way to get the useful text ,and abandon other ,liks ,ad likes,
    other related links?
    I try to use java.util.regex.*;
    is there any other methods for dointg this?

    Regex isn't a good method unless your requirements are quite simple. In general if you want a Java HTML parser they are not hard to find -- "java html parser" is a good choice of keywords for an internet search.

  • How to parse HTML page

    What API or package can I use to parse an HTML page and to obtain
    HTML DOM interfaces.

    Use JTidy to make the HTML well-formed, then use the DOM parser in the Xerces API:
    JTidy (recommended by W3C, so its probably pretty good):
    http://www.w3.org/People/Raggett/tidy/
    http://sourceforge.net/projects/jtidy

  • Recordset onto basic HTML page

    How easy is it to insert a recordset onto a basic html page.
    Obviously I could convert the page to ASP and be able to do it, but
    this particular page has links to it all over the shop and changing
    the extension is not really possible.
    Thanks
    Dan...

    > How easy is it to insert a recordset onto a basic html
    page
    Impossible if you really mean "basic html". HTML has no
    facility for
    working with server-side data.
    Now - you could ask your host to parse all html pages as if
    they were ASP,
    but then you'd have to write the code yourself, because DW
    won't know that
    foo.html is actually capable of supporting ASP/VBScript.
    Murray --- ICQ 71997575
    Adobe Community Expert
    (If you *MUST* email me, don't LAUGH when you do so!)
    ==================
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    ==================
    "dan brown5" <[email protected]> wrote in
    message
    news:fna676$636$[email protected]..
    > How easy is it to insert a recordset onto a basic html
    page. Obviously I
    > could
    > convert the page to ASP and be able to do it, but this
    particular page has
    > links to it all over the shop and changing the extension
    is not really
    > possible.
    >
    > Thanks
    > Dan...
    >

  • Parsing the FRAME tag from HTML pages

    Hello to everybody,
    I am trying to parse the A tags & the Frame tags from HTML pages. I have developed the code below, which works for the A tags but it does not work for the Frame tags. Is there any idea about this?
    private void getLinks() throws Exception {
         System.out.println(diskName);
    links=new ArrayList();
    frames=new ArrayList();
    BufferedReader rd = new BufferedReader(new FileReader(diskName));
    // Parse the HTML
    EditorKit kit = new HTMLEditorKit();
    HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));
    try {
         kit.read(rd, doc, 0);
    catch (RuntimeException e) {return;}
    // Find all the FRAME elements in the HTML document, It finds nothing
         HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.FRAME);
    while(it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String frameSrc = (String)s.getAttribute(HTML.Attribute.SRC);
         frames.add(frameSrc);
    // Find all the A elements in the HTML document, it works ok
    it = doc.getIterator(HTML.Tag.A);
    while (it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String link = (String)s.getAttribute(HTML.Attribute.HREF);
    int endOfSet=it.getEndOffset(),
    startOfSet=it.getStartOffset();
    String text=doc.getText(startOfSet,endOfSet-startOfSet);
    if (link != null)
         links.add(new Link(link,text));
    it.next();
    }

    Hello to everybody,
    I am trying to parse the A tags & the Frame tags from HTML pages. I have developed the code below, which works for the A tags but it does not work for the Frame tags. Is there any idea about this?
    private void getLinks() throws Exception {
         System.out.println(diskName);
    links=new ArrayList();
    frames=new ArrayList();
    BufferedReader rd = new BufferedReader(new FileReader(diskName));
    // Parse the HTML
    EditorKit kit = new HTMLEditorKit();
    HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));
    try {
         kit.read(rd, doc, 0);
    catch (RuntimeException e) {return;}
    // Find all the FRAME elements in the HTML document, It finds nothing
         HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.FRAME);
    while(it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String frameSrc = (String)s.getAttribute(HTML.Attribute.SRC);
         frames.add(frameSrc);
    // Find all the A elements in the HTML document, it works ok
    it = doc.getIterator(HTML.Tag.A);
    while (it.isValid()) {
    SimpleAttributeSet s = (SimpleAttributeSet)it.getAttributes();
    String link = (String)s.getAttribute(HTML.Attribute.HREF);
    int endOfSet=it.getEndOffset(),
    startOfSet=it.getStartOffset();
    String text=doc.getText(startOfSet,endOfSet-startOfSet);
    if (link != null)
         links.add(new Link(link,text));
    it.next();
    }

  • Parse table data from HTML page

    Hello. I have a program that creates an HTML page with several tables to present some data. What I would like to do is extract the data row by row from one of the tables, and parse the data I'm interested in from each row. Can anyone suggest how I should approach this problem?

    Andrew,<br /><br />1. If you want to append these data to the existent one, you've to read the XMP of the file.<br /><br />2. You've to add or modify the Dublin Core Description field <dc:description><br /><br />For example:<br /><rdf:Description rdf:about='uuid:d659be9a-21d7-11d9-9b6a-c1fd593acb83'<br />  xmlns:dc='http://purl.org/dc/elements/1.1/'><br /> <dc:format>image/jpeg</dc:format><br /> <dc:description><br />  <rdf:Alt><br />   <rdf:li xml:lang='x-default'>Image Caption</rdf:li><br />  </rdf:Alt><br /> </dc:description><br /></rdf:Description><br /><br />3. You've to replace the app1 block on the JPG with the new XMP<br /><br />Regards,<br /><br />Juan Pablo

  • Retrieve html page before parsing it

    Hi:
    I'm trying to parse some query result returned from a web site. The following code returned me nothing, not even the html header and the pre-filled tags. When I replace "while(s2!=null){" with "a finite number of loops, say, 100, it worked. But when I increased the number of loops to 300 (actual page returns 327 lines of html code), it gave me nothing again. Could anybody please let me know what's wrong with my code or what should I do to retrieve an html page before I parse it? Thanks.
              String s1 = new String();
              String s2 = new String();
              try{
                   URL u = new URL(url);
                   InputStream ins = u.openStream();
                   InputStreamReader isr = new InputStreamReader(ins);
                   BufferedReader br = new BufferedReader(isr);
                   while(s2!=null){
                        s2 = br.readLine();
                        s1 = s1.concat(s2);
                   //test part
                   response.setContentType("text/html");
              PrintWriter out = response.getWriter();
         out.print(ServletUtilities.headWithTitle("Hello WWW") +
         "<BODY>\n" + s1 +     
                        "MANUALLY-ADDED" +
         "</BODY></HTML>");

    Here is a simple [url http://forum.java.sun.com/thread.jsp?forum=31&thread=285107]example. Don't use String.concat(..) method. Use a StringBuffer and convert it to a String once the entire file has been read.

  • Can I use data from Servlet in my static html page?

    First of all, I can NOT use jsp because of web server's restriction.
    I have a servlet which will give me some image links in html file via doGET and doPOST method. I also need the sizes of the images and compress the images if too large.
    My question is how I can pass the image sizes to the html page and how I can use them in html files.
    Please advise me some solutions to this problem.

    Yeah, you have 2 choices:
    1) Change your web server to one that allows JSP.
    2) Re-build the JSP system from scratch so that the one you make will work in your server. This would involve changing your so called static HTML to have markings (like <% %> tags) where you should insert the values you need to insert. You would then have a servlet that reads the 'static' HTML, parses our the insertion tags, and inserts the values. It would then stream the results back to the user.
    Of course, your HTML is not really static, it is dynamic because the values you are inserting are capable of changing.
    If you don't want to upgrade the server to one that supports JSPs (if yours really doesn't), the have fun making your own system.

  • How ias integrate with Snacktory for getting main text from an html page

    Hi All,
    i am new to endeca and ias, i have an requirement, need to get main text from whole html page before ias save text to Endeca_Document_Text property,
    as ias save all text in page to endeca_document_text property, it is not ok for reading when show in web page, i use an third party API to filter out the main text from original page,
    now i want to save these text to endeca_document_text property,
    an another question,
    i get zero page when doing the logic of filtering main text from original html text in ParseFilter( HTMLMetatagFilter implements ParseFilter) using Snacktory.
    if only do little things, it will work fine, if do more thing, clawer fail to crawl page. any one know how to fix it.
    log for clawler.
    Successfully set recordstore configuration.
    INFO    2013-09-03 00:56:42,743    0    com.endeca.eidi.web.Main    [main]    Reading seed URLs from: /home/oracle/oracle/endeca/IAS/3.0.0/sample/myfirstcrawl/conf/endeca.lst
    INFO    2013-09-03 00:56:42,744    1    com.endeca.eidi.web.Main    [main]    Seed URLs: [http://www.liferay.com/community/forums/-/message_boards/category/]
    INFO    2013-09-03 00:56:43,497    754    com.endeca.eidi.web.db.CrawlDbFactory    [main]    Initialized crawldb: com.endeca.eidi.web.db.BufferedDerbyCrawlDb
    INFO    2013-09-03 00:56:43,498    755    com.endeca.eidi.web.Crawler    [main]    Using executor settings: numThreads = 100, maxThreadsPerHost=1
    INFO    2013-09-03 00:56:44,163    1420    com.endeca.eidi.web.Crawler    [main]    Fetching seed URLs.
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:46,519    3776    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:56:52,889    10146    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:52,889    10146    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:52,890    10147    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-1]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:56:59,184    16441    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:56:59,185    16442    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:56:59,185    16442    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into EndecaHtmlParser getParse
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    come into HTMLMetatagFilter
    INFO    2013-09-03 00:57:07,057    24314    com.endeca.eidi.web.parse.HTMLMetatagFilter    [pool-1-thread-2]    meta tag viewport ==minimum-scale=1.0, width=device-width
    INFO    2013-09-03 00:57:07,058    24315    com.endeca.eidi.web.Crawler    [main]    Seeds complete.
    INFO    2013-09-03 00:57:07,090    24347    com.endeca.eidi.web.Crawler    [main]    Starting crawler shut down
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    Waiting for running threads to complete
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    Progress: Level: Cumulative crawl summary (level)
    INFO    2013-09-03 00:57:07,095    24352    com.endeca.eidi.web.Crawler    [main]    host-summary: www.liferay.com to depth 1
    host    depth    completed    total    blocks
    www.liferay.com    0    0    1    1
    www.liferay.com    1    0    0    0
    www.liferay.com    all    0    1    1
    INFO    2013-09-03 00:57:07,096    24353    com.endeca.eidi.web.Crawler    [main]    host-summary: total crawled: 0 completed. 1 total.
    INFO    2013-09-03 00:57:07,096    24353    com.endeca.eidi.web.Crawler    [main]    Shutting down CrawlDb
    INFO    2013-09-03 00:57:07,160    24417    com.endeca.eidi.web.Crawler    [main]    Progress: Host: Cumulative crawl summary (host)
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]   Host: www.liferay.com:  0 fetched. 0.0 mB. 0 records. 0 redirected. 4 retried. 0 gone. 0 filtered.
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]    Progress: Perf: All (cumulative) 23.6s. 0.0 Pages/s. 0.0 kB/s. 0 fetched. 0.0 mB. 0 records. 0 redirected. 4 retried. 0 gone. 0 filtered.
    INFO    2013-09-03 00:57:07,162    24419    com.endeca.eidi.web.Crawler    [main]    Crawl complete.
    ~/oracle/endeca
    -======================================
    source code for parsefilter
    package com.endeca.eidi.web.parse;
    import java.util.Map;
    import java.util.Properties;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.log4j.Logger;
    import org.apache.nutch.metadata.Metadata;
    import org.apache.nutch.parse.HTMLMetaTags;
    import org.apache.nutch.parse.Parse;
    import org.apache.nutch.parse.ParseData;
    import org.apache.nutch.parse.ParseFilter;
    import org.apache.nutch.protocol.Content;
    import de.jetwick.snacktory.ArticleTextExtractor;
    import de.jetwick.snacktory.JResult;
    public class HTMLMetatagFilter implements ParseFilter {
        public static String METATAG_PROPERTY_NAME_PREFIX = "Endeca.Document.HTML.MetaTag.";
        public static String CONTENT_TYPE = "text/html";
        private static final Logger logger = Logger.getLogger(HTMLMetatagFilter.class);
        public Parse filter(Content content, Parse parse) throws Exception {
            logger.info("come into EndecaHtmlParser getParse");
            logger.info("come into HTMLMetatagFilter");
            //update the content with the main text in html page
            //content.setContent(HtmlExtractor.extractMainContent(content));
            parse.getData().getParseMeta().add("FILTER-HTMLMETATAG", "ACTIVE");
            ParseData parseData = parse.getData();
            if (parseData == null) return parse;
            extractText(content, parse);
            logger.info("update the content with the main text content");
            return parse;
        private void extractText(Content content, Parse parse){
            try {
                ParseData parseData = parse.getData();
                if (parseData == null) return;
                 Metadata md = parseData.getParseMeta();
                ArticleTextExtractor extractor = new ArticleTextExtractor();
                String sourceHtml = new String(content.getContent());
                JResult res = extractor.extractContent(sourceHtml);
                String text = res.getText();
                md.set("Endeca_Document_Text", text);
            } catch (Exception e) {
                // TODO: handle exception
        public static void log(String msg){
            System.out.println(msg);
        public Configuration getConf() {
            return null;
        public void setConf(Configuration conf) {

    but it only extracts URLs from <A> (anchor) tags. I want to be able to extract URLs from <MAP> tags as wellGee, do you think you could modify the code to check for "Map" attributes as well.
    Can someone maybe point a page containing info on the HTML toolkit for me?It's called the API. Since you are using the HTMLEditorKit and an ElementIterator and an AttributeSet, I would start there.
    There is no such API that says "get me all the links", so you have to do a little work on your own.
    Maybe you could use a ParserCallback and every time you get a new tag you check for the "href" attribute.

  • HTML page with FLASH object doesn't reload upon a redirect

    Am using:  ECC 6.0 and ABAP SAPGUI development
    Hi All,
    I'm seeing a frustrating issue that i'm hoping others have seen and resolved.
    Here is what i'm attempting and below that is the issue:
    1)  I have created a ABAP program in which i'm using the HTML viewer (class:  cl_gui_html_viewer ) within a container that is on one of my ABAP screens.
    2)  I load up a web page on our (intra)network that displays an HTML page that has a nice FLASH navigation object.  This navigation object operates as such - when a node is clicked it will go to another html page (on the same network) that will then parse apart what node they clicked on and via javascript submit a form in which I've defined a SAPEVENT for.
    3)  My ABAP program has defined the event handler for this sap event and calls the appropriate method just fine (ON_SAPEVENT).  I am able to trap the event details and do some other things.
    This is all working just fine, except....I want it to then go back to the first page (original page) that contained the FLASH navigation on it upon after the page that trapped the SAPEVENT is complete automatically.  Easy? - that's what I thought...I tried several different ways to do this "go_back" on the html control, "show_url" (with the original URL)...even a redirect in the actual html page itself to go back.
    They all DO go back to the first page, however my flash navigation object on that page NEVER shows up!  It's almost like the frontend is thinking it is already loaded and will not load it again.  I thought maybe I need to do a "flush" or some such - but that didn't seem to solve it.  Has anyone seen this and resolved it or knows what this is?  The thing is...I can completely exit out of the program - and the flash object will load just fine (but only after I wait a minute or so).  What gives...anyone know?
    Thanks in advance,
    Matt

    Am using:  ECC 6.0 and ABAP SAPGUI development
    Hi All,
    I'm seeing a frustrating issue that i'm hoping others have seen and resolved.
    Here is what i'm attempting and below that is the issue:
    1)  I have created a ABAP program in which i'm using the HTML viewer (class:  cl_gui_html_viewer ) within a container that is on one of my ABAP screens.
    2)  I load up a web page on our (intra)network that displays an HTML page that has a nice FLASH navigation object.  This navigation object operates as such - when a node is clicked it will go to another html page (on the same network) that will then parse apart what node they clicked on and via javascript submit a form in which I've defined a SAPEVENT for.
    3)  My ABAP program has defined the event handler for this sap event and calls the appropriate method just fine (ON_SAPEVENT).  I am able to trap the event details and do some other things.
    This is all working just fine, except....I want it to then go back to the first page (original page) that contained the FLASH navigation on it upon after the page that trapped the SAPEVENT is complete automatically.  Easy? - that's what I thought...I tried several different ways to do this "go_back" on the html control, "show_url" (with the original URL)...even a redirect in the actual html page itself to go back.
    They all DO go back to the first page, however my flash navigation object on that page NEVER shows up!  It's almost like the frontend is thinking it is already loaded and will not load it again.  I thought maybe I need to do a "flush" or some such - but that didn't seem to solve it.  Has anyone seen this and resolved it or knows what this is?  The thing is...I can completely exit out of the program - and the flash object will load just fine (but only after I wait a minute or so).  What gives...anyone know?
    Thanks in advance,
    Matt

  • Read Text from HTML-Pages and want to solve "ChangedCharSetException"

    Hello,
    I have an app that connect via threads with pages and parse them an gives me only the Text-version of a HTML-page. Works fine, but if it found a page, where the text is within images, than the whole app stopps and gave me the message:
    javax.swing.text.ChangedCharSetException
            at javax.swing.text.html.parser.DocumentParser.handleEmptyTag(DocumentParser.java:169)
            at javax.swing.text.html.parser.Parser.startTag(Parser.java:372)
            at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1846)
            at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1881)
            at javax.swing.text.html.parser.Parser.parse(Parser.java:2047)
            at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:106)
            at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:78)
            at aufruf.main(aufruf.java:33)So I tried to catch them with "getCharSetSpec()" and "keyEqualsCharSet( )" from the class "javax.swing.text.ChangedCharSetException" and hoped that this solved the problem. But still doesen't work...
    Then I looked at the web and found, that I have to add the line:
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));"doc." is a new HTML Dokument, created with the HTMLEditorKit. I do not have much knowledge about that and so I hope, that someone can explain me, how I can solve that problem, within my code.
    Here we go:
    import javax.swing.text.*;
    import java.lang.*;
    import java.util.*;
    import java.net.*;
    import java.io.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    public class myParser extends Thread
            private String name;
            public void run()
                    try
                            URL viele = new URL(name);                       // "name" ia a variable with a lot of links
                    URLConnection hs = viele.openConnection();
                    hs.connect();
                    if (hs.getContentType().startsWith("text/html"))
                            InputStream is = hs.getInputStream();
                            InputStreamReader isr = new InputStreamReader(is);
                            BufferedReader br = new BufferedReader(isr);
                            Lesen los = new Lesen();
                            ParserDelegator parser = new ParserDelegator();
                            parser.parse(br,los, false);
            catch (MalformedURLException e)
                    System.err.print("Doesn't work");
            catch (ChangedCharSetException e)
                    e.getCharSetSpec();
                    e.keyEqualsCharSet();
                    e.printStackTrace();
            catch (Exception o)
            public void vowi(String n)
                    name = n;
    }and for the case that it is important here is the class "Lesen"
    import java.net.*;
    import java.io.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    class Lesen extends HTMLEditorKit.ParserCallback
            public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
                    try
                            if ((t==HTML.Tag.P) || (t==HTML.Tag.H1) || (t==HTML.Tag.H2) || (t==HTML.Tag.H3) || (t==HTML.Tag.H4) || (t==HTML.Tag.H5) || (t==HTML.Tag.H6))
                                    System.out.println();
                    catch (Exception q)
                            System.out.println(q.getMessage());
            public void handleSimpleTag(HTML.Tag t,MutableAttributeSet a, int pos)
                    try
                            if (t==HTML.Tag.BR)
                                    System.out.println(); // Neue Zeile
                                    System.out.println();
                    catch (Exception qw)
                            System.out.println(qw.getMessage());
            public void handleText(char[] data, int pos)
                    try
                            System.out.print(data);                                           // prints the text from HTML-pages
                    catch (Exception ab)
                            System.out.println(ab.getMessage());
    }Thanks a lot for helping...
    Stephan

    parser.parse(br,los, false);
    parser.parse(br,los, true);

  • How link from html page to a specific frame in flash cs5 as3

    Hi!
    I'm kinda new around here. I am interested in knowing how to link from a specific html page to a specific frame in flash cs5 as3.
    I have a website that I originally began to design in flash but later started developing new pages for it in html. The flash part of it has several pages on different frames and I have created links from the flash part to the other html pages, but, I can only link the html pages back to the main flash home page, and not the other pages in the flash part of the website.
    I have read that in cs3 it was possible using the flashvars skip variable, but I don't know how to do it. I have not yet seen any working examples and I could not find any instructions / tutorials online for cs5.
    Can someone help here?

    add a query string, to the swf's embedding html, with variable/value indicating the frame you want to display in your swf.  add a javascript function to return the query string (or entire url), call the javascript function from flash using the externalinterface class.  and finally add code to your swf to parse the returned url or query string, parse it and then direct your timeline to the appropriate frame.

  • Displaying content on multiple html pages

    I’m building a basic website for a business/charity I
    work in. I’m no pro so all my pages and templates are written
    in HTML. For convenience it would be nice to have sustain bits of
    info appear throughout the website on different pages. As I
    understand the best way to do this is to create a RSS feed and then
    have the relevant web pages display the feed.
    However I’ve been reading up on how to do this and I am
    finding it very complicated and am not even sure if it can be done
    with an HTML page. All the examples I have come across seem to be
    done in PHP. I’m not even sure what PHP is.
    My question therefore is: Firstly, is RSS, what I need, or is
    there a simpler way of having bits of text appear on multiple web
    pages? And, if so, can I have RSS feeds display in an HTML? And,
    again, if so, can someone point me in the right direction to do
    this in the most simple but yet still efficient/reliable way?
    Thank You
    Ps, Merry Christmas and Happy New Year

    > Firstly, the web page displaying the SSI code seems to
    require a .shtml
    > extension, is this correct?
    Yes. It is true *unless* the host enables server parsing of
    all extensions.
    > Secondly, I don?t seem to have to change the SSI source
    file?s extension
    > from
    > .html to .ssi. And in fact I think it will make updating
    the website
    > easier for
    > my colleagues if I keep the extension to .html as, when
    I change the file
    > extension to .ssi, Dreamweaver?s properties inspector
    and CSS panels
    > become
    > inactive meaning that the only way to edit the file is
    going state into
    > the
    > code rather than using a point and click interface. Is
    there any reason
    > the SSI
    > source file should not keep an .html extension.
    Name the file being included anything you want. It doesn't
    matter to its
    functionality as an included file.
    > Is there any open source or low cost software out there
    that would make it
    > easier for my colleagues to update the website?s SSI
    files and still be
    > able to
    > format the text with the same CSS fill the whole website
    uses. Is this
    > what the
    > Contribute program in Macromedia Studio does?
    A properly constructed include file should only contain
    references to CSS
    rules specified in the parent page. That being the case, if
    you are editing
    the include file directly, you cannot style the text unless
    you are doing it
    in Dreamweaver, or unless you make reference to the existing
    styles
    specified in the parent page.
    Contribute does lots more than what you ask. Go to Adobe's
    site and read
    about it.
    Murray --- ICQ 71997575
    Adobe Community Expert
    (If you *MUST* email me, don't LAUGH when you do so!)
    ==================
    http://www.dreamweavermx-templates.com
    - Template Triage!
    http://www.projectseven.com/go
    - DW FAQs, Tutorials & Resources
    http://www.dwfaq.com - DW FAQs,
    Tutorials & Resources
    http://www.macromedia.com/support/search/
    - Macromedia (MM) Technotes
    ==================
    "Chopo^2" <[email protected]> wrote in
    message
    news:[email protected]...
    > Ok, cool, thanks for all the help, I've even managed to
    get the text in
    > the
    > .ssi document to obey the same CSS rules as the rest of
    my web pages. Just
    > a
    > few last question concerning extensions and user
    friendliness before I?m
    > perfectly comfortable with using SSI.
    >
    > Firstly, the web page displaying the SSI code seems to
    require a .shtml
    > extension, is this correct?
    >
    > Secondly, I don?t seem to have to change the SSI source
    file?s extension
    > from
    > .html to .ssi. And in fact I think it will make updating
    the website
    > easier for
    > my colleagues if I keep the extension to .html as, when
    I change the file
    > extension to .ssi, Dreamweaver?s properties inspector
    and CSS panels
    > become
    > inactive meaning that the only way to edit the file is
    going state into
    > the
    > code rather than using a point and click interface. Is
    there any reason
    > the SSI
    > source file should not keep an .html extension.
    >
    > Is there any open source or low cost software out there
    that would make it
    > easier for my colleagues to update the website?s SSI
    files and still be
    > able to
    > format the text with the same CSS fill the whole website
    uses. Is this
    > what the
    > Contribute program in Macromedia Studio does?
    >
    > Thanks a lot everyone,
    >
    > Chopo
    >

Maybe you are looking for

  • How do I install mavericks on a Mac book air without an os installed

    I have a 2011 MacBook Air without an os, how can I add mavericks

  • HT3964 2009 macbook pro superdrive will not load discs

    2009 Macbook pro superdrive will not accept discs. The loading mechanism appears not to be responding. The mechanism may be jammed as it will not allow the disc to be fully inserted. Last time I played a disc I had trouble ejecting it. I managed to g

  • New Mac Pro, AMD, Prem Pro CC & GPU Acceleration Issues

    Hi, I am not exactly what you would call an expert at Adobe Premiere Pro but I have been struggling to make a few exports over the last few days and yesterday discovered that the issue was about Premiere Pro not playing nicely with the AMD FirePro D5

  • Help! 10.4.11 update problem

    Hello Everyone, First post. I updated my G4 dual PPC from 10.4.10 to 10.4.11 and yes, you guessed it, I have a problem. When the G4 restarted after the update, during which it never displayed an optimizing message, my desktop appeared as before but I

  • How to install ink cartridges in officejet pro 8600

    I just purchased an officejet pro 8600 and after following visual setup  instructions and plugging the printer in I don't get any further instructions from the control panel nor can I find how to install  the ink cartridges in this model (seems the p