Java HTML Parser

What seems to be the best tool for HTML parser, not converting to XHTML unless its very robust and can handle any HTML page?
Been looking at JTidy - http://java-source.net/open-source/html-parsers/jtidy but that has some problems trying to convert from HTML to XHTML.
Jerico seems to parse HTML without converting to XHTML and looks reasonable
http://jerichohtml.sourceforge.net/doc/index.html,
Anyone tried other HTML parsers at http://java-source.net/open-source/html-parsers
Would like more information on other HTML parsers people have tried., preferably converting to XHTML without any problems, so we can use SAX parser to interpret the XML. Looking forward to your input
Kind Regards
Abs

It kiinda depends what you need to use if for.
Rent me and I'll tell you moregoogled it ;-)
http://sourceforge.net/projects/htmlparser/

Similar Messages

  • STYLE tag problem in HTML Parser.

    Hi,
    I am trying to parse a HTML file. I am able to extract context of various tags like Tag.SPAN,Tag.DIV and so...
    I want to extract the text content of Tag.Style. What to do? The problem is that HTML Parser right now doesnot support this tag along with 5 more tags which are Tag.META,Tag.PARAM and so..
    Please help me out.

    Before responding to this posting, you may want to check out the discussion in the OP's previous posting on this topic:
    http://forum.java.sun.com/thread.jspa?threadID=634938

  • Don't understand error message from HTML parser?

    I've written a simple test program to parse a simple html file.
    Everything works fine accept for the <img src="test.gif"> tag.
    It understands the img tag and the handleSimpleTag gets called.
    I can even pick out the src attribute. But I get a very strange error message.
    When I run the test program below on the test.html file (also below) I get the following output:
    handleError(134) = req.att srcimg?
    What does "req.att srcimg?" mean?!?!?
    /John
    This is my test program:
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import javax.swing.text.html.parser.*;
    import java.io.*;
    public class htmltest extends HTMLEditorKit.ParserCallback
    public htmltest()
       super();
    public void handleError(String errorMsg, int pos)
       System.err.println("handleError("+pos+") = " + errorMsg);
    static public void main (String[] argv) throws Exception
        Reader reader = new FileReader("test.html");
        new ParserDelegator().parse(reader, new htmltest(), false);
    This is the "test.html" file
    <html>
    <head>
    </head>
    <body>
    This is a plain text.<br>
    This is <b>bold</b> and this is <i>itallic</i>!<br>
    <img src="test.gif">
    "This >is also a plain test text."<br>
    </body>
    </html>
    ----------------------------------------------------------------------

    The handleError() method is not well documented any more than whole javax.swing.text.html package and its design structure. You can ignore the behavior of the method if other result of the parser and your HTML file are proper.

  • Attempting to use HTML parser - getAttribute() not preforming as expected.

    How am I mis-using getAttribute()?
    I am expecting (String)a.getAttribute((String)"name") to give me a value other than null in the below example. What am I doing wrong?
    The HTML test source (missing headers/body so yes its not proper)
    <input name="unit_1" size=5 maxsize=5 value="hr">
    <input name="qty_1" size=5 value=4>
    <input name="unit_1" size=5 maxsize=5 value="hr">
    <input name="partnumber_1" size=10 value="Java Work">
    <input name="description_1" size=50 value="Slip shod work at outragous prices">
    <input name="sellprice_1" size=9 value=185.00>
    <input name="discount_1" size=3 value=>
    What I'd like to see is this:
    About to parse test
    Parsing error: invalid.tagattmaxsizeinput? at 39
    Tag start(<html>, 1 attrs)
    Tag start(<head>, 1 attrs)
    Tag end(</head>)
    Tag start(<body>, 1 attrs)
    Tag(<input>, 4 attrs)
    found input
    unit_1
    hr
    Tag(<input>, 3 attrs)
    found input
    qty_1
    4
    Rather than this:
    About to parse test
    Parsing error: invalid.tagattmaxsizeinput? at 39
    Tag start(<html>, 1 attrs)
    Tag start(<head>, 1 attrs)
    Tag end(</head>)
    Tag start(<body>, 1 attrs)
    Tag(<input>, 4 attrs)
    found input
    null
    null
    Tag(<input>, 3 attrs)
    found input
    null
    null
    The code that reads the HTML and give the output looks like this:
    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    * This small demo program shows how to use the
    * HTMLEditorKit.Parser and its implementing class
    * ParserDelegator in the Swing system.
    class DataSaved {
    String InputName;
    String InputValue;
    boolean IsHidden;
    public class HtmlParseDemo {
    public static void main(String [] args) {
    DataSaved DataSet[];
    Reader r;
    if (args.length == 0) {
    System.err.println("Usage: java HTMLParseDemo [url | file]");
    System.exit(0);
    String spec = args[0];
    try {
    if (spec.indexOf("://") > 0) {
    URL u = new URL(spec);
    Object content = u.getContent();
    if (content instanceof InputStream) {
    r = new InputStreamReader((InputStream)content);
    else if (content instanceof Reader) {
    r = (Reader)content;
    else {
    throw new Exception("Bad URL content type.");
    else {
    r = new FileReader(spec);
    HTMLEditorKit.Parser parser;
    System.out.println("About to parse " + spec);
    parser = new ParserDelegator();
    parser.parse(r, new HTMLParseLister(), true);
    r.close();
    catch (Exception e) {
    System.err.println("Error: " + e);
    e.printStackTrace(System.err);
    * HTML parsing proceeds by calling a callback for
    * each and every piece of the HTML document. This
    * simple callback class simply prints an indented
    * structural listing of the HTML data.
    class HTMLParseLister extends HTMLEditorKit.ParserCallback
    int indentSize = 0;
    protected void indent() {
    indentSize += 3;
    protected void unIndent() {
    indentSize -= 3; if (indentSize < 0) indentSize = 0;
    protected void pIndent() {
    for(int i = 0; i < indentSize; i++) System.out.print(" ");
    public void handleText(char[] data, int pos) {
    pIndent();
    System.out.println("Text(" + data.length + " chars)");
    public void handleComment(char[] data, int pos) {
    pIndent();
    System.out.println("Comment(" + data.length + " chars)");
    public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
    pIndent();
    System.out.println("Tag start(<" + t.toString() + ">, " +
    a.getAttributeCount() + " attrs)");
    indent();
    public void handleEndTag(HTML.Tag t, int pos) {
    unIndent();
    pIndent();
    System.out.println("Tag end(</" + t.toString() + ">)");
    public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos) {
    String name;
    String value;
    boolean hidden;
    pIndent();
    System.out.println("Tag(<" + t.toString() + ">, " +
    a.getAttributeCount() + " attrs)");
    if( t==HTML.Tag.INPUT) {
    System.out.println("found input");
    name = (String)a.getAttribute((String)"name");
    value = (String)a.getAttribute((String)"value");
    System.out.println(name);
    System.out.println(value);
    public void handleError(String errorMsg, int pos){
    System.out.println("Parsing error: " + errorMsg + " at " + pos);

    System.out.println( a.getAttribute(HTML.Attribute.NAME) );

  • Exception in html parser under Linux

    Hi all,
    Following code is copied from Tech Tip 23Sep1999. I have compiled it and run it under Win98. It works fine for any uri. However, when I try to run it under Linux, it throws exceptions. I noticed that some web site can be parsered with the program in Linux but some can't. I wonder the different between those platforms. Anyone can tell me how to make the program works under Linux.
    Rgds,
    unplug
    configuration
    RedHat 7.1
    JDK1.3.1
    Failed: java GetLinks http://java.sun.com
    Worked: java GetLinks http://www.apache.org
    --begining of code
    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    class GetLinks {
    public static void main(String[] args) {
    EditorKit kit = new HTMLEditorKit();
    Document doc = kit.createDefaultDocument();
    // The Document class does not yet
    // handle charset's properly.
    doc.putProperty("IgnoreCharsetDirective",
    Boolean.TRUE);
    try {
    // Create a reader on the HTML content.
    Reader rd = getReader(args[0]);
    // Parse the HTML.
    kit.read(rd, doc, 0);
    // Iterate through the elements
    // of the HTML document.
    ElementIterator it = new ElementIterator(doc);
    javax.swing.text.Element elem;
    while ((elem = it.next()) != null) {
    SimpleAttributeSet s = (SimpleAttributeSet)
    elem.getAttributes().getAttribute(HTML.Tag.A);
    if (s != null) {
    System.out.println(
    s.getAttribute(HTML.Attribute.HREF));
    } catch (Exception e) {
    e.printStackTrace();
    System.exit(1);
    // Returns a reader on the HTML data. If 'uri' begins
    // with "http:", it's treated as a URL; otherwise,
    // it's assumed to be a local filename.
    static Reader getReader(String uri)
    throws IOException {
    if (uri.startsWith("http:")) {
    // Retrieve from Internet.
    URLConnection conn=
    new URL(uri).openConnection();
    return new
    InputStreamReader(conn.getInputStream());
    } else {
    // Retrieve from file.
    return new FileReader(uri);
    --End of code
    --Exception in Linux
    Exception in thread "main" java.lang.NoClassDefFoundError
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:120)
    at java.awt.Toolkit$2.run(Toolkit.java:512)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:503)
    at javax.swing.text.html.CSS.getValidFontNameMapping(CSS.java:932)
    at javax.swing.text.html.CSS$FontFamily.parseCssValue(CSS.java:1789)
    at javax.swing.text.html.CSS.getInternalCSSValue(CSS.java:531)
    at javax.swing.text.html.CSS.addInternalCSSValue(CSS.java:516)
    at javax.swing.text.html.StyleSheet.addCSSAttribute(StyleSheet.java:436)
    at javax.swing.text.html.HTMLDocument$HTMLReader$ConvertAction.start(HTM
    LDocument.java:2536)
    at javax.swing.text.html.HTMLDocument$HTMLReader.handleStartTag(HTMLDocu
    ment.java:1992)
    at javax.swing.text.html.parser.DocumentParser.handleStartTag(DocumentPa
    rser.java:145)
    at javax.swing.text.html.parser.Parser.startTag(Parser.java:333)
    at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1786)
    at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1821)
    at javax.swing.text.html.parser.Parser.parse(Parser.java:1980)
    at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java
    :109)
    at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.ja
    va:74)
    at javax.swing.text.html.HTMLEditorKit.read(HTMLEditorKit.java:239)
    at GetLinks.main(GetLinks.java:23)

    Support for CSS and clearly defined.Also Dictionary getDocumentProperties() is not properly exaplained meaning it doesnt give methods to get all the properties a HTML document can have.

  • Error on HTML Parser

    Hi,
    I'm trying to parse a HTML page but I always get the same error, which is the following exception:
    javax.swing.text.ChangedCharSetException
    In the class ParserCallback I'm using the method handleError and it shows:
    req.att contentmeta?
    ioexception???
    just before the exception occurs.
    The only line where this error occurs in the html page is:
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    and I know that the exact point is the attribute 'content'. If it is removed or changed to 'contenttype' the error desappears.
    The problem is that I can't change the attribute because the html page is not mine, it is caught on the Web. And I don't want to remove it.
    Anybody knows what is happening?
    Thanks!!

    i am also having a problem with html parsing in java
    i have given a detailed / complete description of the problem on this link along with the log and my sample code ...
    http://forum.java.sun.com/thread.jspa?threadID=643683&tstart=0
    if u could see this ...

  • JRE1.5 swing.html parser fails to parse data between script tags

    Hi all...
    I've written a class that extends the java-provided default HTML parser to parse for text inside a table. In JRE1.4.* the parser works fine and extracts data between <script> tags as text. However now that I've upgraded to 1.5, the data between <script> tags are no longer parsed. Any suggestion anyone?
    Steve

    According to the API docs, the 1.5 parser supports HTML 3.2, for which the spec states that the content of SCRIPT and STYLE tags should not be rendered. I assume it doesn't have a scripting engine, so it won't get executed either.

  • Webpage (HTML) parsing...

    Any ideas on how to parse an HTML page? I'm trying to do it with a StreamTokenizer but with little success. I don't think this class was made to do this sort of thing, Oridnarilly anyway. Is there a better choice? StringTokenizer? Here's what I have so far:
      URLConnection uc = url.openConnection();
      BufferedReader br = new BufferedReader(new InputStreamReader
                                            (uc.getInputStream()));
      StreamTokenizer stok = new StreamTokenizer(br);
      stok.eolIsSignificant(false);
      String inputLine;
      for (int i=0; (stok.nextToken() != stok.TT_EOF); i++)
        System.out.println("token #" + i + stok.toString());
      }It gives me a result like this:
    token #0Token['<'], line 3
    token #1Token[script], line 3
    token #2Token[language], line 3
    token #3Token['='], line 3
    token #4Token[javascript], line 3
    token #5Token['>'], line 3
    token #6Token['<'], line 4
    token #7Token['!'], line 4
    token #8Token['-'], line 4
    token #9Token['-'], line 4
    token #10Token[function], line 5
    token #11Token[dojump], line 5
    token #12Token['('], line 5
    token #13Token[')'], line 5
    token #14Token['{'], line 6
    token #15Token[document.location.href], line 7
    token #16Token['='], line 7
    token #17Token[play247.asp?page=promo&id=72&r=R2], line 7What I want is all the links that have "promo" as a parameter e.g. . Any suggestions?

    Java has a callback parser, which notifies you when start/end tags are found. Then you can query the attributes and search for the desired string. Heres a sample to get you started:
    import java.io.FileReader;
    import java.io.IOException;
    import java.io.Reader;
    import javax.swing.text.MutableAttributeSet;
    import javax.swing.text.html.HTML;
    import javax.swing.text.html.HTMLEditorKit;
    import javax.swing.text.html.parser.ParserDelegator;
    public class TestParser extends HTMLEditorKit.ParserCallback
         boolean ignoreText;
         public static void main(String[] args)
         throws IOException
              TestParser parser = new TestParser();
              // args[0] is the file to parse
              Reader reader = new FileReader(args[0]);
              try
                   new ParserDelegator().parse(reader, parser, false);
              catch (IOException e)
                   System.out.println(e);
         public void handleComment(char[] data, int pos)
              System.out.println(data);
         public void handleEndOfLineString(String eol)
         public void handleEndTag(HTML.Tag tag, int pos)
              System.out.println("/" + tag);
         public void handleError(String errorMsg, int pos)
              System.out.println(pos + ":" + errorMsg);
         public void handleMutableTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println("mutable:" + tag + ": " + pos + ": " + a);
         public void handleSimpleTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println( tag + ":" + a );
         public void handleStartTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println( tag + ":" + a );
         public void handleText(char[] data, int pos)
              System.out.println( data );

  • Problem with HTML Parser and multiple instances

    I have a parser program which queries a online shopping comparison web page and extracts the information needed. I am trying to run this program with different search terms which are created by entering a sentence, so each one is sent separately, however the outputs (text files) are the same for each word, despite the correct term and output file seeming passed. I suspect it might be that the connection is not being closed each time but am not sure why this is happening.
    If i create an identical copy of the program and run that after the first one it works but this is not an appropriate solution.
    Any help would be much appreciated. Here is some of my code, if more is required i will post.
    To run the program:
    StringTokenizer t = new StringTokenizer("red green yellow", " ");
            int c = 0;
            Parser1 p = new Parser1();
            while (t.hasMoreTokens()) {
                c++;
                String tok = t.nextToken();
                File tem = new File("C:/"+c+".txt");
                    p.mainprog(tok, tem);
                    p.mainprog(tok, tem)
                    p.mainprog(tok, tem);
    }The parser:
    import javax.swing.text.html.parser.*;
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import java.awt.*;
    import java.util.*;
    import javax.swing.*;
    import java.io.*;
    import java.net.*;
    public class Parser1 extends HTMLEditorKit.ParserCallback {
        variable declarations
       public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos){
    ...methods
      public void handleText(char[] data, int pos){
           ...methods
      public void handleTitleTag(HTML.Tag t, char[] data){
      public void handleEmptyTag(HTML.Tag t, char[] data){       
      public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos){
    ...methods
          static void mainprog(String term, File file) {   
    ...proxy and authentication methods
                        Authenticator.setDefault(new MyAuthenticator() );
                        HTMLEditorKit editorKit = new HTMLEditorKit();
                        HTMLDocument HTMLDoc;
                        Reader HTMLReader;
                      try {
                            String temp = new String(term);
                            String fullurl = new String(MainUrl+temp);
                            url = new URL(fullurl);
                            InputStream myInStream;
                            myInStream = url.openConnection().getInputStream();
                            HTMLReader = (new InputStreamReader(myInStream));
                            HTMLDoc = (HTMLDocument) editorKit.createDefaultDocument();
                            HTMLDoc.putProperty("IgnoreCharsetDirective", new Boolean(true));
                            ParserDelegator parser = new ParserDelegator();
                            HTMLEditorKit.ParserCallback callback = new Parser1();
                            parser.parse(HTMLReader, callback, true);
                            callback.flush();
                            HTMLReader.close();
                            myInStream.close();
                     catch (IOException IOE) {
                        IOE.printStackTrace();
                    catch (Exception e) {
                        e.printStackTrace();
          try {
                FileWriter writer = new FileWriter(file);
                BufferedWriter bw = new BufferedWriter(writer);
                for (int i = 0; i < vect.size(); i++){
                    bw.write((String)vect.elementAt(i));
                    if (vect.elementAt(i)!=vect.lastElement()){
                        bw.newLine();
                bw.flush();
                bw.close();
                writer.close();
            catch (IOException IOE) {
                        IOE.printStackTrace();
                    catch (Exception e) {
                        e.printStackTrace();
              }   catch (IOException IOE) {
                     System.out.println("User options not found.");
    }

    How many Directory Servers are you using?
    Are both serverconfig.xml files of PS instances the same?
    Set debug level to message in the appropriate AMConfig.properties of your portal instances and look into AM debug files.
    For some reason amSDK seems not to get the correct service values.
    -Bernhard

  • Using the HTML Parser as a filter

    I have the need to take an HTML file, filter it so that the SOURCE attribute on all of the IMG tags are modified, and then write the filtered file out. This seems pretty straight forward, but I can't seem to figure out how to get the Swing HTML parser to do what I want.
    If I had some way to parse the file into some structure, and then be able to take that structure and "unparse" it back into a file I would be fine. This assumes that I'd be able to override the handling of the IMG tag (or I guess all of the simple tags) on the parser side so that I'd be able to replace one of the attributes with something that's more meaningful for my purpose.
    I know this is not so difficult, I just can't see how to do it. Any help? Thanks in advance.
    Sander Smith

    If the problem is that simple, I'd use java.util.regex classes(Pattern and Matcher) and could write the program in an hour.

  • Control of HTML parsing

    Hello,
    I am working on trying to remove certain tags from an html source but I am unfamiliar with the use of ParserDelgator or Callback classes. Say you have this example here:
    import javax.swing.text.html.parser.*;
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import java.io.*;
    public class TextFromHtml  extends HTMLEditorKit.ParserCallback{
      public void handleText(char[] data, int pos){
          System.out.println(new String(data));  // or other code you like
      public static void main(String argv[]){
        try{
          Reader r = new FileReader("testPrint.html");
          ParserDelegator parser = new ParserDelegator();
          HTMLEditorKit.ParserCallback callback = new TextFromHtml();
          parser.parse(r, callback, true);  // or 'false' if you like
        catch (IOException e){
          e.printStackTrace();
    }Is there anyway I can control which HTML lines are parsed?
    Thanks in advance!

    I was able to extract the img src tag code using some string manipulation, but now I have a problem with parsing the rest of the HTML source...
    import javax.swing.text.html.parser.*;
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import javax.swing.*;
    import java.io.*;
    public class TextFromHtml extends HTMLEditorKit.ParserCallback
            static BufferedReader reader;
            static PrintWriter pw;
            public static String trimmer(String trimText)
                   return (trimText.trim()).replaceAll("\\s+", " ");
            public void handleText(char[] data, int pos)
                   String text = new String(data);
                 text = trimmer(text);
                    if(text.indexOf((">")) != -1)
                            String[] temp = text.split((">"));
                            if(temp.length>1)
                                    text = temp[1].trim();
                            else text = "";                                                              
                 pw.println(text);
            //public void removeTags(String file)
            public static void main(String[] args)
                    String file = "";
                    String queue = "";
                    try
                            String s = args[0];
                            reader = new BufferedReader(new FileReader(s));
                            file = s.substring(0, s.lastIndexOf('.')) + ".txt";
                            pw = new PrintWriter(new BufferedWriter(new FileWriter(file)));
                            ParserDelegator parser = new ParserDelegator();
                            HTMLEditorKit.ParserCallback callback = new TextFromHtml();
                            parser.parse(reader, callback, true);
                            pw.close();
                            reader = new BufferedReader(new FileReader(file));
                            BufferedWriter fout = new BufferedWriter(new FileWriter("out.txt"));
                            while ((queue = reader.readLine()) != null)
                                    if(queue.length() != 0)
                                            System.out.println(queue);
                                            fout.write(queue);
                            reader.close();
                    }catch (IOException e)
                            e.printStackTrace();
                    //return array;        
    }For some reason, I can only output queue to the CMD prompt; any thing else such as output to another file or returning queue as an array of Strings gives me null values. Does anyone see what is wrong with the code?
    Thanks in advance!

  • HTML parsing, AttributeSet.getAttribute() doesn't work

    I parsed a website using javax.swing.text.html.parser.
    When I get a javax.swing.text.html.parser.Element, elem, I used elem.getAttributSet to get the AttributeSet of elem, atts. Then I used atts.getAttribute(HTML.Tag.FORM) to get the surounding form tag. This works fine in jdk 1.3.8, but for jdk 1.4.2 and after, it just return null.
    Is this a parsing bug for Java? Is there any way to get arround this problem?

    Well, it won't work as iWeb has no import facility so cannot open html files.
    What you could do is upload the html file to wherever you are hosting your site and create a link to it from iWeb, or find another package similar to the one you are using at present that is for Mac rather then PC.

  • How to parse a HTML file using HTML parser in J2SE?

    I want to parse an HTML file using HTML parser. Can any body help me by providing a sample code to parse the HTML file?
    Thanks nad Cheers,
    Amaresh

    What HTML parser and what does "parsing" mean to you?

  • How to make a horizontal line in java html browser?? without hr

    I use java html browser
    I want to make a black horizontal line
    the <hr> line is not good
    for my customer it seems that there is a double spacing in this case
    I tried to write
    <td style = "BORDER-BOTTOM: 1px solid #000000"> ...
    but it does not work
    I suppose this style is not supported by the default StyleSheet
    I tried to write
    <td bgcolor="black" height=1>
    but the table has all rows with equal sizes
    therefore the real height is too big
    Please help me
    I tried to write
    <table border="1">... </table>
    even in this case I have table without any border

    Hi,
    the best way might be to use a blind table with all cells set to have a border on the bottom. The problem is that the standard Java runtime does not render such setting automatically. There is plenty of work involved to adapt it and still it then would work only in the adapted version and not the standard runtime environment.
    Anyway, you can find a working example of how to accomplish individual borders around table cells in open source application SimplyHTML at http://www.lightdev.com/dev/sh.htm
    Ulrich

  • Java XML Parser:Null Pointer exception in EntityReader

    I got NullPointer Exception when trying to parse a XML file which
    is pointed by a net URL, (say "http://www..."). The code causing
    problem is like:
    parser.parse(new URL("http://www.../demo.xml"));
    the exception I got is:
    java.lang.NullPointerException
    java.lang.NullPointerException
    at oracle.xml.parser.EntityReader.initXMLInput(Compiled
    Code)
    at
    oracle.xml.parser.EntityReader.<init>(EntityReader.java:64)
    at oracle.xml.parser.XMLParser.parse(XMLParser.java:245)
    at DemoXML.main(DemoXML.java:47)
    The very same code works fine with a simple local file URL and we
    also know that the URL exists in correct in XML format because we
    can open the URL with IE5.
    Another question - where can we get the source for the Java XML
    Parser.
    null

    This bug has already been reported and will be fixed in our next
    release.
    Oracle XML Team
    http://technet.oracle.com
    Oracle Technology Network
    Fiona Lu (guest) wrote:
    : I got NullPointer Exception when trying to parse a XML file
    which
    : is pointed by a net URL, (say "http://www..."). The code
    causing
    : problem is like:
    : parser.parse(new URL("http://www.../demo.xml"));
    : the exception I got is:
    : java.lang.NullPointerException
    : java.lang.NullPointerException
    : at oracle.xml.parser.EntityReader.initXMLInput
    (Compiled
    : Code)
    : at
    : oracle.xml.parser.EntityReader.<init>(EntityReader.java:64)
    : at oracle.xml.parser.XMLParser.parse
    (XMLParser.java:245)
    : at DemoXML.main(DemoXML.java:47)
    : The very same code works fine with a simple local file URL and
    we
    : also know that the URL exists in correct in XML format because
    we
    : can open the URL with IE5.
    : Another question - where can we get the source for the Java
    XML
    : Parser.
    null

Maybe you are looking for

  • Performance Update in 10.6.2?

    I've been having spinning-ball issues (using Safari, iTunes, and other applications) where the computer will, at seemingly random time periods, display the spinning ball so I can't do anything in that application. I've tried resetting permissions, et

  • Report in background not create spool when no data found

    Hello  , I created simple rapport that should run in background. When I  execute background job (sm37) I noticed that the  spool exists only for when the some data is found. When data is not found for any reason I do not have spool , and I need one s

  • HT1695 how do i know what generation my ipad is

    How do i know what generation my ipad is? ipad wi-fi cellular 32gb version

  • How to use a namespace in a C++ code communicated with labview

    I have a long code written in c++.  Now I want to acquire data with labview, send some data points as an entry to my c++ code and it will return a value to Labview.  I tried with a simple sum of two elements and it worked, but in my long code I have

  • This is truly newbie question

    I am totally unfamiliar with spreadsheets, but like the idea of a table and started to use Numbers to create a timeline, with dates in column A, Events in column B and people in column C. However if I add a row by selecting the top date in A and choo