HTML parser in J2ME

Hi all,
Even I'm stuck with the same problem. I'm developing a J2ME(MIDlet) application in which i have to open a http connection. N also i want to parse the html response n display the contents using J2ME elements in the mobile. I'm not able to solve this problem. Plz help me if any1 has come across the solution of this problem.
Below links are the related threads:
http://forums.sun.com/thread.jspa?forumID=76&threadID=250460
http://forums.sun.com/thread.jspa?forumID=76&threadID=5235530
Thanks in advance
Nandy

Hi All,
I like to ask if anyone knows if there is a HTML
parser available in J2ME? I am building an applicationTry google, a few do exist, but I don't know about free ones.
that needs to display HTML on the client.
Alternatively I may consider using XML, however I
learnt that parsing XML is expensive in terms of
computing power - is it the same for HTML?If you are controlling the content returned, the two would be about the same, as XML and HTML have the same roots. Some XML parsers do exist, and are free to use.
You might be best of returning a custom format, designed around the limitations of the device you are using .

Similar Messages

  • Xml Push Parsing in J2Me applications

    Hi all,
    I would like to use Push parser in j2me application.Can u please tell me how can i use sax parser in mobile application.Because when i included "jaxp-api.jar" in mobile application,it gives an error "FactoryFinder class not found".. So please tell me how can i overcome this error or any other push parser available for J2me applications.
    Thanks
    Regards
    Sourab

    Please review this article.
    http://developers.sun.com/techtopics/mobility/midp/articles/parsingxml/
    There is a link to a push parser mid way through the article. If you search "push parser j2me" you should be able to find others.
    If you still are not able to find a push parser you could create your own. How complex is the xml you need to parse?
    I created an example called y!p! some time ago that uses a push parser to parse Yahoo's Image Search web service. (http://developer.yahoo.com/search/image/V1/imageSearch.html)
    You can download the example src code. For a more complex example you could look at w!k!. It uses a push parser to parse html wikipedia articles.
    w!k! push parser src:
    http://hostj2me.cliqcafe.com/www/forumtopicview.html?fid=58&categoryId=36&fpn=0
    w!k! src code:
    http://www.hostj2me.com/appdetails.html?id=1569

  • How to parse a HTML file using HTML parser in J2SE?

    I want to parse an HTML file using HTML parser. Can any body help me by providing a sample code to parse the HTML file?
    Thanks nad Cheers,
    Amaresh

    What HTML parser and what does "parsing" mean to you?

  • STYLE tag problem in HTML Parser.

    Hi,
    I am trying to parse a HTML file. I am able to extract context of various tags like Tag.SPAN,Tag.DIV and so...
    I want to extract the text content of Tag.Style. What to do? The problem is that HTML Parser right now doesnot support this tag along with 5 more tags which are Tag.META,Tag.PARAM and so..
    Please help me out.

    Before responding to this posting, you may want to check out the discussion in the OP's previous posting on this topic:
    http://forum.java.sun.com/thread.jspa?threadID=634938

  • Don't understand error message from HTML parser?

    I've written a simple test program to parse a simple html file.
    Everything works fine accept for the <img src="test.gif"> tag.
    It understands the img tag and the handleSimpleTag gets called.
    I can even pick out the src attribute. But I get a very strange error message.
    When I run the test program below on the test.html file (also below) I get the following output:
    handleError(134) = req.att srcimg?
    What does "req.att srcimg?" mean?!?!?
    /John
    This is my test program:
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import javax.swing.text.html.parser.*;
    import java.io.*;
    public class htmltest extends HTMLEditorKit.ParserCallback
    public htmltest()
       super();
    public void handleError(String errorMsg, int pos)
       System.err.println("handleError("+pos+") = " + errorMsg);
    static public void main (String[] argv) throws Exception
        Reader reader = new FileReader("test.html");
        new ParserDelegator().parse(reader, new htmltest(), false);
    This is the "test.html" file
    <html>
    <head>
    </head>
    <body>
    This is a plain text.<br>
    This is <b>bold</b> and this is <i>itallic</i>!<br>
    <img src="test.gif">
    "This >is also a plain test text."<br>
    </body>
    </html>
    ----------------------------------------------------------------------

    The handleError() method is not well documented any more than whole javax.swing.text.html package and its design structure. You can ignore the behavior of the method if other result of the parser and your HTML file are proper.

  • Attempting to use HTML parser - getAttribute() not preforming as expected.

    How am I mis-using getAttribute()?
    I am expecting (String)a.getAttribute((String)"name") to give me a value other than null in the below example. What am I doing wrong?
    The HTML test source (missing headers/body so yes its not proper)
    <input name="unit_1" size=5 maxsize=5 value="hr">
    <input name="qty_1" size=5 value=4>
    <input name="unit_1" size=5 maxsize=5 value="hr">
    <input name="partnumber_1" size=10 value="Java Work">
    <input name="description_1" size=50 value="Slip shod work at outragous prices">
    <input name="sellprice_1" size=9 value=185.00>
    <input name="discount_1" size=3 value=>
    What I'd like to see is this:
    About to parse test
    Parsing error: invalid.tagattmaxsizeinput? at 39
    Tag start(<html>, 1 attrs)
    Tag start(<head>, 1 attrs)
    Tag end(</head>)
    Tag start(<body>, 1 attrs)
    Tag(<input>, 4 attrs)
    found input
    unit_1
    hr
    Tag(<input>, 3 attrs)
    found input
    qty_1
    4
    Rather than this:
    About to parse test
    Parsing error: invalid.tagattmaxsizeinput? at 39
    Tag start(<html>, 1 attrs)
    Tag start(<head>, 1 attrs)
    Tag end(</head>)
    Tag start(<body>, 1 attrs)
    Tag(<input>, 4 attrs)
    found input
    null
    null
    Tag(<input>, 3 attrs)
    found input
    null
    null
    The code that reads the HTML and give the output looks like this:
    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    * This small demo program shows how to use the
    * HTMLEditorKit.Parser and its implementing class
    * ParserDelegator in the Swing system.
    class DataSaved {
    String InputName;
    String InputValue;
    boolean IsHidden;
    public class HtmlParseDemo {
    public static void main(String [] args) {
    DataSaved DataSet[];
    Reader r;
    if (args.length == 0) {
    System.err.println("Usage: java HTMLParseDemo [url | file]");
    System.exit(0);
    String spec = args[0];
    try {
    if (spec.indexOf("://") > 0) {
    URL u = new URL(spec);
    Object content = u.getContent();
    if (content instanceof InputStream) {
    r = new InputStreamReader((InputStream)content);
    else if (content instanceof Reader) {
    r = (Reader)content;
    else {
    throw new Exception("Bad URL content type.");
    else {
    r = new FileReader(spec);
    HTMLEditorKit.Parser parser;
    System.out.println("About to parse " + spec);
    parser = new ParserDelegator();
    parser.parse(r, new HTMLParseLister(), true);
    r.close();
    catch (Exception e) {
    System.err.println("Error: " + e);
    e.printStackTrace(System.err);
    * HTML parsing proceeds by calling a callback for
    * each and every piece of the HTML document. This
    * simple callback class simply prints an indented
    * structural listing of the HTML data.
    class HTMLParseLister extends HTMLEditorKit.ParserCallback
    int indentSize = 0;
    protected void indent() {
    indentSize += 3;
    protected void unIndent() {
    indentSize -= 3; if (indentSize < 0) indentSize = 0;
    protected void pIndent() {
    for(int i = 0; i < indentSize; i++) System.out.print(" ");
    public void handleText(char[] data, int pos) {
    pIndent();
    System.out.println("Text(" + data.length + " chars)");
    public void handleComment(char[] data, int pos) {
    pIndent();
    System.out.println("Comment(" + data.length + " chars)");
    public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
    pIndent();
    System.out.println("Tag start(<" + t.toString() + ">, " +
    a.getAttributeCount() + " attrs)");
    indent();
    public void handleEndTag(HTML.Tag t, int pos) {
    unIndent();
    pIndent();
    System.out.println("Tag end(</" + t.toString() + ">)");
    public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos) {
    String name;
    String value;
    boolean hidden;
    pIndent();
    System.out.println("Tag(<" + t.toString() + ">, " +
    a.getAttributeCount() + " attrs)");
    if( t==HTML.Tag.INPUT) {
    System.out.println("found input");
    name = (String)a.getAttribute((String)"name");
    value = (String)a.getAttribute((String)"value");
    System.out.println(name);
    System.out.println(value);
    public void handleError(String errorMsg, int pos){
    System.out.println("Parsing error: " + errorMsg + " at " + pos);

    System.out.println( a.getAttribute(HTML.Attribute.NAME) );

  • How to use SAX parser in J2ME

    Please help me how to use SAX parser in J2ME.
    Is there any function to find the value of a particular element from a XML file?
    I am able to get Element name, Values, Attributes. But control is not in my hand, DefaultHanldler automatically invokes Character method, then only I am able to get values. But I don't know when this method gets invoked.
    Is there any way or method so that I can get value of any element or attribute just by passing element name as parameter in SAX parser?
    Is there any other parser through which I can perform this task in J2ME?
    Thanks in advance.

    Hi..
    have a look at this.
    http://www-128.ibm.com/developerworks/library/wi-parsexml/
    MeTitus

  • Exception in html parser under Linux

    Hi all,
    Following code is copied from Tech Tip 23Sep1999. I have compiled it and run it under Win98. It works fine for any uri. However, when I try to run it under Linux, it throws exceptions. I noticed that some web site can be parsered with the program in Linux but some can't. I wonder the different between those platforms. Anyone can tell me how to make the program works under Linux.
    Rgds,
    unplug
    configuration
    RedHat 7.1
    JDK1.3.1
    Failed: java GetLinks http://java.sun.com
    Worked: java GetLinks http://www.apache.org
    --begining of code
    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    class GetLinks {
    public static void main(String[] args) {
    EditorKit kit = new HTMLEditorKit();
    Document doc = kit.createDefaultDocument();
    // The Document class does not yet
    // handle charset's properly.
    doc.putProperty("IgnoreCharsetDirective",
    Boolean.TRUE);
    try {
    // Create a reader on the HTML content.
    Reader rd = getReader(args[0]);
    // Parse the HTML.
    kit.read(rd, doc, 0);
    // Iterate through the elements
    // of the HTML document.
    ElementIterator it = new ElementIterator(doc);
    javax.swing.text.Element elem;
    while ((elem = it.next()) != null) {
    SimpleAttributeSet s = (SimpleAttributeSet)
    elem.getAttributes().getAttribute(HTML.Tag.A);
    if (s != null) {
    System.out.println(
    s.getAttribute(HTML.Attribute.HREF));
    } catch (Exception e) {
    e.printStackTrace();
    System.exit(1);
    // Returns a reader on the HTML data. If 'uri' begins
    // with "http:", it's treated as a URL; otherwise,
    // it's assumed to be a local filename.
    static Reader getReader(String uri)
    throws IOException {
    if (uri.startsWith("http:")) {
    // Retrieve from Internet.
    URLConnection conn=
    new URL(uri).openConnection();
    return new
    InputStreamReader(conn.getInputStream());
    } else {
    // Retrieve from file.
    return new FileReader(uri);
    --End of code
    --Exception in Linux
    Exception in thread "main" java.lang.NoClassDefFoundError
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:120)
    at java.awt.Toolkit$2.run(Toolkit.java:512)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:503)
    at javax.swing.text.html.CSS.getValidFontNameMapping(CSS.java:932)
    at javax.swing.text.html.CSS$FontFamily.parseCssValue(CSS.java:1789)
    at javax.swing.text.html.CSS.getInternalCSSValue(CSS.java:531)
    at javax.swing.text.html.CSS.addInternalCSSValue(CSS.java:516)
    at javax.swing.text.html.StyleSheet.addCSSAttribute(StyleSheet.java:436)
    at javax.swing.text.html.HTMLDocument$HTMLReader$ConvertAction.start(HTM
    LDocument.java:2536)
    at javax.swing.text.html.HTMLDocument$HTMLReader.handleStartTag(HTMLDocu
    ment.java:1992)
    at javax.swing.text.html.parser.DocumentParser.handleStartTag(DocumentPa
    rser.java:145)
    at javax.swing.text.html.parser.Parser.startTag(Parser.java:333)
    at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1786)
    at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1821)
    at javax.swing.text.html.parser.Parser.parse(Parser.java:1980)
    at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java
    :109)
    at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.ja
    va:74)
    at javax.swing.text.html.HTMLEditorKit.read(HTMLEditorKit.java:239)
    at GetLinks.main(GetLinks.java:23)

    Support for CSS and clearly defined.Also Dictionary getDocumentProperties() is not properly exaplained meaning it doesnt give methods to get all the properties a HTML document can have.

  • Validating XML parser for J2ME?

    Hello,
    does anybody know a validating XML parser for J2ME?
    Thanks
    Volker

    I mean the XML parser.
    I want to know if there are XML parsers which can validate a XML document against a DTD or Schema
    Thanks
    Volker

  • Error on HTML Parser

    Hi,
    I'm trying to parse a HTML page but I always get the same error, which is the following exception:
    javax.swing.text.ChangedCharSetException
    In the class ParserCallback I'm using the method handleError and it shows:
    req.att contentmeta?
    ioexception???
    just before the exception occurs.
    The only line where this error occurs in the html page is:
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    and I know that the exact point is the attribute 'content'. If it is removed or changed to 'contenttype' the error desappears.
    The problem is that I can't change the attribute because the html page is not mine, it is caught on the Web. And I don't want to remove it.
    Anybody knows what is happening?
    Thanks!!

    i am also having a problem with html parsing in java
    i have given a detailed / complete description of the problem on this link along with the log and my sample code ...
    http://forum.java.sun.com/thread.jspa?threadID=643683&tstart=0
    if u could see this ...

  • JRE1.5 swing.html parser fails to parse data between script tags

    Hi all...
    I've written a class that extends the java-provided default HTML parser to parse for text inside a table. In JRE1.4.* the parser works fine and extracts data between <script> tags as text. However now that I've upgraded to 1.5, the data between <script> tags are no longer parsed. Any suggestion anyone?
    Steve

    According to the API docs, the 1.5 parser supports HTML 3.2, for which the spec states that the content of SCRIPT and STYLE tags should not be rendered. I assume it doesn't have a scripting engine, so it won't get executed either.

  • Webpage (HTML) parsing...

    Any ideas on how to parse an HTML page? I'm trying to do it with a StreamTokenizer but with little success. I don't think this class was made to do this sort of thing, Oridnarilly anyway. Is there a better choice? StringTokenizer? Here's what I have so far:
      URLConnection uc = url.openConnection();
      BufferedReader br = new BufferedReader(new InputStreamReader
                                            (uc.getInputStream()));
      StreamTokenizer stok = new StreamTokenizer(br);
      stok.eolIsSignificant(false);
      String inputLine;
      for (int i=0; (stok.nextToken() != stok.TT_EOF); i++)
        System.out.println("token #" + i + stok.toString());
      }It gives me a result like this:
    token #0Token['<'], line 3
    token #1Token[script], line 3
    token #2Token[language], line 3
    token #3Token['='], line 3
    token #4Token[javascript], line 3
    token #5Token['>'], line 3
    token #6Token['<'], line 4
    token #7Token['!'], line 4
    token #8Token['-'], line 4
    token #9Token['-'], line 4
    token #10Token[function], line 5
    token #11Token[dojump], line 5
    token #12Token['('], line 5
    token #13Token[')'], line 5
    token #14Token['{'], line 6
    token #15Token[document.location.href], line 7
    token #16Token['='], line 7
    token #17Token[play247.asp?page=promo&id=72&r=R2], line 7What I want is all the links that have "promo" as a parameter e.g. . Any suggestions?

    Java has a callback parser, which notifies you when start/end tags are found. Then you can query the attributes and search for the desired string. Heres a sample to get you started:
    import java.io.FileReader;
    import java.io.IOException;
    import java.io.Reader;
    import javax.swing.text.MutableAttributeSet;
    import javax.swing.text.html.HTML;
    import javax.swing.text.html.HTMLEditorKit;
    import javax.swing.text.html.parser.ParserDelegator;
    public class TestParser extends HTMLEditorKit.ParserCallback
         boolean ignoreText;
         public static void main(String[] args)
         throws IOException
              TestParser parser = new TestParser();
              // args[0] is the file to parse
              Reader reader = new FileReader(args[0]);
              try
                   new ParserDelegator().parse(reader, parser, false);
              catch (IOException e)
                   System.out.println(e);
         public void handleComment(char[] data, int pos)
              System.out.println(data);
         public void handleEndOfLineString(String eol)
         public void handleEndTag(HTML.Tag tag, int pos)
              System.out.println("/" + tag);
         public void handleError(String errorMsg, int pos)
              System.out.println(pos + ":" + errorMsg);
         public void handleMutableTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println("mutable:" + tag + ": " + pos + ": " + a);
         public void handleSimpleTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println( tag + ":" + a );
         public void handleStartTag(HTML.Tag tag, MutableAttributeSet a, int pos)
              System.out.println( tag + ":" + a );
         public void handleText(char[] data, int pos)
              System.out.println( data );

  • APPLESCRIPT AND HTML PARSING.

    hi,
    im new to applescript so im not quite sure if what i want to do is actually called html parsing.. but basically i want to put a variable in applescript that is linked to the actual html but i dont know how to make applescript access data inside a html code... to give u a better idea, inside the html is something like this:
    100
    now that value "100" changes but its maximum amount is 100. i want to create a script which responds to change when that value starts to drop by loading another link.
    am i making sense? again the thing id like to achieve is make applescript use that value INSIDE the HTML as its own variable (and perform the right actions as that value changes)
    any help would be appreciated.

    In first place you could open the site you talked about in safari and run a little javascript via applescript to get that value.
    Javascript is the "best" way to get a special value out of an HTML-Element, but only works in browsers.
    e.g.
    tell application "Safari"
    open location "http://apple.com"
    delay 6
    set mypromo to do JavaScript "document.getElementById('promos').getElementsByTagName('a')[0].title" in document 1
    display dialog "Title of first Promo is:" & return & mypromo
    end tell
    Or you could just d/l the pure source convert it to text and search for the phrase you are looking for
    e.g.
    set mysource_html to do shell script "curl http://mysite.org/bla.html"
    set mysource_txt to do shell script "curl http://mysite.org/bla.html | textutil -stdin -convert txt -format html -stdout"
    if mysource_html contains "<a>100</a>" then
    display dialog "Hey, value of 100 is reached"
    end if
    --or something like
    if mysource_txt contains "100" then
    display dialog "Hey, value of 100 is reached"
    end if

  • Java HTML Parser

    What seems to be the best tool for HTML parser, not converting to XHTML unless its very robust and can handle any HTML page?
    Been looking at JTidy - http://java-source.net/open-source/html-parsers/jtidy but that has some problems trying to convert from HTML to XHTML.
    Jerico seems to parse HTML without converting to XHTML and looks reasonable
    http://jerichohtml.sourceforge.net/doc/index.html,
    Anyone tried other HTML parsers at http://java-source.net/open-source/html-parsers
    Would like more information on other HTML parsers people have tried., preferably converting to XHTML without any problems, so we can use SAX parser to interpret the XML. Looking forward to your input
    Kind Regards
    Abs

    It kiinda depends what you need to use if for.
    Rent me and I'll tell you moregoogled it ;-)
    http://sourceforge.net/projects/htmlparser/

  • Problem with HTML Parser and multiple instances

    I have a parser program which queries a online shopping comparison web page and extracts the information needed. I am trying to run this program with different search terms which are created by entering a sentence, so each one is sent separately, however the outputs (text files) are the same for each word, despite the correct term and output file seeming passed. I suspect it might be that the connection is not being closed each time but am not sure why this is happening.
    If i create an identical copy of the program and run that after the first one it works but this is not an appropriate solution.
    Any help would be much appreciated. Here is some of my code, if more is required i will post.
    To run the program:
    StringTokenizer t = new StringTokenizer("red green yellow", " ");
            int c = 0;
            Parser1 p = new Parser1();
            while (t.hasMoreTokens()) {
                c++;
                String tok = t.nextToken();
                File tem = new File("C:/"+c+".txt");
                    p.mainprog(tok, tem);
                    p.mainprog(tok, tem)
                    p.mainprog(tok, tem);
    }The parser:
    import javax.swing.text.html.parser.*;
    import javax.swing.text.html.*;
    import javax.swing.text.*;
    import java.awt.*;
    import java.util.*;
    import javax.swing.*;
    import java.io.*;
    import java.net.*;
    public class Parser1 extends HTMLEditorKit.ParserCallback {
        variable declarations
       public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos){
    ...methods
      public void handleText(char[] data, int pos){
           ...methods
      public void handleTitleTag(HTML.Tag t, char[] data){
      public void handleEmptyTag(HTML.Tag t, char[] data){       
      public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos){
    ...methods
          static void mainprog(String term, File file) {   
    ...proxy and authentication methods
                        Authenticator.setDefault(new MyAuthenticator() );
                        HTMLEditorKit editorKit = new HTMLEditorKit();
                        HTMLDocument HTMLDoc;
                        Reader HTMLReader;
                      try {
                            String temp = new String(term);
                            String fullurl = new String(MainUrl+temp);
                            url = new URL(fullurl);
                            InputStream myInStream;
                            myInStream = url.openConnection().getInputStream();
                            HTMLReader = (new InputStreamReader(myInStream));
                            HTMLDoc = (HTMLDocument) editorKit.createDefaultDocument();
                            HTMLDoc.putProperty("IgnoreCharsetDirective", new Boolean(true));
                            ParserDelegator parser = new ParserDelegator();
                            HTMLEditorKit.ParserCallback callback = new Parser1();
                            parser.parse(HTMLReader, callback, true);
                            callback.flush();
                            HTMLReader.close();
                            myInStream.close();
                     catch (IOException IOE) {
                        IOE.printStackTrace();
                    catch (Exception e) {
                        e.printStackTrace();
          try {
                FileWriter writer = new FileWriter(file);
                BufferedWriter bw = new BufferedWriter(writer);
                for (int i = 0; i < vect.size(); i++){
                    bw.write((String)vect.elementAt(i));
                    if (vect.elementAt(i)!=vect.lastElement()){
                        bw.newLine();
                bw.flush();
                bw.close();
                writer.close();
            catch (IOException IOE) {
                        IOE.printStackTrace();
                    catch (Exception e) {
                        e.printStackTrace();
              }   catch (IOException IOE) {
                     System.out.println("User options not found.");
    }

    How many Directory Servers are you using?
    Are both serverconfig.xml files of PS instances the same?
    Set debug level to message in the appropriate AMConfig.properties of your portal instances and look into AM debug files.
    For some reason amSDK seems not to get the correct service values.
    -Bernhard

Maybe you are looking for