TREX python extension for HTML attribute extraction

How is the TREX python extension for HTML attribute extraction supposed to work?
I activated this extension and indexed an HTML document containing:
<HTML>
<HEAD>
<META content="debian, GNU, linux, unix" name=Keywords>
<HEAD>
<HTML>
Why when submitting the query "unix" this document is not found by the system?
(TREX version 6.1.11.02)
Davide

The trick doesn't work on my system (TREX v. 6.1.11.02).
1) This is the edited line in getDCAttributes.py:
knownAttributes = ['description','Keywords']
2) Only the HTML Attribute Extractor extension is activated in extensions.py. In particular, the Dublin Core extension is deactivated.
3) This is the pythonextension section of TREXPreprocessor.ini:
extensiontype= beforeHTTP
4) This is the parametrization of the newly created property in CM:
<i>Unique ID: html_keywords
Description: not set
Property ID: Keywords
Namespace Alias: trilog
Type: String
Group: html
Mandatory: not checked
Multi-Valued: not checked
Read Only: not checked
Maintainable: not checked
Indexable: checked
Default Value: not set     
Allowed Values: not set
Key for Label: not set
Meta Data Extension: not set
Folder Validity Patterns: /
Document Validity Patterns: /
Resource Types: not set
Mime Types: not set
Default Sorting     Ascending: not set
Label Icon: not set     
Hidden: not checked
Dependencies: not checked
Additional Metadata: not set
Property Renderer: not set
Virtual: not checked
Composed of: not set
Comparator Class: not set</i>
5) I've added this property in the parameters <i>Allowed Predefined Properties</i> and <i>Predefined Properties</i> under <i>Content Management -> User Interface -> Search</i>
6) Now it's possible to filter by the <b>Keywords</b> predefined property in the Search UI, but no matches are ever found.
7) No significant message is found in PythonExtension.log:
# running global extensions.py at Tue Jul 12 11:13:03 2005
### import getHtmlAttributes
# running global extensions.py at Tue Jul 12 11:13:04 2005
### import getHtmlAttributes
8) I've run another test <b>activating the Dublin Core extension</b> in extensions.py as well, and setting extensiontype to beforeLEXICON in TREXPreprocessor.ini. Nevertheless it didn't work.
9) Here's some DC extension's output:
### Parse document with key: '/davide_test/attributi/GNU_Emacs.htm'
### extracted attributes: []
This is the GNU_Emacs.htm's <head>:
<META content=emacs name=keywords>
Cheers, Davide

Similar Messages

  • Is there a way to develop an HTML based extension for Dreamweaver like ones for PS or AI?

    Is there a way to develop an HTML based extension for Dreamweaver like ones for PS or AI? I mean like extension produced by Extesion Builder 3.

    Snippets is a good idea. But you can also do this with a template. Each time you use FILE | New > Page from Template, as you would to create a new child page, you have the option to enable or disable an option called "Update page when template changes". If that option is DISABLED (i.e., unchecked), then the child page will be created but with NO TEMPLATE MARKUP (i.e., no editable/non-editable regions)! Seems that's exactly what you want, no?

  • What is the file extension for photoshop for the blanks following  the period in "photoshop,___"?    I have  an older desktop which includes older photoshop files which I have asked a technician to extract for me.  He asks for the extension.  I can't

    I ned the extension for the photoshop program, that is, the letters following the period as shown by the blanks in this:  "photoshop.____"   I cannot find the extension in my laptop.  The photoshop program is a couple of years old and my files are in a desktop out of use for a couple years.  I brought the desktop to a technician, asking him to extract the photos.  He asks me for the file extension.

    If you using Windows you can use the control panel Folder options or from Windows File explorer menu tools>Folder options view tab to change what Microsoft hide on you.  Have windows display extension. Apple hide information from users also.

  • Parsing Non-Standard HTML Attributes

    Hi, recently I have been making use of the HTMLEditorKit to parse HTML pages and extract the values of certain attributes. However, I run into problems when attempting to extract the values of non-standard html attributes since javax.swing.text.html.HTML.Attribute does not allow you to define new attributes only use its existing values. Does anyone know a way of getting values from non-standard attributes?
    Thanks,
    Ross

    Ray.R wrote:
    Hi Darin.
    How have you been?  Wishing you a belated Happy New Year!
    I like your expression:  "step away from the mouse".  Very much to the point.  
    Thanks for the advice.  I'll check out XPath.
    Cheers,
    RayR
    Doing well, Thanks.  Wishing you an early Chinese New Year.  (that's going to get bleeped). [Edit:  Yay, no more bleeping Chinese]
    Be a little careful with XPath.  Some people have been known to wrap C++ libraries to have access to XPath 2.0 and an XSLT engine for awesome search and replace capabilities.  Others have been known to find ways to program XPath graphically:

  • Packaging an extension for Adobe Exchange?

    Hello,
    I developed a small extension for Photoshop CC, with a little help from the HTML 5 Extension guides scattered around the web.
    But there's one thing, that no guide covered in a way that helped me to get this to work.
    Packing it up as a ZXC, that will work after installing it with the extension manager.
    My extension currently works only when debugged (with debugging mode enabled in the registry).
    After disabling debugging mode & installing the extension from the ZXC package that I signed, the extension simply won't show in the Extensions menu within Photoshop.
    What I suspect to be causing this (please throw me a bone or two here) is the installation location (While uploading the package I was asked to specify a location for the extension to be installed, I don't know?)
    Some things you might want to know:
    I developed this small extension using Eclipse
    My 'CSXS' folder is inside of a folder named '.staged-extension' (I noticed it varies, and might be causing this?)
    When debugging it all works just fine
    One of the guides I went through is located here: Introducing HTML5 extensions | Adobe Developer Connection
    There was another page in Adobe's site, specifying the changes required to have the extension to work (It didn't work before then, but I already went through it.)
    Thank you so much for your time in advance

    Hey Davide and thanks for your reply!
    I attempted going through your guide, but I'm still having the same issue.
    I'll try to describe what happens, step by step:
    Done developing & debugging the extension with debug mode on.
    Packing the extension with ZXPSignCmd (just like you instructed)
    Upload the ZXP package to Exchange, then It asks me for : 'install location' - I can't define it to the location you provided but I can choose from a list.
    Nothing really fits (Desktop, Downloads, Brushes etc'), so I went for Desktop just to see what happens.
    After the upload process is complete, I download the package and install it using Extension Manager CC.
    The ZXP package that I packed and uploaded to Exchange appears on my desktop.
    It looks like Adobe Exchange packs my ZXP packager in another one, resulting Extension Manager to extract the outer one, and putting mine in the install location provided during upload.
    I'm clueless.
    EDIT:
    I just realized that for some reason Adobe Exchange only let me upload files that will later be packed into a ZXP package, instead of just using the one I made.
    Removing the whole product and starting over let me choose to upload my self packed one.
    I'll update whether or not this actually works though~
    2nd Edit:
    Nope.
    Probably was a step in the right direction, but it still won't work.

  • Data not coming from DOE to Mobile After defining Rule for device attribute

    Hi All,
    I have created a DO and rule for it.In case of Bulk Rule for all definition when i triggere extract from Portal then all the data comes to outbound queue but when i define rule for Device attribute then no data comes to my Outboun queue.Here is the scenario what i am doing :
    1. I have order header in my backend which has a field named "Work_Center" and this will be criteria field.
    2. In CDS table i have all the records for all the work center.
    3. Now in RMM under customized , i have added an attribute named "Work_center".
    4. Now i defined a rule with Device attribute mapping and activated the rule.
    5. Now on Portal i assigned this data object and in the device attribute tab i assigned the value(this value exist in CDS table for few orders) of a   Work center to the attribute "Work_Center" .
    6. Then i triggrere extract but its Outbound queue is empty, what could be the reason.
    Is my approach is correct
    Regards,
    Abhishek

    Hi Abhishek,
    You can check one ore thing, after you have performed all the steps till step 5, i.e. just before triggering
    extract. Check if the AT table for ur DO has entries based on the criteria specified by you...
    1. In the workbench click on the Data Object, and then right click and select "View Metadata".
    2. Select Distribution Model tab.
    3. Now select your DO's Association table.
    4. For the input field DEVICE ID specify your corresponing device id,and also for status field specify it 
        as "I"  and execute
    If there are any entries now in the AT table, and on triggering extract if they are not coming to the
    outbound Q there is some EXTRACT Q blocked. And is there were no entries in the AT then the rule
    specified is not  the satifying.
    Thanks,
    Swarna
    Now if you have entries w

  • How to Package Photoshop Extension for cs4/cs5 with manifest.xml

    Hello,
    tl;dr: How can I package my extension for both cs4 and cs5 in a way that respects the extension's window geometry?
    I have a panel that specifies window geometry in it's manifest.xml. When the panel is installed into Photoshop's panels/ folder, the geometry gets ignored, so I'm trying to package the panel as an extension to be installed via the Extension Manager. I have run into different problems for each CS version. I've read quite a few pdfs about the Extension Manager, UCF command line packaging, etc, and have not been able to find a solution that works for both platforms.
    My understaning
    My trial and error research has lead me to understand that the extension's files must eventually end up in the (mac) /Library/Application Support/Adobe/<CSversion>ServiceManager/extensions/ directory. In CS5, if the extension's folder contains a manifest.xml file (ex: /extensions/GuideGuide/CSXS/manifest.xml) that specifies window geometry, Photoshop will respect that window geometry when the panel opens. However, in CS4, this is not enough. From my tests, Photoshop CS4 doesn't seem to do anything with the manifest.xml file. Instead, I had to modify /Library/Application Support/Adobe/CS4ServiceManager/ServiceManifest.xml and add my extension to it's contents. Once I did this, Photoshop CS4 launches my panel and respects it's window geometry.
    The problem with this is that it's a manual installation. I can't ask my users to dig around in their system files to install my panel. In addition, since it's unsigned using this method, it won't work if they're not flagged for debugging. I've started exploring using the Extension Manager, but I have run into problems that I cannot find ansers in the few pdfs about packaging that I've been able to find.
    CS5 Problems
    If I use the UCF command line tool, I can package and sign my file. It installs fine and does what I want. However, using this method, I haven't been able to find a way to specify the author and description that shows up in the Extension Manager.
    CS4 Problems
    The UCF command line tool doesn't appear to make packages that can be installed by CS4, and I haven't been able to find one that is compatable. I've had to result to using the Extension Manager to package my extension based on a .mxi file. The problem I have here is that installing files this way limits me to putting them in the /panels directory, which then causes the panel to ignore the indow geometry. Is there a way with an .mxi file to install in the /Library/Application Support/Adobe/<CSversion>ServiceManager/extensions/ directory and modify the /Library/Application Support/Adobe/CS4ServiceManager/ServiceManifest.xml file?
    There must be, because Kuler and other extensions are installed there.
    Thank you SO much for any help you might provide. I've spent weeks trying to get this to work and have run into nothing but dead ends and wild goose chases.

    Your extension is a CSXS extension. For CSXS extension it's introduced in Extension Manager 2.1. (You can download Extension manager 2.1 from http://www.adobe.com/exchange/em_download/em20_download.html)
    In Extension Manager CS2.1 only MXP package is supported. In Extension Manager CS5 CSXS extension must be packaged by ZXP format. So you have to generate two packages for CS4(mxp) and CS5(zxp)
    For CS5, you can use ucf.jar to generate the zxp package.
    For CS4, you have to create an MXI file and package it by Extension Manager 2.1 to mxp package. Here is a sample CSXS mxi file:
    <macromedia-extension
               name="CSXS_TEST_EXTENSION"
               version="1.0.0"
               type="Command"
               requires-restart="true">
              <author name="Macromedia" />
              <products>
                        <product name="Dreamweaver" version="10" primary="true" />
                        <product name="Fireworks" version="10" primary="true" />
                        <product name="Flash" version="10" primary="true" />
                        <product name="" version="11" familyname="Photoshop" primary="true" />
                        <product name="Illustrator" version="14" primary="true" />
                        <product name="CSXS" version="1" />
              </products>
              <description>
              <![CDATA[
              CSXS extension sample.
              ]]>
              </description>
              <ui-access>
              <![CDATA[
              Extension Name: kuler
              ]]>
              </ui-access>
              <license-agreement>
              <![CDATA[
              ]]>
              </license-agreement>
              <files>
                        <file source="test_extension" destination="" />
              </files>
    </macromedia-extension>

  • Is there a way for html code to be automatically loaded in a html document?

    I am using DreamWeaver (CS5.5) what I am looking for is two (2) things:
    Most webpages are consistently formatted, Header, Navigation, content and footer. Typically the Header, Navigation and footer are identical from page to page in the website, be it two (2) paages or 2,000 pages.
    It would be nice if there were an HTML tag or a JavaScript procedure that would allow the designer to create a series of html pages that only contained one of the page elements, header, navigation or footer and after testing be able to import this code into a  page that is under development like you can a snippet.
    I know that I could create the code and test it then copy and paste it into each of the new pages being created.
    What would be ideal is html tag like the link tag that would automatically import the tested code into the new page, like is done with css, so that the designer and/or the coder only has to create and test this code once.
    obvisouly integeration testing would have to be performed.

    To add to Preran's response, you could also consider using php includes. Look here for an example: http://www.tizag.com/phpT/include.php
    The only change is your file extension from .html to .php. You can then let your server handle the parsing of PHP. Note that in order for PHP to work on your local (development) system, you need WAMP (Windows) or MAMP (Mac).
    WAMP: http://www.wampserver.com/en/
    MAMP: http://www.mamp.info/en/

  • IOS Mail changes .mhtml extension to .html when opening in another app

    Hey there guys,
    Just noticed this, this morning. Recently, our Backup server admin changed the report sent out by our Microsoft DPM server from PDF to MHTML output. (DPM could *never* format the PDF correctly and, surprise surprise, can format the MHTML output just fine).
    Anyways, I purchased an MHTML viewer from the Mac App Store for OSX - all fine. I purchased the iOS version of the same developers viewer this morning and tried to open the mail attachment into that app. Oddly, it just showed as text - just like it does inside the iOS Mail app itself.
    It was then that I noticed that the attachment extension was .html not .mhtml - yet the file attached to the email is definetely showing as .mhtml.
    I tried a wee experiment and opened the attachment into Dropbox instead - sure enough, the extension was changed to .html again.
    I then forwarded that message out to my gmail using the Mailbox app and opened the attachment into the MHTML viewer from there - perfect! Because it didn't change the extension as it passed the attachment to the opening app.
    So, I have what obviously is a bug in the iOS Mail app - I just want to report it to Apple so it can be fixed. I can't find anywhere to log this bug without having to pay money for a support incident.
    Knowing that Apple frequent these Communities (I have been contacted by them in the past following posts I've made here), I thought this might be the best course of action.  Or, if I am wrong, someone would let me know!  (Either about the bug, or how to report it)... ;-)
    Thanks!
    William

    Apple doesn't officially monitor these forums.  If you want to talk to Apple directly, then go to https://getsupport.apple.com or http://www.apple.com/feedback/

  • Developing and using Adobe AIR native extensions for Android devices

    I was using this tutorial:
    "Developing and using Adobe AIR native extensions for Android devices"
    http://www.adobe.com/devnet/air/articles/ane-android-devices.html
    When packing the Flex mobile ANESampleTest to deploy on an Android device, the below error happens
    Error occurred while packaging the application:
    aapt tool failed:invalid resource directory name: /private/var/folders/k8/1thhvkf92h947n_g22hg_v9m0000gn/T/52ba05aa-9001-4d46-9438-db81ef83 06f0/res/drawable-xxhdpi
    invalid resource directory name: /private/var/folders/k8/1thhvkf92h947n_g22hg_v9m0000gn/T/52ba05aa-9001-4d46-9438-db81ef83 06f0/res/values-sw600dp
    invalid resource directory name: /private/var/folders/k8/1thhvkf92h947n_g22hg_v9m0000gn/T/52ba05aa-9001-4d46-9438-db81ef83 06f0/res/values-sw720dp-lan
    Does anyone know what the issue might be?

    Did you find a workaround for the Error? I'm getting the same and I can't seem to find any solution.

  • Remove an html attribute in a html document

    Hi...
    Can someone give me a piece of code describing how I can remove a html attribute for the text under the current caret position or selection?
    If I have a <font class="..."> tag, I want to delete the class attribute, but maybe still want to have the font tag because it can contain some other attributes, face, size etc.
    I can get the current attribute with the AttributeSet attr = getCharacterElement().getAttributes().
    To be able to delete an attribute in the AttributeSet, I have to put it in a SimpleAttributeSet object?
    If I do that and delete the class attribute with the removeAttribute method and then put the changed AttributeSet back with setCharacterAttributes and setting the replace parameter to true, the selected text is removed...
    Any solutions?

    I don't know exact answer. It's only my suggestions!
    Foreground color in HTMLDocument differ from DefaultStyledDocument
    in the HTMLDocument color is specified with appropriate TAGs (like <FONT>).
    when you change source html text the document tree structure is changed according to html content and your settings are disappeared.
    Try to change source html text.
    Could you tell me what do you want to achieve?
    may be i can suggest a solution...
    i tryed to create an example... but it's unstable.
    import java.awt.*;
    import java.awt.event.*;
    import javax.swing.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.event.*;
    import javax.swing.tree.*;
    import java.util.*;
    class Test extends JFrame {
    JEditorPane edit;
    public Test(){
    super("Test");
    this.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
    getContentPane().setLayout(new BorderLayout());
    edit = new JEditorPane();
    edit.setEditorKit(new HTMLEditorKit());
    //edit.setDocument(new MyHTMLDocument());
    edit.setEditable(true);
    edit.setText("<HTML><FONT CLASS='c1' COLOR=RED><B>red text</B></FONT></HTML>");
    JScrollPane scroll=new JScrollPane(edit);
    getContentPane().add(scroll,BorderLayout.CENTER);
    JTree tree=new JTree((TreeNode)edit.getDocument().getDefaultRootElement());
    getContentPane().add(tree,BorderLayout.WEST);
    JButton btn=new JButton("test");
    ActionListener lst=new ActionListener() {
    public void actionPerformed(ActionEvent e){
    HTMLDocument html_doc=(HTMLDocument)edit.getDocument();
    Element el=html_doc.getCharacterElement(1);
    MutableAttributeSet newAttr=new SimpleAttributeSet();
    StyleConstants.setForeground(newAttr,Color.blue);
    html_doc.setCharacterAttributes(1,3,newAttr,false);
    try {
    html_doc.insertString(0,"",null); //this call do nothing but notifies that document is changed
    catch (Exception ex) {
    ex.printStackTrace();
    Element font=el.getParentElement();
    MutableAttributeSet attr=(MutableAttributeSet)el.getAttributes();
    btn.addActionListener(lst);
    getContentPane().add(btn,BorderLayout.SOUTH);
    setSize(300,300);
    setVisible(true);
    public static void main(String a[]) {
    new Test();
    and don't worry about dukes:-))
    i just want to help you.
    best regards
    Stas

  • How to change the color for HTML words in JEditorPane?

    Hi Sir,
    In the JTextPane , we could change the word's color by using:
    Style style = doc.addStyle("test",null);
    StyleConstants.setForeground(style, Color.red);
    doc.setCharacterAttributes(10,20,syle,true);
    we can change the text into red color,which range is from 10 to 30.
    But how to change the color for HTML words in JEditorPane?

    Hi,
    you can use an AttributeSet to apply the foreground color. Let's say, doc is a HTMLDocument, then SimpleAttributeSet set = new SimpleAttributeSet();
    doc.getStyleSheet().addCSSAttribute(set, CSS.Attribute.COLOR, "#0D0D0D"); would apply a color to a given AttributeSet. The AttributeSet with your color then can be applied to a selected range of text in a JEditorPane by   /**
       * set the attributes for a given editor. If a range of
       * text is selected, the attributes are applied to the selection.
       * If nothing is selected, the input attributes of the given
       * editor are set thus applying the given attributes to future
       * inputs.
       * @param editor  the editor pane to apply the attributes to
       * @param a  the set of attributes to apply
      public void applyAttributes(JEditorPane editor, AttributeSet a) {
        ((HTMLDocument) editor.getDocument()).getStyleSheet().addCSSAttribute(set, CSS.Attribute.COLOR, "#0D0D0D");
        editor.requestFocus();
        int start = editor.getSelectionStart();
        int end = editor.getSelectionEnd();
        if(end != start) {
          doc.setCharacterAttributes(start, end - start, a, false);
        else {
          MutableAttributeSet inputAttributes =
            ((SHTMLEditorKit) editor.getEditorKit()).getInputAttributes();
          inputAttributes.addAttributes(a);
      } Ulrich

  • How to design Flat file for loading attribute dimension in a planning application

    Dear Gurus,
    I have a requirement to extract attribute dimensions from an essbase application and load it to another planning application. I have a dimension called Program and two attribute dimensions Sales Manager, Accounts manager associated with Program dimension in Essbase application. I will Extract these dimensions using Essbase outline extractor. After Extracting the attribute dimensions I have to load these dimensions to planning applications using outline load utility. Kindly guide me how to design the flat file for loading attribute dimensions in planning application.
    Thanks and Regards
    SC

    You could dig through the docs and try to figure out the file format manually, or you could do this the easy way.  Simply use the Outline Load Utility to export your attribute dimension from Planning.  The export file format is the same as the import file format.  You might have to manually add a couple of test members to your attribute dimension so that your export file has some content.  Then simply update the file you exported, and import it.
    (I am assuming you have already manually created the Attribute dimension in Planning, and that you simply need to add members to it.)
    Hope this helps,
    - Jake

  • Browser is not opening from the webgui or SAP GUI for HTML

    Hi Experts
    I am in critical scenario how to work on the below scenario.
    I have installed an NW7.0 with few functional components.as per the user request i had activated webgui and it is working fine in internet.for this we have done the following steps.
    1. Activated the webgui/mimes in SICF
    2. published in SE80 for WEBGUI and SYSTEM.
    3. Created a reverse proxy in Micrsoft ISA through our network team for the internet users.
    example : http://servername.xyz.com:8005/sap/bc/.../webgui
    reverse proxy URL : http://sapabap.domain.com/sap/bc../webgui -- working URL
    i could access sm50,sm21 and so..on.But when i try soamanager transaction or any other transaction which picks a browser and display the screen which is not working. It picks the intranet link or servername with port( http://servername.xyz.com:8005/sncrr or soamanager..).Is HTTPS is required to enabled ( ithink not necessary since it is an secured login procedure.) i could able to see in se80settingsITS Tab --servername.xyz.com:8005
    May i know why it does not picks up the reverse proxy URL. I have tried with hosts file with many options but is it not working
    Regards
    Bala

    Hi Bala,
    If you go to transaction SE80 -> Repository Information System -> Other objects -> Transactions and you search for SOAMANAGER, you get into the screen where you can modify attributes for this transaction. Please check if there are strange values entered in there... Or maybe SAPGUI for HTML is not selected.
    Kind regards,
    Mark

  • Question about HTML attributes

    I am parsing HTML code and need some help. Maybe it's because
    I don't understand HTML code or something.
    Anyways, I'm trying to get the content of an HTML element. Given the
    following Java code:
         // Iterate through the elements of the HTML document.
         ElementIterator it = new ElementIterator(doc);
         javax.swing.text.Element elem;
         while ((elem = it.next()) != null) {
              SimpleAttributeSet s = (SimpleAttributeSet)elem.getAttributes().getAttribute(HTML.Tag.A);
              if (s != null) {
              System.out.println(s.toString());
              System.out.println(s.getAttribute(HTML.Attribute.HREF));
              System.out.println("element: " + elem.toString());
              if (elem.getName().equals("tr")) {
              System.out.print("found TR: ");
              Enumeration e = elem.getAttributes().getAttributeNames();
              while (e.hasMoreElements())
                   System.out.print(e.nextElement().toString() + " ");
              System.out.println();
    I can't figure out how to parse the content from table data.
    Here is my HTML code:
    <TR VALIGN="top">
    <TD ALIGN="left" COLSPAN="2">
    <FONT FACE="Monospace,Courier">KDEN 191753Z 10007KT 050V150 10SM FEW090 SCT120 SCT200 29/M01 A3004 RMK AO2
    SLP087 VIRGA DSNT S-NW TCU DSNT SW T02891011 10294 20128 58007</FONT><BR>
    <FONT FACE="Monospace,Courier">KDEN 191653Z VRB05KT 10SM FEW090 SCT120 SCT200 28/00 A3006 RMK AO2 SLP095 VIRGA
    DSNT SW-NW T02830000</FONT><BR>
    <BR>
    </TD>
    </TR>
    The code picks out the HTML attribute "valign" for the TR element. But,
    how do I get to the actual content.
    What I would like is to get the string that begins with KDEN.
    Any comments would be helpful.
    Thanks.
    -brad w.

    I was able to make this work.
    What I did was the following:
    <code>
         // Parse the HTML.
         kit.read(rd, doc, 0);
         // Iterate through the elements of the HTML document.
         ElementIterator it = new ElementIterator(doc);
         javax.swing.text.Element elem;
         while ((elem = it.next()) != null) {          
    //          System.out.println("element: " + elem.toString());
              if (elem.getName().equals("content")) {
              start = elem.getStartOffset();
              end = elem.getEndOffset();
              // System.out.println("found content: beg_offset: " + start + "end_offset: " + end);
              if ((end-start) < 4)
                   continue;
              try {
                   loc = doc.getText(start, 4);
              } catch (BadLocationException ex) { continue; }
              if (loc.startsWith("KDEN"))
                   System.out.println(doc.getText(start, end - start));
    </code>
    I need to use the start and end position of the element to get
    the text from the Document.
    -brad w.

Maybe you are looking for

  • GUI_UPLOAD to read data from an Excel File

    Hi Folks, I'm using FM GUI_UPLOAD to read data from an Excel File. But all I see in the table returned is 1 row with garbage values (special chacaters). Excel Workbook has proper data in the sheet, but its not getting uploaded properly. Sy-subrc is 0

  • Unimplemented error

    Aslam o Alikum (Hi) i am using Reports 6i. From Screen previewer when i click Generate to file -> pdf or other formats it show me error message "Unimplemented error". But report run well in previewer. solution urgently needed. Allah Hafiz

  • 11i post install problem on winXP

    Hi, I installed 11i on my xp machine. Staging is done, pre-install checks are also done. Then it begins installation, progress bar appears and then it goes fast and progress bar is 100% in 2 mins and the progress bar window disappears. I am not getti

  • Empty blue screen

    as the computer starts up, after the grey screen it gets stuck at "Login Window starting" and then goes to a blue screen with a cursor. I can connect a USB mouse which moves the cursor, but the wireless keyboard does not work. Please help!as the comp

  • Dummy business place

    hi experts, I`m looking for inputs on use of dummy business place with respect to WHT (TDS & TCS) configuration for India. Under the current scenario, we have multiple business place and one section code to be setup. But I need inputs on whether we n