Xsl transformation from html to text.
Hi, i want to tranform an html source and produce a output as text. All i want to do is to output values from my input fields in my html source. Any ideas on how i would construct my xsl file.
example :
HTML:
<html>
<body>
<input name="od" type="text" value="123">
<input name="id" type="text" value="456">
</body>
</html>
would simply give :
123
456
Thanks for your help !!!
Here is what I came up with. I changed the regular HTML into XHTML then created a stylesheet that would use XPath to find and display the values or the value fields:
test.xml (XHTML version of the HTML you posted)
======================================
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<html>
<body>
<input name="od" type="text" value="123"/>
<input name="id" type="text" value="456"/>
</body>
</html>
test.xsl
======
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<head>
<title>Input values</title>
</head>
<body>
<xsl:for-each select="html/body/input">
<xsl:value-of select="@value"/>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
this gives the desired result.
Similar Messages
-
Xsl transformation from version1 to version2, problem with namespaces
Guys!
In my current project we need to have an interface in Oracle ESB which is build on lets say a wsdl version1 and an interface build on wsdl version2.
In esb i need to define a transformation which will transform the request on version1 to version2. Because the xsd for the operation is really huge (+1000 items) i made some templates in xsl to do most of the work, works great..only i'm having a few issues now.
To re-order items from source to target i do the next in a template
<nameGroep>
<xsl:copy-of select="andhere the xpath from source"/>
<xsl:copy-of select="andhere the xpath from source"/>
<xsl:copy-of select="andhere the xpath from source"/>
</nameGroep>The only problem from the xsl:copy-of is, it also copies the namespace along. So if my target document uses an other namespace, it fails.
To correct this i hoped i could make use of <xsl:namespace-alias> but this doesn't work on a literal/text tag (hope i explain this correct).
Other option is, for every element do something like
[code[
<elementname>
<xsl:value-of select=""/>
</elementname>
but this will create the <elementname> always in the target whether or not it's in the source. You could do a check to see if it's in the source, but this isn't a solution because then i need to check for every 1000+ item in the source document, so..we skip this idea.
So i reach a point where im still searching for a good solution and hoped you guys could help me a bit with it.
If the problem isn't explain well please say so, and i will add extra info.Guys!
In my current project we need to have an interface in Oracle ESB which is build on lets say a wsdl version1 and an interface build on wsdl version2.
In esb i need to define a transformation which will transform the request on version1 to version2. Because the xsd for the operation is really huge (+1000 items) i made some templates in xsl to do most of the work, works great..only i'm having a few issues now.
To re-order items from source to target i do the next in a template
<nameGroep>
<xsl:copy-of select="andhere the xpath from source"/>
<xsl:copy-of select="andhere the xpath from source"/>
<xsl:copy-of select="andhere the xpath from source"/>
</nameGroep>The only problem from the xsl:copy-of is, it also copies the namespace along. So if my target document uses an other namespace, it fails.
To correct this i hoped i could make use of <xsl:namespace-alias> but this doesn't work on a literal/text tag (hope i explain this correct).
Other option is, for every element do something like
[code[
<elementname>
<xsl:value-of select=""/>
</elementname>
but this will create the <elementname> always in the target whether or not it's in the source. You could do a check to see if it's in the source, but this isn't a solution because then i need to check for every 1000+ item in the source document, so..we skip this idea.
So i reach a point where im still searching for a good solution and hoped you guys could help me a bit with it.
If the problem isn't explain well please say so, and i will add extra info. -
Error in generating a new XSL Transformer from large xslt File
Good day to all,
Currently I am facing a problem that whenever i try generating a Transformer object from TransformerFactory, I will have a TransformerConfigurationException threw. I have did some research from the net and understand that it is due to a bug that JVM memory limit of 64kb. However is there any external package or project that has already addressed to this problem? I have checked apache but they already patch the problem in Xalan 2.7.1. However I couldn't find any release of 2.7.1
Please help
Regards
RollinMaoIf you have the transformation rules in a separate XSLT file, then, you can use com.icl.saxon package to get XML files transformed. I have used this package with large XSL files and has worked well.
-
When I set up my yahoo email I indicated to use HTML. However in my messages I get all the code garbage. How can I switch to Text mode? Thanks
AlanThe ability to read messages formatted in HTML will be available when you update your OS to 4.5. You can either wait till your carrier releases it, or follow these steps do install 4.5 onto your phone:
After you download and install the handheld software from a carrier's website that has the 4.5 OS, do a search on your computer to find and delete the vendor.xml file. The file is most likely in the following location: C:\Program Files\Common Files\Research In Motion\AppLoader. Afterwards just start the Desktop Manager and the software will be available for you to install.
**NOTE** - the handheld software you have must be for the same model as you have. You cannot install software for any other model.
Message Edited by jmrmb80 on 09-29-2008 07:48 PM
If someone has been helpful please consider giving them kudos by clicking the star to the left of their post.
Remember to resolve your thread by clicking Accepted Solution. -
Process to Generate PDF from HTML Rich Text Editor
Hi,
I have a HTML form with the Yahoo Rich Text Editor.
The Form posts the rich text to a Livecycles process. [REST]
The input type is String. How could I convert that string into a PDF?
TIA
MichaelThankyou,
I saved the input in a temporary file and used the HTML2PDF component, then deleted the temp file,
Cheers -
XSL transformation not working
Hi!
I am having problems when trying to generate XSL transformation from XML to XML (where XML output is actually XHTML). It always fails executing <xsl:callTemplate name="something", when <xsl:callTemplate /> is executed from another <xsl:template> which is also called with <xsl:callTemplate. Version of database is 10.2.0.4.0, received error is: ORA-00604: invalid character value 'burek' for attribute 'name'.
Transformation is working in Java and Altova XMLSpy.
PL/SQL code:
procedure process_xsl(p_xml in clob, p_xsl in clob, p_result out clob) is
w_xsl_proc dbms_XSLProcessor.Processor;
w_xsl_ss dbms_XSLProcessor.Stylesheet;
w_dom_xsl dbms_xmldom.DOMDocument;
w_dom_xml dbms_xmldom.DOMDocument;
w_parser dbms_xmlparser.Parser;
begin
--xml in xsl iz cloba v DOMDocument
w_parser := dbms_xmlparser.newParser;
dbms_xmlparser.parseClob(w_parser, p_xml);
w_dom_xml := dbms_xmlparser.getDocument(w_parser);
dbms_xmlparser.freeParser(w_parser);
w_parser := dbms_xmlparser.newParser;
dbms_xmlparser.parseClob(w_parser, p_xsl);
w_dom_xsl := dbms_xmlparser.getDocument(w_parser);
dbms_xmlparser.freeParser(w_parser);
--xsl procesiranje
w_xsl_proc := dbms_XSLProcessor.newProcessor;
w_xsl_ss := dbms_XSLProcessor.newStylesheet(w_dom_xsl, null); <-- Here error is received
END;
Stylesheet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"></xsl:output>
<xsl:decimal-format name="dec" decimal-separator="," grouping-separator="."/>
<!-- Predefined constants from einvoice xml schema -->
<xsl:variable name="einvoiceIssuerCode" select="'II'"></xsl:variable>
<xsl:variable name="einvoiceRecipientCode" select="'IV'"></xsl:variable>
<xsl:variable name="einvoiceIssueLocationCode" select="91"></xsl:variable>
<xsl:variable name="einvoiceIssueDateCode" select="137"></xsl:variable>
<!-- Constants directly from document which is a part of transformation -->
<xsl:variable name="einvoiceNumber" select="/IzdaniRacunEnostavni/Racun/GlavaRacuna/StevilkaRacuna/text()"></xsl:variable>
<!-- Intro template -->
*<xsl:template name="burek"> <!-- Second template called with xsl:call template -->*
<xsl:text>TEST</xsl:text>
</xsl:template>
<!-- Template in which we create html structure including css -->
<xsl:template name="einvoice">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Vizualizacija e-računa št. </title>
<xsl:call-template name="burek"></xsl:call-template>
</head>
<body>
</body>
</html>
</xsl:template>
<!-- Intro template -->
<xsl:template match="/">
* <xsl:call-template name="einvoice"></xsl:call-template> <!-- This call is OK -->*
</xsl:template>
</xsl:stylesheet>
XML document
<?xml version="1.0" encoding="UTF-8"?>
<IzdaniRacunEnostavni>
<Racun Id="data">
<GlavaRacuna>
<VrstaRacuna>380</VrstaRacuna>
<StevilkaRacuna>1205019908211</StevilkaRacuna>
<FunkcijaRacuna>9</FunkcijaRacuna>
</GlavaRacuna>
<DatumiRacuna>
<VrstaDatuma>137</VrstaDatuma>
<DatumRacuna>2012-05-07T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<DatumiRacuna>
<VrstaDatuma>263</VrstaDatuma>
<DatumRacuna>2012-04-28T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<DatumiRacuna>
<VrstaDatuma>263</VrstaDatuma>
<DatumRacuna>2012-05-27T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<DatumiRacuna>
<VrstaDatuma>263</VrstaDatuma>
<DatumRacuna>2012-03-28T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<DatumiRacuna>
<VrstaDatuma>263</VrstaDatuma>
<DatumRacuna>2012-04-26T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<DatumiRacuna>
<VrstaDatuma>263</VrstaDatuma>
<DatumRacuna>2012-04-27T00:00:00.0Z</DatumRacuna>
</DatumiRacuna>
<Lokacije>
<VrstaLokacije>91</VrstaLokacije>
<NazivLokacije>Ljubljana</NazivLokacije>
</Lokacije>
</Racun>
</IzdaniRacunEnostavni>
Edited by: 938026 on 01-Jun-2012 00:35Hi,
I think your problem lies in the <title>. You are using non UTF-8 characters in the title (š), but you marked your XML as UTF-8. So change the title to have unicode charaters and it will work.
Herald ten Dam
http://htendam.wordpress.com -
How do we stop the XSL transform creating an empty tag when there is no inp
Here is a way to stop the XSL transform from creating an empty tag when there is no input.
1. Open the XSL Map in Jdev
2. Go to the Design Tab
3. Right click the tag in the target tree and select "Add XSL node -> xsl:if"
4. Create a new link from the source tag (the same that is linked to the target tag) to the newly created xsl:ifFor anyone coming in to find this, I located my answer here:
[Special Applet Attributes|http://java.sun.com/javase/6/docs/technotes/guides/plugin/developer_guide/special_attributes.html#codebase]
Thanks for reading.
Sorry for the interruption. -
XSL Transformation crashes DW CS3
Whenever I select XSL Transformation from any of the menus,
it crashes DW CS3 asking the typical, "do you want to tell
microsoft about this error."
Anyone have this problem with using XSL Transformations? I'm
thinking about reinstalling DW to see if it fixes the problem.
ThanksForgot to mention that this is a Windows XP SP2 machine.
-
Extra texts got spits out from XML XSL transformation?
Hi:
I was trying to output a transformed text from a XSL with XML file, It seems that it spits out all the values from xml file inbetween all the element tags. But all I really need is just a small chunk of it. Does any one know how to get rid of the extra stuff from the generated text?
ThanksIf your XSL doesn't specify what to do with a particular node, the default is to copy it to the output. That's what is happening to you. The remedy is for your XSL to specify a template for the root node ("/") that produces the output you want. Or something like that, your details are rather sketchy.
-
XSL Fragment into HTML via Client-Side Transform
I am designing a site for a school. I searched and found the
post here from July 25, and I have also read the Dreamweaver
help file till I'm blue in the face. They talk all around the
answer but never definitively say if it's possible to do this.
Dreamweaver help mentions:
-- Workflow for performting client-side xsl transformations
Do one of the following:
In your Dreamweaver site, create an entire XSLT page. See
Creating entire XSLT pages.
Convert an existing HTML page to an entire XSLT page. See
Converting HTML pages to XSLT pages.
All the online tutorials show server-side transforms but I'm
not skilled in that...nor do I know if the hosting entity will
provide that level of access to their .NET server.
---- ok. that's the background of the situation. Now to my
problem. ---
We plan to have two mutually exclusive areas on the home
page, such as news & events, that will be updated by a single
school employee. The plan is to create two XML text files that one
teacher can update.
The XMLfiles will be manually uploaded to the web site and
the home page will read that data into properly formatted
information on the home page. I would greatly prefer to keep the
entire process as a client-side procedure.
I have created and linked XSL fragments to the XML data.
If I try to copy and paste code from the XSL fragment into
the index HTML page, I get nothing.
Success comes only after converting the home page into an
XSLT 1.0 file using Dreamweaver and copying and pasting the code
fromt he XSL file into the newly created XSLT file.
Hence my questions:
1 Can I bring these XSL fragments into an HTML home page or
do I have to convert it to XSLT?
2. If I must convert the HTML file to an XSLT file, can
people still type the website address in as www dot site dot com
and the XSLT file will open without anyone knowing the difference?
3. Can I even do this with a client-side transform?
4. Is it possible for one page to reference two separate XSL
fragments pointing to the two separate respective XML files?
Thank you very much for your help.Hi Eric,
these are the cache control headers of the request that serves the XSLT:
GET http://www.carsten-leue.de/test/iframe_xslt/xslt.php HTTP/1.1
Accept: */*
Referer: http://www.carsten-leue.de/test/iframe_xslt/xslt.php
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: www.carsten-leue.de
DNT: 1
Connection: Keep-Alive
There does not seem to be a header involved that prevents caching.
You mention the "legacy ActiveX" control. In which sense is this control involved in the usecase? In my scenario I am pointing the browser to the XML document that has an associated stylesheet and the browser automatically executes the transform.
I am not explicitly triggering the transform via some script in the page.
Does the ActiveX control still play a role in this scenario?
Carsten -
Xmltype.transform and xsl:output method="html"
hi, 9.2.0.4 winxp,
i wonder whether xmltype.transform regards any output instructions in the stylesheet. i requested any of xml, html and text and always got the same result?
any ideas or hints to more info?
regards peterSorry for jumping in on this thread, but I have a question regarding you reply. I have an XSL stylesheet that preforms XML to HTML conversion. Everything works correctly with the exception of those HTML tags that are not weel formed. Using your example if I have something like:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<input type="text" name="{NAME}" size="{DISPLAY_LENGTH}" maxlength="{LENGTH}"></input>
</xsl:stylesheet>
It would render HTML in the format of
<HTML>
<input type="text" name="in1" size="10" maxlength="20"/>
</HTML>
While IE can handle this Netscape can not. Is there anyway to generate completely cross browser complient HTML with XSL?
Thanks! -
How Do I Display HTML Formatted Text From A Data Table In Crystal Reports?
I'm creating reports in Crystal XI. The information being displayed in the reports comes from data tables where the text is formatted in HTML.
I've worked with Crystal Reports enough to know that HTML text pulled from a data table doesn't appear in Crystal the same way it does in a web browser. Crystal Reports ignores all the tags (...unless I'm missing something...) and just displays the text.
Someone far more Crystal savy than I (...who I don't have access to...) came up with a Formula Field workaround that tricks Crystal Reports into displaying some basic HTML tags. Here's that workaround:
<!--
stringVar TableName := ;
TableName := Replace (TableName, "<ul>","<br> <br>");
TableName := Replace (TableName, "<li>", "<br> • ");
TableName := Replace (TableName, "</li>", "");
TableName := Replace (TableName, "</ul>","<br> <br>");
TableName := Replace (TableName, "<a", "<u><font color='blue'");
TableName := Replace (TableName, "</a>", "</font></u>");
TableName
-->
QUESTION - Does any similar workaround exist so I can display an HTML Table in Crystal Reports? If not, is there any way to display HTML formatted text from a data table in Crystal Reports as it would appear in a web browser?Hi Steven,
To display html text in Crystal Reports follows these steps.
1. Right click on the field and select Paragraph tab.
2. Under 'Text Interpretation' select 'HTML Text' and click OK.
I have tried using the way,but it never works.So reply me if there is any way to solve the issue -
Problem to extract text from HTML document
I have to extract some text from HTML file to my database. (about 1000 files)
The HTML files are get from ACM Digital Library. http://portal.acm.org/dl.cfm
The HTML page is about the information of a paper. I only want to get the text of "Title" "Abstract" "Classification" "Keywords"
The Problem is that I can't find any patten to parser the html files"
EX: I need to get the Classification = "Theory of Computation","ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY","Numerical Algorithms and Problem","Mathematics of Computing","NUMERICAL ANALYSIS"......etc .
The section code about "Classification" is below.
Please give any idea to do this, or how to find patten to extract text from this.
<div class="indterms"><a href="#CIT"><img name="top" src=
"img/arrowu.gif" hspace="10" border="0" /></a><span class=
"heading"><a name="IndexTerms">INDEX TERMS</a></span>
<p class="Categories"><span class="heading"><a name=
"GenTerms">Primary Classification:</a></span><br />
� <b>F.</b> <a href=
"results.cfm?query=CCS%3AF%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Theory of Computation</a><br />
� <img src="img/tree.gif" border="0" height="20" width=
"20" /> <b>F.2</b> <a href=
"results.cfm?query=CCS%3A%22F%2E2%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">ANALYSIS OF ALGORITHMS AND PROBLEM
COMPLEXITY</a><br />
� � � <img src="img/tree.gif" border="0" height=
"20" width="20" /> <b>F.2.1</b> <a href=
"results.cfm?query=CCS%3A%22F%2E2%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Numerical Algorithms and Problems</a><br />
</p>
<p class="Categories"><span class="heading"><a name=
"GenTerms">Additional�Classification:</a></span><br />
� <b>G.</b> <a href=
"results.cfm?query=CCS%3AG%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Mathematics of Computing</a><br />
� <img src="img/tree.gif" border="0" height="20" width=
"20" /> <b>G.1</b> <a href=
"results.cfm?query=CCS%3A%22G%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">NUMERICAL ANALYSIS</a><br />
� � � <img src="img/tree.gif" border="0" height=
"20" width="20" /> <b>G.1.6</b> <a href=
"results.cfm?query=CCS%3A%22G%2E1%2E6%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Optimization</a><br />
� � � � � <img src="img/tree.gif" border=
"0" height="20" width="20" /> <b>Subjects:</b> <a href=
"results.cfm?query=CCS%3A%22Linear%20programming%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Linear programming</a><br />
</p>
<br />
<p class="GenTerms"><span class="heading"><a name=
"GenTerms">General Terms:</a></span><br />
<a href=
"results.cfm?query=genterm%3A%22Algorithms%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Algorithms</a>, <a href=
"results.cfm?query=genterm%3A%22Theory%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Theory</a></p>
<br />
<p class="keywords"><span class="heading"><a name=
"Keywords">Keywords:</a></span><br />
<a href=
"results.cfm?query=keyword%3A%22Simplex%20method%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Simplex method</a>, <a href=
"results.cfm?query=keyword%3A%22complexity%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">complexity</a>, <a href=
"results.cfm?query=keyword%3A%22perturbation%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">perturbation</a>, <a href=
"results.cfm?query=keyword%3A%22smoothed%20analysis%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">smoothed analysis</a></p>
</div>One approach is to download Htmlparser from sourceforge
http://htmlparser.sourceforge.net/ and write the rules to match title, abstract etc.
Another approach is to write your own parser that extract only title, abstract etc.
1. tokenize the html file. --> convert html into tokens (tag and value)
2. write a simple parser to extract certain information
find out about the pattern of text you want to extract. For instance "<class "abstract">.
then writing a rule for extracting abstract such as
if (tag is abstract ) then extract abstract text
apply the same concept for other tags
Attached is the sample parser that was used to extract title and abstract from acm html files. Please modify to include keyword and other fields.
good luck
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
public class ACMHTMLParser
private String m_filename;
private URLLexicalAnalyzer lexical;
List urls = new ArrayList();
public ACMHTMLParser(String filename)
super();
m_filename = filename;
* parses only title and abstract
public void parse() throws Exception
lexical = new URLLexicalAnalyzer(m_filename);
String word = lexical.getNextWord();
boolean isabstract = false;
while (null != word)
if (isTag(word))
if (isTitle(word))
System.out.println("TITLE: " + lexical.getNextWord());
else if (isAbstract(word) && !isabstract)
parseAbstract();
isabstract = true;
word = lexical.getNextWord();
lexical.close();
public static void main(String[] args) throws Exception
ACMHTMLParser parser = new ACMHTMLParser("./acm_html.html");
parser.parse();
public static boolean isTag(String word)
return ( word.startsWith("<") && word.endsWith(">"));
public static boolean isTitle(String word)
return ( "<title>".equals(word));
//please modify according to the html source
public static boolean isAbstract(String word)
return ( "<p class=\"abstract\">".equals(word));
private void parseAbstract() throws Exception
while (true)
String abs = lexical.getNextWord();
if (!isTag(abs))
System.out.println(abs);
break;
class URLLexicalAnalyzer
private BufferedReader m_reader;
private boolean isTag;
public URLLexicalAnalyzer(String filename)
try
m_reader = new BufferedReader(new FileReader(filename));
catch (IOException io)
System.out.println("ERROR, file not found " + filename);
System.exit(1);
public URLLexicalAnalyzer(InputStream in)
m_reader = new BufferedReader(new InputStreamReader(in));
public void close()
try {
if (null != m_reader) m_reader.close();
catch (IOException ignored) {}
public String getNextWord() throws IOException
int c = m_reader.read();
if (-1 == c) return null;
if (Character.isWhitespace((char)c))
return getNextWord();
if ('<' == c || isTag)
return scanTag(c);
else
return scanValue(c);
private String scanTag(final int c)
throws IOException
StringBuffer result = new StringBuffer();
if ('<' != c) result.append('<');
result.append((char)c);
int ch = -1;
while (true)
ch = m_reader.read();
if (-1 == ch) throw new IllegalArgumentException("un-terminate tag");
if ('>' == ch)
isTag = false;
break;
result.append((char)ch);
result.append((char)ch);
return result.toString();
private String scanValue(final int c) throws IOException
StringBuffer result = new StringBuffer();
result.append((char)c);
int ch = -1;
while (true)
ch = m_reader.read();
if (-1 == ch) throw new IllegalArgumentException("un-terminate value");
if ('<' == ch)
isTag = true;
break;
result.append((char)ch);
return result.toString();
} -
Read Text from HTML-Pages and want to solve "ChangedCharSetException"
Hello,
I have an app that connect via threads with pages and parse them an gives me only the Text-version of a HTML-page. Works fine, but if it found a page, where the text is within images, than the whole app stopps and gave me the message:
javax.swing.text.ChangedCharSetException
at javax.swing.text.html.parser.DocumentParser.handleEmptyTag(DocumentParser.java:169)
at javax.swing.text.html.parser.Parser.startTag(Parser.java:372)
at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1846)
at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1881)
at javax.swing.text.html.parser.Parser.parse(Parser.java:2047)
at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:106)
at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:78)
at aufruf.main(aufruf.java:33)So I tried to catch them with "getCharSetSpec()" and "keyEqualsCharSet( )" from the class "javax.swing.text.ChangedCharSetException" and hoped that this solved the problem. But still doesen't work...
Then I looked at the web and found, that I have to add the line:
doc.putProperty("IgnoreCharsetDirective", new Boolean(true));"doc." is a new HTML Dokument, created with the HTMLEditorKit. I do not have much knowledge about that and so I hope, that someone can explain me, how I can solve that problem, within my code.
Here we go:
import javax.swing.text.*;
import java.lang.*;
import java.util.*;
import java.net.*;
import java.io.*;
import javax.swing.text.html.*;
import javax.swing.text.html.parser.*;
public class myParser extends Thread
private String name;
public void run()
try
URL viele = new URL(name); // "name" ia a variable with a lot of links
URLConnection hs = viele.openConnection();
hs.connect();
if (hs.getContentType().startsWith("text/html"))
InputStream is = hs.getInputStream();
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr);
Lesen los = new Lesen();
ParserDelegator parser = new ParserDelegator();
parser.parse(br,los, false);
catch (MalformedURLException e)
System.err.print("Doesn't work");
catch (ChangedCharSetException e)
e.getCharSetSpec();
e.keyEqualsCharSet();
e.printStackTrace();
catch (Exception o)
public void vowi(String n)
name = n;
}and for the case that it is important here is the class "Lesen"
import java.net.*;
import java.io.*;
import javax.swing.text.*;
import javax.swing.text.html.*;
import javax.swing.text.html.parser.*;
class Lesen extends HTMLEditorKit.ParserCallback
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
try
if ((t==HTML.Tag.P) || (t==HTML.Tag.H1) || (t==HTML.Tag.H2) || (t==HTML.Tag.H3) || (t==HTML.Tag.H4) || (t==HTML.Tag.H5) || (t==HTML.Tag.H6))
System.out.println();
catch (Exception q)
System.out.println(q.getMessage());
public void handleSimpleTag(HTML.Tag t,MutableAttributeSet a, int pos)
try
if (t==HTML.Tag.BR)
System.out.println(); // Neue Zeile
System.out.println();
catch (Exception qw)
System.out.println(qw.getMessage());
public void handleText(char[] data, int pos)
try
System.out.print(data); // prints the text from HTML-pages
catch (Exception ab)
System.out.println(ab.getMessage());
}Thanks a lot for helping...
Stephanparser.parse(br,los, false);
parser.parse(br,los, true); -
Creating an XML From a Deep Structure using XSL Transformation
Hi ABAPers,
I have a requirement to use XSL Transformations on an ABAP deep type structure.
Currently i have an API that fills in this deep structure and by using CALL TRANSFORMATION ID.... i will get the BIG XML having having 100s of nodes . But actualy form the deep structure i need only some NODES (say 50)... So i tried writing an XSLT
in the transaction STRANS.. but on using this TRANSFORMATION which i wrote i am getting an error messgae like INVALID XML...
Am i going in right track or is there a good solution...
My sample transformation is as below...
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:value-of select="DATA/NODE_ELEMENTS/UUID_KEY/UUID"/>
<xsl:value-of select="DATA/NODE_ELEMENTS/SEMANTICAL_NAME"/>
<xsl:value-of select="DATA/NODE_ELEMENTS/STRUCT_CAT"/>
<xsl:value-of select="DATA/NODE_ELEMENTS/USAGE_CAT"/>
<xsl:value-of select="DATA/NODE_ELEMENTS/RESTRICTED_IND"/>
<xsl:value-of select="VALUES/DATA/NODE_ID"/>.
</xsl:template>
</xsl:transform>
Please help me in solving this issue....
Thanks,
Linda.Hi Linda,
I am replying based on your sample code.
Try the below following suggestions.
here 'GRPHDR' is the node where I am selecting the data.
IGRPHDR is the name of the reference.
First calling the transformation in you program.
TYPES: BEGIN OF tl_hdr,
msgid(20) TYPE c,
END OF tl_hdr.
DATA : t_hdr TYPE STANDARD TABLE OF tl_hdr.
GET REFERENCE OF t_hdr INTO l_result_xml-value.
l_result_xml-name = 'IGRPHDR'.
APPEND l_result_xml TO t_result_xml.
TRY.
CALL TRANSFORMATION yfi_xml_read
SOURCE XML it_xml_data
RESULT (t_result_xml).
CATCH cx_root INTO l_rif_ex.
l_var_text = l_rif_ex->get_text( ).
l_bapiret-type = 'E'.
l_bapiret-message = l_var_text.
APPEND l_bapiret TO errormsgs.
EXIT.
ENDTRY.
in XSL transformation
First write a block of statement to specify from which node you are taking the data.
No matter it is a node or sub-node.
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output encoding="iso-8859-1" indent="yes" method="xml" version="1.0"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<asx:abap xmlns:asx="http://www.sap.com/abapxml" version="1.0">
<asx:values>
<IGRPHDR> " reference name of internal table
<xsl:apply-templates select="//GrpHdr"/>
</IGRPHDR>
</asx:values>
</asx:abap>
</xsl:template>
Next select the data from the nodes under the nodes specified in the transformation.
here msgid is the field i am selecting for value.
<xsl:template match="GrpHdr">
<item>
<MSGID> " field in the internal table t_hdr where data has to go
<xsl:value-of select="MsgId"/>
</MSGID>
</item>
</xsl:template>
reply back if further clarification is needed.
Thanks and regards,
Kannan N
Maybe you are looking for
-
I normally leave my macpro in sleep mode when I am not using it. But when I am going to be away for awhile or there is a bad storm, I will power it down. I have noticed the last few times that it has turned itself on. I downloaded a widget that check
-
How to change inline picture baseline?
I often include small inline pictures in my chemistry texts. They are for example safety pictograms that I add when I mention some dangerous compound. While small, these pictures are often a bit larger than the text line height. The problem is that t
-
how to make transitions between chapters in iDVD without going back to the main menu?
-
Read and Write files to data base throught Oracle Forms 6i
Hello, please help me! How to attach or insert file into DB throught Oracle Forms and later open it ? Is there any possibilities to do that? Would be very appresiative! Solveiga
-
I am having problem with my sound, after i updated to ios 6, apple please help
i am having problem with my sound, after i updated to ios 6, i can hear music with headphone only, apple please help