Searching shtml docs returns raw html tags

I have a v6.1 web server running on Solaris 9
I created a small collection of docs that are mainly shtml documents, with a couple html pages thrown in.
The result page of all searches includes the raw html of all .shtml matches instead of properly formatted text, while the formatting for results of .html matches are fine.
You can see a sample here:
http://www-personal.umich.edu/~vaughan/searchresult.jpg
I have a default doc type of text/html set for the server, and have tried it with parsing of .shtml set, as well as parsing all files.
Anyone know how to get search results for .shtml files to be formatted correctly?
thanks!

Similar Messages

How to add raw HTML tags inside JSF tags...

Hi
I would like to use <input type = text > inside my project in some area..The following code hides the input = type html tag and forwards the click event to jsf command button...After selecting the file,it should forward the value to jsf textfield....
My code seems as below.
<h:form id="detailForm" onsubmit="printElements(detailForm,this)">
<f:verbatim>
<input id="uploadFile" type="file" style ="dispaly:none"size="100" />
</f:verbatim>
<h:inputText id="docName" style="width:650px;" maxlength="100"/>
<h:commandButton id="visibleBrowseButton" value="Select File..." onclick="'detailForm:uploadFile'.click();callClick();">
</h:commandButton>
</h:form>
<script type="text/Javascript">
function callClick()
var val = document.detailForm.uploadFile.value;
document.getElementById('detailForm:docName').value = val;
</script>
While running this page it works fine in IE but in Mozilla firefox it troubles me during detailForm:uploadFile'.click().
I suspect the jsf page cannot able to detect the raw html tag inside jsf tags...Eventhough i tried using inside<f:verbatim> it wont works..
I would like to know
1.Whether the code is right,,if the code goes wrong why it got runned in IE not in firefox....
2.How can raw html tags can be integrated inside JSF tags....

First of all, why are you ignoring valuable answers about a JSF fileupload component in your previous topic?
Second, you can just nest raw HTML anywhere in your JSF page. Your problem is rather related to JavaScript. It has completely nothing to do with Java nor JSF. Learn JavaScript -there is a nice tut at w3schools.com- and look for a JavaScript forum if you still stucks. There are ones at webdeveloper.com and dynamicdrive.com.
The f:verbatim is only required if you was using JSF 1.1 or older, which is not the case. You would have occurred completely different problems.

XML with HTML Tags... (easy points) 11g question

Dear Programming Gods,
I have been researching this for about 2 weeks now, and I have hit the final road block.
Using BI Publisher 11.1.1
In APEX 4.0 I have a Rich Text Field. The data from the Rich Text Field is store in CLOB. The data has HTML formatting tags. I have a data model that selects the data. I create an xml which has the html tags. I export the xml and import it into MS Word using the BI Publisher add-in. I import my subtemplate which handles almost all of the formatting tags. I apply the template to the CLOB field so that the HTML formatting tags will be rendered when printed.
The problem is this. The subtemplate is looking for this < and / > however BI publisher convters the tags stored in the CLOB from raw html tags to this &.lt; and &.gt; so the subtemplate can not match to it.
Here is what I need to figure out and please explain it in very novice terms.
When I generate and export the XML from BI Publisher how do I prevent it from converting my raw tags?
Here is some further assistance when prepairing your answer.
My subtemplate is based on the htmlmarkup.xsl from the following blog but has been modified heavily to include support for simple tables, more formatting such as subscripts and superscripts, ect...
http://blogs.oracle.com/xmlpublisher/2007/01/formatting_html_with_templates.html
I am also familliar with this blog but I do not understand how to implement it using BI 11g.
http://blogs.oracle.com/xmlpublisher/2009/08/raw_data.html
I have tried adding this to my layout but it doesnt seem to work.
<xsl: element name="R_CLOB" dataType="xdo:xml" value="R_CLOB" / >
Please, help me. I have to have this working in 4 days.
Richard

This did not work either. Here's more infor on what I have so far.
My data template looks like this:
<dataTemplate name="Data" description="Template">
 <parameters>
 <parameter name="p_person_id" dataType="character" defaultValue="1"/>
 </parameters>
 <dataQuery>
 <sqlStatement name="Q1">
 select TEMPORARY_TEMPLATE_DATA.line_id as LABEL_line_ID,
TEMPORARY_TEMPLATE_DATA.column_id as LABEL_column_ID,
TEMPORARY_TEMPLATE_DATA.person_id as LABEL_PERSON_ID,
TEMPORARY_TEMPLATE_DATA.label as LABEL_DATA
from MY_BIO.clm_TEMPORARY_TEMPLATE_DATA TEMPORARY_TEMPLATE_DATA
Where person_id = :p_person_id
and style = 'L'
 </sqlStatement>
 <sqlStatement name="Q2" parentQuery="Q1" parentColumn="LABEL_DATA">
 select TEMPORARY_TEMPLATE_DATA.LINE_ID as LINE_ID,
TEMPORARY_TEMPLATE_DATA.COLUMN_ID as COLUMN_ID,
TEMPORARY_TEMPLATE_DATA.label as COLUMN_LABEL,
to_nclob(TEMPORARY_TEMPLATE_DATA.COLUMN_DATA) as COLUMN_DATA,
TEMPORARY_TEMPLATE_DATA.STYLE as STYLE,
TEMPORARY_TEMPLATE_DATA.ATTRIBUTE as ATTRIBUTE,
NVL(TEMPORARY_TEMPLATE_DATA.JUSTIFY,'L') as JUSTIFY
from MY_BIO.clm_TEMPORARY_TEMPLATE_DATA TEMPORARY_TEMPLATE_DATA
Where person_id =:p_person_id
and label = :LABEL_DATA
and style != 'L'
Order by line_id, column_id
 </sqlStatement>
 </dataQuery>
 <dataStructure>
 <group name="G_LABEL" source="Q1">
 <element name="LColumnData" value="label_data"/>
 <group name="G_DATA" parentGroup="G_Label" source="Q2">
 <element name="LineID" value="line_id"/>
 <element name="ColumnID" value="column_id"/>
 <element name="ColumnData" value="column_data"/>
 <element name="Style" value="style"/>
 <element name="Attribute" value="attribute"/>
 <element name="Justify" value="justify"/>
 </group>
 </group>
 </dataStructure>
</dataTemplate>
After running this data_template there was no change in the xml file generated see partial : Note:
my test actually has the B with the html tags
</G_DATA>
- <G_DATA>
<LINEID>20</LINEID>
<COLUMNID>1</COLUMNID>
<COLUMNDATA>test test my test</COLUMNDATA>
<STYLE>R</STYLE>
<ATTRIBUTE />
<JUSTIFY>C</JUSTIFY>
</G_DATA>
- <G_DATA>
<LINEID>21</LINEID>
I loaded in to MS Word but there was no change documnet still look the same. I left the commands import file command and xsl:apply-templates command in the word document template.
I really appreciate you helpiing me.
cheryl

Text Catalog showing HTML tags

We are having an issue after applying Bundle #22 for HCM 8.9 where the calls to the Text Catalog are now showing HTML tags. Has anyone else seen this? Im trying to figure out if its the bundle or something maybe with our customizations that have affected this change. Basically the page where the text from the text catalog displays now shows not only the text, but raw HTML tags as well on the page. Example: BR, B
Thanks!
Edited by: CoryU on May 11, 2010 2:00 PM

We are having an issue after applying Bundle #22 for HCM 8.9 where the calls to the Text Catalog are now showing HTML tags. Has anyone else seen this? Im trying to figure out if its the bundle or something maybe with our customizations that have affected this change. Basically the page where the text from the text catalog displays now shows not only the text, but raw HTML tags as well on the page. Example: BR, B
Thanks!
Edited by: CoryU on May 11, 2010 2:00 PM

Text Search skiping HTML tags

I have a table containing clob column.
select code, details from search order by code;
CODE DETAILS
4 just a test insert
5 just a test insert
9 <HTML>just a test insert</HTML>
10 checking test insert
I have created a context index and add html tags in the stop list.
exec ctx_ddl.create_stoplist('mystop', 'BASIC_STOPLIST');
exec ctx_ddl.add_stopword('mystop', '');
exec ctx_ddl.add_stopword('mystop', '');
CREATE INDEX searchi ON search(details)
INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS
('FILTER CTXSYS.AUTO_FILTER SECTION GROUP CTXSYS.AUTO_SECTION_GROUP STOPLIST MYSTOP');
But when I search 'test insert' it only shows the following rows
SQL> SELECT score(1), code, details FROM search WHERE CONTAINS(details, 'test insert', 1) > 0 ORDER BY score(1);
SCORE(1) CODE DETAILS
5 10 checking test insert
5 9 <HTML>just a test insert</HTML>
I would like to define a text index which skips the html keywords and returns all the rows contain the searching phrase

Since you did not use code tags in your post, most of your html does not show, so it is difficult to tell what html is in your data or what values you set for your stopwords. One problem with stopwords is that, although the word is not indexed, it still expects some word where the stopword was, so searching for "word1 word2" will not find "word1 removed_stopword word2". How about using a procedure_filter as demonstrated below? I only removed a few tags, so you would need to either expand it to include others or searching for starting and ending tags and remove what is inbetween.
SCOTT@orcl_11g> CREATE TABLE search
2 (code NUMBER,
3 details CLOB)
4 /
Table created.
SCOTT@orcl_11g> INSERT ALL
2 INTO search VALUES (4, 'just a test insert')
3 INTO search VALUES (5, 'just a test insert')
4 INTO search VALUES (9, '<HTML>just a test insert</HTML>')
5 INTO search VALUES (10, 'checking test insert')
6 SELECT * FROM DUAL
7 /
4 rows created.
SCOTT@orcl_11g> CREATE OR REPLACE PROCEDURE myproc
2 (p_rowid IN ROWID,
3 p_in_clob IN CLOB,
4 p_out_clob IN OUT NOCOPY CLOB)
5 AS
6 BEGIN
7 p_out_clob := REPLACE (p_in_clob, '<html>', '');
8 p_out_clob := REPLACE (p_out_clob, '</html>', '');
9 p_out_clob := REPLACE (p_out_clob, '<HTML>', '');
10 p_out_clob := REPLACE (p_out_clob, '</HTML>', '');
11 p_out_clob := REPLACE (p_out_clob, '', '');
12 p_out_clob := REPLACE (p_out_clob, '', '');
13 p_out_clob := REPLACE (p_out_clob, '', '');
14 p_out_clob := REPLACE (p_out_clob, '', '');
15 p_out_clob := REPLACE (p_out_clob, '', '');
16 p_out_clob := REPLACE (p_out_clob, '', '');
17 p_out_clob := REPLACE (p_out_clob, '', '');
18 p_out_clob := REPLACE (p_out_clob, '', '');
19 END myproc;
20 /
Procedure created.
SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('myfilter', 'PROCEDURE_FILTER');
3 CTX_DDL.SET_ATTRIBUTE ('myfilter', 'PROCEDURE', 'myproc');
4 CTX_DDL.SET_ATTRIBUTE ('myfilter', 'ROWID_PARAMETER', 'TRUE');
5 CTX_DDL.SET_ATTRIBUTE ('myfilter', 'INPUT_TYPE', 'CLOB');
6 CTX_DDL.SET_ATTRIBUTE ('myfilter', 'OUTPUT_TYPE', 'CLOB');
7 END;
8 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> CREATE INDEX searchi
2 ON search (details)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 PARAMETERS ('FILTER myfilter')
5 /
Index created.
SCOTT@orcl_11g> SELECT token_text FROM dr$searchi$i
2 /
TOKEN_TEXT
CHECKING
INSERT
TEST
3 rows selected.
SCOTT@orcl_11g> COLUMN details FORMAT A35
SCOTT@orcl_11g> SELECT score (1), code, details
2 FROM search
3 WHERE CONTAINS (details, 'test insert', 1) > 0
4 ORDER BY score (1)
5 /
SCORE(1) CODE DETAILS
 3 4 just a test insert
 3 5 just a test insert
 3 9 <HTML>just a test insert</HTML>
 3 10 checking test insert
4 rows selected.
SCOTT@orcl_11g>

Verity Search - exclude html tags when indexing?

I have a table I want to index, but some data stored in the
table contains HTML.
I want to index the content, but I want all HTML tags to be
excluded.
This is a problem, say you had a table storing all retail
stores, there's some HTML in the data, and you do a Verity search
for "Target" (as in the Target retail store). You will return a lot
of irrelevant results if there's a
target="blank" attribute in an A HREF tag, for example. Can
you strip these out during the <CFINDEX> ?
Any idea on how to accomplish this?

Hi user494326,
I'm actually having trouble getting html into a CLOB. It sounds like you were able to do this successfully. What did you do to get it in as far as escaping characters, etc. I'd love to see how you handled that, it would be greatly appreciated!

CF on server adding an /html tag to doc!

Hi all,
I'm not sure what the culprit is here. I'm looking at a file
on my localhost that renders fine. But when I push it to my live
server, it's got a big break between the header and body content
(which is a .cfm template include).
As far as I can tell, the problem is that the javascript
added on the server by CF is adding a </html> tag to the very
beginning of my file, causing the break.
Here's the code I see when there's a problem:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN">
<html>
<head>
<title>Application</title>
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript"
SRC="/CFIDE/scripts/cfform.js"></SCRIPT>
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript">

</SCRIPT>
</head>
</html>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN" "
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
Any ideas where this </html> is coming from?
Thanks!
Rick

Use CSS.
hr { margin:5px;padding:5px; }
Murray --- ICQ 71997575
Adobe Community Expert
(If you *MUST* email me, don't LAUGH when you do so!)
==================
http://www.dreamweavermx-templates.com
- Template Triage!
http://www.projectseven.com/go
- DW FAQs, Tutorials & Resources
http://www.dwfaq.com - DW FAQs,
Tutorials & Resources
http://www.macromedia.com/support/search/
- Macromedia (MM) Technotes
==================
"DJDEL0530" <[email protected]> wrote in message
news:f4esuk$p8o$[email protected]..
> Hello All,
> Inserting a seperator bar using the "HTML" tab and
clicking the "hr" gives
> me
> a nice seperator.
> The page I'm talking about is
>
http://empdigital.net/boxingpreview/home_testv1.html
in the "Spnsors"
> column.
> I can't figure out how to control the amrgin above &
below the rule.
> Any help?
> thanks
>

Search for strings inside html tags ?

Is there any way to get Spotlight to find search strings inside html documents? One example is I want to find any file that includes a certain alt=" " string inside an img src tag.
Spotlight does not seem to include anything inside html tags in it's search, as far as I can tell.
Am I missing something? Is there something I need to do?
Thanks for any ideas,
KarenD
G5 Mac OS X (10.4.4)
G5 Mac OS X (10.3.5)

Very strange--it does find text inside the html files, but indeed does not seem to have text that is in the file but only appears within a tag. The solution is EasyFind by Christian Grunenberg:
http://www.grunenberg.com
You can do a content search on files in a particular folder, which I recommend since it does a brute force search and can take awhile if it has to search everything.
Francine
Schwieder

HTML Tags search using TREX

Hi All,
Using following 2 documents I had tried to configure TREX to search HLTM tags. After following all the steps when I tired to search I don't get any results.
1) How to set up web repository and crawling it for indexing.pdf
2) How-to-guide for searchable HTML tags.pdf
Any buddies please help?
Thanks in advance.
Mahesh

Hi Suman,
I am Manu's Colleague. We have the Hierarchy of Objects like this.
Cubes --> Multiprovider -- > Aggregation level.
We had two tranport requests one for the Planning objects Including Aggregation levels and the other for data model objects including cubes , DSO's and multiproviders. All deletion Requests.
We moved the First transport request to production and we checked using Normal Find objectsand found no results for the aggregation Levels.We assumed all the objects were deleted.
Then we moved the Datamodel transport request to Quasltiy and it failed stating that the Multiproviders are used in Aggregation Level. (this happened in Q)
Then when we checked the aggreation Level in Planning Modeller we found it in there (this in both Q and P) and not in RSA1 transaction until we used TREX to retreive the result. (This in P as we dont have TREX in D and Q systems)
This is the issues and beacuse of this we are not able to delete the Data models in the system.
Thanks for all your previous replies and will be helpful if you have any idea ont his.
Regards.
Shafi.

Search for files by OS X tags in iOS

Hi all.
I started to add OS X tags for my documents in my Macbook these days and opening Numbers or Pages in iPad or iPhone today I found that we can't search for files by tags.
Would it be possible ??
Tks.
OS X Yosemite, iOS 8

I don't have that many old files to play with it, but you can use the Finder's "Find…" to do a raw query of Spotlight's metadata. In the search terms, add a *Raw Query* type with something like *kMDItemLastUsedDate < $time.this_year(-2)* - this example will search for items with a last used date less than year 2005. The time and query syntax is explained a bit in [this developer document|http://developer.apple.com/documentation/Carbon/Conceptual/SpotlightQu ery/Concepts/QueryFormat.html#//apple_ref/doc/uid/TP40001849-CJBEJBHH].

How can I eliminate HTML tags from Oracle Text Snippet?

I perform a search on many tables and on many columns of those tables.
Some of those columns are VARCHAR2 and some CLOB.
Also, some of the searchable data are HTML and some are plain text.
My problem is that ctx_doc.snippet fetches the HTML tags.
For example I get this, as a snippet result in one of my searches: Qual Germany n1 Test Qual Germany n1
I want the result to be fetched without the HTML tags.
In my index configuration I have used NULL FILTER and HTML_SECTION_GROUP.With that configuration I managed to eliminate the HTML tags but not in all cases!
For example:
I search table CONTENTS columns TITLE(VARCHAR2) and MAIN_TEXT(CLOB)
I created the following procedure that concatenates the two columns:
CREATE OR REPLACE PROCEDURE CONTENTS_PROC( p_id in rowid, p_lob IN OUT clob)
IS
BEGIN
FOR c1 IN (SELECT main_text||' '||title data FROM contents WHERE ROWID = p_id)
LOOP
dbms_lob.copy( p_lob, c1.data,
dbms_lob.getlength( c1.data ));
END LOOP;
END;
I created a user Datastore:
BEGIN
ctx_ddl.create_preference( 'content_trans_datastore', 'user_datastore' );
ctx_ddl.set_attribute( 'content_trans_datastore', 'procedure', 'CONTENTS_PROC' );
END;
and finally I create the index:
CREATE INDEX content_trans_ot_idx ON contents(ORACLE_TEXT_COLUMN)
INDEXTYPE IS ctxsys.CONTEXT PARAMETERS ('datastore content_trans_datastore SYNC(ON COMMIT) STORAGE INDEX_STORAGE filter ctxsys.null_filter section group ctxsys.html_section_group');
When I perform the search on those data: Test Doc-Test the snippet I get is: Test Doc-Test.
That's fine, the html tags are removed!
In another case I search table NCP columns NAME(VARCHAR2) and BODY(VARCHAR2)
I created the following procedure that concatenates the two columns:
CREATE OR REPLACE PROCEDURE NCP_PROC( p_id in rowid, p_lob IN OUT clob)
IS
BEGIN
FOR c1 IN (SELECT name||' '||body data FROM ncp WHERE ROWID = p_id)
LOOP
dbms_lob.copy( p_lob, c1.data,
dbms_lob.getlength( c1.data ));
END LOOP;
END;
I created a user Datastore:
BEGIN
ctx_ddl.create_preference( 'ncp_trans_datastore', 'user_datastore' );
ctx_ddl.set_attribute( 'ncp_trans_datastore', 'procedure', 'NCP_PROC' );
END;
and finally I create the index:
CREATE INDEX ncp_trans_ot_idx ON ncp(ORACLE_TEXT_COLUMN)
INDEXTYPE IS ctxsys.CONTEXT PARAMETERS('datastore ncp_trans_datastore SYNC(ON COMMIT) STORAGE INDEX_STORAGE filter ctxsys.null_filter section group ctxsys.html_section_group');
When I perform the search on those data: test http://deleteme.com the snippet I get is: test http://deleteme.com!!!!!!!!!!
How is this possible? Why in the first case the HTML tags are eliminated and in the second case they are not?
Thanks,
Margarita
Edited by: user13312701 on 07-Sep-2010 08:51

Doing various tests I found out that the problem is when I need to search in multiple columns of a table.
That is when I create a user_datastore that uses a procedure that concatenates the columns.
And especially when the data with the html tags is in a VARCHAR2 column.
e.g
--create the table*
CREATE TABLE CONTENT_TRANS (content_trans_id NUMBER,
main_text CLOB,
title vARCHAR2(2000),
oracle_text_column VARCHAR2(1));
alter table "CONTENT_TRANS" add constraint CONTENT_PK primary key("CONTENT_TRANS_ID") ;
--Insert dummy data*
Insert into CONTENT_TRANS
(CONTENT_TRANS_ID,MAIN_TEXT,TITLE)
values
(1,'lorem','lorem qualification 2.1 ');
Insert into CONTENT_TRANS
(CONTENT_TRANS_ID,MAIN_TEXT,TITLE)
values
(2,'lorem','lorem qualification 2.1 ');
--CREATE THE procedure that concatenates main_text(CLOB) and title(VARCHAR2)*
CREATE OR REPLACE PROCEDURE CONTENT_TRANS_PROC( p_id in rowid, p_lob IN OUT clob)
IS
BEGIN
FOR c1 IN (SELECT main_text||' '||title data FROM content_trans WHERE ROWID = p_id)
LOOP
dbms_lob.copy( p_lob, c1.data,
dbms_lob.getlength( c1.data ));
END LOOP;
END;
--Create the user datastore*
BEGIN
ctx_ddl.create_preference( 'content_trans_datastore', 'user_datastore' );
ctx_ddl.set_attribute( 'content_trans_datastore', 'procedure', 'CONTENT_TRANS_PROC' );
END;
--Create the index*
CREATE INDEX content_trans_ot_idx ON content_trans(ORACLE_TEXT_COLUMN)
INDEXTYPE IS ctxsys.CONTEXT PARAMETERS ('datastore content_trans_datastore SYNC(ON COMMIT) filter ctxsys.null_filter section group ctxsys.html_section_group');
exec ctx_doc.set_key_type('PRIMARY_KEY');
--Perform the query
SELECT SCORE(1),ct.content_trans_id, ctx_doc.snippet('content_trans_ot_idx', ct.content_trans_id, 'lorem') as snippet
from content_trans ct
where contains(ct.ORACLE_TEXT_COLUMN, 'lorem', 1) > 1;
Results WITH NOT WANTED HTML TAGS:
6 1 lorem lorem qualification 2.1
6 2 lorem lorem qualification 2.1
Edited by: user13312701 on 13-Oct-2010 01:18

How to remove html-tags from a text.

Hello!
I have a text-field which I will remove html-tag's from.
Example:
"This is a test and another test"
The function must return a similar text, but without the html-
tags and (in this case).
Anybody that can help me with this little problem?
Thanks in advance for any help :-)
Best regards
Kjetil Klxve

You can wait for some kind personal to post a complete code
solution... But if you want to fix this yourself (which is good
for the soul) here are some hints:
- You can use SUBSTR to get at chunks of text
- You can use INSTR to find particular characters.
- You can use INSTR as an argument of SUBSTR
Hence:
bit_of_text := SUBSTR(text, 1, INSTR(text, '<'));
chopped_text := SUBSTR(text, INSTR(text, '<'));
bit_of_text := bit_of_text||SUBSTR(chopped_text, INSTR
(text, '>'), INSTR(text, '<'));
will give you the first bit of text that doesn't contain any
angle brackets.
From this you should be able to work out how to functionalised
this (you'll need to store the offsets and use them in a loop
construct).
Note that this assumes that the text only contains the '<'
character when it's part of a HTML tag. If you can't guarantee
this then you'll have to explicitly search for all the tags e.g.
bit_of_text := SUBSTR(text, 1, INSTR(lower(text), ''));
bit_of_text := SUBSTR(text, 1, INSTR(lower(text), ' '));
This will be a bit of pain. And completely rules out XML!
rgds APC

Need to copy Data from a specific Html Tag

Hello,
I am trying to use CF to access website and capture data from a specific tag to the end of that tag and store same in a csv file or database.
The tag based search of an open file is where I am not able to get any head way. Any one has done this?

You'll need to use a regular expression for that. CF supports regular expressions with the REFind, REFindNoCase and REReplace functions. Here's an example of using regular expressions to capture the value within an HTML tag:
http://www.javamex.com/tutorials/regular_expressions/example_scraping_html.shtml
It's in Java, but the syntax for regular expressions is the same in CF.
Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/
Fig Leaf Software is a Veteran-Owned Small Business (VOSB) on
GSA Schedule, and provides the highest caliber vendor-authorized
instruction at our training centers, online, or onsite.
Read this before you post:
http://forums.adobe.com/thread/607238

Stripping all HTML tags from a CLOB

Hi all,
Running Oracle 9.2.0.8 on AIX...
We have a table which stores HTML document fragments in a clob. I have a requirement to convert these to plain/text (strip all HTML tags) for sending in a plain/text email body.
I have read the following solution from Tom Kyte's site:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:25695084847068
Basically creating an Oracle text index on the CLOB column and calling ctx_doc.filter with "plaintext" parameter set to true.
I noticed in Tom's example, he uses the default filter, which based on the docs, is NULL_FILTER, which applies no filtering. I have tried his example in my dev box, creating the text index on the CLOB column with no parameters.
The call to ctx_doc.filter did not filter the html at all. I re-created the index and specified the INSO_FILTER and the filtering was done. I was under the impression that INSO_FILTER was for filtering binary content to plaintext...
create table filter ( query_id number, document clob );
create table demo
( id int primary key,
 theclob clob
create index demo_idx on demo(theClob) indextype is ctxsys.context;
SET DEFINE OFF;
Insert into DEMO
 (ID, THECLOB)
Values
 (1, '<html><body>This is a test of ctx_doc.filter and plaintext filtering.</body></html>');
COMMIT;
exec ctx_doc.filter('demo_idx',1, 'filter',1, true);The above code does not convert the html to plaintext...
Now re-create with the index with INSO_FILTER
drop index demo_idx;
create index demo_idx on demo(theClob) indextype is ctxsys.context parameters ('filter ctxsys.inso_filter');
exec ctx_doc.filter('demo_idx',1, 'filter',1, true);Above scenario returns string "This is a test of ctx_doc.filter and plaintext filtering."
The ORacle documentation doesn't specify any special filter parameter that needs to be set... just wondering if I'm missing soemthing here... or better yet, if there is a better solution to my problem. ;-)
Thanks
Stephane

The difference between what you did and what Tom Kyte did is that you created your index on a clob column and Tom created his index on a blob column. What I don't know is why that makes a difference. I have demonstrated below with one blob column and one clob column, one index on the blob and one index on the clob, using the same code on both, with different results.
SCOTT@orcl_11gR2> create table filter
2 (query_id number,
3 document clob)
4 /
Table created.
SCOTT@orcl_11gR2> create table demo
2 (id int primary key,
3 theblob blob,
4 theclob clob)
5 /
Table created.
SCOTT@orcl_11gR2> create index demo_blob_idx
2 on demo (theblob)
3 indextype is ctxsys.context
4 /
Index created.
SCOTT@orcl_11gR2> create index demo_clob_idx
2 on demo (theclob)
3 indextype is ctxsys.context
4 /
Index created.
SCOTT@orcl_11gR2> insert into demo values
2 (1,
3 utl_raw.cast_to_raw (
4 '<html>
5 <body>
6 
7 This is a test of
8 ctx_doc.filter 
9 and plaintext filtering.
10 
11 </body>
12 </html>'),
13 '<html>
14 <body>
15 
16 This is a test of
17 ctx_doc.filter 
18 and plaintext filtering.
19 
20 </body>
21 </html>')
22 /
1 row created.
SCOTT@orcl_11gR2> exec ctx_doc.filter ('demo_blob_idx', 1, 'filter', 1, true)
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> exec ctx_doc.filter ('demo_clob_idx', 1, 'filter', 2, true)
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> select id, utl_raw.cast_to_varchar2 (theblob), theclob from demo
2 /
 ID
UTL_RAW.CAST_TO_VARCHAR2(THEBLOB)
THECLOB
 1
<html>
 <body>
 
 This is a test of
 ctx_doc.filter 
 and plaintext filtering.
 
 </body>
 </html>
<html>
 <body>
 
 This is a test of
 ctx_doc.filter 
 and plaintext filtering.
 
 </body>
 </html>
1 row selected.
SCOTT@orcl_11gR2> select query_id, document from filter
2 /
QUERY_ID
DOCUMENT
 1
This is a test of ctx_doc.filter and plaintext filtering.
 2
<html>
 <body>
 
 This is a test of
 ctx_doc.filter 
 and plaintext filtering.
 
 </body>
 </html>
2 rows selected.
SCOTT@orcl_11gR2>

cm:search is not returning any result when logical operator '!' is used.

<cm:search is not returning any result when logical operator '!' is used.
I am using BEA 9.1 content management services API. When I run the following query I am not receiving any results. Also no error or exceptions are seen in the weblogic or cmspi log.
The query is <cm:search id="docs" query="!(object_name like 'Sport*')" />

HI cam
Thanks for your reply, but i found the problem it was because my server administrator password has changed by network guys... and because of it crawler unable to access the content
I wrote my solution here i hope it will help other people
http://bvs-sharepoint.blogspot.com/2015/03/sharepoint-search-is-not-returning.html
RB

Searching shtml docs returns raw html tags

Similar Messages

Maybe you are looking for