Parsing text for keywords
Hi there, i'm about to start a project where i analyse text and pick out all of the meaningful words, throwing away the determiners (such as "the", "a", "an" etc...).
I realise that i could compare my String to a whole list of these words, but i was wondering if anybody had any suggestions before i start. Is there a package that would save me the trouble of writing out a long list of words, or even any sample code. I'm googling as we speak but just thought i would ask here as well.
Thanks in advance
oookiezooo
yeah, i already have alot of knowledge of parsing text, but frankly the idea of writing a HUGE list of words i don't want to include doesn't appeal to me, thats why i asked, thanks for trying though!
Similar Messages
-
Search text for keywords - innodb table
I have a longtext column in a table that I need to search
through for keywords. The table is in innodb format. I dont want to
change it to myisam because I can't afford to have it lock at the
table level... I prefer the row level of innodb.
How can I build a search around this? It would be nice to
have all words, any words, and exact phrase as an option, as seen
in tom mucks extension (which i own)... however, this recordset has
so many arrays, it's completely hand coded and they extension isn't
suppose to work with anything but default recordsets.
Any suggestions? How can I have a more comprehensive search
using a innodb table?tom mucks extension lets you use keywords 3 ways... all
words, any words, and exact phrase.
http://www.tom-muck.com/extensions/help/DynamicSearchPHP/ -
Best Method For Keyword Search (Full Text Search)
I have some cataloged columns I am searching for Keywords.
This table is getting huge over 6.7 million records and it is
becoming slow. What is the best method to optimize the DB for this
Search.
Do I need to create a column that
will have a keyword associated to a description of each record
or.... and search this particular column for the Records?. My
clients records at time are store like "Bolt, Flange" and some are
stored as "Bolt, Flange 1/4inch....." . Any one with any idea of
the best methodology of getting this Keyword Search Optimized and
returning faster query results?
ThanksConsider creating a Verity collection on the appropriate
columns in your database. The frequency of database update will
help in determining if this is the appropriate thing to do. -
When trying to send messages in linkedin, firefox takes me back to the log in screen without sending the message. On the error console it shows the error; error in parsing value for "background" and error in parsing value for "filter". What is causing this?
Pages does not support the Apple font used for color emoji, so that behavior is normal.
With what app are you reading the yahoo mail? There is really no guarantee than any other email service will show the special Apple font involved.
You should have no problem putting emoji directly into Mail or Text edit via drag drop from the Character Viewer as shown below.
You should also be able to upload graphics here easily by clicking on the camera icon. My email is tom at bluesky dot org. -
I have ca. 30 pdf documents I need to search for keywords; how can I do on my MAC?
I have ca. 30 pdf documents I need to search for keywords; when I open these documents in Adobe Reader on my MAC, it shows a Search tool; however, when I search for keywords I know are in the document, none are found. How can I do a keyword search?
Do you know if the text has been OCR recognised? Are the original documents "scans"?
An easy way to find out, if you can select an individual word or letter? If you are selecting a whole block of text then the document will need to be put through Optical Character Recognition (OCR) software first to enable you to keyword search. -
Problem for using oracle xml parser v2 for 8.1.7
My first posting was messed up. This is re-posting the same question.
Problem for using oracle xml parser v2 for 8.1.7
I have a sylesheet with
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">.
It works fine if I refer this xsl file in xml file as follows:
<?xml-stylesheet type="text/xsl" href="http://...../GN.xsl"?>.
When I use this xsl in pl/sql package, I got
ORA-20100: Error occurred while processing: XSL-1009: Attribute 'xsl:version' not found in 'xsl:stylesheet'.
After I changed name space definition to
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> in xsl file, I got
ORA-20100: Error occurred while processing: XSL-1019: Expected ']' instead of '$'.
I am using xml parser v2 for 8.1.7
Can anyone explain why it happens? What is the solution?
Yi<BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by Steven Muench ([email protected]):
Element's dont have text content, they [b]contain text node children.
So instead of trying to setNodeValue() on the element, construct a Text node and use the appendChild method on the element to append the text node as a child of the element.<HR></BLOCKQUOTE>
Steve,
We are also creating an XML DOM from java and are having trouble getting the tags created as we want. When we use XMLText it creates the tag as <tagName/>value rather than <tagName>value</tagName>. We want separate open and close tags. Any ideas?
Lori -
Hi all,
I am in the process of building a shell script as part of a auditing utility. It will search a specified directory for keywords and output results of the file path, and line number that the word was found on. I built a test script (shown below) that does just this, but egrep apparently does not allow MS word, excel, etc... documents to be read. I was wondering if someone could point me in an alternate direction that would allow me to search these types of documents as well? (Wordfile is a file that is create elsewhere with a list of words to search for e.g. bus)
Thanks!
cat << EOF > ${TMPDIR}/scanit
rm -f ${TMPDIR}/strings
strings "\$1" | egrep -n -i -f ${TMPDIR}/wordlist >> ${TMPDIR}/strings
if [ -s ${TMPDIR}/strings ]
then
echo >> ${TMPDIR}/${HOSTNAME}.o
echo "File: \$1" >> ${TMPDIR}/${HOSTNAME}.o
file "\$1" >> ${TMPDIR}/${HOSTNAME}.o
cat ${TMPDIR}/strings >> ${TMPDIR}/${HOSTNAME}.o
fi
rm -f ${TMPDIR}/strings
EOF
HOSTNAME=`hostname`
export HOSTNHAME
if [ $# -eq 0 ]
then
echo "You must specify the start of the directory tree to search"
exit
fi
find $1 -type f 2> ${TMPDIR}/${HOSTNAME}finderrors | tee ${TMPDIR}/${HOSTNAME}_filelist | \
head -100 |\
sed -e "s^sh -x ${TMPDIR}/scanit \"+" -e 's/$/"/' > ${TMPDIR}/scanitnow
sh -x ${TMPDIR}/scanitnow 1> ${TMPDIR}/${HOSTNAME}scanrun 2>&1
cd ${TMPDIR}
if [ -s ${HOSTNAME}.o ]
then
date "+%Y%M%d_%H:%m:%S: indicators found on ${HOSTNAME}" > ${HOSTNAME}scanresults.csv
cat ${HOSTNAME}.o >> ${HOSTNAME}scanresults.csv
else
date "+%Y%M%d_%H:%m:%S: No indicators found on ${HOSTNAME}" > ${HOSTNAME}scanresults.csv
fi
zip ${HOSTNAME}_scan.zip ${HOSTNAME}finderrors ${HOSTNAME}_filelist ${HOSTNAME}scanrun ${HOSTNAME}scanresults.csvI don't think that info is included in metadata (though I could be wrong - checkout Query Programming and Metadata attributes). If line numbers are a key part of this, then you're probably going to have to (a) make a quick conversion of office files to plain text using textutil, or (b) use osascript to search Word via applescript. trying to read a word doc as plain text in unix is going to give you mounds of headaches (particularly if the 'fast save' option is on in Office, since that will save changes non-sequentially on disk).
-
Hi,
I hava a text-file with a structure like this.
"sfdgasdf" "sadsadsadf" "sadfsdfasfd"
"qwevsdf" "sdgfasdfsafd" "yxvcyxvcyxvc"
"hgfddfhhfdfdf" "ewrtqwrwqewqr" "dfgdgdgsdgsdfgsdgg"
My aim is to read this text-file (*.txt) and parse it into an string-array (or whatever is the best). The contents between the apostrophes should be inserted in this array line by line.
For example:
array[0][0] = "sfdgasdf";
array[0][1] = "sfdgasdf";
array[0][1] = "sadfsdfasfd";
array[1][1] = "qwevsdf";
How can I achieve this?
Thanks
Jonny
That's how far i came (not very far).
File file = new File("c:\temp\text.txt");
FileReader stream = new FileReader(file);Hi,
still facing some problems.
My text which I want to parse:
"sfdgasdf" "sadsadsadf" "sadfsdfasfd"
"qwevsdf" "sdgfasdfsafd" "yxvcyxvcyxvc"
"hgfddfhhfdfdf" "ewrtqwrwqewqr" "dfgdgdgsdgsdfgsdgg"
My code:
String[] parse = text.split("\"");
The array that is created has whitespaces and linebreaks as elements. I only want the characters in beween the apostrophes.
How can I "tell" the split function not to insert them in my array?
Cheers
Jonny -
Scan textfield for keyword and apply formatting
I was interested in searching through text in a textfield, and applying text formatting to keywords. For example, every time the word 'the' appears, apply a text format that changes it to green and 14pt. Here is an example of a format and text applied to a textfield. How would I go about searching through the textfield and applying this format only to specific words?
my_txt.text = 'The cat jumped over the house.'
/// my format I want to apply
with (_lt_fmt) {
align = 'left';
blockIndent = 0;
bold = false;
bullet = false;
color = _green;
font = FontNames.ARIAL;
indent = 0;
italic = false;
kerning = false;
leading = 0;
leftMargin = 0;
letterSpacing = 0;
rightMargin = 0;
size = 14;
tabStops = [];
target = "";
underline = false;
url = "";" I replaced some var names b/c they were reserved words"
There were no reserved words for the current or application scope.
"How can I keep all the words highlighted in the different formats?"
Comment out this line:
main_txt.setTextFormat(main_txt.defaultTextFormat);
Also, the code you showed is too verbose. You can combine declarations and and instantiation in one place and have 5 lines instead of 10:
var highLightFormat0:TextFormat = new TextFormat("Arial",14,0xff00ff,"bold");
var highLightFormat1:TextFormat = new TextFormat("Arial",7,0xff0000,"bold");
var highLightFormat2:TextFormat = new TextFormat("Arial",9,0xCCCCCC,"bold");
var highLightFormat3:TextFormat = new TextFormat("Arial", 8, 0xffEE00, "bold");
var main_txt:TextField = new TextField();
In addition, function getTxtFmt and the way you deal with getting TExtFormats is an ovekill - conditionals are worse than direct references. So, I suggest your code is:
import flash.text.TextFormat;
// keywords to highlight
var wordsToSearch:Vector.<String> = new <String>['the','interested','text', 'applying'];
// TxtFormats
var highLightFormats:Vector.<TextFormat> = new <TextFormat>[new TextFormat("Arial", 14, 0xff00ff, "bold"), new TextFormat("Arial", 7, 0xff0000, "bold"), new TextFormat("Arial", 9, 0xCCCCCC, "bold"), new TextFormat("Arial", 8, 0xffEE00, "bold")];
// Create TextField and add to display list
var main_txt:TextField = new TextField();
with (main_txt) {
multiline = main_txt.wordWrap = true;
autoSize = "left";
width = 400;
defaultTextFormat = new TextFormat("Arial",12);
x = main_txt.y = 20;
text = "I was interested in searching through text in a textfield, and applying text formatting to keywords. For example, every time the word 'the' appears, apply a text format that changes it to green and 14pt. Here is an example of a format and text applied to a textfield. How would I go about searching through the textfield and applying this format only to specific words?";
addChild(main_txt);
// Iterate through Vector of keywords
for (var i:int; i < wordsToSearch.length; i++){
search(wordsToSearch[i], i);
// find whole words
function search(keyword:String, fmtChoice:int):void {
//main_txt.setTextFormat(main_txt.defaultTextFormat);
var txt:String = main_txt.text;
var pattern:RegExp = new RegExp("\(\?\<\=\\s)" + keyword + "\\s","ig");
var theResult:Object = pattern.exec(txt);
while (theResult) {
main_txt.setTextFormat( highLightFormats[fmtChoice], theResult.index, theResult.index + keyword.length);
theResult = pattern.exec(txt); -
Error while trying to retrieve text for error ORA-12154
Hello,
I try to install php 5.1.2 on a WIN2003 server and IIS6 with the OCi8 extension without success from several days.
On my server I've a 920 oracle client and the 10.1 instant client, I copy the tnsnames.ora in the instant client's directory.
I've declare many environnement variables :
- NLS_LANG : AMERICAN_AMERICA.WE8MSWIN1252
- TNS_ADMIN : E:\...\oracle\instantclient_10_1
- ORA_NLS33 : E:\..\oracle\920\ocommon\nls\ADMIN\DATA
With the php command line the oci_connect function correctly works : the php command line use the instant client's tnsnames.ora. I can query with success my database.
When I try to load a web php script (the same as the php command line script) I have the following error " Error while trying to retrieve text for error ORA-12154" ( oci_connect( $user , $pass, $sid ) . The $sid variable have the value of an alias declared in the tnsnames.ora.
If I replace the sid's alias by something like this " (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=xx.xx.xx.xx)(PORT=1521)))(CONNECT_DATA=(SID=xx)" in the oci_connect function, I have another error : Error while trying to retrieve text for error ORA-12705.
A web page with the phpinfo function displays the following messages about oci8 extension : It seems to be correct.
oci8
OCI8 Support enabled
Revision $Revision: 1.269.2.8 $
Active Persistent Connections 0
Active Connections 0
Temporary Lob support enabled
Collections support enabled
Do you have any idea ? Thanks a lotThe web server is not seeing the Oracle environment correctly. You need to set PATH to the instant client libraries. ORA_NLS33 is not used for Oracle 10g clients. Perhaps you have some library conflict with two versions of Oracle on the machine?
These may help:
http://www.oracle.com/technology/tech/php/htdocs/php_troubleshooting_faq.html#envvars
http://blogs.oracle.com/opal/2006/05/01 -
Having custom text for 'Actual' and 'Target' in Funne chart
Hi,
We see 'Actual' and 'Target' label values in funnel chart when we hover the mouse on the chart.
But i need to change the text to custom text for SINGLE graph. I dont want to chnage any xml or config files, I need this change in single report only.
Appreciate all your posts which helps.
Regards
MuRam
Edited by: MuRam on Dec 31, 2012 7:36 PMhttp://www.adobe.com/cfusion/mmform/index.cfm?name=wishform
-
Tool Tip Text for field values in ALV report
Hi,
How to get the tool tip text for the field values in ALV report.
Thanks & Regards,
Pallavi.Hi,
In fieldcatalog specify the TOOLTIP.
<b>
LVC_S_FCAT-TOOLTIP
</b>
In this speicfyteh tooltip you want.
Then append this to the fieldcatalog.
Hope this solves ur problem. -
Activate text for Cost Center for ME51N, ME52N, ME53N
Hi, experts
As a requirement on T/C ME51N, ME52N, ME53N is needed to activate on "Account assignment" tab, the text for Cost Center field, how can I do this?
Thanks in advance.
Is there any path or exit could help with it?I need to add on Tabstrip "Account assignment" for fields
CO Area and Cost Center text field description ( right side ) for each one.
How can I do this? Thanks in advance. -
Help Text for Field Name.....
Hi Experts,
In ALV Report there is Feild names like Order No., Qty, etc.
When the user moves the cursor to the Feild Name i.e. Qty, it should show help text "This Qty is for A-B...".
How to bring help text for Feild name when the cursor move to feild name ?
Pl. guide.
YusufHi Shiva,
There is no field TOLLTIP in SLIS_FIELDCAT_ALV.
My sintex is :
w_fcat-col_pos = 9.
w_fcat-fieldname = 'FACTOR'.
w_fcat-seltext_l = 'Stock Value (55 %)'.
w_fcat-outputlen = 18.
w_fcat-do_sum = 'X'.
APPEND w_fcat.
CLEAR w_fcat.
Is there any other way becaz there is no field like tooltip?
Yusuf -
How to change Alt text for the Popup Key LOV Image in Apex 3.2.1.00.10
we are using Application Express version is 3.2.1.00.10
There is an icon to click on to popup the lov search box, the alt text for that image is currently "popup Lov"
would it be possible to change the text to something more meaningfull e.g. "Lookup Person name" or "search Directory for Person names" .
I have tried by updaing them
from
Shared Components>Templates> Popup List of Values Template > Popup Icon Attr --> width="13" height="13" alt="Popup Lov"
(under Popup List of Values)
to
alt="#CURRENT_ITEM_NAME#"
it didn't work.
your respone will help getting accessability sign offVenu,
Try adding title = "Lookup Person name" to the Image Attributes of your icon or button.
Jeff
Maybe you are looking for
-
When I close a youtube video or some other video and even after closing the tab the sound from the video continue to be played. As a matter of fact I have even closed Firefox and re-started Firefox and it plays again. I even went and turned off the c
-
I am trying to restore my laptop to factory settings with saving my folders. My laptop has a webstroids virus is the reason for me doing this. When the factory image recovery was finished or about finished this message pops up..... Restore7.exe-Appli
-
SAP Netweaver installion in distributed environment
Hello, I want to install SAP NetWeaver 2004s in distributed environment. For that i need to install SAP central service and central instance on Windows machine and MaxDB database instance on Linux machine. I have install sap central services on win
-
Hi Gurus Would you pl clear my following doubts? 1.What is the difference between start routine ,transfer routine and update routine? 2.what is alpha conversion? and when it is used? 3.when and why do we do partitioning ?In which stage of project sha
-
Hi all, I have a basic question. What is meant by Flattening a BOM. Regards, Edited by: somya narla on Apr 20, 2008 6:32 PM