String stemming

hi everyone,
i have a program that searches a file for a string. i need to be able to implement functionality to this progam so that it can find like strings. for example when i search for run, i want runner and running to also be displayed as well as run, i have thought about just storing an array of possible stems and searching for them aswell as the original string, however this wont suffice as i need to be able stem strings that aren't always familiar to me. i have included my search method code, i search a file by first reading it into a string and then spliting this string into other strings; title, abstract, body, keyword and index. i am then searching each of these strings for a target string using the indexof(). does anyone have any idea how i will be able to implement string stemming functionality to this?
public void search() {
        String paf = pathText.getText();
        path = new File(paf);
        target = searchText.getText();
        if (path.isFile()){
            searchfile(paf);
        if (path.isDirectory()){
            //folderlen = path.length();
            processdir(path);
        MouseListener ml = new MouseAdapter() {
           public void mousePressed(MouseEvent e) {
        if ((list.getSelectedIndex() != -1) && (e.getClickCount() == 2)) {
            String selection = (String)list.getSelectedValue();
            display(selection);
                list.addMouseListener(ml);
    public void processdir(File path) {
        int i;
        File[] theFiles = path.listFiles();
        for (i = 0; i < theFiles.length; i++) {
            if(theFiles.isDirectory()){
//theFiles = path.listFiles();
processdir((theFiles[i]));
if(theFiles[i].isFile()) {
String pop= theFiles[i].getAbsolutePath();
//System.out.print(pop+"\n");
searchfile(pop);
//list.setListData(files);
public void searchfile(String pah){
pqs = new File(pah);
String co = readFile(pqs);
ArrayList text = documentSpliter(co);
int rank = 0;
String pan = pqs.getName();
String st= pqs.getPath();
if (title.indexOf(target)>=0) {
rank+=10;
files.add("\nthe string was found in the title");
boolean intit = true;
if (abs.indexOf(target)>= 0) {
rank+=5;
files.add("\nthe string was found in the abstract");
boolean inabs = true;
if (body.indexOf(target)>= 0) {
rank+=2;
files.add("\nthe string was found in the body");
boolean inbod = true;
if (keyword.indexOf(target)>= 0) {
rank+=3;
files.add("\nthe string was found in the keyword");
boolean inkey = true;
if (index.indexOf(target)>= 0) {
rank+=2;
files.add("\nthe string was found in the index");
boolean inind = true;
if (rank==0){
files.add("\nword not found in file");
boolean eof = true;
if((rank!=0)) {
files.add(pan+" "+"rank score:"+rank);
files.add("\nthe document "+pan+" has rating:"+rank+"\n\n");
files.add("\n");
files.add(st+"\n");
list.setListData(files);

I think the only likely approach is to have a table of common word extensions ("-es" "-ing" "-ed" "-ly" and so on). You might extend that to actual rules (e.g. for plural formation) and exceptions.

Similar Messages

Trying to create a web service in NetBeans

Hi Folks let's cut right to the chase
I've followed the demos / tutorials and built a few simple web services running on glassfish / mysql / netBeans IDE. All well and good. What I really want now is to be able to return an XML document or a string with xml data from my service.
I tried building an XML document using xerces / document builder (hopefully this is familiar to you) Unfortunately, the results I get back are not XML encoded, and also as I add elements & attributes to my document, it often creates a transformException (which I don't get when doing it from the console). I also get an exception if I try to add two elements as children to the same parent, as indicated in the code snippet below - I don't know what I'm doing wrong there.
          Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
          doc.createElement("Word");
          Element top = doc.createElement("myapp");
        Element word = doc.createElement("word");//new Element("word");
        //Attr attr = doc.createAttribute("testval");
        word.setAttribute("test", "חֹשֶׁך");
        //word.setNodeValue("חֹשֶׁך");
        word.setTextContent("חֹשֶׁך");
        doc.appendChild(top);
        top.appendChild(word);
        Element word2 = doc.createElement("word");
        top.appendChild(word2);So I redid my web service just using a PrintWriter / StringWriter, and at the end replacing '<' and '>' with < > That's all well and good, and I get back the proper response, with the xml tags showing. But I still get an error message: Service invocation threw an exception with message :
" com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence.; Refer to the server log for more details"
(more details at the very bottom). As you may have noticed in the code snippet I posted above, I'm using strings of hebrew characters.
I don't understand how I can get a response back and STILL get a Service invocation exception message.
I also don't know how to fix whatever's causing the exception. Anyone dealt with this kind of thing before?
Thanks! (Code below)
    private static void addWord(PrintWriter pw, Words w, List<Definition> definitions) {
        //Element word = doc.createElement("word");//new Element("word");
        pw.append("<Word>");
        pw.append("<HW>" + w.getWord() + "</HW>");
//        word.setTextContent(w.getWord());
//        doc.appendChild(word);
        for (Definition def : definitions) {
//          Element definition = doc.createElement("Definition");
            pw.append("<Definition>" + def.getDefinition() + "</Definition>");
//          definition.setTextContent(def.getDefinition());
            String pos = def.getPos();
            if (pos == null) pos = "";
            String stem = def.getStem();
            if (stem == null) stem = "";
            pw.append("<POS>" + pos + "</POS>");
            pw.append("<Stem>" + stem + "</Stem>");
//            Element posElem = doc.createElement("POS");
//            posElem.setTextContent(pos);
//            Element stemElem = doc.createElement("Stem");
//            stemElem.setTextContent(stem);
//            definition.setAttribute("POS", def.getPos());
//            definition.setAttribute("stem", def.getStem());
//            word.appendChild(definition);
//            word.appendChild(posElem);
//            word.appendChild(stemElem);
        pw.append("</Word>");
    }SOAP Response
Service invocation threw an exception with message : com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence.; Refer to the server log for more details
Exceptions details : javax.xml.transform.TransformerException: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence.
javax.servlet.ServletException: javax.xml.transform.TransformerException: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence. at org.glassfish.webservices.monitoring.WebServiceTesterServlet.doPost(WebServiceTesterServlet.java:330) at org.glassfish.webservices.monitoring.WebServiceTesterServlet.invoke(WebServiceTesterServlet.java:106) at org.glassfish.webservices.EjbWebServiceServlet.service(EjbWebServiceServlet.java:114) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.doFilter(ServletAdapter.java:1002) at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.invokeFilterChain(ServletAdapter.java:942) at com.sun.grizzly.http.servlet.ServletAdapter.doService(ServletAdapter.java:404) at com.sun.grizzly.http.servlet.ServletAdapter.service(ServletAdapter.java:354) at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:168) at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:234) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:822) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:719) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1013) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:225) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: javax.xml.transform.TransformerException: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence. at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313) at org.glassfish.webservices.monitoring.WebServiceTesterServlet.dumpMessage(WebServiceTesterServlet.java:362) at org.glassfish.webservices.monitoring.WebServiceTesterServlet.doPost(WebServiceTesterServlet.java:320) ... 24 more Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence. at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:369) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1416) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2792) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:609) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:707)

Okay I think I get what's happening this is just a problem with NetBeans printing the SOAP response. So I think I'm good

Non-servlet class in servlet program

hi,
I declare a non-servlet class which is defined by myself in a servlet class. I passed the complie but got an runtime error said NoClassDefFoundError. Does anyone can help me? Thanks.
The following is my code.
//get the search string from web form
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import java.net.*;
import java.util.*;
public class SearchEngines extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
String searchString = (String) request.getParameter("searchString");
     String searchType = (String) request.getParameter("searchType");
     Date date = new java.util.Date();
     response.setContentType("text/html");
PrintWriter out = response.getWriter();
Vector doc_retrieved = new Vector();
BooleanSearch bs = new BooleanSearch();
doc_retrieved=bs.beginSearch(searchString, searchType);
out.println("<HTML><HEAD><TITLE>Hello Client!</TITLE>" +
               "</HEAD><BODY>Hello Client! " + doc_retrieved.size() + " documents have been found.</BODY></HTML>");
out.close();
response.sendError(response.SC_NOT_FOUND,
"No recognized search engine specified.");
public void doPost(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
doGet(request, response);
// a search engine implements the boolean search
import java.io.*;
import java.util.*;
import au.com.pharos.gdbm.GdbmFile;
import au.com.pharos.gdbm.GdbmException;
import au.com.pharos.packing.StringPacking;
import IRUtilities.Porter;
public class BooleanSearch{
     BooleanSearch(){;}
     public Vector beginSearch(String searchString, String searchType){
          Vector query_vector = queryVector(searchString);
          Vector doc_retrieved = new Vector();
          if (searchType.equals("AND"))
               doc_retrieved = andSearch(query_vector);
          else
               doc_retrieved = orSearch(query_vector);
          return doc_retrieved;
     private Vector queryVector(String query){
     Vector query_vector = new Vector();
          try{
               GdbmFile dbTerm = new GdbmFile("Term.gdbm", GdbmFile.READER);
          dbTerm.setKeyPacking(new StringPacking());
          dbTerm.setValuePacking(new StringPacking());
          query = query.toLowerCase();
          StringTokenizer st = new StringTokenizer(query);
          String word = "";
          String term_id = "";
          while (st.hasMoreTokens()){
               word = st.nextToken();
               if (!search(word)){
                    word = Stemming(word);
                    if (dbTerm.exists(word)){
               //          System.out.println(word);
                         term_id = (String) dbTerm.fetch(word);
                         query_vector.add(term_id);
          catch(GdbmException e){
               System.out.println(e.getMessage());
          return query_vector;
     private Vector orSearch(Vector query_vector){
          Vector doc_retrieved = new Vector();
          try{
               GdbmFile dbVector = new GdbmFile("Vector.gdbm", GdbmFile.READER);
               dbVector.setKeyPacking(new StringPacking());
               dbVector.setValuePacking(new StringPacking());
               int doc_num = dbVector.size();
               String doc_id = "";
               String temp = "";
               for (int i = 1; i <= doc_num; i++){
                    boolean found = false;
                    doc_id = String.valueOf(i);
                    temp = (String) dbVector.fetch(doc_id);
                    StringTokenizer st = new StringTokenizer(temp);
                    while (st.hasMoreTokens() && !found){
                         temp = st.nextToken();
                         StringTokenizer st1 = new StringTokenizer(temp, ",");
                         String term = st1.nextToken();
                         if (query_vector.contains(term)){
                              doc_retrieved.add(doc_id);
                              found = true;
          catch(GdbmException e){
               System.out.println(e.getMessage());
          return doc_retrieved;
     private Vector andSearch(Vector query_vector){
          Vector doc_retrieved = new Vector();
          try{
               GdbmFile dbVector = new GdbmFile("Vector.gdbm", GdbmFile.READER);
               dbVector.setKeyPacking(new StringPacking());
               dbVector.setValuePacking(new StringPacking());
               int doc_num = dbVector.size();
               String doc_id = "";
               String temp = "";
               for (int i = 1; i <= doc_num; i++){
                    Vector doc_vector = new Vector();
                    boolean found = true;
                    doc_id = String.valueOf(i);
                    temp = (String) dbVector.fetch(doc_id);
                    StringTokenizer st = new StringTokenizer(temp);
                    while (st.hasMoreTokens()){
                         temp = st.nextToken();
                         StringTokenizer st1 = new StringTokenizer(temp, ",");
                         String term = st1.nextToken();
                         doc_vector.add(term);
                    for (int j = 0; j < query_vector.size(); j++){
                         temp = (String) query_vector.get(j);
                         if (doc_vector.contains(temp))
                              found = found & true;
                         else
                              found = false;
                    if (found)
                         doc_retrieved.add(doc_id);
          catch(GdbmException e){
               System.out.println(e.getMessage());
          return doc_retrieved;
     private String Stemming(String str){
          Porter st = new Porter ();
          str = st.stripAffixes(str);
          return str;
     private boolean search(String str){
          //stop word list
          String [] stoplist ={"a","about","above","according","across","actually","adj","after","afterwards","again",
                                   "against","all","almost","alone","along","already","also","although","always","am","among",
                                   "amongst","an","and","another","any","anyhow","anyone","anything","anywhere","are",
                                   "aren't","around","as","at","away","be","became","because","become","becomes","becoming",
                                   "been","before","beforehand","begin","beginning","behind","being","below","beside",
                                   "besides","between","beyond","billion","both","but","by","can","cannot","can't",
                                   "caption","co","co.","could","couldn't","did","didn't","do","does","doesn't","don't",
                                   "down","during","each","eg","eight","eighty","either","else","elsewhere","end","ending",
                                   "enough","etc","even","ever","every","everyone","everything","everywhere","except",
                                   "few","fifty","first","five","for","former","formerly","forty","found","four","from",
                                   "further","had","has","hasn't","have","haven't","he","he'd","he'll","hence","her","here",
                                   "hereafter","hereby","herein","here's","hereupon","hers","he's","him","himself","his",
                                   "how","however","hundred","i'd","ie","if","i'll","i'm","in","inc.","indeed","instead",
                                   "into","is","isn't","it","its","it's","itself","i've","last","later","latter","latterly",
                                   "least","less","let","let's","like","likely","ltd","made","make","makes","many","maybe",
                                   "me","meantime","meanwhile","might","million","miss","more","moreover","most","mostly",
                                   "mr","mrs","much","must","my","myself","namely","neither","never","nevertheless","next",
                                   "nine","ninety","no","nobody","none","nonetheless","noone","nor","not","nothing","now",
                                   "nowhere","of","off","often","on","once","one","one's","only","onto","or","other","others",
                                   "otherwise","our","ours","ourselves","out","over","overall","own","per","perhaps","pm",
                                   "rather","recent","recently","same","seem","seemed","seeming","seems","seven","seventy",
                                   "several","she","she'd","she'll","she's","should","shouldn't","since","six","sixty",
                                   "so","some","somehow","someone","sometime","sometimes","somewhere","still","stop",
                                   "such","taking","ten","than","that","that'll","that's","that've","the","their","them",
                                   "themselves","then","thence","there","thereafter","thereby","there'd","therefore",
                                   "therein","there'll","there're","there's","thereupon","there've","these","they","they'd",
                                   "they'll","they're","they've","thirty","this","those","though","thousand","three","through",
                                   "throughout","thru","thus","to","together","too","toward","towards","trillion","twenty",
                                   "two","under","unless","unlike","unlikely","until","up","upon","us","used","using",
                                   "very","via","was","wasn't","we","we'd","well","we'll","were","we're","weren't","we've",
                                   "what","whatever","what'll","what's","what've","when","whence","whenever","where",
                                   "whereafter","whereas","whereby","wherein","where's","whereupon","wherever","whether",
                                   "which","while","whither","who","who'd","whoever","whole","who'll","whom","whomever",
                                   "who's","whose","why","will","with","within","without","won't","would","wouldn't",
                                   "yes","yet","you","you'd","you'll","your","you're","yours","yourself","you've"};
          int i = 0;
          int j = stoplist.length;
          int mid = 0;
          boolean found = false;
          while (i < j && !found){
               mid = (i + j)/2;
               if (str.compareTo(stoplist[mid]) == 0)
                    found = true;
               else
                    if (str.compareTo(stoplist[mid]) < 0)
                         j = mid;
                    else
                         i = mid + 1;
          return found;
     }

please show us the full error message.
it sounds like a classpath problem...

Film guys: Sample Rates?

Do people work at 48khz in Logic, or work in 44.1 then change rates later?
I'm working in 48khz just now, but it means that importing 44.1khz files is a drag, seems they need to be converted to playback properly. Working at 48 also seems to mean that my 828MkII is 'stuck' in 48khz, so that neither iTunes or Waveburner will playback as long as Logic is running.
Is there a more elegant solution, or just "shut-up and get on with it"? ;o)

Irv,
I'm surprised to hear that you can't get iTunes to playback while Logic is running @ 48KHz. I never have a problem with this. Perhaps it's got something to do with my audio system, tho I'm not sure. One thing I could swear I remember reading is that CoreAudio will do real-time sample rate conversion, which (I guess) explains why I can play an audio CD while my session is @ 48K.
I just did a project where most of the music was originally recorded at 44.1, the mixes and stems ultimately bumped up to 48K. Personally, I'll never do that again if I can help it, for a variety of reasons. One, because it generated too many files with similar names (strings stem 44, strings stem 48, etc.), and Two, because when we had to do tweaks, we had to go back to the 44.1 session, re-print stuff, then redo the sample rate conversions. Real PITA.

3D Stem Graph String-based Axes

When using 3D Stem Plots and real-world data, I successfully generate a 3D Graph. However, I was wondering if it is possible to customize the X and Y axis values with strings rather than numbers. I still want the Z-Axis to be a numerical intensity, but rather than having the X and Y values be an integer that I have to define in a note, I would like to make these values show strings.
My VI (attached) involves sample baseball batting/running statistics for each position (except pitcher).
Thanks in advanced,
T16626
"Whether you think you can or can't, you're right."
~Henry Ford
Solved!
Go to Solution.
Attachments:
STATS_for-3D-data_CHOOSE-THIS-ONE.csv ‏1 KB
SQUIRES_real-life-3d-data.vi ‏17 KB

Thanks Shane, but this isn't exactly what I was asking. Maybe I convoluted my question.
Imagine a 2 dimensional vertical bar graph.
The Y-Axis shows intensity, while the X-Axis is marked with what each bar refers to.
So let's say we're looking at a pizza parlor. They graph how many cheese, pepperoni, and sausage pizzas are sold on average per week.
Cheese---25 per week
Pepperoni---20 per week
Sausage---18 per week
so the X-Axis would have 3 tick marks, one designating each bar (cheese, pepperoni, and sausage) and the Y-Axis would be marked from 0 to 25, and the bars would reach 25, 20, and 18 units high respectively.
Here's a visualization of an ideal X-Axis (from Wikipedia's page "Bar Chart"): http://upload.wikimedia.org/wikipedia/commons/3/35/Incarceration_Rates_Worldwide_ZP.svg
What I am looking to do is find a way to create these custom ticks on the X-Axis instead of writing off to the side:
"1 refers to cheese, 2 refers to pepperoni, and 3 refers to sausage"
Please note that my goal is to apply this concept to a 3-D graph, rather than a 2-D graph.
Hope this clarifies my question.
~T16626
"Whether you think you can or can't, you're right."
~Henry Ford

Why String.getBytes() throws BufferOverflowException exception?

the following is my codes:
String gbStr = new String(s_content.getBytes(),"ISO_8859_1");
but sometimes it throws exception like this:
java.nio.BufferOverflowException
at java.nio.charset.CoderResult.throwException(CoderResult.java:259)
at java.lang.StringCoding$CharsetSE.encode(StringCoding.java:338)
at java.lang.StringCoding.encode(StringCoding.java:372)
at java.lang.StringCoding.encode(StringCoding.java:378)
at java.lang.String.getBytes(String.java:608)
why?

One explaination offered is
We took a look at the source code of the JVM. The
problem stems from the fact that float values are used
to indicate the maximum value of bytes per characters
in java.nio.charset.CharsetEncoder.maxBytesPerChar.
The issue is that floats cannot accuratly hold more than
2^24 integer values which is equals to 16,777,216.
After that value is reached, the encoding operation in
the character set classes incorrectly rounds down the
amount of memory needed for the buffer. The correct
solution would be to use doubles instead, or account
for the round off problem by increasing the buffer size.
SUGGESTED WORKAROUND
The workaround that we are using, is to use to .
getBytes() on a substring that is smaller than 16MB,
and combined the results by either using a
ByteArrayOutputStream or a ByteBuffer.
NOTE: If you are planning on using more than one-byte
characters sets, than you have to make sure that your
buffer is set accordingly.

Building the varchar string to return from a pl sql function

i'm new with pl/sql and i'm having trouble trying to build the string that i want to return from a function that is inside a package. it seems my problem stems from the fact that i'm trying to incorporate a variable (varchar2) into the string to be returned. below are two attempts that i've made which do not work:
function test_policy (p_schema_name IN varchar2, p_object_name IN varchar2) return varchar2 as
predicate_value varchar2(2000);
user_name varchar2(100);
begin
select first_name
into user_name
from employees
where first_name = SYS_CONTEXT('hr_app_context', 'username');
predicate_value := 'first_name = ' || user_name;
predicate_value := 'first_name = ' || '' || user_name || '';
return predicate_value;
end test_policy;
Can someone help me with the proper syntax to build my string for the return value? Thanks.

this function implements the code for a policy i've created. basically, the policy says that when i do a select on the employees table, i should only see a record whose first_name = sys_context('hr_app_context', 'username'). so, when i perform a simple select * from employees, i get an error which says policy predicate has error. i'm pretty sure the error is caused by how i'm building the return value for that function. if i hard code some return value like:
predicate_value := 'first_name = ''HR''' ;
the select statement above works fine, and i only see the record from employees where first_name = 'HR'

Query string read as part of file name, throwing not found errors

Hi all, I host a number of Web sites under a CF7 installation, Win2003.
One site in particular is throwing not-found errors in response to certain search bot requests.
In the IIS log, I noticed that for these requests, the query part of the URL is part of cs-uri-stem field value, but is not in the cs-uri-query field where it belongs:
cs-uri-stem= /index.cfm?template=24hour5.cfm
cs-uri-query=<blank>
instead of
cs-uri-stem= /index.cfm
cs-uri-query=template=24hour5.cfm
Evidently something somewhere is interpreting the entire URL as a filename, instead of a file name and a query string. When CF tries to locate the file it is throwing a not-found error.
Maybe there is something weird about the question mark, but it looks normal to me.
I can't seem to stop this error, since it is occuring at the OS, IIS, CF or jrun layer. Does anyone have any idea what is going on here, and what I can do about it?
Thanks in advance.
Joe

Hey Reed, thanks for responding.
I have a Cf utility that parses logs, so I modifed it to print out the ASCII codes for each character. They look normal, as far as I can tell. The question mark has a code of 63 which is correct, and no non-alphabetic characters precede or follow.
One interesting thing is that the stem being called is an index.cfm file, and the query string argument happens to be a template name, and it ends in .cfm. That's why it is making it all the way to CF, which chokes on it, instead of IIS logging a 404 error.
Often an identifiable bot is requesting these bad URLs, though I have spotted another request with agent 'Mozilla/4.0.' I suspect that is some kind of automated scan. (I also see other requests with the same agent name, though a different IP, that look like errononeously URL-encoded requests. These get filtered by URLScan.)
I don't know for sure is whether the specific clients that make these bad calls always make them them wrong way. They appear to. Most clients that access the site do so normally.
I wonder if there could be something in the request header, perhaps that instructs IIS to expect a different charset than what is actually used, or something like that.

Newbie Question: Rules: Functions: How to compare String based type?

I have some XML facts in my rules dictionary defined by the following schema (fragments shown)
<xs:simpleType name="VarType">
   <xs:restriction base="xs:string">
      <xs:enumeration value="Foo"/>
      <xs:enumeration value="Bar"/>
      <xs:enumeration value="Baz"/>
      <xs:enumeration value="Qux"/>
   </xs:restriction>
</xs:simpleType>
<xs:complexType name="ProgType">
   <xs:sequence>
      <xs:element name="ID" type="xs:string"/>
      <xs:element name="var" type="VarType" maxOccurs="unbounded"/>
   </xs:sequence>
</xs:complexType>
Which means that a Prog of ProgType has an ID and a "list" of "var" strings restricted to bounds specified by VarType.
The issue comes when I try to create a Rules Function operating on these types.
Function-> boolean containsVar(ProgType prog,VarType var) (built using the Functions tab of the Rules editor)
for (String v : prog.var ){
   if (v == var){
      return true
return false
The problem we run into here is typing. If v is declared a String, as here, then v == var is invalid because types don't match. But I can't declare v a VarType due to
RUL-05583: a primitive type or fact type is expected, but neither can be found.
This problem may stem from the fact the Java's String is declared final and can't be subclassed, so the JAXB translation to Java may have to wrap it, futzing ==/equals() in the process.
SO... How do I create this method and compare these values?
TIA
Edited by: wylderbeast on Mar 10, 2011 9:15 AM - typos
Edited by: wylderbeast on Mar 10, 2011 9:18 AM

And here's the answer.
var.value() seems to return the String value of the type
so the comparison becomes
(v == var.value())
Live and learn....

The unit string stays editable for numerics in a non strict typedef

The change is not effective as the typedef instance keeps its defined units but may displays wrong unit string.
This is not the expected behavior to change for example m to m/s as the units are part of the type definition.
It is expected to be able to change meters to feet since it affects only the displayed number and it is not a change in physical dimensions.
LabVIEW, C'est LabVIEW

It seems that if you change the base unit, everything goes awry. The control has the new units, but the unit label is never updated. This is definitely a bug. I think the bug stems from the fact of the desired behavior when you don't change the base unit (i.e. change m/s to km/s). The non-strict typedef keeps the unit label at m/s and everything works as it should. Based on what NI does for other control components for type defs, I expect this is desired behavior. The problem is that the user can change m/s to m in the typedef which is incompatible with the m/s unit label and therefore needs to be forced to update the unit string.
With a typedef, you cannot change the base unit using the unit label (take a good VI with both set to m/s, and try to change the type def instance of the control to m. LabVIEW will allow the change, but the VI will not break because LabVIEW really didn't change the units. This may be constituted as a second bug or part of the same bug.
NOTE: Non strict typedef units are m/s. Instance Unit Label changed to m. VI did not break and runs as if the units are m/s.
Perhaps the desired behavior is to break the VI when the instance of the typedef doesn't have compatible units with the typedef. That would cause the VIs to break when a typedef is updated. So, maybe in addition, the typedef needs to force a unit label update if the old and new units are not compatible.
Message Edited by Matthew Kelton on 03-24-2009 12:02 PM
Attachments:
typedef unit mismatch.png ‏14 KB

Exception Stack Trace as String?

Hello all!
I need to convert the stack trace returned by the printStackTrace() method (return type void) into a String. Does anyone know of a good way of doing this?
My interest in this stems from a desire to send any Java error messages to myself via email rather than having them displayed on the computer screen, which is remote to my workstation. Any help, hints, etc. are much appreciated.
Thanks in advance.

I don't know of a way to solve this via the Throwable API alone, but you could do one of the following things: a) send the stack trace to a known file via printStackTrace(PrintWriter s), then parse that file routinely and mail the information or b) extend PrintWriter with a class that sends the information to you when its write methods are called. Personally, I would write a log class that writes the stack trace to a file and then emails that file.

in String Buffer

I need to insert a < sign in a StringBuffer.
I'm using the following code:
tmpStrBuf.insert(0, '<');
tmpStrBuf.insert(tmpStrBuf.length(), ">");While the > sign inserts correctly, for the < sign, I get
& lt; (space inserted to prevent forum from parsing)
Any help or suggestions appreciated.
Thanks,
Jim

Good point...I added some more system.out.println to the code, the StringBuffer it not the problem.
I'm actually building an xml string, from an existing string. I need to change part of the original string, add some items, and delete some of it.
Then it is added to an ArrayList, then later on in the code, I step through the ArrayList, adding each item in the ArrayList to the XML document
Code:
tmpStrBuf.insert(0, "<");
tmpStrBuf.insert(tmpStrBuf.length(), ">");
ArrayList literalExLines = new ArrayList();
literalExLines.add(tmpStrBuf.toString());
// Create the literal example text node
ltetext = question.createTextNode(literalExLines.get(n).toString()) ;
// Add the text to the element
lte.appendChild(ltetext);
// Add the "<literalExample>" tag to "<stem>"
stem.appendChild(lte);
// Add the "<question>" tag to "<stem>"
if (debug) System.out.println("Adding stem");
qst.appendChild(stem);
So, somewhere in appending it to the xmlDocument, it is getting converted.
Any suggestions on where it might be getting converted?
Thanks,
Jim

Stemming

Hi,
Can someone explain the code in the folloiwng- I need a detailed explanation
class Stemmer
{ private char[] b;
private int i, /* offset into b */
i_end, /* offset to end of stemmed word */
j, k;
private static final int INC = 50;
/* unit of size whereby b is increased */
public Stemmer()
{ b = new char[INC];
i = 0;
i_end = 0;
* Add a character to the word being stemmed. When you are finished
* adding characters, you can call stem(void) to stem the word.
public void add(char ch)
{ if (i == b.length)
{ char[] new_b = new char[i+INC];
for (int c = 0; c < i; c++) new_b[c] = b[c];
b = new_b;
b[i++] = ch;
/** Adds wLen characters to the word being stemmed contained in a portion
* of a char[] array. This is like repeated calls of add(char ch), but
* faster.
public void add(char[] w, int wLen)
{ if (i+wLen >= b.length)
{ char[] new_b = new char[i+wLen+INC];
for (int c = 0; c < i; c++) new_b[c] = b[c];
b = new_b;
for (int c = 0; c < wLen; c++) b[i++] = w[c];
* After a word has been stemmed, it can be retrieved by toString(),
* or a reference to the internal buffer can be retrieved by getResultBuffer
* and getResultLength (which is generally more efficient.)
public String toString() { return new String(b,0,i_end); }
* Returns the length of the word resulting from the stemming process.
public int getResultLength() { return i_end; }
* Returns a reference to a character buffer containing the results of
* the stemming process. You also need to consult getResultLength()
* to determine the length of the result.
public char[] getResultBuffer() { return b; }
/* cons(i) is true <=> b[i] is a consonant. */
private final boolean cons(int i)
{ switch (b[i])
{ case 'a': case 'e': case 'i': case 'o': case 'u': return false;
case 'y': return (i==0) ? true : !cons(i-1);
default: return true;
/* m() measures the number of consonant sequences between 0 and j. if c is
a consonant sequence and v a vowel sequence, and <..> indicates arbitrary
presence,
<c><v> gives 0
<c>vc<v> gives 1
<c>vcvc<v> gives 2
<c>vcvcvc<v> gives 3
private final int m()
{ int n = 0;
int i = 0;
while(true)
{ if (i > j) return n;
if (! cons(i)) break; i++;
i++;
while(true)
{ while(true)
{ if (i > j) return n;
if (cons(i)) break;
i++;
i++;
n++;
while(true)
{ if (i > j) return n;
if (! cons(i)) break;
i++;
i++;
/* vowelinstem() is true <=> 0,...j contains a vowel */
private final boolean vowelinstem()
{ int i; for (i = 0; i <= j; i++) if (! cons(i)) return true;
return false;
/* doublec(j) is true <=> j,(j-1) contain a double consonant. */
private final boolean doublec(int j)
{ if (j < 1) return false;
if (b[j] != b[j-1]) return false;
return cons(j);
/* cvc(i) is true <=> i-2,i-1,i has the form consonant - vowel - consonant
and also if the second c is not w,x or y. this is used when trying to
restore an e at the end of a short word. e.g.
cav(e), lov(e), hop(e), crim(e), but
snow, box, tray.
private final boolean cvc(int i)
{ if (i < 2 || !cons(i) || cons(i-1) || !cons(i-2)) return false;
{ int ch = b[i];
if (ch == 'w' || ch == 'x' || ch == 'y') return false;
return true;
private final boolean ends(String s)
{ int l = s.length();
int o = k-l+1;
if (o < 0) return false;
for (int i = 0; i < l; i++) if (b[o+i] != s.charAt(i)) return false;
j = k-l;
return true;
/* setto(s) sets (j+1),...k to the characters in the string s, readjusting
k. */
private final void setto(String s)
{ int l = s.length();
int o = j+1;
for (int i = 0; i < l; i++) b[o+i] = s.charAt(i);
k = j+l;
}

Use code tags when posting code.
I would suggest and some println()s and then running the code.

Using 5.1 Stems in Premiere Pro CC20141

I'm setting up a Premiere project for creating deliverables of a film and I'm wondering if I'm properly setting up this sequence. I've done this before but I'm either forgetting a step or something has changed in a CC update. I'm probably forgetting a step.
I tried this method but no matter what I do, all of my audio goes to tracks 1-2 and 5-6.
The solution I have come up with is this. Before I place the audio in a sequence, I modify the audio of each stem by selecting the 5.1 preset and then mapping it to the appropriate channel.
(Clip settings for my center channel audio as an example)
I then select all of my stems, right click and choose "New Sequence From Clip." This strings them out in a single track but everything is quickly placed in sync on the respective tracks. Now my sequence audio settings look like this (exactly like my clip settings):
(Sequence Settings)
I now have audio in all 6 channels and I am able to export a Quicktime with 5.1 audio and everything sounds great.
At face value, this looks like everything is fine. My question is does this actually work? For instance, if I send a file to a theatre or festival, will they be able to play the 5.1 audio?

If you're getting your correct sound placements over in Qt, that's a good sign you're doing it right. You could also go through the work of outputting this to DVD through Encore (bit of a pain) and playing that on a couple surround-setup tv systems. The next release coming in a few months has an easier to understand dialog box, but you might have this one down.
Neil

Verity Multilanguage Locale Not Stemming

I am not able to get stemming to work with my collections
that are in unicode. Everything else seems fine but whenever I
search on something like choice, it only returns choice and not
choices as well. I redid the collections using englishx and the
stemming worked but unfortunately there are display issues then,
black diamond with question mark, and my content is in multiple
languages (spanish, french, swedish, english, etc.) so it would be
better to have the stemming in multiple languages.
Does anyone know what sort of problem this could be?
Configuration or installation issue is my guess but I have no idea
where to start.
Oh btw I am running redhat enterprise and using a simple
query in cfsearch.
Thanks,
jt

A footnote to this.
Ken & the guys from Adobe came back to me after my last
comment, and did a bit of a turn around. Ken's helped me through
this with sample code and some pointers in the right direction to
assist me to utilise LUCENE's Russian-stemming capabilities (which
I can confirm do actually work!), as a "pre-processor" which
tokenises documents before Verity indexes them, and then does the
same thing to search strings before passing them to Verity. This
works OK, and is a workable band-aid to the situation.
My next move is to factor Verity out of the equation (and, I
hasten to add, any FUTURE equation involving searching - Russian or
otherwise), and produce a pure Lucene solution.
Thanks Ken and Skip for helping us out on this issue. It's
restored my faith in Adobe.
Cheers.
Adam

String stemming

Similar Messages

Maybe you are looking for