A slightly unusual String question
Hi everybody,
Does anyone know if there is a limit to how long Strings can be? I'm parsing an html file and the opening and closing paragraph tags are not always on the same line. The only way I could think of to make sure I parsed out the paragraphs correctly was to squish the whole HTML body (which is very long) into one big String and scan through it for the tags. Hence my question; I don't want to kill my JVM with an OutOfMemoryError.
Any help on this would be great.
Thanks,
Jezzica85
extract from word docs you say?
ive done that
package com.doesthatevencompile.desktopsearch.filetypes.framework.msdoc;
import org.apache.lucene.document.Document;
import org.textmining.text.extraction.WordExtractor;
import com.doesthatevencompile.desktopsearch.filetypes.framework.DocumentFieldHelper;
import com.doesthatevencompile.desktopsearch.filetypes.framework.DocumentHandler;
import com.doesthatevencompile.desktopsearch.filetypes.framework.DocumentHandlerException;
import java.io.InputStream;
public class TextMiningWordDocHandler implements DocumentHandler {
public Document getDocument(InputStream is)
throws DocumentHandlerException {
String bodyText = null;
try {
bodyText = new WordExtractor().extractText(is);
catch (Exception e) {
throw new DocumentHandlerException(
"Cannot extract text from a Word document", e);
if ((bodyText != null) && (bodyText.trim().length() > 0)) {
Document doc = new Document();
DocumentFieldHelper.addFieldToDocument(doc, DocumentFieldHelper.KEYWORD_ALL_TEXT, bodyText);
DocumentFieldHelper.setDocumentType(doc, DocumentFieldHelper.TYPE_DOC);
return doc;
return null;
}the lib you need can be found here:
http://doesthatevencompile.com/current-projects/code-sniplets/lib/
called tm-extractors
there is some lucene code mixed in the code sniplet which you dont need to worry about. hopefully this is enough to set you on your way
Similar Messages
-
Unusual string insertion question
Hi everybody,
I'm trying an experiment with "encoding" a text file, and I was wondering, is there a specific way to insert a random character, say, "&", into random places in a text string a random number of times? I've heard of Math.random, but I don't think that would do what I'm looking for, can anyone help me out?
Thanks,
Jezzica85Cipher.java
import java.util.*;
public abstract class Cipher {
public String encrypt(String s) {
StringBuffer result = new StringBuffer("");
StringTokenizer words = new StringTokenizer(s);
while (words.hasMoreTokens()) {
result.append(encode(words.nextToken()) + " ");
return result.toString();
public String decrypt(String s) {
StringBuffer result = new StringBuffer("");
StringTokenizer words = new StringTokenizer(s);
while (words.hasMoreTokens()) {
result.append(decode(words.nextToken())+ " ");
return result.toString();
public abstract String encode(String word);
public abstract String decode(String word);
} Caesar.java
public class Caesar extends Cipher {
public String encode(String word) {
StringBuffer result = new StringBuffer();
for (int k = 0; k < word.length(); k++) {
char ch = word.charAt(k);
ch = (char)('a' + (ch -'a'+ 3) % 26);
result.append(ch);
return result.toString();
public String decode(String word) {
StringBuffer result = new StringBuffer();
for (int k = 0; k < word.length(); k++) {
char ch = word.charAt(k);
ch = (char)('a' + (ch - 'a' + 23) % 26);
result.append(ch);
return result.toString();
} TestEncrypt.java
public class TestEncrypt {
public static void main(String argv[]) {
Caesar caesar = new Caesar();
//here's the message
String plain = "this is the secret message";
//encrypt the message
String secret = caesar.encrypt(plain);
System.out.println(" ********* Caesar Cipher Encryption *********");
System.out.println("PlainText: " + plain);
System.out.println("Encrypted: " + secret);
System.out.println("Decrypted: " + caesar.decrypt(secret));
} Message was edited by:
fastmike -
Hi!
I have two questions:
1) Can I do toString() on a null value?
2) Is there any difference in writing:
String s = new String();
and
String s;isnt it true that in the first case s -object of
String class is created??Yes, the empty string, length zero, as he said.
>
in second case just defining a variable type String??Yes. It will either be null or have an udefined value, depending on whether it's a member variable or a method variable. -
Convert string question... "\"
I try the below code on jsp, but result is : a , a\b
but I want the result is : a\\b, a\\\\b
Thanks !
<%
String str = "a\b, a\\b";
String outStr = "";
for (int i=0;i<str.length();i++){
char c = str.charAt(i);
if (c == '\\'){
outStr += "\\";
else{
outStr += c;
%>
<%=outStr%>Right, because the compiler interprets "\\" as an escape character, indicating that you want a backslash. (If you tried to just use a String "\", you'd probably get a compilation error, since the compiler would think you were trying to escape the second quotation).
I think you're going to need to use RegExp... this question has been asked a bunch before, so search through the forums for an answer. -
Hello,
I have the next problem:
I have cursor that contains the next thing:
WHERE LKP_INDUSTRY1_ID IN ( p_sector )
p_sector is being delivered in varchar: '23;22;36'
My question is: is it possible to manipulate p_sector in a way that I can paste it in my cursor? I can replace the ';' with ',' but I don't know how I can make it read the numbers as numbers and not as strings.
Thanks in advance.
OliDoes this solve your problem ?
SQL> create or replace type tab_n is table of number;
2 /
Type created.
SQL> declare
2 p_sector varchar2(20) := '7369;7499;7521';
3 t tab_n := tab_n();
4 element number;
5 pos integer := 0;
6 cursor a is select ename from emp where empno in (select * from table(t));
7 begin
8 for i in 1..length(translate(p_sector || ';',';0123456789',';')) loop
9 t.extend;
10 t(t.count) := substr(p_sector,pos+1,instr(p_sector || ';',';',1,i)-pos-1);
11 pos := instr(p_sector,';',1,i);
12 end loop;
13 for v in a loop
14 dbms_output.put_line(v.ename);
15 end loop;
16 end;
17 /
SMITH
ALLEN
WARD
PL/SQL procedure successfully completed.Rgds. -
Concat substring and string question
Hello,
So I am trying to add together a string \\fafs10\home and the concat of a substring of the first initial of a GivenName and LastName. I keep getting a NaN error.
Here's what I have:
=string("\\fafs10\home\") + concat(substring(GivenName,0,1), LastName)
I want to return this as a default value to my list.
Any help would be greatly appreciated. Thank you.
MatthewHi
are you doing this in a calculated column or within a rule on an InfoPath form or another way?
you should probably just go with
concat("\\fafs10\home\",substring(GivenName,0,1),LastName)
Regards
Sergio Giusti Sergio Blogs
Linked
In Profile
Whenever you see a reply you think is helpful, click Vote As Helpful.
Whenever you see a reply you think is the answer to the question, click Mark As Answer. -
I would like to know how to make a string with the same characters as another string. Also, how can I set an int with the same value as there are characters in a string. It would really help if you gave me an example, because I am new to java and pretty much lost.
That page has alot of information, but I really don't
know how to use any of it. It would be really helpful
is some one gave me an example. I see something like
"int length ( )", but I don't know how to use it.You don't know how to call a method? Then you need to start from the very beginning:
Sun's basic Java tutorial
Sun's New To Java Center. Includes an overview of what Java is, instructions for setting up Java, an intro to programming (that includes links to the above tutorial or to parts of it), quizzes, a list of resources, and info on certification and courses.
http://javaalmanac.com. A couple dozen code examples that supplement The Java Developers Almanac.
jGuru. A general Java resource site. Includes FAQs, forums, courses, more.
JavaRanch. To quote the tagline on their homepage: "a friendly place for Java greenhorns." FAQs, forums (moderated, I believe), sample code, all kinds of goodies for newbies. From what I've heard, they live up to the "friendly" claim.
Bruce Eckel's Thinking in Java (Available online.)
Joshua Bloch's Effective Java
Bert Bates and Kathy Sierra's Head First Java.
James Gosling's The Java Programming Language. Gosling is
the creator of Java. It doesn't get much more authoratative than this.
Here's a freebie though:String str = ...;
int len = str.length(); -
is there a way to remove a character from a string, for example:
"< head>"
and do some sort of trim to it to create
"<head>"
thankshalfpipehippie wrote:
is there a way to remove a character from a string, for example:
"< head>"
and do some sort of trim to it to create
"<head>"
thanksreplaceAll(...) can handle this quite easily:
String text = "abc <head > def </ head > ghi";
System.out.println(text.replaceAll("\\s++(?=[^<>]*+>)", ""));But when your tags contain attributes, you can't use the code above. But, you would have mentioned such an important piece of information in your original post, right? -
String question regarding "\"
i am trying to read in a CSV (coma separated value) file containing text like this
"2007/10/04","22:47:24","C:\test\tp2c266b.BAT","deleted","",""
"2007/10/04","22:48:06","C:\Program Files\Common Files\Symantec Shared\CCPD-LC\symlcrst.dll","changed","",""
"2007/10/04","22:48:19","C:\PROGRA~1\Symantec\LIVEUP~1\ludirloc.dat","changed","",""
Using a CSV parser from:
http://ostermiller.org/utils/CSV.html
This code is easy to use and looks just like the example shown.... However when i parse out the array and print it to the screen the output for the path name looks like this:
C:testtp2c266b.BAT
C:Program FilesCommon FilesSymantec SharedCCPD-LCsymlcrst.dll ---- no \ is there
i realize that the \ is an escape character but i need it, without it the directory listing is pointless.... please help have no idea what to do
How do i read in the \ from the file and keep the dang thing..... or replace it with a / anything is better then no indicator of directory structureCODE:
import com.Ostermiller.util.CSVParser;
import java.io.InputStreamReader;
import java.io.FileInputStream;
import java.io.*;
import hansen.filespy;
import java.net.*;
public class EventReader {
public static void main(String[] args) throws Exception
java.net.InetAddress i = java.net.InetAddress.getLocalHost();
getMachineID ID = new getMachineID();
int machineid = ID.getIT(i.getHostName());
while(true)
runit(machineid);
try{Thread.sleep(10000);}catch(Exception e){}//sleep for 10sec
public static void runit(int machineid) throws Exception{
filespy spy = new filespy();
File f = new File("C:\\WINDOWS\\system32\\scl.csv");
//test code ---- usb key??
File[] roots = File.listRoots();
for ( File root : roots )
System.out.println( root );
//test code ---- end
if (f.exists()) {
CSVParser shredder = new CSVParser(new InputStreamReader(new FileInputStream("C:\\WINDOWS\\system32\\scl.csv")));
String[] t;
//remove header
String[] header; //------------------------------create string array to contain stings
header = shredder.getLine();
while ((t = shredder.getLine()) != null) {
System.out.println(t[1]); //date //-----------------------contains date
System.out.println(t[2]); //file //------------------------contains filepath ------ however it shows C:testprogram files instead of C:\test\program files
System.out.println(t[3]); //event //-----------contains event
System.out.println(t[4]); //empty line
spy.spy(t[1], t[2], t[3], "Non-System", machineid); //--------------sending to outside class for further parsing.... see code example from above for what im doing
shredder.close();
boolean f1 = new File( "C:\\WINDOWS\\system32\\scl.csv").delete();
if (!f1) {
System.out.println("failed to delete, file not there");
}//end if
else{System.out.println("no events");}
} -
Quick string question finding if a string contains a character
hello peeps
is there a quick way of checking if a string contains a specific character
e.g.
myString="test:test";
myString.contains(":");
would be true
any ideas on a quick way of doing itis there a contains() method in 1.4.2? i couldnt see
it in the docsNo there isn't. But the 1.5 has a contains(CharSequence s) method. -
Hi guys, I'm still learning the intricacy of Java at the moment so bear with me, as I'm sure the answer is going to be simple has heck ...
I'm trying to sort a String, where I can make the letters in order. Basically, if the String is "ddeab" then it should be converted to "abdde".
So far, I'm using an int variable to get the value of the char in first and second letter ( via charAt to pick the position ) and swap them if they're are lower. However, this is where I'm stuck - how do I get them to "swap" in a string?
Edited by: Phoom on Dec 1, 2007 11:58 AMPhoom wrote:
After a quick reading on Strings, I found toCharArray() useful in this case. I guess I can convert it back to string format by using a for loop and make it add to a string.
...No need to loop over that array: have a look at the various constructors of the String class.
And unless it a (homework) requirement to implement your own sorting algorithm, have a look at java.util.Arrays' sort methods.
Good luck. -
A co-worker had a query he was running with a where clause something like this...
Where
Name='BLAH & BLAHBLAH'
How do I pervent sql developer from treating the & symbole as a request for a variable prompt?
It's would prompt for variable named BLAHBLAH...
Thanks
Obe
ps. So far the Java SDK 6_10 is working...In SQL*Plus you can "SET DEFINE OFF" prior to the SQL statement to get it to ignore the ampersand, but as far as I know (and I may be wrong), you can't do that in SQL Developer Worksheet.
What I normally do in this situation is break the string up and use "||" and "CHR". To use your example:
WHERE name = 'BLAH ' || CHR(38) || ' BLAHBLAH'
Ed. H. -
Hi All:
I am stuck with a small situation, I would appreciate if someone can help me with that. I have 2 strings:
String 1 - "abc"
String 2 - "I want to check if abc is in the middle"
How can I check if the string 1 "abc" is in the middle of the string 2. I mean the String 2 does not start with "abc" and it does not end with "abc", I want to check if it is somewhere between the string.
Thanks,
Kunalint i = s2.indexOf(s1);
if((i > 0) && ((i + s1.length()) < s2.length())) {
// somewhere in the middle
} else if(i == 0) {
// start
} else if((i + s1.length()) == s2.length()) {
// end
} else if(i == -1) {
// nowhere
} -
The effect of repeated concatenation of Strings can cause a lot of unwanted objects in memory. If I am creating a query statement as follows:
String query = "Select * from"+tablename+"where something "+x+"something"+y.......(All in a single statement)
Willl this also create a lot of unwanted objects before query finally gets its value? Should I do this concatenation with a stringbuffer?I found the following listing and hope it helps...
http://java.sun.com/docs/books/jls/second_edition/html/expressions.doc.html#39990
15.18.1.2 Optimization of String Concatenation
An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.
For primitive types, an implementation may also optimize away the creation of a wrapper object by converting directly from a primitive type to a string.
Hope this also helps. -
A slightly different backlight question
I've heard some other people have had a problem with the backlight of their iBooks turning off and then not coming back. I've had the same problem but it tends to happen when I adjust the screen. I'll push it back slightly and it will turn off. If I sleep it, it will sometimes come back but usually I have to shut it down, close the lid, open it up and start up again. Most of the time, it comes back. I've tried zapping the PRAM, and resetting the PMU. Anything else to be done? Or might I have a frayed wire or something between my computer and screen which becomes disabled when I move the screen. It's an old computer and I don't really want to take it in for hardwear repairs if I can avoid it but it's been a major problem. Thanks.
Hi, and welcome to Apple Discussions.
You are not alone.
Dominion Tech in Colchester, Vermont used to do the repair for $79.90. You may want to call and see if the price still stands. You may also want to check your local Yellow Pages to see if someone will meet that price to save the shipping charges.
Maybe you are looking for
-
Disk image for "Roller Coaster Tycoon" disk
Hi Just bought RCT 3. I want to create a disk image so that I don't have to use the game DVD all the time and carry it everywhere with me. I tried this by inserting the disk, creating an image from the disk and saving it as a CD/DVD Master. The disk
-
Hi all, I have recently introduced APEX as our tool of choice for RAD of application dev accross our shared instance. I running a POC on Apexv4 via APEXLISTNER(TOMCAT) on 10G R2 (Redhat) Works a treat! I want to set up a suite off applications that a
-
PSE13 and Itunes 11.4 synchronization
I have experienced some problems when I try to synchronize my PSE 13 photo albums from Itunes to my different devices such as Ipad and Iphone. That worked properly with PSE12 but since I upgraded to version 13 of PSE i doesn't work anymore. However i
-
How to know the type of transactions by seeing the data in IP_IN_QUEUE
Hi B2B gurus, I am Using EDI X12 over internet. Our suppliers processing inbound transactions to us,After receiving from B2B its storing in IP_IN_Queue, The documents which we are sending it storing in IP_OUT_Queue. We want to see the data in in IP_I
-
Get the Encoder Data from NI Robotics Starter Kit 1.0
I want to make a perfect 90 degree turn using the encoder Data from Robotics Starter Kit 1.0. Now I am doing it giving the angular velocity and controlling the time. But, the problem is, it's not perfect all the time. It is not repeatable. So, I want