Image based Text Search

I created a table which contains four columns namely article_id,format,title and text(Blob column which stores documents and images).
I uploaded the following file types into the BLOB column using SQL * Loader.
1. PDF document
2. GIF image
3. JPG image
4. DOC document
5. TXT text file etc.,
I created the index using the following
CREATE INDEX TXT_INDEX ON TXT_TABLE(text)
INDEXTYPE IS ctxsys.context;
When I try to search for an text inside these documents, it works fine for PDF,DOC,TXT file types. If the same text appears anywhere in GIF or JPG image it is not retrieving it.
Can you please help me out in this. It is urgent.
Thanks in advance.
Regards
Raj

The text searching component (interMedia Text) does not index images.

Similar Messages

How theme-based text search work?

Hello all,
I want to use the theme-based text search feature of interMedia. I am not too understand about how it works.
Is that I need to build up my own threasrus that map a word to a concept. e.g. Concept 'Internet' include words such as 'Computer', 'Web', 'http'. And then when I use theme-based search for the word 'Internet', then it will give me all the records that contain the words 'Computer', 'Web' and 'http' ?
If it is the case, do it mean that I have to build up many many theasrus by myself?
Thanks for everyone's help in advance.

I am trying to do the same thing. I am successful in adding these related terms to thesaurus but unsuccessful in updating the knowledge base. If you got some success, let me know
Gagan
[email protected]
<BLOCKQUOTE>quote:<HR>Originally posted by fenton lui ([email protected]):
Hello all,
I want to use the theme-based text search feature of interMedia. I am not too understand about how it works.
Is that I need to build up my own threasrus that map a word to a concept. e.g. Concept 'Internet' include words such as 'Computer', 'Web', 'http'. And then when I use theme-based search for the word 'Internet', then it will give me all the records that contain the words 'Computer', 'Web' and 'http' ?
If it is the case, do it mean that I have to build up many many theasrus by myself?
Thanks for everyone's help in advance.<HR></BLOCKQUOTE>
null

Apex 3.1, Interactive Report Row Text Search, image bitmap as TEXT?

I think this IR thing is powerful which could save me lots of time in development.
One question: does the row text search(default: all columns) treat image column as regular text(string)? I did the following search on:
SAMPLE APPLICATION-->Products, I put 300 in the search column( for $300 list-price search), the search produces 3 lines( should only have 2). the 3rd line's list price is $1999, I looked it in SQL*PLUS and saw its image bitmap (long string) includes a "300" inside, so I believe the "default all columns search" treat image as regular string.
How can I avoid the image bitmap search included in IR? This bitmap strings are very long for each image and can EASILY match searching conditions for something like PRODOUCT DESCRIPTION, PRODUCT PRICE for our products data( about 25000)? thanks
sean

Sean / Russell,
Thanks for reporting this, it's certainly a bug.
By the way, the search is performed in SQL, on whatever column values are being displayed (run the page in debug mode to see the full SQL). So in the case of the sample application, it is not matching the image bitmap, but the image size, which is selected in the SQL. The bug is that the full search should not include columns which have filtering disabled or one of the special image format masks. We'll try to fix this for an upcoming patch.
Thanks,
Marco

How to convert from image based pdf to text based pdf

I have Adobe 9 Pro. How to convert from image based pdf to text based pdf? For example, if someone emails a scanned pdf to me, how do I convert that document into a text based pdf?

To perform OCR, open the document and select: Document > OCR Text Recognition > Recognize Text Using OCR
More information on the various options is in the Acrobat help doc.

How to help an image-based site's Google indexing?

I've a client whose site, due to budget and time constraints, is made up mostly of images. It's not a high-traffic site, but it's a site whose target audience requires more visual "wow" than most. I am prefacing with this because I just know someone's going to say "use more text" and that's not what I'm here asking about.
Instead, my question has to do with Search Engine Indexing of sites made up mostly of images. Perhaps Flash-based sites have the same problem, but I don't believe in those. The site in question has no Flash. It's 100% valid HTML markup, only that it's got more images than text. Even the company address and phone numbers are images. Don't argue with me, it just is, and had to be this one unique time. ;-)
Could I compensate for the lack of text by creating a DIV loaded with company info, putting it immediately after the body tag in the code, and placing it outside the viewport so no one ever sees it?
I thought of simply putting a display:hidden on it, but thought crawlers like Google's probably look for that bit of code. It's so easy to check for, after all.
Will placing a div outside the viewport with pertinent text in it help with the SEO concerns? At least a little bit?

I think you already know the answer to this question so I won't beat a dead horse here.
Could I compensate for the lack of text by creating a DIV loaded with company info, putting it immediately after the body tag in the code, and placing it outside the viewport so no one ever sees it?
Google frowns on site owners who try to manipulate search results by seeding pages with hidden keywords. I'm not suggesting for one moment that YOU would do that but lots of people have tried this with very bad results -- blacklisting or total removal from Search results.
Let's say you have a graphic banner in your header with the company name, slogan and phone number on it. It's obviously got content on it but it's not indexable. I think you could safely use an object replacement technique by adding a real text equivalent to the header and setting text-indent to -99999px. This makes the text readable by screen readers and bots that don't use CSS while keeping it out of view of sighted humans. Just make certain that whatever text you use has a graphic equivalent on the page or you run the risk of being blacklisted.
In addition, be sure to apply alt and title attributes to every graphic. Search engines don't place much importance on them but they won't penalize you for using them either.
Best of luck,
Nancy O.
Alt-Web Design & Publishing
Web | Graphics | Print | Media Specialists
http://alt-web.com/
http://twitter.com/altweb
http://alt-web.blogspot.com

Error: This page has graphics other than images or text on it

I recently upgraded to Adobe Acrobat 9 Pro, and I am experiencing a problem
with OCR text recognition.
If I use "Create PDF from Scanner" or "Create PDF from file" (using a
monochrome TIFF image) to create a PDF, I can then perform OCR on the PDF
file in Acrobat.
But if I open a monochrome TIFF image in some other program (Irfanview) and
print that to the Adobe PDF printer, then open that PDF in Acrobat and try
to perform OCR on it, I get the following message:
"Acrobat could not perform recognition (OCR) on this page because: This page
has graphics other than images or text on it. It cannot be captured."
If I select the scanned image in the PDF file using the TouchUp Object tool,
and then cut it to the clipboard, I am then able to then select a second
blank object on the page, that has the same dimensions as the scan (it is
full page size). If I delete that object, then paste the first one back in
(the scanned image), then OCR will work. So what is this additional object
that is preventing OCR? I did not have this issue with versions 4-8 of
Acrobat.
Also, this problem appears to only pertain to 1-bit monochrome images. I
scanned the same page to 8-bit grayscale, then printed that to Adobe PDF
from Irfanview, and that PDF file allows OCR to be performed. Does anyone
else have a problem using OCR when the PDF was created by printing a
monochrome raster image to PDF?

Not sure this will work for indesign, but for any other searches about the OCR problem with "This page has graphics other than images or text on it", I used this workaround for a huge scanned text that Acrobat 9 Pro wouldn't OCR:
1. Save As.... > TIFF image
This saved each page as a tiff, so if your file is big, you might want to create a special folder for it. (I had 1100 pages!)
2. Open the folder with the TIFF images.
3. Select all files.
4. Right-click > Open with.... > Choose.... > Other.... > Acrobat
5. Popup will ask if you want to open all files in a single document. Obviously a good choice for 1100 pages.
6. After the new PDF generates, save it.
7. Document > OCR --- should work now.
8. If it does, don't forget to delete the folders with the TIFFs.
Hope that helps someone.

I have a Flash Sample to rotate images and text but I not find a way to display special characters

Hello everyone.
I bought a very nice Flash application that rotate images, and text of any color and size. It use an XML input file.
I've posted here, a complete copy, so any of you can download, view and use it freely.
I would appreciate if any of you know how to do, so that the text displayed, including the characters I use in my language (Spanish), such as á, é, í, ó, ú, ñ, and other special characters.
In fact, I could not find a way to do it, because I'm not expert Flash, and less in ActionScript.
If any of you would help me on that, I thank you implement the appropriate adjustments and compressed into a. zip file, and let me know where to download it, or if you prefer you can send it to my email: [email protected]
After all compressed in .zip format is a very small file: 430K.
Click here to download the complete sample.
Thanks.
=====================================
Translated using http://translate.google.es
=====================================

Hello Rinus,
If I understood your last post correctly, then problem 2 is resolved, right?
Regarding problem 3:
I'm not asking you to share exact VIs.
I just want to see a very simple VI that explains the concept of what you're trying to do, what should happen (this can be in words that refer to the front panel elements) and what you've tried.
The terminology you're using isn't clear to me without an extra explanation.
This could even be only a Front Panel with a few buttons on where you just describe what should happen with specific controls/indicators.
Based on the first post it is not clear to me what you mean with:
- A "button element":
Are you talking about a control, an indicator, a cluster that contains multiple control?
- The structure:
Is this an event structure, case structure, for loop, ...?
Is it seems like you want to programmatically control Front Panel objects, which on itself is no problem at all independent of how many objects you want to control.
Please share with me simple example of what goes wrong and explain which things should happen on that specific Front Panel.
This will allow me to help you and also allow me to guide you along the right path.
Kind Regards,
Thierry C - Applications Engineering Specialist Northern European Region - National Instruments
CLD, CTA
If someone helped you, let them know. Mark as solved and/or give a kudo.

Converting a web-based text page to XMP

I have a live, web-based text feed that contains all of the metadata I want to capture. I want to tag this metadata to video and image files, and I want that information to stay with the video and image files throughout the production process (ie. the metadata information should be available in transcoded copies of the video files and so on).
I'd also like to use this metadata to create data-driven graphics in Photoshop. At this stage I can get basic datasets working in PS, but I have to cut-and-paste the info from some cells into a .csv file first. Ideally I'd like to eliminate this manual step as it introduces the possibility of human error, which is what I'm trying to eliminate in the first place.
Photoshop can't read the whole text file as a dataset because I think it's in an array (I'm not sure, I don't get it), and from my understanding datasets will only work with .csv files.
It should be a relatively simple process to grab information from a live, web-based text feed and convert it to useable metadata, but I find the XMP SDK completely indecipherable. Apologies for my ignorance but I have no coding experience whatsoever and it seems like it's written in an alien language. Can someone please offer some advice as to how I go about this? Attached is a copy of the text file I'm trying to use.
Thanks to anyone with enough smarts or nerd power to crack it... it's beyone me, I've wasted weeks staring at a blank page and I'm more confused now than when I started.

Hi,
what you are planning to do seems technicaly possible with our XMP C++ SDK. But you really need programming skills in C/C++ to realize this project.
Regards,
Samy

Want to send a email with images and text in the body of email in iOS

In iOS, we have written a code to send an email, with embedded images and text in the body of the email ( not attachment) using mail composer. It works well with iOS devices like iPhone and iPad, but does not work in window based OS. Can anybody help. The code is

Thanks James !, do you have an idea how to find the window resource which I belive will be included in our appllication pack.
In above I have missed to copy the code,below is the code. This might help you to help me.
NSMutableString *imgContent = [[[NSMutableString alloc] initWithString:@"<html><body>"] retain];
UIImage *imageData = [UIImage imageNamed:@"Midhun.png"];
NSData *imageDataInBase64 = [NSData dataWithData:UIImagePNGRepresentation(imageData)];
NSString *base64String = [imageDataInBase64 base64EncodedString];
[imgContent appendString:[NSString stringWithFormat:@"<img src='data:image/png;base64,%@'>",base64String]];
[imgContent appendString:@"</body></html>"];
MFMailComposeViewController *emailWin = [[MFMailComposeViewController alloc] init];
[emailWin setMessageBody:imgContent isHTML:YES];

How To Create Image Based PDF

Hi,
Currently we are Converting Text-based PDF to Image-Based PDF using Adobe Reader/Adobe Acrobat by using
Print(Adobe PDF)->Advanced->Print as image.
So i can save the file as image based pdf.
How to do this in C# using Adobe Dlls.. Please Give me Sample Code to do this. What dll i have to use?.. what code i have to use?..
Regards,
R.Balajiprasad

You would need to license the Adobe PDF Library from Datalogics (http://www.datalogics.com).

How to do full text search in mobile apps?

Hi,
I want to be able to use the FTS module in sqlite to do leverage the powerful full text capability in it. But I was sad to learn that module are disabled in AIR (Has anybody gotten FTS module to work in AIR?)
Is there any way I can use the FTS module with SQLite. Android native developers use this all the time so I think not having this is a serious limitation for text based apps.
My alternatives right now are to:
Write my own indexer and scorer (lots of work)
Do full text search on the server side (breaks offline capability)
I hate both options. I am leaning toward righting my own indexer but what do you guys suggest I do?

I've got nearly the same concern. I want to add full text search on a mobile and desktop application, without any server.
I read in archived discussion that the FTS support was already considered before air 2.0.
Will we see it soon?

Information on full text search in Oralce Database

Hi,
We are looking to implement full text search using Oracle database. Where can I find info on this topic? Specifically, I'm looking for
1) an overview of how to implement them in Oracle database - column type, size/limitations etc.
2) does oracle database come with filters to extract and filter data from different file formats such as Ms office, PDF (images) etc.
Appreciate your reply

Look into the Oracle Text documentation. It has the answers to your questions.

Re-save a PDF so contents are image-based?

Is there quick and easy way to do this? I know it will inflate the file size, but that's not important for my applications. Basically, I want all the contents of an existing text-based PDF to be image-based, as if I was a document I optically scanned, but without the need to print it out, and run it through a scanner. Is this possible? Thanks

Ah nice idea. Now, is it possible to perform this in a batch fashion?
Ideally, I want my user to be able to drop a bunch of PDF's in a directory, run a batch file, and then they will be "converted" as described above.
Are there command line arguements for Acrobat that could do this? Using X Standard here.

Full-text search containstable AND logical operator performance problem

I have the following clause in my SQL statement:
CONTAINSTABLE(subject_ifts, SearchText, '"smith*" AND "n*"', LANGUAGE 1033)
It takes approximately 60 seconds to return 840 rows from a total of 4 millions rows in the searched table. If the search condition is changed to '"smith*"' it returns 840 rows in less than 1 second (ie. all rows containing the text "smith"
also contain the text "n"). It seems that the search for "n*" takes a long time to return rows as almost all 4 million rows contain this text. (Note: The search criteria is passed as a parameter into a stored procedure at runtime based
on the search criteria input by the user in the UI).
Is there any way to make SQL Server perform its search for the text "n*" just on the resultset of the 840 rows returns from the "smith*" search? Theoretically this should return rows a lot quicker basing the search on 840 rows
rather than 4 million. However, I cannot seem to implement this effectively. I have tried using CTEs and JOIN to no avail. Any help greatly appreciated.
Graham Goodwin Email: [email protected]

Hello,
I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated.
Thank you for your understanding and support.
Regards,
Fanny Liu
If you have any feedback on our support, please click
here.
Fanny Liu
TechNet Community Support

Text search

Yet I seem not to undestand what's wrong with text searches in a VI.
I got a VI, in which there's a string constant, which is a command line in DOS style.
This string is something like "C:\Program Files\MySoftware\ABC_2_5_6.exe -f -g"
If I perform a text search in the VI for "2_5_6", nothing is found.
I tried "*2_5_6", "2_5_6*", "*2_5_6*" as well, but Labview finds nothing.
Am I missing something ?

Runawaycode
I have tried to search the same string in my LabVIEW and it works fine.
BTW which LabVIEW version are you using
If you need to search string in specific VI Then press ctrl+F then Select "Text" (Radio Button) and Input the Search text( Image below).
Also check for some options provided in Find window
=========================================
Please remember to accept a solutions and show your appreciation by giving Kudos to helpful messages...
Mangesh D.
CLAD | Project Engineer
==
VIPM, LabVIEW 8.2, 2009, 2011SP1, 2012, 2012SP1, 2013, cRIO,cDAQ, PXI, ELVIS, Multisim, Smart Camera....
Attachments:
2013-02-02_1634.png ‏57 KB

Image based Text Search

Similar Messages

Maybe you are looking for