Search text in PDF file

I would like to text search in pdf file, through java (VJ++), is it possible through java.io, i'm getting junk text.
also tried to add COM wrapper through VJ++, but file is not getting loaded ?? any examples ??
Thank you

any ideas
searching for PDF
help required

Similar Messages

  • How to search text in pdf file?

    Hi all
    I have to store the cover of a newspaper that include images and text and then should be able to search keywords in the cover.
    I've read about to store in pdf format and use intermedia text.
    I am just wondering the way to store and to do the search .
    Thanks all

    Hi,
    You need store the PDF document in a BLOB column and create a CTXSYS index type.
    e.g.: (.doc files)
    CREATE INDEX I_DOC ON DOC_TABLE (DOC_COLUMN) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('SYNC (ON COMMIT)');Then you can test typing this SQL below:
    select score(1) from DOC_TABLE where contains(DOC_COLUMN, 'My text', 1) > 0;In my case, i use this index for purpose to search on Word Documents (.doc)
    Maybe this link help you to create an index type using FILTERS, in order to search on PDF files:
    http://www.oracle.com/technology/products/text/htdocs/altfilters.htm
    Cheers

  • How to write a unicode text in pdf file

    Dear Friends,
    I am a beginner in acrobat pdf plug-in development. I was trying to write a unicode text (Tamil text) into pdf file.
    Using same api I am able to write english text in time-roman, areal etc fonts. But I am not able to write tamil texts.
    The code is as below:
            memset(&pdeFontAttrs, 0, sizeof(pdeFontAttrs));
            pdeFontAttrs.name = ASAtomFromString("Latha");
            pdeFontAttrs.type = ASAtomFromString("TrueType");
            pdeFont    = PDEFontCreateFromSysFont(                                        \
                            PDFindSysFont(&pdeFontAttrs, sizeof(pdeFontAttrs), 0),    \
                            kPDEFontCreateEmbedded);
            pdeText = PDETextCreate();
            PDETextAdd(pdeText, kPDETextRun, 0, (ASUInt8 *)buffer, _tcslen(buffer),
                                    pdeFont, &gState, sizeof(gState), NULL, 0, &textMatrix, NULL);
            PDEContentAddElem(pdeContent, kPDEAfterLast, (PDEElement)pdeText);
            PDPageSetPDEContent(pdPage, gExtensionID);  
            PDPageReleasePDEContent (pdPage, gExtensionID);
    KIndly assume that PDEGraphicsState and PDETextMatrix are set properly set, I am not pasting entire code to avoid complexity.
    Thank you,
    Safiq

    Dear lrosenth,
    I went through some codes/suggestions in internet and I found that I need to have cmap file and cid font file for the respective font since pdf doesn't support unicode fonts directly.
    Can you help me to know where can I get cmap file and cid font file for tamil language font Latha(TrueType) microsoft font.
    Regards,
    Safiq

  • Why can't I "Save as Text" a pdf file received as an email attachment?

    I can "Save as text" a pdf file which I have created in my own computer (that is, it goes into MS notebook that I then can Copy and Save as an MS Word file) but not when I receive a pdf as an email attachment. (The file is saved, but it is empty.) Why would I want to convert my own pdf back to text? Well, in case I no longer have the original Word document I suppose, but the thing is "Save as text" works with my pdf, but not with those I recieve from others. How come? Thanks!

    Is this a scanned PDF? If so, it must first be OCR'd.

  • Always scrolling back to the search bar for pdf files in ibooks. Is there a way to fix this?

    ALWAYS scrolling back up to the search bar for pdf files in ibooks. Is there a way to fix this?

    Care to share your fix with the rest of the community in case anyone else has the same problem or since you found the solution are you off to never be heard from again?

  • How to Extract the Highlight Text in PDF File

    Hi Scripters,
    i want know, how to extract the hightlight text in pdf files for text only format for (*.txt) file extension save.
    regards
    baby

    Hi,
    Okay i'll try do best.
    thanks for your reply.
    Regards
    Baby

  • Search text in PDF and MS Word document

    Can any body tell me how search text in PDF and MS Word document through Java code, any body has code or any suggestion to give
    Thank You
    Adnan

    Can any body tell me how search text in PDF
    and MS Word document through Java code, any
    body has code or any suggestion to giveYes.
    First, you need to work out how to read each document type from Java.
    E.g, for MS Word you could use Apache Jakarta POI - HWPF: http://jakarta.apache.org/poi/hwpf/index.html
    Then, you use Apache Lucene to index and search.
    See http://lucene.apache.org/java/docs/index.html
    ~D

  • Searching text in PDF

    I believe that I have heard that in GW 8 it will be possible to search text in PDF documents.
    I have tried it, but it doesn't work.
    Is there a way to make it work in GW 8?
    Thanks,
    Tomislav

    Dave Parkes wrote:
    > I don't know enough about the Linux setup to know precisely what is called
    > on that OS.
    It's still called the document conversation agent on Linux. I would set off
    an indexing run to see if it kicks it all off properly. My PDFs have been
    indexing here for a good long time :)
    Danita
    Time to upgrade to GW8!
    http://www.caledonia.net/gw8upg.html

  • Read text in pdf files

    Hi Ppl,
    Is it possible to read text from pdf file ? We can use activex controls to open and display pdf files, but these activex doesn seem to support reading of text from these pdf files. Help me out plz.
    Thanks 

    The full PDF format is VERY complex. Probably the reason why PDFBox was choking on one of the PDF files of a former poster. You are of course free to implement a PDF parser in LabVIEW but expect this to be a project where a man year of effort certainly won't be enough to even get close to what PDFBox can do. Then decide if you want to give it away for free just for the good karma of it, or attempt to sell it with a potential of maybe one license every year.
    Just look at the opposite direction: Creating a PDF file from within LabVIEW. There are several Toolkits out there who can do that and they already took a considerable amount of time to develop. Yet the generation of a small subset of PDF features in a file is several exponents easier than parsing and interpreting any exisiting PDF document that might have been created by tools like Adobe Acrobate, with Adobe as the creater of PDF potentially using all the bells and whistles they eventually put into the PDF standard over those two or more decades, including quite a few bugs that eventually got documented as a feature.
    Rolf Kalbermatter
    CIT Engineering Netherlands
    a division of Test & Measurement Solutions

  • Insert Text in PDF files?

    Prior to reformating my computer, I was able to insert text into pdf files.
    Now I am not seeing the instert text option, can you assist.
    Neil Borne
    Moderator's Note: Removing personal details.

    Thank you and it would appear that it should work, but I must click on the
    Comment Box far top right;
    When choosing Text, I receive the "Failed to load Application resource
    (internal Error).
    I have looked for updates, none available. I have also clicked on repair
    installation.
    What should I do next?
    Neil

  • Read the text in pdf file

    Dear all,
    I have checked a lot of post about reading pdf in this forum. However, is it possible to read the text in pdf file. In my case, I need to read the content of pdf to do further process. Could anybody give me some suggestions. Thank you.

    I have a similar problem, can anybody help us....

  • How to use OCR Font A type by the time of writing some text into Pdf fil

    Hi,
    I am generating one pdf file in java. How can I use OCR Font A for text of pdf file ..Please can any one help where can I get OCR Font A and how to use that one in java ... I want to write some text into pdf file and that text should use OCR Font A family ...
    Thanks.

    This document shows how to disable OCR during conversion; just do the opposite: https://forums.adobe.com/docs/DOC-3062

  • Editing text from pdf file

    how to edit text from pdf file?

    Adobe Reader does not allow editing the text of a PDF document. You will need to get Acrobat on your Windows or Mac to do that.

  • How to remove a hidden text in pdf file with Acrobat Pro 9. How to save pdf file and remove hidden text?

    I
    I made this file in indesign, the highlited empty spaces indicates that their is a hidden text and it pop up when searching for some words in pdf file. so how can I save pdf file to keep only the seen text ???

    Dear lrosenth,
    I went through some codes/suggestions in internet and I found that I need to have cmap file and cid font file for the respective font since pdf doesn't support unicode fonts directly.
    Can you help me to know where can I get cmap file and cid font file for tamil language font Latha(TrueType) microsoft font.
    Regards,
    Safiq

  • Add text to PDF file

    The question is: Is it possible to add text on each page of pdf file?
    I try to do this and search a lot in internet and this forum.
    Most of treads are about converting to PDF OTF format or other.
    I need to add text to alredy existing PDF format which is archive in R/3 as PDF not OTF document.
    Thanks,
    Olek

    anyone have any idea?
    how could be this done?
    maybe some java code or something other?

Maybe you are looking for

  • Apple TV would not love Bang&Olufssen (component problem)

    Within several months I connect TV via HDMI ➡ DVI to Bang&Olufssen Beovision 7-40. All - OK. Several days ago I have tried to connect TV to the TV through a videocomponental cable and have seen the black screen. I have tried to connect this cable t

  • 802.1x Wireless Authentication with 10.8.4 Build 12E3067

    Hello All, Work in a school and we use 802.1x authentication for Wi-Fi and access to our server and Staff wireless VLAN.  We use a login window profile that authenticates with our Active Directory. Previous and working set up was MBA (Mid 2012) 5,1.

  • Extra border around palettes in CC

    Has anyone figured out how to get rid of the extra border that surrounds the whole palette in CC? Or, is this a bug? Hard to know either way.

  • A J2EE problem from an International Bank EJB-RDB remapping problem

    Background Information: We are going to deploy a J2EE project with DB2 V7.1 as the backend database server. However, the project is developed using WSED and initially assumed DB2 V.8 will be used as the database server. The inconsistency of DB2 versi

  • Base station died?

    My airport express base station stopped working. The light no longer comes on. Is there anything to try or do they just die? Ann