Fastest method for searching within a PDF file?

I have created a PDF document that holds all document reports for the last 10 years. It is currently 30,000 pages. As time goes on, incoming reports append to this PDF. This has proven to be an excellent way of holding and retrieving information, allowing me to pull up reports using specific search criteria.
The question I pose is this. Is there any faster way to search this mammoth document than the adobe search feature? The adobe search feature has great functionality, without a doubt. And even with 30,000 pages I can search the entire document in about 5 minutes or so, however in this day and age, 5 minutes can be a lifetime, especially to a client, lawyer, doctor, etc...
Does anyone know of a way of improving Adobe's search function to expedite it, or know of any existing third party programs that are able to do this (free or for cost)?
I appreciate your time, love the community, and await your response.
Have a great day.

Hi,
You can try going to finder, hitting CMD + F and below search this mac, you should see kind. You can try switching that contents and the second option to documents.
Hope this helps,
Zevie

Similar Messages

Searching for links across multiple pdf files

We have thousands of pdf files that are being moved to a new website. Some of these pdf files have links within them (either as text or as a hyperlink). This number is unknown.
The issue is how to programmatically search across multiple pdf files (numbering in the thousands) looking for links using a regular expression or part of a path. This will have to be able to search behind the text and search for the link url.
We first need to identify the number of files with links and create a list of the files with links that need modifying. If the number is too great to modify manually, then we would need the ability to programmatically edit these links.
The pdf files are stored in a database. Also, the pdf files are different versions and some are password protected.
Is there an Adobe product that will perform this? If not, are there any 3rd party vendor products that will accomplish this?
Thanks in advance for your help.

I have no solution, but a thought: the database factor may seem to be
a killer. But you could look for a solution designed to read PDF files
from a web site (by spidering or from a list), which would presumably
load them.
Or could do a one off extraction of the files from the database into a
directory and use that for your process. Probably a very good idea,
since extracting all files from the database is likely to be costly
and hammer the server (but can be scheduled at a sensible pace), while
the search process will (if it is possible at all) doubtless need to
be run countless times.
Aandi Inston

Search for content within a PDF

Is it possible to search for content within a PDF across a mapped drive?

Yes. Use Edit>Advanced search. Choose All PDF Documents in and choose the mapped drive.
They have to be searchable PDF's of course...

How to invoke alt-text for images in a PDF file by Automation

Hi,
Can any one help me?
How to invoke Alt-text for Images in a PDF file using script?
Thanks for looking into this.
Regards,
Sudhakar

What do you mean "invoke" alt-text? If Alt-text is there, then it will be presented to a screen reader.

How do I import my edge animation in indesign so that it can play within a PDF file

how do I import my edge animation in indesign so that it can play within a PDF file??
Plz help.

Hi there,
Have you tried reading this tutorial on exporting Edge Animate files to be used in InDesign?
http://www.adobe.com/devnet/digitalpublishingsuite/articles/enhancing-your-dps-folios-with -edge-animations.edu.html
Edge Animate files can be played in DPS folios. PDFs are not capable of playing Edge Animate files at this time. I did find a work around to get SWF files to play in PDF documents:
http://indesignsecrets.com/how-to-get-animations-to-work-in-pdf-working-title.php
I hope this helps!

Preview asking for user password on PDF file. There is no password. Anyone have any suggestions?

Preview asking for user password on PDF file. There is no password. Anyone have any suggestions?

OK. Open your PDF with Preview and at the top of the window (near the file's title) you will see Locked in grey.
Click on it and in the menu that appears click Unlock.
It looks like this in OS 10.7.3 :
For information you can view the permissions of a PDF file in Preview by choosing in the menu bar Tools > Inspector, and then click Encryption (the lock).
Hope this will help.

Best method for encrypting/decrypting large XML files ( 100MB)

I am in need of encrypting XML for large part files that can get upwards of 100Mb+.
I found some articles and code, but the only example I was successful in getting to work used XMLCipher, which takes a Document, parses it, and then encrypts it.
Obviously, 100Mb files do not cooperate well with DOM, so I want to find a better method for encryption/decryption of these files.
I found some articles using a CipherInputStream and CipherOutputStreams, but am not clear if this is the way to go and if this will avoid memory errors.
import java.io.*;
import java.security.spec.AlgorithmParameterSpec;
import javax.crypto.*;
import javax.crypto.spec.IvParameterSpec;
public class DesEncrypter {
    Cipher ecipher;
    Cipher dcipher;
    public DesEncrypter(SecretKey key) {
        // Create an 8-byte initialization vector
        byte[] iv = new byte[]{
            (byte)0x8E, 0x12, 0x39, (byte)0x9C,
            0x07, 0x72, 0x6F, 0x5A
        AlgorithmParameterSpec paramSpec = new IvParameterSpec(iv);
        try {
            ecipher = Cipher.getInstance("DES/CBC/PKCS5Padding");
            dcipher = Cipher.getInstance("DES/CBC/PKCS5Padding");
            // CBC requires an initialization vector
            ecipher.init(Cipher.ENCRYPT_MODE, key, paramSpec);
            dcipher.init(Cipher.DECRYPT_MODE, key, paramSpec);
        } catch (java.security.InvalidAlgorithmParameterException e) {
        } catch (javax.crypto.NoSuchPaddingException e) {
        } catch (java.security.NoSuchAlgorithmException e) {
        } catch (java.security.InvalidKeyException e) {
    // Buffer used to transport the bytes from one stream to another
    byte[] buf = new byte[1024];
    public void encrypt(InputStream in, OutputStream out) {
        try {
            // Bytes written to out will be encrypted
            out = new CipherOutputStream(out, ecipher);
            // Read in the cleartext bytes and write to out to encrypt
            int numRead = 0;
            while ((numRead = in.read(buf)) >= 0) {
                out.write(buf, 0, numRead);
            out.close();
        } catch (java.io.IOException e) {
    public void decrypt(InputStream in, OutputStream out) {
        try {
            // Bytes read from in will be decrypted
            in = new CipherInputStream(in, dcipher);
            // Read in the decrypted bytes and write the cleartext to out
            int numRead = 0;
            while ((numRead = in.read(buf)) >= 0) {
                out.write(buf, 0, numRead);
            out.close();
        } catch (java.io.IOException e) {
}This looks like it might fit, but there is one more twist, I am using a persistence manager and xml encoding to accomplish that, so I am not sure how (where) to implement this method without affecting persistence.
Any guidance on what would work best in this situation would be appreciated.
Regards,
vbplayr2000

I can give some general guidelines that might help, having done much similar work:
You have 2 different issues, at least from my reading of your problem:
1) How to deal with large XML docs that most parsers will not handle without memory issues
2) Where to hide or "black box" the encrypt/decrypt routines
#1: Check into XPP3/XMLPull. Yes, it's different that the other XML parsers you are used to using, and more work is involved, but it is blazing fast and can be used to parse a stream as it is being read. You can populate beans and process as needed since there is really not much "inversion of control" involved compared to parsers that go on to finish the entire document or load it all into memory.
#2: Extend Serializable and write your own readObject/writeObject methods. Place the encrypt/decrypt in there as appropriate. That will "hide" the implementation and should be what any persistence manager can deal with.
Regards,
antarti

Performing a search within multiple .as files

In Flash CS4 is it possible to perform a search within
multiple .as files, or an entire project?
thanks!

I'm don't know if that's possible on CS4, but you can search
using the OS searching tools:
I'm not sure about Mac OS, but on Windows, you can click on
Start, Search, and under the Containing Text field, you can type in
'trace' (without quotes). Then click on the drop-down box under
'Look In:' and select Browse...
Browse to the location of your project and select it, then
click the Search Now button to begin searching through the files
there.
It will at least tell you which file(s) have trace in them.
Once you know that, then you can search through the file using CS4
to find the actual command. It would be much more convenient to
have that as a feature in CS4, but I don't know if it is.

Will not search for a word in pdf file. Using Windows 7

Like most I have used Acrobat reader for viewing PDF files for years. I recently upgraded to Windows 7 from XP. I have tried to search a PDF file for a word and it always comes up with none found even though I search for a word that is displayed in the document I'm looking at. Any advice?
Regards,
Dwight

If the PDF is not a scanned image, can you share it with us: https://forums.adobe.com/thread/1070933

Programmatically search for a word in PDF file

In the program I am developing, I am opening up a PDF file from the application. Is there any way to search for a particular word in the PDF file and move to the page containing the first occurance of that word in the PDF file programmatically? I am using VC++ to develop the application.
Any guidance is appreciated and thanks in advance.

Thanks for your reply Leonard. I am not using any library now. Currently I am just opening up the PDF file in Adobe Acrobat Reader using ShellExecute() API passing the PDF filename. Now I want to open the PDF file in Adobe Acrobat Reader, and move to the page containig a particular text automatically. Is it possible through any command line arguments to AcroRd32.exe like, AcroRd32.exe <filename.pdf> search:<wordtosearch> OR I have to use any third party library to do this.

How to search for text inside multiple pdf file at once in ipad

Hi
I am student and i need search a word or subject or sentences on my all pdf files,i have tested some applications like ibook, ipdf, addobe reader, good reader ,.... But couldn't find what i need to have.
Please kindly help me to find the best application.
Thanks

I found an article, that claims PDF Expert does exactly what you are asking.
Article: http://www.imore.com/pdf-expert-ipad-brings-full-text-search-pdf-library
PDF Expert: https://itunes.apple.com/us/app/pdf-expert-fill-forms-annotate/id393316844?mt=8

Search String in PDF file - MAC Apple Script

I want to write a Script for Searching String in a PDF File.
I start with Apple Script and the last 2 days i searching to eliminate the error in the Script.
set Datei to choose file
tell application "Adobe Acrobat Professional"
open Datei
activate
find text Datei string "01.07.08"
end tell
The script will stopped in line from find text.
The scrippt will be load a PDF file, activate it an i want to search "01.07.08" and marked it.
Please help me.
Thanks michael

Right, I think I've got on to something here:
1. I must not have been able to search this particular pdf document in the past, despite what I think I remember.
2. When I select the Yamaha TDM pdf in Finder and show the Inspector window (CMD-opt-I) I see under "More Info" that Security Method is 'Password Encrypted'. In fact, try to select and copy some of the text: you can't. BUT: print the whole file to a new PDF, then save that new pdf file and use it for searching, ... ta-da! it works!
I was a bit disappointed in ColorSync's inability to open the file and then Save it As... In the Tiger days, I had used this work-around: I took a password encrypted pdf document, opened it in ColorSync utility (in Applications/Utilities) and saved it under a new filename somewhere on the disk, and this process sort of neglected to cloak it up in its password-protection.
Oh well, I suppose I can use the Print to PDF method which I describe just here to achieve the same goal. Until, that is, the big-wigs in the publishing companies and Apple are alerted to this fact and strip the 'functionality' from the print process.
Thanks for posting.

Search text in PDF file

I would like to text search in pdf file, through java (VJ++), is it possible through java.io, i'm getting junk text.
also tried to add COM wrapper through VJ++, but file is not getting loaded ?? any examples ??
Thank you

any ideas
searching for PDF
help required

How to search text in pdf file?

Hi all
I have to store the cover of a newspaper that include images and text and then should be able to search keywords in the cover.
I've read about to store in pdf format and use intermedia text.
I am just wondering the way to store and to do the search .
Thanks all

Hi,
You need store the PDF document in a BLOB column and create a CTXSYS index type.
e.g.: (.doc files)
CREATE INDEX I_DOC ON DOC_TABLE (DOC_COLUMN) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('SYNC (ON COMMIT)');Then you can test typing this SQL below:
select score(1) from DOC_TABLE where contains(DOC_COLUMN, 'My text', 1) > 0;In my case, i use this index for purpose to search on Word Documents (.doc)
Maybe this link help you to create an index type using FILTERS, in order to search on PDF files:
http://www.oracle.com/technology/products/text/htdocs/altfilters.htm
Cheers

What are methods for converting otf to pdf format in sap script

Hi,
I have a requirement in script i have to convert that that otf file to pdf format,when i use function modules its corrupting that pdf file,So i want convert otf to pdf using class method any one can help me for that.If any sample coding for class method.
Thanks.

ok
CALL FUNCTION 'CONVERT_OTF'
 EXPORTING
 format = 'PDF'
* max_linewidth = 255
 IMPORTING
 bin_filesize = lv_bin_filesize
* bin_file = pdf_xstring
 TABLES
 otf = lt_otf
 lines = lt_pdf_table
 EXCEPTIONS
 err_max_linewidth = 1
 err_format = 2
 err_conv_not_possible = 3
 err_bad_otf = 4
 OTHERS = 5.
CALL FUNCTION 'GUI_DOWNLOAD'
 EXPORTING
 bin_filesize = lv_bin_filesize
 filename = c_name
 filetype = 'BIN'
* APPEND = ' '
* WRITE_FIELD_SEPARATOR = ' '
* HEADER = '00'
* TRUNC_TRAILING_BLANKS = ' '
* WRITE_LF = 'X'
* COL_SELECT = ' '
* COL_SELECT_MASK = ' '
* DAT_MODE = ' '
* CONFIRM_OVERWRITE = ' '
* NO_AUTH_CHECK = ' '
* CODEPAGE = ' '
* IGNORE_CERR = ABAP_TRUE
* REPLACEMENT = '#'
* WRITE_BOM = ' '
* TRUNC_TRAILING_BLANKS_EOL = 'X'
* WK1_N_FORMAT = ' '
* WK1_N_SIZE = ' '
* WK1_T_FORMAT = ' '
* WK1_T_SIZE = ' '
* WRITE_LF_AFTER_LAST_LINE = ABAP_TRUE
* SHOW_TRANSFER_STATUS = ABAP_TRUE
* VIRUS_SCAN_PROFILE = '/SCET/GUI_DOWNLOAD'
* IMPORTING
* FILELENGTH =
 TABLES
 data_tab = lt_pdf_table
* FIELDNAMES =
 EXCEPTIONS
 file_write_error = 1
 no_batch = 2
 gui_refuse_filetransfer = 3
 invalid_type = 4
 no_authority = 5
 unknown_error = 6
 header_not_allowed = 7
 separator_not_allowed = 8
 filesize_not_allowed = 9
 header_too_long = 10
 dp_error_create = 11
 dp_error_send = 12
 dp_error_write = 13
 unknown_dp_error = 14
 access_denied = 15
 dp_out_of_memory = 16
 disk_full = 17
 dp_timeout = 18
 file_not_found = 19
 dataprovider_exception = 20
 control_flush_error = 21
 OTHERS = 22.
 IF sy-subrc <> 0.
* MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
* WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
 ENDIF.
TRY.
 GET PARAMETER ID 'RECEIPTENT' FIELD lvs_recipient1.
 send_request = cl_bcs=>create_persistent( ).
* lt_attach_bin = cl_document_bcs=>xstring_to_solix( ip_xstring = lt_solix ).
 APPEND 'Test message' TO lt_text.
 l_sub_50 = lc_test1.
 document = cl_document_bcs=>create_document( i_type = 'RAW'
 i_text = lt_text
 i_subject = l_sub_50 ).
 document->add_attachment( i_attachment_type = 'PDF'
 i_attachment_subject = 'script.pdf'
 i_att_content_hex = lt_attach_bin ).
 l_sub_line = lc_test1.
 TRY.
* Build subject line for email.
 CALL METHOD send_request->set_message_subject
 EXPORTING
 ip_subject = l_sub_line.
 CATCH cx_send_req_bcs INTO loref_obj_error.
 PERFORM sub_catch_error1 USING loref_obj_error.
 ENDTRY.
 send_request->set_document( document ).
* sender = cl_cam_address_bcs=>create_internet_address( '[email protected]' ).
 recipient1 = lvs_recipient1-objkey.
 recipient = cl_cam_address_bcs=>create_internet_address( '[email protected]' ).
* send_request->set_sender( sender ).
 send_request->add_recipient( i_recipient = recipient
 i_express = 'X' ).
 sent_to_all = send_request->send( i_with_error_screen = 'X' ).
 COMMIT WORK.
 CATCH cx_bcs INTO bcs_exception.
 MESSAGE 'eee' TYPE 'S'.
 EXIT.
 ENDTRY.

Fastest method for searching within a PDF file?

Similar Messages

Maybe you are looking for