Help renaming pdf files based on internal content

I work for a company that has thousands of E-tickets coming in daily, weekly, monthly, etc..
These tickets come in bafhakfbaifh.pdf and we have to manually rename them or print them all out and then sort through them and put them in order.
What I would like to do is:
1. Split any pdf's that have more than one page or "ticket" in my case. I know how to do this with automator easily, but I'd love to keep it all in one program.
2. Search the file for Event Name (i.e. Madonna)
3. Search the file for Date of event (August 12, 2012)
4. Search the file for Section, Row, Seat (124 3 12)
5. Rename the file based on content found (Madonna August 12 2012 124 3 12.pdf)
6. Move from original download folder to organized folders based on artist/team.
7. Automatically print in alphabetical or some sort of designated order.
Any help is muchly appreciated. So far, I found a PC program called A-PDF rename, but it is not automated enough to be practical. Hazel is awesome at OCRing the pdf and moving from folder to folder, but does not do enough.
Any help is muchly appreciated.
Thank you.

You're looking for a PDF Parser or PDF miner tool (PDFminer) as a starting framework, and you'll almost certainly be writing custom code around that as parsing a text file that's effectively free-form and originating from multiple different sources almost always (always?) involves writing customized processing code and an on-going series of tweaks as the suppliers of the PDF change their ticket formats.  (Even apparently-simple details such as the time and date formats, for instance, can vary by geography and language and by supplier, and can derail common processing.)
In some cases that I can envision, it'd be entirely possible that the data you're after is actually located in an embedded image and not in text that can be parsed.
The best approach is to get folks to send you JSON or XML or some other format intended for interchange, and avoid the whole mess that is parsing or mining a printer-oriented format.
The other obvious option is to use something like Amazon's Mechanical Turk or some other explicitly outsourced help.  Depending on how often the formats change and how many of these PDF files you're dealing with and how varied the formats are, sometimes throwing staff at the problem can be the most cost-effective approach.

Similar Messages

  • Automatically renaming pdf files based on excel data

    I am creating pdf certificates using variable data from excel files with InDesign.  This creates a multipage pdf file with a different persons name on each page.  Then end result needs to be individual pdf files named for each person.  I can extract the single pages out of the main pdf ending up with however many files all named the same thing besides a number at the end of the common file name.  Question:  Is there a automated process of renaming the individual files using the data from the excel file?

    You're looking for a PDF Parser or PDF miner tool (PDFminer) as a starting framework, and you'll almost certainly be writing custom code around that as parsing a text file that's effectively free-form and originating from multiple different sources almost always (always?) involves writing customized processing code and an on-going series of tweaks as the suppliers of the PDF change their ticket formats.  (Even apparently-simple details such as the time and date formats, for instance, can vary by geography and language and by supplier, and can derail common processing.)
    In some cases that I can envision, it'd be entirely possible that the data you're after is actually located in an embedded image and not in text that can be parsed.
    The best approach is to get folks to send you JSON or XML or some other format intended for interchange, and avoid the whole mess that is parsing or mining a printer-oriented format.
    The other obvious option is to use something like Amazon's Mechanical Turk or some other explicitly outsourced help.  Depending on how often the formats change and how many of these PDF files you're dealing with and how varied the formats are, sometimes throwing staff at the problem can be the most cost-effective approach.

  • HowTo Rename file based on message content?

    Hi All,
    Newbie here.
    I have an message which i'm pushing out to a file via a File eWay.
    I need to name\rename the file based on the ID of the message.
    What is the best way to do this?
    Thanks,
    Ken

    I'm not aware of a way to do this via the File eWay, which is very limited in functionality. Of course, you could do it with standard java.io but that isn't very Java EE. So, I typically use the Batch eWay (Batch Local File) where more robust capabilities such as this are quite easy using the dynamic configuration features.

  • .PDF files in DVD ROM content

    I am using a Mac and Encore CS4. I am trying to include a .pdf file as DVD ROM content. Whenever I try this, all .pdf files are "grayed out" and I can't use them. Adobe Acrobat Reader 9 and Adobe Acrobat Pro are both installed on the computer. I can read .pdf files outside of Encore.
    Any help would be appreciated.

    To add to Ruud's comment. When adding DVD-ROM content (Build tab; about half way down), you cannot add files; you must add folders. But you can have anything you want inside the folder.
    If you really want the file at the root of the DVD (i.e. not in a folder), build from Encore to a DVD folder without the file(s). Then use ImgBurn, add the Video_TS folder and any other files, then burn. I've never done this before, so I did a quick test (but I am CS3). Such a disk plays as a DVD in a set top DVD player and on a PC through wmp, and, on the PC, I can open the file from the disk.

  • How to Extract Data from the PDF file to an internal table.

    HI friends,
    How can i Extract data from a PDF file to an internal table....
    Thanks in Advance
    Shankar

    Shankar,
    Have a look at these threads:-
    extracting the data from pdf  file to internal table in abap
    Adobe Form (data extraction error)
    Chintan

  • Sorting incoming files based on payload content

    Hi,
      I have a file>XI-> FILE scenario...I need to sort the incoming file based on payload content...But my understanding is in XI each line item would be processed one at a time...Is it possible to sort the entire file before beginning to process???
    -Ken

    Hi Ken,
    you can try to create generic content sorting with java mapping:
    /people/ravikumar.allampallam/blog/2005/06/24/convert-any-flat-file-to-any-idoc-java-mapping
    Regards,
    michal

  • Exporting to Separate PDF files based on Group

    Post Author: Tanya Sherin
    CA Forum: Exporting
    Hello,
    I have a report that needs to exported into separate pdf files based on one of the groups already established in the report. I would like to automate this process as much as possible because the report size. Has anyone encountered this need?
    Thanks for your assistance.
    Regards,
    Tanya

    Post Author: synapsevampire
    CA Forum: Exporting
    You'll need CR XI or a third party solution.
    Here's one I suggest:
    http://www.milletsoftware.com/Visual_CUT.htm
    Contact Ido (owner) for a free trial and confirmation that it meets your needs.
    -k

  • Browser based AIR Help and PDF Files

    Hi everyone,
    I am so glad to have forum like this, hoping that sharing your experiences will help a little bit further.
    Just recently I have tried to generate an browser based AIR Help in combination with merged projects. By the way, Peter Grainge's tutorial was very helpful here, thanks Peter!
    My project also contained a single PDF file in a baggage container. After I had published my file, I had noticed that it was not possible to open the PDF in any way.
    So I tried the same using the Adobe AIR Help Application and there the PDF worked. Now I was wondering whether I did something wrong or do I need to be aware of something.
    Personally I don't mind using the Adobe AIR Help Application, the only disadvantage I had discovered beside the local installation, is the the document spacing seemed to be a little odd. Actually there was no space between the border and the content. With any other single source output there is, including the browser based AIR help.
    I didn't find any similar problems on this forum, but maybe any of you has a little hint for me.
    Thanks and greetings from Germany,
    Christian

    PDF should work from browser based AIR Help. Try a different file.
    See the AIR topics on my site about the margins. In short, you create a copy of your CSS and just change the body tag margins in Notepad. Use the ordinary CSS when you are working or creating other outputs, use myproject_AIR.css when you generate the help.
    See www.grainge.org for RoboHelp and Authoring tips
    @petergrainge

  • Can I create a custom table of contents and link to other .pdf files based on responses to a form?

    Hey Everyone! First post ever, so bear with me:
    I'm trying to create a streamlined method to use a form  to let myself and others add information and select certain options to put together a custom table of contents. Basically, I would like to have a form with a series of text fill and single/multiple choice options that will automatically populate a table of contents based on the selections and will link to other .pdf files that are associated with the selections. I was hoping this would be possible with a form, but I'm relatively new to the function of the software as a whole and my research came up short. Any suggestions on how to start are more than welcome, and if I wasn't quite clear enough I would be happy to elaborate.
    Thanks for your time!

    You would need to search for other PDF creation software that can accomplish what you desire.
    There are many cheaper  PDF creation alternatives other than Adobe's Acrobat Pro software.
    Also, try doing a web search under these terms to see if you can find an app/software/solution that may work for you.
    How to create table of contents in PDF files

  • Can I rename a PDF file based on the metadata? (prefer a bulk rename)

    I had to do a file recovery on a hard drive with a program called TestDisk. It recovered 30GB+ of files but it recovered data by ignoring the partition/partition table. Because of this it uses its own naming structure during the recovery. The names of the files are f_______.XXX. I noticed that when I hover over the PDF files it actually displayed the original file name. I looked in the file proerties and noticed the PDF tab which contained this. Does anyone know if I can rename the file(s) using this data. The summary tab on the file properties is blank but the file name is located under this PDF tab (I didn't even know this existed). I don't know what I'm going to do with the rest of the files but being able to bulk rename 8GB of PDFs would be great!  :-)
    Thanks,
    Sundiata

    ok I'm not into programming and not familiar with Javascript. Does anyone know any websites to maybe assist me with making this batch file or at least provide enough basic Javascript info to get me started and I may be able to figure this out. Or is this Javascript something that is auto. something created during the renaming process creation?
    Thank

  • Is there a way to batch rename PDFs in a folder with content from the file? i.e. the title field

    Hi, I have a load of documents in a folder. Most have names like 2334.pdf, 3645.pdf etc. Does anyone know of a way to automatically get the title from withing each file and rename the file with that title i.e. doc title.pdf
    I found a product from a-odf.com that claims to do this, but I was hoping to use adobe's product.
    Thanks in advance
    Nelson

    There is no "out of box" tool per se that does this type of task, but you could probably script it with an Action that uses JavaScript.  If you are not familiar with Acrobat JavaScript then you will need to find someone who is familiar with it to assist you in building it.

  • Print Series of PDF Files Based on Table of Contents PDF

    Hello,
    Background: Procedure "manual" that is composed of a table of contents PDF (TOC). Each entry in the TOC has a link to a seperate PDF file. When a user (in Reader) selects a link, the appropriate section of the manual opens, i.e. the correct PDF file opens. They can then print that particular file.
    My question: Is there a script that would allow my user to print all of the manual, that is, all of the PDF files linked to the TOC, in the same order, using a menu or button on the TOC PDF?
    Thanks,
    Tom

    I have developed a script that can do just that, but with locally saved PDF
    files:
    http://try67.blogspot.com/2009/10/combine-pdf-files-from-text-list.html
    It might be possible to adjust it to do what you describe. If you're
    interested, contact me personally by PM or at try6767 at gmail dot com.

  • PDF File from an internal table data

    I have an internal table ready for grid display of a report. But according to the new requirement, it shouldn't create a Spool request as it is affecting the performance of the system (because its a huge file with 3000+ pages).
    So, now we want to create a (non-editable) PDF File and store it in an application server.
    From the forum, i see that i have only one option, which is to Convert internal table data to OTF Format and then to PDF.
    Right now i am using  PRINT_TEXT to convert to OTF file and FM CONVERT_OTF for PDF file.
    The challenges i am facing right now is..
    1. After getting the OTF file from PRINT_TEXT, i couldn't get the correct PDF File. It shows random characters instead..
    2. My internal table has 103 columns. So how should i accommodate data from all these columns into a single row on a PDF?
    3. I couldn't create a table kind of view/display in PDF
    Please help/advice me on the above issue.
    Thanks,
    Sarada

    Hi,
    103 columns need to be seen in at least 7 pages or more. Cant display in just 1 page.
    I guess that better to create a smartforms and pass all the information from the table to the smartforms.
    There you can create multipages.
    Probably, that huge table need to separete it in 7 tables, so each page will get the information from one table...
    i will try to do sth like that.
    Regards
    Miguel

  • I need help renaming a file using regular expressions in Bridge.

    Hi,
    I work at a university, and we are working through files for our Thesis and Dissertations. We have been renaming them to make them more consistent. I am just wondering if there is a regular expression that could help with this process?
    Here is come examples of current file names;
    THESIS 1981 H343G
    Thesis 1981 g996e
    THESIS-1981-A543G
    I don't need to change the actual names of the files. just how they are formatted.
    Proper case on Thesis.
    Hyphens(-) in all white space.
    First letter capital, last letter lowercase on the call no (H343g)
    So the list above should look like;
    Thesis-1981-H343g
    Thesis-1981-G996e
    Thesis-1981-A543g
    I have seen people do some pretty cool things with regular expressions! Any help would be greatly appreciated. Thanks!

    You would be better off using a script to do this as an example as I don't think it would be possible in the Bridge re-name.
    Using ExtendScript Toolkit or a Plain text editor copy the code into either and save it out as Filename.jsx
    This needs to be saved into the correct folder. this is found by going to the preferences in Bridge, selecting Startup Scripts, this will open the folder where the script is to be saved.
    Once this is done close and re-start Bridge.
    To Use: Goto the Tools Menu and select Rename PDFs
    Make sure you test the code with a few copied files into a seperate folder first to make sure it does what you want.
    The script will do all PDF files in the selected folder.
    #target bridge 
    if( BridgeTalk.appName == "bridge" ) { 
    renamePDFs = MenuElement.create("command", "Rename PDFs", "at the end of Tools");
    renamePDFs.onSelect = function () {
    app.document.deselectAll();
    var thumbs = app.document.getSelection("pdf");
    for( var z in thumbs){
    var Name = decodeURI(thumbs[z].spec.name);
    var parts = Name.toLowerCase().replace(/\s/g,'-').match(/(.*)(-)(.*)(-)(.*)(\.pdf)/);
    var NewName = parts[1].replace(/^[a-z]/, function(s){ return s.toUpperCase() });
    NewName += parts[2]+parts[3]+parts[4]+parts[5].toUpperCase().replace(/[A-Z]$/, function(s){ return s.toLowerCase() });
    NewName += parts[6];
    thumbs[z].spec.rename(NewName);

  • I downloaded reader 11 and now i cant open any pdf files, i recieve internal error

    after downloading reader 11 , I can not open any pdf files on my computer... i recieve internal error

    Well, what computer?
    Mylenium

Maybe you are looking for

  • Blue Screen after installing WRT600N

    I have been running two wired high-end machines for over a year with an old Linksys 10/100 wired and wireless B Router.  I wanted to get the wired and wireless speed a bit faster so I bought the WRT600N mainly for the Gigabit switch and the USB port.

  • JButton-adding two images

    hey, at first I want to say that I was looking here and on google about my problems and I did find nothing :/. I had in my JButton ImageIcon like that: JButton button = new JButtone(ImageIcon sampleImage); and all is great. But i need in my program t

  • Getting "no longer supported" message when opening gmail though Safari on Windows based laptop.  Are others having problem?  Any solutions?

    I have been using Safari to access a gmail account for about a year.    Today, I got a message that Safari is no longer supported by gmail and I need to upgrade.  But I'm using the latest version of Safari. as far as I can tell.  Anyone else getting

  • Time Machine makes full backup after permission repair.

    I am struggling with this a couple of months but now I am sure. Each time I do a permission repair of my hard drive, Time Machine makes a full backup of roughly 200G. In this way my Time Machine disk can be full rather quickly. I seem not to be able

  • Jar program

    I want to run my jar file from the different loacation i.e. where the main class is not exist... if i am doing so it will generate error main class not found.. what i want to do for solve it?