Applescript: Extract data from pdf

I am trying to extract a string from a pdf, and then rename the pdf with that string. The string varies in length, but always comes between "Name:" and "ID:"
Ideally I could drop a pdf with multiple pages, and it would extract the individual sheets and rename as new documents with this string.
From another thread, I've tried using this shell script (the Citing Patent and Classifications were the delimiters):
for f in "$@" 
do 
     echo "$f" >> ~/Desktop/Patent01.txt 
     cat "$f" | sed -n '/Citing Patent/,/CLASSIFICATIONS/p' | sed  's/CLASSIFICATIONS//p' >> ~/Desktop/Patent01.txt 
done
Thanks!

Replying to self with progress, and maybe someone can help.
Drop the single page pdf onto the script; it calls an automator which converts the pdf to plaintext; applescript then reads the txt file for the data I want; last step is to rename the original file with the string that I want (a name).
I'm getting error -10006, can't rename. Below is the script and a screenshot of the automator.
on open fileList
  tell application "Finder"
  --set thePDFfile to (choose file)
  repeat with thePDFfile in fileList
  set theInfo to info for thePDFfile
  set theFile to name of theInfo
  set qtdstartpath to quoted form of (POSIX path of thePDFfile)
  set workflowpath to "/Users/Galen/PDFextract/NoInput.workflow"
  set qtdworkflowpath to quoted form of (POSIX path of workflowpath)
  set command to "/usr/bin/automator -i " & qtdstartpath & " " & qtdworkflowpath
  set output to do shell script command
  --do shell script "automator /Users/Galen/PDFextract/NoInput.workflow"
  set AppleScript's text item delimiters to "
  set thetext to text items of (read "/Users/Galen/Desktop/ExtractOutput.txt")
  set studentName to item 2 of thetext
  set AppleScript's text item delimiters to " "
  set thetext to text items of studentName
  set lastName to item 2 of thetext
  set firstName to item 3 of thetext
  set lastFirst to (lastName & " " & firstName)
  --return lastFirst
  set AppleScript's text item delimiters to ""
  set the name of theFile to ((lastFirst as text) & ".pdf")
  end repeat
  end tell
end open

Similar Messages

  • Extracting data From PDF to Excel

    I have inherited a large library of PDF invoices which I need to extract data from into excell - or some other spreadsheet. The other option is to open up thousands of pdf documents and run the numbers by hand which is just dumb. I am new to acrobat and an entire afternoon of trial by fire / google hasn't gotten me very far - so even pointers in the right direction are appriciated.
    Ideally I would like to tell Acrobat what data is important on each document (can I use the form tool to do this?), extract the data from the relevant files (batch processing tool I presume?), compile the data and extract it to a CSV.
    It looks like the functionality is here I am just unsure how it all needs to fit together. Any Suggestions?

    Hi,
    There is software out there that will convert PDFs to excel... look for ABBYY or Able to extract... If you have a lot of files that are the same merge them together before using the software. Remember that if the data is created from a scanned image then the results will only be as good as the ability of the OCR engine contained in the software. You can play with the software to create tables, etc...

  • Extract data from PDF

    Hello,
    I am using Adobe Acrobat Professional 6.0 to create a bunch of survey questionnaires for respondents to fill out using an off-line computer. I used check boxes and radio buttons to set up the forms and assign output values. However, I couldn't figure out how to export the response values into one single file (preferably .csv). Does anyone know how to make that happen? Thanks in advance.

    Thanks for the reply!
    So, how do I get data from fdf to csv?
    I have participants coming in to fill out the questionnaire in my lab and we save their files. For example, for participant #001, the PDF file was saved as Questionnaire_001, and participant #002 as Questionnaire_002, and etc. If say I have 50 participants I will have 50 PDF files stored in the computer. This is the method used by the guy worked here before me and he somehow was able to extract data from those saved files.
    I know in the Adobe Acrobat Professional 6.0 I can get fdf file by going Advanced--> Forms--> Export Forms Data. But how do I get a csv file that has all 50 people's responses, with each column a response field (Q1, Q2, Q3, and etc) and each row a participant?
    Thanks a lot.

  • Extract data from PDF to SAP

    Hi all,
       I have created an Offline form in sfp Transaction and emailed successfully .
         And now that Receiver has sent me the form with the filled pdf form to my outlook id ( bcas my mail id is being configured in SMTP) .
       Now I want to Update a table with that filled values in the received pdf..
    1) What r all the steps should i follow now?
    2) What for guided procedures or workflow?
    3) Do i have the option to receive the mail to my Business       workplace inbox instead my personal mail id?
    i went thru all the related threads in this topic. But could not get the Idea..
    If someone knows please suggest me ..
    Thank you.
    Rgrds.
    Edited by: Deepa K on Feb 25, 2008 1:30 PM

    Hi,
    When you create an abap object based on standard interface IF_INBOUND_EXIT_BCS you will got 2 method .
    First here is the attributes i define in my object , all are Private instance attributes.
    XML_DOCUMENT type ref to IF_IXML_DOCUMENT.
    CONVERTER type ref to CL_ABAP_CONV_IN_CE,
    ATTACHEMENT_ATTRIBUTES type BCSS_DBPA,
    ATTACHEMENT_FILE type BCSS_DBPC ,
    BINARY_FILE Type XSTRING,
    FORMXML      Type STRING,
    PDF_FORM_DATA Type XSTRING ,
    XML_NODE Type Ref To IF_IXML_NODE,
    XML_NODE_VALUE Type STRING.
    Set this code in method CREATE_INSTANCE
    * Check if the singleton instance has already
    * been created.
    IF instance is INITIAL.
      CREATE OBJECT instance.
    ENDIF.
    * Return the iTE nstance.
    ro_ref = instance.
    The other method is where the mail will be process
    here is a sample code for method PROCESS_INBOUND
    * Data definition :
      DATA : pdf_line    TYPE solix  .
      DATA : nb_att(10) TYPE n.
      DATA w_part TYPE int4 .
      FIELD-SYMBOLS : <pdf_line> TYPE solix.
    ** Set return code so no other Inbound Exit will be done.
      e_retcode = if_inbound_exit_bcs=>gc_terminate.
      TRY .
    * Get the email document that was sent.
          mail = io_sreq->get_document( ).
    * Get number of attachement in the mail
    * If number is lower than 2 that means no attachement to the mail
          nb_att = mail->get_body_part_count( ) - 1.
          CHECK nb_att GT 0.
          CLEAR w_part.
    * Process each document
          DO nb_att TIMES.
            w_part  = sy-index + 1 .
            CLEAR xml_document .
    * Get attachement attributes
            attachement_attributes =
               mail->get_body_part_attributes( im_part = w_part ).
            IF attachement_attributes-doc_type IS INITIAL.
              DATA w_pos TYPE i .
              FIND '.' IN attachement_attributes-filename
                IN CHARACTER MODE MATCH OFFSET w_pos.
              ADD 1 TO w_pos.
              attachement_attributes-doc_type =
                 attachement_attributes-filename+w_pos.
            ENDIF.
    * Get the attachement
            attachement_file = mail->get_body_part_content( w_part ).
    * If attachement is not a binary one ,
    * transform it to binary.
            IF attachement_attributes-binary IS INITIAL.
              CALL FUNCTION 'SO_SOLITAB_TO_SOLIXTAB'
                EXPORTING
                  ip_solitab  = attachement_file-cont_text
                IMPORTING
                  ep_solixtab = attachement_file-cont_hex.
            ENDIF.
    * Convert the attachement file into an xstring.
            CLEAR binary_file.
            LOOP AT attachement_file-cont_hex ASSIGNING <pdf_line>.
              CONCATENATE binary_file <pdf_line>-line
                 INTO binary_file IN BYTE MODE.
            ENDLOOP.
            TRANSLATE attachement_attributes-doc_type TO UPPER CASE.
    * Process the file depending on file extension
    * Only XML and PDF file is allow
            CASE attachement_attributes-doc_type  .
              WHEN 'PDF'.
    * Process an interactive form
                me->process_pdf_file( ).
              WHEN 'XML'.
    * Process XML data
                me->process_xml_file( input_xstring = binary_file ).
              WHEN OTHERS.
    * Nothing to do , process next attachement
            ENDCASE.
        CATCH zcx_pucl003 .
      ENDTRY.
    As you can see i add several specific method to my object in order to make the code more clear.
    Here is the code for all the specifics methods
    PROCESS_PDF_FILE
      TRY.
    * Extract the Data of the PDF as a XSTRING stream
          me->process_form( pdf = binary_file ).
          me->process_xml_file( input_xstring = pdf_form_data ).
        CATCH zcx_pucl003 INTO v_exception.
          RAISE EXCEPTION v_exception.
      ENDTRY.
    PROCESS_FORM with inbound parameter PDF type XSTRING
      DATA :
         l_fp          TYPE REF TO if_fp ,
         l_pdfobj      TYPE REF TO if_fp_pdf_object .
    TRY.
    * Get a reference to the form processing class.
          l_fp = cl_fp=>get_reference( ).
    * Get a reference to the PDF Object class.
          l_pdfobj = l_fp->create_pdf_object( ).
    * Set the pdf in the PDF Object.
          l_pdfobj->set_document( pdfdata = pdf ).
    * Set the PDF Object to extract data the Form data.
          l_pdfobj->set_extractdata( ).
    * Execute call to ADS
          l_pdfobj->execute( ).
    * Get the PDF Form data.
          l_pdfobj->get_data( IMPORTING formdata = pdf_form_data ).
        CATCH cx_fp_runtime_internal
              cx_fp_runtime_system
              cx_fp_runtime_usage.
      ENDTRY.
    PROCESS_XML_FILE with inbound parameter INPUT_XSTRING type XSTRING.
      TRY.
          me->create_xml_document( input_xstring = input_xstring ).
          me->process_xml( ).
        CATCH ZCX_PUCL003 INTO v_exception.
          RAISE EXCEPTION v_exception.
      ENDTRY.
    CREATE_XML_DOCUMENT with inbound parameter INPUT_XSTRING type XSTRING.
      DATA :
         l_ixml        TYPE REF TO if_ixml,
         streamfactory TYPE REF TO if_ixml_stream_factory ,
         istream       TYPE REF TO if_ixml_istream,
         parser        TYPE REF TO if_ixml_parser.
      DATA: parseerror TYPE REF TO if_ixml_parse_error,
            str        TYPE string,
            i          TYPE i,
            count      TYPE i,
            index      TYPE i.
    DATA :
    * Convert the xstring form data to string so it can be
    * processed using the iXML classes.
      TRY.
          converter = cl_abap_conv_in_ce=>create( input = input_xstring ).
          converter->read( IMPORTING data = formxml ).
    * Get a reference to iXML object.
          l_ixml = cl_ixml=>create( ).
    * Get iStream object from StreamFactory
          streamfactory = l_ixml->create_stream_factory( ).
          istream = streamfactory->create_istream_string( formxml ).
    * Create an XML Document class that will be used to process the XML
          xml_document = l_ixml->create_document( ).
    * Create the Parser class
          parser = l_ixml->create_parser( stream_factory = streamfactory
                                          istream        = istream
                                          document       = xml_document ).
    * Parse the XML
          parser->parse( ).
          IF sy-subrc NE 0
            AND parser->num_errors( ) NE 0.
            count = parser->num_errors( ).
            index = 0.
            WHILE index < count.
              parseerror = parser->get_error( index = index ).
              str = parseerror->get_reason( ).
              index = index + 1.
            ENDWHILE.
            EXIT.
          ENDIF.
        CATCH cx_parameter_invalid_range
              cx_sy_codepage_converter_init
              cx_sy_conversion_codepage
              cx_parameter_invalid_type.
      ENDTRY.
    Method PROCESS_XML
      DATA v_formname TYPE fpname.
    * For each node of the XML file you want to retrieve the value
    * Then use the specific method PROCESS_NODE .
    * Find Node where System Id is store
      CLEAR : xml_node ,
              xml_node_value.
      TRY.
          me->process_node( node_name     = 'SYSID' ).
          CHECK NOT xml_node_value IS INITIAL.
          CASE xml_node_value.
            WHEN sy-sysid.
    * Search for Form name.
              me->process_node( node_name = 'FORM_NAME').
              CHECK NOT xml_node_value IS INITIAL.
              v_formname = xml_node_value.
            WHEN OTHERS.
          ENDCASE.
          CATCH cx_root.
      ENDTRY.
    Method PROCESS_NODE with inbound parameter NODE_NAME type STRING
      CLEAR : xml_node , xml_node_value .
      xml_node = xml_document->find_from_name( name = node_name ).
      IF xml_node IS INITIAL.
    * Missing one node in the form, nothing will be done
          RAISE EXCEPTION TYPE ....
      ELSE.
        xml_node_value = xml_node->get_value( ).
      ENDIF.
    Hope this help you .
    Best regards
    Bertrand

  • Reg Extracting data from PDF using file adapter

    Hi Experts,
                 In my business process I will get different files in the form of pdf. I have to extract the fields from the file and send it to ECC system. Can any one suggest me how to do it without using CA.
    Regards
    Suresh

    you might have to use a custom solution.
    you will find tips here Trouble writing out a PDF in XI/PI?

  • Extracting Data from PDF forms in Reader created in Livecycle

    Hello
    We would like users who complete a PDF  document in Adobe Reader created in Livecycle to be able to export the  completed fields (and accompanying questions) to a MS Word document in a  format that appears similar to the PDF so it can be pasted in future  documents.
    Is there a simple step procedure that the users can follow
    Any assistance would be much appreciated

    Hi,
    I think, you had selected "3.x Datasource" as the type when you were replicating the Metadata from second client.
    If so, delete the datsource (in BIW) from the second client , and then replicate the datsource one more time.But this time , you need to select "As Datasource" option only.
    with rgds,
    Anil Kumar Sharma .P

  • How to Extract Data from the PDF file to an internal table.

    HI friends,
    How can i Extract data from a PDF file to an internal table....
    Thanks in Advance
    Shankar

    Shankar,
    Have a look at these threads:-
    extracting the data from pdf  file to internal table in abap
    Adobe Form (data extraction error)
    Chintan

  • Need to pre-populate and Extract data from static PDF form

    Hi Jasmin or Jayan or anyone else that can answer.
    I have a requirement to use Digital Signatures.  Because of that, the forms must be static PDFs and the form variables will be “document form”.  I want to pre-populate the form via an SQL query and custom render process and render it as PDF so that the submitter can apply a digital signature when he/she is done and ready to submit for approvalSubsequent approvers will also digitally sign the form.  I know that I will specify the custom render to render only once and thereby preserve the signature(s) on the form.  I do, however, need to extract data from the form to control the business process.  I cannot access the data in the form the same way I do with an xdp and I also cannot pre-populate the same way I do with an xdp. 
    Any suggestions on how to attack this?

    Parth, one problem with your approach is he will submit PDF and therefore you won't be able to put the PDF in a variable that's suppose to contain just xml.
    The prepopulation should be the same. If you start off with an xdp, then you will call a render service that merges data with your xdp to create a PDF.
    Now when you submit, you will submit the entire PDF back in the Document Form variable. In Workbench, you can use the FormDataIntegration service to extract data from that PDF that's being stored under Document Form var/object/document and put it in an xml variable. Then you can just use xPath to do your condition.
    I'm assuming you'll just pass that same Document Form variable to the next step, because if you do any change to the PDF it'll brake the signature.
    Let me know if I missed anything.
    Jasmin

  • Extract data from Dynamic Table in Pdf

    Hi,all
    How can I extract data from dynamically created table(the rows are added/removed by user in offline scenario) in pdf form?
    Regards,
    Michael

    Hi Micheal,
         I have a scenario which is similar as yours.I want to extract table data from the offline form.when i extract data i am getting values only for first row of the table.Can u please guide me how to fetch the data for a table(this table also has dynamically increasing rows in offline).I need the solution urgently.Please help me on this.
    WIll reward points for sure.
    Thanks and Regards,
    Srividya.

  • Applescript or workflow to extract text from PDF and rename PDF with the results

    Hi Everyone,
    I get supplied hundreds of PDFs which each contain a stock code, but the PDFs themselves are not named consistantly, or they are supplied as multi-page PDFs.
    What I need to do is name each PDF with the code which is in the text on the PDF.
    It would work like this in an ideal world:
    1. Split PDF into single pages
    2. Extract text from PDF
    3. Rename PDF using the extracted text
    I'm struggling with part 3!
    I can get a textfile with just the code (using a call to BBEDIT I'm extracting the code)
    I did think about using a variable for the name, but the rename functions doesn't let me use variables.

    Hello
    You may also try the following applescript script, which is a wrapper of rubycocoa script. It will ask you choose source pdf files and destination directory. Then it will scan text of each page of pdf files for the predefined pattern and save the page as new pdf file with the name as extracted by the pattern in the destination directory. Those pages which do not contain string matching the pattern are ignored. (Ignored pages, if any, are reported in the result of script.)
    Currently the regex pattern is set to:
    /HB-.._[0-9]{6}/
    which means HB- followed by two characters and _ and 6 digits.
    Minimally tested under 10.6.8.
    Hope this may help,
    H
    _main()
    on _main()
        script o
            property aa : choose file with prompt ("Choose pdf files.") of type {"com.adobe.pdf"} ¬
                default location (path to desktop) with multiple selections allowed
            set my aa's beginning to choose folder with prompt ("Choose destination folder.") ¬
                default location (path to desktop)
            set args to ""
            repeat with a in my aa
                set args to args & a's POSIX path's quoted form & space
            end repeat
            considering numeric strings
                if (system info)'s system version < "10.9" then
                    set ruby to "/usr/bin/ruby"
                else
                    set ruby to "/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby"
                end if
            end considering
            do shell script ruby & " <<'EOF' - " & args & "
    require 'osx/cocoa'
    include OSX
    require_framework 'PDFKit'
    outdir = ARGV.shift.chomp('/')
    ARGV.select {|f| f =~ /\\.pdf$/i }.each do |f|
        url = NSURL.fileURLWithPath(f)
        doc = PDFDocument.alloc.initWithURL(url)
        path = doc.documentURL.path
        pcnt = doc.pageCount
        (0 .. (pcnt - 1)).each do |i|
            page = doc.pageAtIndex(i)
            page.string.to_s =~ /HB-.._[0-9]{6}/
            name = $&
            unless name
                puts \"no matching string in page #{i + 1} of #{path}\"
                next # ignore this page
            end
            doc1 = PDFDocument.alloc.initWithData(page.dataRepresentation) # doc for this page
            unless doc1.writeToFile(\"#{outdir}/#{name}.pdf\")
                puts \"failed to save page #{i + 1} of #{path}\"
            end
        end
    end
    EOF"
        end script
        tell o to run
    end _main

  • Extract data from database tables and download in pdf and csv

    extract data from database tables and download in pdf and csv
    hi how can i re-write my old form procedure in adf java. the procedure used to extract data from diffirent table and dowload the data in pdf and csv.am not downloading image, i what to extract data from diffirent tables in my database and download that data in pdf and csv. i would like to write this in java adf.i just what direction am not asking anyone to do my work this is my learning curve
    the form code is
    function merge_header3 return varchar2 is
    begin
         return '~FACILITY DESCRIPTION~ACCOUNT NO~BRANCH CODE~BANK REF NO.~P/P/ AMOUNT~Postal Address 1~Postal Address 2~Box Postal Code~Dep. Date~Month~BANK NAME~BRANCH NAME~ACCOUNT TYPE~DESCRIPTION~OBJECTIVE DESCRIPTION';
    end;
    procedure download_file (i_pbat integer) is
      dir varchar2(80);
      file_name1 varchar2(80);
      file_name2 varchar2(80);
      appl_code varchar2(80);
      fil1 client_text_io.file_type;
      fil2 client_text_io.file_type;
      dat varchar2(1000);
      DATA VARCHAR2(1000);
      bvspro varchar2(100);
      ssch   varchar2(100);
      bvspro_total number(20,2);
      ssch_total   number(20,2);
      grand_total  number(20,2);
      cnt    integer;
      cursor pbat is
           select *
           from sms_payment_batches
           where id = i_pbat
      cursor pay  (pb_id integer) is
           select *
           from sms_payment_vw
           where pbat_id = pb_id
           order by subsidy ASC,programme,beneficiary_name
      cursor cgref (low varchar2) is
           select *
           from cg_ref_codes
           where rv_domain ='SMS'
           and rv_low_value = low
      success boolean;     
      begin  
           set_application_property(cursor_style,'busy');
           appl_code := sms_global.ref_code('SMS','APP_CODE','SMS',0);
        dir       := sms_global.ref_code('SMS','PAY_DIR','c:\sms\batch_payments',0);
             success := webutil_file.create_directory(dir);
         if webutil_file.file_is_directory(dir) then
             null;
    --         message ('directory exists');
        else
    --                  message ('create directory ');
             success := webutil_file.create_directory(dir);
    --         if success then        message ('directory exists');    end if;
        end if;     
        for c_pbat in pbat loop
             file_name1 := dir ||'\' || appl_code||c_pbat.batch_number||'-'||to_char(c_pbat.batch_dt,'yyyymmdd')||'pay.txt';
             file_name2 := dir ||'\' || appl_code||c_pbat.batch_number||'-'||to_char(c_pbat.batch_dt,'yyyymmdd')||'merge.txt';
    --message('create files ');
    --         fil1  := client_text_io.fopen (file_name1,'W');
    --         fil2  := client_text_io.fopen (file_name2,'W');
        fil1  := client_text_io.fopen (file_name1,'W','');
        fil2  := client_text_io.fopen (file_name2,'W','');
                   dat :=                       'FROM ACCOUNT NUMBER'
                                                                ||'~'||'FROM ACCOUNT DESCRIPTION'
                                                                ||'~'||'MY STATEMENT DESCRIPTION'
                                                                ||'~'||'BENEFICIARY ACCOUNT NUMBER'
                                                                ||'~'||'BENEFICIARY SUB ACCOUNT NUMBER'        
                                                                ||'~'||'BENEFICIARY BRANCH CODE'
                                                                ||'~'||'BENEFICIARY NAME'
                                                                ||'~'||'BENEFICIARY STATEMENT DESCRIPTION'
                                                                ||'~'||'AMOUNT';
             --     client_text_io.put_line(fil1,dat);
             bvspro:= null;
             ssch  := null;
             cnt := 0;     
             dat := '~'||lpad('~',16,'~');
             for c_pay in pay(c_pbat.id) loop
    --message('cpay loop ' || cnt);              
               if bvspro is null then
                     dat := lpad('~',16,'~');
                     dat := utility.put_field(1,c_pay.programme,dat,'~');     
               client_text_io.put_line(fil2,dat);
               dat := utility.put_field(1,c_pay.subsidy,dat,'~');
               client_text_io.put_line(fil2,dat);
               dat := merge_header3;
                     client_text_io.put_line(fil2,dat);
                     bvspro := c_pay.programme;
                     ssch := c_pay.subsidy;
                     grand_total := 0;
                     bvspro_total := 0;
                     ssch_total := 0;
               end if;
               if bvspro <> c_pay.programme then
                     dat := lpad('~',16,'~');
                     dat := utility.put_field(5,ssch_total,dat,'~');
                     dat := lpad('~',16,'~');
                     dat := utility.put_field(5,bvspro_total,dat,'~');
               dat := utility.put_field(1,'Total:' || bvspro,dat,'~');
                     client_text_io.put_line(fil2,dat);
                     dat := lpad('~',16,'~');
               client_text_io.put_line(fil2,dat);
                     dat := utility.put_field(1,c_pay.programme,dat,'~');     
               client_text_io.put_line(fil2,dat);
                     bvspro := c_pay.programme;
               dat := utility.put_field(1,c_pay.subsidy,dat,'~');
               client_text_io.put_line(fil2,dat);
               dat := merge_header3;
                     client_text_io.put_line(fil2,dat);
                     bvspro := c_pay.programme;
                     ssch := c_pay.subsidy;
                     bvspro_total := 0;
                     ssch_total := 0;
                     cnt :=0;
             end if;                           
               if ssch <> c_pay.subsidy then
                     dat := lpad('~',16,'~');
                     dat := utility.put_field(5,ssch_total,dat,'~');
                     dat := lpad('~',16,'~');
               client_text_io.put_line(fil2,dat);
               dat := utility.put_field(1,c_pay.subsidy,dat,'~');
               client_text_io.put_line(fil2,dat);
               dat := merge_header3;
                     client_text_io.put_line(fil2,dat);
                     ssch := c_pay.subsidy;
                     ssch_total := 0;
                     cnt :=0;
             end if;                           
            bvspro_total := bvspro_total + c_pay.amount;
            ssch_total   := ssch_total   + c_pay.amount;              
                  grand_total  := grand_total  + c_pay.amount;              
            cnt := cnt +1;
    --message('bfore write file 2 ' );              
            client_text_io.put_line(fil2
                                   ,cnt
                            ||'~'|| c_pay.beneficiary_name
                                                                ||'~'||c_pay.BENEFICIARY_ACCOUNT_NUMBER ||''            
                                                                ||'~'||c_pay.BRANCH_CODE             ||''           
                                                                ||'~'|| c_pay.BENEFICIARY_STATEMENT_DESC            
                                                                ||'~'|| c_pay.AMOUNT                                
                            ||'~'|| c_pay.address_line1
                            ||'~'|| c_pay.address_line2
                                                    ||'~'|| c_pay.postal_code
                                                    ||'~'|| TO_CHAR(c_pay.deposit_date,'DD-Mon-YYYY')
                                                    ||'~'|| c_pay.month
                                                    ||'~'|| c_pay.bank
                                                    ||'~'|| c_pay.bank_branch
                                                    ||'~'|| c_pay.account_type
                                                    ||'~'|| c_pay.subsidy
                                                    ||'~'|| c_pay.programme)
                  DATA :=                                  c_pay.FROM_ACCOUNT_NUMBER                   
                                                                ||'~'||c_pay.FROM_ACCOUNT_DESCR                    
                                                                ||'~'||c_pay.MY_STATEMENT_DESCR                    
                                                                ||'~'||c_pay.BENEFICIARY_ACCOUNT_NUMBER
                                                                ||'~'
                                                                ||'~'||c_pay.BRANCH_CODE            
                                                                ||'~'||c_pay.BENEFICIARY_NAME                      
                                                                ||'~'||c_pay.BENEFICIARY_STATEMENT_DESC            
                                                                ||'~'||c_pay.AMOUNT;                                
            DATA := REPLACE(DATA, ',' , ' ' );
            DATA := REPLACE(DATA, '~' , ',' );
    --message (cnt ||' ' || data);       
    --message('bfore write file 1 ' );              
                  client_text_io.put_line(fil1, data);
             end loop;
    --message ('end of write');         
                 dat := lpad('~',16,'~');
                 dat := utility.put_field(6,ssch_total,dat,'~');
                 dat := lpad('~',16,'~');
           dat := utility.put_field(1,'Total:' || bvspro,dat,'~');
                 dat := utility.put_field(5,bvspro_total,dat,'~');
              client_text_io.put_line(fil2,dat);
              dat := lpad('~',16,'~');
           client_text_io.put_line(fil2,dat);
           dat := utility.put_field(1,'Grand Total:' ,dat,'~');
                 dat := utility.put_field(5,grand_total,dat,'~');
              client_text_io.put_line(fil2,dat);
             -- close file
    for i in 1..50 loop  
           if substr(i,-1) = 0 then
                 message ('flush ' || i);
           end if;                 
                  client_text_io.put_line(fil1, lpad(' ',2000));
                  client_text_io.put_line(fil2, lpad(' ',2000));
                  client_text_io.put_line(fil1, lpad(' ',2000));
                  client_text_io.put_line(fil2, lpad(' ',2000));
    end loop;
             client_text_io.fclose(fil1);
             client_text_io.fclose(fil2);
        end loop;
       set_application_property(cursor_style,'default');
        exception
             when others then
                  message(sqlcode ||' ' ||sqlerrm);
       end download_file;    i try this but this code onlydownload image not data from database tables
        public void downloadImage(FacesContext facesContext, OutputStream outputStream)
            BindingContainer bindings = BindingContext.getCurrent().getCurrentBindingsEntry();
            // get an ADF attributevalue from the ADF page definitions
            AttributeBinding attr = (AttributeBinding) bindings.getControlBinding("DocumentImage");
            if (attr == null)
                return;
            // the value is a BlobDomain data type
            BlobDomain blob = (BlobDomain) attr.getInputValue();
            try
            {   // copy the data from the BlobDomain to the output stream
                IOUtils.copy(blob.getInputStream(), outputStream);
                // cloase the blob to release the recources
                blob.closeInputStream();
                // flush the output stream
                outputStream.flush();
            catch (IOException e)
                // handle errors
                e.printStackTrace();
                FacesMessage msg = new FacesMessage(FacesMessage.SEVERITY_ERROR, e.getMessage(), "");
                FacesContext.getCurrentInstance().addMessage(null, msg);
            }

    You should ask your forum in the ADF-forum.

  • Extract data from a scanned PDF Chart

    Hi,
    I have a scanned PDF chart, which shows linear relationship between two variables. Is there a way to extract data from the scanned PDF using Acrobat?
    I want to avoid error in my calculations by eyeballing the data. Using "Measuring tool" may be an option, but wanted to ask whether any forum members have a better and efficient way to extract data, which can later be used in a spreadsheet software.
    As an example, please refer to the attached link, I will like to extract data from Figure 6 in this document: http://www.seas.columbia.edu/earth/wtert/sofos/nawtec/nawtec13/nawtec13-3164.pdf
    FYI, I have Acrobat X installed on my Windows computer.
    Thanks in advance for your help.

    Using AA9 Pro I was able to use the TouchUp Object Tool, right click and open the graph in Photoshop. From there, or Paint and another image editor, one might be able to clean it up or re-create the paths. The gray values are too close for it to be simple.
    The measure tool does not seem a good choice; although you see measurement lines, it would not actually produce the line you desire, until maybe, just maybe you flattened annotations, a feature within Fixups, perhaps limited to the Professional version.
    IMO, it is a fairly linear graph with only a variation at the 10 Power (MW). You eyeball is as good as mine at this one; the actual values may be available from the authoring entity.

  • I want to extract data from a PDF using Java

    I would prefer to extract data from a PDF and convert it to XML. Is there an API that will convert a PDF to some Adobe format XML? Ideally I would like to add some JAR files to my classpath, similar to PDFBox. I don't want to install a bunch of server side componets or anything like that.
    Thanks!

    Thank you for the reply!
    If I installed the server side components, how would a Java client invoke a service to export data from a PDF? RMI, Web Services?

  • Extracting data from a pdf form

    Hi,
    livecycle es2, workbench 9.0
    I'm new to workbench and have a problem extracting data from a pdf form submitted to a short lived process.
    I have set up the following very simple process :
    default startpoint >  ProcessForm > exportData > set value > set value > Write Document
    The intention is to update the document and write it to disk. So far, each step works except for the 'export data' where I cannot get the pdf to extract to xml.
    The Input to the 'export data' step is a variable (myDoc), Data Type: Document,  created from the incoming PDF form.
    If I write out myDoc it is an exact copy of the incoming document, so I guess the start and finish steps of of the process are OK.
    The incoming (PDF) form I was given had no data schema, but  I thought I could access the form data by exporting to an xml variable....
      Service : FormDataIntegration  / exportData
    input (PDF Document)    variable : myDoc
      output(Data extracted)     variable : myXMLData
    Then in the next step (set value) access the xml element I am after ..
    Mappings
    Location:  /process_data/@groupId      Expression: /process_data/myXMLData/xdp/datasets/data/form1/mainPage/groupId
    This is did not work, so I got the incoming form, exported the form data to an xml file,  and created a schema using  Stylus Studio. I then imported that into the myXMLdata definition. ( BTW - Do I need to specify the root node after importing it ? )
    Still not working !
    Extra info : The XML view of my incoming  form shows I have a minimal dataset definition- is this OK ??
    <connectionSet xmlns="http://www.xfa.org/schema/xfa-connection-set/2.8/">
       <?originalXFAVersion http://www.xfa.org/schema/xfa-connection-set/2.4/?></connectionSet>
    <xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
       <xfa:data xfa:dataNode="dataGroup"/>
    </xfa:datasets>
    The schema created by stylus studio has none of the xfdf, xfa settings I have seen on other schemas - is this OK ?
    Any help to get this fixed greatly appreciated
    thanks
    steve

    hey thanks for the offer, but I am now sorted after I found a simple working example on line.
    This is a similar process to the one I am working on, and is clearly described and easy to follow...
    http://eslifeline.wordpress.com/2009/04/25/extracting-data-from-signed-pdf-using-livecycle -server/
    girish bedekar - I thank you !

  • Extracting data from Excel To Illustrator javascript or vbscript

    Hi all-
    I was wondering if there was a way to extract data from Excel to be used in Illustrator. I know there is an option of variables and xml, and I don't want that. I've seen and tried out how to read illustrator and write to excel, and I get that.  What I would like to do is pretty much the opposite:
    1.Pre-fill in an Excel file(.xls,.csv, doesn't matter) with data such as a filename in column 1 and (Replacement Text) in column 2 and close manually.
    2. Run script(VBSCRIPT,Javascript, doesn't matter)
    3.For each column in Excel file where cell in first column is not empty, open Illustrator Template with placeholder of "DWG" textframe and replace the frame titled "DWG" with Replacement text from Excel in Column2.
    4, Save each to a PDF file and name file with text from Excel Column1(Filename)
    In a nutshell, there will be a single illustrator template with a premade textFrame with a name of "DWG". Excel will contain two columns, one for the filename to be named and one for the relative text to replace with the placeholder in AI. I hoped I explained this well enough without causing too much confusion. Thanks in advance.
    Filename
    Replacement Text
    test1.pdf
    DWG01
    test2.pdf
    DWG02
    test3.pdf
    DWG03
    test4.pdf
    DWG04

    As text… \n is new line character and \r is return character. I can't remember which excel uses but they both equate to a line/paragraph… I very quickly threw together an example for you…
    #target Illustrator
    textToPDF();
    function textToPDF() {
              if ( app.documents.length == 0 ) { return; }
              var doc, csvFile, i, fileArray, opts;
              csvFile = File( '~/Desktop/ScriptTest/Test.csv' );
              if ( !csvFile.exists ) { return; }
              fileArray = readInCSV( csvFile );
              doc = app.activeDocument;
              opts = new PDFSaveOptions();
              opts.pDFPreset = '[Press Quality]';
              // Here we loop the main array
              for ( i = 0; i < fileArray.length; i++ ) {
                        // Here we get the second item of sub array i
                        doc.textFrames.getByName( 'DWG' ).contents = fileArray[i].[1];
                        // Here we get the first item of sub array i
                        doc.saveAs( File( fileArray[i].[0] ), opts );
    function readInCSV( fileObj ) {
              var fileArray, thisLine, csvArray;
              fileArray =[];
              fileObj.open( 'r' );
              while( !fileObj.eof ) {
                        thisLine = fileObj.readln();
                        csvArray = thisLine.split( ',' );
                        fileArray.push( csvArray );
              fileObj.close();
              return fileArray;
    I haven't tested it but it should be close…?

Maybe you are looking for