Extract highlighted words from a pdf (Acrobat SDK, OLE)

Hello Acrobat gurus ! :-)
I'm new to the SDK, so please excuse any "stupid" question i might have.
Here is what i want to do:
I want to search for a group of words in a pdf document. According to the SDK documentation, once i search for a text using AcroExch.AVDoc.FindText(), the function "Finds the specified text, scrolls so that it is visible, and highlights it."
I was assuming that after calling this function with my string, once the string is found i will have acces to the coordinates of the rectangle containing the highlighted group of words (i presumed that those words would be automatically contained in an object of the type AcroExch.HiliteList) and to the coordinates of those words.But i'm not able to do so, i cannot find any function(s) that give me that kind of access.
So question is:
Is it possible to access the coordinates of the rectangle/words that are highlighted in a pdf after calling the FindText() function ? Can someone help me get on the right track ?
Thanks

Ok, let me give you an more elaborate example, maybe i don't ask the right question.
Let's say i have a pdf, containing the following text in the first page
--- arbitrary number of ":"
Mother's Name: Joanna
Father's Name: Josh
other text
If i call the function like this: FindText("Mother's Name:"), acrobat is going to find the first occurance of my string. What i want to do is to be able to get the coordinates of this WHOLE string OR the coordinates of the last character in the string  (in this case ":").
The problem is that if i go for the coordinates ofthe double dots i cannot just look for them in the pdf, because i may have an unknown number of double dots (":") before the ones i'm interested in. The logical solution in this case would be to get the coordinates of the entire string ("Mother's Name:" in this case) and then get the coordinates of the double dots i'm interested in.
Would that be possible ?

Similar Messages

  • Extract Tag Tree from existing PDF

    Hello,
    We are starting a new project where a user can accessibility check their pdf. They do this by uploading the pdf file and on a new screen we are supposed to show if the tag tree of the pdf (if it has tags).
    Can the tag tree from an existing pdf be extracted on the server by running some command line code in the background or by calling a method in the .NET SDK? Anyone have any ideas.
    If this isn't possible does anyone know of any other software programs that I might use in order to get this information. As long as I can get the tag tree I shouldn't have any problem marking it up into HTML and rendering it to the person's browser.
    Thanks,
    Dustin Michaels

    Are you sure you can't install it on a windows server? This links says you can.
    http://www.adobe.com/products/acrobatpro/productinfo/systemreqs/Re: Extract Tag Tree from existing PDF
    Microsoft® Windows® 2000 with Service Pack 4; Windows Server® 2003 (32-bit or 64-bit editions) with Service Pack 1; Windows XP Professional, Home, Tablet PC, or 64-bit Editions with Service Pack 2; or Windows Vista™ Home Basic, Home Premium, Ultimate, Business, or Enterprise (32-bit or 64-bit editions).
    Anyway even if you couldn't install it on a server we could always have it installed on a Windows XP machine and have the windows server contact the Windows XP machine to get the tag tree from the pdf uploaded.
    Do you have any idea how you might extract the tag tree regardless of what operating system adobe acrobat is running on using some .NET code?
    Thanks,
    Dustin Michaels

  • How do extract one page from one pdf document and save as a new pdf?

    How do I extract one page from a pdf document and create a new document?

    In Acrobat: Tools - Pages - Extract.
    In Reader it's not possible.
    On Sat, Jan 31, 2015 at 10:29 PM, Ned Murphy <[email protected]>

  • Extract a page from a PDF document

    Can I extract a page from a PDF using Preview? Under Windows using Adobe Acrobat Pro you can extract a page from a PDF, so I was wondering.

    Under Windows using Adobe Acrobat Pro you can extract a page from a PDF
    Under Mac OS X, using Adobe Acrobat Pro, you can do the same thing.
    But there are a few free utilities to do the same thing, as mentioned.
    For the Print option mentioned, you can also use the Quartz filters to compress the "printed" PDF.

  • When converting Word Doc to PDF Acrobat converts all words and tables but will not convert graphs

    When converting a word microsoft word document to PDF Acrobat converts words, tables, everything but the graphs in the document. It will not convert the graphs it leaves that area blank.

    Cvest group wrote:
    From Microsoft Office 2007 created word doc from word doc to Acrobat to Create PDF.
    This is very confusing; what actually coverted the document - Word, Acrobat, or CreatePDF (now renamed to "Adobe PDF Pack") ?
    Either way it's not an Adobe Reader problem;
    if Word: post in the Microsoft Word forum
    if Acrobat: post in the Adobe Acrobat forum
    if ExportPDF: post in the ExportPDF forum

  • Is there a way to copy non-contiguous words from a PDF

    Is there a way to copy non-contiguous words from a PDF?  Is there a way to use a template or mask that will isolate certian areas to be copied with spaces in between the words so that when you go to paste the contents into multiiple fileds, the separate words will pop into the appropriate fileds.  The fields will be in a submission form.
    Please let me know if this is possible.
    Thanks
    Linda

    No it's not.

  • Extracting two words from the article in english

    i have an english article which is to be classified into a particular category based on the keywords. There are lacks of keywords stored in database. What i have to do is to obtain the keywords from the article and match it from the database. if match is found then the article belongs to that particular category. This keyword matching i did for one word by using split(" "), but now i want to do for 2 words from an article.that is getting 2 words from the article which is repeated many times.then searching it in the db.(here 2 words will be considered as one keyword)
    Now what i should do to get the two appropriate keywords from the article without taking a,am,the,is,when etc...(leaving many generic words).
    Any help will be appreciated.

    hi,
    thanks for reply!
    I know its a bad algorithm classify the article written in english only based on few words appearing in the article.
    But what i want to do is first extract the words from the article leaving the generic words, then count the single word each.Then i am sorting the words based on count and taking the five words from the article which has highest count. Now i have the database where millions of keywords are stored. These keywords are refering to particle category
    ie. if we consider a category as sports, then under this category i have many keywords stored in the database like cricket, football, worldcup,tennis... etc
    Now if i search the appropriate word from the article it will be considered as keyword. then this will be searched in the database. if match is found then it means the article belongs to sports category.
    Now problem is some times article can have two words which can be considered as one keyword and can be used to classify article in much better way.
    The question is how to get such words from the article???
    ex.. if Hero's Journey is combined word appearing many times in the article then this keyword can be used to classify the article much better than going for single word.
    Can anybody help me in this regard.
    Any help will be appreciated.

  • How to extract inline styles from a PDF document using Acrobat 9?

    I have a requirement for extracting all the contents along with the para level and character level styles from a PDF document in the form of XML. While doing so I'm getting lot of additional tags. In addition to that I'm not able to find the inline tags (character level tags) like bold, italics, superscripts etc and the page numbers. It would of great help if someone can throw light on this.
    Thanks.

    Moved to Acrobat Forum.

  • How can i extract the text from the PDF files,Power point files,Word files?

    hi friends,
    i need to extract text from the PDF files,Power Point,Ms word files.Is it possible with java?if yes how can i extract text from those files.please give solution this problem.i would be thankful if u provide solution.
    regards,
    prakash.

    Find an API which could read each of those files and start coding.

  • Is it possible to extract an annotation from a pdf document using sdk in c#

    we need to extract annotations from multiple pdf files and we need to import to a different PDF file.
    Thanks in advance.

    so if we are dealing with desktop system is it possible ?

  • How to extract specific pages from a PDF

    Hello. I'm using Windows XP Pro on a custom PC with Adobe Acrobat 8.0. I work for a small magazine (abqarts.com) that publishes its online version in PDF format which is created by our production dept. I need to extract specific pages from the magazien as PDFs to send to a client. Tried to look up how in the Help file but I think the termonology is defeating me.
    I can load the magazine's PDF into Acrobat, but can't manage to save, print or export two pages and the cover as individual PDF files. I'd sure appreciate some help.
    Thanks,
    Peggy

    Graffiti, thanks for your quick response! When you say "open the pages view" that's the drop-down View menu, right? Then I select Page Display but don't know which one to chose after that. Single, two-up etc.
    And Control>click on a page selects an image on that page--not the entire page, which is what I want.
    That said, I'm way happy you pointed out Document>Extract Pages. That works great for me, one page at a time. Maybe I don't need the other things clarified because I can use this one, but I'd like to get working all the tips you provided.
    Gratefully,
    Peggy

  • Can I extract images/items from a pdf?

    I lost my hard drive! (D'oh!)
    But I do have some hi-res pdf files made from the original files (InDesign).
    Is it possible to extract discrete components from pdfs? Such as images, text blocks, etc?
    It seems like it should be possible, but I'm wondering if one must be a PostScript coder or somesuch.
    Cheers!
    ~Ben

    Excellent! Thank you both, George and Steve.
    I have CS3, so v8.3.1 or Acrobat. So that process is Advanced > Document Processing > Export all images.
    Oddly, it tells me that it can't extract/export vector images. I suppose that means AS SUCH, since it managed to export JPEG versions of images that I know were .eps format. Strange, but true!
    Thanks again!
    Ben

  • Extract Pantone color from the PDF using c#.

    hi i have an requirement to extact pantone color from PDF. So i had decided will  go for acrobat.dll or illustrator library to get pantone and other color from the PDF.Can you please help me out how to proceed and get pantone color from PDF using c#.thanku in advance

    If the PDF has Pantone colors in it, you can use output preview to see it. If you want a Pantone equivalent of a CMYK value you can do this:
    In Illustrator, make a cmyk swatch of the desired color, open the Color Guide panel (Window > Color Guide), then press the Swatch Library button in the panel's lower left corner and choose color books> Pantone+solid coated. Clicking the first color in the Color Harmony section (the Base Color) at the top of the panel will add the closest PMS Spot Color equivalent to your Swatches panel.
    There is often no exact 4/C match for a spot color, but this method will get you in the ballpark.

  • Does the email button word from a PDF generated report from 9ias?

    Has anyone been able to get the email button from a PDF report generated from 9ias? I try to email the report as an attachment, and the file doesn't attach, and I dont receive anything. If I email a link, it just sends a link that directs you to the report server webpage, but you can't see the file.
    Besides saving the file to the local file system and attaching the file to an email, has anyone been able to get this button to work?

    Hi Jim,
    What command are you using in your link or button?
    For sending report output by email you must use:
    http://machine:port/reports/rwservlet?report=...+server=...+destype=mail+desname=<email_address>desformat=<your_desired_format>from=...
    You can also use CC, BCC, REPLYTO and SUBJECT. See Publishing Reports manual for more information:
    http://download.oracle.com/docs/html/A92102_01/pbr_cla.htm#658500
    For more advanced emailing options you can also use distribution. See Publishing Reports:
    http://download.oracle.com/docs/html/A92102_01/pbr_dist.htm#1005563
    Navneet.

  • Extracting stem words from text index

    Hello all,
    I am trying to categorize some records in a table. I wonder if Oracle Text has some searching capabilities inside the text index. So, what I'm trying to achieve is to find the minimum amount of stem words that can be found in a set of records. Basically, it's kind of reverse searching, I have a subset of records from a table that can be found using a regular query (no text query) - the table has a text index on one column - and I want to find, using the text index, the minimum amount of stem words in that column that can generate a hit for the whole subset if queried using only the text query.
    Thanks,
    Danny

    Here is a method for viewing the stem word of any given word by using a function that inserts one row into a table, dynamically rebuilds the index, then selects the stem word from the domain index table. I have then added some code to use that to loop through all the words in the original domain index table, insert them and their roots into another table, and select the roots and corresponding concatenated words.
    SCOTT@orcl_11gR2> SELECT banner FROM v$version
      2  /
    BANNER
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
    PL/SQL Release 11.2.0.1.0 - Production
    CORE     11.2.0.1.0     Production
    TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
    NLSRTL Version 11.2.0.1.0 - Production
    5 rows selected.
    SCOTT@orcl_11gR2> CREATE TABLE test_tab (test_col VARCHAR2 (40))
      2  /
    Table created.
    SCOTT@orcl_11gR2> INSERT ALL
      2  INTO test_tab VALUES ('The cats ran quickly from the dogs.')
      3  INTO test_tab VALUES ('The mice were running from the cats.')
      4  INTO test_tab VALUES ('Some people walk their dogs every day.')
      5  INTO test_tab VALUES ('The dogs chased the cats.')
      6  SELECT * FROM DUAL
      7  /
    4 rows created.
    SCOTT@orcl_11gR2> BEGIN
      2    CTX_DDL.CREATE_PREFERENCE ('test_lex', 'AUTO_LEXER');
      3    CTX_DDL.SET_ATTRIBUTE ('test_lex', 'INDEX_STEMS', 'YES');
      4  END;
      5  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> CREATE INDEX test_idx ON test_tab (test_col)
      2  INDEXTYPE IS CTXSYS.CONTEXT
      3  PARAMETERS
      4    ('LEXER       test_lex
      5        STOPLIST CTXSYS.EMPTY_STOPLIST')
      6  /
    Index created.
    SCOTT@orcl_11gR2> CREATE TABLE stem_tab
      2    (test_word  VARCHAR2 (4000))
      3  /
    Table created.
    SCOTT@orcl_11gR2> CREATE INDEX stem_idx on stem_tab (test_word)
      2  INDEXTYPE IS CTXSYS.CONTEXT
      3  PARAMETERS
      4    ('LEXER        stem_lex
      5        STOPLIST  CTXSYS.EMPTY_STOPLIST')
      6  /
    Index created.
    SCOTT@orcl_11gR2> CREATE OR REPLACE FUNCTION get_stem
      2    (p_word IN VARCHAR2)
      3    RETURN VARCHAR2
      4  AS
      5    v_word       VARCHAR2 (32767);
      6  BEGIN
      7    DELETE FROM stem_tab;
      8    COMMIT;
      9    INSERT INTO stem_tab (test_word) VALUES (p_word);
    10    COMMIT;
    11    EXECUTE IMMEDIATE 'ALTER INDEX stem_idx REBUILD';
    12    SELECT MIN (token_text)
    13    INTO   v_word
    14    FROM   dr$stem_idx$i;
    15    RETURN v_word;
    16  EXCEPTION
    17    WHEN NO_DATA_FOUND THEN RETURN p_word;
    18  END get_stem;
    19  /
    Function created.
    SCOTT@orcl_11gR2> SHOW ERRORS
    No errors.
    SCOTT@orcl_11gR2> CREATE TABLE words_and_stems
      2    (word  VARCHAR2 (20),
      3       stem  VARCHAR2 (20))
      4  /
    Table created.
    SCOTT@orcl_11gR2> SET SERVEROUTPUT ON
    SCOTT@orcl_11gR2> DECLARE
      2    v_stem VARCHAR2 (32767);
      3  BEGIN
      4    FOR r IN
      5        (SELECT DISTINCT token_text
      6         FROM      dr$test_idx$i
      7         WHERE  token_text != '.')
      8    LOOP
      9        v_stem := get_stem (r.token_text);
    10        INSERT INTO words_and_stems
    11        VALUES (r.token_text, v_stem);
    12    END LOOP;
    13    COMMIT;
    14  END;
    15  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11gR2> COLUMN words FORMAT A45 WORD_WRAPPED
    SCOTT@orcl_11gR2> SELECT stem,
      2           LISTAGG (word, ', ') WITHIN GROUP (ORDER BY word)
      3             AS words
      4  FROM   words_and_stems
      5  GROUP  BY stem
      6  /
    STEM                 WORDS
    BE                   BE, WERE
    CAT                  CAT, CATS
    CHASE                CHASE, CHASED
    DAY                  DAY
    DOG                  DOG, DOGS
    EVERY                EVERY
    FROM                 FROM
    MICE                 MICE
    MOUSE                MOUSE
    PEOPLE               PEOPLE
    QUICKLY              QUICKLY
    RAN                  RAN
    RUN                  RUN, RUNNING
    SOME                 SOME
    THE                  THE
    THEIR                THEIR
    WALK                 WALK
    17 rows selected.
    SCOTT@orcl_11gR2>

Maybe you are looking for

  • An error pops up when i try to download the new iTunes

    I keep tring to download the new iTunes, but everytime i try to, an error pops up saying an error occurred during the installation of assembly 'Microsoft.VC80.CRT,version="8.0.50727.4053",type="win32",publicKeyToken="1fc8b 9a1e18e3b",processorArchite

  • Using a studio microphone with skype

    Hey! So i've recently bought a Samson Meteor USB Studio microphone for recording and such. It sounds flawless in programs like audacity. I tried to use it in skype while playing some video games with friends and the sound quality dropped so much, eve

  • Authorizations for WEBI report based on BPC data model

    Hi All, We are strugelling with setting up authorisations for the reporting on BPC data model. We created Bex query on top of Multiprovider that consists of BPC cube. The Bex query is source for WEBI output. The authorisations has been set up on BPC

  • Homogeneous system copy using database specific tools

    My enviroment is ECC6/Oracle 10.2.0.2/Solaris 10 I'm attempting my first system copy (central instance ABAP) using database specific tools from my PRD to my QAS system in the source system I ran sapinst and proceed as follows Execution of Service SAP

  • I uploaded my ipod to a third party software and now I cant find the files.

    I was having major problems syncing my ipod with iTunes so I used a third party software called "MediaWidget" to upload my data to itunes, it said it had succesfully uploaded and when i looked in itunes they were all there BUT I then restored my ipod