How to identify line and paragraph breaks on word enumeration

Hi,
I am using PDWordFinder to extract text from the PDF document and then enumerate each word.
I need to identify the line breaks and replace them with white space
and paragraph breaks should be replaced with "/r".
I use the following to identify the line breaks
(PDWordGetAttrEx(docWord, 0) & WXE_LAST_WORD_ON_LINE)
Is there any way to identify paragraph breaks?
Thanks in advance
Vatspal

Is the PDF tagged/structured or not?   If so, then you have all the
paragraph information you need by using the PDSEdit APIs.  If not, then
you will need to do MUCH MORE heuristic analysis of the content to
determine paragraphs.

Similar Messages

  • How to Identify incoming and outgoing IDOC in the system for last one year.

    Dear All,
    How to Identify incoming and outgoing IDOC in the system for last one year.
    Regards
    Ashok

    Dear Anindya,
    I used WE05 the result by giving direction outbound.Then on created field i put the value date.I m getting the output in the no of idoc number.
    This is only the way to identify?
    Thnx
    Ashok
    Edited by: ashok singh on Oct 22, 2009 8:40 AM

  • How to identify display and navigational attributes in report?

    To all of thanks to be in SDN, plz send tue clarifications.....?
    Q. How to identify display and navigational attributes in report?  is there any naming convension differences for both of them?

    This works fine for "powerusers", but for informational users you have to create your own naming conventions for all navigational attributes which are not unique. For example 0COUNTRY may be a navigational attribute of 0CUSTOMER, OSOLD-TO, 0SHIP-TO, 0BILLTOPRTY, 0PAYER. To make the text clear for your users in reports name it CS Country, SO Country, SH Country, BT country, ... as 0COUNTRY could also be a characteristic from the document.
    This means quiet some work especially if you have a multilingual installation, but the information users will be very thankful.
    hope this helps
    mich

  • How to display Lines and Boxes in SAP Script

    hi,
    Can any one help me how to display Lines and Boxes in SAP Script.
    Regards
    kiran

    The SAP printer drivers based on page-oriented printers use these commands when creating output whereas the line printers and non-supported page-oriented printers ignore these commands.
    Syntax:/: BOX [XPOS] [YPOS] [WIDTH] [HEIGHT] [FRAME] [INTENSITY]
    This command draws a box of the specified size at the specified position.
    The following calculation is performed internally to determine the absolute output position of a box on the page:
    X(abs) = XORIGIN + XPOS
    Y(abs) = YORIGIN + YPOS
    WIDTH determines the width of the box.
    Default: WIDTH value of the SIZE command.
    HEIGHT determines height of the box.
    Default: HEIGHT value of the SIZE command.
    FRAME determines the thickness of frame.
    Default: 0 (no frame).
    INTENSITY determines the box contents as a grayscale percentage.
    Default: 100 (full black)
    The other details I guess is provided by friends who have posted before me.
    Regards,
    K.Sibi

  • How to identify the text color in a word doc.?

    how to identify the text color in a word doc.?
    I need to read a word document using java code. which contains many strings with different colors.
    i need to identify the color and giving the marks accordingly like
    test in blue color so
    test marks=2
    how can i do this using java. i only want to know how can i identify the text color using java code.?

    morgalr wrote:
    I guarantee it is not pretty.Indeed.
    I created a Word doc that simply has the word "Blue" in blue, then a space, then the word "Red" in red, all in the default font that Word started with (Times New Roman). The resulting document is 24,064 bytes. It starts off with 80 bytes of various hex values, mostly 0x00.Then 432 bytes of just 0xFF. Then 2048 bytes of various hex values, mostly 0x00. Then the text "Blue Red" (which appears twice more in the file). And so on...
    Edited by: jverd on May 10, 2010 8:45 AM

  • How can I copy and paste text into Word in Acrobat 9 Pro

    How can I copy and paste text into Word from Acrobat 9 Pro?

    You might find it easier to export the file to Word under File > Export > Word in Acrobat 9.

  • How do I download and work on a Word document?

    How do I download and work on a Word document?

    Download from where? The internet or ?
    Second, in order to work on a Windows file, you need software that will be able to do it - I'm sure there are more - this is one: Pages (iWork) or install a Windows version.

  • How to insert a character and paragraph break at the beginning of a paragraph using Grep

    Dear Community!
    I am stucked...
    I have several books in indesign, which I want to export into epub. I have a standardized procedure, to be able to assign the task to students.
    My problem i I have a paragraph style 'heading2' and I would like to insert a paragraph break before which is styled 'new page'.
    Here is the theory I have thought about:
    -with find and change find the style 'heading2'
    -insert a special character + paragraph break before the first character of the paragraph
    -with a second find and change find the special character
    -replace with nothing and style with 'new page'
    My question is how to insert with grep (?) before the first character anything.
    Please help me!
    Thanks in advance
    ND

    Look in the flyout panel of your Paragraph panel.
    (Or in the Online Help -- a resource a lot of people seem to overlook... It's even *called* "Keep Options". ...
    >I have a problem along these lines that I have not found the answer to by searching, and neither have I been able to start a new discussion on this. (These forums are kind of hard to figure out)
    ... on the top right of THIS web page I can see a heading "Actions", with below it the text "Start a discussion" ...)

  • How to identify Parked and Posted Documents in GL Line Item Report

    Is there any way to identify parked and posted documents in GL Line Item Report S_ALR_87012282? If i tick the parked documents in the further selection tab in the selection parameter, report will list all documents (parked and posted).
    There is no available field in the layout. Is there any other way?
    Thanks!

    HI,
    The other way to know the parked document details
    use tcode: FBV3 - Display
    OR
    you can go to SE16 and enter table name VBKPF -Document Header for Document Parking
    BSTAT = V
    V = parked document
    and ececute u will get the details of parked document.
    Edited by: Manohar Mathkunti on Sep 13, 2008 11:11 AM

  • How to get rid of paragraph breaks in a lengthy text instead of manually

    I have imported a large piece of text - it is very narrow and has many paragraph breaks I guess you call it - or return breaks. I started doing it manually but it is a nightmare. Is there an easier way to do it? I snooped around and tried a few things but cannot seem to get this to work. I have a lot more work to do. Thanks again.

    Position the mouse pointer after the last printed character of a paragraph. Press the mouse button and hold it while you drag down over several of the empty lines. Release the mouse button. Simultaneously press command+C keys (then release) to copy the selection to the clipboard. Simultaneously press command+F keys (then release) to open the Find&Replace dialog. Click once in the "Find:" window, then press command+V keys (then release) to paste the contents of the clipboard there. Click in the "Replace:" window then press "Return" to enter a single carriage return. Click on the "Replace & Find' button.

  • How to identify fact and dimension tables

    Hi ,
    We are having the list of parent child relation info for each database tables. Based upon the parent child tables ,needs to identify which table is fact ,which table is dimension .Could you please help me how to identify the fact table and dimensions tables ?
    Thanks in advance

    Hi,
    Refer this link........
    http://www.oraclebidwh.com/2007/12/fact-dimension-tables-in-obiee/
    Please mark if it helps you.......

  • How to identify transfer and update routines are applied to my ods/ cubes?

    hi all,
    how to identify transfer nor update routines are been applied to my ods/ cube?
    regds
    hari

    You need to go through the update Rule and Transfer Rule mapping and if you see any routine then there is a routine. Only manual process and also that is not too bad to go through them.
    thanks.
    Wond

  • [ID5] Find line without paragraph break

    Hello,
    I want to Find lines containing a specific character style. But when that line is at the end of a paragraph, it also selects the paragraph break. Which GREP do I have to use to select the line without the break?
    Regards, Sjoerd

    Almost forgot.
    Neither Find Text nor Character Styles have anything to do with lines. "Find Text" never selects an entire 'line'. Not even GREP can do this. (The Online Help for GREP contains the word "line" a couple of times [*] but it shouldn't, except in the negated sense of "GREP cannot distinguish a separate line".)
    [*] From memory. I tried to verify this, but the Online Help failed to appear. Again.
    ... Adding Insult to Injury: a pop-up offering this:
    Would you like to add Community Help search on Adobe® InDesign™ to your browser's list of search engines?
    Well, I don't think so.

  • Table Names ( for quotations , quotation lines and price breaks)

    Hi Experts,
    What are the table names for quotation, quotation lines and quotation price breaks.
    Thanks,
    MPH

    Quotes and POs share the same tables.
    The type_lookup_code in po_headers_all identifies the type of the document.
    Quotations > po_headers_all (or you can use PO_HEADERS_RFQQT_V)
    lines => po_lines_all (or you can use PO_LINES_RFQQT_V)
    price breaks => po_line_locations_all (or you can use PO_LINE_LOCATIONS_V )
    Hope this answers your question,
    Sandeep Gandhi

  • How to install Line and Telegram in Q10?

    I want to install Line and Telegram in my BB Q10, but I can't find them on App World.
    Is there a way to install them on my Q10?

    Hello bb2015
    Welcome to the Community
    You can install Line and Telegram using Snap and the guidance here: You Upgraded to OS 10.2.1 and want to run Android ... Credit to John Clark. 
    Cheers
    • Click Likes to appreciate those who helps you
    • If your issue solved, mark the best post in your thread as Solution

Maybe you are looking for