Working with .pdf files and JAVA

Hi,
does anyone have an answer to how I can find more information on .pdf files?
I would like to convert .pdf files to textfiles and/or xml files. I can not find it in the j2se Edition, and someone told me it can be found in the j2ee edition, but I can not find anything there either. Please help..
thanks,
R.

thanks for your reply. What tools do you mean? I know lots of tools for converting text to a .pdf file, but no tools for the other direction. There is an API available (commercial), that lets you work with PDF in JAVA, but i am interesting in the other possibilities.
Regards

Similar Messages

Quicklook does not work with WMV files and quick look no longer maintains resized views when viewing from a folder using the up/down arrows

Quicklook does not work with WMV files and quick look no longer maintains resized views when viewing from a folder using the up/down arrows. Any fixes?

Same problem here...

Creating DVD with PDF files and web links

Hi all, first I'd like to say that these forums are a big help. I've spent DAYS scouring through topics learning. Of course, I know this opens it up for someone to post a link to a thread where my question has already been answered. Unfortunately, I haven't been able to find the specific help I need and would like to open a dialogue with experts.
I am creating a marketing DVD for a product. We produced a video for it, but the client also wants the audience to have access to a large amount of research in this specific field. This exists as PDF files and links to websites.
His previous Marketing CD was just that, a CD made with FileMaker and had the files and links and only worked in PC computers. I do not want to go back in that directions.
I want to make an informative DVD with the video and a few pages of selling points and cool tricks (I discovered multilayered menus working on this!) for those viewing on TV or Computer, and then an option for Computer users to click for more info.
How do I put PDF files on the disc and how do I put web links on there?
Thanks,
Byron

DVDSP uses a tool called DVD@Access. It enables a user to link to URL and call such documents as pdfs. The problem is that its never was reliable - especially on the PC side of things.
There have been many of posts in the last 2 days about its use. Do a search and you'll see.
DVD was not designed with the web in mind - it was conceived long before that time. linking to "outside" documents requires a third party tool to take over.
Just beware! - in fact if your client came to me I would refuse to do the job. I've seen the problems that exist doing work like this especially if your distributing to a large audience with different OS sets ups. If one of the users has Vista you can forget about it working at all.
My suggestion would be to design a menu that tells the user the file paths to your pdfs or URLs on the disc.

Working with pdf files in swing applications

Hi,
I have a swing application which displays a pdf file and contains a text box. i want to display the current page number of the pdf file in the text box.
Can any one please guide me how to implement the above functionality.
Regards,
Tommy

How can i downsave pdf file in CC 2014?
This is very unfortune, because we use some VB script together with illustrator. That process is stopping now because of this message!!!
Dont know how i can solve this issue!

Change in behavior when working with PDF files in illustrator CC and CC2014. HELP IS NEEDED!

Make a new CC file. Save in CC as pdf. Open same pdf file in CC 2014, make a change to file. Save file. Open same file in CC again. Now a dialogbox is displayed. This file is made in a newer version of illustrator!. This new behavior is totally stopping our entire production! What to do? NEED HELP ASAP
Cheers
Jesper G

How can i downsave pdf file in CC 2014?
This is very unfortune, because we use some VB script together with illustrator. That process is stopping now because of this message!!!
Dont know how i can solve this issue!

Full-Text search is not working with PDF files - SQL Server 2012 64 bit

Hi,
We are in the process of storing PDF files in SQL Server 2012 with Full-Text search capability.
I followed the steps as below and it works fine with word document but not for PDF files. I tried with PDF ifiler 11 & 9 and both are unsuccessful.
Server/DB Level Settings:
1)
Enable FileStream
2)
Install Full-Text
then restart
3)
Use [specific db]
alter
database [db name]
add
filegroup Files
contains filestream;
alter
database [db name]
add
file (
name = N'Files',
filename =
N'D:\SQL\DATA') to
filegroup [Files];
3)
Database level
Settings:
FileStream:
FileStream
Directory name:
[Set the name]
FileStream
non-transacted
Access: [set Appropriate]
3a)
Add a
datafile to DB
with filestreamdata
filetype.
4)
Share D:\SQL\DATA
directory and
add specific accounts
with read/write
access
5)
Give bulkadmin
access to those
specific accounts
at server
level
6)
From the
page (link)
download and
install the *.pdf
IFilter for
FTS. Link:
http://www.adobe.com/support/downloads/detail.jsp?ftpID=5542
7)
To the
PATH global system
variable add
path to the
catalog,
where you installed
the plugin.
Default for
this version is:
C:\Program
Files\Adobe\Adobe
PDF iFilter 9
for 64-bit
platforms\bin
8)
From the
page (link)
download a
FilterPackx64.exe
and install
it. Link:
http://www.microsoft.com/en-us/download/confirmation.aspx?id=20109
9)
Now from
SSMS execute the following
procedures:
-sp_fulltext_service
'load_os_resources',1
-sp_fulltext_service
'verify_signature', 0
EXEC
sp_fulltext_service
'update_languages';
-- update language list
EXEC
sp_fulltext_service
'restart_all_fdhosts';
-- restart daemon
reconfigure
with override;
10)
Restart the
server
11)
select document_type,
path from
sys.fulltext_document_types
where document_type
= '.pdf'
-select
document_type,
path from sys.fulltext_document_types
where document_type
= '.docx'
12) Results are OK.
Following is my Table /Index/ catalog script:
CREATE
TABLE dbo.DocumentFilesTest
DocumentId INT
IDENTITY(1,1)
NOT NULL
PRIMARY KEY,
AddDate datetime
NOT NULL,
Name nvarchar(50)
NOT NULL,
Extension nvarchar(10)
NOT NULL,
Description nvarchar(1000)
NULL,
FileStream_Id UNIQUEIDENTIFIER
ROWGUIDCOL NOT
NULL UNIQUE DEFAULT
NEWSEQUENTIALID(),
FileSource varbinary(MAX)
FILESTREAM DEFAULT(0x)
go
--Add default add date for document
ALTER
TABLE dbo.DocumentFilesTest
ADD CONSTRAINT
DF_DocumentFilesTest_AddDate
DEFAULT sysdatetime()
FOR AddDate
EXEC
sp_fulltext_database
'enable'
GO
IF
NOT EXISTS
(SELECT
TOP 1 1 FROM sys.fulltext_catalogs
WHERE name
= 'Ducuments_Catalog_test')
BEGIN
EXEC sp_fulltext_catalog
'Ducuments_Catalog_test',
'create',
'D:\SQL\PDFBlob';
END
--EXEC sp_fulltext_catalog 'Ducuments_Catalog_test', 'drop'
DECLARE
@indexName nvarchar(255)
= (SELECT
Top 1 i.Name
from sys.indexes
i
Join sys.tables
t on
i.object_id
= t.object_id
WHERE t.Name
= 'DocumentFilesTest'
AND i.type_desc
= 'CLUSTERED')
PRINT @indexName
EXEC
sp_fulltext_table
'DocumentFilesTest',
'create',
'Ducuments_Catalog_test',
@indexName
EXEC
sp_fulltext_column
'DocumentFilesTest',
'FileSource',
'add', 0,
'Extension'
EXEC
sp_fulltext_table
'DocumentFilesTest',
'activate'
EXEC
sp_fulltext_catalog
'Ducuments_Catalog_test',
'start_full'
ALTER
FULLTEXT INDEX
ON [dbo].[DocumentFilesTest]
ENABLE
ALTER
FULLTEXT INDEX
ON [dbo].[DocumentFilesTest]
SET CHANGE_TRACKING
= AUTO
ALTER
FULLTEXT CATALOG
Ducuments_Catalog_test REBUILD
WITH ACCENT_SENSITIVITY=OFF;
INSERT
INTO DocumentFilesTest(Extension,
Name,
FileSource)
SELECT
'pdf'
'BOL12006553.pdf'
* FROM
OPENROWSET(BULK
'd:\SQL\PDFBlob\BOL12006553.pdf',
SINGLE_BLOB)
AS BLOB;
GO
INSERT
INTO DocumentFilesTest(Extension,
Name,
FileSource)
SELECT
'docx'
'test.docx'
* FROM
OPENROWSET(BULK
'd:\SQL\PDFBlob\test.docx',
SINGLE_BLOB)
AS Document;
GO
SELECT
d.*
FROM dbo.DocumentFilesTest
d WHERE
Contains(d.FileSource,
'BILL')
Returns nothing. it should come from PDF file
SELECT
d.*
FROM dbo.DocumentFilesTest
d WHERE
Contains(d.FileSource,
'TEST')
Returns from word document as follows:
2           2014-06-04 10:11:41.393            test.docx docx
NULL   [BINARY Value] [Binary Value]
Any help is appreciated. Its been a long wait.
Thanks,
Vel
Vel Thavasi

Hello,
Did you check the fulltext log files for more details about the errors. If the filter isn’t working, there should be errors in the error log file.
The following thread is about similar issue, please refer to:
http://social.msdn.microsoft.com/forums/sqlserver/en-US/69535dbc-c7ef-402d-a347-d3d3e4860d72/sql-server-2008-64bit-fulltext-indexing-pdf-not-working-cant-find-ifilter
Regards,
Fanny Liu
If you have any feedback on our support, please click here.
Fanny Liu
TechNet Community Support

File, Place only works with PDF files...why?

I create documents in Mac Pages that I want to then create an interactive PDF (mainly navigation). I am using the demo copy of Indesign to see if it fits the bill.
The mac pages doocument is a fully formated and ready for export to a static PDF. As a test, I took a few pages of it and exported it to pdf, word and rtf.
The only file format that InDesign would import/place is PDF (pages, word and rtf were all grayed out and could not be selected via ID place).
I had hoped that ID would inport/place pages directly, but I cannot seem to get it to import any format other than PDF.
I tried some other .doc files (actaully created with WORD) and they were selectable but only imported the table of contents (no red arrow an lower right of text box to continue place).
Any suggestions?
thanks
bob

ID is certainly capable of placing RTF as well as native Word files (DOC and DOCX). What you're seeing is quite unusual.
Try trashing your preferences.
For the other DOC files, you need to hold down the shift key when click the page to place them.
Bob

Working with PDF files

Hello, we would like to write some functionality that generates PDF files from our Java application and additionally, some functionality that reads them into the app also. What is the best API to use for this? Would it be iText?

Aha,show my code and say nothing[
............................................................................................................................./b]
1�Bjacob for taking out pdf ,word and excel.
jacob is a bridage�Cwhich connects java and com or win32 functions.It nees a dll,but the authoe of the jacob provide it�B
jacob�Fhttp://www.matrix.org.cn/down_view.asp?id=13
put dll under path,jar file under classpath , import java.io.File;
import com.jacob.com.*;
import com.jacob.activeX.*;
public class FileExtracter{
public static void main(String[] args) {
ActiveXComponent app = new ActiveXComponent("Word.Application");
String inFile = "c:\\test.doc";
String tpFile = "c:\\temp.htm";
String otFile = "c:\\temp.xml";
boolean flag = false;
try {
app.setProperty("Visible", new Variant(false));
Object docs = app.getProperty("Documents").toDispatch();
Object doc = Dispatch.invoke(docs,"Open", Dispatch.Method, new Object[]{inFile,new Variant(false), new Variant(true)}, new int[1]).toDispatch();
Dispatch.invoke(doc,"SaveAs", Dispatch.Method, new Object[]{tpFile,new Variant(8)}, new int[1]);
Variant f = new Variant(false);
Dispatch.call(doc, "Close", f);
flag = true;
} catch (Exception e) {
e.printStackTrace();
} finally {
app.invoke("Quit", new Variant[] {});
}2)
apache's poi takes out word�Cexcel�B
poi package�Fhttp://www.matrix.org.cn/down_view.asp?id=14
put it under classpath.
import java.io.*;
import org.textmining.text.extraction.WordExtractor;
* Title: pdf extraction
* Description: email:[email protected]
* Copyright: Matrix Copyright (c) 2003
* Company: Matrix.org.cn
* @author chris
* @version 1.0,who use this example pls remain the declare
public class PdfExtractor {
public PdfExtractor() {
public static void main(String args[]) throws Exception
FileInputStream in = new FileInputStream ("c:\\a.doc");
WordExtractor extractor = new WordExtractor();
String str = extractor.extractText(in);
System.out.println("the result length is"+str.length());
System.out.println("the result is"+str);
}3)
3�Bpdfbox for pdf
http://www.matrix.org.cn/down_view.asp?id=12
import org.pdfbox.pdmodel.PDDocument;
import org.pdfbox.pdfparser.PDFParser;
import java.io.*;
import org.pdfbox.util.PDFTextStripper;
import java.util.Date;
* Title: pdf extraction
* Description: email:[email protected]
* Copyright: Matrix Copyright (c) 2003
* Company: Matrix.org.cn
* @author chris
* @version 1.0,who use this example pls remain the declare
public class PdfExtracter{
public PdfExtracter(){
public String GetTextFromPdf(String filename) throws Exception
String temp=null;
PDDocument pdfdocument=null;
FileInputStream is=new FileInputStream(filename);
PDFParser parser = new PDFParser( is );
parser.parse();
pdfdocument = parser.getPDDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
OutputStreamWriter writer = new OutputStreamWriter( out );
PDFTextStripper stripper = new PDFTextStripper();
stripper.writeText(pdfdocument.getDocument(), writer );
writer.close();
byte[] contents = out.toByteArray();
String ts=new String(contents);
System.out.println("the string length is"+contents.length+"\n");
return ts;
public static void main(String args[])
PdfExtracter pf=new PdfExtracter();
PDDocument pdfDocument = null;
try{
String ts=pf.GetTextFromPdf("c:\\a.pdf");
System.out.println(ts);
catch(Exception e)
e.printStackTrace();

Issues with .pdf files and Firefox Android Beta

I read the local paper online and the individual pages present themselves as .pdf files. With the Firefox for Android NON beta clicking on said file downloadable ads it and automatically calls up the default .pdf reader which opens the page. Some recent change in the Beta version simply downloads the file. I then have to manually click to open. Tried clearing defaults on both Firefox and the Kindle app, force stopping both, all to no avail.....suggestions?
Bob

the change was made in firefox 25 - that's why you're currently seeing this behaviour only in firefox beta. the general release for firefox 25 is scheduled for next week. i originally also thought to recommend the pdf.js-addon as an alternative but i found it to be very slow on my android too. unfortunately i don't know of any other solution currently.

Best way to work with FCP files and Adobe Premiere

Hi,
I have a bit of a technical problem. I'm editing a doco for broadcast on TV and have shot on HDV and downloaded onto our system as PAL, Apple Pro Res 422, 25 fps. oowever, Our graphics editor is on Adobe Premiere CS4.
What is the best way to get it onto her system and back off again and onto ours, with the mimimum or no loss of quality?
I've heard that an XML might work, but will it work on Adobe CS4? I don't know how this will pan out if it's XML. Will we be able to open it all back up in FCP when she hands it back. What sort of file do we get back from the Adobe Prem Pro Cs4 system?
I'm a bit nervous about it all.
Hope you can help.
Cheers,
Margie

This got me curious Eagleray as I do this all the time within my Mac FCS to CS all the time. But to a PC got me a googlin. Hope this discussion _http://www.animotion.nl/en/tutorials/fcp-xml-premiere-pro-cs5/_ helps as quite a few students seem to have Macs at school and PCs at home. Also do be aware of the different formatting of some PC drives like FAT16, FAT32, NTFS vs the Mac GUID. The Mac can read them all with the proper system addons/drivers, they show up in System Preferences Preference Panes under "OTHER". I use NTFS-3G to mount NTFS PC drives. I think it was free, if not there is another one I used to use that is free from some open source NTFS group. FAT has the size limit issue of I think 2 or 4gb (you can't copy anything bigger than that at one time to a Mac from a FAT drive). I'm on Snow Leopard on my 8core. My partner who edits in PP CS3 on a PC brought over his Maxtor USB the other day for me to copy some video files from and it's been having problems mounting, I believe since I upped to SL. I threw it on my 6 year old PowerBook G4 running 10.5.8 and it mounted right away. Just another weird gotcha. But take a quick look, tho the title says CS5 in it, a few students were using CS4 on a PC.
Message was edited by: TimeKoder13
Message was edited by: TimeKoder13

Working with RAW Files and JPG

I have a high end digital camera that takes pictures in RAW format as well as JPG format. When I import the photos, it only imports the JPG format. Is there a way to import RAW photo's into iPhoto?

It does (and should by default). However after it stores them in the ORIGINALS folder, it creates a .jpg copy in the MODIFIED folder and uses that by the looks of things. (Use Spotlight or Finder to see if you've actually imported any raw's).
You should see the RAW's displayed in the iPhoto window after import. If not, check the connection settings in your camera menu ... and make sure you are using a program or creative mode (my 20D does not take raw's in the preset or Green mode).
To work on RAW's you'd need to go Adobe Photoshop with Adobe Camera Raw (incl.in CS2) or Apple Aperture.... now significantly cheaper.

Adobe reader not associating with pdf files

i've installed adobe reader X after reinstalling windows, but reader is not associating with pdf files and also not showing in "open with.." list, even after browsing and try to associate it manually it's not visible in recommended program list and in other program list. to open pdf files i've to open adobe reader and then open files.

Please use the "Adobe Reader and Acrobat Cleaner Tool" from http://labs.adobe.com/downloads/acrobatcleaner.html and remove any traces of Adobe Reader already on your system.
Post successfully removing the application, please re-install Adobe Reader from: http://get.adobe.com/reader
Hope this helps
Ankit

How to open a pdf file and then attach it with images

I am new to Indesign Server.
I'm currently working on a pdf.
I have a white blank pdf template.
that I want to attach/glue it with images.
How to open a pdf file and then attach it with images.
Please, help me.
Thanks.

First step would be to make yourself familiar with InDesign desktop version.
Whatever you intend to achieve, do it there manually. (see regular app docs or forums)
Then try to automate your steps with scripting (see scripting docs or forum)
If you can do it with a script in the desktop version, that script will likely also run in ID Server. (see server forum).
If you can specify missing features not achievable thru scripting or manual use, reconsider to write a plugin (this forum).
A seasoned C++ programmer will need a few months to learn the basics, wade thru tons of documentation etc. Alternatively consider to hire a consultant to do the development work for you.
Dirk

I tried opening a pdf file and I set it to use always and firefox which did not work, I was wondering how to undo that and what I should use to open pdf files in the future?

I was attempting to open a document from a frequently used site, I had never been on, on my new mac however. I attempted to open the pdf file and was given the option of which program to use to open it, I mistakenly clicked use always and firefox as the program with which to open pdf files. I do not know how to undo this or what I should do in the future to access the pdf files.

Hi Melfour-
Here is a Support article detailing how to work with your Firefox PDF preferences:
[[Opening PDF files within Firefox]]
Hope that helps.

Issue with PDF links and opened Adobe files

I have tested and IE 8 opens correct links inside PDF's.
When I open link in Mozilla or Chrome it will not work.
Is this related to Adobe Plugin which is installed inside Add-ons?
Strange is that link inside PDF shows the following link:resource://pdf.js/web/
After checking support issues there was quoted in the past:
The current Firefox 27 beta release still doesn't show the links properly using version 0.8.641, but the current Aurora 28.0a2 build does show the links to the Appendices properly using version 0.8.759.
So that would mean that when Firefox 28 gets released in March it will work for users with this Firefox version.

This is likely a problem with the way those links are coded in the PDF file, so the file may have to be saved again using different settings to make it work with PDF Viewers other than the Adobe Reader.

Working with .pdf files and JAVA

Similar Messages

Maybe you are looking for