f



PDF to text

Hi there,

Looking for some Java libraries which will extract the text from a PDF,
retaining white space formatting i.e. paragraphs, newlines etc.

I've looked at, and tested, pdfbox, which does extract the text however
it does not preserve, or insert, paragraphs, newlines into its output.

I've looked at IText but according to the FAQ this will not extract the
text from the PDF.

I'd rather not use an external program like "pdftotext", a pure Java,
library based solution would be better.

Any ideas?

Cheers

Lord0

0
8/3/2005 1:59:04 PM
comp.lang.java.programmer 52714 articles. 1 followers. Post Follow

0 Replies
698 Views

Similar Articles

[PageSpeed] 13

Reply: