Enhancing scanned text in a PDF file

  • Follow


Are their any programs you guys would reccomend to clean up scanned
images for OCR? I have several many paged image files in PDF format
that I want to translate into text. I have a couple of good OCR
programs that almost do a good enough job translating them, but I think
they would work better if i could run some filters and do some manual
editing on the pictures. Any suggestions for programs that can do it
without having to tear apart the PDF files into individual images?

Thanks,

Michael

0
Reply michalchik (7) 8/30/2006 2:51:57 AM

In article <1156906317.831771.18870@h48g2000cwc.googlegroups.com>,
 "michalchik@aol.com" <michalchik@aol.com> wrote:

> Are their any programs you guys would reccomend to clean up scanned
> images for OCR? I have several many paged image files in PDF format
> that I want to translate into text. I have a couple of good OCR
> programs that almost do a good enough job translating them, but I think
> they would work better if i could run some filters and do some manual
> editing on the pictures. Any suggestions for programs that can do it
> without having to tear apart the PDF files into individual images?
> 
> Thanks,
> 
> Michael

Photoshop Elements will open a multi-page PDF document, then let you 
open and edit one page at a time, then save that page back into (I'm 
pretty sure) the same position in the document.  (Illustrator will do 
this also with the vector/PS content in PDF documents.)

Whether you can clean up the text enough to improve the OCRing is a 
separate question -- no experience with that myself.
0
Reply AES 8/30/2006 3:52:29 AM


<michalchik@aol.com> wrote in message 
news:1156906317.831771.18870@h48g2000cwc.googlegroups.com...
> Are their any programs you guys would reccomend to clean up scanned
> images for OCR? I have several many paged image files in PDF format
> that I want to translate into text. I have a couple of good OCR
> programs that almost do a good enough job translating them, but I think
> they would work better if i could run some filters and do some manual
> editing on the pictures. Any suggestions for programs that can do it
> without having to tear apart the PDF files into individual images?

Even if you can improve the percentage of correct output from the OCR 
program, I've never seen one whose output is 100% correct. So you'll still 
have to edit the output of the OCR program. I think you'll be just as far 
ahead to leave the PDF as-is and find a way to work on the text output 
(dictionary, grammer checker, etc.).
--
Gerry


0
Reply Gerry 8/30/2006 1:42:42 PM

Hi Michael,

I would try ABBYY's PDF Transformer - this will re-OCR the PDF pages and 
convert the pages back to an Office format for editing - then allow you to 
convert to back to a Text searchable PDF using our own PDF-XChange which is 
now included as part of their bundle in V2.x.

I have found it exceptional for PDF to Office conversions - irrespective of 
the Office 2 PDF conversion back again.

You can download a trial version from their web site :

http://www.pdftransformer.com/

-- 
Best Regards

John Verbeeten
Tracker Software Products
PDF-XChange & SDK, PDF-XChange Viewer;
PDF-Tools & SDK, Image-XChange SDK,
TIFF-XChange & SDK, Raster-XChange.
Email : johnV@docu-track.com
Support: http://www.docu-track.com/forum/index.php
Web site : http://www.docu-track.com
<michalchik@aol.com> wrote in message 
news:1156906317.831771.18870@h48g2000cwc.googlegroups.com...
> Are their any programs you guys would reccomend to clean up scanned
> images for OCR? I have several many paged image files in PDF format
> that I want to translate into text. I have a couple of good OCR
> programs that almost do a good enough job translating them, but I think
> they would work better if i could run some filters and do some manual
> editing on the pictures. Any suggestions for programs that can do it
> without having to tear apart the PDF files into individual images?
>
> Thanks,
>
> Michael
> 


0
Reply John 8/30/2006 7:27:56 PM

3 Replies
642 Views

(page loaded in 0.069 seconds)

Similiar Articles:













7/20/2012 11:38:39 AM


Reply: