Can't open more than 20 files in Acrobat 5

  • Follow


I've had a large number (c4000) scans made to PDF for a newspaper archive
I'm collating. However, I have discovered that some of the pages have been
scanned upside down, or turned 90 degrees during the OCR process, and I'm
looking for a quick way of isolating them.
I had been dragging and dropping them issue by issue to Acrobat 5 and then
just closing them one by one - but I then realised it would only open 20
files at a time, so I was missing them.
Anyone know a way to:
a) quickly get more than 20 files open at once
b) check the files in another way

I know I can just pick them off in batches, but there are so many of them
I'm looking for a quicker method.

James


0
Reply James 8/3/2003 5:02:03 PM

James Goffin wrote:

> I've had a large number (c4000) scans made to PDF for a newspaper archive
> I'm collating. However, I have discovered that some of the pages have been
> scanned upside down, or turned 90 degrees during the OCR process, and I'm
> looking for a quick way of isolating them.
> I had been dragging and dropping them issue by issue to Acrobat 5 and then
> just closing them one by one - but I then realised it would only open 20
> files at a time, so I was missing them.
> Anyone know a way to:
> a) quickly get more than 20 files open at once
> b) check the files in another way

Use free software.

Copy (or link) all files to the same directory.
Run Ghostscript with the following parameters.
gs -dLastPage=1 -dNOPAUSE -dBATCH check.ps

%!
% This is check.ps
/in (%stdin) (r) file def
/out (file.log) (w) file def
/stdout (%stdout) (w) file def
/s 120 string def

/foo {
   dup out exch writestring
   out ( ) writestring
   run
   stdout (Orientation ? ) writestring
   stdout flushfile
   in s readline pop
   dup (x) eq {pop exit} if
   out exch writestring
   out (\n) writestring
} bind def

(*.pdf) //foo =string filenameforall
%EOF

Make Ghostscript and console windows tiled and type in the file
orientation (or anything else) at "Orientation ?" prompt.
You need to press Enter at the end because Ghostscript has
line-buffered console. On Windows use gswin32c instead of
gs .

The result will be saved in file.log in the same directory.

0
Reply Alex 8/3/2003 9:04:53 PM


Thanks Alex.

I've never used Ghostscript, but I'll give this a go. What would the
orientation attributes be? "Portrait" "Landscape"?

And presumably this wouldn't catch scans that are upside down (180 degrees
rotated from normal) because the orientation would be correct. Is there a
way to check whether the OCR process has worked (the PDFs are in Image+Text
format)?

James.

"Alex Cherepanov" <alexcher@quadnet.net> wrote in message
news:VNeXa.10529$iZ4.1236@nwrddc02.gnilink.net...
> James Goffin wrote:
>
> > I've had a large number (c4000) scans made to PDF for a newspaper
archive
> > I'm collating. However, I have discovered that some of the pages have
been
> > scanned upside down, or turned 90 degrees during the OCR process, and
I'm
> > looking for a quick way of isolating them.
> > I had been dragging and dropping them issue by issue to Acrobat 5 and
then
> > just closing them one by one - but I then realised it would only open 20
> > files at a time, so I was missing them.
> > Anyone know a way to:
> > a) quickly get more than 20 files open at once
> > b) check the files in another way
>
> Use free software.
>
> Copy (or link) all files to the same directory.
> Run Ghostscript with the following parameters.
> gs -dLastPage=1 -dNOPAUSE -dBATCH check.ps
>
> %!
> % This is check.ps
> /in (%stdin) (r) file def
> /out (file.log) (w) file def
> /stdout (%stdout) (w) file def
> /s 120 string def
>
> /foo {
>    dup out exch writestring
>    out ( ) writestring
>    run
>    stdout (Orientation ? ) writestring
>    stdout flushfile
>    in s readline pop
>    dup (x) eq {pop exit} if
>    out exch writestring
>    out (\n) writestring
> } bind def
>
> (*.pdf) //foo =string filenameforall
> %EOF
>
> Make Ghostscript and console windows tiled and type in the file
> orientation (or anything else) at "Orientation ?" prompt.
> You need to press Enter at the end because Ghostscript has
> line-buffered console. On Windows use gswin32c instead of
> gs .
>
> The result will be saved in file.log in the same directory.
>


0
Reply James 8/3/2003 9:22:55 PM

> I've never used Ghostscript, but I'll give this a go. What would the
> orientation attributes be? "Portrait" "Landscape"?
Ghostscript shows you the 1st page of every file.
You are free to type anything you like about it.
The result will be saved as:

file_name.pdf your string
file_name2.pdf another string

> And presumably this wouldn't catch scans that are upside down (180 degrees
> rotated from normal) because the orientation would be correct.
If the text on the screen is upside down it's 180 degrees rotated.

> Is there a
> way to check whether the OCR process has worked (the PDFs are in Image+Text
> format)?

There are many ways to do it.
You can try to extract the text and count the number of dictionary
words vs. garbage.

You can try to highlight the text and visually check whether it matches
the image.

There may be implementation-dependent clues in the file.

Unfortunately 4000 files is too few to justify any serious development.

0
Reply Alex 8/3/2003 9:45:22 PM

Alex - I misunderstood what the script was doing initially, but now I know
what you mean.

One small bug though. The display clears before the "Orientation?" prompt
appears, and with the -dNOPAUSE operator it disappears so quickly that you
can't see the file in time. Removing the -dNOPAUSE gives you time to see it,
but adds another keypress.

Any idea why this is - or more importantly a way around it? I'm running
Ghostscript 8 on Windows ME.

Thanks for all your help.

James.

"Alex Cherepanov" <alexcher@quadnet.net> wrote in message
news:SnfXa.10729$iZ4.5845@nwrddc02.gnilink.net...
> > I've never used Ghostscript, but I'll give this a go. What would the
> > orientation attributes be? "Portrait" "Landscape"?
> Ghostscript shows you the 1st page of every file.
> You are free to type anything you like about it.
> The result will be saved as:
>
> file_name.pdf your string
> file_name2.pdf another string
>
> > And presumably this wouldn't catch scans that are upside down (180
degrees
> > rotated from normal) because the orientation would be correct.
> If the text on the screen is upside down it's 180 degrees rotated.
>
> > Is there a
> > way to check whether the OCR process has worked (the PDFs are in
Image+Text
> > format)?
>
> There are many ways to do it.
> You can try to extract the text and count the number of dictionary
> words vs. garbage.
>
> You can try to highlight the text and visually check whether it matches
> the image.
>
> There may be implementation-dependent clues in the file.
>
> Unfortunately 4000 files is too few to justify any serious development.
>


0
Reply James 8/9/2003 10:29:58 AM

4 Replies
274 Views

(page loaded in 0.431 seconds)

Similiar Articles:













7/24/2012 3:07:50 AM


Reply: