I have a problem in that I have created a PDF of some 114MB using
Ghostscript, which someone else has reduced to 8MB using Adobe Pdfwriter.
Unfortunately, his 8MB version will not work on my old Sun, which only
has the reader version 5.0.10, which can't be upgraded, as that is the
latest from Adobel
Is there any way I can reduct the size of my file somewhat, without
decreasing quality or compatibility?
My process for creation is
a) Scan each page, saving as an uncompressed TIFF
b) Save each page as a Postscript too.
c) Use a slightly modified version of 'ps2pdf' in 'AFPL Ghostscript 6.50
(2000-12-02)' to create a pdf. Here's my 'slightly modified' version of
ps2pdf:
#!/bin/sh
echo $*
echo $@ -1
gs -q -dNOPAUSE -sPAPERSIZE=a4 -dBATCH -sDEVICE=pdfwrite
-sOutputFile=foo.pdf -c
save pop -f $*
while creates foo.pdf, given a list of .ps files on the command line.
|
|
0
|
|
|
|
Reply
|
Dave
|
4/30/2005 1:43:35 PM |
|
Dave wrote:
> I have a problem in that I have created a PDF of some 114MB using
> Ghostscript, which someone else has reduced to 8MB using Adobe Pdfwriter.
You can try to force a different compression filter using
setdistillerparams operator. Can I take a look at the files?
|
|
0
|
|
|
|
Reply
|
Alex
|
4/30/2005 6:47:27 PM
|
|
Alex Cherepanov wrote:
> Dave wrote:
>
>> I have a problem in that I have created a PDF of some 114MB using
>> Ghostscript, which someone else has reduced to 8MB using Adobe Pdfwriter.
>
>
> You can try to force a different compression filter using
> setdistillerparams operator. Can I take a look at the files?
>
>
This is the 114MB file.
http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete.pdf
and there is the much smaller version I can't open on my Sun.
http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
Here's 3 representate postscript files.
http://www.medphys.ucl.ac.uk/~drkirkby/4-8.ps (copied to server)
http://www.medphys.ucl.ac.uk/~drkirkby/4-9.ps (being copied as I write)
http://www.medphys.ucl.ac.uk/~drkirkby/4-10.ps (will be copied in a few
mins)
I have Acrobat Distiller 5 on a low-spec (733MHz Celeron, 1GB RAM) PC,
but have not tried using that.
|
|
0
|
|
|
|
Reply
|
Dave
|
4/30/2005 7:36:03 PM
|
|
Dave wrote:
> Alex Cherepanov wrote:
>
>> Dave wrote:
>>
>>> I have a problem in that I have created a PDF of some 114MB using
>>> Ghostscript, which someone else has reduced to 8MB using Adobe
>>> Pdfwriter.
>>
>>
>>
>> You can try to force a different compression filter using
>> setdistillerparams operator. Can I take a look at the files?
>>
>>
> This is the 114MB file.
>
> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete.pdf
Short analysis with tool.pdf.Info from Multivalent:
PDF version: 1.3
Content: 71 DCT-coded *huge* (about 5040x6496) greyscale images,
probably scanned from a manual.
Producer: AFPL Ghostscript 6.50
On A4 size paper, this is equivalent to a resolution of around 600 dpi.
Each image is about 1-2 MByte in size => total file size about 120 MB.
> and there is the much smaller version I can't open on my Sun.
>
> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
Short analysis with tool.pdf.Info from Multivalent:
PDF version: 1.5
Content: 71 JBIG2-coded images with about 1232x1628 (which is about
1/4*1/4 = 1/16 of the size of the original images)
AR6 resampled the images to much fewer pixels (1/16 of the original
pixels). On A4 sized paper, this is equivalent to a resolution of around
150 dpi. AR6 did that probably due to the user selecting the "e-book"
profile. This is a set of settings for low-res documents for online use.
Let's do the math:
If you resample the original JPEG images to 1/4*1/4 size, the resulting
file size will be around 120/16 = 8 MBytes. This is about the size of
the PDF 1.5 document. The additional JBIG-2 compression thus has limited
effect on file size in this case. Thus: PDF 1.5 features are unnecessary
to achieve a reasonable file size.
Conclusion: Scan at a lower resolution (around 120 dpi) and the file
size will be ok. In addition you can reduce the image to 16 levels of
gray instead of 256 and use PNG instead of JPEG. JPEG is not really
appropriate for this type of images.
> Here's 3 representate postscript files.
>
> http://www.medphys.ucl.ac.uk/~drkirkby/4-8.ps (copied to server)
> http://www.medphys.ucl.ac.uk/~drkirkby/4-9.ps (being copied as I write)
> http://www.medphys.ucl.ac.uk/~drkirkby/4-10.ps (will be copied in a few
> mins)
What sense does it make to put a single scanned image into a Postscript
framing with all of its overhead (ASCII85 in this case)? PNG (or TIFF if
you like) would be much more appropriate.
> I have Acrobat Distiller 5 on a low-spec (733MHz Celeron, 1GB RAM) PC,
> but have not tried using that.
I would use Imagemagick's convert or tifflib's tiff2pdf (both free) to
put a PDF frame around a bunch of images to create a single file.
Ralf
|
|
0
|
|
|
|
Reply
|
Ralf
|
5/1/2005 3:05:05 PM
|
|
If the pages you are scanning consist mostly of text and other line art (ie
no photos), then you should be saving the files as TIFF, with CCITT GRP4
compression. This is the fax format, and it does an amazing job of
compressing text files. I think gs supports this format, but genuine Acrobat
certainly does.
"Dave" <nospam@nowhere.com> wrote in message news:42738b87@212.67.96.135...
> I have a problem in that I have created a PDF of some 114MB using
> Ghostscript, which someone else has reduced to 8MB using Adobe Pdfwriter.
>
> Unfortunately, his 8MB version will not work on my old Sun, which only
> has the reader version 5.0.10, which can't be upgraded, as that is the
> latest from Adobel
>
> Is there any way I can reduct the size of my file somewhat, without
> decreasing quality or compatibility?
>
> My process for creation is
> a) Scan each page, saving as an uncompressed TIFF
> b) Save each page as a Postscript too.
> c) Use a slightly modified version of 'ps2pdf' in 'AFPL Ghostscript 6.50
> (2000-12-02)' to create a pdf. Here's my 'slightly modified' version of
> ps2pdf:
>
> #!/bin/sh
> echo $*
> echo $@ -1
> gs -q -dNOPAUSE -sPAPERSIZE=a4 -dBATCH -sDEVICE=pdfwrite
> -sOutputFile=foo.pdf -c
> save pop -f $*
>
> while creates foo.pdf, given a list of .ps files on the command line.
>
|
|
0
|
|
|
|
Reply
|
Dan
|
5/3/2005 12:49:12 AM
|
|
"Dave" <nospam@nowhere.com> wrote in message news:4273de28@212.67.96.135...
> and there is the much smaller version I can't open on my Sun.
>
> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
Just for info that opens fine using Version 6 of the reader for Windows.
|
|
0
|
|
|
|
Reply
|
CWatters
|
5/3/2005 8:42:18 AM
|
|
CWatters wrote:
> "Dave" <nospam@nowhere.com> wrote in message news:4273de28@212.67.96.135...
>
>>and there is the much smaller version I can't open on my Sun.
>>
>>http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
>
>
> Just for info that opens fine using Version 6 of the reader for Windows.
>
>
The latest version of acrobat on a Sun is 5.0.11, so I can't see it
myself. My PC is rather slow and I ahve rather an averasion to using
them. That said, I think producing files for the latest acrobat is not
wise, as few people have it installed.
|
|
0
|
|
|
|
Reply
|
Dave
|
5/3/2005 9:13:45 AM
|
|
Ralf Koenig wrote:
>> Here's 3 representate postscript files.
>>
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-8.ps (copied to server)
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-9.ps (being copied as I write)
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-10.ps (will be copied in a
>> few mins)
>
>
> What sense does it make to put a single scanned image into a Postscript
> framing with all of its overhead (ASCII85 in this case)? PNG (or TIFF if
> you like) would be much more appropriate.
>
>> I have Acrobat Distiller 5 on a low-spec (733MHz Celeron, 1GB RAM) PC,
>> but have not tried using that.
>
>
> I would use Imagemagick's convert or tifflib's tiff2pdf (both free) to
> put a PDF frame around a bunch of images to create a single file.
>
> Ralf
For some reason my old version of Imagemagick could not read TIFF files.
So I updated to the latest (6.2.2) from source, and get during the
configure stage:
TIFF --with-tiff=yes no (failed tests)
I've updated the TIFF library, but still it fails. The self-tests on the
TIFF library pass OK, so I don't know why
I've joined the Imagemagick mailing list, with a hope of resolving this,
but do you know any other tool that can put a few hundred TIFF files
into one postscript file?
|
|
0
|
|
|
|
Reply
|
Dave
|
5/3/2005 12:05:03 PM
|
|
Ralf Koenig wrote:
> Dave wrote:
>
>> Alex Cherepanov wrote:
>>
>>> Dave wrote:
>>>
>>>> I have a problem in that I have created a PDF of some 114MB using
>>>> Ghostscript, which someone else has reduced to 8MB using Adobe
>>>> Pdfwriter.
>>>
>>>
>>>
>>>
>>> You can try to force a different compression filter using
>>> setdistillerparams operator. Can I take a look at the files?
>>>
>>>
>> This is the 114MB file.
>>
>> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete.pdf
>
>
> Short analysis with tool.pdf.Info from Multivalent:
>
> PDF version: 1.3
> Content: 71 DCT-coded *huge* (about 5040x6496) greyscale images,
> probably scanned from a manual.
> Producer: AFPL Ghostscript 6.50
>
> On A4 size paper, this is equivalent to a resolution of around 600 dpi.
They were scanned at 600 dpi. I am borrowoing the manual, so wanted to
get all the information, then worry about perhaps loosing some later.
> Each image is about 1-2 MByte in size => total file size about 120 MB.
>
Each TIFF image is about 30MB. The Postcript images are about a third or
that size. Since lots of areas are of the same gray level (there's a lot
of white). Does Postript realise this, and so have some sort of lossless
compression?
>> and there is the much smaller version I can't open on my Sun.
>>
>> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
>
>
> Short analysis with tool.pdf.Info from Multivalent:
>
> PDF version: 1.5
> Content: 71 JBIG2-coded images with about 1232x1628 (which is about
> 1/4*1/4 = 1/16 of the size of the original images)
I did not realise the resolution had been reduced.
> AR6 resampled the images to much fewer pixels (1/16 of the original
> pixels). On A4 sized paper, this is equivalent to a resolution of around
> 150 dpi. AR6 did that probably due to the user selecting the "e-book"
> profile. This is a set of settings for low-res documents for online use.
I'd probably want to change that, as it is likely to be printed.
> Let's do the math:
> If you resample the original JPEG images to 1/4*1/4 size, the resulting
> file size will be around 120/16 = 8 MBytes. This is about the size of
> the PDF 1.5 document. The additional JBIG-2 compression thus has limited
> effect on file size in this case. Thus: PDF 1.5 features are unnecessary
> to achieve a reasonable file size.
>
> Conclusion: Scan at a lower resolution (around 120 dpi) and the file
> size will be ok. In addition you can reduce the image to 16 levels of
> gray instead of 256 and use PNG instead of JPEG. JPEG is not really
> appropriate for this type of images.
I did not use JPEG - the files were saved as uncompressed TIFFs. As I
said, I only have one chance to get this right, although my scanner does
seem to be doing some rather funny things, which is not helping. It
seems to scan some gray areas as groups of dark a white dots
>> Here's 3 representate postscript files.
>>
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-8.ps (copied to server)
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-9.ps (being copied as I write)
>> http://www.medphys.ucl.ac.uk/~drkirkby/4-10.ps (will be copied in a
>> few mins)
>
>
> What sense does it make to put a single scanned image into a Postscript
> framing with all of its overhead (ASCII85 in this case)? PNG (or TIFF if
> you like) would be much more appropriate.
>
>> I have Acrobat Distiller 5 on a low-spec (733MHz Celeron, 1GB RAM) PC,
>> but have not tried using that.
>
>
> I would use Imagemagick's convert or tifflib's tiff2pdf (both free) to
> put a PDF frame around a bunch of images to create a single file.
>
> Ralf
I did not scan to jpeg, but TIFF.
I'm basially borroing that manaul for a short peroiod, so decided to
scan at the highest resolution and worry about reducting information
later, if possible.
Since huge areas are white, it seems sensible that some sort of
compression will work well.
|
|
0
|
|
|
|
Reply
|
Dave
|
5/3/2005 1:37:32 PM
|
|
Dave wrote:
> Ralf Koenig wrote:
>
>> Dave wrote:
>>
>>> Alex Cherepanov wrote:
>>>
>>>> Dave wrote:
>>>>
>>>>> I have a problem in that I have created a PDF of some 114MB using
>>>>> Ghostscript, which someone else has reduced to 8MB using Adobe
>>>>> Pdfwriter.
>>>>
>>> [...]
>>> This is the 114MB file.
>>>
>>> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete.pdf
>>
>> Short analysis with tool.pdf.Info from Multivalent:
>>
>> PDF version: 1.3
>> Content: 71 DCT-coded *huge* (about 5040x6496) greyscale images,
>
> They were scanned at 600 dpi. I am borrowoing the manual, so wanted to
> get all the information, then worry about perhaps loosing some later.
>
>> Each image is about 1-2 MByte in size => total file size about 120 MB.
>>
>
> Each TIFF image is about 30MB.
This is the raw size: 5000*6000 pixels * 1 byte = 30 MB.
> The Postcript images are about a third or
> that size. Since lots of areas are of the same gray level (there's a lot
> of white). Does Postript realise this, and so have some sort of lossless
> compression?
Well, Postscript does not realize anything, like the "White House" does
not realize anything. The question is, whether the Gimp export plugin
(which you use to save the raw scans to PS format) uses compression
features of the Postscript format.
-> Yes it does, it uses lossless RunLength compression, as I could see
in your PS-files. At the same time it uses the ASCII85 filter that
converts binary content to an ASCII represenation, thereby needing more
space than necessary.
>>> and there is the much smaller version I can't open on my Sun.
>>>
>>> http://www.medphys.ucl.ac.uk/~drkirkby/5370B-incomplete-acrobat6.pdf
>>
>> Short analysis with tool.pdf.Info from Multivalent:
>>
>> PDF version: 1.5
>> Content: 71 JBIG2-coded images with about 1232x1628 (which is about
>> 1/4*1/4 = 1/16 of the size of the original images)
>
>
> I did not realise the resolution had been reduced.
>
>> AR6 resampled the images to much fewer pixels (1/16 of the original
>> pixels). On A4 sized paper, this is equivalent to a resolution of
>> around 150 dpi. AR6 did that probably due to the user selecting the
>> "e-book" profile. This is a set of settings for low-res documents for
>> online use.
>
>
> I'd probably want to change that, as it is likely to be printed.
>
>> Let's do the math:
>> If you resample the original JPEG images to 1/4*1/4 size, the
>> resulting file size will be around 120/16 = 8 MBytes. This is about
>> the size of the PDF 1.5 document. The additional JBIG-2 compression
>> thus has limited effect on file size in this case. Thus: PDF 1.5
>> features are unnecessary to achieve a reasonable file size.
>>
>> Conclusion: Scan at a lower resolution (around 120 dpi) and the file
>> size will be ok. In addition you can reduce the image to 16 levels of
>> gray instead of 256 and use PNG instead of JPEG. JPEG is not really
>> appropriate for this type of images.
>
>
> I did not use JPEG - the files were saved as uncompressed TIFFs. As I
> said, I only have one chance to get this right, although my scanner does
> seem to be doing some rather funny things, which is not helping. It
> seems to scan some gray areas as groups of dark a white dots
ps2pdf converted them to JPEG inside PDF when converting. This is
reasonable to achieve a file size you can live with. Otherwise the file
would have been about 6 MB (this is an estimate for FlateDecode) * 71 =
420 MB!
>>> Here's 3 representate postscript files.
>>>
>>> http://www.medphys.ucl.ac.uk/~drkirkby/4-8.ps (copied to server)
>>> http://www.medphys.ucl.ac.uk/~drkirkby/4-9.ps (being copied as I write)
>>> http://www.medphys.ucl.ac.uk/~drkirkby/4-10.ps (will be copied in a
>>> few mins)
>>
>>
>>
>> What sense does it make to put a single scanned image into a
>> Postscript framing with all of its overhead (ASCII85 in this case)?
>> PNG (or TIFF if you like) would be much more appropriate.
>>
>>> I have Acrobat Distiller 5 on a low-spec (733MHz Celeron, 1GB RAM)
>>> PC, but have not tried using that.
>>
>>
>>
>> I would use Imagemagick's convert or tifflib's tiff2pdf (both free) to
>> put a PDF frame around a bunch of images to create a single file.
>>
>> Ralf
>
> I did not scan to jpeg, but TIFF.
Well, you scanned to raw pixels, then saved the raw pixels to TIFF.
But ps2pdf (wrapper for Ghostscript) converted the images to JPEG inside
PDF to reduce file size. There are options to disable GS doing this.
> I'm basially borroing that manaul for a short peroiod, so decided to
> scan at the highest resolution and worry about reducting information
> later, if possible.
Well, this is the right approach. Now you have to find a clever way to
reduce information. Imagemagick's convert is fine for this purpose.
1. Convert the TIFF files to PNG (this is lossless) so that your version
of Imagemagick, which lacks TIFF support can access the images
2. Reduce to 300 dpi, which is still fine for printing
3. [Optional] Adjust levels: convert -level -> the reduction of entropy
will result in better compression
4. Use convert to wrap the PNG's with a PDF frame, you can also use
"sam2p" for this last step: http://www.inf.bme.hu/~pts/sam2p/
OR:
1. Use a recent version of libtiff
2. Reduce resolution to 300 dpi
3. compress the TIFFs to LZW or Zip Compression
4. Use tiff2pdf to make a couple of TIFFs into one PDF
> Since huge areas are white, it seems sensible that some sort of
> compression will work well.
Right, PNG with lossless compression is perfect for such material with
256 levels of gray. You can also convert all pages that only have black
and white content without greylevels to 1-bit images and have them
compressed with lossless Group4 compression - TIFF G4 format.
In PDF, every image can have different compression methods. PDF in this
case is just glue to hold everything together.
Ralf
|
|
0
|
|
|
|
Reply
|
Ralf
|
5/3/2005 6:21:19 PM
|
|
Dave <nospam@nowhere.com> wrote:
> The latest version of acrobat on a Sun is 5.0.11, so I can't see it
> myself. My PC is rather slow and I ahve rather an averasion to using
> them. That said, I think producing files for the latest acrobat is not
> wise, as few people have it installed.
It's 7.0, check adobe site.
|
|
0
|
|
|
|
Reply
|
pisz_na
|
6/5/2005 11:26:43 AM
|
|
|
10 Replies
235 Views
(page loaded in 0.239 seconds)
|