I was reading Adobe Pdf standard 1.3. In paragraph 5.5.3 ("Font
subsets") Adobe wrote:
"For a font subset, the PostScript name of the font-the value of the
font's
BaseFont entry and the font descriptor's FontName entry-begins with a
tag
followed by a plus sign. The tag consists of exactly six uppercase
letters; the choice
of letters is arbitrary, but different subsets in the same PDF file
must have different
tags. For example, EOODIA+Poetica is the name of a subset of Poetica, a
Type 1 font."
I don't understand why it's necessary to use different "tags". Infact
the "drawing" text commands reference the fonts using their internal
obj number (50 0 obj) and not the BaseFont or the FontName values.
Infact it's possible to make a "bad" pdf with 2 different subset with
the same name (ECKDJC+Arial). I created it merging 2 pdf using PdfLib
(www.pdflib.com).
You can get then from:
http://www.fhtino.it/misc/file1.pdf
http://www.fhtino.it/misc/file2.pdf
http://www.fhtino.it/misc/file1_file2.pdf
The resulting pdf file is correctly opened by Acrobat and Ghostscript.
So, why Adobe wrote that?
______________________________________________
Fabrizio Accatino - fhtino@yahoo.com
|
|
0
|
|
|
|
Reply
|
fhtino (5)
|
5/18/2004 2:55:22 PM |
|
"Fabrizio Accatino" <fhtino@yahoo.com> wrote:
>I was reading Adobe Pdf standard 1.3. In paragraph 5.5.3 ("Font
>subsets") Adobe wrote:
>
>"For a font subset, the PostScript name of the font-the value of the
>font's
>BaseFont entry and the font descriptor's FontName entry-begins with a
>tag...
>
>I don't understand why it's necessary to use different "tags". Infact
>the "drawing" text commands reference the fonts using their internal
>obj number (50 0 obj) and not the BaseFont or the FontName values.
Because much software will assume that two fonts with the same base
name are the same font.
Font subset prefixes are vital.
For instance, in many (older?) versions of Acrobat, if you had two
subsets with the same name, in two different PDFs, and you joined the
PDFs, one subset would be discarded.
>
>Infact it's possible to make a "bad" pdf with 2 different subset with
>the same name (ECKDJC+Arial).
That's even worse - in more cases this will cause damage.
Follow the strict letter of the PDF specification even if it seems
unnecessary, and even if the software you are using seems happy with
the result. Many, many, bad PDFs created in the past have caused
problems as software is updated.
> The resulting pdf file is correctly opened by Acrobat
You didn't even try combining files?
----------------------------------------
Aandi Inston quite@dial.pipex.com http://www.quite.com
Please support usenet! Post replies and follow-ups, don't e-mail them.
|
|
0
|
|
|
|
Reply
|
quite
|
5/20/2004 1:28:10 PM
|
|
"Aandi Inston" <quite@dial.pipex.con> wrote:
>
> Because much software will assume that two fonts with the same base
> name are the same font.
>
> Font subset prefixes are vital.
OK. But my question is slightly different. I don't understand why it's
necessary to add the 6 char prefix. Infact when pdf uses the font, it
"call" it using the OBJ number and not the prefix+name sequence.
> For instance, in many (older?) versions of Acrobat, if you had two
> subsets with the same name, in two different PDFs, and you joined the
> PDFs, one subset would be discarded.
What do you mean with "same name"? Font_name or Prefix+Font_Name ?
> >Infact it's possible to make a "bad" pdf with 2 different subset
with
> >the same name (ECKDJC+Arial).
>
> That's even worse - in more cases this will cause damage.
Where? When? Open it with Acrobat 4,5,6, Ghostscript and Xpdf is
working perfectly.
> Follow the strict letter of the PDF specification even if it seems
> unnecessary, and even if the software you are using seems happy with
> the result. Many, many, bad PDFs created in the past have caused
> problems as software is updated.
>
> > The resulting pdf file is correctly opened by Acrobat
>
> You didn't even try combining files?
I have done a lot of tests. I try to explain some results.
TEST 1 : 2 pdfs with 2 subset with different name
-------------------------------------------------
Source files:
abcdef_emb.pdf : ECKDJC+Arial
efghilmnop_emb.pdf : EDBBFI+Arial
Merged files:
merge_acrobat.4.0.pdf : ECKDJC+Arial EDBBFI+Arial
merge_ghostscript.pdf : MXVVJJ+Arial RVNFOH+Arial
As you can see Ghostscript changed the prefix but has not merged the
subsets.
Acrobat 4 not merged and not changed.
The resulting subsets are exactly the same of the source files.
TEST 2 : 2 pdfs with 2 subset with the same name
------------------------------------------------
Source files:
file1.pdf : ECKDJC+Arial
file2.pdf : ECKDJC+Arial
Merged files:
merge_pdflib.pdf : ECKDJC+Arial ECKDJC+Arial
merge_GhostSscipt.pdf : MXVVJJ+Arial RVNFOH+Arial
merge_Acrobat_6.0.pdf : ECKDJC+Arial EDBBFI+Arial
Acrobat 4 is unable to merge file1 and file2. It raised an error. But
Acrobat 6 works perfectly.
Also Ghostscript and Pdflib (www.pdflib.com) work perfectly. And they
don't merge the subsets.
In particular PdfLib uses a different approach. First of all it creates
a XObject of all elements to be imported in the output pdf. Then it
places the XObject on the output pdf.
In this way it manages every input page as a "big object" with all
necessary resources. (well, all is IMHO).
Best regards
______________________________________________
Fabrizio Accatino - fhtino@yahoo.com
|
|
0
|
|
|
|
Reply
|
Fabrizio
|
5/21/2004 9:54:40 AM
|
|
"Fabrizio Accatino" <fhtino@yahoo.com> wrote:
>"Aandi Inston" <quite@dial.pipex.con> wrote:
>>
>> Because much software will assume that two fonts with the same base
>> name are the same font.
>>
>> Font subset prefixes are vital.
>
>OK. But my question is slightly different. I don't understand why it's
>necessary to add the 6 char prefix. Infact when pdf uses the font, it
>"call" it using the OBJ number and not the prefix+name sequence.
Yes, it would not be ambigious in PDF display. But that's not the
issue.
Why not just follow the specification?
Just because you can't see a good reason for things in the spec,
doesn't mean it's a good idea to break it.
----------------------------------------
Aandi Inston quite@dial.pipex.com http://www.quite.com
Please support usenet! Post replies and follow-ups, don't e-mail them.
|
|
0
|
|
|
|
Reply
|
quite
|
5/21/2004 1:39:01 PM
|
|
"Aandi Inston" <quite@dial.pipex.con> wrote:
>
> Why not just follow the specification?
>
> Just because you can't see a good reason for things in the spec,
> doesn't mean it's a good idea to break it.
My company receives pdfs from our customers. Then we merge them, add
OMR codes, print and mail.
Our "pdf print stream" is a collection of many pdfs. So it could
contain pdfs with the same subset_name. It's very rare but it's
possible.
______________________________________________
Fabrizio Accatino - fhtino@yahoo.com
|
|
0
|
|
|
|
Reply
|
Fabrizio
|
5/21/2004 3:35:32 PM
|
|
"Fabrizio Accatino" <fhtino@yahoo.com> wrote:
>"Aandi Inston" <quite@dial.pipex.con> wrote:
>>
>> Why not just follow the specification?
>>
>> Just because you can't see a good reason for things in the spec,
>> doesn't mean it's a good idea to break it.
>
>My company receives pdfs from our customers. Then we merge them, add
>OMR codes, print and mail.
>
>Our "pdf print stream" is a collection of many pdfs. So it could
>contain pdfs with the same subset_name. It's very rare but it's
>possible.
Ok, I understand where your focus is coming from now.
Unfortunately, it does sometimes happen that different subsets have
the same subset name. It happens much more often than pure randomness
would suggest, so there must be some other factor at work.
When combining PDFs, the simplest approach is to keep all the font
objects separate. If files have subsets that do not mark themselves
with any prefix, some versions of Acrobat may have printing problems,
but overall that seems very rare.
There could also be cases, perhaps less rare, where the subsets have
the same prefix but aren't identical. A case can be made for
modifying the subset prefixes as you get them. Our Quite Revealing
software has an option to do that, and it has helped some people out
of font nightmares.
You can also of course merge font objects that are identical in every
respect (including all references and embedded font data). There can,
however, be significant costs in doing a deep comparison. You can see
this with Acrobat, the larger the files to be combined, even though it
does (I think) only a shallow comparison.
----------------------------------------
Aandi Inston quite@dial.pipex.com http://www.quite.com
Please support usenet! Post replies and follow-ups, don't e-mail them.
|
|
0
|
|
|
|
Reply
|
quite
|
5/21/2004 6:19:36 PM
|
|
|
5 Replies
143 Views
(page loaded in 0.108 seconds)
|