f



Convert postscript to PDF or text in OSX

I understand that you can convert postscript files to pdf using some
command line utility built into OSX.  

(When i was reading about PStill 
http://www.versiontracker.com/dyn/moreinfo/macosx/3629
an Adobe Distill type of program for OSX $70.  But the command is free.)


What command do i use?
% man pdf 
returned nothing.

I actually want to convert postscript files to the simple text format. 
They are all text, columns and rows of names and numbers.  The idea is
to get test scores with associated ID#s into a spreadsheet.  The text
returned will have spaces instead of tabs so it will be clumbsy.  
Is there a better way?

-- 
email:  tmcdanel   at   fastmail   dot   fm
0
tmcdanel (67)
9/24/2003 3:26:38 AM
comp.sys.mac.system 33446 articles. 2 followers. jfmezei.spamnot (9455) is leader. Post Follow

5 Replies
460 Views

Similar Articles

[PageSpeed] 47

Terry McDanel <tmcdanel@NO.iname.SPAM.com> wrote:

> I understand that you can convert postscript files to pdf using some
> command line utility built into OSX.  

The way I would do this is with ghostscript. Or you could download
MacGhostView, which is a front end to ghostscript and includes it. m.

-- 
matt neuburg, phd = matt@tidbits.com, http://www.tidbits.com/matt/
Read TidBITS! It's free and smart. http://www.tidbits.com
0
matt231 (2249)
9/24/2003 4:38:39 AM
Terry McDanel wrote in <news:230920032226386684%tmcdanel@NO.iname.SPAM.com>:

> I understand that you can convert postscript files to pdf using some
> command line utility built into OSX.

Nope -- not at the moment. With MacOS X 10.3 Apple licensed the Adobe
distiller core and makes it available in the form of the PSNormalizer
framework. Then there exists both a CUPS filter to convert from PS to PDF
and a simple shell tool called pstopdf in /usr/bin

> (When i was reading about PStill
> http://www.versiontracker.com/dyn/moreinfo/macosx/3629
> an Adobe Distill type of program for OSX $70.

But PStill can do many more things for you than just convert PS to PDF. Read
the detailled description on

    <http://www.stone.com/>

> But the command is free.)

In case you installed ghostscript you have a free tool on your machine to do
the job. But there exist minor problems with some sort of fonts. I believe
PStill would be the better choice in this regard.

> I actually want to convert postscript files to the simple text format.

Which applications produce the PostScript files? There exist several
scenrios where the application itself produces plain PDF that will be
converted into PostScript by MacOS X printing system on the fly (via the
cgpdftops filter). 

So it might be a good idea to avoid this conversion and to deal directly
with PDF. Why? Well, there exists a great tool called XPDF, which can
extract text from PDFs via the pdftotext command.

    <http://www.foolabs.com/xpdf/about.html>

You can compile it yourself or simply install the demo version of DevonThink

    <http://www.devon-technologies.com/products/devonthink.php>

and copy the pdftotext binary (which is free, of course) out of the
application bundle to /usr/bin for example.

> They are all text, columns and rows of names and numbers.  The idea is
> to get test scores with associated ID#s into a spreadsheet.  The text
> returned will have spaces instead of tabs so it will be clumbsy.

So you have to do some PostProcessing based on regular expressions or the
like.

Regards,

Thomas

0
9/24/2003 7:02:52 AM
In article <230920032226386684%tmcdanel@NO.iname.SPAM.com>,
 Terry McDanel <tmcdanel@NO.iname.SPAM.com> wrote:

> I understand that you can convert postscript files to pdf using some
> command line utility built into OSX.  
> 
> (When i was reading about PStill 
> http://www.versiontracker.com/dyn/moreinfo/macosx/3629
> an Adobe Distill type of program for OSX $70.  But the command is free.)
> 
> 
> What command do i use?
> % man pdf 
> returned nothing.
> 

Try 'man -k pdf'

-- 
Tom Stiller

PGP fingerprint =  5108 DDB2 9761 EDE5 E7E3 
                   7BDA 71ED 6496 99C0 C7CF
0
tomstiller (3053)
9/24/2003 10:47:49 AM
In article <230920032226386684%tmcdanel@NO.iname.SPAM.com>, Terry
McDanel wrote: 
> I understand that you can convert postscript files to pdf using some
> command line utility built into OSX.  

You mean ps2pdf?  Yep, that works.  But it's not quote builtin. It is
however trivial to install with fink. 


> I actually want to convert postscript files to the simple text format. 

In that case the conversion to pdf isn't really buying you anything.
It's no closer to straight text than postscript is.

The same fink package that provides ps2pdf also provides something
called ps2ascii.  I've never used this myself, but it sounds like it
might do what you want.  Here's the description from the man page:


DESCRIPTION
    ps2ascii  uses gs(1) to extract ASCII text from PostScript(tm) or Adobe
    Portable Document Format (PDF) files. If no files are specified on  the
    command  line,  gs  reads  from standard input; but PDF input must come
    from an explicitly-named file, not standard input.  If no  output  file
    is specified, the ASCII text is written to standard output.

    ps2ascii  doesn't look at font encoding, and isn't very good at dealing
    with kerning, so for PostScript (but not currently PDF), you might con-
    sider pstotext (see below).

[note: fink does not seem to provide 'ps2text', at least not in any
package I have loaded]

0
tristero (21)
9/24/2003 11:12:06 AM
tristero wrote in <news:a_ecb.142601$mp.70004@rwcrnsc51.ops.asp.att.net>:

[ps2pdf]
> It is however trivial to install with fink.

There is a more comfortable installer available for ESP ghostscript 7.0.6,
too... No need to deal with fink, just doubleclick and you're done.

[ps2ascii]
> I've never used this myself, but it sounds like it might do what you want.

No -- ps2ascii isn't still ready for prime time...

> Here's the description from the man page:
>   ps2ascii  doesn't look at font encoding, and isn't very good at dealing
>   with kerning, so for PostScript (but not currently PDF), you might con-
>   sider pstotext (see below).

Which I've prevered always when I had to deal with PostScript output in the
past.
 
> [note: fink does not seem to provide 'ps2text', at least not in any
> package I have loaded]

I just made an installer package for pstotext 1.8g some minutes ago. You can
download it from my site. It will install both binary and man page under
/usr/local and requires GhostScript in your path:

    <http://users.phg-online.de/tk/MOSXS/pstotext.dmg>

But the results, pdftotext gives you, are ways better! :-)

Regards,

Thomas

0
9/24/2003 5:05:35 PM
Reply: