f



Converting PDF to text

What would people recommend for converting PDF to Text that:

	a) can be purchased on CDROM (no downloading)

	b) is compatible with MacOS X 10.5.1

	c) can do this to a document that is 400 pages long

I realize the result will be ugly compared to PDF.

I am hoping for something less ugly than reading the PDF file with
a text editor.

I have looked at the Adobe.com website and I do not know enough about
their various offerings that have Acrobat in the name.  Besides, you
folks might be less biased.
0
Kilgallen (2738)
12/23/2007 6:41:12 PM
comp.sys.mac.system 33446 articles. 2 followers. jfmezei.spamnot (9455) is leader. Post Follow

33 Replies
797 Views

Similar Articles

[PageSpeed] 12

In article <R8UDlPb3bL2d@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> What would people recommend for converting PDF to Text that:
> 
> 
> I have looked at the Adobe.com website and I do not know enough about
> their various offerings that have Acrobat in the name.  Besides, you
> folks might be less biased.

Well, if you're doing much PDF'ing, Adobe Acrobat Standard (the bottom 
of their commercial product line, one notch below Acrobat Pro) may well 
be worth its price to you.  It will certainly do what you want (and of 
course many other things); has always been reliable and solid, for me 
anyway; and you won't need to upgrade it for many years, if at all.  
And, it "plays well" with Illustrator and/or Adobe Photoshop or 
Photoshop Elements, if you use those.
0
siegman (1559)
12/23/2007 7:27:02 PM
In article <R8UDlPb3bL2d@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> I realize the result will be ugly compared to PDF.
> 
> I am hoping for something less ugly than reading the PDF file with
> a text editor.
> 
> I have looked at the Adobe.com website and I do not know enough about
> their various offerings that have Acrobat in the name.  Besides, you
> folks might be less biased.



Adobe Reader 8.1.1, is free, works with Leopard 10.5.1, has a "save as 
text" option under File. It saves it as text. You can open it in Word 
and format it prettily if that's what you want. It does a pretty good 
job of "save as text" too. It won't save images that way, of course. I 
don't know if you can get it on a CD unless it comes bundled with some 
other software; I've always downloaded Adobe Reader.
-- 
W. Oates
0
warren.oates (3828)
12/23/2007 8:18:35 PM
On 2007-12-23, Larry Kilgallen <Kilgallen@SpamCop.net> wrote:
> What would people recommend for converting PDF to Text that:
>
> 	a) can be purchased on CDROM (no downloading)
>
> 	b) is compatible with MacOS X 10.5.1
>
> 	c) can do this to a document that is 400 pages long

You could try pdftotext which comes with the xpdf package.
I have no idea how you would get it on CDROM - I would just
type "sudo port install xpdf" and Bob's your uncle, but if
you can't connect to the Internet.... 

Ian

-- 
Ian Gregory
http://www.zenatode.org.uk/ian/
0
foo37 (895)
12/23/2007 8:43:56 PM
Larry Kilgallen wrote:
> What would people recommend for converting PDF to Text that:
> 
> 	a) can be purchased on CDROM (no downloading)
> 
> 	b) is compatible with MacOS X 10.5.1
> 
> 	c) can do this to a document that is 400 pages long
> 
> I realize the result will be ugly compared to PDF.
> 
> I am hoping for something less ugly than reading the PDF file with
> a text editor.
> 
> I have looked at the Adobe.com website and I do not know enough about
> their various offerings that have Acrobat in the name.  Besides, you
> folks might be less biased.

If the .pdf was made from a text editor then the imbedded text can be 
extracted.  *IF* the .pdf was made by scanning an original paper 
document, then the .pdf is really just a series of .tiff page pictures. 
  Extracting text from the .tiffs is not easy or maybe not even possible.
0
breyfogle (27)
12/23/2007 9:51:09 PM
In article <476ed81a$0$36353$742ec2ed@news.sonic.net>,
 breyfogle <breyfogle@aol.com> wrote:

> If the .pdf was made from a text editor then the imbedded text can be 
> extracted.  *IF* the .pdf was made by scanning an original paper 
> document, then the .pdf is really just a series of .tiff page 
> pictures. 
>   Extracting text from the .tiffs is not easy or maybe not even 
>   possible.

An OCR program can probably do it.

-- 
Support the troops:  Bring them home ASAP.
0
michelle14 (19004)
12/23/2007 10:22:50 PM
In article <michelle-234EC3.15225023122007@news.east.cox.net>, Michelle
Steiner <michelle@michelle.org> wrote:

> In article <476ed81a$0$36353$742ec2ed@news.sonic.net>,
>  breyfogle <breyfogle@aol.com> wrote:
> 
> > If the .pdf was made from a text editor then the imbedded text can be 
> > extracted.  *IF* the .pdf was made by scanning an original paper 
> > document, then the .pdf is really just a series of .tiff page 
> > pictures. 
> >   Extracting text from the .tiffs is not easy or maybe not even 
> >   possible.
> 
> An OCR program can probably do it.

Acrobat can do it. (The full version, not Reader).

-- 
Help improve usenet. Kill-file Google Groups.
http://improve-usenet.org/
0
dave16 (4224)
12/23/2007 10:48:04 PM
In article <R8UDlPb3bL2d@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> What would people recommend for converting PDF to Text that:
> 
> 	a) can be purchased on CDROM (no downloading)
> 
> 	b) is compatible with MacOS X 10.5.1
> 
> 	c) can do this to a document that is 400 pages long
> 
> I realize the result will be ugly compared to PDF.
> 
> I am hoping for something less ugly than reading the PDF file with
> a text editor.

Free software:  Preview (included with Mac OS X), or Adobe Reader 7 or 
later.

Choose the text selection tool.  Select All.  Copy.  Paste into your 
favorite word processor.  All the text will be copied.  Some of the 
formatting might be preserved, but you'll probably have to reformat it.

This will only work if the PDF really contains text and its security 
properties don't prohibit copying of the content.

As breyfogle  said, some PDFs just contain graphics; if you have one of 
those, you'll need an OCR program that can scan multipage PDF files, and 
the results will probably need a lot of proofreading.

Some PDFs are locked so you can't copy text or graphics from them, 
though you can make copies of the PDF file.  There are utilities like 
PDFKey Pro that will unlock a PDF file, but I don't know if there are 
any that can be purchased on CD.
0
wayne.morris (951)
12/24/2007 5:54:03 AM
> > What would people recommend for converting PDF to Text that:
> > 
> > 	a) can be purchased on CDROM (no downloading)
> > 
> > 	b) is compatible with MacOS X 10.5.1
> > 
> > 	c) can do this to a document that is 400 pages long
> > 
> > I realize the result will be ugly compared to PDF.
> > 
> > I am hoping for something less ugly than reading the PDF file with
> > a text editor.
> 
> Free software:  Preview (included with Mac OS X), or Adobe Reader 7 or 
> later.

I noticed the Save as Text option under File.  Is this ever grayed out?  
Or is it always available.  That is a mighty useful trick if it is 
always available.  Many thanks to the person recommending that.
0
noemailhere (606)
12/24/2007 6:57:35 AM
On 2007-12-24 06:57:35 +0000, The New guy <noemailhere@please.comm> said:

>>> What would people recommend for converting PDF to Text that:
>>> 
>>> 	a) can be purchased on CDROM (no downloading)
>>> 
>>> 	b) is compatible with MacOS X 10.5.1
>>> 
>>> 	c) can do this to a document that is 400 pages long
>>> 
>>> I realize the result will be ugly compared to PDF.
>>> 
>>> I am hoping for something less ugly than reading the PDF file with
>>> a text editor.
>> 
>> Free software:  Preview (included with Mac OS X), or Adobe Reader 7 or
>> later.
> 
> I noticed the Save as Text option under File.  Is this ever grayed out?
> Or is it always available.  That is a mighty useful trick if it is
> always available.  Many thanks to the person recommending that.

If the other alternatives are too laborious, take a look at File Juicer 
- http://echoone.com/filejuicer/ - which may do the job more neatly 
(and can extract images as well).

Stan

0
man208 (241)
12/24/2007 10:18:16 AM
Larry Kilgallen wrote:
> What would people recommend for converting PDF to Text that:
> 
> 	a) can be purchased on CDROM (no downloading)
> 
> 	b) is compatible with MacOS X 10.5.1
> 
> 	c) can do this to a document that is 400 pages long
> 
> I realize the result will be ugly compared to PDF.
> 
> I am hoping for something less ugly than reading the PDF file with
> a text editor.
> 
> I have looked at the Adobe.com website and I do not know enough about
> their various offerings that have Acrobat in the name.  Besides, you
> folks might be less biased.

Why can't you just put the PDF on the CD-ROM?
0
nospamatall2 (3238)
12/24/2007 10:58:32 AM
Wayne C. Morris <wayne.morris@this.is.invalid> wrote:


> 
> Some PDFs are locked so you can't copy text or graphics from them, 
> though you can make copies of the PDF file.  There are utilities like
> PDFKey Pro that will unlock a PDF file, but I don't know if there are
> any that can be purchased on CD.

One work around here is to open the PDF in ColorSync Utility and resave
as a PDF, which removes the copy lock. Just did this last week and it
works well.

Dave

-- 
There's a fine line between stupid and clever.
0
12/24/2007 11:44:24 AM
In article <476f876a$0$21089$da0feed9@news.zen.co.uk>,
 Stan The Man <man@pr100.com> wrote:

> 
> If the other alternatives are too laborious, take a look at File Juicer 
> - http://echoone.com/filejuicer/ - which may do the job more neatly 
> (and can extract images as well).
> 

That's an interesting suggestion.

FIleJuicer is advertised as a utility to extract images from many 
different kinds of documents, and I've used it for same.  It works very 
well -- more or less overdoes that job, in fact.

But yes, I recall that among the multiple output files and folders it 
produces when it chops up a document are also some that are labelled 
"text" or something similar.  Never thought to look and see if the text 
of a PDF file would be in one of those.
0
siegman (1559)
12/24/2007 4:03:07 PM
In article <1i9mlyt.12pxpniyjzagvN%dave_devine@nospamcop.net>,
 dave_devine@nospamcop.net (Dave Devine) wrote:

> 
> One work around here is to open the PDF in ColorSync Utility and resave
> as a PDF, which removes the copy lock. Just did this last week and it
> works well.
> 

Any other simple ways to remove the copy lock (on a Mac)?
0
siegman (1559)
12/24/2007 4:48:09 PM
In article <siegman-D05B64.08480924122007@nntp.stanford.edu>,
 AES <siegman@stanford.edu> wrote:

> Any other simple ways to remove the copy lock (on a Mac)?

Have you tried Preview? It has a Save As.
-- 
W. Oates
0
warren.oates (3828)
12/24/2007 5:01:24 PM
Warren Oates <warren.oates@gmail.com> wrote:

> In article <siegman-D05B64.08480924122007@nntp.stanford.edu>,
>  AES <siegman@stanford.edu> wrote:
> 
> > Any other simple ways to remove the copy lock (on a Mac)?
> 
> Have you tried Preview? It has a Save As.

On the file I had to process, the Save As option was greyed out. I
assume that is the case for all edit locked files.

Dave
-- 
There's a fine line between stupid and clever.
0
12/24/2007 11:43:12 PM
> > > Any other simple ways to remove the copy lock (on a Mac)?
> > 
> > Have you tried Preview? It has a Save As.
> 
> On the file I had to process, the Save As option was grayed out. I
> assume that is the case for all edit locked files.

Try the same file with Adobe Reader. Is it still grayed out?
0
noemailhere (606)
12/25/2007 2:08:08 AM
In article <wayne.morris-6D2090.23540223122007@shawnews.wp.shawcable.net>, "Wayne C. Morris" <wayne.morris@this.is.invalid> writes:
> In article <R8UDlPb3bL2d@eisner.encompasserve.org>,
>  Kilgallen@SpamCop.net (Larry Kilgallen) wrote:
> 
>> What would people recommend for converting PDF to Text that:
>> 
>> 	a) can be purchased on CDROM (no downloading)
>> 
>> 	b) is compatible with MacOS X 10.5.1
>> 
>> 	c) can do this to a document that is 400 pages long
>> 
>> I realize the result will be ugly compared to PDF.
>> 
>> I am hoping for something less ugly than reading the PDF file with
>> a text editor.
> 
> Free software:  Preview (included with Mac OS X),

I looked briefly at that.

> or Adobe Reader 7 or later.

Where do I buy a CDROM of that (see requirement a above) ?

> Choose the text selection tool.  Select All.  Copy.  Paste into your 
> favorite word processor.  All the text will be copied.

Doing that for 400 pages does not seem viable.

> This will only work if the PDF really contains text and its security 
> properties don't prohibit copying of the content.

That is not an issue with my 400 page input document.
0
Kilgallen (2738)
12/25/2007 10:29:51 PM
In article <476f876a$0$21089$da0feed9@news.zen.co.uk>, Stan The Man <man@pr100.com> writes:
> On 2007-12-24 06:57:35 +0000, The New guy <noemailhere@please.comm> said:
> 
>>>> What would people recommend for converting PDF to Text that:
>>>> 
>>>> 	a) can be purchased on CDROM (no downloading)

> If the other alternatives are too laborious, take a look at File Juicer 
> - http://echoone.com/filejuicer/ - which may do the job more neatly 
> (and can extract images as well).

That web page shows me no option for purchasing the product on CDROM.
0
Kilgallen (2738)
12/25/2007 10:30:51 PM
In article <fko3cu$sg5$2@aioe.org>, nospamatall <nospamatall@iol.ie> writes:
> Larry Kilgallen wrote:
>> What would people recommend for converting PDF to Text that:
>> 
>> 	a) can be purchased on CDROM (no downloading)
>> 
>> 	b) is compatible with MacOS X 10.5.1
>> 
>> 	c) can do this to a document that is 400 pages long
>> 
>> I realize the result will be ugly compared to PDF.
>> 
>> I am hoping for something less ugly than reading the PDF file with
>> a text editor.
>> 
>> I have looked at the Adobe.com website and I do not know enough about
>> their various offerings that have Acrobat in the name.  Besides, you
>> folks might be less biased.
> 
> Why can't you just put the PDF on the CD-ROM?

I don't see how that would convert it to text.
0
Kilgallen (2738)
12/25/2007 10:31:50 PM
In article <UFy09eyGe$m5@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> That web page shows me no option for purchasing the product on CDROM.

Why do you have to buy it on CD ROM?

-- 
Support the troops:  Bring them home ASAP.
0
michelle14 (19004)
12/26/2007 12:17:47 AM
In article <QICyglVD+WZC@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> In article <fko3cu$sg5$2@aioe.org>, nospamatall <nospamatall@iol.ie> writes:
> > Larry Kilgallen wrote:
> >> What would people recommend for converting PDF to Text that:
> >> 
> >> 	a) can be purchased on CDROM (no downloading)
> >> 
> >> 	b) is compatible with MacOS X 10.5.1
> >> 
> >> 	c) can do this to a document that is 400 pages long
> >> 
> >> I realize the result will be ugly compared to PDF.
> >> 
> >> I am hoping for something less ugly than reading the PDF file with
> >> a text editor.
> >> 
> >> I have looked at the Adobe.com website and I do not know enough about
> >> their various offerings that have Acrobat in the name.  Besides, you
> >> folks might be less biased.
> > 
> > Why can't you just put the PDF on the CD-ROM?
> 
> I don't see how that would convert it to text.

It wouldn't, but it would be as easy to read a PDF with Preview as it 
would a text file with TextEdit.

-- 
Tom Stiller

PGP fingerprint =  5108 DDB2 9761 EDE5 E7E3  7BDA 71ED 6496 99C0 C7CF
0
tomstiller (3053)
12/26/2007 2:18:48 AM
In article <michelle-B9EF8F.17174725122007@news.east.cox.net>, Michelle Steiner <michelle@michelle.org> writes:
> In article <UFy09eyGe$m5@eisner.encompasserve.org>,
>  Kilgallen@SpamCop.net (Larry Kilgallen) wrote:
> 
>> That web page shows me no option for purchasing the product on CDROM.
> 
> Why do you have to buy it on CD ROM?

It is a security rule around here.  No downloaded software.
0
Kilgallen (2738)
12/26/2007 12:52:43 PM
In article <tomstiller-0692C3.21184825122007@newsgroups.comcast.net>, Tom Stiller <tomstiller@comcast.net> writes:
> In article <QICyglVD+WZC@eisner.encompasserve.org>,
>  Kilgallen@SpamCop.net (Larry Kilgallen) wrote:
> 
>> In article <fko3cu$sg5$2@aioe.org>, nospamatall <nospamatall@iol.ie> writes:
>> > Larry Kilgallen wrote:
>> >> What would people recommend for converting PDF to Text that:
>> >> 
>> >> 	a) can be purchased on CDROM (no downloading)
>> >> 
>> >> 	b) is compatible with MacOS X 10.5.1
>> >> 
>> >> 	c) can do this to a document that is 400 pages long
>> >> 
>> >> I realize the result will be ugly compared to PDF.
>> >> 
>> >> I am hoping for something less ugly than reading the PDF file with
>> >> a text editor.
>> >> 
>> >> I have looked at the Adobe.com website and I do not know enough about
>> >> their various offerings that have Acrobat in the name.  Besides, you
>> >> folks might be less biased.
>> > 
>> > Why can't you just put the PDF on the CD-ROM?
>> 
>> I don't see how that would convert it to text.
> 
> It wouldn't, but it would be as easy to read a PDF with Preview as it 
> would a text file with TextEdit.

But the task at hand is to convert the PDF to text, not to read it.
If the task were just to read it, all I have to do is double-click.
0
Kilgallen (2738)
12/26/2007 12:54:25 PM
On Wed, 26 Dec 2007 07:52:43 -0500, Larry Kilgallen wrote
(in article <TWGBa8eYR3zG@eisner.encompasserve.org>):

> In article <michelle-B9EF8F.17174725122007@news.east.cox.net>, Michelle 
> Steiner <michelle@michelle.org> writes:
>> In article <UFy09eyGe$m5@eisner.encompasserve.org>,
>> Kilgallen@SpamCop.net (Larry Kilgallen) wrote:
>> 
>>> That web page shows me no option for purchasing the product on CDROM.
>> 
>> Why do you have to buy it on CD ROM?
> 
> It is a security rule around here.  No downloaded software.

Then you need to buy either ReadIRIS Pro 
<http://www.irislink.com/c2-532-189/OCR-Software---Product-list.aspx>, which 
is available either by download or on CD, or Adobe Acrobat Pro 
<http://www.adobe.com/> and proceed to the online store. So far as I know the 
full version of Acrobat is available only on CD.

Both options cost substantially more than File Juicer, but that's just how it 
is if you insist on software on CD. (Which, just so you know, especially in 
the case of the Adobe software, most definitely did traverse the Internet 
before being burned to disc.)

-- 
email to oshea dot j dot j at gmail dot com.

0
try.not.to (2779)
12/26/2007 1:09:01 PM
In article <TWGBa8eYR3zG@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> >> That web page shows me no option for purchasing the product on 
> >> CDROM.
> > 
> > Why do you have to buy it on CD ROM?
> 
> It is a security rule around here.  No downloaded software.

*nod*  It's a stupid rule, an over reaction, but you have to live with 
it.

-- 
Support the troops:  Bring them home ASAP.
0
michelle14 (19004)
12/26/2007 4:01:58 PM
In article <8wjmwN+ZHWqD@eisner.encompasserve.org>,
 Kilgallen@SpamCop.net (Larry Kilgallen) wrote:

> In article <wayne.morris-6D2090.23540223122007@shawnews.wp.shawcable.net>, 
> "Wayne C. Morris" <wayne.morris@this.is.invalid> writes:
> > In article <R8UDlPb3bL2d@eisner.encompasserve.org>,
> >  Kilgallen@SpamCop.net (Larry Kilgallen) wrote:
> > 
> >> What would people recommend for converting PDF to Text that:
> >> 
> > 
> > Free software:  Preview (included with Mac OS X),
> 
> I looked briefly at that.
> 
> > Choose the text selection tool.  Select All.  Copy.  Paste into your 
> > favorite word processor.  All the text will be copied.
> 
> Doing that for 400 pages does not seem viable.
> 

I've just used a combination of Preview and TextEdit to extract the text 
from a 600 page VMS manual via copy and paste. Surprisingly it managed 
that with only a small amount of beach ball spinning even on my G3 iBook.

I'll add that I used Tiger, and a couple of ntes too.

a) TextEdit will by default output to an RTF file, so you need to
   use Make Plain Text from the Format menu.
b) The default when saving a text file in TextEdit is Unicode (UTF-16)
   To get it in plain(ish) ASCII, I needed to select Non-lossy ASCII,
   as if I selected any of the Latin encodings, it refused to do the
   save. Note that in Non-lossy ASCII, characters such as a copyright
   sign were written in the format \251 for my example.

I agree it could indeed be tedious for 400 pages, as for starters, 
paragraphs in the original were not separated in the final output by 
blank lines.

-- 
Paul Sture

Sue's OpenVMS bookmarks:
http://eisner.encompasserve.org/~sture/ovms-bookmarks.html
0
12/26/2007 7:13:51 PM
In article <paul.sture.nospam-62E93E.20135126122007@mac.sture.ch>,
 "P. Sture" <paul.sture.nospam@hispeed.ch> wrote:

> a) TextEdit will by default output to an RTF file, so you need to
>    use Make Plain Text from the Format menu.

Or change it in preferences so you don't have to change it in the Format 
menu every time.

> b) The default when saving a text file in TextEdit is Unicode 
> (UTF-16)

Or change it in preferences so you don't have to change it in the Save 
As dialog every time.

-- 
Support the troops:  Bring them home ASAP.
0
michelle14 (19004)
12/26/2007 7:49:27 PM
In article <michelle-DD0EEC.12492626122007@news.east.cox.net>,
 Michelle Steiner <michelle@michelle.org> wrote:

> In article <paul.sture.nospam-62E93E.20135126122007@mac.sture.ch>,
>  "P. Sture" <paul.sture.nospam@hispeed.ch> wrote:
> 
> > a) TextEdit will by default output to an RTF file, so you need to
> >    use Make Plain Text from the Format menu.
> 
> Or change it in preferences so you don't have to change it in the Format 
> menu every time.
> 
> > b) The default when saving a text file in TextEdit is Unicode 
> > (UTF-16)
> 
> Or change it in preferences so you don't have to change it in the Save 
> As dialog every time.

Thanks Michelle, I'd missed the Preference settings, although will note 
that doing a Restore All Defaults in Preferences sets new documents to 
RTF, and the encoding when saving plain text files as "automatic".

-- 
Paul Sture

Sue's OpenVMS bookmarks:
http://eisner.encompasserve.org/~sture/ovms-bookmarks.html
0
12/26/2007 11:16:01 PM
In article <0001HW.C397BC9D011CB174F040E6D8@newsgroups.comcast.net>, J.J. O'Shea <try.not.to@but.see.sig> writes:
> On Wed, 26 Dec 2007 07:52:43 -0500, Larry Kilgallen wrote
> (in article <TWGBa8eYR3zG@eisner.encompasserve.org>):

> Then you need to buy either ReadIRIS Pro 
> <http://www.irislink.com/c2-532-189/OCR-Software---Product-list.aspx>, which 
> is available either by download or on CD,

The description of that is so centered on the OCR task that I do not
think it is the product for me.

> or Adobe Acrobat Pro 
> <http://www.adobe.com/> and proceed to the online store.

Or elsewhere :-)

> So far as I know the 
> full version of Acrobat is available only on CD.

Thanks for the clarification that I should stop looking for Standard.
I stopped by MicroCenter today, bought a copy, and my conversion is
complete.
0
Kilgallen (2738)
12/27/2007 12:52:11 AM
Larry Kilgallen wrote:
> In article <fko3cu$sg5$2@aioe.org>, nospamatall <nospamatall@iol.ie> writes:
>> Larry Kilgallen wrote:
>>> What would people recommend for converting PDF to Text that:
>>>
>>> 	a) can be purchased on CDROM (no downloading)
>>>
>>> 	b) is compatible with MacOS X 10.5.1
>>>
>>> 	c) can do this to a document that is 400 pages long
>>>
>>> I realize the result will be ugly compared to PDF.
>>>
>>> I am hoping for something less ugly than reading the PDF file with
>>> a text editor.
>>>
>>> I have looked at the Adobe.com website and I do not know enough about
>>> their various offerings that have Acrobat in the name.  Besides, you
>>> folks might be less biased.
>> Why can't you just put the PDF on the CD-ROM?
> 
> I don't see how that would convert it to text.

OK I see now that you meant the program itself must be on CD-ROM. I
thought you meant you were going to sell CDs with the text on. If you
know anyone who would qualify for academic status, you can get Acrobat
pro for 149 dollars
http://bookstore.ucdavis.edu/Display.cfm?itemId=5148

Acrobat reader (at least from version 6) has a basic ability to save as
text. I don't think adobe supply that as other than a download, but a
lot of those magazines with 'free' CDs include it. Try looking in a good
newsagent at the mac magazines, or see if anyone you know has old ones.

Andy

0
nospamatall2 (3238)
12/27/2007 1:54:42 AM
The New guy <noemailhere@please.comm> wrote:

> > > > Any other simple ways to remove the copy lock (on a Mac)?
> > > 
> > > Have you tried Preview? It has a Save As.
> > 
> > On the file I had to process, the Save As option was grayed out. I
> > assume that is the case for all edit locked files.
> 
> Try the same file with Adobe Reader. Is it still grayed out?

Yep.
-- 
There's a fine line between stupid and clever.
0
12/27/2007 1:57:06 PM
Dave Balderstone wrote:
> In article <michelle-234EC3.15225023122007@news.east.cox.net>, Michelle
> Steiner <michelle@michelle.org> wrote:
> 
>> In article <476ed81a$0$36353$742ec2ed@news.sonic.net>,
>>  breyfogle <breyfogle@aol.com> wrote:
>>
>>> If the .pdf was made from a text editor then the imbedded text can be 
>>> extracted.  *IF* the .pdf was made by scanning an original paper 
>>> document, then the .pdf is really just a series of .tiff page 
>>> pictures. 
>>>   Extracting text from the .tiffs is not easy or maybe not even 
>>>   possible.
>> An OCR program can probably do it.
> 
> Acrobat can do it. (The full version, not Reader).
> 
Which version ?  5.0 won't.
0
breyfogle (27)
12/30/2007 9:11:32 PM
breyfogle wrote:
> Dave Balderstone wrote:
>> In article <michelle-234EC3.15225023122007@news.east.cox.net>, Michelle
>> Steiner <michelle@michelle.org> wrote:
>>
>>> In article <476ed81a$0$36353$742ec2ed@news.sonic.net>,
>>>  breyfogle <breyfogle@aol.com> wrote:
>>>
>>>> If the .pdf was made from a text editor then the imbedded text can
>>>> be extracted.  *IF* the .pdf was made by scanning an original paper
>>>> document, then the .pdf is really just a series of .tiff page
>>>> pictures.   Extracting text from the .tiffs is not easy or maybe not
>>>> even   possible.
>>> An OCR program can probably do it.
>>
>> Acrobat can do it. (The full version, not Reader).
>>
> Which version ?  5.0 won't.

I remember trying to do it in v4. It should be able to do it, but for
some reason it wouldn't, so I opened the pdf in photoshop, saved the
pages out as tiff and then used acrobat to import and ocr. I can't
remember how successful that was, but it did work. I do remember that
omnipage was vastly superior in its recognition, and gave me a txt file
that was much easier to edit.

Andy
0
nospamatall2 (3238)
12/31/2007 1:31:39 AM
Reply: