f



Ghostscript and copying text from a pdf gets junk

Dear group,

I have downloaded gs 8.14 and redmon 1.7 to make pdfs in win98se.

I use the apple color laserwriter 12/600 driver which outputs
to redmon and makes a pdf.

The pdfs look great, the only problem is that when i try to copy
the text in acrobat and paste to notepad, it comes out like this:

7\SH)RQW)RUPDW
\SH([WHQVLRQ9HUVLRQ&RPSUHVVLRQ&RORU'HSWK0DLQWDLQHU$GREH6\VWHPV6SHFLILFDWLRQ

GSview and text extract give similar results.

It doesn't always do this, that is, some pdfs don't have this problem,
but if i ask the printer driver to use truetype fonts only (no subst)
it seems to do it every time.

It appears that a "shifting" process is occurring:

copy text: 5HVXOWV
orig text: Results

s->V 3 character shift (not counting that it's lower case to upper case)
t->W 3 chars
e->H 3 chars

When the print driver  uses "standard substitutions" like arial -> 
helvetica it copies fine.

Any ideas?

I dont think it's mozilla (which is the app i'm printing from)

Thanks,

Don

0
xxx (88)
4/26/2004 8:28:05 PM
comp.lang.postscript 3552 articles. 0 followers. Post Follow

0 Replies
479 Views

Similar Articles

[PageSpeed] 42

Reply:

Similar Artilces:

pdf \ text (get rid of text in pdf)
Is there a way to remove all text from PDF? Will extract images work for you? If so, PDF-Tools by Tracker Software will do it. http://www.docu-track.com/ -- Don Vancouver, USA "MarosV" <maros.vranec@gmail.com> wrote in message news:ebb897e1-c8e3-4b3a-9274-dfd9d2c845c3@c4g2000hsg.googlegroups.com... > Is there a way to remove all text from PDF? ...

ghostscript PDF page extraction, leaving text as text
Ghostscript may be used to extract pages from a PDF file with a command like this: gs -sDEVICE=pdfwrite \ -dNOPAUSE -dBATCH -dSAFER \ -dFirstPage=48 -dLastPage=48 \ -sOutputFile=onepage.pdf input.pdf The problem is, while that page looks the same as the original in a PDF reader, it seems to be an image rather than an "object" representation. That is, open the extracted PDF in something like Acrobat or PDF XChange Viewer and "search" and "text selection" work, whereas in the extracted one neither function works. Presumably this is because the text has been rasterized. Is it possible to use gs to extract ranges of pages, preferably also reducing the resolution of the embedded images, but leaving the text as text? I frequently need to reduce the size of PDF files, but it should all come out of the resolution of the images, and the text should remain as accessible as it was in the original. If ghostscript cannot do this, is there another linux tool that can? Thanks, David Mathog >>>>> "David" =3D=3D David Mathog <dmathog@gmail.com> writes: David> gs -sDEVICE=3Dpdfwrite \ -dNOPAUSE -dBATCH -dSAFER \ David> -dFirstPage=3D48 -dLastPage=3D48 \ -sOutputFile=3Donepage.pdf David> input.pdf I've just tried this with a PDF file, and it works: search and select works on both onepage.pdf and input.pdf. David> The problem is, while that page looks the same as the David> ori...

How to get text from PDF?
Hi all, I have my web server bases on linux. I am working on a project for which I need to get text out of PDF file. I need to know which text belongs to which PDF page number? Is there any utility/tool that should be installed on linux and I can use it from command line in PHP through exec() or system() etc for this purpose? Please reply me urgently. Thanks in advance. On 22 Dec, 15:03, Shahid <mirzashahidmahm...@gmail.com> wrote: > Hi all, > > I have my web server bases on linux. I am working on a project for > which I need to get text out of PDF file. I need to know w...

copy pdf text == ????????
i have a pdf document and i want to "Ctrl+C" some text from it, i get ????? ???? ????? ????, etc. do you know any program that can copy what really is inside?? "Raveman" <otworz@buziaczek.pl> wrote: >i have a pdf document and i want to "Ctrl+C" some text from it, i get ????? >???? ????? ????, etc. It sounds as if the encodings of the font are not what you expect. >do you know any program that can copy what really is inside?? That really is what is inside. If you want to get the text that your eyes see, however, you will need to use OCR. -...

copying text from pdf
Morning! why do texts from different .pdf files get copied differently (if selected with 'Touch up text' and copied to a text editor)? I have observed the following behaviors: - single characters are marked in the original; copied, there are no spaces inbetween; - a text gets underlined in the original; copied, there are no spaces; - whole text gets marked; copied, it's usually fine; - only a single line can be copied at a time; usually without extra problems... What does this depend on? When I convert a Word document to .pdf, can I choose which of the above behaviors I will pref...

Copy from a PDF to PDF
Is it possible to copy a piece of text from a page (not the whole page) from a PDF file and paste it into another PDF file in a desired position? Thanks in article YT2Fb.22892$vi2.19893@twister.auna.com, jesusp at jesusp@arrakis.es wrote on 12/20/03 1:09 PM: > Is it possible to copy a piece of text from a page (not the whole page) from > a PDF file and paste it into another PDF file in a desired position? - Yep - copy and paste. You may have to crop it first to get just what you want. And it may not paste into position - you need to move it around. - I suppose it depends on which applications you have at hand. Acrobat will certainly do it. MSD Thx, I crop the desired rectangle in the first PDF, select it with the TouchUp Object Tool, then copy it, but the WHOLE page is pasted in the second PDF and I'd like to have only the part I want. At least this is what I get with ACROBAT 5.0. Am I doing something wrong? "WharfRat" <wharfrat@footprintsphotographics.com> wrote in message news:BC09FB34.10D43%wharfrat@footprintsphotographics.com... in article YT2Fb.22892$vi2.19893@twister.auna.com, jesusp at jesusp@arrakis.es wrote on 12/20/03 1:09 PM: > Is it possible to copy a piece of text from a page (not the whole page) from > a PDF file and paste it into another PDF file in a desired position? - Yep - copy and paste. You may have to crop it first to get just what you want. And it may not paste into position - you need to move it around. - I...

ANN: Fly Text to PDF
Hi All: Fly Text to PDF 1.3 is powerful tool which can convert your text files into PDF. This tool is powerful converter tool running on Microsoft Windows Operating System. You can use this tool to convert your text report, text documents and other text files into PDF quickly and easily. You also can set the PDF properties in each text files by using special tags, or set the default properties for every output PDF files. Please visit our website for more information: http://www.medafan.com/pdf-tools For the output sample, please click on: http://www.medafan.com/pdf-tools/license.pdf Key fea...

Getting the Text from Image and PDF
Hi friends, This is Jan, I am new to this Group. I have a requirement here. Is there any Java API for getting the Text data from an Image and PDF formats. Please let me know the same. If anything found, please suggest me regarding them. Thanks && Regards.. Jan Jan <janreddy.sr@gmail.com> wrote: > Hi friends, > This is Jan, I am new to this Group. > I have a requirement here. > Is there any Java API for getting the Text data from an Image > and PDF formats. For reading characters from graphical data, google "ocr" (and "java") (the acronym means "optical character recognition") PDFs may contain the text directly (non-graphically), which would make extraction much easier (and not require ocr). On 02/14/2014 04:09 AM, Jan wrote: > Hi friends, > > This is Jan, I am new to this Group. > > I have a requirement here. > > Is there any Java API for getting the Text data from an Image and PDF formats. Please let me know the same. If anything found, please suggest me regarding them. > <http://www.catb.org/~esr/faqs/smart-questions.html#before> On Fri, 14 Feb 2014 01:09:12 -0800 (PST), Jan <janreddy.sr@gmail.com> wrote, quoted or indirectly quoted someone who said : > Is there any Java API for getting the Text data from an Image and PDF formats. >Pease let me know the same. If anything found, please suggest m...

Get text copy of data dictionary
I have Pervasive.SQL V8, but can hardly be considered an experienced user. I need a text version of the field names and other information, in a form that I can copy into notepad or Word. Is this possible? I cannot highlight and copy the text in the "Table Designer" window. I also posted this query in the Pervasive Support Forum, so please excuse the redundancy - I'm not sure the same viewers all follow both. TIA, Paul P Hoberg wrote: > I have Pervasive.SQL V8, but can hardly be considered an experienced > user. I need a text version of the field names and other...

Module to get text from a PDF page?
I'm looking for a Perl module that will give me the text from a page of a simple (uncompressed, unencrypted) PDF. I've found several modules on CPAN that will write text into PDFs, but nothing to get it out. The closest possibilities look like PDF::API2 and Text::PDF. I've been working with them, and they seem to be able to get at a lot of meta-information in a PDF, but unable to get at the actual text in the file. My workaround is to shell out to pdftotext to get the text, but I'd like to have a pure-perl solution if possible. Does anyone know of a module that can do thi...

copying text from PS/PDF files
Are there any clients like page(1) that allow me to select text from a PS/PDF file? That's especially useful while reading Fransisco's books :) ...

Convert postscript to PDF or text in OSX
I understand that you can convert postscript files to pdf using some command line utility built into OSX. (When i was reading about PStill http://www.versiontracker.com/dyn/moreinfo/macosx/3629 an Adobe Distill type of program for OSX $70. But the command is free.) What command do i use? % man pdf returned nothing. I actually want to convert postscript files to the simple text format. They are all text, columns and rows of names and numbers. The idea is to get test scores with associated ID#s into a spreadsheet. The text returned will have spaces instead of tabs so it will be clumb...

getting PDF information using GhostScript?
I would like to get (some) information about a PDF. For reasons of installation ease I would prefer NOT to install xpdf (which would give me pdfinfo(1)). So - is there any way to use GhostScript (which can read PDF rather well :-) to get a pagecount and the media size for each page? BugBear bugbear wrote: > I would like to get (some) information about a PDF. > > For reasons of installation ease I would prefer > NOT to install xpdf (which would give me pdfinfo(1)). > > So - is there any way to use GhostScript > (which can read PDF rather well :-) > to get a pagecount and the media > size for each page? > > BugBear You'd have to dig into how ghostscript interprets PDF - it does it using it's PostScript interpreter equipped with a couple of additional PDF-specific operators. But even without these (you need not decompose PDF streams etc. to access the information e.g. pdfinfo provides) you "easily" could write a PostScript program to extract this information. Helge PS: "easily": the quotes mean you'd need some PostScript skills, though. -- Helge Blischke Softwareentwicklung SRZ Berlin | Firmengruppe besscom http://www.srz.de bugbear wrote: > ... > is there any way to use GhostScript > (which can read PDF rather well :-) > to get a pagecount and the media > size for each page? > ... I don't know of a ready-made Ghostscript tool for this, but with some digging and some PS/PDF ...

Text Color in Copied Text
I use Mathematica for all my class notes, and occasionally copy things from code editors that use color to highlight different constructs. In an earlier version (probably 5.2) the color highlighting copied over when I pasted it into a Text cell, but now this doesn't happen - all b/w. I don't see any way around this. Any ideas? Thanks, Kevin ...

code highlighting and copying text from PDF files
I asked 2 questions a couple of days ago but strangely not of them was published! So I'll try again but this time the topics are merged :) 1. How can I get C code highlighting to Sam or Acme? 2. Is there a client like page(1) that allows me to copy text from a PDF file? Very useful while reading Fransisco's books... > 2. Is there a client like page(1) that allows me to copy text from a > PDF file? Very useful while reading Fransisco's books... sadly, no. - erik --bcaec521599f21410b04c1ce6993 Content-Type: text/plain; charset=UTF-8 you could use ps2asc...

Copy, copy, copy...
http://www.slate.com/blogs/future_tense/2015/06/08/apple_wwdc_2015_music_news_and_other_things_apple_thinks_it_invented.html On Friday, June 12, 2015 at 9:49:16 AM UTC-4, Nashton wrote: > http://www.slate.com/blogs/future_tense/2015/06/08/apple_wwdc_2015_music_news_and_other_things_apple_thinks_it_invented.html And that was all they ever did, from the very beginning, when they stole from Xerox. On 2015-06-12 18:37:31 +0000, Walter Myer said: > On Friday, June 12, 2015 at 9:49:16 AM UTC-4, Nashton wrote: >> http://www.slate.com/blogs/future_tense/2015/06/08/apple_wwdc_2015_music_news_and_other_things_apple_thinks_it_invented.html >> > > And that was all they ever did, from the very beginning, when they > stole from Xerox. Ummm... No. From that very source: "After so much innovation, it’s no wonder Apple felt the need to take a little time to step back" On 2015-06-12 3:37 PM, Walter Myer wrote: > On Friday, June 12, 2015 at 9:49:16 AM UTC-4, Nashton wrote: >> http://www.slate.com/blogs/future_tense/2015/06/08/apple_wwdc_2015_music_news_and_other_things_apple_thinks_it_invented.html > > And that was all they ever did, from the very beginning, when they stole from Xerox. > I can imagine it's even getting tiresome for their fans. ...

Help reading PDF to get text... #2
Hi, I need help with PDF::API2 or TEXT::PDF::* or any module which can b used to read pdf files. I have been trying to find any other thread which address this... but was unable to get a resolution. I have a bunch of pdf reports which I need to read through to find text string in any of the lines to read the report name. Any help is appreciated. Thanks.[COLOR=firebrick - tq_aud ----------------------------------------------------------------------- Posted via http://www.codecomments.co ----------------------------------------------------------------------- ...

Copy text from PDF, no system font available
Hi, I have a PDF doc which was initially created in QuarkXpress Passport 4.1 (K): LaserWriter 8.8.7 and then was converted to PDF with Acrobat Distiller 5 for Macintosh. I don't have any contact deltails of the original author or the Quark file, just the PDF. I just want to copy certain text from the PDF to Word so that I can add it to a document that I am writing. In Document Properties/Fonts it shows the fonts used in the document and they are all Emdebbed Subset, Type: Type 1 and Encoding Custom. The text I want to copy is written with font MgHelveticaLight-Normal. When I use t...

How to make text copiable from pdf generated by ghostscript?
If you try to copy text from the following pdf (in Acrobat ctrl+C), you will see the copied text are not correct. I'm wondering if there is a way to fix this pdf file so that the text can be copied correctly. http://projects.scipy.org/scipy/raw-attachment/ticket/620/loader2000Fast.pdf On Sun, 10 Jul 2011 16:15:01 -0700 (PDT), Peng Yu wrote: > If you try to copy text from the following pdf (in Acrobat ctrl+C), > you will see the copied text are not correct. I'm wondering if there > is a way to fix this pdf file so that the text can be copied > correctly. > > http://projects.scipy.org/scipy/raw-attachment/ticket/620/loader2000Fast.pdf Use another PDF viewer. xpdf and evince can copy the text. It might help if the embedded fonts were type 1. Bob T. ...

Junk, junk, junk
I know an older man who drives to various businesses and recovers their scrap, often selling to metal yards and the like. Infrequently he comes across older computer equipment and I've built up a couple of systems for him in trade for network cabling and the few odds and ends I can use. Today I went by there and found two (2) older IBM machines that had been scavenged first by their original owning company and then by him. They're PS/2s and they're in sad shape, friends. All the cards pulled, cases broken, SCSI cabling pulled out with parts of the connectors still in the motherboard. All cards present had been pulled and stuffed into a plastic bin. Ah, the ignominy of it all. Oh, yeah. They were an 8580-311 and a 9595-AHA. Cards: XGA-2 FRU 87F4774 Card and daughter, 92F0048 and 92F20050 Card with 4 SIMM slots, 2 SIMMS PN 65x5806 FRU 92F0103 @ 80NS almost identical card w/4 SIMMS, no IBM tag on them Card, 16/4 with 9-pin DIN and what looks like an ethernet connector, FRU 74G0098 on long blue extender, sticker with 74G0099B or 8 on card Card, FRU 85F0063 and card-edge connector(s?) on top Card, 16/4, FRU 74F9415 on short blue extender Card, ESDI HARD DISK ATTACHMENT, PN 90X6858 Am I lucky? Surely pics will follow soon(ish). -- The email address, above, is most certainly munged. Perhaps you might reply to the newsgroup, instead? Thanks! james <mentor@arisia.invalid> wrote in news:Xns9A5BDAF236B3Cxxsomera...

Help reading PDF to get text... #3
Hi, I need help with PDF::API2 or TEXT::PDF::* or any module which can b used to read pdf files. I have been trying to find any other thread which address this... but was unable to get a resolution. I have a bunch of pdf reports which I need to read through to find text string in any of the lines to read the report name. Any help is appreciated. Thanks.[COLOR=firebrick - tq_aud ----------------------------------------------------------------------- Posted via http://www.codecomments.co ----------------------------------------------------------------------- ...

PDF PDF PDF
For anyone struggling to figure out how to create a PDF in SWX it's pretty simple but you may have to have the Bluebeam version of swx FIRST go to Tools>Options and check "Save as PDF" Then you can save them right from the save as dialog. Maybe I'm the only dumbass that could't figure that out! ;0) But it was in help under "PDF" An easier way is to download a program from www.pdf995.com that "prints" you files to a PDF foramat. This program works with SW and any other program you use to print with. "3d" <jmiller at marvelindustri...

Help reading PDF to get text... #4
Hi, I need help with PDF::API2 or TEXT::PDF::* or any module which can b used to read pdf files. I have been trying to find any other thread which address this... but was unable to get a resolution. I have a bunch of pdf reports which I need to read through to find text string in any of the lines to read the report name. Any help is appreciated. Thanks.[COLOR=firebrick - tq_aud ----------------------------------------------------------------------- Posted via http://www.codecomments.co ----------------------------------------------------------------------- ...

problem converting the postscript file to pdf using ghostscript
i have a post script file which has four pagees three pages are in Potrait and the fourth is landscape when i run command this command "gswin32c -q -dLOCALFONTS -dSAFER -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dAutoRotatePages=/None -sOutputFile=c:\S100834431_GScript_6.pdf -dCompatibilityLevel=1.4 -c ..setpdfwrite -f C:\S100834431.ps" it generates a pdf file i get the first three pages correct but the fourth page which was landscape is not displayed properly pls help me ...

Web resources about - Ghostscript and copying text from a pdf gets junk - comp.lang.postscript

Ghostscript - Wikipedia, the free encyclopedia
Ghostscript is a suite of software based on an interpreter for Adobe Systems ' PostScript and Portable Document Format (PDF) page description ...

Ghostscript 9.0 supports ICC profiles
... also supports ICC colour profiles and allows third-party Colour Management Modules (CMMs) to be integrated The developers have released Ghostscript ...

Bill Casselman's course page
PostScript is an interpreted language originally intended for use in printers.It can be used for many tasks involving complicatedgraphics, and ...

GhostPCL, GhostPDF, and GhostXPS
GhostPCL is Artifex Software's implementation of the PCL-5™ and PCL-XL™ family of page description languages. For more information please see ...

FileOptimizer can compress 33 different formats
... The program is essentially a front end for a host of other tools. Present it with a PDF file, say, and behind the scenes it’ll call up Ghostscript ...

The comet is here: Icaros 1.4 has been released!
We are really excited to announce the immediate availability of the new "point release" of Icaros Desktop, the most known distribution of the ...

Coders at Work: L Peter Deutsch
A prodigy, L Peter Deutsch started programming in the late ’50s, at age 11, when his father brought home a memo about the programming of design ...

Commands tagged mate - commandlinefu.com
Great UNIX/Bash commands tagged with mate - see these and many other invaluable command-line nuggets at commandlinefu.com

Cygwin Gold Stars
Cygwin Install Cygwin Update Cygwin Search Packages Licensing Terms Cygwin/X Community Reporting Problems Mailing Lists Newsgroups Gold Stars ...

World atlas of Flickr geotaggers is maptastic
The maps are ordered by the number of pictures taken in the central cluster of each one. This is a little unfair to aggressively polycentric ...

Resources last updated: 3/14/2016 4:53:22 AM