f



How to make text copiable from pdf generated by ghostscript?

If you try to copy text from the following pdf (in Acrobat ctrl+C),
you will see the copied text are not correct. I'm wondering if there
is a way to fix this pdf file so that the text can be copied
correctly.

http://projects.scipy.org/scipy/raw-attachment/ticket/620/loader2000Fast.pdf
0
pengyu.ut (763)
7/10/2011 11:15:01 PM
comp.lang.postscript 3552 articles. 0 followers. Post Follow

1 Replies
497 Views

Similar Articles

[PageSpeed] 57

On Sun, 10 Jul 2011 16:15:01 -0700 (PDT), Peng Yu wrote:
 > If you try to copy text from the following pdf (in Acrobat ctrl+C),
 > you will see the copied text are not correct. I'm wondering if there
 > is a way to fix this pdf file so that the text can be copied
 > correctly.
 >
 > http://projects.scipy.org/scipy/raw-attachment/ticket/620/loader2000Fast.pdf

Use another PDF viewer. xpdf and evince can copy the text.  It might help
if the embedded fonts were type 1.  

Bob T.
0
BobT (1484)
7/11/2011 2:53:18 AM
Reply:

Similar Artilces:

ghostscript PDF page extraction, leaving text as text
Ghostscript may be used to extract pages from a PDF file with a command like this: gs -sDEVICE=pdfwrite \ -dNOPAUSE -dBATCH -dSAFER \ -dFirstPage=48 -dLastPage=48 \ -sOutputFile=onepage.pdf input.pdf The problem is, while that page looks the same as the original in a PDF reader, it seems to be an image rather than an "object" representation. That is, open the extracted PDF in something like Acrobat or PDF XChange Viewer and "search" and "text selection" work, whereas in the extracted one neither function works. Presumably this is because the text has been rasterized. Is it possible to use gs to extract ranges of pages, preferably also reducing the resolution of the embedded images, but leaving the text as text? I frequently need to reduce the size of PDF files, but it should all come out of the resolution of the images, and the text should remain as accessible as it was in the original. If ghostscript cannot do this, is there another linux tool that can? Thanks, David Mathog >>>>> "David" =3D=3D David Mathog <dmathog@gmail.com> writes: David> gs -sDEVICE=3Dpdfwrite \ -dNOPAUSE -dBATCH -dSAFER \ David> -dFirstPage=3D48 -dLastPage=3D48 \ -sOutputFile=3Donepage.pdf David> input.pdf I've just tried this with a PDF file, and it works: search and select works on both onepage.pdf and input.pdf. David> The problem is, while that page looks the same as the David> ori...

pdf \ text (get rid of text in pdf)
Is there a way to remove all text from PDF? Will extract images work for you? If so, PDF-Tools by Tracker Software will do it. http://www.docu-track.com/ -- Don Vancouver, USA "MarosV" <maros.vranec@gmail.com> wrote in message news:ebb897e1-c8e3-4b3a-9274-dfd9d2c845c3@c4g2000hsg.googlegroups.com... > Is there a way to remove all text from PDF? ...

ANN: Fly Text to PDF
Hi All: Fly Text to PDF 1.3 is powerful tool which can convert your text files into PDF. This tool is powerful converter tool running on Microsoft Windows Operating System. You can use this tool to convert your text report, text documents and other text files into PDF quickly and easily. You also can set the PDF properties in each text files by using special tags, or set the default properties for every output PDF files. Please visit our website for more information: http://www.medafan.com/pdf-tools For the output sample, please click on: http://www.medafan.com/pdf-tools/license.pdf Key fea...

Generating multiple pdf files with generated names using ods pdf.
Hi everyone. Is it possible to generate MULTIPLE, individually named (on the fly) pdf files using the ods pdf facility? For example, an input file has data for Alfred, Betty, Charles, Debra etc and I want a separate pdf file generated for each person with the file names generated on the fly. For example, I would like to generate files named as file-Alfred.pdf file-Betty.pdf file-Charles.pdf file-Debra.pdf so that the separate reports can be "distributed" individually. The "names" above are not known in advance. I know about the newfile option, but this gener...

Convert postscript to PDF or text in OSX
I understand that you can convert postscript files to pdf using some command line utility built into OSX. (When i was reading about PStill http://www.versiontracker.com/dyn/moreinfo/macosx/3629 an Adobe Distill type of program for OSX $70. But the command is free.) What command do i use? % man pdf returned nothing. I actually want to convert postscript files to the simple text format. They are all text, columns and rows of names and numbers. The idea is to get test scores with associated ID#s into a spreadsheet. The text returned will have spaces instead of tabs so it will be clumb...

Ghostscript and copying text from a pdf gets junk
Dear group, I have downloaded gs 8.14 and redmon 1.7 to make pdfs in win98se. I use the apple color laserwriter 12/600 driver which outputs to redmon and makes a pdf. The pdfs look great, the only problem is that when i try to copy the text in acrobat and paste to notepad, it comes out like this: 7\SH)RQW)RUPDW \SH([WHQVLRQ9HUVLRQ&RPSUHVVLRQ&RORU'HSWK0DLQWDLQHU$GREH6\VWHPV6SHFLILFDWLRQ GSview and text extract give similar results. It doesn't always do this, that is, some pdfs don't have this problem, but if i ask the printer driver to use truetype fonts only (no subst) it seems to do it every time. It appears that a "shifting" process is occurring: copy text: 5HVXOWV orig text: Results s->V 3 character shift (not counting that it's lower case to upper case) t->W 3 chars e->H 3 chars When the print driver uses "standard substitutions" like arial -> helvetica it copies fine. Any ideas? I dont think it's mozilla (which is the app i'm printing from) Thanks, Don ...

PDF PDF PDF
For anyone struggling to figure out how to create a PDF in SWX it's pretty simple but you may have to have the Bluebeam version of swx FIRST go to Tools>Options and check "Save as PDF" Then you can save them right from the save as dialog. Maybe I'm the only dumbass that could't figure that out! ;0) But it was in help under "PDF" An easier way is to download a program from www.pdf995.com that "prints" you files to a PDF foramat. This program works with SW and any other program you use to print with. "3d" <jmiller at marvelindustri...

Problems generating small PDF files in Ghostscript
I use Ghostscript for generating PDF files -- it seems to be one of the best free general-purpose tools for distilling PS, and the resulting PDF files are often smaller than those generated by other programs. But there are a few hang-ups with generating the smallest possible PDF files using Ghostscript: 1) There are "Tr" modes in the PDF specification for automatically showing a text string, then stroking the path of the same text string in the same position, but Ghostscript doesn't generate such PDF code -- instead "(XX) dup gsave show grestore false charpath stroke&quo...

PDF file generated from Ghostscript and Acrobat Distiller
Hello, I have 2 PDF files generated from a same Microsoft Word DOC file (2 pages of my resume) from two computers. One computer generated PDF file by Ghostscript 8.53, and the other computer generated by Acrobat Distiller. I notice the PDF file generated by Ghostscript is 13KB, and generated by Acrobat Distiller is 130KB. I was wondering that why the file size is so big difference? Any Ideas? Thanks in advance. Tony In article <%70ag.2338$a23.79@trndny01>, tc123456_1999@hotmail.com says... > Hello, > > I have 2 PDF files generated from a same Microsoft Word DOC file (2 pages of > my resume) from two computers. One computer generated PDF file by > Ghostscript 8.53, and the other computer generated by Acrobat Distiller. I > notice the PDF file generated by Ghostscript is 13KB, and generated by > Acrobat Distiller is 130KB. I was wondering that why the file size is so big > difference? Any Ideas? Embedded fonts, different compression settings, differnet image compression/downsampling (if images are present) are the most likely. Acrobat 7 (professional) can do 'audit space usage' Ken These PDF files do not have embedded image. For the smaller PDF file, does that mean this PDF file has higher compression setting, or this file has less embedded components (such as fonts) in it? If it has less components, will this smaller PDF file display content correctly at any other computer? Thanks. Tung "Ken Sharp&q...

How to generate correct postscript from pdf on a SPARC Solaris
Users have begun reporting they are receiving PDFs created on Macs and Windows using Adobe Acrobat 6.x. Our platform is SPARC Solaris 8, and Adobe only has made Acrobat reader 5.0.8 available for download. The problem is that the users are reporting certain special characters are being printed differently - in one case, a less than or equal sign is being printed as a greater than or equal to sign! I just checked adobe.com and no newer version of acrobat reader is available for Solaris (or, for that matter, Linux). What legal alternatives do we have on Solaris for generating correct printed...

Text from PDF files generated from a LaTeX source
I'm dealing with trying to get a company that sells a file parsing system to schools and universities to combat plagiarism. They claim that they can handle PDF files, but all the versions I have tried so far (from a LaTeX source, both by way of pdflatex and dvips -Ppdf ; ps2pdf14) is not handled properly by their system (they are slow to respond to questions, so I have had mucho frustration getting them to actually test files for me, all my tests have been using reports mailed to my professsors as tests :-). What I have getten from them is that they need to extract the text from the fil...

Generating a PDF with page dimensions only as large as the text demands
I'm interested in using pdflatex to generate good-looking equations that I will paste into PowerPoint slides. (Yes, yes, rotten tomatoes coming my way...) What I would like to happen is for the generated PDF to have the dimensions demanded by the equation contained in the document. Is there any way to do this? I have tried generating a page-size PDF with pdflatex and then converting with ImageMagick with the command convert -trim file.pdf trimmed_file.pdf but trimmed_file.pdf no longer looks good -- the background isn't transparent, and the text looks bad upon rescaling. Any suggestions would be much appreciated! Best Roger On Oct 19, 5:12=A0pm, "sinos...@gmail.com" <sinos...@gmail.com> wrote: > I'm interested in using pdflatex to generate good-looking equations > that I will paste into PowerPoint slides. =A0(Yes, yes, rotten tomatoes > coming my way...) =A0What I would like to happen is for the generated > PDF to have the dimensions demanded by the equation contained in the > document. =A0Is there any way to do this? =A0I have tried generating a > page-size PDF with pdflatex and then converting with ImageMagick with > the command > > =A0 convert -trim file.pdf trimmed_file.pdf > > but trimmed_file.pdf no longer looks good -- the background isn't > transparent, and the text looks bad upon rescaling. > > Any suggestions would be much appreciated! Ah, I've managed to answer my own questio...

problem converting the postscript file to pdf using ghostscript
i have a post script file which has four pagees three pages are in Potrait and the fourth is landscape when i run command this command "gswin32c -q -dLOCALFONTS -dSAFER -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dAutoRotatePages=/None -sOutputFile=c:\S100834431_GScript_6.pdf -dCompatibilityLevel=1.4 -c ..setpdfwrite -f C:\S100834431.ps" it generates a pdf file i get the first three pages correct but the fourth page which was landscape is not displayed properly pls help me ...

Q: Latex special to make pdf text unextractable
Can somebody advise on how to block text extraction from pdf's generated through latex -> dvips -> ps2pdf (a \special probably)? VK Among other things, VK@club.org saw fit to write: > Can somebody advise on how to block text extraction from pdf's generated > through latex -> dvips -> ps2pdf (a \special probably)? I don't know, but you could try using an external app *after* generating the pdf, something like pdftk: http://www.accesspdf.com/pdftk/ -- Ignacio __ Fern�ndez Galv�n / /\ Linux user / / \ #289967 / / /\ \ PGP Pub Key / / /\ \ \ 0x01A95F99 / /_/__\ \ \ /________\ \ \ jellby \___________\/ yahoo.com On 29-11-2004 20:22, VK@club.org wrote: > Can somebody advise on how to block text extraction from pdf's generated > through latex -> dvips -> ps2pdf (a \special probably)? For that purpose (among others), I use Multivalent: http://sourceforge.net/project/showfiles.php?group_id=44509 Best regards, Jose Carlos Santos ...

How to make ASCII text of scanned PDF image files searchable by Google
I am a professional writer, and have a number of published booklets, brochures, etc. scanned into Adobe Acrobat PDF format as images, including all of the original printed artwork. They are linked to from the resume and portfolio page of my business website at: http://www.joelrennie.com/resume.html The problem is that while Goolge and other search engines now index the contents of PDF files for web searches, these pages and their text are not searchable. Is there an easy way I can make the text of these files searchable? I could easy re-scan each of the documents with my OCR software to g...

ADV: EventStudio 2.0: Generate PDF Documents from Text Files
http://www.EventHelix.com/EventStudio/ We are pleased to announce the release of EventStudio 2.0, This is a major upgrade to the product. What's New in EventStudio 2.0: o Four new type of documents can be generated: - Collaboration Diagram (PDF) - Interface Collaboration Diagram (PDF) - Interaction Collaboration Diagram (PDF) - Message Filter Collaboration Diagram (PDF) o Choose between collaboration diagram or context diagrams for any of the above document types. o Use the power of regular expressions to generate an infinite variety of documents...

Re: How do I make the text bigger using "ODS PDF" in this case?
You need use proc template to create your style with bigger font: filename ascii temp; proc printto print=ascii; run; options ls=70; proc print data=sashelp.class; run; proc printto print=print; run; proc template; edit styles.default as ymini; style fonts / 'BatchFixedFont' ="Courier",16pt); end; run; option orientation=landscape; ods pdf file="c:\temp\junk.pdf" style=ymini; data _null_; file print; infile ascii; input; put @1 _infile_ $char70.; run; ods pdf close; On Wed, 29 Apr 2009 02:40:34 -0700, RolandRB <rolandberry@HOTMAIL.COM> wrote: &g...

Problems converting PDF to PDF/A with ghostscript
I am running Ghostscript 8.71 on Windows. I have tried just about every combination of switches and ICC profiles. The command that seems to work the best is as follows: gswin32 -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE -dUseCIEColor - sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite - sOutputFile=out_pdfa.pdf input_pdf.pdf This yields a seemingly valid PDF/A-1b (CMYK) for most files. I am able to open the files in acrobat and successfully validate the conformity of the document to the aforementioned PDF/A standard. Certain files do not pass the PDF/A conformity validation in Acrobat, despite the note in Acrobat stating the file is being viewed in PDF/A mode. The preflight test reports the following error: CIDset in subset font is incomplete ArialBoldItalicMT 14.112 pt TrueType (CID) embedded (as a subset) If I convert the offensive input_pdf.pdf file using Acrobat by printing to the Adobe PDF printer and selecting the PDF/A-1b (CMYK) standard the output PDF passes the preflight without any errors. Does anyone have any suggestions for consistently producing valid PDF/ A-1b (CMYK) documents with Ghostscript? Cheers! In article <6cfcdbba-47de-4be1-8e93- 1d4255ab0ebf@t2g2000yqe.googlegroups.com>, brent.m.edwards@gmail.com says... > I am running Ghostscript 8.71 on Windows. I have tried just about > every combination of switches and ICC profiles. The command that seems > to work the best is as follows: > > gswin32 -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSA...

make text flow around other text ?
This text starts +---------------+ on the left but | | then when the box | some boxed | ends we see that | text | the text on the | | left is allowed +---------------+ to flow across all the way to the right margin. What HTML tags convey that this flowing should occur ? Note: The boxed text is preferrably as text and not as an image. Thanks. ericosman@rcn.com >From: ericosman@rcn.com (Eric=A0Osman) >What HTML tags convey >that this flowing should occur ? A simple Tables Layout can accomplish that! Web...

PDF image of text to readable text ?
Seems there are web based tools and software. My son needs text to have it read for him. He has a PC. Found PDF reader $50 , http://thurly.net/11ia and http://thurly.net/11i4 the last being google. Wondering what you folks found useful or use ? Thanks! -- Bill S. Jersey USA zone 5 shade garden http://uppitywis.org/ live WI ...

generating text in unbound text books
I want some unbound text boxes in my form which lookup information about a student (such as their address, etc.) from a "Students" table when the Student's ID # is entered in this "Student Programs" table that I have. Say one such text box is labeled "Add1" and is Unbound. Why doesn't this piece of code, put in the afterupdate portion of Student ID, give me the Address of the Student? (it instead gives me a blank form) Private Sub Student_ID_AfterUpdate() Me.Add1 = DLookup("[Home address 1]", "Students", "[Student ID] = &quo...

Making 2 lines of text in text()
Hi, I would like to add some text in a plot using text(). I am wondering what symbol should I add in the text so that it shows 2 lines of text in the textbox? Thanks, Alan Use the "\newline" tag for TeX/LaTeX t = text(1,1,'my\newlinetext'); Jeremy "Alan" <twaleung@engmail.uwaterloo.ca> wrote in message news:ef4f1d3.-1@webcrossing.raydaftYaTP... > Hi, > > I would like to add some text in a plot using text(). I am wondering > what symbol should I add in the text so that it shows 2 lines of text > in the textbox? > > Thanks, > >...

Need a tool which makes a Text-PDF and a separate TXT-File from a windows print stream
Hello, I know some tools which make Text-PDFs and other tools which make Picture-PDFs and separate TXT-Files with a bad structure (values of tables are as long strings without spaces in the textfile, so it is not possible to extract the values) from a windows print stream (virtual printer). Is there a tool which can both things or is it necessary to take two different tools (the first tool make a Text-PDF and the second tool makes a textfile from the PDF). Thanks Jens Jens, Can you capture the Windows print stream as Windows Metafiles (WMF, EMF, GDI)? If so, our wmf2vector SDK migh...

Re: How do I make the text bigger using "ODS PDF" in this case? #2
Sorry, there is a '(' missing from this line: 'BatchFixedFont' =("Courier",16pt); On Wed, 29 Apr 2009 19:25:54 -0400, Ya Huang <ya.huang@AMYLIN.COM> wrote: >You need use proc template to create your style with bigger >font: > >filename ascii temp; >proc printto print=ascii; >run; >options ls=70; >proc print data=sashelp.class; >run; >proc printto print=print; >run; > >proc template; > edit styles.default as ymini; > style fonts / > 'BatchFixedFont' ="Courier",16pt); > end; >run; >...

Web resources about - How to make text copiable from pdf generated by ghostscript? - comp.lang.postscript

Ghostscript - Wikipedia, the free encyclopedia
Ghostscript is a suite of software based on an interpreter for Adobe Systems ' PostScript and Portable Document Format (PDF) page description ...

Ghostscript 9.0 supports ICC profiles
... also supports ICC colour profiles and allows third-party Colour Management Modules (CMMs) to be integrated The developers have released Ghostscript ...

Bill Casselman's course page
PostScript is an interpreted language originally intended for use in printers.It can be used for many tasks involving complicatedgraphics, and ...

GhostPCL, GhostPDF, and GhostXPS
GhostPCL is Artifex Software's implementation of the PCL-5™ and PCL-XL™ family of page description languages. For more information please see ...

FileOptimizer can compress 33 different formats
... The program is essentially a front end for a host of other tools. Present it with a PDF file, say, and behind the scenes it’ll call up Ghostscript ...

The comet is here: Icaros 1.4 has been released!
We are really excited to announce the immediate availability of the new "point release" of Icaros Desktop, the most known distribution of the ...

Coders at Work: L Peter Deutsch
A prodigy, L Peter Deutsch started programming in the late ’50s, at age 11, when his father brought home a memo about the programming of design ...

Commands tagged mate - commandlinefu.com
Great UNIX/Bash commands tagged with mate - see these and many other invaluable command-line nuggets at commandlinefu.com

Cygwin Gold Stars
Cygwin Install Cygwin Update Cygwin Search Packages Licensing Terms Cygwin/X Community Reporting Problems Mailing Lists Newsgroups Gold Stars ...

World atlas of Flickr geotaggers is maptastic
The maps are ordered by the number of pictures taken in the central cluster of each one. This is a little unfair to aggressively polycentric ...

Resources last updated: 3/14/2016 3:37:49 AM