html2ps, gnome-web-photo, etc.

  • Follow


I wonder if any of you have used html2ps, gnome-web-photo, or any
other means of converting HTML pages to images which could thereafter
be included in a PDF file.  I have several hundred such pages which
ought to be converted.  By and large they are pretty simple and do not
themselves contain images.  (The pages are mostly query screens
and the reports which are produced by running the queries.)  I am
interested in alternatives to firing up a graphics editor and scraping
each screen by hand, which given the number of pages will be a very
tedious undertaking.  The documentation is currently being
produced on a Linux machine and I would like to keep work on it
there if possible, hence excursions to Windows programs are not
desirable although not completely out of the question.  The ideal
would be a GUI-free command-line program.

0
Reply anarcissie (8) 9/7/2006 3:02:05 AM

Hi,

On 6 Sep 2006 20:02:05 -0700 anarcissie@gmail.com wrote:

> I wonder if any of you have used html2ps, gnome-web-photo, or any
> other means of converting HTML pages to images which could thereafter
> be included in a PDF file.  I have several hundred such pages which
> ought to be converted.  By and large they are pretty simple and do not
> themselves contain images.

So do you want them as images or PDF? That's because I don't see any
reason to make them images if it's the latter. You can go directly from
postscript to PDF, and your files will be probably much smaller and
smoother. So my suggestion would be to go with html2ps
(http://user.it.uu.se/~jan/html2ps.html) and - if conversion to
something other than postscript is needed - use ghostscript's ps2pdf to
convert to PDF.

-hwh
0
Reply Hans 9/7/2006 1:55:27 PM


Hans-Werner Hilse wrote:
> Hi,
>
> On 6 Sep 2006 20:02:05 -0700 anarcissie@gmail.com wrote:
>
> > I wonder if any of you have used html2ps, gnome-web-photo, or any
> > other means of converting HTML pages to images which could thereafter
> > be included in a PDF file.  I have several hundred such pages which
> > ought to be converted.  By and large they are pretty simple and do not
> > themselves contain images.
>
> So do you want them as images or PDF? That's because I don't see any
> reason to make them images if it's the latter. You can go directly from
> postscript to PDF, and your files will be probably much smaller and
> smoother. So my suggestion would be to go with html2ps
> (http://user.it.uu.se/~jan/html2ps.html) and - if conversion to
> something other than postscript is needed - use ghostscript's ps2pdf to
> convert to PDF.

Ideally I would be including pictures of the pages with the article
which
to them.

Right now I have two outputs from each LaTeX file: an HTML page, from
which the screenshots are linked, and a PDF file, which has no
screenshots.  I'd like to include the screenshots in the PDF version
as illustrations just in case my clients decide they want the paper
version.  (The HTML screenshots are mostly just neutralized HTML
at this point, which I display in a dependent window on request.)

0
Reply anarcissie 9/7/2006 6:53:28 PM

On 7 Sep 2006 11:53:28 -0700, anarcissie@gmail.com wrote:

>> So do you want them as images or PDF? That's because I don't see any
>> reason to make them images if it's the latter. You can go directly from
>> postscript to PDF, and your files will be probably much smaller and
>> smoother. So my suggestion would be to go with html2ps
>> (http://user.it.uu.se/~jan/html2ps.html) and - if conversion to
>> something other than postscript is needed - use ghostscript's ps2pdf to
>> convert to PDF.
>
>Ideally I would be including pictures of the pages with the article
>which
>to them.
>
>Right now I have two outputs from each LaTeX file: an HTML page, from
>which the screenshots are linked, and a PDF file, which has no
>screenshots.  I'd like to include the screenshots in the PDF version
>as illustrations just in case my clients decide they want the paper
>version.  (The HTML screenshots are mostly just neutralized HTML
>at this point, which I display in a dependent window on request.)

I am not sure if I understand how your followup relates to the
suggestion you're quoting. Have you tried going that route, i.e.

  html => ps => pdf => inclusion into the pdf through latex

?


Michele
-- 
>It's because the universe was programmed in C++.
No, no, it was programmed in Forth.  See Genesis 1:12:
"And the earth brought Forth ..."
- Robert Israel in sci.math, thread "Why numbers?"
0
Reply Michele 9/7/2006 8:22:38 PM

anarcissie@gmail.com wrote:
> I wonder if any of you have used html2ps, gnome-web-photo, or any
> other means of converting HTML pages to images which could thereafter
> be included in a PDF file.  I have several hundred such pages which
> ought to be converted.  By and large they are pretty simple and do not
> themselves contain images.  (The pages are mostly query screens
> and the reports which are produced by running the queries.)  I am
> interested in alternatives to firing up a graphics editor and scraping
> each screen by hand, which given the number of pages will be a very
> tedious undertaking.  The documentation is currently being
> produced on a Linux machine and I would like to keep work on it
> there if possible, hence excursions to Windows programs are not
> desirable although not completely out of the question.  The ideal
> would be a GUI-free command-line program.

Netscape used to have a commandline option something along the lines of

$ netscape -print -exit <uri>

which when run *without* a GUI present would print the page to a PS
file and then exit. It was then trivial to snip off pages >1 and turn
the result into an image format.

But I have no idea if this technique still exists. Firefox would do
well to implement it if it's been dropped...

///Peter
-- 
Crossposted additionally to comp.infosystems.www.browsers.misc and 
netscape.public.mozilla.browser
0
Reply Peter 9/7/2006 9:38:44 PM

Peter Flynn wrote:
> anarcissie@gmail.com wrote:
> > I wonder if any of you have used html2ps, gnome-web-photo, or any
> > other means of converting HTML pages to images which could thereafter
> > be included in a PDF file.  I have several hundred such pages which
> > ought to be converted.  By and large they are pretty simple and do not
> > themselves contain images.  (The pages are mostly query screens
> > and the reports which are produced by running the queries.)  I am
> > interested in alternatives to firing up a graphics editor and scraping
> > each screen by hand, which given the number of pages will be a very
> > tedious undertaking.  The documentation is currently being
> > produced on a Linux machine and I would like to keep work on it
> > there if possible, hence excursions to Windows programs are not
> > desirable although not completely out of the question.  The ideal
> > would be a GUI-free command-line program.
>
> Netscape used to have a commandline option something along the lines of
>
> $ netscape -print -exit <uri>
>
> which when run *without* a GUI present would print the page to a PS
> file and then exit. It was then trivial to snip off pages >1 and turn
> the result into an image format.
>
> But I have no idea if this technique still exists. Firefox would do
> well to implement it if it's been dropped...
>
> ///Peter
> --
> Crossposted additionally to comp.infosystems.www.browsers.misc and
> netscape.public.mozilla.browser

Discussions in Mozilla-related newsgroups and mailing lists I
have read indicated that this function no longer exists, and I
have not tried it.  Some called for its revival, but the
appearance of gnome-web-photo was held by others to have
answered the need.  My problem with gnome-web-photo is
that the Linux system I am working with may not have the
necessary components, and I do not have administrative
privileges or even very ready access to the administrators.

0
Reply anarcissie 9/8/2006 2:25:36 PM

Michele Dondi wrote:
> On 7 Sep 2006 11:53:28 -0700, anarcissie@gmail.com wrote:
>
> >> So do you want them as images or PDF? That's because I don't see any
> >> reason to make them images if it's the latter. You can go directly from
> >> postscript to PDF, and your files will be probably much smaller and
> >> smoother. So my suggestion would be to go with html2ps
> >> (http://user.it.uu.se/~jan/html2ps.html) and - if conversion to
> >> something other than postscript is needed - use ghostscript's ps2pdf to
> >> convert to PDF.
> >
> >Ideally I would be including pictures of the pages with the article
> >which
> >to them.
> >
> >Right now I have two outputs from each LaTeX file: an HTML page, from
> >which the screenshots are linked, and a PDF file, which has no
> >screenshots.  I'd like to include the screenshots in the PDF version
> >as illustrations just in case my clients decide they want the paper
> >version.  (The HTML screenshots are mostly just neutralized HTML
> >at this point, which I display in a dependent window on request.)
>
> I am not sure if I understand how your followup relates to the
> suggestion you're quoting. Have you tried going that route, i.e.
>
>   html => ps => pdf => inclusion into the pdf through latex

That's my plan, if I can get html2ps to work.  I was just giving
some context for my previous message.  I was also trying to
ward off the fate of a previous request of this type in these
newsgroups of some years past, which was repeatedly answered
by someone pushing a Windows GUI one-page-at-a-time HTML
imager, which is not what I want.  (I studied Google Groups
before piping up!)

0
Reply anarcissie 9/8/2006 2:32:43 PM

6 Replies
196 Views

(page loaded in 0.095 seconds)

Similiar Articles:









7/12/2012 1:34:25 AM


Reply: