I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
with the output piped into first a search command and then InfoZip to
compress the results, without creating a large intermediate file.
Later I can use pipe and unzip to decompress into further search
commands again without large intermediate files.
The problem is that using the DCL Pipe command seems very slow
compared to writing things to disk and reading back. I'd like to avoid
the intermediate files being stored uncompressed as they can be as
large as 5GB but even minimum compression with InfoZip reduces them by
better than 20:1.
The examples below would perform a a very short run and seem to suffer
very badly from the use of Pipe.
Is Pipe always slow?
Any Suggestions?
Tim
Example :-
> PIPE @run_prog_to_sysoutput.com | -
search sys$input "interesting stuff" | -
zip z1.zip - -1
> PIPE unzip -p z1.zip | -
search sys$input/out=results1.txt "more interesting stuff"
Takes 82 seconds
> @run_prog_to_file
> search intermediate_1.txt/out=intermediate_2.txt "interesting
stuff"
> zip z2.zip intermediate_2.txt -1
> delete intermediate_2.txt;*
> unzip z2.zip
> search intermediate_2.txt/out=results2.txt "more interesting
stuff"
Takes 27 seconds
Zip options used are
zip - compress from pipe
zip -1 use minimum compression
unzip -p compress to pipe
|
|
0
|
|
|
|
Reply
|
tim-DOT-ffrench-HYPHEN-lynch (25)
|
8/2/2004 10:22:55 AM |
|
Tim ffrench-Lynch wrote:
> I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
> with the output piped into first a search command and then InfoZip to
> compress the results, without creating a large intermediate file.
> Later I can use pipe and unzip to decompress into further search
> commands again without large intermediate files.
>
> The problem is that using the DCL Pipe command seems very slow
> compared to writing things to disk and reading back. I'd like to avoid
> the intermediate files being stored uncompressed as they can be as
> large as 5GB but even minimum compression with InfoZip reduces them by
> better than 20:1.
>
> The examples below would perform a a very short run and seem to suffer
> very badly from the use of Pipe.
>
> Is Pipe always slow?
> Any Suggestions?
>
> Tim
>
>
> Example :-
>
> > PIPE @run_prog_to_sysoutput.com | -
> search sys$input "interesting stuff" | -
> zip z1.zip - -1
> > PIPE unzip -p z1.zip | -
> search sys$input/out=results1.txt "more interesting stuff"
>
> Takes 82 seconds
>
> > @run_prog_to_file
> > search intermediate_1.txt/out=intermediate_2.txt "interesting
> stuff"
> > zip z2.zip intermediate_2.txt -1
> > delete intermediate_2.txt;*
> > unzip z2.zip
> > search intermediate_2.txt/out=results2.txt "more interesting
> stuff"
>
> Takes 27 seconds
>
>
> Zip options used are
> zip - compress from pipe
> zip -1 use minimum compression
> unzip -p compress to pipe
There have been various reports here in COV that...
$ PIPE /NOLOGICAL_NAMES/NOSYMBOLS
will improve performance, assuming your pipe doesn't neet predefined
logicals or symbols.
Cheers!
Keith Cayemberg
|
|
0
|
|
|
|
Reply
|
keith.cayemberg2 (352)
|
8/2/2004 10:46:07 AM
|
|
Doesn't PIPE always/mostly creates temp files for the
"piped" data ?
Jan-Erik.
Tim ffrench-Lynch wrote:
>
> I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
> with the output piped into first a search command and then InfoZip to
> compress the results, without creating a large intermediate file.
> Later I can use pipe and unzip to decompress into further search
> commands again without large intermediate files.
>
> The problem is that using the DCL Pipe command seems very slow
> compared to writing things to disk and reading back. I'd like to avoid
> the intermediate files being stored uncompressed as they can be as
> large as 5GB but even minimum compression with InfoZip reduces them by
> better than 20:1.
>
> The examples below would perform a a very short run and seem to suffer
> very badly from the use of Pipe.
>
> Is Pipe always slow?
> Any Suggestions?
|
|
0
|
|
|
|
Reply
|
aaa23 (199)
|
8/2/2004 8:38:39 PM
|
|
Jan-Erik S�derholm wrote:
>
> Doesn't PIPE always/mostly creates temp files for the
> "piped" data ?
Not to my knowledge, but how could one test for that?
D.J.D.
|
|
0
|
|
|
|
Reply
|
djesys.nospam3 (1961)
|
8/3/2004 12:50:02 AM
|
|
In article <410EE139.6E9DE3FB@comcast.net>, David J Dachtera <djesys.nospam@comcast.net> wrote:
> Jan-Erik S�derholm wrote:
>>
>> Doesn't PIPE always/mostly creates temp files for the
>> "piped" data ?
>
> Not to my knowledge, but how could one test for that?
Set an alarm ACE for file creation/deletion, and enable that for long
enough to do a single PIPE test?
-Dan
|
|
0
|
|
|
|
Reply
|
Dan
|
8/3/2004 1:21:38 AM
|
|
Tim ffrench-Lynch wrote:
> >
> > I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
> > with the output piped into first a search command and then InfoZip to
> > compress the results, without creating a large intermediate file.
> > Later I can use pipe and unzip to decompress into further search
> > commands again without large intermediate files.
> >
> > The problem is that using the DCL Pipe command seems very slow
> > compared to writing things to disk and reading back. I'd like to avoid
> > the intermediate files being stored uncompressed as they can be as
> > large as 5GB but even minimum compression with InfoZip reduces them by
> > better than 20:1.
> >
> > The examples below would perform a a very short run and seem to suffer
> > very badly from the use of Pipe.
> >
> > Is Pipe always slow?
> > Any Suggestions?
There is a zipGrep.pl script in the examples directory of the Perl
extension Archive::Zip. See:
http://search.cpan.org/~nedkonz/Archive-Zip-1.12/
You will also need to install the Compress::Zlib extension, and, of
course, Perl. If you go this route you can avoid the subprocess
creation and communication through mailboxes, which I'd think would
help, though I haven't actually done any tests.
|
|
0
|
|
|
|
Reply
|
craigberry (308)
|
8/3/2004 2:29:37 AM
|
|
In article <410EE139.6E9DE3FB@comcast.net>, David J Dachtera <djesys.nospam@comcast.net> writes:
> Jan-Erik S�derholm wrote:
>>
>> Doesn't PIPE always/mostly creates temp files for the
>> "piped" data ?
>
> Not to my knowledge, but how could one test for that?
$ pipe show u | show log sys$pipe
"SYS$PIPE" = "_EISNER$MPA1287:" (LNM$PROCESS_TABLE)
It uses the MP pseudo-device driver. It's mostly like an ordinary
mailbox. The device driver does not (as far as I know -- though
I cringe to think otherwise) make use of any disk files.
John Briggs
|
|
0
|
|
|
|
Reply
|
briggs3 (572)
|
8/3/2004 7:29:41 PM
|
|
In article <y388I8c2gTXT@eisner.encompasserve.org>, briggs@encompasserve.org writes:
> In article <410EE139.6E9DE3FB@comcast.net>, David J Dachtera <djesys.nospam@comcast.net> writes:
>> Jan-Erik S�derholm wrote:
>>>
>>> Doesn't PIPE always/mostly creates temp files for the
>>> "piped" data ?
>>
>> Not to my knowledge, but how could one test for that?
>
> $ pipe show u | show log sys$pipe
> "SYS$PIPE" = "_EISNER$MPA1287:" (LNM$PROCESS_TABLE)
>
For completeness...
$ pipe show log sys$output | -
( type sys$pipe ; show log sys$pipe ) | -
type sys$pipe
"SYS$OUTPUT" = "_ALPHA$MPA127:" (LNM$PROCESS_TABLE)
"SYS$PIPE" = "_ALPHA$MPA127:" (LNM$PROCESS_TABLE)
So both reader and writer access the same named pipe device.
John Briggs
|
|
0
|
|
|
|
Reply
|
briggs3 (572)
|
8/3/2004 7:52:29 PM
|
|
In article <410EA64E.69A2F6E9@aaa.com>, Jan-Erik =?iso-8859-1?Q?S=F6derholm?= <aaa@aaa.com> writes:
> Doesn't PIPE always/mostly creates temp files for the
> "piped" data ?
>
There was an implementation of PIPE for VMS 6 and earlier which did
just that. The PIPE built into VMS since 7.0 does not use temporary
disk files, it uses a "pipe", which is basically a variaion on the
concept of a mailbox.
|
|
0
|
|
|
|
Reply
|
koehler2 (8190)
|
8/3/2004 8:10:41 PM
|
|
Bob Koehler wrote:
>
> In article <410EA64E.69A2F6E9@aaa.com>, Jan-Erik =?iso-8859-1?Q?S=F6derholm?= <aaa@aaa.com> writes:
> > Doesn't PIPE always/mostly creates temp files for the
> > "piped" data ?
> >
>
> There was an implementation of PIPE for VMS 6 and earlier which did
> just that. The PIPE built into VMS since 7.0 does not use temporary
> disk files, it uses a "pipe", which is basically a variaion on the
> concept of a mailbox.
OK.
The O.P was talking of some 8Gb of data, right ? I would not expect PIPE
to be the most efficient method to deal with that amount.
Jan-Erik.
|
|
0
|
|
|
|
Reply
|
aaa23 (199)
|
8/3/2004 9:32:57 PM
|
|
Tim ffrench-Lynch wrote:
> I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
> with the output piped into first a search command and then InfoZip to
> compress the results, without creating a large intermediate file.
> Later I can use pipe and unzip to decompress into further search
> commands again without large intermediate files.
>
> The problem is that using the DCL Pipe command seems very slow
> compared to writing things to disk and reading back. I'd like to avoid
> the intermediate files being stored uncompressed as they can be as
> large as 5GB but even minimum compression with InfoZip reduces them by
> better than 20:1.
>
> The examples below would perform a a very short run and seem to suffer
> very badly from the use of Pipe.
>
> Is Pipe always slow?
> Any Suggestions?
<snip>
Yep, don't use PIPE.
PIPE does two undesirable things from a performance point of view:
it creates multiple subprocesses, something like one for each
"segment" in the pipe, and it uses mailboxes for transmitting the
data from one subprocess to the next, and those mailboxes, I believe
use the default device characteristics, i.e., they may be limited
to 1056 bytes or so (is it DEFMBXBUFQUO or DEFMBXMXMSG that controls
this?). In any case, is a small "pipe" (sorry for the pun) you're
trying to force a lot of data through.
I tell our developers that PIPE is an aid for the programmer, not
the program. You do NOT gain preformance, only your personal time
if this is a qucik one-off sort of thing. If you need a more
permanent tool, and/or you intend to do this processing regularly,
using temporary files is much more effcient and faster.
BTW, I use PIPE all the time for simply stuff from the command
line. However, I've made a rule (for myself and my team) to
NEVER use it in DCL procedures...especially stuff that runs in
batch serveral times a day or more...
Regards, Ken
--
I don't speak for Intel, Intel doesn't speak for me...
Ken Fairfield
D1C Automation VMS System Support
who: kenneth dot h dot fairfield
where: intel dot com
|
|
0
|
|
|
|
Reply
|
My.Full.Name (343)
|
8/3/2004 11:20:31 PM
|
|
Ken Fairfield wrote:
>
> Tim ffrench-Lynch wrote:
>
> > I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
> > with the output piped into first a search command and then InfoZip to
> > compress the results, without creating a large intermediate file.
> > Later I can use pipe and unzip to decompress into further search
> > commands again without large intermediate files.
> >
> > The problem is that using the DCL Pipe command seems very slow
> > compared to writing things to disk and reading back. I'd like to avoid
> > the intermediate files being stored uncompressed as they can be as
> > large as 5GB but even minimum compression with InfoZip reduces them by
> > better than 20:1.
> >
> > The examples below would perform a a very short run and seem to suffer
> > very badly from the use of Pipe.
> >
> > Is Pipe always slow?
> > Any Suggestions?
>
> <snip>
>
> Yep, don't use PIPE.
>
> PIPE does two undesirable things from a performance point of view:
> it creates multiple subprocesses, something like one for each
> "segment" in the pipe, and it uses mailboxes for transmitting the
> data from one subprocess to the next, and those mailboxes, I believe
> use the default device characteristics, i.e., they may be limited
> to 1056 bytes or so (is it DEFMBXBUFQUO or DEFMBXMXMSG that controls
> this?).
For MBA devices, yeah - I'd expect. Not sure about MPAs (pipes).
> In any case, is a small "pipe" (sorry for the pun) you're
> trying to force a lot of data through.
>
> I tell our developers that PIPE is an aid for the programmer, not
> the program. You do NOT gain preformance, only your personal time
> if this is a qucik one-off sort of thing. If you need a more
> permanent tool, and/or you intend to do this processing regularly,
> using temporary files is much more effcient and faster.
>
> BTW, I use PIPE all the time for simply stuff from the command
> line. However, I've made a rule (for myself and my team) to
> NEVER use it in DCL procedures...especially stuff that runs in
> batch serveral times a day or more...
If you look at:
http://www.djesys.com/freeware/vms/vmspipe.zip
...., you'll see the product of my efforts to build "pipe lines" on
OpenVMS-VAX V5.5-2. The problem was that I needed to process more data
than I had freespace on disk. So, I put some programs together to create
permanent mailboxes, used DEFINEs and SPAWN/NOWAITs strategically and
hooked together some of my code and CONVERT/FDL to convert multi-GB's
worth of data from dd-format ASCII on 8mm (from AIX machines) to
ANSI-labelled EBCDIC on 9-track tape (for IBM mainframe) without the
need for intermediate files.
D.J.D.
|
|
0
|
|
|
|
Reply
|
djesys.nospam3 (1961)
|
8/4/2004 1:38:55 AM
|
|
On Mon, 02 Aug 2004 11:22:55 +0100, Tim ffrench-Lynch
<tim-DOT-ffrench-HYPHEN-lynch@baesystems.com> wrote:
>I'm trying to run some software on an Alpha PWS600au VMS 7.2 system
>with the output piped into first a search command and then InfoZip to
>compress the results, without creating a large intermediate file.
>Later I can use pipe and unzip to decompress into further search
>commands again without large intermediate files.
>
>The problem is that using the DCL Pipe command seems very slow
>compared to writing things to disk and reading back. I'd like to avoid
>the intermediate files being stored uncompressed as they can be as
>large as 5GB but even minimum compression with InfoZip reduces them by
>better than 20:1.
let alone PIPE, you might need to also watch out for the files > 4gb.
owing to signed/unsigned issues integral to infoZip
(and perhaps >2gb on compressed output)
there are some (beta) InfoZip versions available that'll handle both.
(google should be your friend here, Zip 2.4h, Unzip 5.51f, or later)
[future InfoZIp versions 3+ promise to settle these
32 bit limitations. iirc. I haven't checked for v3 betas in the
past few weeks/months)
inre usage of temp-files, your best bet, (if you can do so)
is to arrange for ample scratch space that you can point to
via "sys$scratch:"
Best-World performance, might involve a dedicated disk
(physical, or perhaps, virtual (LDAn:, or VDan:), and playing
with process RMS params, SET RMS/SEQ/BLOCK=64/BUFF=128/EXT=nnnnn"
that, XFC if enabled (and ample memory) will help you on the reads.
|
|
0
|
|
|
|
Reply
|
JBloggs (111)
|
8/4/2004 2:26:29 AM
|
|
>
> there are some (beta) InfoZip versions available that'll handle both.
> (google should be your friend here, Zip 2.4h, Unzip 5.51f, or later)
>
> [future InfoZIp versions 3+ promise to settle these
> 32 bit limitations. iirc. I haven't checked for v3 betas in the
> past few weeks/months)
I am using this zip 2.4h version, but I have difficulties with VERY
large files, esp. when compression image backups of big disks. The zip
files seam to be OK, there is no error while compressing, but when
uncompressing, they often are not usuable ... the problem is: some are
good, some are bad and I did not find any rule in it.
Regards
Dieter
|
|
0
|
|
|
|
Reply
|
dieter.rossbach (19)
|
8/4/2004 9:41:32 AM
|
|
On 4 Aug 2004 02:41:32 -0700, dieter.rossbach@gmx.de (dieter rossbach)
wrote:
>> there are some (beta) InfoZip versions available that'll handle both.
>> (google should be your friend here, Zip 2.4h, Unzip 5.51f, or later)
>>
>> [future InfoZIp versions 3+ promise to settle these
>> 32 bit limitations. iirc. I haven't checked for v3 betas in the
>> past few weeks/months)
>
>I am using this zip 2.4h version, but I have difficulties with VERY
>large files, esp. when compression image backups of big disks. The zip
>files seam to be OK, there is no error while compressing, but when
>uncompressing, they often are not usuable ... the problem is: some are
>good, some are bad and I did not find any rule in it.
Hmm. that's alarming.
I've done ok (so far) on about 20 occasions with Zip 2.4h
using a single input file >4gb (~4.5 -- 5gb backup savesets)
but I haven't zipped anything resulting in an output .Zip
with size greater than 2gb.
|
|
0
|
|
|
|
Reply
|
JBloggs (111)
|
8/4/2004 2:39:37 PM
|
|
David J Dachtera wrote:
>
> ..., you'll see the product of my efforts to build "pipe lines" on
> OpenVMS-VAX V5.5-2. The problem was that I needed to process more data
> than I had freespace on disk. So, I put some programs together to create
> permanent mailboxes, used DEFINEs and SPAWN/NOWAITs strategically and
> hooked together some of my code and CONVERT/FDL to convert multi-GB's
> worth of data from dd-format ASCII on 8mm (from AIX machines) to
> ANSI-labelled EBCDIC on 9-track tape (for IBM mainframe) without the
> need for intermediate files.
Just as you can run RMU /UNLOAD and RMU /LOAD between two
Rdb databases using a mailbox without needing the diskspace
for the whole unload file...
Jan-Erik.
|
|
0
|
|
|
|
Reply
|
aaa23 (199)
|
8/4/2004 5:01:48 PM
|
|
|
15 Replies
35 Views
(page loaded in 0.206 seconds)
Similiar Articles: The OpenVMS Frequently Asked Questions(FAQ)In a few cases---such as the DCL PIPE command---you will ... You can get ZIP and UNZIP and related and similar ... Quadword alignment will offer the best performance ... The OpenVMS Frequently Asked Questions (FAQ)In a few cases---such as the DCL PIPE command---you will ... and for a pointer to the available ATW Wizard.zip ... Performance of larger directory files improves (greatly ... 7/27/2012 12:54:40 AM
|