I'm a bit confused about this... hoping somebody can help.
So I have one program writing to stdout and another reading on stdin
(using fwrite() and fread()); I run them from the command line piping
the output of one to the input of the other. Experimentation reveals
that if the first program writes faster than the second reads, then
there is a fairly small buffer which, after it fills up, the first
program blocks on writing until the second has a chance to catch up.
Likewise, if the reader is faster, it blocks while the pipe is empty.
That's all logical enough. Question: How big is this buffer by default?
Can I change it?
Also (this is the part I'm really interested in), what happens if the
first program writes a little bit at a time, but the second one reads a
large chunk at a time, and it tries to read chunks significantly larger
than the size of that buffer (say, 2-3 times larger)? Will the first
program block when it fills the buffer, and then the second program will
block forever, because the amount of data it wants is never available?
Conversely, what if the first program writes in large chunks... can it
write in chunks bigger than the buffer size?
Thanks,
Josh
PS: I hope I don't offend anyone by not posting a real email address...
I get enough spam as it is. What's the etiquette on that? Most people
seem to include their email... is it considered impolite not to?
|
|
0
|
|
|
|
Reply
|
jh
|
2/4/2006 6:53:36 AM |
|
jh <no@thanks.com> writes:
> Question: How big is this buffer by default?
$ grep PIPE_BUF /usr/include/*/*.h
/usr/include/bits/posix1_lim.h:#define _POSIX_PIPE_BUF 512
/usr/include/linux/limits.h:#define PIPE_BUF 4096 /* # bytes in atomic write to a pipe */
> Can I change it?
No.
> Also (this is the part I'm really interested in), what happens if the
> first program writes a little bit at a time, but the second one reads a
> large chunk at a time, and it tries to read chunks significantly larger
> than the size of that buffer (say, 2-3 times larger)?
Reads from pipe can be "short" (return less data than the program
requested).
> PS: I hope I don't offend anyone by not posting a real email address...
> I get enough spam as it is. What's the etiquette on that? Most people
> seem to include their email... is it considered impolite not to?
Most people nowadays obfuscate their e-mail (as I have done), and
provide instructions that are easy for humans but hard for e-mail
bots to decode.
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
|
|
0
|
|
|
|
Reply
|
Paul
|
2/4/2006 8:05:57 AM
|
|
"jh" <no@thanks.com> wrote in message
news:no-7CBD72.01533604022006@wonka.hampshire.edu...
> So I have one program writing to stdout and another reading on stdin
> (using fwrite() and fread()); I run them from the command line piping
> the output of one to the input of the other.
[snip]
> Question: How big is this buffer by default? Can I change it?
Which buffer? There are three: the output stdio buffer in the process with
the write end of the pipe, the pipe buffer itself, and the input stdio
buffer in the process with the read end of the pipe.
The stdio buffer sizes are implementation defined, and can be changed with
setvbuf(). The pipe buffer size is PIPE_BUF, which is at least 512 bytes.
> Also (this is the part I'm really interested in), what happens if the
> first program writes a little bit at a time, but the second one reads a
> large chunk at a time, and it tries to read chunks significantly larger
> than the size of that buffer (say, 2-3 times larger)? Will the first
> program block when it fills the buffer, and then the second program will
> block forever, because the amount of data it wants is never available?
fread() and fwrite() do not return until they have, respectively, read or
written the number of bytes implied by their arguments, unless there is an
error or EOF is reached (the latter for fread() only).
fread() and fwrite() internally call read() and write() respectively.
Ignoring signals, write() behaves like fwrite(). But read() has different
semantics: it returns as soon as data is available.
If you write a little bit at a time, fwrite() will probably copy the data in
each call to the stdio buffer, calling write() to drain that buffer when it
is full.
At the other end of the pipe, fread() will loop calling read(), which will
block if no data is available at the time of the call, until enough has been
read (assuming EOF is not reached).
> Conversely, what if the first program writes in large chunks... can it
> write in chunks bigger than the buffer size?
Yes; fwrite() will block in write() if necessary.
Alex
|
|
0
|
|
|
|
Reply
|
Alex
|
2/4/2006 11:30:31 AM
|
|
On 2006-02-04, jh <no@thanks.com> wrote:
> I'm a bit confused about this... hoping somebody can help.
>
> So I have one program writing to stdout and another reading on stdin
> (using fwrite() and fread()); I run them from the command line piping
> the output of one to the input of the other. Experimentation reveals
> that if the first program writes faster than the second reads, then
> there is a fairly small buffer which, after it fills up, the first
> program blocks on writing until the second has a chance to catch up.
> Likewise, if the reader is faster, it blocks while the pipe is empty.
> That's all logical enough. Question: How big is this buffer by default?
> Can I change it?
>
> Also (this is the part I'm really interested in), what happens if the
> first program writes a little bit at a time, but the second one reads a
> large chunk at a time, and it tries to read chunks significantly larger
> than the size of that buffer (say, 2-3 times larger)? Will the first
> program block when it fills the buffer, and then the second program will
> block forever, because the amount of data it wants is never available?
No, it will read what it can. It may or may not then block on a second
attempt to read, but by then the buffer is empty and the writer can
write more.
> Conversely, what if the first program writes in large chunks... can it
> write in chunks bigger than the buffer size?
No. The write will (i believe) succeed in writing what it can. It may
then block on its next attempt to write, until the buffer is drained
> PS: I hope I don't offend anyone by not posting a real email address...
> I get enough spam as it is. What's the etiquette on that? Most people
> seem to include their email... is it considered impolite not to?
You should really use *.invalid for that purpose - and while it's
technically against the rules, no-one will really care unless you're
a troll using it to hide your identity.
|
|
0
|
|
|
|
Reply
|
Jordan
|
2/4/2006 5:50:09 PM
|
|
On 04/02/2006, Alex Fraser wrote:
> Which buffer? There are three: the output stdio buffer in the process
> with the write end of the pipe, the pipe buffer itself, and the input
> stdio buffer in the process with the read end of the pipe.
I've seen some apps (eg the ALSA aplay utility) which use read() and
write() instead of fread() and fwrite(), to talk to stdin and stdout,
getting the file descriptor from fileno(stdin) or fileno(stdout).
I can see the rationale for this: it gives more fine-grained control
over the I/O, and presumably the file descriptors can be made non
blocking.
Does anyone think this is a particularly good/bad idea?
--
Simon Elliott http://www.ctsn.co.uk
|
|
0
|
|
|
|
Reply
|
Simon
|
2/4/2006 6:11:02 PM
|
|
"Simon Elliott" <Simon at ctsn.co.uk> wrote in message
news:43e4ee36$0$1170$bed64819@news.gradwell.net...
> I've seen some apps (eg the ALSA aplay utility) which use read() and
> write() instead of fread() and fwrite(), to talk to stdin and stdout,
> getting the file descriptor from fileno(stdin) or fileno(stdout).
>
> I can see the rationale for this: it gives more fine-grained control
> over the I/O, and presumably the file descriptors can be made non
> blocking.
>
> Does anyone think this is a particularly good/bad idea?
IMO: if you use POSIX I/O functions because the standard C I/O ("stdio")
functions can't do the job, then it is obviously a good idea. Otherwise it
is a bad idea.
There are definitely cases where the stdio functions can't do the job. If
you want to multiplex I/O using select() or poll(), including stdin/out/err,
then the underlying descriptors must be non-blocking for robustness, and the
stdio functions require blocking descriptors. If you are handling signals,
the interaction with stdio functions is unspecified, whereas interaction
with POSIX functions is.
You might want to use SIGALRM or select()/poll() to implement I/O with
timeouts. This rules out stdio.
Alex
|
|
0
|
|
|
|
Reply
|
Alex
|
2/5/2006 2:13:27 PM
|
|
"Jordan Abel" <random832@gmail.com> wrote in message
news:slrndu9qgl.lc9.random832@random.yi.org...
> On 2006-02-04, jh <no@thanks.com> wrote:
> > So I have one program writing to stdout and another reading on stdin
> > (using fwrite() and fread()); I run them from the command line piping
> > the output of one to the input of the other.
[snip]
> > Also (this is the part I'm really interested in), what happens if the
> > first program writes a little bit at a time, but the second one reads a
> > large chunk at a time, and it tries to read chunks significantly larger
> > than the size of that buffer (say, 2-3 times larger)? Will the first
> > program block when it fills the buffer, and then the second program
> > will block forever, because the amount of data it wants is never
> > available?
>
> No, it will read what it can.
This is basically true for read(), but not fread().
> It may or may not then block on a second attempt to read, but by then the
> buffer is empty and the writer can write more.
For read() on a blocking descriptor, the "may or may not" is determined by
whether or not there is (more) data available. That is, ignoring EOF and
signals, read() blocks if no bytes are available, else it returns whatever
data it can (normally the lesser of the number of bytes available and the
specified size, but theoretically anything from one byte up to that amount).
> > Conversely, what if the first program writes in large chunks... can it
> > write in chunks bigger than the buffer size?
>
> No. The write will (i believe) succeed in writing what it can.
On a blocking descriptor, write() will write as many bytes as requested
(blocking if necessary), unless there is an error or a signal causes it to
return early.
Alex
|
|
0
|
|
|
|
Reply
|
Alex
|
2/5/2006 2:13:32 PM
|
|
Simon> I've seen some apps (eg the ALSA aplay utility) which use read()
Simon> and write() instead of fread() and fwrite(), to talk to stdin and
Simon> stdout, getting the file descriptor from fileno(stdin) or
Simon> fileno(stdout).
Isn't fileno(stdin) --- resp. fileno(stdout) --- just a fancy way to
write 0, resp. 1?
Simon> I can see the rationale for this: it gives more fine-grained
Simon> control over the I/O, and presumably the file descriptors can be
Simon> made non blocking. Does anyone think this is a particularly
Simon> good/bad idea?
Not using read or write alone, but _mixing_ read and write with
fread, fwrite and the rest of bufferd I/O.
--
A true pessimist won't be discouraged by a little success.
|
|
0
|
|
|
|
Reply
|
Ian
|
2/5/2006 4:11:22 PM
|
|
On 2006-02-05, Alex Fraser <me@privacy.net> wrote:
> "Jordan Abel" <random832@gmail.com> wrote in message
> news:slrndu9qgl.lc9.random832@random.yi.org...
>> On 2006-02-04, jh <no@thanks.com> wrote:
>> > So I have one program writing to stdout and another reading on stdin
>> > (using fwrite() and fread()); I run them from the command line piping
>> > the output of one to the input of the other.
> [snip]
>> > Also (this is the part I'm really interested in), what happens if the
>> > first program writes a little bit at a time, but the second one reads a
>> > large chunk at a time, and it tries to read chunks significantly larger
>> > than the size of that buffer (say, 2-3 times larger)? Will the first
>> > program block when it fills the buffer, and then the second program
>> > will block forever, because the amount of data it wants is never
>> > available?
>>
>> No, it will read what it can.
>
> This is basically true for read(), but not fread().
There is no such thing as fread(). I was talking in terms of the actual
system calls inevitably made by the program, since for these purposes it
doesn't matter what language they're actually in. In the case of
fread(), the "second attempt to read" is made in a loop within fread.
>> It may or may not then block on a second attempt to read, but by then the
>> buffer is empty and the writer can write more.
>
> For read() on a blocking descriptor, the "may or may not" is determined by
> whether or not there is (more) data available.
I was referring to after it drains a full buffer. [therefore, there's no
data left until the writer puts in more]
>> > Conversely, what if the first program writes in large chunks... can it
>> > write in chunks bigger than the buffer size?
>>
>> No. The write will (i believe) succeed in writing what it can.
>
> On a blocking descriptor, write() will write as many bytes as requested
> (blocking if necessary), unless there is an error or a signal causes it to
> return early.
Are you sure? It can't time out?
Now, if write is blocking, at least the ones that fit in the buffer will
then be available to the reader, right?
|
|
0
|
|
|
|
Reply
|
Jordan
|
2/5/2006 8:21:43 PM
|
|
"Jordan Abel" <random832@gmail.com> wrote in message
news:slrnducnp0.qjm.random832@random.yi.org...
> On 2006-02-05, Alex Fraser <me@privacy.net> wrote:
> > "Jordan Abel" <random832@gmail.com> wrote in message
> > news:slrndu9qgl.lc9.random832@random.yi.org...
> >> On 2006-02-04, jh <no@thanks.com> wrote:
> >> > So I have one program writing to stdout and another reading on stdin
> >> > (using fwrite() and fread()); I run them from the command line
> >> > piping the output of one to the input of the other.
> > [snip]
> >> > [if the first program writes a little bit at a time, but the second
> >> > one reads large chunks, then] the second program will block forever,
> >> > because the amount of data it wants is never available?
> >>
> >> No, it will read what it can.
> >
> > This is basically true for read(), but not fread().
>
> There is no such thing as fread().
Are you sure?
> I was talking in terms of the actual system calls inevitably made by the
> program,
But given that the OP only mentioned fread() and fwrite(), you didn't think
that fact was worth mentioning? (This was really my point.)
[snip]
> > On a blocking descriptor, write() will write as many bytes as requested
> > (blocking if necessary), unless there is an error or a signal causes it
> > to return early.
>
> Are you sure? It can't time out?
Not usually, but a timeout would constitute an error.
Alex
|
|
0
|
|
|
|
Reply
|
Alex
|
2/5/2006 10:54:11 PM
|
|
Ian Zimmerman <nobrowser@gmail.com> wrote, on Sun, 05 Feb 2006:
> Isn't fileno(stdin) --- resp. fileno(stdout) --- just a fancy way to
> write 0, resp. 1?
They start out with those values, but there are ways they can get
changed. E.g.:
close(0);
freopen(somefile, "w", stdout);
Now fileno(stdout) is 0.
--
Geoff Clare <netnews@gclare.org.uk>
|
|
0
|
|
|
|
Reply
|
Geoff
|
2/7/2006 6:12:15 PM
|
|
|
10 Replies
150 Views
(page loaded in 0.096 seconds)
Similiar Articles: How to pass enum as default argument - comp.lang.c++.moderated ...Default buffers on pipes - comp.unix.programmer So I have one program writing to ... How to pass an argument to a python program open in IDLE ... can I pass the argument ... tar terminates unexpectedly when piped to dd ? - comp.os.linux ...... you specify neither "f" nor "O", tar writes to a default ... reads due to interaction with the OS's pipe buffer size. ... > Pipes are very neat, but after a test, it seems the ... redirecting output from telnet - comp.unix.solaris... it spits out tons of info and lines so much the buffer ... that you need to use the -p port option as the default ... unix.shell Daniel Ganek wrote: > I'm using named pipes ... Removing duplicates from within sections of a file - comp.lang.awk ...For 3.2 they'll be enabled by default. Thanks to Michael ... ptys are in use, gawk will fall back to using plain pipes. 13. Fixed a regex matching across buffer ... synchronous interprocess communication - comp.unix.programmer ...... exit(1); break; } default ... it's talking to a terminal, thus flushing it's buffers on ... All the standard UNIX communication mechanisms (pipes ... Annoying kernel memory leak in U6? - comp.unix.solaris... 3703832576 622204 0 Total [kmem_default ... Perhaps insanely large TCP buffers? You could check how ... application requests and it potentialy could exhaust all ... Value too large for defined data type... - comp.unix.solaris ...... obtained using isainfo -kv) It looks more like pipes ... I have seen this issue when some field in a stat buffer ... Err#32 EPIPE Received signal #13, SIGPIPE [default ... [comp.publish.cdrom] CD-Recordable FAQ, Part 1/4 - comp.publish ...Archive-name: cdrom/cd-recordable/part1 Posting-Frequency: monthly Last-modified: 2008/10/09 Version: 2.71 Send corrections and updates to And... Unix & Linux: Default buffers on pipes - programming.itags.orgprogramming.itags.org: Unix & Linux question: Default buffers on pipes, created at:Wed, 07 May 2008 05:29:00 GMT with 1,557 bytes, last updated: Monday, July 02, 2012 ... Re: [perf-discuss] Changing the default buffer sizes for pipes ?Re: [perf-discuss] Changing the default buffer sizes for pipes ? Michael Schulte Sun, 28 Mar 2010 01:05:22 -0700 7/12/2012 10:30:32 PM
|