[CALUG] More I/O buffering foolishness
Mordechai T. Abzug
morty at frakir.org
Thu Apr 13 21:17:09 CDT 2006
On Wed, Mar 01, 2006 at 09:39:22PM -0500, Jason C. Miller wrote:
> Arg. Alright....some more details. :)
>
> 1. It is a Solaris 8 machine
> 2. My code is not as simple as a single perl print() statement. That
> was simply there to provide a simple example that doesn't work on my
> system. It works fine at the command prompt but not from a script
> (which, after more reading, I understand a lot more about how programs
> see what devices STDOUT is attached to and how they act accordingly).
> 3. As mentioned, it doesn't matter what command (perl, awk, sed, grep,
> etc) comes after that next pipe. All the STDIN for those programs are
> block-buffered.
[Was going through old email, and saw this. Apparently, the problem is
not fixed, so I'm responding even though it's an old post.]
This issue has bitten me in the past. As someone else said, the
underlying problem is that *nix libc and other *nix I/O
implementations tend to autodetect if their stdout is a TTY or a
file/pipe, and adjust buffering appropriately. The easiest way to
demo this is to run a program with slow output, and then run the same
program while piping to cat; if you see different results, you're
seeing buffering. Examples:
perl -le 'while (1) {print ++$a; sleep 1}'
perl -le 'while (1) {print ++$a; sleep 1}'|cat
There are various fixes:
(1) If you have expect, there is a script called "unbuffer" that comes
with it that acts as a wrapper for any program you run,
redirecting output through a pty. The upshot of this is that the
program switches to line-buffered output. So instead of "command1
| command2 | command3", you can use "unbuffer command1 | unbuffer
command2 | command3". For the sample case:
unbuffer perl -le 'while (1) {print ++$a; sleep 1}'|cat
This solution is my preferred solution for this problem in the
general case, since it is portable and requires no source code
modifications.
The only issues I have had with it are: (1) some Linux distros
don't put "unbuffer" in the default path even after you install
expect; and (2) older versions of unbuffer took no arguments,
while newer versions require -p for unbuffer to work with stdin.
(2) If you can modify the script/program in question, you can usually
have it specify no buffering or line buffering. Perl has a "$|"
variable that can disable the current (ie. stdout) buffering
completely, or if you use FileHandle/IO::Handle, you can use
setvbuf() to set line buffering, or you can manually flush() after
each line. C has a setvbuf(3) call, or you can manually fflush()
after each line. For the sample case:
perl -le '$|=1; while (1) {print ++$a; sleep 1}'|cat
This is portable, but only works for scripts/programs for which
you have the source code and are willing to modify them.
In general, it's better to setup line buffering than no buffering.
You can have performance problems disabling buffering altogether.
(3) Some canned commands have special options to disable buffering or
enable line buffering. tcpdump's "-l" leaps to mind. This is not
a general case, but is worth keeping in mind.
- Morty
More information about the lug
mailing list