[CALUG] More I/O buffering foolishness

Thu Apr 13 21:17:09 CDT 2006

On Wed, Mar 01, 2006 at 09:39:22PM -0500, Jason C. Miller wrote:
> Arg.  Alright....some more details.  :)
> 
> 1. It is a Solaris 8 machine
> 2. My code is not as simple as a single perl print() statement.  That 
> was simply there to provide a simple example that doesn't work on my 
> system.  It works fine at the command prompt but not from a script 
> (which, after more reading, I understand a lot more about how programs 
> see what devices STDOUT is attached to and how they act accordingly).
> 3. As mentioned, it doesn't matter what command (perl, awk, sed, grep, 
> etc) comes after that next pipe.  All the STDIN for those programs are 
> block-buffered.

[Was going through old email, and saw this.  Apparently, the problem is
not fixed, so I'm responding even though it's an old post.]

This issue has bitten me in the past.  As someone else said, the
underlying problem is that *nix libc and other *nix I/O
implementations tend to autodetect if their stdout is a TTY or a
file/pipe, and adjust buffering appropriately.  The easiest way to
demo this is to run a program with slow output, and then run the same
program while piping to cat; if you see different results, you're
seeing buffering.  Examples:

perl -le 'while (1) {print ++$a; sleep 1}'
perl -le 'while (1) {print ++$a; sleep 1}'|cat

There are various fixes:

(1) If you have expect, there is a script called "unbuffer" that comes
    with it that acts as a wrapper for any program you run,
    redirecting output through a pty.  The upshot of this is that the
    program switches to line-buffered output.  So instead of "command1
    | command2 | command3", you can use "unbuffer command1 | unbuffer
    command2 | command3".  For the sample case:
    unbuffer perl -le 'while (1) {print ++$a; sleep 1}'|cat

    This solution is my preferred solution for this problem in the
    general case, since it is portable and requires no source code
    modifications.

    The only issues I have had with it are: (1) some Linux distros
    don't put "unbuffer" in the default path even after you install
    expect; and (2) older versions of unbuffer took no arguments,
    while newer versions require -p for unbuffer to work with stdin.

(2) If you can modify the script/program in question, you can usually
    have it specify no buffering or line buffering.  Perl has a "$|"
    variable that can disable the current (ie. stdout) buffering
    completely, or if you use FileHandle/IO::Handle, you can use
    setvbuf() to set line buffering, or you can manually flush() after
    each line.  C has a setvbuf(3) call, or you can manually fflush()
    after each line.  For the sample case:

    perl -le '$|=1; while (1) {print ++$a; sleep 1}'|cat

    This is portable, but only works for scripts/programs for which
    you have the source code and are willing to modify them.

    In general, it's better to setup line buffering than no buffering.
    You can have performance problems disabling buffering altogether.

(3) Some canned commands have special options to disable buffering or
    enable line buffering.  tcpdump's "-l" leaps to mind.  This is not
    a general case, but is worth keeping in mind.

- Morty