An irritating bug in bash

Here’s what kernel developers talk about at coffee: bugs in our command shells.  At least, that was the topic the other day when Stephen, Dave and I were complaining about the various troubles we’d had with different command shells.

While others moved to zsh some years ago, I have been a bash user since kicking the tcsh habit.  But for years I have been plagued by a subtle nuisance in bash: sometimes it doesn’t catch terminal window resizes properly.  The result is that command line editing works very poorly until bash finally figures this out. After a while, I worked out that this behavior happens only when the window size change happens while you’re in some application spawned by bash.  So if you’re in an editor like vim, change the window size, and then exit (or suspend) the editor, bash will be confused about the terminal size.

While this has always annoyed me, it never quite reached the threshold for me to do anything about it.  But recently it has been bugging me more and more.  After we returned from coffee, I dug into the bash manual and discovered a little-known option, the checkwinsize builtin.  In a nutshell, you can set this shell option as follows:

    shopt -s checkwinsize

which the bash manual says: If set, Bash checks the window size after each command and, if necessary, updates the values of LINES and COLUMNS. Sounds great!  As an aside, I think that as a modern shell, bash should set this option by default.  (Others think so too).

With much self-satisfaction I set this option and got ready for line editing bliss.  But, no joy.  I checked and rechecked, and finally started using truss, and then DTrace, to try to understand the problem.  After some digging I eventually discovered the following bug in the shell.  Here’s the meat of the writeup I submitted to the bash-bug list:

On Solaris/OpenSolaris platforms, I have discovered what I believe is a
bug in lib/sh/winsize.c.
I discovered with a debugger that the get_new_window_size() function
has no effect on Solaris.  In fact, here is what this file looks like if
you compile it:
$ dis winsize.o
disassembly for winsize.o
section .text
get_new_window_size:     c3                 ret
That's it-- an empty function.  The problem is that the appropriate header
file is not getting pulled in, in order to #define TIOCGWINSZ.
As a result, even with 'shopt -s checkwinsize' set on Solaris, bash
does not check the win size on suspend of a program, or on program
exit.  This is massively frustrating, and I know of several Solaris
users who have switched to zsh as a result of this bug.
I have not tried bash 4.0, but looking at the source code, it appears
that the bug is present there as well.
I added an ifdef clause which looks to see if the HAVE_TERMIOS_H define
is set, after the #include of config.h.  If it is, then I #include the
termios.h header file.  This solves the problem, which I confirmed by
rebuilding and dis'ing the function.  I also ran my recompiled bash
and confirmed that it now worked correctly.

Hopefully the bash maintainers will take note and fix this bug.  In the mean time, I’m going to see if we can get the fix for this applied to the Nevada (and hence, OpenSolaris) version of bash.

Update: The bash maintainers have fixed this bug in the following patch to bash 4.x.  Hurray!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s