Commit c2bd98f6 authored by David Johnson's avatar David Johnson

Workaround dhclient/resolvconf problem in Ubuntu 16.

This replaces the first attempt, which just masked the race condition,
since I didn't understand what tmcc bossinfo was really doing.  This
appears to fix it satisfactorily for now; it doesn't seem that we will
run into the case where the file exists but has no nameserver.

  resolvconf on Linux also breaks DNS momentarily via dhclient exit
  hook, or something.  On Ubuntu 16, resolvconf is setup to run via
  dhclient enter hook (the hook redefines make_resolv_conf, which
  dhclient-script eventually executes prior to the exit hook execution).
  For whatever reason, though, sometimes when our exit hook (this
  script) runs, /etc/resolv.conf is a dangling symlink.  I was not able
  to find the source of the asynch behavior, so I can't say for sure.
  But sethostname.dhclient is an immediate casualty, because it calls
  tmcc bossinfo(), and the tmcc binary attempts to use res_init and read
  the resolver and use that as boss.  If there is no /etc/resolv.conf
  (or it is a broken symlink into /run, as it is on resolvconf systems
  before resolvconf runs for the first time on boot), res_init will
  return localhost, and there is no way for us in tmcc to know that is
  inappropriate (taking the res_init resolver might not be the best
  choice, but we do not dare to add a special-case rejection of
  localhost in tmcc).
parent a3ea0297
......@@ -90,10 +90,7 @@ fi
# leaves DNS a little wonky. So we whack on it til it responds so that
# the sethostname script won't fail.
#
# resolveconf on Linux also breaks DNS momentarily via dhclient exit hook.
# So if we have that, avoid it, too.
#
if [ "$new_network_number" = "10.200.1.0" -o -x /sbin/resolvconf ]; then
if [ "$new_network_number" = "10.200.1.0" ]; then
for i in 0 1 2; do
if `$BINDIR/tmcc bossinfo >/dev/null 2>&1`; then
break
......@@ -103,6 +100,37 @@ if [ "$new_network_number" = "10.200.1.0" -o -x /sbin/resolvconf ]; then
done
fi
#
# resolvconf on Linux also breaks DNS momentarily via dhclient exit
# hook, or something. On Ubuntu 16, resolvconf is setup to run via
# dhclient enter hook (the hook redefines make_resolv_conf, which
# dhclient-script eventually executes prior to the exit hook execution).
# For whatever reason, though, sometimes when our exit hook (this
# script) runs, /etc/resolv.conf is a dangling symlink. I was not able
# to find the source of the asynch behavior, so I can't say for sure.
# But sethostname.dhclient is an immediate casualty, because it calls
# tmcc bossinfo(), and the tmcc binary attempts to use res_init and read
# the resolver and use that as boss. If there is no /etc/resolv.conf
# (or it is a broken symlink into /run, as it is on resolvconf systems
# before resolvconf runs for the first time on boot), res_init will
# return localhost, and there is no way for us in tmcc to know that is
# inappropriate (taking the res_init resolver might not be the best
# choice, but we do not dare to add a special-case rejection of
# localhost in tmcc... you never know what crazy proxy schemes might
# arise in the future).
#
if [ -x /sbin/resolvconf ]; then
rcwaittime=0
while [ ! -f `readlink -f /etc/resolv.conf` -a $rcwaittime -lt 5 ]; do
echo "`date`: waiting for /etc/resolv.conf to exist..." >>$LOGDIR/dhclient-exit.log 2>&1
sleep 1
rcwaittime=`expr $rcwaittime + 1`
done
if [ ! -f `readlink -f /etc/resolv.conf` ]; then
echo "*** WARNING: /etc/resolv.conf does not exist; this will likely cause problems!" >>$LOGDIR/dhclient-exit.log 2>&1
fi
fi
#
# See if the Testbed configuration software wants to change the hostname.
#
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment