Commit 74df028a authored by David Johnson's avatar David Johnson

Fix ctl node reboot races on Liberty/Ubuntu 15.10.

Reboots of the ctl node for the Liberty version would result in
failures to startup mysql, and this renders all openstack services
inoperable.

Recall that in the common case (because we have many testbeds whose
nodes only have one expt interface), we setup the openstack mgmt lan as
a VPN over the control net between all the nodes, served from the nm
node.

Well, mysql binds to and listens on the ip addr of the mgmt net device,
and when the ctl node is rebooted, mysql starts long before openvpn can
bring up the vpn client net device.  Moreover, rabbitmq would fail to
start for the same reason, and rabbitmq is the AMQP messaging service
that underlies all openstack RPC.

For various reasons, it's not sufficient to just make the mysql
initscript (which on 15.10 is still legacy LSB!) depend on the openvpn
legacy LSB initscript.

So I wrote a little initcript (embedded in setup-controller.sh) that
spins in a sleep 1; loop, looking for the mgmt net to get its known IP
from the openvpn client.  It has reverse dependency on mysql, so it runs
to completion before mysql starts.

Then, we had to handle the rabbitmq case... but rabbitmq has a modern
systemd unit file, not an LSB initscript.  So I wrote a systemd unit
file that invokes my mgmt net LSB initscript to wait for the mgmt net
IP... and that has a reverse dep on rabbitmq-server.service.

Now all is good.  mysql and rabbitmq-server are certainly blocked for a
few extra seconds, while the VPN comes up, but all the openstack
services themselves are written defensively to handle RPC server
disconnects, or database disconnects (doh).
parent a814e07f
......@@ -112,6 +112,63 @@ if [ -z "${DB_ROOT_PASS}" ]; then
service_enable mysql
# Save the passwd
echo "DB_ROOT_PASS=\"${DB_ROOT_PASS}\"" >> $SETTINGS
if [ -z "${MGMTLAN}" -a $OSVERSION -ge $OSLIBERTY ]; then
# Make sure mysqld won't start until after the openvpn
# mgmt net is up.
cat <<EOF >/etc/init.d/legacy-openvpn-net-waiter
#!/bin/bash
#
### BEGIN INIT INFO
# Provides: legacy-openvpn-net-waiter
# Required-Start: \$network openvpn
# Required-Stop:
# Should-Start: \$network openvpn
# X-Start-Before: mysql
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Waits for an IP address to appear on the mgmt net device.
# Description: Waits for an IP address to appear on the mgmt net device.
### END INIT INFO
#
. /lib/lsb/init-functions
case "\${1:-''}" in
'start')
while [ 1 -eq 1 ]; do
ip addr show | grep -q "$MGMTIP"
if [ \$? -eq 0 ]; then
log_daemon_msg "Found net device with ip addr $MGMTIP; allowing services to start" "openvpn"
break
else
sleep 1
fi
done
;;
'stop')
exit 0
;;
'restart')
exit 0
;;
*)
exit 1
;;
esac
exit 0
EOF
chmod 755 /etc/init.d/legacy-openvpn-net-waiter
#sed -i -e 's/^# Required-Start:\(.*\)$/# Required-Start:\1 mgmt-net-waiter/' /etc/init.d/mysql
#sed -i -e 's/^# Should-Start:\(.*\)$/# Should-Start:\1 mgmt-net-waiter/' /etc/init.d/mysql
update-rc.d legacy-openvpn-net-waiter defaults
update-rc.d legacy-openvpn-net-waiter enable
#update-rc.d mysql enable
fi
fi
#
......@@ -161,6 +218,31 @@ EOF
sleep 1
rabbitmqctl start_app
done
if [ -z "${MGMTLAN}" -a $OSVERSION -ge $OSLIBERTY ]; then
# Make sure rabbitmq won't start until after the openvpn
# mgmt net is up.
cat <<EOF >/etc/systemd/system/openvpn-net-waiter.service
[Unit]
Description=OpenVPN Device Waiter
After=network.target network-online.target local-fs.target
Wants=network.target
Before=rabbitmq-server.service
Requires=rabbitmq-server.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/etc/init.d/legacy-openvpn-net-waiter start
StandardOutput=journal+console
StandardError=journal+console
[Install]
WantedBy=multi-user.target
EOF
systemctl enable openvpn-net-waiter.service
fi
fi
#
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment