tbsetup/os_setup.in · 5ab15776ba74b5b5cace251dccee4f8d3406b536 · emulab / emulab-devel

Changes for setting up jailed nodes, which need checks similar to what · 5ab15776

Leigh B. Stoller authored Jan 08, 2003

real nodes get. Also, run a proper os_select on jailed nodes, *after*
the os for the physical node is setup, since otherwise stated will not
be happy.

Fixes for dealing with failed os_load. Previously, if os_load would
fail, os_setup would wait for those nodes anyway since it had no idea
what nodes had failed (and we do not want to just quit from os_setup
since that might cause a lot of extra power cycles). Now, for each
node that got an os_load, check its eventstate; it should be in ISUP
immediately after os_load exits (since thats what os_load waited for),
and if its not, then mark that node as failed. Note though that failed
loads no longer result in the node going into hwdown, since 99 percent
of the time its a busted user image, not a hardware problem. I figure
we will catch real hw errors via the reload daemon, when it sends
email about nodes not finishing.

Do not bother with doing the vnode setup if any of the phys nodes
failed to setup. Leads to cascading errors and prolongs the angony by
another few minutes. Might revisit this later.

Remove local WaitTillAlive() function, and switch to using the version
I put into libdb a couple of weeks ago.

Fix up a bunch of print statements to be nicer.

5ab15776