• Leigh Stoller's avatar
    New approach to dealing with nodes that fail to boot is os_setup, and · 5cf6aad2
    Leigh Stoller authored
    land in hwdown.
    
    Currently, if a node fails to boot in os_setup and the node is running
    a system image, it is moved into hwdown. 99% of the time this is
    wasted work; the node did not fail for hardware reasons, but for some
    other reason that is transient.
    
    The new approach is to move the node into another holding experiment,
    emulab-ops/hwcheckup. The daemon watches that experiment, and nodes
    that land in it are freshly reloaded with the default image and
    rebooted. If the node reboots okay after reload, it is released back
    into the free pool. If it fails any part of the reload/reboot, it is
    officially moved into hwdown.
    
    Another possible use; if you have a suspect node, you go wiggle some
    hardware, and instead of releasing it into the free pool, you move it
    into hwcheckup, to see if it reloads/reboots. If not, it lands in
    hwdown again. Then you break out the hammer.
    
    Most of the changes in Node.pm, libdb.pm, and os_setup are
    organizational changes to make the code cleaner.
    5cf6aad2
Name
Last commit
Last update
account Loading commit data...
apache Loading commit data...
assign Loading commit data...
autoconf Loading commit data...
backend Loading commit data...
bugdb Loading commit data...
capture Loading commit data...
cdrom Loading commit data...
collab Loading commit data...
daikon Loading commit data...
db Loading commit data...
delay Loading commit data...
dhcpd Loading commit data...
discvr Loading commit data...
doc Loading commit data...
event Loading commit data...
firewall Loading commit data...
flash Loading commit data...
hw_config Loading commit data...
hyperviewer Loading commit data...
image-test Loading commit data...
install Loading commit data...
ipod Loading commit data...
lib Loading commit data...
mfs Loading commit data...
mote Loading commit data...
named Loading commit data...
node_usage Loading commit data...
os Loading commit data...
patches Loading commit data...
pelab Loading commit data...
protogeni Loading commit data...
pxe Loading commit data...
rc.d Loading commit data...
robots Loading commit data...
rpms Loading commit data...
security Loading commit data...
sensors Loading commit data...
sql Loading commit data...
ssl Loading commit data...
sysadmin Loading commit data...
tbsetup Loading commit data...
testsuite Loading commit data...
tip Loading commit data...
tmcd Loading commit data...
tools Loading commit data...
utils Loading commit data...
vis Loading commit data...
wiki Loading commit data...
www Loading commit data...
xmlrpc Loading commit data...
.loc-ignore Loading commit data...
AGPL-COPYING Loading commit data...
GNUmakefile.in Loading commit data...
GNUmakerules Loading commit data...
GPL-COPYING Loading commit data...
LGPL-COPYING Loading commit data...
MOVED-TO-WIKI Loading commit data...
Makeconf.in Loading commit data...
README Loading commit data...
TODO Loading commit data...
TODO.plab Loading commit data...
WEBtemplate.in Loading commit data...
config.h.in Loading commit data...
configure Loading commit data...
configure.in Loading commit data...
defs-aerolab Loading commit data...
defs-calfeld-emulab Loading commit data...
defs-davidand-emulab Loading commit data...
defs-default Loading commit data...
defs-duerig-emulab Loading commit data...
defs-elabinelab Loading commit data...
defs-example Loading commit data...
defs-example-privatecnet Loading commit data...
defs-fish-emulab Loading commit data...
defs-gatech Loading commit data...
defs-gtw-emulab Loading commit data...
defs-johnsond-emulab Loading commit data...
defs-kevina-emulab Loading commit data...
defs-kwebb-emulab Loading commit data...
defs-newbold-emulab Loading commit data...
defs-newbold-macdb Loading commit data...
defs-ricci-emulab Loading commit data...
defs-shash-emulab Loading commit data...
defs-stoller-emulab Loading commit data...
defs-stoller-home Loading commit data...
defs-stoller-lbsdb Loading commit data...
defs-uky Loading commit data...
defs-wide Loading commit data...