• Mike Hibler's avatar
    First crack at surviving down planetlab nodes. If the master barrier sync · 5f413b47
    Mike Hibler authored
    node sits in the stub or monitor barrier sync for more than the SYNCTIMO
    timeout value in common-env.sh, it will send a HUP to syncd which will
    knock all the other nodes out of their barrier sync.  If that happens,
    all nodes will print a warning message and continue.
    
    All nodes wait for both a stub sync and a monitor sync, so if one plab node
    is down, they will timeout on both barrier syncs.  Race conditions?  Sure.
    If for example everyone times out on the stub barrier due to a slow node,
    and then that node reaches the barrier, it will hang there while everyone
    else waits on the monitor barrier.  When the latter times out, it will
    kick the slow node out of the stub sync and it will then proceed to hang
    in the monitor sync until the experiment is stopped.  Got that?
    
    As an aside, it would be nice if the initializer of a barrier could specify
    a timeout value, and return a special error code to everyone if it timed out,
    but that would require an incompatible change to the sync protocol.
    5f413b47
Name
Last commit
Last update
account Loading commit data...
apache Loading commit data...
assign Loading commit data...
autoconf Loading commit data...
bugdb Loading commit data...
capture Loading commit data...
cdrom Loading commit data...
collab Loading commit data...
daikon Loading commit data...
db Loading commit data...
delay/linux Loading commit data...
dhcpd Loading commit data...
discvr Loading commit data...
doc Loading commit data...
event Loading commit data...
firewall Loading commit data...
hw_config Loading commit data...
hyperviewer Loading commit data...
image-test Loading commit data...
install Loading commit data...
ipod Loading commit data...
lib Loading commit data...
mote Loading commit data...
named Loading commit data...
os Loading commit data...
patches Loading commit data...
pelab Loading commit data...
pxe Loading commit data...
rc.d Loading commit data...
robots Loading commit data...
rpms Loading commit data...
security Loading commit data...
sensors Loading commit data...
sql Loading commit data...
ssl Loading commit data...
sysadmin Loading commit data...
tbsetup Loading commit data...
testsuite Loading commit data...
tip Loading commit data...
tmcd Loading commit data...
tools Loading commit data...
utils Loading commit data...
vis Loading commit data...
wiki Loading commit data...
www Loading commit data...
xmlrpc Loading commit data...
.loc-ignore Loading commit data...
BUGS Loading commit data...
GNUmakefile.in Loading commit data...
GNUmakerules Loading commit data...
GPL-COPYING Loading commit data...
LGPL-COPYING Loading commit data...
LICENSE Loading commit data...
Makeconf.in Loading commit data...
README Loading commit data...
TODO Loading commit data...
TODO.plab Loading commit data...
config.h.in Loading commit data...
configure Loading commit data...
configure.in Loading commit data...
defs-aerolab Loading commit data...
defs-calfeld-emulab Loading commit data...
defs-davidand-emulab Loading commit data...
defs-default Loading commit data...
defs-duerig-emulab Loading commit data...
defs-elabinelab Loading commit data...
defs-example Loading commit data...
defs-fish-emulab Loading commit data...
defs-gatech Loading commit data...
defs-johnsond-emulab Loading commit data...
defs-kwebb-emulab Loading commit data...
defs-newbold-emulab Loading commit data...
defs-newbold-macdb Loading commit data...
defs-ricci-emulab Loading commit data...
defs-shash-emulab Loading commit data...
defs-stoller-emulab Loading commit data...
defs-stoller-home Loading commit data...
defs-stoller-lbsdb Loading commit data...
defs-uky Loading commit data...
defs-wide Loading commit data...