tbsetup/reload_daemon.in · 74d218443ac9dfcbd0a9aefda48332e24ba46cf1 · emulab / emulab-devel

Reloading daemon. Looks for free nodes that have not been reloaded · 74d21844

Leigh B. Stoller authored Mar 30, 2001

since the last reservation (as determined by last_reservation table).
Picks one (randomly) from that set of nodes, and calls sched_reload on
it. Then waits until the node has finished reloading, as determined by
the reserved table, which gets cleared by the tmcd when the node first
reboots after a scheduled reload. Sleeps 30 seconds, and then goes
around again. So at most one node is tied up in a reload at a time,
which seems like a good balance between trying to keep the machines in
a pristine state, and having nodes available for use.

The advantage of this approach is that instead of calling sched_reload
on 40 nodes (after generating a new image) and watching the network
meltdown, we can let the nodes reload at a slower pace. We could call
sched_reload on allocated nodes so that they will load when freed, but
we run into the problem of big experiments ending and causing meltdown.

The downside is that this approach is a little too aggressive. Nodes
will end up reloading after just a single experiment. Need finer grain
control over when to reload, but I will leave that as an exercise for
later.

74d21844