Commit bc4f9b05 authored by Robert Ricci's avatar Robert Ricci
Browse files

Change the way we decide that a node is up and ready for use. Instead

of just pinging it, poll its eventstate in the database, and wait
until it's reported ISUP. This way, we don't report that a node is
ready before it really is. Also bumped up the timeout by a couple of
minutes to account for the extra time it takes for the OS to boot.

Right now, if the OS is considered pingable, we assume that it will
report ISUP. In the future, however, when the node state is more
formalized, we will have better ways of determining this.
parent 6be80975
......@@ -39,7 +39,6 @@ use libtestbed;
my $nodereboot = "$TB/bin/node_reboot";
my $os_load = "$TB/bin/os_load";
my $vnode_setup = "$TB/sbin/vnode_setup";
my $ping = "/sbin/ping";
my $dbg = 0;
my $failed = 0;
my @nodes = ();
......@@ -513,7 +512,7 @@ sub WaitTillAlive ($) {
#
# Seems like a long time to wait, but it ain't!
#
my $maxwait = (60 * 5);
my $maxwait = (60 * 7);
#
# Start a counter going, relative to the time we rebooted the first
......@@ -523,20 +522,16 @@ sub WaitTillAlive ($) {
my $minutes = 0;
#
# Sigh, a long ping results in the script waiting until all the
# packets are sent from all the pings, before it will exit. So,
# loop doing a bunch of shorter pings.
# Wait for the node to finish booting, as recorded in database
#
while (1) {
system("$ping -q -c 4 -t 4 $pc >/dev/null 2>&1");
$status = $? >> 8;
my $state;
if (!TBGetNodeEventState($pc,\$state)) {
print "*** Error getting event state for $pc.\n";
return 1;
}
#
# Returns 0 if any packets are returned. Returns 2 if pingable
# but no packets are returned. Other non-zero error codes indicate
# other problems. Any non-zero return indicates "not pingable" to us.
#
if (! $status) {
if ($state eq TBDB_NODESTATE_ISUP) {
print "$pc is alive and well\n" if $dbg;
return 0;
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment