• Mike Hibler's avatar
    Add check in "wedged" code to verify that the node is not already reloading. · 45ec3557
    Mike Hibler authored
    Due to a race with collecting events, it looks like some events will still
    slip through the crack and we might wind up having missed a transition after
    five minutes. If we see that we are already in RELOADING (the state transition
    we are looking for) when we would declare the node wedged, then fake the
    transition and continue.
    I suspect this would not happen if I just looped on event_poll til there
    were no more events, but I am afraid of letting that loop go unbounded.
    So til I gather more data, lets go with this hack check.
libosload.pm.in 42 KB