• Mac Newbold's avatar
    First batch of changes for adding TBCOMMAND events. Currently, here's what · 71b82cc4
    Mac Newbold authored
    is supported:
    - stated listens for TBCOMMAND events, and currently handles REBOOT,
      POWEROFF, POWERON, and POWERCYCLE events. It does everything except make
      the actual calls to node_reboot and power. And it accepts batches of
      nodes instead of just single ones.
    - Timeouts were added to the db for these commands, with no timeout for
      the power ones (since the node can't hang during those), and a 15 second
      timeout from reboot until the SHUTDOWN state.
    - If a rebootimes out, it tries it again, up to 3 times. If it gets to
      three times without working, it sends mail to tbops and turns the
      machine off instead of continuing to reboot it. Right now I haven't
      made it do node_reboot -f or power cycle on retries, but it easily
    - Stuff to be done before they work: make node_reboot send an event
      instead of doing the work, and make a new script that has node_reboot's
      old guts. Note that this requires authentication in our events for these
      commands, and a way to make sure that the command that came in as an
      event was properly authenticated.
    - For future growth and expansion, it is set up so it should be relatively
      easy to add other commands that do different things, even if they take
      arbitrary params that aren't nodes or lists of nodes.