    • Mac Newbold's avatar
      Merge the newstated branch with the main tree. · 5c961517
      Mac Newbold authored
      Changes to watch out for:
      - db calls that change boot info in nodes table are now calls to os_select
      - whenever you want to change a node's pxe boot info, or def or next boot
      osids or paths, use os_select.
      - when you need to wait for a node to reach some point in the boot process
      (like ISUP), check the state in the database using the lib calls
      - Proxydhcp now sends a BOOTING state for each node that it talks to.
      - OSs that don't send ISUP will have one generated for them by stated
      either when they ping (if they support ping) or immediately after they get
      to BOOTING.
      - States now have timeouts. Actions aren't currently carried out, but they
      will be soon. If you notice problems here, let me know... we're still
      tuning it. (Before all timeouts were set to "none" in the db)
      One temporary change:
      - While I make our new free node manager daemon (freed), all nodes are
      forced into reloading when they're nfreed and the calls to reset the os
      are disabled (that will move into freed).
    • Robert Ricci's avatar
      A few changes for use with the testsuite's 'full' mode: · 509c7b38
      Robert Ricci authored
      1) Checks database redirects for nodes, and ignore events that aren't
         directed to its database.
      2) Doesn't insist on being run as root (doesn't need to be right now,
      3) '-f' option that prevents it from forking into the backgound, for
         easier killing.
    • Robert Ricci's avatar
      First pass at operational mode support for node states. · 4db415f5
      Robert Ricci authored
      Operational mode (op_mode in the database) affects the state diagram
      and timeouts for a node. Modes planned so far are:
      NORMAL    - Normal operation
      DELAYING  - Acting as a delay node
      UNKNOWNOS - Running an OS that does not report its state (OSKit kernels, etc.)
      RELOADING - Disk reloading
      stated now responds to to TBNODEOPMODE events, and sets database state
      accordingly. The set of state timeouts and valid state transitions are
      affected by a node's operational mode.
      The nodes table now stores information about operational modes, and
      the state_transitions and state_timeouts tables include the operational
      mode in addition to states.
      Next step will be to get the appropriate programs to send TBNODEOPMODE
    • Robert Ricci's avatar
      Changed behavior when reloading node state from database. Now, if we · 67d3205d
      Robert Ricci authored
      find a node that we already knew about, and it hasn't changed state or
      timestamp, we just use the old entry. This allows us to still notice
      new nodes, or nodes that have had their state changed externally (say,
      by hand), but not forget about nodes we've already sent mail about.
    • Robert Ricci's avatar
      Remove SWIG from the build process - unfortunately, it's slightly · 8ffb8cf4
      Robert Ricci authored
      broken. Also, it made me slightly uneasy that there was no way to
      prevent swig from putting one of its generated files in sorce
      directory. So, I've just checked in the two major files that get
      generated by SWIG, so that the make rule that runs it never gets
      One of the reasons for doing this is that swig generates slightly
      broken code when the -exportall (which does perl module exports
      correctly) arugment is given. A very minor amount of manual tweaking
      of the generated .pm file can fix this problem. So, the checked in
      copy of event.pm has these tweaks applied.
      As a result of all of this, exports work correctly in the event perl
      module, so the hacky practice of putting your program in the event
      namespace is no longer necessary.
    • Robert Ricci's avatar
      New script: stated · 447bb8a5
      Robert Ricci authored
      Watches for events sent by TMCD regarding the state of nodes. Records
      this information in the database. Also watches for nodes that undergo
      invalid state transitions, or stay in the same state for too long.
      Right now, the only action it takes is to send email, but in the
      future, will take action to 'unstick' nodes.
      Not yet installed by default.