1. 18 Apr, 2008 1 commit
  2. 13 Apr, 2008 1 commit
  3. 28 Mar, 2008 1 commit
  4. 06 Feb, 2008 1 commit
    • David Johnson's avatar
      Add support for including nodes from multiple PLCs in experiments. Right · c44c47c9
      David Johnson authored
      now, this is keyed off nodetype.  Lots of hardcoded constants and config
      stuff moved to attributes in the db.  You can now set per-PLC and
      per-slice attributes, so you can (for instance) use different auth info
      whenever you want.  Experiments can use preexisting slices if somebody
      sets up the db before swapin.  Also, we no longer have to rely on
      slices.xml to sync up nodes/sites with PLC... can use xmlrpc instead.
      
      Lots of code cleanup, improved some abstractions, etc.
      c44c47c9
  5. 21 Sep, 2007 1 commit
  6. 11 Apr, 2007 1 commit
  7. 06 Mar, 2007 1 commit
  8. 05 Dec, 2006 1 commit
  9. 12 Sep, 2006 1 commit
    • Kirk Webb's avatar
      · 52dcfd48
      Kirk Webb authored
      Added secondary logging for node setup/teardown success/failure.  Also log
      node pool membership changes in this log.
      52dcfd48
  10. 28 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · 37f4392e
      Kirk Webb authored
      Updates to the plab monitor.  Fixed a couple of bugs and created a
      separate libplabmon library module.
      37f4392e
  11. 21 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · af0d6629
      Kirk Webb authored
      Some bugfixes and updates to the monitor.
      
      * Added load average monitoring and initial test startup randomization
      
      The load the monitor was exerting, especially at startup, was pretty high.
      This change appears to have brought that under control.
      
      * Fixed window size bug(s)
      
      There were a few bugs related to tracking the outstanding child process
      window that are corrected by this checkin.
      af0d6629
  12. 17 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · f1fa5a51
      Kirk Webb authored
      New plab vnode monitor framework, now with proactive node checking action!
      
      The old monitor has been completely replaced.  The new one uses modular pools
      to test and track plab nodes.  There are currently two pool modules:
      good and bad.  THe good pool tests nodes that have are not known to have
      issues to proactively find problems and push nodes into the "bad" pool
      when necessary.  The bad pool acts similarly to the old plabmonitor; it
      does and end to end test on nodes, and if and when they finally come up,
      moves them to the good pool.  Both pools have a testing backoff mechanism
      that works as follows:
      
        * The node is tested right away upon entering either pool
        * Node fails to setup:
          * goodpool: node is sent to bad pool (hwdown)
          * badpool:  node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
        * Node setup succeeds:
          * goodpool: node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
          * badpool:  node is moved to good pool.
      
      The backoff thing may be bogus, we'll see.  It seems like a reasonable thing
      to do though - no need to hammer a node with tests if it consistently
      succeeds or fails.  Nodes that flop back and forth will get the most
      testing punishment.  A future enhancement will be to watch for flopping
      and force nodes that exhibit this behavior to pass several consecutive
      tests before being eligible for return back into the good pool.
      
      The monitor only allows a configurable window's worth of outstanding
      tests to go on at once.  When tests finish, more nodes tests are allowed
      to start up right away.
      
      Some refactoring needs to be done.  Currently the good and bad pools share
      quite a bit of duplicated code.  I don't know if I dare venture into
      inheritance with perl, but that would be a good way to approach this.
      
      Some other pool module ideas:
      
      * dynamic setup pools
      
      When experiments w/ plab vnodes are swapped in, use the plab monitor to
      manage setting up the vnodes by dynamically creating pools on a per-experiment
      basis.  This has the advantage that the monitor can keep a global cap on
      the number of outstanding setup operations.  These pools might also try to
      bring up vnodes that failed to setup during swapin later on, along with other
      vnode monitoring tasks.
      
      * "all nodes" pools
      
      Similar to the dynamic pools just mentioned, but with the mission to extend
      experiments to all plab nodes possible (as nodes come and go).  Useful for
      services.
      f1fa5a51