1. 23 Mar, 2004 2 commits
    • Kirk Webb's avatar
      Couple of items: · ed848fd5
      Kirk Webb authored
      * Small fix to DBQueryFatal in libdb.py: () is a valid return value,
        don't fail on it(insert/replace); do fail if DBQuery returns None though.
      
      * Fix up libplab.py to not choke on new plab_slices column.
      ed848fd5
    • Kirk Webb's avatar
      Snapshot. · fd6d8cc9
      Kirk Webb authored
      * incompatible option handling and use removed from gen purpose libs
      * Global PLC mutex implemented, but currently disabled
      * plabmonitord parallelization cut in half (for now)
      
      I'm still very frustrated with option handling/passing.  Needs more thought,
      but the primary issue is that there really isn't a global variable space in
      python (global to file, yes, but not global to interpreter invocation).
      
      I've learned that __builtin__ might work for this, but it seems hacky..
      fd6d8cc9
  2. 18 Mar, 2004 1 commit
  3. 17 Mar, 2004 2 commits
    • Kirk Webb's avatar
      More updates: · 3ae7da68
      Kirk Webb authored
      * Added comments
      * Added Emulab copyright
      * made mod_PLC handle the "not assigned" error case in freeNode()
        - optimization and less log clutter.
      * bug fix in plabmonitord (ISUP decection)
      3ae7da68
    • Kirk Webb's avatar
      Snapshot. · 856c2509
      Kirk Webb authored
      * Changed the way options are parsed in the python scripts so that modules
        can easily add and use their own options independent of top-level scripts.
      
      * Added --noIS and --pollNodes module options.
      
      * Added batch option to vnode_setup (degree of parallelization)
        - defaults to 10
      
      * Major updates to plamonitord
        - batches testing, currently to 40
      856c2509
  4. 03 Mar, 2004 1 commit
    • Kirk Webb's avatar
      More plab updates/changes · d4a887e5
      Kirk Webb authored
      * implemented PLC slice renewal
      * restructured daemon code/startup
        - removed getfree daemon (replaced by plabdiscover; run from cron)
        - moved generic daemonizing code into libtestbed (class)
        - created plabrenewd - small script that utilizes daemonizing class
        - removed plabdaemon file.
        - updated bossnode startup scripts
      * changed slice prefix - PLC denies permission w/ anything other than "utah"
      * Minor semantic changes to module API to be more consistent with other parts.
      * Some bug fixes.
      d4a887e5
  5. 02 Mar, 2004 1 commit
    • Kirk Webb's avatar
      Some updates to plab support: · b7c376c3
      Kirk Webb authored
      * removed unused and not generally useful ping checking
      * reorganized node discovery and added node info updating
        - e.g., update IP, SITE, or HOSTNAME when they have changed
        - no longer part of the backend module as this is independent of
          which backend is used; may modularize it due to plab's new "trumpet"
          service, which is basically its node DB available via a decentralized
          transport/API.
      * introduced new method of getting node info - use plab sites.xml file
      * various other cleanups.
      b7c376c3
  6. 26 Feb, 2004 1 commit
  7. 25 Feb, 2004 2 commits
    • Kirk Webb's avatar
      e6075372
    • Kirk Webb's avatar
      Kirk takes the weed whacker to the plab code. This is the first pass result. · ae2eec76
      Kirk Webb authored
      I'll come along for a closer cut in the future.
      
      * Modularized the plab communications 'adaptor' interface and moved the
        dslice- and PLC-specific code into their own modules.
      
      * Wrote an API definition README
      
      * Separated out generic routines from libplab into their own library modules
        (libtestbed.py and libdb.py)
      
      Functionally, not much has changed - this was just a massive re-org with some
      other cleanup.  Should be much easier to code up new PLAB interfaces as the
      plab folks flail around in their attempt to standardize on something.
      
      XXX: may want to re-think where the generic library modules should go.  If
      more python code enters Elab, we'll probably want to move 'em to more standard
      locations.
      
      This isn't the end of the cleanup - I would eventually like to go back and
      rethink the class structures, beef up the comments, and extend the API.
      ae2eec76
  8. 10 Jan, 2004 1 commit
  9. 06 Jan, 2004 1 commit
  10. 03 Jan, 2004 2 commits
  11. 31 Dec, 2003 1 commit
  12. 30 Dec, 2003 3 commits
    • Kirk Webb's avatar
      Commit to usher in the new PLC regime. Added a config variable to · 6d205dc5
      Kirk Webb authored
      vnode_setup for the timeout on waiting for child processes.  I've
      set it to 10 minutes since all ancillary setup programs have their own
      time bounds (I think - the plab ones do anyway).
      
      The function of plabmonitord has changed slightly.  Instead of setting
      up and tearing down vnodes, its job is to just setup the emulab management
      sliver on plab nodes in hwdown.  Once the vserver comes up and reports isalive,
      it moves the node out of hwdown.  Currently, it first tries to tear down the
      vserver before reinstantiating it.  In the future, we could get fancier and
      try interacting with the service sliver directly before simply tearing it down.
      
      All new plab nodes now start life in hwdown, and must be summoned forth
      into production by plabmonitord.
      
      This commit does NOT include support for the node-local httpd.  That will
      come soon.
      6d205dc5
    • Mike Hibler's avatar
    • Kirk Webb's avatar
      Mods to getfree daemon to grab list of available nodes from plab · 5471f18e
      Kirk Webb authored
      central.  Also, back out Mike's hack, and use the ALLOWED_LIST feature
      Austin originally had to limit node scope.
      5471f18e
  13. 29 Dec, 2003 1 commit
  14. 23 Dec, 2003 2 commits
  15. 15 Dec, 2003 1 commit
  16. 12 Dec, 2003 1 commit
  17. 09 Dec, 2003 1 commit
    • Kirk Webb's avatar
      · e664ad58
      Kirk Webb authored
      A couple of things:
      
      1) Added PLAB_SLICEPREFIX so that we can separately instantiate plab slices
      from mini, or elsewhere.  On the mainbed, its set to "emulab".  On mini, its
      set to "emulab_mini".  The "emulab" part has to exist first so that the new
      plab node manager doesn't nuke our dslice slivers.
      
      2) Fixed up Plab.getFree() so that it doesn't try to add the same IP twice
      to the DB if a new one is found, and listed more than once.
      e664ad58
  18. 08 Dec, 2003 1 commit
  19. 02 Dec, 2003 2 commits
  20. 01 Dec, 2003 2 commits
  21. 17 Nov, 2003 1 commit
  22. 05 Nov, 2003 1 commit
  23. 04 Nov, 2003 1 commit
  24. 01 Nov, 2003 1 commit
    • Kirk Webb's avatar
      Couple important, but small fixes: · 92eb1d5e
      Kirk Webb authored
      1) properly disable alarm before exiting ForkCmd
         - this was causing SIGALRM to get sent when it shouldn't have, and
           probably caused the renewal failures.
         - was introduced accidentally yesterday when I unwittingly committed
           some beta libplab code along with the rootball version string fix.
      
      2) Changed semantics of the renew daemon s.t. it only sends a single message
         for each invocation of the renewal loop - summarizes the ones that failed.
      
      The rest of the code I committed accidentally yesterday seems to be working
      just fine.  It all looks sane on perusal.
      92eb1d5e
  25. 31 Oct, 2003 2 commits
  26. 24 Oct, 2003 1 commit
    • Robert Ricci's avatar
      Commit the stuff necessary to copy out new plab rootballs, versions of · d12f9b61
      Robert Ricci authored
      which had been hanging around in my home directory for a while.
      
      There are a few new things in plab/etc/netbed_files that set up a
      directory of the same name in @prefix@. This will get rsync'ed with
      netbed_files/ on each planetlab node.
      log/  - just needs to exist for the httpd server
      sbin/ - contains thttpd, and scripts to manipulate it
      www/  - the directory served by thttpd. Contains symlinks to the 'real'
              location of the rootballs (etc/plab)
      
      I've committed a binary of thttpd - this is simply because it'd be a
      PITA to compile a Linux binary for every devel tree, etc.
      
      PLAB_ROOTBALL has now become a configure options. The idea is that we
      will keep the latest version number in configure.in, but you can
      override it in your defs
      file. This way, we don't have to update every defs file when there's a new
      version, but people can still play around with their own version if they want.
      
      The two scripts that interact with the plab nodes skip ones that are
      down. They ssh in as 'utah1', meaning that one of us who has access to
      that account needs to run them, so that they can have access to our
      keys. We can put boss's public key (or something) out there to remove
      this requirement.
      
      plabdist runs an rsync between @prefix@/etc/plab/netbed_files and a
      file of the same name on the planetlab nodes. It's intended to be run
      from the main install tree - the local rsync directory is not normally
      set up in devel trees. It runs in parallel, but is limited to 4 to
      avoid beating up boss too much. Takes about 1:40 with the current set
      of plab nodes (took > 10 minutes doing one at a time).
      
      plabhttpd (re)starts the mini web server on all plab nodes
      d12f9b61
  27. 23 Oct, 2003 1 commit
    • Kirk Webb's avatar
      · 5b52831c
      Kirk Webb authored
      Well, here it is:  The checkin implementing robust recovery/retry and
      asynchronous safe termination in plab allocation/deallocation/setup.
      
      Here are some of the more prominent changes/additions:
      
      * Bounded plab agent communication
        Scripts should never hang waiting for plab xmlrpc commands to complete;
        they have their own internal timeouts.  Node.create() in libplab is an
        exception, but is always run under a timeout constraint in vnode_setup
        and can be changed easily if the need arises.
      
      * Wrote functions in libplab to do the retry/recovery/timeout of remote
        command exection.
      
      * Wrapped critical sections with a signal watcher.
      
      * Added code to handle various error conditions properly
      
      * Added a libtestbed function, TBForkCmd, which runs a given program in
        a child process, and can optionally catch incoming SIGTERMs and terminate
        the child (then exit itself).
      
      * Fixed up vnode_setup to batch the 'plabnode free' operation along with
        a few other cleanups.  This should alleviate Jay's concern about how
        long it used to take to teardown a plab expt.
      
      * Whacked plabmonitord into better shape; fixed a couple bugs, taught it how
        to daemonize, and implemented a priority list for testing broken plab nodes.
        This list causes new (as yet unseen) nodes to be tried first over ones that
        have been tested already.
      5b52831c
  28. 20 Oct, 2003 1 commit
  29. 15 Oct, 2003 1 commit
  30. 14 Oct, 2003 1 commit
    • Kirk Webb's avatar
      · 4deac149
      Kirk Webb authored
      Update to libplab.plab.renew:
      
        * Make renewal robust against various kinds of failures.  These changes
          will augment my larger set of libplab and plab* updates/fixes coming
          soon to an Emulab near you.
      4deac149