1. 21 Apr, 2008 1 commit
  2. 13 Apr, 2008 1 commit
  3. 07 Feb, 2008 1 commit
    • David Johnson's avatar
      Add support for including nodes from multiple PLCs in experiments. Right · c44c47c9
      David Johnson authored
      now, this is keyed off nodetype.  Lots of hardcoded constants and config
      stuff moved to attributes in the db.  You can now set per-PLC and
      per-slice attributes, so you can (for instance) use different auth info
      whenever you want.  Experiments can use preexisting slices if somebody
      sets up the db before swapin.  Also, we no longer have to rely on
      slices.xml to sync up nodes/sites with PLC... can use xmlrpc instead.
      
      Lots of code cleanup, improved some abstractions, etc.
      c44c47c9
  4. 05 Sep, 2007 1 commit
  5. 16 Jul, 2007 1 commit
    • David Johnson's avatar
      Several things in this commit: · 3bbd843b
      David Johnson authored
        * Prior to this commit, libplab depended on db state to create slivers
          and slices.  Now it can be done using the regular command line tools
          without the metadata in the db.  This makes development and debugging
          much easier and allows us to use the command-line tools even if state has
          been cleared out of the db (i.e., for sliver garbage collection).
        * Add support for sliver start/stop/restart via the v4 NM.
        * Some support for sliver garbage collection.
        * Various other improvements and cleanup.
      3bbd843b
  6. 10 Mar, 2007 1 commit
    • David Johnson's avatar
      These are the rest of the changes that have been accumulating in my dev · 1b6ef602
      David Johnson authored
      tree for v4 planetlab node support.  Currently, we support both v3 and
      v4 NMs via a little wrapper, and we dist out different versions of the
      rootball depending on NM version.  Also updated various parts of libplab
      to log success and failure from interactions with planetlab nodes to the
      db, and there are beginnings of support for that in plabmonitord.in.
      1b6ef602
  7. 05 Dec, 2006 1 commit
  8. 11 Sep, 2006 1 commit
    • Kirk Webb's avatar
      · aa446875
      Kirk Webb authored
      plab logging enhancements.
      
      timing information for various RPCs is now logged to
      /usr/testbed/log/plabtiming.log.  This info will be useful for extracting
      trends for the various plab nodes, and in calculating reliability and
      timing metrics.  These could be used, for e.g., to pick nodes that tend to
      come up more quickly.
      
      This update also squelches much of the python backtrace noise when plab nodes
      fail to setup correctly (can be turned on with debug flag).  Instead, failures
      are summarized on a single line.
      
      Oh, and pay no attention to the aspect behind the curtain!  Yes, you may
      groan and moan if you wish - I'm using aspects to help do the logging.  I
      find this to be a really slick way of wrapping several functions!
      aa446875
  9. 15 Dec, 2005 1 commit
    • Kirk Webb's avatar
      · 41c54939
      Kirk Webb authored
      The revived Plab interface is here!
      
      Lots of updates to the plab backend, including improved plab <-> elab node
      id translation and update handling.  Includes support for the current PLC
      API, and the new pl_conf node manager interface API.  Several more db library
      routines were ported from the perl library to the python one to support the
      new code (mostly the node_id tracking stuff).  Fixes to the client side and
      also a rootball creation cleanup (binaries removed from the CVS repo).
      
      There are also enhancements to the experiment view page for experiments
      including plab nodes: site and widearea hostname are now displayed along
      with the other node information.
      
      Note that the way setup timeout for vnodes is calculated has been changed a
      bit.  Instead of using a hardwired base timeout, the base timeout is now
      based on the reload_waittime database field, which comes from the 'OS'
      (e.g., FBSD-JAIL, RHL-PLAB) the vnode runs.
      
      The default max duration for a plab slice created through the plab_ez interface
      is set to 1 year, and linktest is currently disabled and hidden through
      the ez interface.
      
      There is still work to do, but this checkin brings with it a functional
      plab portal!
      41c54939
  10. 23 Mar, 2004 1 commit
    • Kirk Webb's avatar
      Snapshot. · fd6d8cc9
      Kirk Webb authored
      * incompatible option handling and use removed from gen purpose libs
      * Global PLC mutex implemented, but currently disabled
      * plabmonitord parallelization cut in half (for now)
      
      I'm still very frustrated with option handling/passing.  Needs more thought,
      but the primary issue is that there really isn't a global variable space in
      python (global to file, yes, but not global to interpreter invocation).
      
      I've learned that __builtin__ might work for this, but it seems hacky..
      fd6d8cc9
  11. 18 Mar, 2004 1 commit
    • Kirk Webb's avatar
      More updates: · 3ae7da68
      Kirk Webb authored
      * Added comments
      * Added Emulab copyright
      * made mod_PLC handle the "not assigned" error case in freeNode()
        - optimization and less log clutter.
      * bug fix in plabmonitord (ISUP decection)
      3ae7da68
  12. 17 Mar, 2004 1 commit
    • Kirk Webb's avatar
      Snapshot. · 856c2509
      Kirk Webb authored
      * Changed the way options are parsed in the python scripts so that modules
        can easily add and use their own options independent of top-level scripts.
      
      * Added --noIS and --pollNodes module options.
      
      * Added batch option to vnode_setup (degree of parallelization)
        - defaults to 10
      
      * Major updates to plamonitord
        - batches testing, currently to 40
      856c2509
  13. 26 Feb, 2004 1 commit
  14. 25 Feb, 2004 2 commits
    • Kirk Webb's avatar
      e6075372
    • Kirk Webb's avatar
      Kirk takes the weed whacker to the plab code. This is the first pass result. · ae2eec76
      Kirk Webb authored
      I'll come along for a closer cut in the future.
      
      * Modularized the plab communications 'adaptor' interface and moved the
        dslice- and PLC-specific code into their own modules.
      
      * Wrote an API definition README
      
      * Separated out generic routines from libplab into their own library modules
        (libtestbed.py and libdb.py)
      
      Functionally, not much has changed - this was just a massive re-org with some
      other cleanup.  Should be much easier to code up new PLAB interfaces as the
      plab folks flail around in their attempt to standardize on something.
      
      XXX: may want to re-think where the generic library modules should go.  If
      more python code enters Elab, we'll probably want to move 'em to more standard
      locations.
      
      This isn't the end of the cleanup - I would eventually like to go back and
      rethink the class structures, beef up the comments, and extend the API.
      ae2eec76
  15. 31 Dec, 2003 1 commit
    • Kirk Webb's avatar
      Commit to usher in the new PLC regime. Added a config variable to · 6d205dc5
      Kirk Webb authored
      vnode_setup for the timeout on waiting for child processes.  I've
      set it to 10 minutes since all ancillary setup programs have their own
      time bounds (I think - the plab ones do anyway).
      
      The function of plabmonitord has changed slightly.  Instead of setting
      up and tearing down vnodes, its job is to just setup the emulab management
      sliver on plab nodes in hwdown.  Once the vserver comes up and reports isalive,
      it moves the node out of hwdown.  Currently, it first tries to tear down the
      vserver before reinstantiating it.  In the future, we could get fancier and
      try interacting with the service sliver directly before simply tearing it down.
      
      All new plab nodes now start life in hwdown, and must be summoned forth
      into production by plabmonitord.
      
      This commit does NOT include support for the node-local httpd.  That will
      come soon.
      6d205dc5
  16. 24 Dec, 2003 1 commit
  17. 23 Oct, 2003 1 commit
    • Kirk Webb's avatar
      · 5b52831c
      Kirk Webb authored
      Well, here it is:  The checkin implementing robust recovery/retry and
      asynchronous safe termination in plab allocation/deallocation/setup.
      
      Here are some of the more prominent changes/additions:
      
      * Bounded plab agent communication
        Scripts should never hang waiting for plab xmlrpc commands to complete;
        they have their own internal timeouts.  Node.create() in libplab is an
        exception, but is always run under a timeout constraint in vnode_setup
        and can be changed easily if the need arises.
      
      * Wrote functions in libplab to do the retry/recovery/timeout of remote
        command exection.
      
      * Wrapped critical sections with a signal watcher.
      
      * Added code to handle various error conditions properly
      
      * Added a libtestbed function, TBForkCmd, which runs a given program in
        a child process, and can optionally catch incoming SIGTERMs and terminate
        the child (then exit itself).
      
      * Fixed up vnode_setup to batch the 'plabnode free' operation along with
        a few other cleanups.  This should alleviate Jay's concern about how
        long it used to take to teardown a plab expt.
      
      * Whacked plabmonitord into better shape; fixed a couple bugs, taught it how
        to daemonize, and implemented a priority list for testing broken plab nodes.
        This list causes new (as yet unseen) nodes to be tried first over ones that
        have been tested already.
      5b52831c
  18. 24 Sep, 2003 1 commit
    • Kirk Webb's avatar
      · 3239c722
      Kirk Webb authored
      A couple of quick bug fixes
      
       - extract/format traceback properly for email message
       - libplab.py is already disabling lin buffering for plabnode,
         so the code here to disable it has been removed.  We were led to believe
         there was a buffering problem from the plabnode scripts that were never
         actually getting killed off.
      3239c722
  19. 23 Sep, 2003 2 commits
  20. 17 Sep, 2003 1 commit
    • Kirk Webb's avatar
      Several updates to libplab.py and plabnode.in · 56e67515
      Kirk Webb authored
      - getfree daemon doesn't die anymore when communcation with the plab dslice
        agent fails.
      
      - the link classifier logic has been changed slightly to allow nodes
        to be classified as inet2 even if they don't reverse resolve.  The problem
        here is that intl nodes that don't resolve, but which go through abilene
        will look like inet2 nodes, which is wrong.  Manual verification of the
        node_auxtypes table is still recommended.
      
      - The fping verifier has been disabled for now (since some plab nodes
        block ICMP traffic).
      
      - made some error messages more descriptive
      
      - plabnodes script now handles more agent communication errors gracefully
       (retries when if encounters them).
      
      - rearranged plabnode's retry loops to be a little easier to read, and
        more general.
      56e67515
  21. 16 Sep, 2003 1 commit
    • Kirk Webb's avatar
      · e1a2fabc
      Kirk Webb authored
      Some PLAB dslice manager updates:
      
      - in addition to asking the dslice agent (on plab) for a list of available
        nodes, we now also fping them all to weed out unresponsive ones.  One problem
        here is that several plab nodes block ICMP; could be solved by pinging with
        nmap (tries both a ICMP, and TCP ping).  This affects the plabdaemon getfree
        command, and subsequently which plab nodes appear as "up" in the DB
      
      - Changed slice naming scheme:  we now append the experiment index onto the
        slice name to try to ensure uniqueness (emulab_<pid>_<eid>_<idx>)
      
      - Modified plabnode to try to cope with flaky nodes - there is some retry
        code in there now
      
      - Added the "fixsudo" shell script which is run very first as root (via the
        cumbersome "su" command) to fix sudoers for later sudo use on plab nodes.
      e1a2fabc
  22. 22 Aug, 2003 1 commit
    • Austin Clements's avatar
      * Rewrote argument handling code to use getopt. · 6348a02e
      Austin Clements authored
      * Various improvements to new node stuff, including reworking node
        status updates so that they use the right table, and don't update
        vnodes that are alive (since their watchdog will do this).
      
      * Added renewal code to automatically renew all leases that are doing
        to expire within two days.
      
      * Moved Emulabification directly into the node abstraction.  Now the
        libplab wrapper scripts are all just plain wrapper scripts, instead
        of having the knowledge spread out
      
      * Switched from using a Plab-specific keypair to using the normal
        Emulab one, which makes it possible to use sshtb to Plab nodes.
      
      * Removed node booting code, since vnode_setup takes care of this.
      6348a02e
  23. 19 Aug, 2003 1 commit
    • Austin Clements's avatar
      This is the Planetlab manager. It includes a combination dslice · e6ce08a1
      Austin Clements authored
      service manager and resource broker that works closely with the
      control flow through the Emulab experiment swap process.  It keeps all
      slice and node data in the DB.  Node allocation automatically unpacks
      and configures the node to come up as an Emulab/Plab node when it is
      booted (later, via vnode_setup).  It also takes care of other
      necessary bits of interfacing with Planetlab, including discovering
      which nodes are available, adding new Plab nodes to the DB, and
      maintaining status information on Plab nodes.
      e6ce08a1