1. 27 Dec, 2006 1 commit
  2. 14 Dec, 2006 1 commit
  3. 06 Dec, 2006 1 commit
  4. 04 Dec, 2006 2 commits
  5. 01 Dec, 2006 1 commit
  6. 27 Nov, 2006 1 commit
  7. 27 Oct, 2006 1 commit
  8. 25 Oct, 2006 1 commit
    • Leigh Stoller's avatar
      Makefile Whacking! Try to deal with the problem caused by the delay · 7590f9c5
      Leigh Stoller authored
      between when something is installed and when post-install runs. Short
      of a global lock (which we probably need anyway someday), my solution
      is this. In your makefiles, add these variables before the line that
      has the include of $(TESTBED_SRCDIR)/GNUmakerules:
      
      	SETUID_BIN_SCRIPTS   =
      	SETUID_SBIN_SCRIPTS  =
      
      I have added three new rules to GNUmakerules that look like this:
      
      	$(addprefix $(SBINDIR)/, $(SETUID_SBIN_SCRIPTS)): $(SBINDIR)/%: %
      		echo "Installing (setuid) $<"
      		-mkdir -p $(INSTALL_SBINDIR)
      		$(SUDO) $(INSTALL) -o root -m 4755 $< $@
      
      Yep, your eyes ain't lying to you; use sudo to run the target so that
      install does the right thing (which is that the old file is not
      replaced until the new one has the proper attributes on it).
      
      Note that post-install is still needed for the initial install, but
      should no longer be needed for day to day installs since all that other
      stuff post-install does is mkdir/chmod on directories.
      7590f9c5
  9. 20 Oct, 2006 1 commit
    • Mike Hibler's avatar
      Wow, this should make me look important! · afa5e919
      Mike Hibler authored
      Two-day boondoggle to support "/scratch", an optional large, shared filesystem
      for users.  To do this, I needed to find all the instances where /proj is used
      and behave accordingly.  The boondoggle part was the decision to gather up all
      the hardwired instances of shared directory names ("/proj", "/users", etc.)
      so that they are set in a common place (via unexposed configure variables).
      This is a boondoggle because:
      
      1. I didn't change the client-side scripts.  They need a different mechanism
         (e.g., tmcd) to get the info, configure is the wrong way.
      
      2. Even if I had done #1 it is likely--no, certain--that something would
         fail if you tried to rename "/proj" to be "/mike".  These names are just
         too ingrained.
      
      3. We may not even use "/scratch" as it turns out.
      
      Note, I also didn't fix any of the .html documentation.  Anyway, it is done.
      To maintain my illusion in the future you should:
      
      1. Have perl scripts include "use libtestbed" and use the defined PROJROOT(),
         et.al. functions where possible.  If not possible, make sure they run
         through configure and use @PROJROOT_DIR@, etc.
      
      2. Use the configure method for python, C, php and other languages.
      
      3. There are perl (TBValidUserDir) and php (VALIDUSERPATH) functions which
         you should call to determine if an NS, template parameter, tarball or
         other file are in "an acceptable location."  Use these functions where
         possible.  They know about the optional "scratch" filesystem.  Note that
         the perl function is over-engineered to handles cases that don't occur
         in nature.
      afa5e919
  10. 28 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · 37f4392e
      Kirk Webb authored
      Updates to the plab monitor.  Fixed a couple of bugs and created a
      separate libplabmon library module.
      37f4392e
  11. 17 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · f1fa5a51
      Kirk Webb authored
      New plab vnode monitor framework, now with proactive node checking action!
      
      The old monitor has been completely replaced.  The new one uses modular pools
      to test and track plab nodes.  There are currently two pool modules:
      good and bad.  THe good pool tests nodes that have are not known to have
      issues to proactively find problems and push nodes into the "bad" pool
      when necessary.  The bad pool acts similarly to the old plabmonitor; it
      does and end to end test on nodes, and if and when they finally come up,
      moves them to the good pool.  Both pools have a testing backoff mechanism
      that works as follows:
      
        * The node is tested right away upon entering either pool
        * Node fails to setup:
          * goodpool: node is sent to bad pool (hwdown)
          * badpool:  node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
        * Node setup succeeds:
          * goodpool: node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
          * badpool:  node is moved to good pool.
      
      The backoff thing may be bogus, we'll see.  It seems like a reasonable thing
      to do though - no need to hammer a node with tests if it consistently
      succeeds or fails.  Nodes that flop back and forth will get the most
      testing punishment.  A future enhancement will be to watch for flopping
      and force nodes that exhibit this behavior to pass several consecutive
      tests before being eligible for return back into the good pool.
      
      The monitor only allows a configurable window's worth of outstanding
      tests to go on at once.  When tests finish, more nodes tests are allowed
      to start up right away.
      
      Some refactoring needs to be done.  Currently the good and bad pools share
      quite a bit of duplicated code.  I don't know if I dare venture into
      inheritance with perl, but that would be a good way to approach this.
      
      Some other pool module ideas:
      
      * dynamic setup pools
      
      When experiments w/ plab vnodes are swapped in, use the plab monitor to
      manage setting up the vnodes by dynamically creating pools on a per-experiment
      basis.  This has the advantage that the monitor can keep a global cap on
      the number of outstanding setup operations.  These pools might also try to
      bring up vnodes that failed to setup during swapin later on, along with other
      vnode monitoring tasks.
      
      * "all nodes" pools
      
      Similar to the dynamic pools just mentioned, but with the mission to extend
      experiments to all plab nodes possible (as nodes come and go).  Useful for
      services.
      f1fa5a51
  12. 06 Jun, 2006 1 commit
  13. 02 Jun, 2006 1 commit
  14. 01 Jun, 2006 1 commit
    • Leigh Stoller's avatar
      Add suport for building per project, group, experiment DBs on ops. At · adbcfd47
      Leigh Stoller authored
      present the per-experiment stuff is not hooked in, but will be for
      templates later. Anyway, each user gets a mysql account on ops, with
      password set to the same as their mailman password (which is also
      their jabber password, etc). Each project gets a DB named by the
      project, and each group gets a DB named by pid,gid. Users are placed
      on the access lists for the DBs as you would expect.
      
      There is a little bit of complexity to make sure that we can create
      DBs on ops outside the Emulab path and grant access to them, without
      Emulab getting confused or mucking things up.
      
      I'll get a news item done ...
      adbcfd47
  15. 25 May, 2006 1 commit
  16. 13 Apr, 2006 1 commit
  17. 30 Mar, 2006 1 commit
  18. 21 Mar, 2006 2 commits
    • Kirk Webb's avatar
      · 7a3a7bf9
      Kirk Webb authored
      Fix up botched configure[.in].
      7a3a7bf9
    • Kevin Atkinson's avatar
      · be95cc12
      Kevin Atkinson authored
      Added libtblog.pm to configure
      be95cc12
  19. 16 Mar, 2006 1 commit
  20. 13 Mar, 2006 1 commit
    • Leigh Stoller's avatar
      A set of changes to run "prepare" on a node just prior to an image · d8f8f9b4
      Leigh Stoller authored
      being taken.
      
      The basic strategy is to have node_reboot (when -p option supplied)
      invoke a special command on the node that will cause the shutdown
      procedure to run prepare as it goes single user, but before the
      network is turned off and the machine rebooted. The output of the
      prepare run is capture and send back via the tmcd BOOTLOG command and
      stored in the DB, so that create_image can dump that to the logfile
      (so that the person taking the image can know for certain that the
      prepare ran and finished okay).
      
      On linux this is pretty easy to arrange since reboot is actually
      shutdown and shutdown runs the K scripts in /etc/rc.d/rc6.d, and at
      the end the node is basically single user mode. I just added a new
      script to run prepare and send back the output.
      
      On FreeBSD this is a lot harder since there are no decent hooks.
      Instead, I had to hack up init (see tmcd/freebsd/init/{4,5,6}) with
      some simple code that looks for a command to run instead of going to a
      single user shell. The command (script) runs prepare, sends the output
      back to tmcd, and then does a real reboot.
      
      Okay, so how to get -p passed to node_reboot? I hacked up the
      libadminmfs code slightly to do that, with new 'prepare' argument
      option. This may not be the best approach; might have to do this as a
      real state transition if problems develop. I will wait and see.
      
      Also, I changed www/loadimage.php3 to spew the output of the
      create_image to the browser.
      d8f8f9b4
  21. 08 Mar, 2006 1 commit
  22. 06 Mar, 2006 1 commit
  23. 17 Feb, 2006 1 commit
  24. 15 Feb, 2006 1 commit
    • David Johnson's avatar
      * Makeconf.in, configure, configure.in, defs-default, defs-johnsond-emulab: · 4982b9cd
      David Johnson authored
          - added a new defs var, TBROBOCOPSEMAIL
      
        * tbsetup/power_mail.pm.in:
          - add some new info to robot powerup mails
      
        * db/libdb.pm.in:
          - add a new function to determine if an experiment contains nodes of a
            given class/type
      
        * tbsetup/swapexp.in:
          - check if exp is a robot exp; that is, if it has robots or motes; if
            so, cc error msgs to TBROBOCOPSEMAIL in addition to TBOPS
      4982b9cd
  25. 26 Jan, 2006 1 commit
  26. 23 Jan, 2006 2 commits
    • Leigh Stoller's avatar
    • Timothy Stack's avatar
      · add602df
      Timothy Stack authored
      Parse the NS file with the real NS parser so we can make sure linktest is
      doing the "right" thing.
      
      	* configure, configure.in: Add tbsetup/nsverify files.
      
      	* tbsetup/GNUmakefile.in: Add nsverify subdir.
      
      	* tbsetup/tbprerun.in: Run verify-ns on the experiments NS file.
      
      	* tbsetup/ns2ir/nstb_compat.tcl: Bring up-to-date with the current
      	world.
      
      	* tbsetup/nsverify/GNUmakefile.in: Makefile.
      
      	* tbsetup/nsverify/ns-2.27.patch: Patch file for NS version 2.27.
      
      	* tbsetup/nsverify/nstbparse.in: Wrapper for the NS parser.
      
      	* tbsetup/nsverify/tb_compat.tcl: Different version of
      	tb_compat.tcl that is used to verify linktest parameters.
      
      	* tbsetup/nsverify/verify-ns.in: Script that runs on boss and
      	verifies that the testbed parser worked correctly.
      
      	* tbsetup/ns2ir/parse-ns.in, tbsetup/ns2ir/parse.proxy.in: Tweaked
      	a bit so parse.proxy can be used to run the regular NS parser in
      	addition to the testbed one.
      add602df
  27. 09 Jan, 2006 1 commit
  28. 05 Jan, 2006 1 commit
  29. 02 Jan, 2006 1 commit
    • Timothy Stack's avatar
      · bd20dd17
      Timothy Stack authored
      First cut at a daemon that does regular checkups of the testbed
      hardware/software.
      
      	* configure, configure.in: Add tbsetup/checkup directory.
      
      	* db/audit.in: Add a listing of stuck checkups.
      
      	* install/boss-install.in: Add 'elabckup' user.
      
      	* rc.d/3.testbed.sh.in: Startup the checkup_daemon.
      
      	* sql/database-create.sql, sql/database-migrate.txt: Add the
      	checkups tables.
      
      	* tbsetup/GNUmakefile.in: Descend into the checkup directory.
      
      	* tbsetup/checkup: The checkup daemon, man page, and
      	  associated scripts.
      
      	* tbsetup/ptopgen.in: Add a feature with a value of 0.9 to
      	  prereserved nodes to keep them from being allocated unless
      	  they're really wanted.
      
      	* utils/firstuser.in: Add some other options so the script can be
      	  used to create other pseudo users.
      bd20dd17
  30. 19 Dec, 2005 1 commit
    • Kevin Atkinson's avatar
      · 45f997fd
      Kevin Atkinson authored
      Updates to to Error Logging API Code.
      
      You should start seeing much better error messages coming from my
      system.  Errors coming from parse.proxy and assign (the two most
      frequent sources of errors) should now be concise and to the point.
      Errors coming from libosload/libreboot (the next most frequent source
      of errors) should now also be much better, but not perfect.  Getting
      perfect errors will likely a rework of how errors are handled in
      libosload/libreboot, just adding tberror/tbwarn/tbnotice calls is not
      enough.  I can do this at a latter date if necessary.
      
      A few minor database changes.
      
      Some changes to the API.  A few bug fixes. Lots of tberror/tbwarn/tbnotice
      added to scripts.
      
      Since assign is a C program, and at this time my API is perl only, I wrote a
      second wrapper around assign, assign_wrapper2.  When assign fails errors are
      now parsed in assign_wrapper2, sent to stderr and logged.  This means that
      RunAssign() just returns when assign fails rather than echoing some of
      assign.log output and then quiting.  The output to the activity log remains
      unchanged.
      
      Since "parse.proxy" is run from ops I couldn't use my API in it, even though
      it is a perl program.  Instead I parse the errors coming form it in
      parse-ns.
      45f997fd
  31. 16 Dec, 2005 1 commit
  32. 15 Dec, 2005 1 commit
    • Kirk Webb's avatar
      The revived Plab interface is here! · 41c54939
      Kirk Webb authored
      Lots of updates to the plab backend, including improved plab <-> elab node
      id translation and update handling.  Includes support for the current PLC
      API, and the new pl_conf node manager interface API.  Several more db library
      routines were ported from the perl library to the python one to support the
      new code (mostly the node_id tracking stuff).  Fixes to the client side and
      also a rootball creation cleanup (binaries removed from the CVS repo).
      
      There are also enhancements to the experiment view page for experiments
      including plab nodes: site and widearea hostname are now displayed along
      with the other node information.
      
      Note that the way setup timeout for vnodes is calculated has been changed a
      bit.  Instead of using a hardwired base timeout, the base timeout is now
      based on the reload_waittime database field, which comes from the 'OS'
      (e.g., FBSD-JAIL, RHL-PLAB) the vnode runs.
      
      The default max duration for a plab slice created through the plab_ez interface
      is set to 1 year, and linktest is currently disabled and hidden through
      the ez interface.
      
      There is still work to do, but this checkin brings with it a functional
      plab portal!
      41c54939
  33. 13 Dec, 2005 1 commit
    • Kirk Webb's avatar
      · 981030d5
      Kirk Webb authored
      Update SSH_ARGS to remove "Protocol 1" specification.  Everyone needs to
      reconfig their object trees!
      981030d5
  34. 12 Dec, 2005 2 commits
  35. 01 Dec, 2005 1 commit
  36. 28 Nov, 2005 1 commit
    • Timothy Stack's avatar
      Make the netlab client applet available to locals. · 19dbd0fd
      Timothy Stack authored
      	* configure, configure.in: Add xmlrpcpipe.php3.
      
      	* xmlrpc/emulabserver.py.in: Add missing virtual_tables.  Add
      	getareas call to get the list of robot areas.  Add node.getlist
      	and node.typeinfo methods for getting information about the nodes.
      	Add a "nic" argument to node.available to get the count of
      	wireless nodes.  Add "exclude" argument to
      	experiment.virtual_topology so we don't have to download the
      	massive route table, also delete the 'pid'/'eid' fields for the
      	same reason.  Don't return string output of virtual_topology, it's
      	huge.  Return some more info in experiment.getlist().
      
      	* www/GNUmakefile.in: Add xmlrpcpipe.php3.
      
      	* www/beginexp_form.php3, www/modifyexp.php3: Add links to the
      	client gui.
      
      	* www/netlab-client.jar: The client binary.
      19dbd0fd