1. 20 Oct, 2010 1 commit
    • Mike Hibler's avatar
      Support for no shared filesystem (unsupport for shared filesystem?) and · c1c1bce2
      Mike Hibler authored
      (eventual) support for NFS servers without race conditions!
      
      This means no NFS between nodes and ops/fs. There are still NFS mounts of
      ops on boss however.
      
      Added new defs-* variable NOSHAREDFS, which when set non-zero will disable
      the export of NFS filesystems to nodes.  Involved lots of little changes:
      
       * /users, /proj, and /share filesystems are not exported to nodes.
      
       * Returned mount info now includes an FSTYPE key which will be set to "LOCAL"
         if NOSHAREDFS is in effect (by default it is set to "NFS-RACY"; more on
         this later).  In the case where it is set to LOCAL, the other mount lines
         no longer contain REMOTE=foo settings.  Because of this change,
         THE TMCD VERSION NUMBER HAS BEEN BUMPED TO 32.
      
       * The client rc.mounts script will now create local versions of /users/*,
         /proj/<pid>, and /share when FSTYPE=LOCAL.  It first runs mkextrafs to
         create a large partition for these, since someday we will likely want
         to pre-populate these with a non-trivial amount of data.  Right now,
         the only thing that is put in the user's homedir is the standard dotfiles
         for the OS and the Emulab authorized_keys file (so you can login).
      
       * Linktest had to be modified to fetch the various results files (via
         loghole) rather than just assuming they were in /proj.  And also changed
         to invoke tevc with the local copy of the event key so it won't try to
         read it over NFS.
      
       * create_image was modified to ssh to the node and run the imagezip
         command, capturing the output of ssh.  This is controlled via the "-s"
         option which defaults to on for a NOSHAREDFS system, but can also be
         used on a normal system.
      
       * elabinelab's can be configured with/without a shared FS via the
         CONFIG_SHAREDFS attribute (note polarity change) which defaults to 1.
      
      Another new defs-* variable, NFSRACY, will some day allow you to specify
      (by setting to 0) that your NFS server does NOT have the nefarious mountd
      race condition when changing /etc/exports.  Currently, this defaults to 1
      since all versions of FreeBSD supported as an "fs" node have this "feature."
      Rumor has it that FreeBSD 8 does not have this problem nor, presumably,
      would a Linux NFS server.
      
      The only use of this variable right now is to set the FSTYPE returned by the
      tmcd "mounts" call, which in turn is used by one client script, rc.topomap
      (via a libsetup function) to determine whether it should try copying
      the topo file multiple times.
      
      Random: add python2.6 to list of python's checked for in configure.
      Random: resync defs-example-privatecnet with defs-example.
      Random: did a little code-pissin here and there.
      c1c1bce2
  2. 11 Oct, 2010 1 commit
    • Leigh B Stoller's avatar
      Work on an optimization to the perl code. Maybe you have noticed, but · 92f83e48
      Leigh B Stoller authored
      starting any one of our scripts can take a second or two. That time is
      spent including and compiling 10000s of thousands of lines of perl
      code, both from our libraries and from the perl libraries.
      
      Mostly this is just a maintenance thing; we just never thought about
      it much and we have a lot more code these days.
      
      So I have done two things.
      
      1) I have used SelfLoader() on some of our biggest perl modules.
         SelfLoader delays compilation until code is used. This is not as
         good as AutoLoader() though, and so I did it with just a few 
         modules (the biggest ones).
      
      2) Mostly I reorganized things:
      
        a) Split libdb into an EmulabConstants module and all the rest of
           the code, which is slowly getting phased out.
      
        b) Move little things around to avoid including libdb or Experiment
           (the biggest files).
      
        c) Change "use foo" in many places to a "require foo" in the
           function that actually uses that module. This was really a big
           win cause we have dozens of cases where we would include a
           module, but use it in only one place and typically not all.
      
      Most things are now starting up in 1/3 the time. I am hoping this will
      help to reduce the load spiking we see on boss, and also help with the
      upcoming Geni tutorial (which kill boss last time).
      92f83e48
  3. 22 Jul, 2010 1 commit
  4. 22 Mar, 2010 1 commit
  5. 07 Nov, 2009 1 commit
    • Leigh B. Stoller's avatar
      Change to infodir (/usr/testbed/expinfo) handling; experiment · 41d34103
      Leigh B. Stoller authored
      directories are now placed in a project subdirectory, to avoid
      blowing out the max number of subdirs (32K in FreeBSD). Dirs are
      now called $pid/$eid/$idx.
      
      Added sanity checks to batchexp, swapexp, and endexp to watch for the
      case that testbed admin installed the new code but did not run the
      fixup script as instructed in doc/UPDATING.
      41d34103
  6. 18 Aug, 2009 1 commit
  7. 05 Aug, 2009 1 commit
  8. 23 Jul, 2009 1 commit
  9. 26 Jun, 2009 1 commit
  10. 11 Jun, 2009 1 commit
  11. 03 Jun, 2009 1 commit
    • Kevin Atkinson's avatar
      Add support for returning more detailed information in the case of a · 340f10a4
      Kevin Atkinson authored
      swap failure in the xmlrpc server.  To use it, the extra option
      "extrainfo" needs to be set to true.  This will cause "value" to be a
      structure (in the XML-RPC sence) with useful information instead of
      just the exit value of the script.  The structure will contain at
      least the following fields:
        cause: "temp", "user", etc
        cause_desc
        mesg: more specific error information
        exitval: script return value
        log: activity log
      Note, that value may still be an integer in the case of some other
      failure that is not swap related.
      
      To support this, the "-X" option was added to swapexp and batchexp
      which will output a RPC-XML method response to stdout when possible.
      340f10a4
  12. 28 May, 2009 1 commit
    • Kevin Atkinson's avatar
      Add "-N" option to batchexp/swapexp/endexp to suppress most email to · 649cd68c
      Kevin Atkinson authored
      testbed-ops and the user.  Important email that requires testbed-ops
      attention, such as on a recursive cleanup error, will still be sent.
      In addition mail normally sent to testbed-logs will still be sent.
      
      Also, add "noemail" option to xmlrpc server methods corresponding to
      those commands, and "-N" option to related commands in script_wrapper.
      649cd68c
  13. 09 Jul, 2008 1 commit
    • Leigh B. Stoller's avatar
      My attempt to improve swapmod ... · 3593d9c6
      Leigh B. Stoller authored
      Previously, any error in assign wrapper would cause the experiment to
      swap out because the "DB had been modified" ... well I have isolated
      all of the changes that are made, and errors in assign_wrapper proper
      no longer do that. tbswap now restores the experiment back the way it
      was. Not that errors after assign_wrapper (like in os_setup) are still
      a problem.
      
      In addition, rather then kill off all of the vlans, leave them in
      place and then do a comparison after assign wrapper, removing obsolete
      and modified vlans only. I have made use of the obsolete vlans table
      for this by having snmpit track its changes in that table. There is a
      bunch of new code in Lan.pm for doing the comparisons.
      3593d9c6
  14. 03 Jun, 2008 1 commit
  15. 11 Feb, 2008 1 commit
  16. 28 Nov, 2007 1 commit
  17. 26 Nov, 2007 1 commit
  18. 05 Nov, 2007 3 commits
  19. 02 Nov, 2007 1 commit
  20. 06 Aug, 2007 1 commit
  21. 02 Aug, 2007 1 commit
  22. 19 Jun, 2007 1 commit
    • Leigh B. Stoller's avatar
      Big update to the stats gathering code ... · 495f6803
      Leigh B. Stoller authored
      This change attempts to make the stats gathering code more reliable by
      not relying on the testbed_stats records to reconstruct usage
      statistics.  The main source of errors and total confusion in the
      current stats code is that testbed_stats includes all the errors and
      transitions, from which I have to reconstruct what happened in order
      to determine usage by a project or user.
      
      The new stats code still generates the testbed_stats code, but actual
      usage is recorded as it happens, in the experiment_resources table, as
      swapins, swapouts, and swapmods occur. Its also much faster to compute
      the data for the tables in the web interface, not having to scan a
      zillion testbed_stats records in php.
      
      There is a time consuming update to the records that takes place with
      a lot of tables locked.
      495f6803
  23. 15 May, 2007 1 commit
    • Leigh B. Stoller's avatar
      Checkpoint changes that have been discussed in the last few weeks: · c4f53202
      Leigh B. Stoller authored
      * Records are now "help open" when a run is stopped. When the next run
        is started, a check is made to see if the files
        (/project/$pid/exp/$eid) have changed, and if so a new version of the
        archive is committed before the next run is started.
      
      * Change the way swapmod is handled within an instance. A new option
        on the ShowExp page called Modify Resources. The intent is to allow
        an instance to be modified without having to start and stop runs,
        which tends to clutter things up, according to our user base. So, if
        you are within a run, that run is reset (reused) after the swapmod is
        finished. You can do this as many times as you like. If you are
        between runs (last operation was a stoprun), do the swapmod and then
        "speculatively" start a new run. Subsequent modifies reuse the that
        run again, as above.
      
        I think this is what Kevin was after ... there are some UI issues
        that may need to be resolved, will wait to hear what people have to
        say.
      
      * Revising a record is now supported. Export, change in place, and
        then use the Revise link on the ShowRun page. Currently this has to
        happen from the export directory on ops, but eventually allow an
        upload (to correspond to downloaded exports)
      
      * Check to see if export already exists, and give warning. Added a
        checkbox that allows user to overwrite the export.
      
      * A bunch of minor UI changes to the various template pages.
      c4f53202
  24. 13 Mar, 2007 1 commit
  25. 23 Jan, 2007 1 commit
  26. 10 Jan, 2007 1 commit
  27. 09 Jan, 2007 2 commits
  28. 08 Dec, 2006 2 commits
  29. 04 Dec, 2006 1 commit
  30. 06 Nov, 2006 1 commit
    • Kevin Atkinson's avatar
      libaudit related changes: · e89ee617
      Kevin Atkinson authored
        - Added "LIBAUDIT_FANCY" option to AuditStart.  When this option is
          used libaudit will send a different email than it normally sends,
          and on error call tblog_find_error() to determine the error.
      
        - Also add audit function AddAuditInfo which adds add additional
          information for libaudit to use in SendAuditMail when AUDIT_FANCY
          is set.
      
        - Modify template_swapin, template_instantiate, and template_create
          to use the new audit functionality.
      
        - Suppressing calling tblog_find_error and sending the error email
          when auditing in swapexp and batchexp
      
      tblog changes:
      
        - Shorten the message sent to the user when the error in unknown.
          Remove all parts about lack of free nodes as it no longer really
          applies as tblog now correctly identified those errors and handles
          them separately.  The message is now just "Please look at the log
          below to see what happened."
      
        - Improve algo. used to determine the other error when canceled.
          Will now work by removing all errors related to the cancel request
          and the essentially rerunning tblog_find_error.  If the cause of
          the error is still canceled, repeat and try again until the cause
          is something other than canceled or no errors are left.
      
        - Refactor tblog_find_error, which involves creating new internal
          functions: tblog_determine_single_error, tblog_store_error,
          tblog_dump_error
      
        - Add section on Primary vs Secondary Errors to the inline POD
          documentation.
      
        - Other minor enhancements and bug fixes.
      e89ee617
  31. 20 Oct, 2006 1 commit
    • Mike Hibler's avatar
      Wow, this should make me look important! · afa5e919
      Mike Hibler authored
      Two-day boondoggle to support "/scratch", an optional large, shared filesystem
      for users.  To do this, I needed to find all the instances where /proj is used
      and behave accordingly.  The boondoggle part was the decision to gather up all
      the hardwired instances of shared directory names ("/proj", "/users", etc.)
      so that they are set in a common place (via unexposed configure variables).
      This is a boondoggle because:
      
      1. I didn't change the client-side scripts.  They need a different mechanism
         (e.g., tmcd) to get the info, configure is the wrong way.
      
      2. Even if I had done #1 it is likely--no, certain--that something would
         fail if you tried to rename "/proj" to be "/mike".  These names are just
         too ingrained.
      
      3. We may not even use "/scratch" as it turns out.
      
      Note, I also didn't fix any of the .html documentation.  Anyway, it is done.
      To maintain my illusion in the future you should:
      
      1. Have perl scripts include "use libtestbed" and use the defined PROJROOT(),
         et.al. functions where possible.  If not possible, make sure they run
         through configure and use @PROJROOT_DIR@, etc.
      
      2. Use the configure method for python, C, php and other languages.
      
      3. There are perl (TBValidUserDir) and php (VALIDUSERPATH) functions which
         you should call to determine if an NS, template parameter, tarball or
         other file are in "an acceptable location."  Use these functions where
         possible.  They know about the optional "scratch" filesystem.  Note that
         the perl function is over-engineered to handles cases that don't occur
         in nature.
      afa5e919
  32. 18 Oct, 2006 1 commit
  33. 04 Oct, 2006 1 commit
  34. 26 Sep, 2006 3 commits