1. 01 May, 2003 1 commit
  2. 30 Apr, 2003 1 commit
    • Leigh B. Stoller's avatar
      Some batch mode changes. In the early days we did not have such fancy · 0197f41d
      Leigh B. Stoller authored
      tb tools! I've changed the batch system to "preload" the experiment in
      foreground mode (results of parse spit back to user directly). The
      batch daemon now uses swapexp instead of startexp. Upon failure, the
      experiment goes back to the "swapped" state; previously its virt state
      was blasted, and rentered again next try. This is nice cause you can
      actually look at the batch experiment (vis, virt tables, etc) while it
      is posted and not running.
      
      Not sure if all the Ts are crossed. Will find out ...
      0197f41d
  3. 29 Apr, 2003 1 commit
    • Chad Barb's avatar
      Robust Experiment Modify -and- · 7308f458
      Chad Barb authored
      Various Other changes to get Expt Modify ready for prime time.
      
       - If assign fails on a modify, experiment will
         be restored to old state, *not* swapped out.
      
       - Reboot option has been improved to reboot all
         nodes as part of os_setup, not in separate
         step.
      
       - Different assign error codes result in different
         retry behavior for assign_wrapper
         (Follow's Rob's change to assign to make it
          pass back special code for non-retriable faults)
      
       - '64' bit in assign_wrapper exit code indicates to tbswap
         that db/phys state hadn't been mucked with before
         the exit occurred
         (ergo, '65' and '1' are the common return codes,
          though the old 4,8,16,32 are still there for assign failing.)
      
       - (tbswap still returns codes from assign wrapper)
      
       - Added 5 sec pause between assign attempts.
      
       - Cleaned up tbswap code.
      
       - Physical state backup/restore removed from tbprerun,
         put into swapexp.
      
       - Interfaces table now getting cleaned up correctly
         (Mike noticed problem)
      
       - Changed menu display in showexp to show
         the "modify" menu option for swapped out experiments
         (like it used to.)
      
       - A couple other changes.
      
      Note:
       Still admin-only, but I plan to change that soon.
      
      To do:
       - Erase expt backups in /tmp after using them.
       - Re-viz failed experiments.
      7308f458
  4. 28 Apr, 2003 2 commits
    • Leigh B. Stoller's avatar
      Add several minor experiment_stats fields; swap_errors (a count), · 27558935
      Leigh B. Stoller authored
      swap_exitcode (last error), idle_swaps (a count), batch (a flag to
      indicate a batch experiment).
      
      Add a operational log. Okay, its not actually a log, but a table that
      will grow forever until it consumes the earth. Its a small table
      though, so it will take a few years. Its cross indexed with the
      experiment_stats table, so by massaging this table along with the
      stats table, we can get a good picture of what was running on the
      testbed when, and how many resources it was using. Sorry, not a log
      file, but we can easily generate a log file from tbe table if the Boss
      really wants one. The table entry averages 28 bytes.
      
      Move stats to their own main menu item (admin mode only). Remove from
      the showexp_list page since that was bogus.
      27558935
    • Leigh B. Stoller's avatar
      Add support for new {user,group,project,experiment}_stats tables. · 5e5508bf
      Leigh B. Stoller authored
      The first three are aggregate tables, while the experiment stats table
      gets a record for each new experiment, and is updated when an
      experiment is swapped in/out/modify or terminated. Look at the table
      to see what is tracked. Once the experiment_stats record is updated,
      the aggregate tables are updated as necessary. There are a bunch of
      ugly changes to assign_wrapper to get the stats. Note that pnodes is
      not incremented until an experiment sucessfully swaps in. This is in
      leu of getting status codes; I'm not tracking failed operations yet,
      nor creating the log file that Jay wants. I'll do that in the next
      round of changes when we see how useful these numbers are.
      
      Most of the changes are to create/delete table entries where
      appropriate, and to display the records. Display is only under admin
      mode, and the display is raw; just a dump of the assoc tables in php.
      The last 100 experiment stats records are available via the Experiment
      List page, using the "Stats" show option at the top. Bad place, but
      will do for now.
      5e5508bf
  5. 17 Apr, 2003 1 commit
    • Chad Barb's avatar
      · 4233af4e
      Chad Barb authored
      For the benefit of our users,
      added 'reboot nodes in experiment' checkbox,
      on by default, with a stern warning.
      4233af4e
  6. 16 Apr, 2003 1 commit
    • Leigh B. Stoller's avatar
      Add support for idleswapping an experiment as the creator of the · ff5a57de
      Leigh B. Stoller authored
      experiment, rather than as an administrator, which presents group
      permission problems when the experiment is in a subgroup (requires two
      additional group, whereas suexec adds only one group). That aside, the
      correct approach is to run the swap as the creator. To do that, must
      flip to the user (from the admin person) in the backend using the new
      idleswap script, and then run the normal swapexp. Add new option to
      swapexp (-i) which changes the email slightly to make it clear that
      the experiment was idleswapped, and so that the From: is tbops not the
      user (again, to make it more clear).
      ff5a57de
  7. 03 Apr, 2003 1 commit
    • Chad Barb's avatar
      · 765de560
      Chad Barb authored
      Added new feature 'Experiment Modify'.
      Now available (to admins only for now) from the showexp page.
      
      Warning! doing a modify which alters the topology will probably
      require a "reboot all nodes" afterwards.
      (There will be a checkbox soon in the modify experiment page.)
      
      Adding/removing delay nodes seems to work fine without reboots, though.
      
      Warning! If the new version of the experiment cannot be mapped
       (not enough nodes available, for instance) the experiment will be
       swapped out! This will get fixed later.
      
      Prerun backs up the experiment topology, so using a bad NS
      file doesn't result in experiment termination.
      
      As part of this, added library functions to libdb to
      delete, backup, and restore both virtual and physical experiment state.
      765de560
  8. 27 Mar, 2003 1 commit
  9. 11 Mar, 2003 1 commit
    • Chad Barb's avatar
      · caad3a35
      Chad Barb authored
      New version of unified tbswap in/out.
      startexp/endexp/swapexp have been changed to use new script.
      
      tbswapin and tbswapout have been replaced with a script which
      spits out a warning message, then calls tbswap appropriately.
      
      The README has also been modified.
      caad3a35
  10. 18 Dec, 2002 1 commit
    • Leigh B. Stoller's avatar
      New "restart" or perhaps better if named "replay" mode to swapexp. · d651dd42
      Leigh B. Stoller authored
      Attempts to replay an experiment by rebooting all the nodes, clearing
      the various startup bits (ready, startstatus, bootstatus, portstats),
      and then restarting the event system. I am dubious that this is a
      workable solution because of the asynchronous nature of the testbed
      (nodes happily cruise from TBRESET to ISUP and beyond without
      stopping), and so its hard to truly replicate the initial lack of
      state that a freshly swapped in experiment has. Still, people
      requested it and I cheerfully provided it cause thats what I do;
      service with a smile and not a wit of complaint. Is anyone reading
      this?
      d651dd42
  11. 16 Sep, 2002 1 commit
    • Leigh B. Stoller's avatar
      Reorg of working directory and log file stuff for start/swap/end · 533dc18f
      Leigh B. Stoller authored
      experiment. Here is mail to tbops:
      
      * Moved the working directory for experiment setup/swap/end to a new
        directory located on boss instead of over NFS to /proj/$pid/$eid. This
        new location is /usr/testbed/expwork/$pid/$eid.
      
      * Changed the name of the directories we create in /usr/testbed/expinfo to
        $pid-$eid.$index where $index is a new autoincrement field in the DB
        table. I really hated the names that were created before.
      
      * Changed where logs are written from /tmp to the new location in
        /usr/testbed/expwork/$pid/$eid.
      
      Okay, why.
      
      * We no longer operate on NFS mounted directories that might hang. Its
        easier to catch the situation where a copy of the log file over at the
        end of experiment creation fails cause of an NFS problem.
      
      * We no longer have user writable files that are inputs to other parts of
        the system (like top and ptop files).  Not that a user would be bad, but
        it closes a hole.
      
      * We no longer copy user writable files from /proj to boss where we might
        fill up an important filesystem cause the user put a .ndz file in the the
        working directory. Not that a user would be bad, but it closes a hole.
      
      * Its easier to save all the log files this way, for each swap in and
        out.
      
      * Removing a directory over NFS is a royal irritant when someone is CD'ed
        into that directory or looking at a file on the other side (the astute
        observer will peg this as the reason I went down this idiotic path in the
        first place!).
      
      * About 6 other reasons that I can no longer remember. Seriously, I really
        had more reasons I can no longer remember! :-)
      533dc18f
  12. 11 Jul, 2002 1 commit
    • Leigh B. Stoller's avatar
      A bunch of logfile changes. Logs are now saved in the experiment · 51bc0de4
      Leigh B. Stoller authored
      directory so that they can be viewed later after the operation is
      complete. I've also cleaned up the mechanism for determining when
      a log file is active (for the web spew) by using another slot in the
      experiments table, and added some libdb routines to manage that slot.
      At present just the last (or latest) log can be viewed after the fact,
      but we can change that later if think its really necessary. At the
      same time, make it possible for admin types to view the log files for
      other peoples expierments; spew is setuid, but flips back after
      opening the file (does usual checks too). I've also incorporated the
      log changes into the batch daemon, so you can view the last batch log
      too, although I have not tested that yet!
      51bc0de4
  13. 07 Jul, 2002 1 commit
  14. 16 Jun, 2002 1 commit
    • Leigh B. Stoller's avatar
      Some fixes to the spewlogfile stuff so that you do not get the · d9c3dd68
      Leigh B. Stoller authored
      transition error when you click too fast after creating it. Instead of
      looking at experiment state, use the logile slot of the experiments
      table, and make sure its cleared/set properly in start/swap experiment
      scripts.
      
      Also added a spew option to the swap page so you can watch experiments
      swap in/out.
      d9c3dd68
  15. 16 May, 2002 1 commit
  16. 19 Mar, 2002 1 commit
  17. 12 Feb, 2002 1 commit
  18. 28 Dec, 2001 1 commit
  19. 27 Nov, 2001 1 commit
  20. 07 Nov, 2001 1 commit
  21. 24 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      Add swappable and priority bits to experiment creation form. Not used, · 28c1968f
      Leigh B. Stoller authored
      but simply entered into the DB record for the experiment until we know
      what to do with them. Add to batchexp script arguments, since all that
      stuff is done outside the web interface. Add a swapexp perl script to
      swap an an experiment in/out form the command line. Add web links on
      the Experiment Information page to do this from the web interface. A
      bunch of locking changes. Previously expt_terminating in the
      experiment record prevented multiple calls to terminate an experiment,
      but now we have a more general locking problem with
      start,swapin,swapout, and terminate, so change expt_terminating to
      expt_locked (still a datetime field) and add locking to all of
      startexp, swapexp, and endexp. Note that batch experiments cannot be
      swapped yet because of locking issues still to be resolved. Minor
      cleanup in tbreport to make email message look better.
      28c1968f
  22. 17 Oct, 2001 1 commit
    • Leigh B. Stoller's avatar
      Rework of the batch experiment code. Unified it with the immediate · 4d420b21
      Leigh B. Stoller authored
      experiment code. No longer uses another table. Rather, the experiment
      record contains a couple of extra fields for the batch system. Also
      combined some of the backend code (no longer a killbatch script).
      Also added scriptable experiments; the batchexp program in the bin
      directory can start an experiment from the command line, and in fact
      is used from the web page for both batch experiments and immediate
      experiments (-i option). All of the DB code that was in the web
      interfaces was moved to batchexp.
      4d420b21
  23. 16 Oct, 2001 1 commit
  24. 26 Sep, 2001 1 commit
  25. 24 Sep, 2001 2 commits
  26. 28 Aug, 2001 2 commits
  27. 22 Aug, 2001 1 commit
  28. 19 Jul, 2001 1 commit
  29. 29 Jun, 2001 1 commit
  30. 20 Jun, 2001 1 commit
  31. 10 May, 2001 1 commit
    • Leigh B. Stoller's avatar
      Lots of little changes for sending email to the right places, with · 3285bc3e
      Leigh B. Stoller authored
      proper headers. Split out some of the mail into testbed-logs,
      testbed-ops, and testbed-approval. Added a library for including from
      our perl scripts. Contains a couple of mail helper functions, but will
      hopefully contain more as time goes by.
      
      Fixed a bug in the web interface that was causing breakage for people
      with multiple accounts. Mac and Jay have noticed this, when logging
      out and trying to join or create a project under a new or different
      name.
      3285bc3e
  32. 19 Mar, 2001 1 commit
  33. 09 Mar, 2001 1 commit
  34. 07 Mar, 2001 1 commit
    • Leigh B. Stoller's avatar
      Rework of experiment termination. The wrapper script that is invoked · 9f692188
      Leigh B. Stoller authored
      by the web server forks a child to do the actual work of calling tbend
      and other stuff. The parent returns right away and the script ends.
      When the experiment termination (child) ends, an email message is sent
      to the user that issued the termination request. To prevent multiple
      clicks, I added a DB field called expt_terminating that is a DATETIME
      field.  If the field is set, the script fails and the user is told to
      be more patient. I used a DATETIME field mostly for debugging purposes
      so we can track and future problems.
      9f692188
  35. 03 Jan, 2001 1 commit
  36. 15 Dec, 2000 1 commit
  37. 01 Dec, 2000 1 commit