1. 26 Jul, 2004 1 commit
    • Leigh B. Stoller's avatar
      Okay, lets clear up some confusion when swapmod fails and 1) the · fdac8b89
      Leigh B. Stoller authored
      experiment is swapped or 2) the experiment is completely terminated.
      In these case, lets put explicit swapout/destroy events into
      testbed_stats so that the record is not confused by experiments that
      appear to start when they are still running. This really throws off
      the summary stats web page!
      fdac8b89
  2. 15 Jul, 2004 1 commit
  3. 29 Jun, 2004 1 commit
  4. 17 May, 2004 1 commit
  5. 13 May, 2004 2 commits
  6. 29 Apr, 2004 1 commit
    • Leigh B. Stoller's avatar
      Add prelim support for using linktest. Because of problems, this is · 6cdccbd2
      Leigh B. Stoller authored
      currently available to only people with stud=1 status in the DB.
      
      * www/tbauth.php3: Add a STUDLY() function to check that bit.
      
      * www/linktest.php3: New page to run linktest on the fly. The level
        defaults to the current level in the experiments table, but you can
        override that via the form on the page.
      
      * www/showexp.php3: Add link to aforementioned page. STUDLY() only.
      
      * www/beginexp_form.php3: Add an option (selection) to set the linktest
        level for create/swapin. Defaults to 0 (no linktest). STUDLY() only.
      
      * www/editexp.php3: Add an option to edit the default linktest level
        for an experiment. STUDLY() only.
      
      * tbsetup/batchexp.in and tbsetup/swapexp.in: Add code to optionally run
        the linktest, sending email if it fails (exists with non-zero status).
        Failure does not affect the swapin.
      6cdccbd2
  7. 07 Apr, 2004 1 commit
  8. 15 Mar, 2004 1 commit
  9. 12 Feb, 2004 1 commit
    • Leigh B. Stoller's avatar
      * Removed startexp, and merged its contents into batchexp. There has been · aef08532
      Leigh B. Stoller authored
        no reason for the separation for a long time, and it made maintence more
        difficult cause of duplication between batchexp and startexp (batch was
        the sole user of startexp). Cleaner solution.
      
      * Check argument processing for batchexp, swapexp, endexp to make sure the
        taint checks are correct. All three of these scripts will now be
        available from ops. I especially watch the filename processing, which was
        pretty loose before and could allow some to grab a file on boss by trying
        to use it as an NS file (scripts all runs as user of course). The web
        interface generates filenames that are hard to guess, so rather then
        wrapping these scripts when invoked from ops, just allow the usual paths
        (/proj, /groups, /users) but also /tmp/$uid-XXXXXX.nsfile pattern, which
        should be hard enough to guess that users will not be able to get
        anything they are not supposed to.
      
      * Add -w (waitmode) options to all three scripts. In waitmode, the backend
        detaches, but the parent remains waiting for the child to finish so it
        can exit with the appropriate status (for scripting). The user can
        interrupt (^C), but it has no effect on the backend; it just kills the
        parent side that is waiting (backend is in a new session ID). Log outout
        still goes to the file (available from web page) and is emailed.
      aef08532
  10. 05 Feb, 2004 1 commit
  11. 08 Jan, 2004 1 commit
  12. 18 Nov, 2003 4 commits
    • Leigh B. Stoller's avatar
      Minor additions for Shashi: · def28c32
      Leigh B. Stoller authored
      * Make the NS file an optional argument to swapexp modify; when not
        given the prerun phase is skipped. Instead, go directly to tbswap
        (run assign, etc).
      
      * Add NSESWAP event so that Shashi can fire off the above modify using
        tevc from an experimental node.
      
      	tevc -e pid/eid now ns nseswap
      
      * Change event scheduler to react to above event, and fire off:
      
      	nseswap pid eid
      
        as the user. The script should do its thing, and *exec* swapexp with
        the proper args as quickly as possible (so that the event scheduler
        is not hung up for too long. The script is invoked as the user,
        since the event scheduler is running as the user.
      def28c32
    • Leigh B. Stoller's avatar
      Remove some special handling for the nsfiles table; make it part · 942cdc07
      Leigh B. Stoller authored
      of virt_tables so that it is saved and restored like the rest of
      the virtual state.
      942cdc07
    • Leigh B. Stoller's avatar
      Ah, just get rid of the expt_locked check. Not worth the trouble and · c78e9c71
      Leigh B. Stoller authored
      its going to get replaced at some point by a busy state. The swap
      scripts properly set the next state before unlocking the experiments
      table, which possibly leaves some small races as experiments
      transition through states (which happens with the table unlocked,
      cause I used to have this really handy variable called expt_locked,
      which no one really likes anymore).
      
      We either have to use more table locking, fix up expt_locked, or punt
      and say it won't happen more than once in a few thousand operations!
      c78e9c71
    • Leigh B. Stoller's avatar
      Change die() to ExitWithStatus(1) so that the user sees the message · 4aae6ce5
      Leigh B. Stoller authored
      instead of testbed-ops. Either way, Mike gets to see it.
      4aae6ce5
  13. 17 Nov, 2003 1 commit
    • Leigh B. Stoller's avatar
      Merge the two state machines (batchstate and state) into a single · 2025e0bd
      Leigh B. Stoller authored
      state machine (state). All of the stuff that was previously handled by
      using batchstate is now embedded into the one state machine. Of
      course, these mostly overlapped, so its not that much of a change,
      except that we also redid the machine, adding more states (for
      example, modify phases are now explicit. To get a picture of the
      actual state machine, on boss:
      
      		stategraph -o newstates EXPTSTATE
      		gv newstates.ps
      
      Things to note:
      
      * The "batchstate" slot of the experiments table is now used solely to
        provide a lock for batch daemon. A secondary change will be to
        change the slot name to something more appropriate, but it can
        happen anytime after this new stuff is installed.
      
      * I have left expt_locked for now, but another later change will be to remove
        expt_locked, and change it to active_busy or some such new state name in
        the state machine. I have removed most uses of expt_locked, except those
        that were necessary until there is a new state to replace it.
      
      * These new changes are an implementation of the new state machine,
        but I have not done anything fancy. Most of the code is the same as
        it was before.
      
      * I suspect that there are races with the batch daemon now, but they
        are going to be rare, and the end result is probably that a
        cancelation is delayed a little bit.
      2025e0bd
  14. 29 Oct, 2003 1 commit
  15. 16 Oct, 2003 1 commit
    • Leigh B. Stoller's avatar
      Fix bug with respect to modified experiments that abort and get · 589e97d2
      Leigh B. Stoller authored
      swapped out (non-recoverable) by tbswap. swapexp was leaving the
      experiment in the running state instead of paused. We need to check
      this after tbswap since we do not get reasonable error codes back.
      Also some cleanup with respect to how aborted modifies are handled.
      I think I understand what Chad did ...
      
      A general comment; we need to be better about returning meaningful
      error codes!
      589e97d2
  16. 30 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Up to now we have had two state variables associated with an experiment, · 4269dad1
      Leigh B. Stoller authored
      plus a lock field. The lock field was a simple "experiment locked, go away"
      slot that is easy to use when you do not care about the actual state that
      an experiment is in, just that it is in "transition" and should not be
      messed with.
      
      The other two state variables are "state" and "batchstate". The former
      (state) is the original variable that Chris added, and was used by the tb*
      scripts to make sure that the experiment was in the state each particular
      script wanted them to be in. But over time (and with the addition of so
      much wrapper goo around them), "state" has leaked out all over the place to
      determine what operations on an experiment are allowed, and if/when it
      should be displayed in various web pages. There are a set of transition
      states in addition to the usual "active", "swapped", etc like "swapping"
      that make testing state a pain in the butt.
      
      I added the other state variable ("batchstate") when I did the batch
      system, obviously! It was intended as a wrapper state to control access to
      the batch queue, and to prevent batch experiments from being messed with
      except when it was really okay (for example, its okay to terminate a
      swapped out batch experiment, but not a swapped in batch experiment since
      that would confuse the batch daemon). There are fewer of these states, plus
      one additional state for "modifying" experiments.
      
      So what I have done is change the system to use "batchstate" for all
      experiments to control entry into the swap system, from the web interface,
      from the command line, and from the batch daemon. The other state variable
      still exists, and will be brutally pushed back under the surface until its
      just a vague memory, used only by the original tb* scripts. This will
      happen over time, and the "batchstate" variable will be renamed once I am
      convinced that this was the right thing to do and that my changes actually
      work as intended.
      
      Only people who have bothered to read this far will know that I also added
      the ability to cancel experiment swapin in progress. For that I am using
      the "canceled" flag (ah, this one was named properly from the start!), and
      I test that at various times in assign_wrapper and tbswap. A minor downside
      right now is that a canceled swapin looks too much like a failed swapin,
      and so tbops gets email about it. I'll fix that at some point (sometime
      after the boss complains).
      
      I also cleaned up various bits of code, replacing direct calls to exec
      with calls to the recently improved SUEXEC interface. This removes
      some cruft from each script that calls an external script.
      
      Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting.
      Also fixed to not run the parser directly! This was very wrong; should
      call nscheck instead. Changed to use "nobody" group instead of group
      flux (made the same change in nscheck).
      
      There is a script in the sql directory called newstates.pl. It needs
      to be run to initialize the batchstate slot of the experiments table
      for all existing experiments.
      4269dad1
  17. 07 Aug, 2003 1 commit
  18. 06 Aug, 2003 1 commit
    • Leigh B. Stoller's avatar
      Clean up temporary files used in modify. The temp dirs were being · 05bd80ff
      Leigh B. Stoller authored
      created in /tmp and left behind. I've moved them to the expwork
      directory instead, and added a routine in the library to clear them
      out.
      
      Clear out the nsfile (stored in /tmp) used in modify. The web page was
      creating a temp file, but never removing it. swapexp now copies the
      nsfile in so that the web page can remove the temporary after the
      script exits. The temp is placed in the expwork directory as well, but
      left behind for debugging.
      
      When swapmod fails, send along the nsfile in the email message.
      05bd80ff
  19. 30 Jul, 2003 1 commit
    • Leigh B. Stoller's avatar
      Change the prerender code to run in the background so that Mike does · 11d792e3
      Leigh B. Stoller authored
      not have to wait 3 minutes for it to finish before he can watch his
      experiment swapin fail for some other reason.
      
      I adopted the same pid mechanism as in eventsys_control.in, which uses
      a slot in the experiments table.
      
      Running "prerender" puts the render into the background and stores
      the pid. Running "prerender -r" kills a running prerender and removes
      the existing info from the DB.
      
      Fixed the problem with swapmod not restoring the old vis; swapmod now
      kills any running prerender, and restarts one if the swapmod fails
      (the prerun of the new NS file starts up another prerender in the
      background).
      
      Add setpriority() call in prerender to nice it and children to 15.
      11d792e3
  20. 29 Jul, 2003 1 commit
    • Leigh B. Stoller's avatar
      Some cleanup on the batch mode stuff. Make it more explicit in the · 29b820b1
      Leigh B. Stoller authored
      showexp page that its a batch experiment, by the menu options. Same
      deal in the swapexp output, plus some other minor cleanup. The only
      bug I found while trying to figure out the batchmode problem reported
      this morning by the FileMover people, is that the cancelflag is not
      cleared after swaping a running batch experiment out, so even after
      reinjecting it into the queue, it will not run. Still, that does seem
      to be what the FileMover people reported.
      29b820b1
  21. 27 Jul, 2003 1 commit
  22. 17 Jul, 2003 1 commit
  23. 11 Jun, 2003 1 commit
  24. 09 Jun, 2003 1 commit
  25. 05 Jun, 2003 2 commits
  26. 04 Jun, 2003 1 commit
  27. 03 Jun, 2003 1 commit
  28. 28 May, 2003 1 commit
  29. 25 May, 2003 1 commit
  30. 24 May, 2003 1 commit
    • Mac Newbold's avatar
      Round of changes related to idleswapping and autoswapping. The web and · 02aaf8e4
      Mac Newbold authored
      back end scripts now support 3 different kind of forced swaps:
      
      1. Idle-Swap : this is ths same one we had before. Email message to them
      says it was swapped "because it was idle for too long"
      
      2. Auto-Swap : A new one, typically for user-requested timed swapouts.
      Email says it was swapped "because it was swapped in too long"
      
      3. Force swap: Generic one, for "none of the above" cases. Just says
      Experiment "has been forcibly swapped out by Testbed Operations."
      
      The force swap option on the web now lets you choose which of these three
      you want. Only "Idle-Swap" counts as an idleswap in the stats. Soon
      idleswap and autoswap will be used by idlemail when it does automatic
      swapping.
      02aaf8e4
  31. 22 May, 2003 1 commit
    • Leigh B. Stoller's avatar
      Reorg the batch system slightly as per Eric's request that batch mode · da97ba35
      Leigh B. Stoller authored
      experiments look more like regular experiments. Batch mode experiments
      can now be preloaded and swapped. When preloaded, they go into a
      "Pause" state. Swapping a batch mode experiment in puts them into the
      "posted" state so the batch daemon will see them. Swapping out a
      batchmode experiment does the expected; it puts them back into the
      Pause state. Terminating a batch mode experiment does the expected;
      its gone. When a batch mode experiment finishes normally, it goes back
      into the pause state, which allows batches to be reinjected as many
      times as Eric likes.
      da97ba35
  32. 21 May, 2003 1 commit
    • Leigh B. Stoller's avatar
      Minor stats changes for dealing with swapmodify; be sure to credit for · cb309ff2
      Leigh B. Stoller authored
      each portion of the experiment as it is modified.
      
      Also add expt_swap_uid so that we know who did the last operation, and
      so we can charge/credit the right person. So, if joe swaps in the
      experiment and jane swaps it out, joe gets charged. If jane swaps in
      the experiment and joe modifies it, jane gets credit for the first
      portion, and joe will later get charged for the second portion.
      
      Took longer to explain then to implement ...
      
      Lbs
      cb309ff2
  33. 15 May, 2003 1 commit
    • Leigh B. Stoller's avatar
      Split the experiment stats table into two parts. The first is the · a382994d
      Leigh B. Stoller authored
      per-experiment instantiation with aggregate data like the number of
      swapins, the dates and the like. The other part is the per
      swapin/modify stats. These are number of pnodes, links, lans,
      etc. Long term, I think we want more precise swapin stats, and with
      experiment modify in the mix, we need to have multiple stat records
      per experiment, but do not need to duplicate all the stuff in the
      other table just mentioned.
      
      To reduce the amount the table size, we cross reference the tables by
      index only instead of with pid,eid and the like. We use exptidx to
      link experiments, experiment_stats, and the new experiment_resources
      table. experiment_resources and stats are linked by another index in
      the resources table, which indicates which is the current resource
      row. On a modify, a new resource record is created, and the stats
      record updated to point to the new (latest) resource record.
      
      Web Changes: Improve showstats and showexpstats. Make them user
      accessible so that mere users can see stats for themselves and for
      their projects. No ability for mere users (PIs) to look at another
      person's stats. Generally, these two pages need more work, but now
      they are more useful. I added Show Stats to the user info and project
      info pages to display per-usr/proj stats. Add more info in the
      showstats display, but the showexpstats display is still not pretty
      printed; just the raw tables.
      
      Rename a few fields, add some indexes, and otherwise make some minor
      changes that are sure to annoy everyone.
      a382994d
  34. 05 May, 2003 1 commit
  35. 01 May, 2003 1 commit