1. 30 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Up to now we have had two state variables associated with an experiment, · 4269dad1
      Leigh B. Stoller authored
      plus a lock field. The lock field was a simple "experiment locked, go away"
      slot that is easy to use when you do not care about the actual state that
      an experiment is in, just that it is in "transition" and should not be
      messed with.
      
      The other two state variables are "state" and "batchstate". The former
      (state) is the original variable that Chris added, and was used by the tb*
      scripts to make sure that the experiment was in the state each particular
      script wanted them to be in. But over time (and with the addition of so
      much wrapper goo around them), "state" has leaked out all over the place to
      determine what operations on an experiment are allowed, and if/when it
      should be displayed in various web pages. There are a set of transition
      states in addition to the usual "active", "swapped", etc like "swapping"
      that make testing state a pain in the butt.
      
      I added the other state variable ("batchstate") when I did the batch
      system, obviously! It was intended as a wrapper state to control access to
      the batch queue, and to prevent batch experiments from being messed with
      except when it was really okay (for example, its okay to terminate a
      swapped out batch experiment, but not a swapped in batch experiment since
      that would confuse the batch daemon). There are fewer of these states, plus
      one additional state for "modifying" experiments.
      
      So what I have done is change the system to use "batchstate" for all
      experiments to control entry into the swap system, from the web interface,
      from the command line, and from the batch daemon. The other state variable
      still exists, and will be brutally pushed back under the surface until its
      just a vague memory, used only by the original tb* scripts. This will
      happen over time, and the "batchstate" variable will be renamed once I am
      convinced that this was the right thing to do and that my changes actually
      work as intended.
      
      Only people who have bothered to read this far will know that I also added
      the ability to cancel experiment swapin in progress. For that I am using
      the "canceled" flag (ah, this one was named properly from the start!), and
      I test that at various times in assign_wrapper and tbswap. A minor downside
      right now is that a canceled swapin looks too much like a failed swapin,
      and so tbops gets email about it. I'll fix that at some point (sometime
      after the boss complains).
      
      I also cleaned up various bits of code, replacing direct calls to exec
      with calls to the recently improved SUEXEC interface. This removes
      some cruft from each script that calls an external script.
      
      Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting.
      Also fixed to not run the parser directly! This was very wrong; should
      call nscheck instead. Changed to use "nobody" group instead of group
      flux (made the same change in nscheck).
      
      There is a script in the sql directory called newstates.pl. It needs
      to be run to initialize the batchstate slot of the experiments table
      for all existing experiments.
      4269dad1
  2. 29 Sep, 2003 7 commits
  3. 28 Sep, 2003 1 commit
  4. 27 Sep, 2003 1 commit
  5. 26 Sep, 2003 14 commits
  6. 25 Sep, 2003 10 commits
  7. 24 Sep, 2003 6 commits
    • Leigh B. Stoller's avatar
    • Leigh B. Stoller's avatar
      Commit my daemon to monitor the status of plab physnodes in hwdown, · 59c5d5bb
      Leigh B. Stoller authored
      trying to bring them back from the dead periodically by trying to
      instantiate a vserver/vnode on them, and then tearing it down. If we
      can do that, then the node is usable, and it gets moved back into the
      normal holding experiment so that ptopgen will add it to ptop files.
      
      This deamon is not turned on yet; waiting for other little bits and
      pieces to be done.
      
      There is an equiv change in os_setup that moves physnodes into hwdown
      when a setup on a vnode fails.
      
      Lbs
      59c5d5bb
    • Jay Lepreau's avatar
      Elab interface to Plab announcement, tweaked with one addition and · 89d6ea0f
      Jay Lepreau authored
      a little html.  Ref'ed from news page.  Installed om 9/22.
      89d6ea0f
    • Robert Ricci's avatar
      For wide-area nodes, include the site as a feature in the ptop file, so that · f855d010
      Robert Ricci authored
      assign can attempt to spread an experimenter's nodes across sites.
      f855d010
    • Leigh B. Stoller's avatar
      Convert install-rpm/install-tarfile to use the web server instead of · 0eba3e76
      Leigh B. Stoller authored
      tmcd (which is bad, since tying up the tmcd threads blocks all nodes
      in the testbed). The old functionality is left in tmcd for now.
      
      On the server side, a new web page (www/spewrpmtar.php3) receives a
      request for a file, along with the nodeid (pcXXX) making the request,
      and the secret key that is generated for each new experiment and
      transfered to the node via tmcd. If the key matches, the operation is
      handed off to tbsetup/spewrpmtar.in which verifies that the file is in
      the list of rpm/tar files for that node, and then spits it out to
      stdout. The web page uses fpassthru() to send the file out to the
      client. The client is using wget, and is required to use https (the
      web page checks).
      
      At present, the external script is run as the creator of the
      experiment, and gid of the experiment. Perhaps this is not a good
      idea. In any event, the file must be in the list of rpm/tarfiles,
      either owned by the experiment creator or with a group of the
      experiment, and the file must reside in either /proj or /groups.
      I use the realpath() function to make sure there are no symlink tricks
      pointing to outside those filesystems. I use the standard NFS read goo to
      prevent transient mount problems that we all know and love.
      0eba3e76
    • Mike Hibler's avatar