1. 12 Jan, 2005 1 commit
  2. 10 Sep, 2004 1 commit
    • Leigh B. Stoller's avatar
      Small change to suexec code. This change has the potential for creating · 7e731fba
      Leigh B. Stoller authored
      unanticipated breakage. If that happens, just need to back out the
      changes under the "suexec-stuff" tag. However, the better solution will
      probably be to fix the PHP scripts that break by adding the proper
      groups in the call to suexec (in the web page, see below) or by fixing
      the backend Perl script that breaks.
      
      This fix is primarily to address the problem of some users being in more
      groups (cause of subgroups) then the max number of groups allowed
      (NGROUPS).  The groups that really mattered (say, for creating an
      experiment in a subgroup) could be left out cause they were at the end
      of the list.
      
      * suexec.c: Change how groups are handled. Instead of taking a single
        gid argument (the gid to setgid as), now takes a comma separated list
        of groups. Further, instead of doing a setgroups to the user's entire
        group list as specified in the groups file (getgroups), setgroups to
        just the groups listed on the command line, plus the user's primary
        group from the password file (this is to prevent potential breakage
        with accessing files from the users homedir, although might not really
        be necessary).
      
        This change is somewhat rational in the sense that in our case, suexec
        is not being used to run arbitrary user code (CGIs), but only to run
        specific scripts that we say should be run. The environment for
        running those scripts can be more tightly controlled then it would
        otherwise need to be if running some random CGI the user has in his
        public html directory.
      
      * www: Change the gid argument to SUEXEC() in a number of scripts so
        that the project and subgroup are explicitly given to suexec, as
        described above. For example, in beginexp:
      
      	SUEXEC(gid, "$pid,$unix_gid", ....);
      
        Aside: note that project names (pid) are always one to one with their
        unix group name, but subgroup names are not, and *always* have to be
        looked up in the DB, hence the "unix_gid" argument.
      
        Script breakage should require nothing more then adding the proper
        group to the list as above.
      7e731fba
  3. 12 Feb, 2004 1 commit
  4. 10 Feb, 2004 1 commit
  5. 17 Nov, 2003 1 commit
    • Leigh B. Stoller's avatar
      Merge the two state machines (batchstate and state) into a single · 2025e0bd
      Leigh B. Stoller authored
      state machine (state). All of the stuff that was previously handled by
      using batchstate is now embedded into the one state machine. Of
      course, these mostly overlapped, so its not that much of a change,
      except that we also redid the machine, adding more states (for
      example, modify phases are now explicit. To get a picture of the
      actual state machine, on boss:
      
      		stategraph -o newstates EXPTSTATE
      		gv newstates.ps
      
      Things to note:
      
      * The "batchstate" slot of the experiments table is now used solely to
        provide a lock for batch daemon. A secondary change will be to
        change the slot name to something more appropriate, but it can
        happen anytime after this new stuff is installed.
      
      * I have left expt_locked for now, but another later change will be to remove
        expt_locked, and change it to active_busy or some such new state name in
        the state machine. I have removed most uses of expt_locked, except those
        that were necessary until there is a new state to replace it.
      
      * These new changes are an implementation of the new state machine,
        but I have not done anything fancy. Most of the code is the same as
        it was before.
      
      * I suspect that there are races with the batch daemon now, but they
        are going to be rare, and the end result is probably that a
        cancelation is delayed a little bit.
      2025e0bd
  6. 30 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Up to now we have had two state variables associated with an experiment, · 4269dad1
      Leigh B. Stoller authored
      plus a lock field. The lock field was a simple "experiment locked, go away"
      slot that is easy to use when you do not care about the actual state that
      an experiment is in, just that it is in "transition" and should not be
      messed with.
      
      The other two state variables are "state" and "batchstate". The former
      (state) is the original variable that Chris added, and was used by the tb*
      scripts to make sure that the experiment was in the state each particular
      script wanted them to be in. But over time (and with the addition of so
      much wrapper goo around them), "state" has leaked out all over the place to
      determine what operations on an experiment are allowed, and if/when it
      should be displayed in various web pages. There are a set of transition
      states in addition to the usual "active", "swapped", etc like "swapping"
      that make testing state a pain in the butt.
      
      I added the other state variable ("batchstate") when I did the batch
      system, obviously! It was intended as a wrapper state to control access to
      the batch queue, and to prevent batch experiments from being messed with
      except when it was really okay (for example, its okay to terminate a
      swapped out batch experiment, but not a swapped in batch experiment since
      that would confuse the batch daemon). There are fewer of these states, plus
      one additional state for "modifying" experiments.
      
      So what I have done is change the system to use "batchstate" for all
      experiments to control entry into the swap system, from the web interface,
      from the command line, and from the batch daemon. The other state variable
      still exists, and will be brutally pushed back under the surface until its
      just a vague memory, used only by the original tb* scripts. This will
      happen over time, and the "batchstate" variable will be renamed once I am
      convinced that this was the right thing to do and that my changes actually
      work as intended.
      
      Only people who have bothered to read this far will know that I also added
      the ability to cancel experiment swapin in progress. For that I am using
      the "canceled" flag (ah, this one was named properly from the start!), and
      I test that at various times in assign_wrapper and tbswap. A minor downside
      right now is that a canceled swapin looks too much like a failed swapin,
      and so tbops gets email about it. I'll fix that at some point (sometime
      after the boss complains).
      
      I also cleaned up various bits of code, replacing direct calls to exec
      with calls to the recently improved SUEXEC interface. This removes
      some cruft from each script that calls an external script.
      
      Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting.
      Also fixed to not run the parser directly! This was very wrong; should
      call nscheck instead. Changed to use "nobody" group instead of group
      flux (made the same change in nscheck).
      
      There is a script in the sql directory called newstates.pl. It needs
      to be run to initialize the batchstate slot of the experiments table
      for all existing experiments.
      4269dad1
  7. 09 Sep, 2003 1 commit
  8. 06 Aug, 2003 1 commit
    • Leigh B. Stoller's avatar
      Clean up temporary files used in modify. The temp dirs were being · 05bd80ff
      Leigh B. Stoller authored
      created in /tmp and left behind. I've moved them to the expwork
      directory instead, and added a routine in the library to clear them
      out.
      
      Clear out the nsfile (stored in /tmp) used in modify. The web page was
      creating a temp file, but never removing it. swapexp now copies the
      nsfile in so that the web page can remove the temporary after the
      script exits. The temp is placed in the expwork directory as well, but
      left behind for debugging.
      
      When swapmod fails, send along the nsfile in the email message.
      05bd80ff
  9. 30 Jul, 2003 1 commit
  10. 14 Jul, 2003 1 commit
  11. 06 Jun, 2003 1 commit
    • Chad Barb's avatar
      · 2fad09a4
      Chad Barb authored
      NetBuild modify. Woot.
      
      Available only to admins for now.
      
      Link is available off "modify experiment page" in admin mode.
      2fad09a4
  12. 21 May, 2003 1 commit
    • Chad Barb's avatar
      · f3a7da49
      Chad Barb authored
      Changed submit button to work on NS4.7
      (by using <input type='submit'> instead of <button>)
      for our retrocomputing friends.
      f3a7da49
  13. 14 May, 2003 1 commit
    • Chad Barb's avatar
      · 84bffaa3
      Chad Barb authored
      Modify experiment is now deployed/documented...
      We can announce it tomorrow.
      84bffaa3
  14. 17 Apr, 2003 1 commit
    • Chad Barb's avatar
      · 4233af4e
      Chad Barb authored
      For the benefit of our users,
      added 'reboot nodes in experiment' checkbox,
      on by default, with a stern warning.
      4233af4e
  15. 03 Apr, 2003 1 commit
    • Chad Barb's avatar
      · 765de560
      Chad Barb authored
      Added new feature 'Experiment Modify'.
      Now available (to admins only for now) from the showexp page.
      
      Warning! doing a modify which alters the topology will probably
      require a "reboot all nodes" afterwards.
      (There will be a checkbox soon in the modify experiment page.)
      
      Adding/removing delay nodes seems to work fine without reboots, though.
      
      Warning! If the new version of the experiment cannot be mapped
       (not enough nodes available, for instance) the experiment will be
       swapped out! This will get fixed later.
      
      Prerun backs up the experiment topology, so using a bad NS
      file doesn't result in experiment termination.
      
      As part of this, added library functions to libdb to
      delete, backup, and restore both virtual and physical experiment state.
      765de560