1. 22 Jun, 2005 1 commit
    • Leigh B. Stoller's avatar
      Added my simplistic link tracing and monitoring. Example usage and · 7942119e
      Leigh B. Stoller authored
      some details can be found in the advanced tutorial that I wrote up.
      See this link:
      The basic idea is that each virt_lan entry gets a couple of new slots
      describing the type of tracing that is desired.
        traced tinyint(1) default '0',
        trace_type enum('header','packet','monitor') NOT NULL default 'header',
        trace_expr tinytext,
        trace_snaplen int(11) NOT NULL default '0',
        trace_endnode tinyint(1) NOT NULL default '0',
      There is a new physical table called "traces" that is a little bit
      like the current delays table. A new tmcd command returns the trace
      configuration to the client nodes (tmcd/common/config/rc.trace).
      The delays table got a new boolean called "noshaping" that tells the
      delay node to bridge, but not set up any pipes. This allows us to
      capture traffic at the delay node, but without much less overhead on
      the packets.
      The pcapper got bloated up to do packet capture and more event stuff.
      I also had to add some mutex locking around calls into the pcap
      library and around malloc, since the current setup used linuxthreads,
      which is not compatable with the standard libc_r library. I was
      getting all kinds of memory corruption, and I am sure that if someone
      breathes on the pcapper again, it will break in some new way.
  2. 10 Sep, 2004 1 commit
    • Leigh B. Stoller's avatar
      Small change to suexec code. This change has the potential for creating · 7e731fba
      Leigh B. Stoller authored
      unanticipated breakage. If that happens, just need to back out the
      changes under the "suexec-stuff" tag. However, the better solution will
      probably be to fix the PHP scripts that break by adding the proper
      groups in the call to suexec (in the web page, see below) or by fixing
      the backend Perl script that breaks.
      This fix is primarily to address the problem of some users being in more
      groups (cause of subgroups) then the max number of groups allowed
      (NGROUPS).  The groups that really mattered (say, for creating an
      experiment in a subgroup) could be left out cause they were at the end
      of the list.
      * suexec.c: Change how groups are handled. Instead of taking a single
        gid argument (the gid to setgid as), now takes a comma separated list
        of groups. Further, instead of doing a setgroups to the user's entire
        group list as specified in the groups file (getgroups), setgroups to
        just the groups listed on the command line, plus the user's primary
        group from the password file (this is to prevent potential breakage
        with accessing files from the users homedir, although might not really
        be necessary).
        This change is somewhat rational in the sense that in our case, suexec
        is not being used to run arbitrary user code (CGIs), but only to run
        specific scripts that we say should be run. The environment for
        running those scripts can be more tightly controlled then it would
        otherwise need to be if running some random CGI the user has in his
        public html directory.
      * www: Change the gid argument to SUEXEC() in a number of scripts so
        that the project and subgroup are explicitly given to suexec, as
        described above. For example, in beginexp:
      	SUEXEC(gid, "$pid,$unix_gid", ....);
        Aside: note that project names (pid) are always one to one with their
        unix group name, but subgroup names are not, and *always* have to be
        looked up in the DB, hence the "unix_gid" argument.
        Script breakage should require nothing more then adding the proper
        group to the list as above.
  3. 17 Nov, 2003 1 commit
    • Leigh B. Stoller's avatar
      Merge the two state machines (batchstate and state) into a single · 2025e0bd
      Leigh B. Stoller authored
      state machine (state). All of the stuff that was previously handled by
      using batchstate is now embedded into the one state machine. Of
      course, these mostly overlapped, so its not that much of a change,
      except that we also redid the machine, adding more states (for
      example, modify phases are now explicit. To get a picture of the
      actual state machine, on boss:
      		stategraph -o newstates EXPTSTATE
      		gv newstates.ps
      Things to note:
      * The "batchstate" slot of the experiments table is now used solely to
        provide a lock for batch daemon. A secondary change will be to
        change the slot name to something more appropriate, but it can
        happen anytime after this new stuff is installed.
      * I have left expt_locked for now, but another later change will be to remove
        expt_locked, and change it to active_busy or some such new state name in
        the state machine. I have removed most uses of expt_locked, except those
        that were necessary until there is a new state to replace it.
      * These new changes are an implementation of the new state machine,
        but I have not done anything fancy. Most of the code is the same as
        it was before.
      * I suspect that there are races with the batch daemon now, but they
        are going to be rare, and the end result is probably that a
        cancelation is delayed a little bit.
  4. 30 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Up to now we have had two state variables associated with an experiment, · 4269dad1
      Leigh B. Stoller authored
      plus a lock field. The lock field was a simple "experiment locked, go away"
      slot that is easy to use when you do not care about the actual state that
      an experiment is in, just that it is in "transition" and should not be
      messed with.
      The other two state variables are "state" and "batchstate". The former
      (state) is the original variable that Chris added, and was used by the tb*
      scripts to make sure that the experiment was in the state each particular
      script wanted them to be in. But over time (and with the addition of so
      much wrapper goo around them), "state" has leaked out all over the place to
      determine what operations on an experiment are allowed, and if/when it
      should be displayed in various web pages. There are a set of transition
      states in addition to the usual "active", "swapped", etc like "swapping"
      that make testing state a pain in the butt.
      I added the other state variable ("batchstate") when I did the batch
      system, obviously! It was intended as a wrapper state to control access to
      the batch queue, and to prevent batch experiments from being messed with
      except when it was really okay (for example, its okay to terminate a
      swapped out batch experiment, but not a swapped in batch experiment since
      that would confuse the batch daemon). There are fewer of these states, plus
      one additional state for "modifying" experiments.
      So what I have done is change the system to use "batchstate" for all
      experiments to control entry into the swap system, from the web interface,
      from the command line, and from the batch daemon. The other state variable
      still exists, and will be brutally pushed back under the surface until its
      just a vague memory, used only by the original tb* scripts. This will
      happen over time, and the "batchstate" variable will be renamed once I am
      convinced that this was the right thing to do and that my changes actually
      work as intended.
      Only people who have bothered to read this far will know that I also added
      the ability to cancel experiment swapin in progress. For that I am using
      the "canceled" flag (ah, this one was named properly from the start!), and
      I test that at various times in assign_wrapper and tbswap. A minor downside
      right now is that a canceled swapin looks too much like a failed swapin,
      and so tbops gets email about it. I'll fix that at some point (sometime
      after the boss complains).
      I also cleaned up various bits of code, replacing direct calls to exec
      with calls to the recently improved SUEXEC interface. This removes
      some cruft from each script that calls an external script.
      Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting.
      Also fixed to not run the parser directly! This was very wrong; should
      call nscheck instead. Changed to use "nobody" group instead of group
      flux (made the same change in nscheck).
      There is a script in the sql directory called newstates.pl. It needs
      to be run to initialize the batchstate slot of the experiments table
      for all existing experiments.
  5. 30 May, 2003 1 commit
  6. 23 Apr, 2003 1 commit
    • Leigh B. Stoller's avatar
      More bloat! Check for experiments in transition. Check state for · 5987ba7f
      Leigh B. Stoller authored
      active. Add explanation of how to use the page at the top. Add check
      box for modifying the base experiment (virt_lans). Redo Red/Gred stuff
      so that its "greyed" out on links that are not red/gred. Clean up the
      script argument processing and proper permission checks.
      The missing feature that would be nice to add is the ability to modify
      the base experiment config when the experiment is swapped out. I think
      it would be very handy. The problem is that instead of looking at
      delays and linkdelays, we have to read virt_lans directly and do a
      bunch of stuff that assign_wrapper does when creating delays and
      linkdelays; namely putting together the individual member parameters
      of a link or lan into a coherent set of values. Not rocket science,
      but a lot of ugly script hacking, and I am not in the mood. Note
      though that I modified the backend (delay_config) to allow this to be
      done in the future.
  7. 22 Apr, 2003 1 commit
    • Leigh B. Stoller's avatar
      Inject some serious bloat! Add queue limit/type and the RED/GRED · b20ae26a
      Leigh B. Stoller authored
      params. Allow "easier" changing of entire link/lan (rather than having
      to specify each node). A lot data structure hacking to allow for
      reducing the number of actual backend (delay_config) commands issued.
      Used to be 1 per change! Now its one per node or link/lan.
      This is almost ready ...
  8. 05 Mar, 2003 1 commit
  9. 02 Oct, 2002 1 commit
    • Chad Barb's avatar
      Initial version of delay web control. · dd27f82a
      Chad Barb authored
      Functional, but needs some work.
      Won't allow non-admins to use it (since it doesn't do "proper" permission checking yet.)
      Input is aggressively checked for bad mojo before being pasted into any command line.
      Run from /delaycontrol.php3?eid=exptname&pid=projname
      Admin bit must be on.