1. 14 Sep, 2006 9 commits
  2. 13 Sep, 2006 10 commits
  3. 12 Sep, 2006 9 commits
    • Robert Ricci's avatar
      Fix a bug I introduced with a careless copy and paste - I was copying · a9e89bd0
      Robert Ricci authored
      an int in twice.
      
      Also fix another bug (masked by the previous) I introduced into
      census()
      a9e89bd0
    • Kirk Webb's avatar
      · 52dcfd48
      Kirk Webb authored
      Added secondary logging for node setup/teardown success/failure.  Also log
      node pool membership changes in this log.
      52dcfd48
    • Leigh B. Stoller's avatar
      This started out as a simple little hack to add a StopRun "ns" event, but · cbdc4178
      Leigh B. Stoller authored
      it got more complicated as it progressed.
      
      The bulk of the change was changing template_exprun so that it can take a
      pid/eid as an alternative to eid/guid. This is a big convenience since its
      easy to find the template from a running experiment, and it makes it
      possible to invoke from the event scheduler, which has never heard of a
      template before (and its not something I wanted to teach it about).  Its
      also easier on users.
      
      Anyway, back to the stoprun event. You can now do this:
      
      	$ns at 100 "$ns stoprun"
      or
      	tevc -e pid/eid now ns stoprun
      
      You can add the -w option to wait for the completion event that is sent,
      but this brings me to the glaring problems with this whole thing.
      
      * First, the scheduler has to fire off the stoprun in the background,
        since if it waits, we get deadlock. Why? Cause the implementation of
        stoprun uses the event system (SNAPSHOT event, other things), and if
        the scheduler is sitting and waiting, nothing happens.
      
        Okay, the solution to this was to generate a COMPLETION event from
        template_exprun once the stop operation is complete. This brings me
        to the second problem ...
      
      * Worse, is that the "ns" events that are sent to implement stoprun (like
        snapshot) send their own completion events, and that confuses anyone
        waiting on the original stoprun event (it returns early).
      
        So what to do about this? There is a "token" field in the completion
        event structure, which I presume is to allow you to match things up.  But
        there is no way to set this token using tevc (and then wait for it), and
        besides, the event scheduler makes them up anyway and sticks them into
        the event. So, the seed of a fix are already germinating in my mind, but
        I wanted to get this commit in so that Mike would have fun reading this
        commit log.
      cbdc4178
    • Robert Ricci's avatar
      Add a ton of debugging output, showing the byte locations that each · 84cf1d12
      Robert Ricci authored
      'field' is written to and read from. This was done to aid the
      debugging of reading and writing replay files.
      
      However, this output is ridiculously verbose, so it's commented out.
      84cf1d12
    • Robert Ricci's avatar
      Try to make sure we get core dumps, by upping the coredump size · 4d11d3ec
      Robert Ricci authored
      rlimit.
      
      Also, check for error in packet size calculation vs. how much data
      is actually saved.
      4d11d3ec
    • Robert Ricci's avatar
      Serious bugfix - PakcetInfo::census() was undercounting the number · 5b3b2838
      Robert Ricci authored
      of bytes required to save the packet. This was causing us to create
      a buffer too small to hold the packet, causing memory corruption bugs
      and causing us to write invalid replay files.
      
      The way that the packet size claculation is separated from the saving
      of the packet is a serious problem, and needs to be re-designed!
      5b3b2838
    • Leigh B. Stoller's avatar
      Checkpoint little web page to spew the event stream out. The bulk of · 4820df1b
      Leigh B. Stoller authored
      this change was actually refactoring Tim's spewlog code to be more
      general so that it can be used elsewhere. I still need to go back and
      change Tim's oroginal code to use the stuff.
      4820df1b
    • Jonathon Duerig's avatar
      Quick fix to LOG_EVERYTHING. · 9c9b43b4
      Jonathon Duerig authored
      9c9b43b4
    • Jonathon Duerig's avatar
      Finished adding the REPLAY option for logging. Added an explanation of how to... · c82c98d8
      Jonathon Duerig authored
      Finished adding the REPLAY option for logging. Added an explanation of how to add new logging options to the comments at the top.
      c82c98d8
  4. 11 Sep, 2006 9 commits
  5. 10 Sep, 2006 2 commits
    • Leigh B. Stoller's avatar
      The bulk of this commit adds the ability to run the program agent on ops · e8bb6bca
      Leigh B. Stoller authored
      so that users can schedule program events to run there. For example:
      
      	set myprog [new Program $ns]
      	$myprog set node "ops"
      	$myprog set command "/usr/bin/env >& /tmp/foo"
      
      	$ns at 10 "$myprog start"
      or
      	tevc -e pid/eid now myprog start
      
      Since the program agent cannot talk to tmcd from ops, there are new
      routines to create the config files that the program agent uses, in
      the expertment tbdata directory.
      
      I also rewrote the eventsys.proxy script that starts the event
      scheduler on ops; I rolled the startup of the program agent into this
      script, via new -a option which is passed over from boss when an ops
      program agent is detected in the virt topology. This keep the number
      of new processes on ops to a small number.
      
      Also part of the above rewrite is that we now catch when event
      scheduler (or the program agent) exits abnormally, sending email to
      tbops and the swapper of the experiment. We have been seeing abnormal
      exits of the scheduler and it would good to detect and see if we can
      figure out what is going wrong.
      
      Other small bug fixes in experiment run.
      e8bb6bca
    • Jonathon Duerig's avatar
      Added a first rough draft of the least squares path saturation sensor. There... · 9c6f20f0
      Jonathon Duerig authored
      Added a first rough draft of the least squares path saturation sensor. There are a lot of rough edges detailed earlier in a message to Rob. This is totally untested code.
      9c6f20f0
  6. 08 Sep, 2006 1 commit