1. 24 Aug, 2006 3 commits
    • Leigh B. Stoller's avatar
      Add a limit option to help with debugging. The limit is the number of · 95feb639
      Leigh B. Stoller authored
      rows pushed across to the DP.
    • Mike Hibler's avatar
    • Jonathon Duerig's avatar
      Fixed the traffic model so that it is now in line with what Rob and I... · c6fa578a
      Jonathon Duerig authored
      Fixed the traffic model so that it is now in line with what Rob and I discussed. Delays are before the write with the write size cached. Writes expire based on an expiration date.
      Miscellaneous fixes. Open problem: PacketSensor does not deal correctly with zero-sized packets. I changed KernelTcp to pass such packets because otherwise there is no way to do state changes based on SYN/FIN/other packets. SYN is handled ok for now because of a change noted below. FIN is not.
      Added a StateSensor. As I discovered, using the kernel tcp_info data structure isn't useful when dealing with fields that change on a packet by packet basis because the kernel information is retrieved at processing time and not capture time. For instance, it is useless when trying to determine whether a connection was established by the time a particular packet was sent (to determine whether it is part of the three-way-handshake). The StateSensor keeps track of the state machine and correlates it to the packet involved. This allows the other sensors to rely on it to distinguish between connection setup and the rest of the connection traffic. Added references to it in the PacketSensor, the DelaySensor and the ThroughputSensor.
      Changed the way packet information was transmitted to the sensors to make it easier to add new packet types (as will be necessary when accept()s are handled).
      Fixed all outstanding flaws in the basic feedback mechanism. In short, "Its alive!". Currently the only data being transmitted is the base rtt (MinDelay). Now that the feedback is working in a basic form, it will be easier to get the other characteristics online.
  2. 23 Aug, 2006 1 commit
    • Kirk Webb's avatar
      · 19389112
      Kirk Webb authored
      Fix up the /etc/passwd file creation process to work correctly when
      the slice is rebooted (i.e., when rc.inplab gets re-run, don't append
      all the existing service slice user entries - exponential blowup!).
      This is what caused the bgmon sliver on the German node to go haywire with
      memory - the sliver was rebooted many times.
  3. 22 Aug, 2006 6 commits
  4. 21 Aug, 2006 14 commits
    • Kirk Webb's avatar
      · af0d6629
      Kirk Webb authored
      Some bugfixes and updates to the monitor.
      * Added load average monitoring and initial test startup randomization
      The load the monitor was exerting, especially at startup, was pretty high.
      This change appears to have brought that under control.
      * Fixed window size bug(s)
      There were a few bugs related to tracking the outstanding child process
      window that are corrected by this checkin.
    • Mike Hibler's avatar
      Tweaks to make auto-* work correctly with BSD /bin/sh · 9b8d05a3
      Mike Hibler authored
      Change dbmonitor to, by default, use the latest of A->B and B->A
      values when setting characteristics on site A.  Do this for both latency
      and bandwidth values, though should probably allow BW values to continue
      to be asymmetric.
    • Robert Ricci's avatar
      Fix a bug noticed by Jon - make sure to not iterate off the end of the · 4b247a8d
      Robert Ricci authored
      list, which could happen if we somehow saw an ack for a packet whose
      send() we missed (ie. the kernel dropped it on the way in to pcap).
    • Robert Ricci's avatar
      Bug fixes: · fbda5f5d
      Robert Ricci authored
      Fix range check - this was necessary because of my change to the
      ackFor calculation.
      Tricky detail about STL list.erase() - it does *not* erase the end iterator,
      which meant that we were failing to remove the packet being acked (we were
      removing all packets up to that one). So, we have it increment the iterator
      in localAck() even if we find the one we're looking for.
      Tested well for lossless connections - still needs testing for lossy
      connections and SACK (see below). Hopefully, it works for the former,
      but I know it doesn't work for the latter.
      Added more debugging output to localAck()
      Added a check which should warn us if we see any SACKS, though I
      can't be sure it works, because I haven't seen any yet (that I know
      of! :)
    • Kevin Atkinson's avatar
      · 018dd59b
      Kevin Atkinson authored
      Small script to set the cloudinfo from an input file.  Only works
      on elab experiments for now.
    • Mike Hibler's avatar
      Simple perl script to grab latency/bw records from DB and print them out: · 4b9ba9ec
      Mike Hibler authored
      usage: showsamples.pl [-Bbdl] <srcix>|all <dstix>|all
         show database records for given site indices
             -B        show both srcix -> dstix and dstix -> srcix
             -b        show bandwidth
             -d        show delay (the default)
             -l        show loss
             -n <num>  show only the last <num> records
             -S <time> show records no later than unix timestamp <time>
      usage: showsamples.pl -e pid/eid
         show mapping of name->ix for all plab nodes in <pid>/<eid>
    • Robert Ricci's avatar
      Major re-work of the handleTcp() function, which had a fundamentally · e93ed0f4
      Robert Ricci authored
      wrong interpretation of how part of TCP works.
      Being an ACK and being a data packet are *not* mutually exclusive,
      though the old code assumed that they were.
      There are basically four things we might need to do:
      If outgoing
          (a) Handle an outgoing ACK
          (b) Handle an outgoing data packet
      If incoming
          (c) Handle an incoming ACK
          (d) Handle an incoming data packet
      Note that a and b can be done on the same packet, as well as c and d. Right
      now, however, our code only handles b and c. We will need to support a and
      d before we can model real applications, I think!
      Also, make handleTcp() more robust.
    • Robert Ricci's avatar
    • Kevin Atkinson's avatar
      · 9b718661
      Kevin Atkinson authored
      Avoid counting planetlab vnodes twice.
    • Kevin Atkinson's avatar
      · 815e21b0
      Kevin Atkinson authored
      tbreport related changes from Mike Kasick <mkasick@andrew.cmu.edu>:
      - assign_wrapper2 now passes violations to the tbreport error parser.
      - Fixed improper regexp in assign_wrapper2 tbreport error parser.
      - Readded create_vlan_failed error in snmpit (left out of original commit
        for unknown reason).
      - Modified endexp to use libtblog, added tbswap_out_failed error.
    • Dan Gebhardt's avatar
    • Robert Ricci's avatar
      Bug fix for my last commit - still open pcap socket if we're saving · 01acaff8
      Robert Ricci authored
      replay information.
    • Robert Ricci's avatar
      Don't open the pcap device if doing replay · e0866db5
      Robert Ricci authored
    • Robert Ricci's avatar
      Typo police · df01a7e9
      Robert Ricci authored
  5. 18 Aug, 2006 14 commits
  6. 17 Aug, 2006 2 commits
    • Kirk Webb's avatar
      · f1fa5a51
      Kirk Webb authored
      New plab vnode monitor framework, now with proactive node checking action!
      The old monitor has been completely replaced.  The new one uses modular pools
      to test and track plab nodes.  There are currently two pool modules:
      good and bad.  THe good pool tests nodes that have are not known to have
      issues to proactively find problems and push nodes into the "bad" pool
      when necessary.  The bad pool acts similarly to the old plabmonitor; it
      does and end to end test on nodes, and if and when they finally come up,
      moves them to the good pool.  Both pools have a testing backoff mechanism
      that works as follows:
        * The node is tested right away upon entering either pool
        * Node fails to setup:
          * goodpool: node is sent to bad pool (hwdown)
          * badpool:  node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
        * Node setup succeeds:
          * goodpool: node is scheduled to be retested according to
                      an additive backoff function, maxing out at 1 hour.
          * badpool:  node is moved to good pool.
      The backoff thing may be bogus, we'll see.  It seems like a reasonable thing
      to do though - no need to hammer a node with tests if it consistently
      succeeds or fails.  Nodes that flop back and forth will get the most
      testing punishment.  A future enhancement will be to watch for flopping
      and force nodes that exhibit this behavior to pass several consecutive
      tests before being eligible for return back into the good pool.
      The monitor only allows a configurable window's worth of outstanding
      tests to go on at once.  When tests finish, more nodes tests are allowed
      to start up right away.
      Some refactoring needs to be done.  Currently the good and bad pools share
      quite a bit of duplicated code.  I don't know if I dare venture into
      inheritance with perl, but that would be a good way to approach this.
      Some other pool module ideas:
      * dynamic setup pools
      When experiments w/ plab vnodes are swapped in, use the plab monitor to
      manage setting up the vnodes by dynamically creating pools on a per-experiment
      basis.  This has the advantage that the monitor can keep a global cap on
      the number of outstanding setup operations.  These pools might also try to
      bring up vnodes that failed to setup during swapin later on, along with other
      vnode monitoring tasks.
      * "all nodes" pools
      Similar to the dynamic pools just mentioned, but with the mission to extend
      experiments to all plab nodes possible (as nodes come and go).  Useful for
    • Jonathon Duerig's avatar
      Rationalized Rob's previous checkin with mine to remove the additional... · 70cbdf5e
      Jonathon Duerig authored
      Rationalized Rob's previous checkin with mine to remove the additional dependencies that I had made to the now defunct IpHeader