Skip to content
Snippets Groups Projects
  1. Dec 12, 2011
  2. Dec 06, 2011
  3. Dec 05, 2011
    • David Johnson's avatar
    • David Johnson's avatar
      Fixup the last commit so newer-style linux shaping with netem works. · 72fb9e3a
      David Johnson authored
      I didn't know this initially, but it turns out that with newer netems,
      you can't add another netem qdisc instance (nor an htb instance)
      inside another netem instance.  The linux maintainers removed
      "classful" qdisc support from the netem qdisc (which makes it
      possible for another qdisc to be "nested" inside a netem qdisc)
      because 1) netem couldn't even nest an instance of itself inside
      itself -- which isn't stricly necessary for us because we can do
      both delay and plr in one netem instance), and 2) because apparently
      non-work-conserving qdiscs already didn't work inside netem (a
      work-conserving qdisc is one that always has a packet ready when its
      underlying device is ready to transmit a packet -- thus, a
      bandwidth-shaping qdisc that might not have a packet ready because
      it's slowing down the send rate is non-work-conserving), and 3) to
      support code cleanups.
      
      So -- what this means for us is that by using modern netem, we are now
      doing bandwidth shaping first, then plr and delay.  With our old
      custom kernel modules, we were doing plr, delay, then bandwidth.
      
      I talked this strategy over with Jon (because adding classful support
      back to netem is nontrivial and defeats the point of trying to use
      what's in the kernel directly without patching it more), and we believe
      it's ok to do -- one because it doesn't always change the shaped rate
      from the old way we used to do things, and second because using these
      params *in tandem* to do link shaping is kind of a poor man's way
      of actually modeling real link behavior -- a la flexlab.
      
      So we'll just document it for users, call it beta for now, and test
      it against the old way and BSD.  If it looks reasonable, we'll stick
      with it; otherwise we'll look at reviving the old style.
      72fb9e3a
  4. Dec 02, 2011
  5. Dec 01, 2011
  6. Nov 30, 2011
  7. Nov 29, 2011
    • Leigh B Stoller's avatar
    • Leigh B Stoller's avatar
      Fix bug that was causing reserved vlantags to be left behind, causing · 235db86c
      Leigh B Stoller authored
      snmmpit to fail at seemingly random times. Also add an update script
      to delete the stale tags.
      235db86c
    • David Johnson's avatar
      Support using Linux netem modules for delay and loss shaping. · 35f1deaa
      David Johnson authored
      ... instead of using our custom kernel modules.  I got tired of
      pulling our patches forward and adapting to the packet sched API
      changes in the kernel!  netem is more advanced than our stuff,
      anyway, and should do a fine job.
      35f1deaa
    • David Johnson's avatar
      Lots of changes: debug; macvlans; details below. · fdf97b51
      David Johnson authored
      I added debug options for each LVM and vzctl call; you can toggle it
      on by touching /vz/.lvmdebug, /vz.save/.lvmdebug, /.lvmdebug, and
      /vz/.vzdebug, /vz.save/.vzdebug, /.vzdebug.  I also added dates to
      debug timestamps for debugging longer-term shared node problems.
      
      I added support for using macvlan devices instead of openvz veths
      for experiment interfaces.  Basically, you can add macvlan devices
      atop any other ethernet device to "virtualize" it using fake mac
      addresses.  We use them like this: if the virtual link/lan needs to
      leave the vhost on a phys device or vlan device, we attach the macvlan
      devices to the appropriate real device.  If the virtlan is completely
      internal to the vhost, we create a dummy ethernet device and attach
      the macvlan devices to that.
      
      The difference between macvlan devices and veths is that macvlan
      devices are created only in the root context, and are moved into
      the container context when the vnodes boot.  There is no "root
      context" half -- the device is fully in the container's network
      namespace.  BUT, the underlying device is in the root network
      namespace.
      
      We use macvlans in "bridge" mode, so that when one macvlan device sends
      a packet, the device driver checks any other macvlan devices attached
      to the underlying physical, vlan, or dummy device, and delivers the packet
      accordingly.  The difference between this fake bridge and a real bridge
      is that the macvlan driver knows the mac of each attached interface,
      and does not have to do any learning whatsoever.  I haven't looked at
      the code, but it should be a very, very simple, fast, and zero-copy
      transmit from one macvlan device onto another.
      
      This is essentially the same as the planetlab shortbridge, but since
      I haven't looked at the code, I can't say that there aren't more
      opportunities to optimize.  Still, this should hopefully be faster
      than openvz veths.
      
      Oh, and I also added support for using Linux tc's netem modules
      for doing delay and loss shaping, instead of using our custom
      kernel modules.  I got tired of pulling our patches forward and
      adapting to the packet sched API changes in the kernel!  netem is
      more advanced than our stuff, anyway, and should do a fine job.
      fdf97b51
    • David Johnson's avatar
  8. Nov 28, 2011
    • David Johnson's avatar
      Fix a couple echo strings. · afed8661
      David Johnson authored
      afed8661
    • David Johnson's avatar
      Add build_fake_macs and use it in getifconfig. · fe6c2807
      David Johnson authored
      build_fake_macs generates fake mac addresses for the inside and outside
      halves of a veth.  For openvz vnodes, we have to uniquely address
      both halves.  tmcd gives us the vmac for the inside of the container;
      it is basically 00:00:ipOct0:ipOct1:ipOct2:ip0ct3.  Normally, for openvz
      veths, this works fine, because only the inside of the container ever sees
      the vmac.  BUT, if we're not using openvz veths (i.e., using macvlan devices),
      we might not have inside/outside halves of the veth.  Consequently, we
      have to give the device a unique mac addr that is unique in both the
      root and container contexts.  This is trivial for non-shared vhosts, but
      if the vhost is shared, we can't just use the vmac as specified above.  So,
      we do the following:
      
          # We have to set the locally administered bit (0x02) in the first
          # octet, and we can't set the unicast/multicast bit (0x01).  So
          # we have the first two octets to play with, minus those two bits,
          # leaving us with 14 total bits.  But then, for veths, we need a
          # a MAC for the root context, and for the container.  So there goes
          # another bit.
          #
          # So, what we're going to do is, if the vmid fits in 13 bits,
          # take the 5 MSB and shift them into bits 3-7 of the first octet,
          # and take the 8 LSB and make them the second octet.  Then, we
          # always set bit 2, and the container MAC gets bit 8 set.
      
      Of course, this requires getifconfig to check for these "hacked" vmacs
      when ifsetup configures interfaces inside the container -- so now
      getifconfig checks for these special hacked vmacs if it can't find
      a device with the vmac itself.  Good times...
      fe6c2807
    • Leigh B Stoller's avatar
      Bug fix to -m option; Look at the total port set and use · 376a71d2
      Leigh B Stoller authored
      mapPortsToSwitches to get the minimal set of switches needed in the
      stack (including trunk links). This avoids contacting the protogeni
      switches.
      376a71d2
    • Leigh B Stoller's avatar
      Remove debugging. · 0b284ba8
      Leigh B Stoller authored
      0b284ba8
    • Leigh B Stoller's avatar
      Bug fix to -m option; Look at the total port set and use · a86487f6
      Leigh B Stoller authored
      mapPortsToSwitches to get the minimal set of switches needed in the
      stack (including trunk links). This avoids contacting the protogeni
      switches.
      a86487f6
    • Leigh B Stoller's avatar
      Apply fix from snmpit_new. · 7f584271
      Leigh B Stoller authored
      7f584271
    • Leigh B Stoller's avatar
    • Leigh B Stoller's avatar
      Minor bug fix to syncVlansFromTables(); when there are no old or new · 0ff5e378
      Leigh B Stoller authored
      vlans, skip entirely to avoid contacting all switches for no reason.
      0ff5e378
    • Leigh B Stoller's avatar
      Fix minor typo. · 417e4727
      Leigh B Stoller authored
      417e4727
Loading