1. 27 Feb, 2017 2 commits
  2. 16 Feb, 2017 1 commit
    • Mike Hibler's avatar
      More robustness improvements for FreeBSD vnodes. · aa9bc39b
      Mike Hibler authored
      Not sure how I got headed down this path, but here I am:
       * replace use of "ps" and "grep" with, wait for it..."pgrep"!
       * explicitly specify type=vif so we don't wind up with the extra,
         vifN.M-emu backend interface that gets left laying around,
       * add -F option to "xl shutdown" which is needed for HVMs else
         shutdown will fail and the domain won't go away (qemu left behind)
         and FBSD filesystem can be messed up,
       * Use "hd" instead of "sd" to avoid emulated SCSI driver which has
         caused me grief in the past (though it should never actually get
         used due to PVHVM config of kernel).
      aa9bc39b
  3. 15 Feb, 2017 1 commit
  4. 18 Nov, 2016 1 commit
  5. 24 Oct, 2016 1 commit
  6. 29 Sep, 2016 1 commit
    • Mike Hibler's avatar
      Performance improvements to the vnode startup path. · 87eed168
      Mike Hibler authored
      The bigest improvement happened on day one when I took out the 20 second sleep
      between vnode starts in bootvnodes. That appears to have been an artifact of
      an older time and an older Xen. Or, someone smarter than me saw the potential
      of getting bogged down for, oh say three weeks, trying to micro-optimize the
      process and instead just went for the conservative fix!
      
      Following day one, the ensuing couple of weeks was a long strange trip to
      find the maximum number of simultaneous vnode creations that could be done
      without failure. In that time I tried a lot of things, generated a lot of
      graphs, produced and tweaked a lot of new constants, and in the end, wound
      up with the same two magic numbers (3 and 5) that were in the original code!
      To distinguish myself, I added a third magic number (1, the loneliest of
      them all).
      
      All I can say is that now, the choice of 3 or 5 (or 1), is based on more
      solid evidence than before. Previously it was 5 if you had a thin-provisioning
      LVM, 3 otherwise. Now it is based more directly on host resources, as
      described in a long comment in the code, the important part of which is:
      
       #
       # if (dom0 physical RAM < 1GB) MAX = 1;
       # if (any swap activity) MAX = 1;
       #
       #    This captures pc3000s/other old machines and overloaded (RAM) machines.
       #
       # if (# physical CPUs <= 2) MAX = 3;
       # if (# physical spindles == 1) MAX = 3;
       # if (dom0 physical RAM <= 2GB) MAX = 3;
       #
       #    This captures d710s, Apt r320, and Cloudlab m510s. We may need to
       #    reconsider the latter since its single drive is an NVMe device.
       #    But first we have to get Xen working with them (UEFI issues)...
       #
       # else MAX = 5;
      
      In my defense, I did fix some bugs and stuff too (and did I mention
      the cool graphs?) See comments in the code and gitlab emulab/emulab-devel
      issue #148.
      87eed168
  7. 02 Sep, 2016 1 commit
  8. 31 Aug, 2016 1 commit
  9. 12 Aug, 2016 1 commit
  10. 28 Jul, 2016 1 commit
  11. 21 Jul, 2016 1 commit
  12. 07 Jun, 2016 1 commit
  13. 09 Feb, 2016 1 commit
  14. 04 Jan, 2016 1 commit
  15. 21 Dec, 2015 1 commit
  16. 01 Dec, 2015 1 commit
    • Leigh B Stoller's avatar
      Add an "interruptible" option to TBScriptLock(). When set, each time · 08ce72b6
      Leigh B Stoller authored
      through the loop we look to see if signals are pending, and if so we return
      early with an error. The caller (libvnode_xen) can use this to avoid really
      long waits, when the server has said to stop what its doing. For example, a
      vnode setup is waiting for an image lock, but the server comes along ands
      to stop setting up. Previously, we would wait for the lock, now we return
      early. This is to help with cancelation where it is nice if the server can
      stop a CreateSliver() in its tracks, when it is safe to do so.
      08ce72b6
  17. 24 Nov, 2015 2 commits
  18. 20 Nov, 2015 1 commit
  19. 27 Oct, 2015 3 commits
  20. 14 Oct, 2015 1 commit
  21. 26 Aug, 2015 1 commit
  22. 04 May, 2015 1 commit
  23. 24 Apr, 2015 1 commit
  24. 27 Mar, 2015 1 commit
  25. 05 Mar, 2015 1 commit
    • Mike Hibler's avatar
      Revamp Xen vnode code to take advantage of "Xen mode" in capture. · 7a59bc05
      Mike Hibler authored
      A per-domain capture process is now started up in vnodeCreate and
      shutdown in vnodeDestroy. It should remain running for the entire time
      in between (across reboots, etc.)
      
      This should help ensure that you don't ever miss your favorite console
      output, even the thrilling early-stage boot messages!
      7a59bc05
  26. 04 Mar, 2015 2 commits
  27. 03 Mar, 2015 1 commit
    • Mike Hibler's avatar
      Numerous fixes to vnode code based on testing. · ee834259
      Mike Hibler authored
      Create a modest partition 4 (1G) since we need some space for local FSes.
      Rename many of the LVM routines to follow a common naming scheme.
      Make extra effort to remove partitions on an LV; something about a FreeBSD
          VM makes kpartx forget one of its partitions.
      Handle deltas on whole-disk images; some code was missing for this case.
      Make sure multi-image case works; the golden images was being named
          incorrectly.
      Make sure extra FS case works; for the golden image case we were using
          the golden image and ignoring what was specified for the extra FS.
      ee834259
  28. 24 Feb, 2015 1 commit
  29. 23 Feb, 2015 1 commit
  30. 20 Feb, 2015 1 commit
  31. 19 Feb, 2015 3 commits
  32. 17 Feb, 2015 1 commit
    • Mike Hibler's avatar
      Major overhaul to support thin snapshot volumes and also fixup locking. · a9e75f33
      Mike Hibler authored
      A "thin volume" is one in which storage allocation is done on demand; i.e.,
      space is not pre-allocated, hence the "thin" part. If thin snapshots and
      the associated base volume are all part of a "thin pool", then all snapshots
      and the base share blocks from that pool. If there are N snapshots of the
      base, and none have written a particular block, then there is only one copy
      of that block in the pool that everyone shares.
      
      Anyway, we now create a global thin pool in which the thin snapshots can be
      created. We currently allocate up to 75% of the available space in the VG
      to the pool (note: space allocated to the thin pool IS statically allocated).
      The other 25% is for Things That Will Not Be Shared and as fallback in case
      something on the thin volume path fails. That is, we can disable thin
      volume creation and go back to the standard path.
      
      Images are still downloaded and saved in compressed form in individual
      LVs. These LVs are not allocated from the pool since they are TTWNBS.
      
      When the first vnode comes along that needs an image, we imageunzip the
      compressed version to create a "golden disk" LV in the pool. That first
      node and all subsequent nodes get thin snapshots of that volume.
      
      When the last vnode that uses a golden disk goes away we...well,
      do nothing. Unless $REAP_GDS (linux/xen/libvnode_xen.pm) is set non-zero,
      in which case we reap the golden disk. We always leave the compressed
      image LV around. Leigh says he is going to write a daemon to GC all these
      things when we start to run short of VG space...
      
      This speed up for creation of vnodes that shared an image turned up some
      more rack conditions, particularly around iptables. I close a couple more
      holes (in particular, ensuring that we lock iptables when setting up
      enet interfaces as we do for the cnet interface) and added some optional
      lock debug logging (turned off right now).
      
      Timestamped those messages and a variety of other important messages
      so that we could merge (important parts of) the assorted logfiles and
      get a sequential picture of what happened:
      
          grep TIMESTAMP *.log | sort +2
      
      (Think of it as Weir lite!)
      a9e75f33
  33. 01 Feb, 2015 1 commit