1. 09 Aug, 2016 6 commits
  2. 08 Aug, 2016 2 commits
    • David Johnson's avatar
      Handle control net on newer systemd/udevds correctly on Ubuntu. · 9c456003
      David Johnson authored
      Prior to this commit, in Ubuntu 16, our control net hook was getting
      invoked accidentally by udev rules that look for bridge ports or vlan
      ports via ifquery.  Those rules invoke call ifquery -l, but do not add
      the --no-mappings argument to skip mapping processing --- and thus our
      mapping hook got run.  But it was not getting run via systemd's
      networking.service, which is where it needs to run.  That service
      guarantees that udev has 'settled' (flushed its event queue it
      accumulated during boot), which is important for devices with slow
      firmware/drivers/etc.
      
      Sadly, our mapping hook could *not* get run by the normal
      networking.service, because we cannot predict the control net device
      name (the possibilities are determined now by hardware and firmware, and
      could range from enoX to enpXsYfZdA).  ifup -a requires that the real
      device name be present and be set to auto in /etc/network/interfaces.
      You can run ifup -a --force to bring up a non-existent device, but you
      cannot bring it down with ifdown.  Interestingly enough, ifquery does
      not require that all 'auto' devices it returns be real devices, and
      that's why things were working.
      
      First, we have to make sure our findcnet hook does not run via the
      builtin udev rules.  That's easy; we fixed up findcnet to look for some
      udev/systemd env vars, and do nothing in that case.  Hopefully we got
      env vars that are always present...
      
      There are basically 3 strategies we can try after that.  We can make
      our own networking-emulab.service that brings up and down the Emulab
      control net, and make networking.service pull that in.  This way,
      'service networking restart' or 'systemctl restart networking.service'
      would still work.  However, ifup/ifdown would not work, because the
      control net iface is not present in /etc/network/interfaces.  So nix
      that.
      
      Two other options require us to dynamically edit /etc/network/interfaces
      on first boot of a debian/systemd machine, to place all ethernet devices
      into it along with our mapping hook and set them to auto, *and* to
      remove those customizations in prepare.  This sort of sucks, but it
      doesn't suck much worse than if prepare fails in some other part of the
      process.  What is more, we can make it suck less by always checking to
      assure ourselves that the real control net device is present in
      /etc/network/interfaces, and is present on the system.  If we encounter
      anything to the contrary, we can recreate the Emulab section from
      scratch.  Thus if there are prepare failures, the image will still boot
      because any inconsistent cruft will get wiped away.  We can do this
      either by adding a networking-emulab.service that runs and finishes
      prior to networking.service, OR we can add a udev rule that calls a
      script to ensure all ethernet devices are added to
      /etc/network/interfaces prior to running.  At this point, I favor the
      latter approach, if we can guarantee that it finishes prior to anything
      looking at /etc/network/interfaces.  We can't guarantee anything about
      udev events being "finished" for a subsystem, AFAIK.
      
      Finally (and the best way), we can use yet another interfaces(5)
      mechanism and some strategic udev rules of our own!  We add udev rules
      (/etc/udev/rules.d/99-emulab-control-network.rules) that populates the
      /run/emulab-interfaces.d-auto-added dir (listed as a source dir in
      /etc/network/interfaces for the ifup/ifdown/ifquery commands below) with
      files that contain simply 'auto <IFACE>'.  Those rules are careful to do
      only that for certain valid wired Ethernet devices (and deliberately not
      wireless devices!).  Then, once we've got 'auto ...' stanzas for each
      possible Ethernet device, we can continue to utilize the mapping stanzas
      below like previous versions of this file did.  And we don't have
      anything to clean out on reboot or on image capture, because /run is
      automatically cleared.  ifup/ifdown/ifquery are not bothered by the
      absence of the sourced directory in /run, if that didn't exist for any
      reason.
      
      If you need to add another foo* device name, you'll need to edit the
      interfaces file (with another mapping stanza) and update the match rules
      in 99-emulab-control-network.rules .
      9c456003
    • Gary Wong's avatar
  3. 06 Aug, 2016 1 commit
  4. 04 Aug, 2016 1 commit
  5. 03 Aug, 2016 4 commits
    • David Johnson's avatar
      Fix findcnet for newer udev reliable fixed device names. · bbb4ebe0
      David Johnson authored
      In the latest udev world, udev generates predictable device names using
      firmware info and/or pci buss info (i.e., eno1 or enps4f0).  So, we now
      try to run dhclient only on real ethernet devices (i.e., eth*, en*,
      sl*).  There are other kinds of ethernet devices (i.e. wireless, wl*,
      ww*) or virtual devices, but we don't care about finding the control net
      on those.  Might need to add another device name prefix for PV devices
      in Xen guests... we'll see.
      bbb4ebe0
    • David Johnson's avatar
      Workaround dhclient/resolvconf problem in Ubuntu 16. · c2bd98f6
      David Johnson authored
      This replaces the first attempt, which just masked the race condition,
      since I didn't understand what tmcc bossinfo was really doing.  This
      appears to fix it satisfactorily for now; it doesn't seem that we will
      run into the case where the file exists but has no nameserver.
      
        resolvconf on Linux also breaks DNS momentarily via dhclient exit
        hook, or something.  On Ubuntu 16, resolvconf is setup to run via
        dhclient enter hook (the hook redefines make_resolv_conf, which
        dhclient-script eventually executes prior to the exit hook execution).
        For whatever reason, though, sometimes when our exit hook (this
        script) runs, /etc/resolv.conf is a dangling symlink.  I was not able
        to find the source of the asynch behavior, so I can't say for sure.
        But sethostname.dhclient is an immediate casualty, because it calls
        tmcc bossinfo(), and the tmcc binary attempts to use res_init and read
        the resolver and use that as boss.  If there is no /etc/resolv.conf
        (or it is a broken symlink into /run, as it is on resolvconf systems
        before resolvconf runs for the first time on boot), res_init will
        return localhost, and there is no way for us in tmcc to know that is
        inappropriate (taking the res_init resolver might not be the best
        choice, but we do not dare to add a special-case rejection of
        localhost in tmcc).
      c2bd98f6
    • Mike Hibler's avatar
      Update pubsub port for KEEPALIVE fixes. · a3ea0297
      Mike Hibler authored
      a3ea0297
    • Leigh B Stoller's avatar
  6. 02 Aug, 2016 1 commit
  7. 01 Aug, 2016 1 commit
    • Leigh B Stoller's avatar
      Small DB changes for supporting secure transfer of datasets between · 43c7c976
      Leigh B Stoller authored
      clusters using credentials to provide permission to access the datasets.
      
      * Add authority_urn to the images table, which is the urn of the origin
        dataset (similar to the slice urn, the Portal mints a credential in
        its namespace, so that the Portal always has permission to do anything
        it wants to the dataset at the remote cluster).
      
      * Add slot to the apt_datasets table to store a credential from the
        cluster where the dataset lives. This credential gives the owner
        permission to download the dataset, which the portal will delegate to
        any cluster that might need to get that dataset.
      43c7c976
  8. 29 Jul, 2016 8 commits
  9. 28 Jul, 2016 5 commits
  10. 27 Jul, 2016 1 commit
  11. 26 Jul, 2016 4 commits
  12. 22 Jul, 2016 5 commits
  13. 21 Jul, 2016 1 commit