Skip to content
  • David Johnson's avatar
    Handle control net on newer systemd/udevds correctly on Ubuntu. · 9c456003
    David Johnson authored
    Prior to this commit, in Ubuntu 16, our control net hook was getting
    invoked accidentally by udev rules that look for bridge ports or vlan
    ports via ifquery.  Those rules invoke call ifquery -l, but do not add
    the --no-mappings argument to skip mapping processing --- and thus our
    mapping hook got run.  But it was not getting run via systemd's
    networking.service, which is where it needs to run.  That service
    guarantees that udev has 'settled' (flushed its event queue it
    accumulated during boot), which is important for devices with slow
    firmware/drivers/etc.
    
    Sadly, our mapping hook could *not* get run by the normal
    networking.service, because we cannot predict the control net device
    name (the possibilities are determined now by hardware and firmware, and
    could range from enoX to enpXsYfZdA).  ifup -a requires that the real
    device name be present and be set to auto in /etc/network/interfaces.
    You can run ifup -a --force to bring up a non-existent device, but you
    cannot bring it down with ifdown.  Interestingly enough, ifquery does
    not require that all 'auto' devices it returns be real devices, and
    that's why things were working.
    
    First, we have to make sure our findcnet hook does not run via the
    builtin udev rules.  That's easy; we fixed up findcnet to look for some
    udev/systemd env vars, and do nothing in that case.  Hopefully we got
    env vars that are always present...
    
    There are basically 3 strategies we can try after that.  We can make
    our own networking-emulab.service that brings up and down the Emulab
    control net, and make networking.service pull that in.  This way,
    'service networking restart' or 'systemctl restart networking.service'
    would still work.  However, ifup/ifdown would not work, because the
    control net iface is not present in /etc/network/interfaces.  So nix
    that.
    
    Two other options require us to dynamically edit /etc/network/interfaces
    on first boot of a debian/systemd machine, to place all ethernet devices
    into it along with our mapping hook and set them to auto, *and* to
    remove those customizations in prepare.  This sort of sucks, but it
    doesn't suck much worse than if prepare fails in some other part of the
    process.  What is more, we can make it suck less by always checking to
    assure ourselves that the real control net device is present in
    /etc/network/interfaces, and is present on the system.  If we encounter
    anything to the contrary, we can recreate the Emulab section from
    scratch.  Thus if there are prepare failures, the image will still boot
    because any inconsistent cruft will get wiped away.  We can do this
    either by adding a networking-emulab.service that runs and finishes
    prior to networking.service, OR we can add a udev rule that calls a
    script to ensure all ethernet devices are added to
    /etc/network/interfaces prior to running.  At this point, I favor the
    latter approach, if we can guarantee that it finishes prior to anything
    looking at /etc/network/interfaces.  We can't guarantee anything about
    udev events being "finished" for a subsystem, AFAIK.
    
    Finally (and the best way), we can use yet another interfaces(5)
    mechanism and some strategic udev rules of our own!  We add udev rules
    (/etc/udev/rules.d/99-emulab-control-network.rules) that populates the
    /run/emulab-interfaces.d-auto-added dir (listed as a source dir in
    /etc/network/interfaces for the ifup/ifdown/ifquery commands below) with
    files that contain simply 'auto <IFACE>'.  Those rules are careful to do
    only that for certain valid wired Ethernet devices (and deliberately not
    wireless devices!).  Then, once we've got 'auto ...' stanzas for each
    possible Ethernet device, we can continue to utilize the mapping stanzas
    below like previous versions of this file did.  And we don't have
    anything to clean out on reboot or on image capture, because /run is
    automatically cleared.  ifup/ifdown/ifquery are not bothered by the
    absence of the sourced directory in /run, if that didn't exist for any
    reason.
    
    If you need to add another foo* device name, you'll need to edit the
    interfaces file (with another mapping stanza) and update the match rules
    in 99-emulab-control-network.rules .
    9c456003