Skip to content
  • David Johnson's avatar
    Attempt to safely work around systemd swap service on Ubuntu 16. · 7afd5f24
    David Johnson authored
    systemd.swap is one of its special builtin services.  Basically, swap
    devices are parsed out of fstab, or by examining a disk's GPT.  Any such
    devices are turned into instantiated units.  This happens via the
    systemd-fstab-generator.  Generators in systemd are almost
    uncontrollable.  They run immediately, prior to on-disk unit file
    parsing, and all you can do is disable or replace them.  You cannot
    express dependencies on the resulting units (unless you write your own
    generator).  Generators also run in an impoverished environment (think
    read-only /etc), so we cannot just add another generator that does
    basically what fixup-fstab-swaps does.  Finally, we cannot write a
    template unit file for all swap devices (we would use this to inject a
    blocking dependency so that these swap units don't conflict with us).
    Lennart has recognized the value in this, but thought the impl effort is
    pretty hard.  This makes sense, because the generators run prior to unit
    file load from disk (and presumably that would nix templates for
    generated units)... and I gather there are other problems as well.
    
    This is quite problematic for us because we rely on the ability to
    update /etc/fstab with the name of the real swap device, and to mkswap
    on it.  However, on machines with lots of cores, systemd is at its
    parallelizing best, and inevitably systemd tries to start up one of its
    instantiated swap device units at the same time as our fixup-fstab-swaps
    script is running.
    
    So I've done several things to try to deal with this situation.  First,
    this Ubuntu 16-specific version of fixup-fstab-swaps no longer adds a
    swap line to fstab with options=defaults -- instead it uses
    options=noauto,x-emulab-auto .  The noauto causes systemd's instantiated
    swap units to not automatically run on boot (don't worry, they become
    active if fixup-fstab-swaps swapons them, and thus they get swapped off
    prior to umount -- important that happens to avoid hangs); but our
    script will swapon the noauto,x-emulab-auto swap partitions as if they'd
    had options=default|auto.  What this does break is swapon/off -a --- but
    who cares.  The x-* comment option in fstab is something I didn't know
    about, I'll admit.
    
    Second, I've done is make emulab-fstab-fixup.service Conflict with
    swap.target, but also to be pulled in by swap.target!  The hope was that
    this would ensure that our service *always* runs successfully, even if
    it kills off swap.target to "handle" the conflict.  Well, the problem is
    that we need to Conflict with the instantiated swap unit files, not
    swap.target... so I think that isn't really working.  But I left it in
    -- maybe it is helping us win races.
    
    The one thing I cannot block is that systemd looks at the partition
    types of at least one of our hardware types (d820) and generates swap
    unit files by the partition UUID.  How it is doing this, I have no idea
    -- that behavior is only supposed to happen if your disk is GPT.  So we
    get failures on the d820s from the systemd instantiated swap units on
    first boot, but our scripts always do the right thing.
    7afd5f24