Commit 828e7acd authored by David Johnson's avatar David Johnson

Use the last resort systemd swap strategy: wrap the systemd-fstab-generator.

As noted in a previous commit, systemd generators run at boot, with the
rootfs mounted read-only, prior to systemd reading any unit files on
disk.  The systemd-fstab-generator reads /etc/fstab and generates units
for any devices in there, including swap devices.  Thus, the only way to
ensure our swap device fixing happens correctly (and we have to do this
in case the image was loaded from an old MFS that might write a swap
device with /dev/hda or whatever into /etc/fstab), and doesn't race with
systemd, is to preempt it.

So now our generator runs the systemd generator each boot, but it
post-processes the auto-generated unit files, removing them if any of
those devices was 1) an Emulab auto-generated swap device, and 2) it
does not exist.

Moreover, if there is *no* swap device in /etc/fstab, we hunt down the
canonical Emulab swap partition on the root disk, and mkswap it, and
generate a unit for it.  Of course, we don't use the system template (it
is an m4 file in the src dist, so we can't), but hopefully the template
won't change much --- and it's basic.

We now only mkswap devices if they don't appear as swap devs via blkid.

To clean up old Emulab systemd images, do these steps:

  $ systemctl disable emulab-fstab-fixup.service
  $ make client-install
  $ rm -f /usr/local/etc/emulab/initscripts/emulab-systemd-swaps

I think that will do it.  Worked for me on Ubuntu 16 and Centos 7.

Here's the comments from the generator for posterity:

  This is a systemd generator that wraps systemd-fstab-generator.

  First, we always run the system generator; and any swap file units
  it generates are modified to contain a
  Before=emulab-fstab.fixup.service dependency to ensure the fixup
  script never races with legitimate systemd unit targets.

  If it has already run on an Emulab node (and if the second-stage
  fixup script (emulab-fstab-fixup.service) has also run), we don't
  run it again; we just run the system fstab generator directly,
  modify the generated swap unit files to run before the fixup script.

  Otherwise, on first boot of an Emulab image, we run the system
  generator and let it generate swap units for any swap devices that
  were in /etc/fstab.  We then check all the auto-Emulab-added swap
  device entries in /etc/fstab, and if that device does not exist, we
  remove the auto-generated unit files/symlinks that correspond to it
  (they are in $1).  We cannot edit /etc/fstab ourselves because we're
  run prior to remount-/-rw; so we make a copy of it in /run/emulab
  and move the copy to /etc/fstab in emulab-fstab-fixup.service ---
  the second part of our solution (the fixup service).

  NB: we do not remove user-added swap devices, even if they are
  invalid!

  We check to see if each auto-Emulab-added swap device that was in
  /etc/fstab is currently a valid swap device (i.e., that it shows up
  via blkid with a TYPE="swap").  If it is valid, we *do not* run
  mkswap to create it!  If it is not currently a valid swap device,
  but if the partition is marked as a Linux swap partition, we will
  run mkswap on it to ensure it is valid by the time systemd runs
  swapon on it.  If we *did* plan to run mkswap, that can be prevented
  by creating the
  /etc/emulab/emulab_systemd_fstab_generator_never_mkswap file.  This
  gives the user the ability to create a disk image where swap
  partitions are not created/wiped on first boot of the image.  I
  can't see a use case for this, but it's easy to do.

  If there is a swap partition on the device containing the root
  partition, and if it is not already in /etc/fstab, we try to add a
  unit ourselves for that swap partition.  We do it in the new style,
  too, by using its UUID, in this case.

  When scanning /etc/fstab to find auto-Emulab-added devices, we look
  for a comment above a line containing a swap device, and the comment
  must match the regexp /^#.*the following.* added by / .  Then if the
  line below the comment refers to an invalid swap device, we remove
  the unit files that correspond to the device.  Otherwise
parent 923fddeb
......@@ -177,6 +177,7 @@ systemd-dir-install: dir-install
$(INSTALL) -m 755 -o root -g $(DIRGROUP) -d $(SYSETCDIR)/systemd
$(INSTALL) -m 755 -o root -g $(DIRGROUP) -d $(SYSETCDIR)/systemd/system
$(INSTALL) -m 755 -o root -g $(DIRGROUP) -d $(SYSETCDIR)/systemd/system/multi-user.target.wants
$(INSTALL) -m 755 -o root -g $(DIRGROUP) -d $(SYSETCDIR)/systemd/system-generators
common-install: dir-install
(cd ../common; $(MAKE) DESTDIR=$(DESTDIR) local-install)
......@@ -255,7 +256,11 @@ sysetc-install-systemd: systemd-dir-install
$(INSTALL) -m 755 -o root -g $(DIRGROUP) \
$(SRCDIR)/testbed $(BINDIR)/initscripts/
$(INSTALL) -m 755 -o root -g $(DIRGROUP) \
$(SRCDIR)/emulab-systemd-swaps $(BINDIR)/initscripts
$(SRCDIR)/fstab-generator-finish $(BINDIR)/initscripts
# Install our fstab generator wrapper
$(INSTALL) -m 755 -o root -g $(DIRGROUP) \
$(SRCDIR)/fstab-generator \
$(SYSETCDIR)/systemd/system-generators/systemd-fstab-generator
# Install the service unit files
$(INSTALL) -m 644 -o root -g $(DIRGROUP) \
$(SRCDIR)/testbed.service $(SYSETCDIR)/systemd/system
......
[Unit]
Description=Emulab fstab fixup (swap)
Before=swap.target
After=remount-rootfs.service
After=swap.target
After=systemd-remount-fs.service
DefaultDependencies=no
Conflicts=shutdown.target
[Service]
Type=oneshot
RemainAfterExit=no
ExecStart=/usr/local/etc/emulab/initscripts/emulab-systemd-swaps
ExecStart=/usr/local/etc/emulab/initscripts/fstab-generator-finish
Restart=no
[Install]
......
#!/usr/bin/perl -w
#
# Copyright (c) 2011 University of Utah and the Flux Group.
#
# {{{EMULAB-LICENSE
#
# This file is part of the Emulab network testbed software.
#
# This file is free software: you can redistribute it and/or modify it
# under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or (at
# your option) any later version.
#
# This file is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public
# License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this file. If not, see <http://www.gnu.org/licenses/>.
#
# }}}
#
#
# This is a hack to avoid Emulab and systemd conflicts. Basically, sometimes
# our MFSes add swap devices to /etc/fstab that are incorrect (i.e., /dev/hda
# instead of /dev/sda). We could always add 'noauto' to the mount options so
# systemd would ignore it, but we have legacy or deployed MFSes to deal with.
# When systemd encounters one of these, it tries to start the device and halts
# the boot process for a long time while waiting for the device to "start".
#
# We can't edit the bogus /etc/fstab entry before system reads it, because
# because system reads it while the root is still mounted read-only, very early
# on in the startup process.
#
# systemd doesn't allow us to remove units (a swap device is a unit, just like
# a service is a unit), so all we can do is, before the swap unit runs, cancel
# any pending systemd jobs before they try to initialize the the bogus device(s).
#
use English;
use strict;
# Drag in path stuff so we can find emulab stuff.
BEGIN { require "/etc/emulab/paths.pm"; import emulabpaths; }
my $FIXER = "$BINDIR/fixup-fstab-swaps";
# Turn off line buffering on output
$| = 1;
open(LOG,">>$LOGDIR/emulab-systemd-swap.log");
my @output = `$FIXER`;
my %removed = ();
my %added = ();
foreach my $line (@output) {
chomp($line);
if ($line =~ /^Removing.*\/dev\/(.*)$/) {
$removed{$1} = 0;
}
elsif ($line =~ /^Using\s+\/dev\/([^ ]+)\s+.*$/) {
$added{$1} = 0;
}
}
foreach my $rdev (keys(%removed)) {
# it got added back, so don't bother deleting the systemd unit.
if (exists($added{$rdev})) {
delete $removed{$rdev};
}
}
if (!keys(%removed)) {
print LOG "No bogus Emulab swap devs to fix.\n";
exit(0);
}
print LOG "Will try to remove systemd jobs for bogus swap devices: " .
join(' ',keys(%removed)) . "\n";
@output = `systemctl --full list-jobs`;
foreach my $line (@output) {
chomp($line);
if ($line =~ /^\s*(\d+)\s+([^\s]+)\s+/) {
my ($job,$unit) = ($1,$2);
foreach my $rdev (keys(%removed)) {
if ($unit =~ /^dev-$rdev\.(swap|device)$/) {
print LOG "Emulab canceling bogus swap device init job ($job,$unit).\n";
system("systemctl cancel $job");
$removed{$rdev} += 1;
}
}
}
}
foreach my $rdev (keys(%removed)) {
if ($removed{$rdev} == 2) {
delete $removed{$rdev};
print LOG "Successfully canceled systemd jobs for bogus swap device $rdev.\n";
}
}
if (keys(%removed)) {
print LOG "Failed to cancel systemd jobs for bogus swap devices: " .
join(' ',keys(%removed)) . "\n";
exit(keys(%removed));
}
exit(0);
This diff is collapsed.
#!/bin/sh
. /etc/emulab/paths.sh
RUNDIR=/run/emulab
FSTAB=/etc/fstab
#
# These are the things the Emulab version of the systemd-fstab-generator
# could not do, since the root device isn't mounted read-write during
# generator run. So, do them now if necessary.
#
if [ -f $RUNDIR/generated_swaps ]; then
echo "Recording Emulab-generated swaps"
mv $RUNDIR/generated_swaps $BOOTDIR/
if [ -f $RUNDIR/generated_fstab ]; then
echo "Moving Emulab-generated fstab to /etc/fstab"
mv $RUNDIR/generated_fstab $FSTAB
fi
fi
exit 0
#!/usr/bin/perl -w
#
# Copyright (c) 2007 University of Utah and the Flux Group.
# Copyright (c) 2007, 2016 University of Utah and the Flux Group.
#
# {{{EMULAB-LICENSE
#
......@@ -27,7 +27,8 @@ use Getopt::Std;
# Drag in path stuff so we can find emulab stuff.
BEGIN { require "/etc/emulab/paths.pm"; import emulabpaths; }
if (-e "$BINDIR/fixup-fstab-swaps") {
# Only do this on non-systemd systems.
if (-e "$BINDIR/fixup-fstab-swaps" && ! -e "$ETCDIR/uses-systemd") {
exec("$BINDIR/fixup-fstab-swaps");
}
......
......@@ -76,7 +76,6 @@ dir-install:
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/systemd/system/network-online.target.wants
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/systemd/system/networking.service.wants
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/systemd/system/multi-user.target.wants
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/systemd/system/swap.target.wants
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/udev
$(INSTALL) -m 755 -o root -g root -d $(SYSETCDIR)/udev/rules.d
......@@ -88,7 +87,6 @@ bin-install: dir-install
$(INSTALL) -m 755 $(SRCDIR)/findcnet $(BINDIR)/findcnet
$(INSTALL) -m 755 $(SRCDIR)/emulab-udev-network-interfaces-handler \
$(BINDIR)/emulab-udev-network-interfaces-handler
$(INSTALL) -m 755 $(SRCDIR)/fixup-fstab-swaps $(BINDIR)
etc-install: dir-install common-sysetc-install
$(INSTALL) -m 644 $(SRCDIR)/group $(ETCDIR)/group
......@@ -139,10 +137,6 @@ systemd-install: dir-install
$(INSTALL) -m 644 $(SRCDIR)/ntp.service $(SYSETCDIR)/systemd/system
ln -sf $(SYSETCDIR)/systemd/system/ntp.service \
$(SYSETCDIR)/systemd/system/multi-user.target.wants/ntp.service
$(INSTALL) -m 644 -o root -g $(DIRGROUP) \
$(SRCDIR)/emulab-fstab-fixup.service $(SYSETCDIR)/systemd/system
ln -sf $(SYSETCDIR)/systemd/system/emulab-fstab-fixup.service \
$(SYSETCDIR)/systemd/system/swap.target.wants/emulab-fstab-fixup.service
# Kick the init process to read our newly-installed unit files
# (i.e., so an immediate tbprepare will work...)
@if [ -z "$(DESTDIR)" ]; then \
......
[Unit]
Description=Emulab fstab fixup (swap)
Before=swap.target
After=systemd-remount-fs.service
DefaultDependencies=no
Conflicts=shutdown.target
Conflicts=swap.target
[Service]
Type=oneshot
RemainAfterExit=no
ExecStart=/usr/local/etc/emulab/initscripts/emulab-systemd-swaps
Restart=no
[Install]
WantedBy=swap.target
#!/usr/bin/perl -w
#
# Copyright (c) 2007-2016 University of Utah and the Flux Group.
#
# {{{EMULAB-LICENSE
#
# This file is part of the Emulab network testbed software.
#
# This file is free software: you can redistribute it and/or modify it
# under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or (at
# your option) any later version.
#
# This file is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public
# License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this file. If not, see <http://www.gnu.org/licenses/>.
#
# }}}
#
use English;
use Getopt::Std;
# Drag in path stuff so we can find emulab stuff.
BEGIN { require "/etc/emulab/paths.pm"; import emulabpaths; }
sub usage()
{
print "Usage: fixup-fstab-swaps [-U] [-E]\n";
exit(1);
}
my $optlist = "UE";
my $FSTAB = "/etc/fstab";
my $SWAPS = "/proc/swaps";
my $FIRST_BOOT_FLAG = "$BOOTDIR/added_swaps";
my $script = $0;
my $noupdate = 0;
my $noenable = 0;
sub removeswaps();
sub findswaps();
sub swapenable();
# Turn off line buffering on output
$| = 1;
#
# Parse command arguments. Once we return from getopts, all that should be
# left are the required arguments.
#
%options = ();
if (! getopts($optlist, \%options)) {
usage();
}
if (defined($options{"U"})) {
$noupdate = 1;
}
if (defined($options{"E"})) {
$noenable = 1;
}
if (@ARGV) {
usage();
}
# Remove any existing swap devices from fstab; we can't ensure that they'll
# work on this node. This only runs the first time the node is booted after
# being imaged. Then look for linux swap partitions on the root disk and add
# them to fstab. This way we have sane defaults for swap. The user is, of
# course, free to change the swap configration later if necessary.
if (! -f $FIRST_BOOT_FLAG && !$noupdate) {
removeswaps();
findswaps();
open FD, ">$FIRST_BOOT_FLAG";
close FD;
}
#
# Figure out if there is a swap device, add it to fstab and intialize if
# necessary.
#
swapenable()
if (!$noenable);
exit(0);
sub removeswaps()
{
my @buffer;
if (!open(FD, "+<$FSTAB")) {
print STDERR "*** WARNING: could not open $FSTAB for writing: $!\n";
return;
}
while (<FD>) {
next if (/^\s*$/ || /^#/);
my ($fs,$mpt,$type,$opt) = split;
if ($type eq "swap") {
print "Removing old swap device $fs\n";
next;
}
push @buffer, $_;
}
truncate FD, 0;
seek(FD, 0, 0);
print FD $_ for (@buffer);
close(FD);
}
sub findswaps()
{
#
# No swap device.
# Identify the root disk and see if we can locate a Linux swap partition
# on it.
#
my @swapdevs;
my $rdisk;
my $rootfs = `df / | grep /dev/`;
if ($rootfs =~ /^(\/dev\/nvme\d+n\d+)p\d+/) {
$rdisk = $1;
} elsif ($rootfs =~ /^(\/dev\/[a-z]+)\d+/) {
$rdisk = $1;
}
elsif ($rootfs =~ /^(\/dev\/[^\s]+)\s/) {
$rootfs = `readlink -f $1`;
if ($rootfs =~ /^(\/dev\/[a-z]+)\d+/) {
$rdisk = $1;
}
else {
print STDERR "*** WARNING: could not identify root disk, ".
"no swap enabled\n";
return;
}
}
else {
print STDERR "*** WARNING: could not identify root disk, ".
"no swap enabled\n";
return;
}
@lines = `fdisk -l $rdisk 2>/dev/null | grep 'Linux swap'`;
if ($? != 0) {
print STDERR "*** WARNING: could not read MBR of $rdisk, ".
"no swap enabled\n";
return;
}
chomp(@lines);
foreach $line (@lines) {
if ($line =~ /^(\/dev\/\S+)\s+\d+\s+\d+\s+(\d+)\+?/) {
my $dev = $1;
my $size = $2;
print "Using $dev ($size sectors) for swap\n";
push @swapdevs, $dev;
}
}
if (!@swapdevs) {
print STDERR "*** WARNING: could not locate a suitable swap device, ".
"no swap enabled\n";
return;
}
if (!open(FD, ">>$FSTAB")) {
print STDERR "*** WARNING: could not add swap devices to $FSTAB: $!\n";
return;
}
print FD "# the following swap devices added by $script\n";
for (@swapdevs) {
print FD "$_\t\tswap\tswap\tnoauto,x-emulab-auto\t0\t0\n";
}
close(FD);
}
sub swapenable()
{
my %curswaps;
if (!open(FD, "<$SWAPS")) {
print STDERR "*** WARNING: could not open $SWAPS for reading: $!\n";
return;
}
<FD>; # Throw away the header
while(<FD>) {
@_ = split;
$curswaps{$_[0]} = 1;
}
close FD;
if (!open(FD, "<$FSTAB")) {
print STDERR "*** WARNING: could not open $FSTAB for reading, ".
"no swap enabled\n";
return;
}
while (<FD>) {
next if (/^\s*$/ || /^#/);
my ($fs,$mpt,$type,$opt) = split;
next if ($type ne "swap" || ! -b $fs || exists $curswaps{$fs});
next if ($opt =~ /\bnoauto\b/ && $opt !~ /x-emulab-auto/);
if (system("mkswap $fs")) {
print STDERR "*** WARNING: could not initialize swap on $fs\n";
next;
}
if (system("swapon $fs")) {
print STDERR "*** WARNING: could not enable swap on $fs\n";
next;
}
}
close(FD);
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment