Update Using the Recovery MFS authored by David Johnson's avatar David Johnson
Using the Linux Recovery MFS Using the Linux Recovery MFS
============================ ============================
The Linux Recovery MFS is a memory-based, network-booted environment The Linux Recovery MFS is a memory-based, network-booted environment
that can be used to examine and/or fix a non-booting experiment node. that can be used to examine and/or fix a non-booting experiment node.
Booting into the Recovery MFS Booting into the Recovery MFS
----------------------------- -----------------------------
When looking at your Emulab/CloudLab/POWDER experiment status page, you When looking at your Emulab/CloudLab/POWDER experiment status page, you
can click on the troubled node, and select the `Recovery` option from can click on the troubled node, and select the `Recovery` option from
the popup menu. You will then be given more information and a `Confirm` the popup menu. You will then be given more information and a `Confirm`
button in a dialogue that pops up. Click `Confirm`, and your node will button in a dialogue that pops up. Click `Confirm`, and your node will
be rebooted and/or power cycled, and then network-booted into this be rebooted and/or power cycled, and then network-booted into this
environment. Your account and ssh public keys will be installed as the environment. Your account and ssh public keys will be installed as the
environment boots and configures, so once the Recovery MFS has booted, environment boots and configures, so once the Recovery MFS has booted,
login to your node via `ssh`. Note that we do not install account login to your node via `ssh`. Note that we do not install account
passwords, so you cannot login on serial console as your own uid; but passwords, so you cannot login on serial console as your own uid; but
you can login to the console via the root account and its random you can login to the console via the root account and its random
password. This random password is available once you've opened the password. This random password is available once you've opened the
console from the experiment status page. console from the experiment status page.
Using the Recovery MFS Using the Recovery MFS
---------------------- ----------------------
There are many reasons why your node may not boot. We'll cover the most There are many reasons why your node may not boot. We'll cover the most
common ones in this document, but you will possibly have to do some of common ones in this document, but you will possibly have to do some of
your own research based on the information here. It's best if you can your own research based on the information here. It's best if you can
look at the console logs for your node (an option on the experiment look at the console logs for your node (an option on the experiment
status page), and use any error information or timeouts to point you to status page), and use any error information or timeouts to point you to
the right solution. Consider also what packages you may have the right solution. Consider also what packages you may have
installed---note carefully if you upgraded the kernel, or grub. installed---note carefully if you upgraded the kernel, or grub.
Because the Linux MFS is network-booted via PXE/TFTP, it is optimized Because the Linux MFS is network-booted via PXE/TFTP, it is optimized
for space. It has a carefully-chosen set of utilities that will help for space. It has a carefully-chosen set of utilities that will help
you diagnose and repair problems related to your root filesystem and you diagnose and repair problems related to your root filesystem and
bootloader. Its Linux kernel configuration is small, so it may not have bootloader. Its Linux kernel configuration is small, so it may not have
support for all things you might want to try, even if you've used support for all things you might want to try, even if you've used
`chroot` to enter your root filesystem and run programs (and the goal of `chroot` to enter your root filesystem and run programs (and the goal of
this guide is to get you back up and running your own kernel and on-disk this guide is to get you back up and running your own kernel and on-disk
environment, in any case). environment, in any case).
It is non-trivial to *add* software to the MFS, due to its optimization It is non-trivial to *add* software to the MFS, due to its optimization
for space. The MFS is based on `buildroot`, and we use `uclibc` to for space. The MFS is based on `buildroot`, and we use `uclibc` to
provide the standard C library, and customize its configuration for provide the standard C library, and customize its configuration for
space. This means that (at best) you should only expect space. This means that (at best) you should only expect
statically-linked executables to run in the MFS, and only if those statically-linked executables to run in the MFS, and only if those
executables were built against something approximating the Linux ~4.14 executables were built against something approximating the Linux ~4.14
ABI. If your statically-linked binary depends at all on the GNU libc ABI. If your statically-linked binary depends at all on the GNU libc
libraries (e.g. if your gcc was not built with `--enable-static-nss` and libraries (e.g. if your gcc was not built with `--enable-static-nss` and
your program needed NSS support), you won't be able to use it in this your program needed NSS support), you won't be able to use it in this
context. The best solution would be to build your software against our context. The best solution would be to build your software against our
build environment, but this is non-trivial, so we don't currently build environment, but this is non-trivial, so we don't currently
support it. support it.
Finding the Partition Containing the Root Filesystem Finding the Partition Containing the Root Filesystem
---------------------------------------------------- ----------------------------------------------------
We have two primary standard Linux image types. The newest are GPT-based, dual BIOS/UEFI-bootable (https://gitlab.flux.utah.edu/emulab/emulab-devel/-/wikis/Dual-boot-UEFI-BIOS-Images), and place the rootfs in the 3rd partition. Older BIOS bootable-only images have an MBR partition table, and place the rootfs in the 1st partition. We have two primary standard Linux image types. The newest are GPT-based, dual BIOS/UEFI-bootable (https://gitlab.flux.utah.edu/emulab/emulab-devel/-/wikis/Dual-boot-UEFI-BIOS-Images), and place the rootfs in the 3rd partition. Older BIOS bootable-only images have an MBR partition table, and place the rootfs in the 1st partition.
Our newer dual BIOS/UEFI-bootable images install both a UEFI loader in the EFI System Partition (GPT partition 1), and a grub loader in the MBR (note also the presence of the BIOS boot partition, GPT partition 2). Older BIOS-bootable images require the bootloader (typically grub) must also be installed to Our newer dual BIOS/UEFI-bootable images install both a UEFI loader in the EFI System Partition (GPT partition 1), and a grub loader in the MBR (note also the presence of the BIOS boot partition, GPT partition 2). Older BIOS-bootable images require the bootloader (typically grub) must also be installed to
the first partition, *not* to the MBR. the first partition, *not* to the MBR.
You can list the detected filesystems and non-empty partitions by You can list the detected filesystems and non-empty partitions by
running the `blkid` command. For instance, `/dev/sda1` holds the root running the `blkid` command. For instance, `/dev/sda1` holds the root
partition in the output below: partition in the output below:
``` ```
# blkid # blkid
/dev/sda1: UUID="5fac4c69-51f6-47af-9773-a0d722426942" TYPE="ext3" /dev/sda1: UUID="5fac4c69-51f6-47af-9773-a0d722426942" TYPE="ext3"
/dev/sda3: UUID="71e2e601-c555-466b-b925-e9d672f72835" TYPE="swap" /dev/sda3: UUID="71e2e601-c555-466b-b925-e9d672f72835" TYPE="swap"
``` ```
On nodes with NVME devices, the root partition would instead be named On nodes with NVME devices, the root partition would instead be named
`/dev/nvmen1p1`, or similar. Do not assume that `/dev/sda1` is the `/dev/nvmen1p1`, or similar. Do not assume that `/dev/sda1` is the
root. root.
Checking for Root Filesystem Errors Checking for Root Filesystem Errors
----------------------------------- -----------------------------------
If your node will not boot and you see on console that the root If your node will not boot and you see on console that the root
filsystem failed to mount due to unfixable errors, you will need to use filsystem failed to mount due to unfixable errors, you will need to use
the `fsck` program to correct these errors: the `fsck` program to correct these errors:
``` ```
# fsck /dev/sda1 # fsck /dev/sda1
fsck 1.44.1 (24-Mar-2018) fsck 1.44.1 (24-Mar-2018)
e2fsck 1.44.1 (24-Mar-2018) e2fsck 1.44.1 (24-Mar-2018)
/dev/sda1: clean, 124348/1048576 files, 659357/4194304 blocks /dev/sda1: clean, 124348/1048576 files, 659357/4194304 blocks
``` ```
You can run `fsck -h` to learn more about specific options, depending on You can run `fsck -h` to learn more about specific options, depending on
how badly your filesystem may be damaged. how badly your filesystem may be damaged.
Checking Filesystem Metadata Checking Filesystem Metadata
---------------------------- ----------------------------
Our standard Linux images's root partitions are formatted using `ext3` Our standard Linux images's root partitions are formatted using `ext3`
or `ext4`. You can use a program called `tune2fs` to view (and change) or `ext4`. You can use a program called `tune2fs` to view (and change)
metadata and filesystem options. Be careful when using tune2fs---know metadata and filesystem options. Be careful when using tune2fs---know
what you are doing before using it to make changes! Here is an example what you are doing before using it to make changes! Here is an example
of listing root filesystem metadata on the root partition of our of listing root filesystem metadata on the root partition of our
`UBUNTU18-64-STD` image: `UBUNTU18-64-STD` image:
``` ```
# tune2fs -l /dev/sda1 # tune2fs -l /dev/sda1
tune2fs 1.44.1 (24-Mar-2018) tune2fs 1.44.1 (24-Mar-2018)
Filesystem volume name: <none> Filesystem volume name: <none>
Last mounted on: / Last mounted on: /
Filesystem UUID: 5fac4c69-51f6-47af-9773-a0d722426942 Filesystem UUID: 5fac4c69-51f6-47af-9773-a0d722426942
Filesystem magic number: 0xEF53 Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic) Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype sparse_super large_file Filesystem features: has_journal ext_attr resize_inode dir_index filetype sparse_super large_file
Filesystem flags: signed_directory_hash Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl Default mount options: user_xattr acl
Filesystem state: clean Filesystem state: clean
Errors behavior: Continue Errors behavior: Continue
Filesystem OS type: Linux Filesystem OS type: Linux
Inode count: 1048576 Inode count: 1048576
Block count: 4194304 Block count: 4194304
Reserved block count: 209715 Reserved block count: 209715
Free blocks: 3534947 Free blocks: 3534947
Free inodes: 924228 Free inodes: 924228
First block: 0 First block: 0
Block size: 4096 Block size: 4096
Fragment size: 4096 Fragment size: 4096
Reserved GDT blocks: 1023 Reserved GDT blocks: 1023
Blocks per group: 32768 Blocks per group: 32768
Fragments per group: 32768 Fragments per group: 32768
Inodes per group: 8192 Inodes per group: 8192
Inode blocks per group: 512 Inode blocks per group: 512
Filesystem created: Wed May 9 17:38:11 2018 Filesystem created: Wed May 9 17:38:11 2018
Last mount time: Thu Mar 5 09:19:03 2020 Last mount time: Thu Mar 5 09:19:03 2020
Last write time: Thu Mar 5 10:05:48 2020 Last write time: Thu Mar 5 10:05:48 2020
Mount count: 140 Mount count: 140
Maximum mount count: -1 Maximum mount count: -1
Last checked: Wed May 9 17:38:11 2018 Last checked: Wed May 9 17:38:11 2018
Check interval: 0 (<none>) Check interval: 0 (<none>)
Lifetime writes: 280 GB Lifetime writes: 280 GB
Reserved blocks uid: 0 (user root) Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root) Reserved blocks gid: 0 (group root)
First inode: 11 First inode: 11
Inode size: 256 Inode size: 256
Required extra isize: 32 Required extra isize: 32
Desired extra isize: 32 Desired extra isize: 32
Journal inode: 8 Journal inode: 8
Default directory hash: half_md4 Default directory hash: half_md4
Directory Hash Seed: 91b8dc1a-0f8b-4236-8b4e-a6bfe93db18b Directory Hash Seed: 91b8dc1a-0f8b-4236-8b4e-a6bfe93db18b
Journal backup: inode blocks Journal backup: inode blocks
``` ```
Mounting the Root Filesystem Mounting the Root Filesystem
---------------------------- ----------------------------
Once you've verified that your root filesystem is intact, you'll want to Once you've verified that your root filesystem is intact, you'll want to
`mount` it at a mountpoint (typically `/mnt`). As described above, our `mount` it at a mountpoint (typically `/mnt`). As described above, our
standard Linux image root partitions are formatted with the `ext3` or standard Linux image root partitions are formatted with the `ext3` or
`ext4` filesystems. The Recovery MFS does not support all filesystem `ext4` filesystems. The Recovery MFS does not support all filesystem
types, so YMMV if you're trying to examine a different kind of types, so YMMV if you're trying to examine a different kind of
filesystem. A straightforward use of `mount` should work (of course filesystem. A straightforward use of `mount` should work (of course
change `/dev/sda1` to the device that hosts your root filesystem): change `/dev/sda1` to the device that hosts your root filesystem):
``` ```
mount /dev/sda1 /mnt mount /dev/sda1 /mnt
``` ```
If `mount` complains, you may need to manually specify the filesystem If `mount` complains, you may need to manually specify the filesystem
type (and/or first run `fsck` on the partition as described above): type (and/or first run `fsck` on the partition as described above):
``` ```
mount -t ext3 /dev/sda1 /mnt mount -t ext3 /dev/sda1 /mnt
``` ```
Make sure you haven't already mounted something else at `/mnt`. If you Make sure you haven't already mounted something else at `/mnt`. If you
need more than one partition mounted at once, you'll need to create more need more than one partition mounted at once, you'll need to create more
directories to serve as additional mountpoints. Always make sure to directories to serve as additional mountpoints. Always make sure to
`umount` *everything* you mount before rebooting or exiting the Recovery `umount` *everything* you mount before rebooting or exiting the Recovery
MFS! MFS!
At this point, you can browse filesystem contents and make any necessary At this point, you can browse filesystem contents and make any necessary
changes. For instance, you might want to view the syslog: changes. For instance, you might want to view the syslog:
``` ```
less /mnt/var/log/messages less /mnt/var/log/messages
``` ```
If you forget whether or where you've mounted the root filesystem, you If you forget whether or where you've mounted the root filesystem, you
can simply invoke `mount` with no options: can simply invoke `mount` with no options:
``` ```
mount mount
``` ```
The final line should tell you if and where you've mounted the root The final line should tell you if and where you've mounted the root
filesystem. filesystem.
Using `chroot` to Run On-Disk Programs Using `chroot` to Run On-Disk Programs
-------------------------------------- --------------------------------------
Once you have mounted your root partition, you may want or need to run Once you have mounted your root partition, you may want or need to run
on-disk programs to make repairs. To do this, you need to mount on-disk programs to make repairs. To do this, you need to mount
additional special filesystems and files into the mountpoint containing additional special filesystems and files into the mountpoint containing
your root filesystem. The details of these special filesystems is your root filesystem. The details of these special filesystems is
beyond the scope of this document, and not all programs require all of beyond the scope of this document, and not all programs require all of
these filesystems to be mounted; other more comprehensive guides are these filesystems to be mounted; other more comprehensive guides are
easy to find. Not all on-disk software will run reliably in this easy to find. Not all on-disk software will run reliably in this
configuration, but most tools will. configuration, but most tools will.
Again, make sure you already have your root filesystem Again, make sure you already have your root filesystem
mounted at `/mnt`. Then: mounted at `/mnt`. Then:
``` ```
mount -o bind /proc /mnt/proc mount -o bind /proc /mnt/proc
mount -o bind /dev /mnt/dev mount -o bind /dev /mnt/dev
mount -o bind /dev/pts /mnt/dev/pts mount -o bind /dev/pts /mnt/dev/pts
mount -o bind /sys /mnt/sys mount -o bind /sys /mnt/sys
# If you need internet access and name resolution: # If you need internet access and name resolution:
mount -o bind /etc/resolv.conf /mnt/etc/resolv.conf mount -o bind /etc/resolv.conf /mnt/etc/resolv.conf
# Change your root to be the root of the on-disk filesystem: # Or, if the above command fails because /mnt/etc/resolv.conf
chroot /mnt /bin/bash # is a symlink to the systemd stub resolver, try
# Run whatever programs you need, such as apt-get or grub-install. mount -o bind /etc/resolv.conf /mnt/run/systemd/resolve/stub-resolv.conf
# e.g., $ apt-get update && apt-get install ... # Change your root to be the root of the on-disk filesystem:
# e.g., $ grub-install --force /dev/sda1 chroot /mnt /bin/bash
# Exit the chroot environment # Run whatever programs you need, such as apt-get or grub-install.
exit # e.g., $ apt-get update && apt-get install ...
# Unmount special filesystems (in reverse order from above): # e.g., $ grub-install --force /dev/sda1
umount /mnt/etc/resolv.conf /mnt/sys /mnt/dev/pts /mnt/dev /mnt/proc # Exit the chroot environment
# Unmount the root filesystem: exit
umount /mnt # Unmount special filesystems (in reverse order from above):
``` umount /mnt/etc/resolv.conf /mnt/sys /mnt/dev/pts /mnt/dev /mnt/proc
# Unmount the root filesystem:
Reinstalling the Bootloader (grub2) umount /mnt
----------------------------------- ```
If you don't see any messages from the on-disk bootloader (usually Reinstalling the Bootloader (grub2)
`grub2`) in the console log of your failing node, you may well have -----------------------------------
broken your bootloader installation or its configuration. As mentioned
earlier, the bootloader for our standard Linux images *must* be If you don't see any messages from the on-disk bootloader (usually
installed to the first partition of the boot device, and this partition `grub2`) in the console log of your failing node, you may well have
also contains the root filesystem. Do not reinstall to the MBR; this broken your bootloader installation or its configuration. As mentioned
will not help you unless your disk image is a whole-disk image, and this earlier, the bootloader for our standard Linux images *must* be
is almost never the case. installed to the first partition of the boot device, and this partition
also contains the root filesystem. Do not reinstall to the MBR; this
We do not provide `grub2` binaries in the MFS, so you will need to use will not help you unless your disk image is a whole-disk image, and this
the `chroot` strategy above to run `grub-install` or `grub2-install` is almost never the case.
from within your on-disk root filesystem, once you've mounted it as
described above. Once you've entered the `chroot` environment, and know We do not provide `grub2` binaries in the MFS, so you will need to use
the device containing your root filesystem, you can do the `chroot` strategy above to run `grub-install` or `grub2-install`
``` from within your on-disk root filesystem, once you've mounted it as
grub-install --force /dev/sda1 described above. Once you've entered the `chroot` environment, and know
``` the device containing your root filesystem, you can do
(Replace `/dev/sda1` with the path of the partition containing your root ```
filesystem.) grub-install --force /dev/sda1
```
Make sure to `umount` special filesystems and the root filesystem when (Replace `/dev/sda1` with the path of the partition containing your root
you're finished, before rebooting! filesystem.)
Advanced Filesystem Configurations Make sure to `umount` special filesystems and the root filesystem when
---------------------------------- you're finished, before rebooting!
The Recovery MFS does provide LVM (e.g., `lvdisplay` et al) and software Advanced Filesystem Configurations
RAID (e.g. `dmraid`) tools, so you could use them to examine, expose, ----------------------------------
and mount logical volumes or software RAID devices---but these advanced
configurations are beyond the scope of this document. You will need to The Recovery MFS does provide LVM (e.g., `lvdisplay` et al) and software
read and understand either toolset to proceed. RAID (e.g. `dmraid`) tools, so you could use them to examine, expose,
and mount logical volumes or software RAID devices---but these advanced
Recovering non-Linux filesystems configurations are beyond the scope of this document. You will need to
-------------------------------- read and understand either toolset to proceed.
Recovering non-Linux filesystems
--------------------------------
If your node is running FreeBSD rather than Ubuntu or CENTOS Linux, then the Linux recovery filesystem won't help you right now. Contact testbed-ops for help. If your node is running FreeBSD rather than Ubuntu or CENTOS Linux, then the Linux recovery filesystem won't help you right now. Contact testbed-ops for help.
\ No newline at end of file