- 04 Jan, 2019 1 commit
-
-
Dan Reading authored
-
- 03 Jan, 2019 1 commit
-
-
Leigh Stoller authored
-
- 17 Dec, 2018 1 commit
-
-
David Johnson authored
-
- 11 Dec, 2018 2 commits
-
-
Leigh Stoller authored
-
Leigh Stoller authored
* Makefile changes to build and install nossl versions of capture and console on a rack control node (or more generally, a physical node hosting boss/ops VMs that are not built on our XEN49 image). * Add -I (insecure) option to capture, that listens on localhost only. * Add systemd startup files for capture on ops and boss, I tested these on Ubuntu18. Basic instructions: * Clone the emulab-devel repo to the control node. git clone https://gitlab.flux.utah.edu/emulab/emulab-devel.git * On the control node, install the libssl devel code: sudo apt-get update sudo apt-get install libssl-dev * configure and build capture. Note that the obj-clientside directory might already exist, you can just rm -rf the directory. control> cd ~elabman control> mkdir obj-clientside control> cd obj-clientside control> /path/to/emulab-devel/clientside/configure control> make rack-control control> sudo make rack-control-install control> (cd os/capture; sudo make rack-control-startup-install) * start capture. control> sudo systemctl daemon-reload control> sudo systemctl start capture-boss control> sudo systemctl start capture-ops
-
- 06 Dec, 2018 1 commit
-
-
Leigh Stoller authored
* Stop using the ALWAYSUP state machine for switches, this causes ISUP to always get sent, which in certain cases, results in stated rebooting the switch! Added new ONIE state machine, which handles the way switches actually boot into ONIE first and then does the bootinfo/grub dance, or does a reload or does admin mode. * Do not send PXEBOOTING from ONIE; this was a mistake, it throws us into the PXEKERNEL state machine, which sometimes results is stated rebooting the switch! We still use PXEWAIT (it is sent by bootinfod), since that is the "waiting" state that is wired into a lot of Emulab, it just happens to now be a state in the ONIE state machine, so its legal. * Fix a bug in libossetup, that was fooling libossetup_switch into thinking the wrong thing. * Add some timeouts to the libosload_mlnx code, sshd sometime refuses to answer after a failed login. Strange. * Fix a fork() problem in the switch reload code; gotta call exit, not return! This was wreaking subtle (okay not so subtle) havoc in libossetup.
-
- 05 Dec, 2018 1 commit
-
-
Dan Reading authored
-
- 29 Nov, 2018 1 commit
-
-
Leigh Stoller authored
-
- 28 Nov, 2018 2 commits
-
-
Leigh Stoller authored
clientside subdir so they can be installed on nodes.
-
Mike Hibler authored
Most important: if a <2TB blockstore has an ext4 filesystem, make sure we create it without the 64bit and huge_file features. The former will make it impossible (currently) to take a snapshot since imagezip does not handle 64-bit blocknumbers (working on it...) Don't stripe an LVM LV over more than 8 devices. Some of the Clemson nodes have 20+ disks and we won't buy much (and it might even be counterproductive) to try to stripe writes over all devices all the time. Still trying to get lvcreate to not prompt when one of the devices has an old metadata prompt. -Zy is supposed to prevent that, but it doesn't. Try adding -y as well. Not related: in the BEGIN block, don't cat $ETCDIR/genvmtype unless it actually exists. Not everything is a docker container ya know...
-
- 08 Nov, 2018 1 commit
-
-
Dan Reading authored
-
- 06 Nov, 2018 3 commits
-
-
David Johnson authored
(We don't want systemd sending them SIGTERM before bootvnodes can get them!)
-
David Johnson authored
-
David Johnson authored
-
- 05 Nov, 2018 2 commits
-
-
Leigh Stoller authored
* The primary problem with the mellanox is that the install image does a kexec out of ONIE into Linux, spends 30+ minutes doing stuff, and then reboots. This throws the reload state machine out of whack cause we do not get a chance to send the RELOADDONE state. So ... some change to rc.testbed and rc.reload on the USB dongle: the ONIE MFS sends RELOADING and writes a flag file to the ONIE partition on the "disk" (not the usb). Then the kexec into MLNX, the install happens, and reboots. The next boot into ONIE sees the flag file, erases it and sends REDLOADDONE. Waits for a bit, and then continues on the normal path. This abuses stated in that there a whiny messages in the stated log file, but I am immune to stated whining. * Another item of note is that the switch DHCPs, but only to get the IP info, there is no ability to give it an initial config file like we can with the Dell switches. The main problem here is that the switch comes up with its default login/password which is obviously well known cause its in the manual. That means there is a window where the switch is vulnerable, but since we block the switches from the public side, this is not a serious problem. As soon as we can get in (sshd is running) we login and update the config with passwords, keys, etc. * Other changes to the machine dependent osload library module, I had done some of this before switching to the Dells way back when, but it needed to be updated/completed.
-
Leigh Stoller authored
-
- 30 Oct, 2018 2 commits
-
-
David Johnson authored
This is necessary for clusters that run an arp lockdown on boss. This eluded me for a long time. None of the documented ways to set the mac address of an endpoint on container create work (they only work on post-create network attach). You have to use some special, weird, undocumented magic.
-
David Johnson authored
(Most of these got lost in some other commit storm, I believe. The firewall fixes are new, for newer Dockers that drop traffic by default.)
-
- 26 Oct, 2018 6 commits
-
-
David Johnson authored
-
David Johnson authored
-
David Johnson authored
-
David Johnson authored
-
David Johnson authored
-
Mike Hibler authored
Turns out we have not been installing (via slicefix) the local site certs on nodes after they have been imaged. We haven't noticed because we don't usually use SSL-enabled tmcd. Leigh noticed because we do use it in the script that locks down ARP entries.
-
- 25 Oct, 2018 3 commits
-
-
David Johnson authored
(Also, add support for user to change container entrypoint at runtime. Note also that the server side now stores the entrypoint/cmd/env attributes as base64url-encoded virt_node_attributes, so that we can just use the existing table_regex for those values.) We add a new runit service (/etc/service/dockerentrypoint) to clientside/tmcc/linux/docker/dockerfiles/common to handle the entrypoint/cmd/env/workingdir/user emulation. From the comments: Docker's semantics for ENTRYPOINT/CMD vary depending on if those values are specified as arrays of string, or simple as single strings (which must be interpreted by /bin/sh -c). Handling all the quoting possibilities in the shell is a major pain. So, this script handles the basic stuff (in particular, sourcing env vars, because we want the shell to interpret them!) -- then execs our perl companion script (run.pl) to deal with the entrypoint/command files that libvnode_docker::emulabizeImage and libvnode_docker::vnodeCreate populated. libvnode_docker creates these single-line files in /etc/emulab/docker as either string:hexstr(<entrypoint-or-cmd-string>), or array:hexstr(a[0]),hexstr(a[1])... . This allows us to preserve the original type of the image's entrypoint/cmd as well as the runtime entrypoint/cmd, and to preserve the exact bytes for the eventual final call to exec. The static files builtin to an emulabized image are /etc/emulab/docker/{entrypoint.image,cmd.image}, and those created dynamically at runtime if user changes the entrypoint or cmd are bind-mounted to /etc/emulab/docker{entrypoint.runtime,cmd.runtime}. Given the presence (or absence!) of those files, this script implements the emulation, based upon the content in those files.
-
David Johnson authored
-
David Johnson authored
-
- 02 Oct, 2018 1 commit
-
-
David Johnson authored
(Also link the dbus machine-id file to the one systemd will generate on the next boot. This seems safe and correct.) Certain things (like systemd's dhcp client) use the machine-id as a seed for derived values. For instance, systemd's dhcp client offers a ClientIdentifier in the new client style, and some servers will return the same address to *all* requesting clients, instead of returning only based on source MAC. Can't have any of that confusion.
-
- 26 Sep, 2018 1 commit
-
-
Leigh Stoller authored
-
- 04 Sep, 2018 1 commit
-
-
Kirk Webb authored
-
- 29 Aug, 2018 3 commits
-
-
Leigh Stoller authored
-
Leigh Stoller authored
tables from outer Emulab, use dumpuser/newuser since in a target system setup, we do not do any DB state transfer from the outer Emulab.
-
Leigh Stoller authored
-
- 24 Aug, 2018 2 commits
-
-
Dan Reading authored
-
Leigh Stoller authored
-
- 22 Aug, 2018 1 commit
-
-
Dan Reading authored
-
- 21 Aug, 2018 3 commits
-
-
Dan Reading authored
-
Dan Reading authored
-
Dan Reading authored
-
- 17 Aug, 2018 1 commit
-
-
Mike Hibler authored
Also add partial support for 11.2 MFS (just kernel right now, binaries are still 10.3).
-