1. 17 May, 2019 2 commits
  2. 13 May, 2019 1 commit
    • Mike Hibler's avatar
      Properly cleanup when exportSlice fails. · 638b8731
      Mike Hibler authored
      We were relying on the subsequent unexportSlice call to do the
      cleanup, but it lacks the necessary state to know what needs to
      be cleaned up. The result of left over targets/target groups/extents.
  3. 26 Apr, 2019 2 commits
  4. 19 Apr, 2019 1 commit
    • David Johnson's avatar
      Better handle systemd-networkd chattiness in control net search. · 126ef78e
      David Johnson authored
      systemd-networkd and friends have become very chatty.  This commit
      is about turning down the noise.  It also removes PreferredLiftime=forever
      because it is no longer valid where it used to be, and cannot be used
      apparently in the DHCP case.  Seems that the default is now "forever"
      anyway, so it's now irrelevant to us.  (Older systemd-networkds would
      set the address lifetime to the advertised lease.)
      We also only mark an iface with CriticalConnection=yes once that iface
      has been chosen as the control net.  We used to just mark them all
      in the udev helper so that we didn't have to modify the generated
      config after successful detection, but now systemd-networkd complains
      about bringing down a searched-but-not-control-net interface if
      it is critical.  So, avoid that.
      Finally, I added `-q` to our invocation of systemd-networkd-wait-online,
      and increased the timeout with which we call it.  Timeout increase is
      because we would get spurious event loop disconnect messages without it;
      and q to quiet it in other ways.  Ugh.
  5. 15 Apr, 2019 3 commits
    • Mike Hibler's avatar
      Initial steps to enable jumbo frames on experiment interfaces. · 33beb373
      Mike Hibler authored
      This is just mods to the tmcd "ifconfig" command to include an MTU= arg.
      Right now we don't have anything in the DB for MTU, so tmcd is just returning
      "MTU=" which says to not explicitly set the MTU.
      It also includes the basic client-side support which I have tested on a
      physical interface with MTU=1500. Further changes will be needed to DTRT
      on virtual interfaces and their physical carrier interface.
      But the hope is to get the client-side part nailed down before the next
      set of images are rolled, so that we will be ready when support for the
      front-side (UI and DB state) get added.
    • David Johnson's avatar
    • David Johnson's avatar
      Fix breakage to raw xmlrpc mode in 13ee8406. · 535c8d7a
      David Johnson authored
      (The hack to get "raw" xml mode from xmlrpclib is quite different than
      for m2crypto.  Basically, the response is parsed in the Transport, so
      not only do we need a special raw input method on the ServerProxy, but
      also a custom "raw" transport that skips the parser.)
  6. 09 Apr, 2019 1 commit
    • Mike Hibler's avatar
      Hack-ish change to allow ixl0 as control net. · 2452ef19
      Mike Hibler authored
      Our old hack-ish heuristic would only consider 10Gb interfaces if there
      were no 1Gb interfaces. But the new Powder nodes have both and we want
      one of the 10Gb interfaces to be the control net.
  7. 03 Apr, 2019 1 commit
    • Leigh Stoller's avatar
      Watch for a bogus handshake; I saw this happen on one of the FEs, we did · 58e1192e
      Leigh Stoller authored
      a handshake even though capserver was not running. But the uid/gid
      values were totally bogus. So sanity check them, and if they look
      whacky, abort the handshake until the next time we wake up, to do it
      I go no good theories as to how this happened. A bad theory is that
      maybe some transient startup process bound that socket for a while, but
      that seems incredibly unlikely.
  8. 02 Apr, 2019 1 commit
  9. 26 Mar, 2019 2 commits
  10. 19 Mar, 2019 2 commits
  11. 14 Mar, 2019 1 commit
  12. 06 Mar, 2019 1 commit
  13. 25 Feb, 2019 1 commit
  14. 21 Feb, 2019 1 commit
  15. 15 Feb, 2019 1 commit
    • Mike Hibler's avatar
      Make sure iscsid picks up change of initiator name. · 1c2d994c
      Mike Hibler authored
      Otherwise you get a lot of this action:
      WARNING: (iqn.1993-08.org.debian:01:5ad48b44316d): session reinstatement from different address
      The issue was that we would restart iscsid after changing the name, but if the
      iSCSI sessions were already open, the name change would not immediately take
      effect (til next reboot).
      This will happen if the open-iscsi and/or iscsid service is started at boot time
      prior to the Emulab blockstore config running. We explicitly turned these services
      off in Ubuntu 14, but not in 16 and above cuz that would involve interaction with
      systemd and some of us don't speak systemd. Anyway, this will work even if the
      services are enabled at boot.
  16. 29 Jan, 2019 6 commits
  17. 28 Jan, 2019 1 commit
  18. 11 Jan, 2019 1 commit
  19. 04 Jan, 2019 1 commit
  20. 03 Jan, 2019 1 commit
  21. 17 Dec, 2018 1 commit
  22. 11 Dec, 2018 2 commits
    • Leigh Stoller's avatar
    • Leigh Stoller's avatar
      Changes for building/installing capture/console on control nodes: · fabd07a7
      Leigh Stoller authored
      * Makefile changes to build and install nossl versions of capture and
        console on a rack control node (or more generally, a physical node
        hosting boss/ops VMs that are not built on our XEN49 image).
      * Add -I (insecure) option to capture, that listens on localhost only.
      * Add systemd startup files for capture on ops and boss, I tested these
        on Ubuntu18.
      Basic instructions:
      * Clone the emulab-devel repo to the control node.
        git clone https://gitlab.flux.utah.edu/emulab/emulab-devel.git
      * On the control node, install the libssl devel code:
        sudo apt-get update
        sudo apt-get install libssl-dev
      * configure and build capture. Note that the obj-clientside directory might
        already exist, you can just rm -rf the directory.
        control> cd ~elabman
        control> mkdir obj-clientside
        control> cd obj-clientside
        control> /path/to/emulab-devel/clientside/configure
        control> make rack-control
        control> sudo make rack-control-install
        control> (cd os/capture; sudo make rack-control-startup-install)
      * start capture.
        control> sudo systemctl daemon-reload
        control> sudo systemctl start capture-boss
        control> sudo systemctl start capture-ops
  23. 06 Dec, 2018 1 commit
    • Leigh Stoller's avatar
      Various fixes for ualloc switches: · cdcbedc7
      Leigh Stoller authored
      * Stop using the ALWAYSUP state machine for switches, this causes ISUP
        to always get sent, which in certain cases, results in stated
        rebooting the switch!
        Added new ONIE state machine, which handles the way switches actually
        boot into ONIE first and then does the bootinfo/grub dance, or does a
        reload or does admin mode.
      * Do not send PXEBOOTING from ONIE; this was a mistake, it throws us
        into the PXEKERNEL state machine, which sometimes results is stated
        rebooting the switch!
        We still use PXEWAIT (it is sent by bootinfod), since that is the
        "waiting" state that is wired into a lot of Emulab, it just happens to
        now be a state in the ONIE state machine, so its legal.
      * Fix a bug in libossetup, that was fooling libossetup_switch into
        thinking the wrong thing.
      * Add some timeouts to the libosload_mlnx code, sshd sometime refuses to
        answer after a failed login. Strange.
      * Fix a fork() problem in the switch reload code; gotta call exit, not
        return! This was wreaking subtle (okay not so subtle) havoc in
  24. 05 Dec, 2018 1 commit
  25. 29 Nov, 2018 1 commit
  26. 28 Nov, 2018 2 commits
    • Leigh Stoller's avatar
      Part of issue #472; move tip and client part of xmlrpc, into the · 33b207d3
      Leigh Stoller authored
      clientside subdir so they can be installed on nodes.
    • Mike Hibler's avatar
      Various Linux local blockstore changes: · 7bd23fb1
      Mike Hibler authored
      Most important: if a <2TB blockstore has an ext4 filesystem, make sure we
      create it without the 64bit and huge_file features. The former will make
      it impossible (currently) to take a snapshot since imagezip does not handle
      64-bit blocknumbers (working on it...)
      Don't stripe an LVM LV over more than 8 devices. Some of the Clemson nodes
      have 20+ disks and we won't buy much (and it might even be counterproductive)
      to try to stripe writes over all devices all the time.
      Still trying to get lvcreate to not prompt when one of the devices has an
      old metadata prompt. -Zy is supposed to prevent that, but it doesn't. Try
      adding -y as well.
      Not related: in the BEGIN block, don't cat $ETCDIR/genvmtype unless it
      actually exists. Not everything is a docker container ya know...
  27. 08 Nov, 2018 1 commit