1. 06 Dec, 2018 1 commit
    • Leigh Stoller's avatar
      Various fixes for ualloc switches: · cdcbedc7
      Leigh Stoller authored
      * Stop using the ALWAYSUP state machine for switches, this causes ISUP
        to always get sent, which in certain cases, results in stated
        rebooting the switch!
      
        Added new ONIE state machine, which handles the way switches actually
        boot into ONIE first and then does the bootinfo/grub dance, or does a
        reload or does admin mode.
      
      * Do not send PXEBOOTING from ONIE; this was a mistake, it throws us
        into the PXEKERNEL state machine, which sometimes results is stated
        rebooting the switch!
      
        We still use PXEWAIT (it is sent by bootinfod), since that is the
        "waiting" state that is wired into a lot of Emulab, it just happens to
        now be a state in the ONIE state machine, so its legal.
      
      * Fix a bug in libossetup, that was fooling libossetup_switch into
        thinking the wrong thing.
      
      * Add some timeouts to the libosload_mlnx code, sshd sometime refuses to
        answer after a failed login. Strange.
      
      * Fix a fork() problem in the switch reload code; gotta call exit, not
        return! This was wreaking subtle (okay not so subtle) havoc in
        libossetup.
      cdcbedc7
  2. 29 Nov, 2018 1 commit
  3. 28 Nov, 2018 1 commit
  4. 05 Nov, 2018 1 commit
    • Leigh Stoller's avatar
      Working Mellanox user alloc switch support (issue #445): · 95e7bded
      Leigh Stoller authored
      * The primary problem with the mellanox is that the install image does a
        kexec out of ONIE into Linux, spends 30+ minutes doing stuff, and then
        reboots. This throws the reload state machine out of whack cause we do
        not get a chance to send the RELOADDONE state. So ... some change to
        rc.testbed and rc.reload on the USB dongle: the ONIE MFS sends
        RELOADING and writes a flag file to the ONIE partition on the
        "disk" (not the usb). Then the kexec into MLNX, the install happens,
        and reboots. The next boot into ONIE sees the flag file, erases it and
        sends REDLOADDONE. Waits for a bit, and then continues on the normal
        path. This abuses stated in that there a whiny messages in the stated
        log file, but I am immune to stated whining.
      
      * Another item of note is that the switch DHCPs, but only to get the IP
        info, there is no ability to give it an initial config file like we
        can with the Dell switches. The main problem here is that the switch
        comes up with its default login/password which is obviously well known
        cause its in the manual. That means there is a window where the switch
        is vulnerable, but since we block the switches from the public side,
        this is not a serious problem. As soon as we can get in (sshd is
        running) we login and update the config with passwords, keys,
        etc.
      
      * Other changes to the machine dependent osload library module, I had
        done some of this before switching to the Dells way back when, but it
        needed to be updated/completed.
      95e7bded
  5. 11 Apr, 2018 1 commit