    • Leigh B Stoller's avatar
      Rationalize execute services: · 241b3ab5
      Leigh B Stoller authored
      This has been bothering me for a while; A single execute service
      operates like a traditional Emulab startup command, with output going to
      /proj/pid/exp/eid/logs, which is a pain in the ass cause I don't know
      the name of the file and there 1000s of file in the directory.
      But when there are multiple execute services, we wrap them in a script
      and redirect the output to easy an easy to find spot; /var/tmp. Much
      Now we always wrap them up in a script so the output files go to
      /var/tmp. And drop a note in the original file that says where to go
      The script is written to /proj/pid/exp/eid, but thinking to the future
      when there are no NFS mounts at all, we now bundle that script into a
      little tarball and append that to the install services. Tarballs already
      handle a no-NFS world, asking the web server for the file. QED
    • Leigh B Stoller's avatar
      Various fixes for ualloc switches: · cdcbedc7
      Leigh B Stoller authored
      * Stop using the ALWAYSUP state machine for switches, this causes ISUP
        to always get sent, which in certain cases, results in stated
        rebooting the switch!
        Added new ONIE state machine, which handles the way switches actually
        boot into ONIE first and then does the bootinfo/grub dance, or does a
        reload or does admin mode.
      * Do not send PXEBOOTING from ONIE; this was a mistake, it throws us
        into the PXEKERNEL state machine, which sometimes results is stated
        rebooting the switch!
        We still use PXEWAIT (it is sent by bootinfod), since that is the
        "waiting" state that is wired into a lot of Emulab, it just happens to
        now be a state in the ONIE state machine, so its legal.
      * Fix a bug in libossetup, that was fooling libossetup_switch into
        thinking the wrong thing.
      * Add some timeouts to the libosload_mlnx code, sshd sometime refuses to
        answer after a failed login. Strange.
      * Fix a fork() problem in the switch reload code; gotta call exit, not
        return! This was wreaking subtle (okay not so subtle) havoc in