1. 03 Apr, 2019 1 commit
  2. 11 Mar, 2019 1 commit
  3. 19 Feb, 2019 1 commit
  4. 14 Feb, 2019 1 commit
  5. 01 Feb, 2019 1 commit
  6. 28 Jan, 2019 1 commit
  7. 22 Jan, 2019 1 commit
  8. 06 Dec, 2018 1 commit
    • Leigh Stoller's avatar
      Various fixes for ualloc switches: · cdcbedc7
      Leigh Stoller authored
      * Stop using the ALWAYSUP state machine for switches, this causes ISUP
        to always get sent, which in certain cases, results in stated
        rebooting the switch!
      
        Added new ONIE state machine, which handles the way switches actually
        boot into ONIE first and then does the bootinfo/grub dance, or does a
        reload or does admin mode.
      
      * Do not send PXEBOOTING from ONIE; this was a mistake, it throws us
        into the PXEKERNEL state machine, which sometimes results is stated
        rebooting the switch!
      
        We still use PXEWAIT (it is sent by bootinfod), since that is the
        "waiting" state that is wired into a lot of Emulab, it just happens to
        now be a state in the ONIE state machine, so its legal.
      
      * Fix a bug in libossetup, that was fooling libossetup_switch into
        thinking the wrong thing.
      
      * Add some timeouts to the libosload_mlnx code, sshd sometime refuses to
        answer after a failed login. Strange.
      
      * Fix a fork() problem in the switch reload code; gotta call exit, not
        return! This was wreaking subtle (okay not so subtle) havoc in
        libossetup.
      cdcbedc7
  9. 30 Nov, 2018 1 commit
  10. 28 Nov, 2018 1 commit
  11. 08 Nov, 2018 1 commit
  12. 01 Oct, 2018 1 commit
    • Leigh Stoller's avatar
      More work on the aggregate monitoring. · 9f3205c9
      Leigh Stoller authored
      1. Split the resource stuff (where we ask for an advertisement and
         process it) into a separate script, since that takes a long time to
         cycle through cause of the size of the ads from the big clusters.
      
      2. On the monitor, distinguish offline (nologins) from actually being
         down.
      
      3. Add a table to store changes in status so we can see over time how
         much time the aggregates are usable.
      9f3205c9
  13. 15 Aug, 2018 1 commit
  14. 14 Aug, 2018 1 commit
  15. 12 Aug, 2018 1 commit
  16. 07 Aug, 2018 1 commit
  17. 30 Jul, 2018 3 commits
  18. 26 Jul, 2018 1 commit
  19. 20 Jul, 2018 1 commit
  20. 13 Jul, 2018 2 commits
  21. 09 Jul, 2018 1 commit
  22. 22 Jun, 2018 1 commit
  23. 18 Jun, 2018 1 commit
    • Leigh Stoller's avatar
      Add automated cancellation of reservations that are not used: · 67a0e58e
      Leigh Stoller authored
      * If unused at six hours, schedule for cancel in three hours and send
        email.
      
      * If reservation becomes used within those three hours, rescind the
        cancellation.
      
      * Add an override bit so that cancel/uncancel on the command line
        supercedes (so explicit cancel or rescinding a cancel, means do not
        make any more automated checks for unused).
      
      * Rework cancel to be more library friendly.
      67a0e58e
  24. 08 Jun, 2018 1 commit
  25. 30 May, 2018 1 commit
  26. 25 May, 2018 1 commit
  27. 23 May, 2018 2 commits
  28. 15 May, 2018 1 commit
  29. 11 Apr, 2018 3 commits
  30. 10 Apr, 2018 1 commit
  31. 03 Apr, 2018 2 commits
  32. 29 Mar, 2018 1 commit
    • Leigh Stoller's avatar
      Reservations system changes: · df90d7a7
      Leigh Stoller authored
      1) Rework so that instead of relying on swapin__last + autoswap timeout,
         set expt_expires for classic experiments at the beginning of swapin
         time. This is cause swapin_last is not set till the end of swapin,
         and so during swapin the res system is in an inconsistent state since
         there is no way to determine when the experiment ends.
      
      2) On the Geni path, simplify expiration handling; do not allow a slice
         modification and expiration change at the same time; the bookkeeping
         and failure rollback is a pain, especially wrt reservation system,
         and this rarely ever actually happens, so get rid of a lot of
         complication.
      df90d7a7
  33. 09 Mar, 2018 1 commit