1. 01 Oct, 2018 1 commit
    • Leigh Stoller's avatar
      More work on the aggregate monitoring. · 9f3205c9
      Leigh Stoller authored
      1. Split the resource stuff (where we ask for an advertisement and
         process it) into a separate script, since that takes a long time to
         cycle through cause of the size of the ads from the big clusters.
      
      2. On the monitor, distinguish offline (nologins) from actually being
         down.
      
      3. Add a table to store changes in status so we can see over time how
         much time the aggregates are usable.
      9f3205c9
  2. 15 Aug, 2018 1 commit
  3. 14 Aug, 2018 1 commit
  4. 12 Aug, 2018 1 commit
  5. 07 Aug, 2018 1 commit
  6. 30 Jul, 2018 3 commits
  7. 26 Jul, 2018 1 commit
  8. 20 Jul, 2018 1 commit
  9. 13 Jul, 2018 2 commits
  10. 09 Jul, 2018 1 commit
  11. 22 Jun, 2018 1 commit
  12. 18 Jun, 2018 1 commit
    • Leigh Stoller's avatar
      Add automated cancellation of reservations that are not used: · 67a0e58e
      Leigh Stoller authored
      * If unused at six hours, schedule for cancel in three hours and send
        email.
      
      * If reservation becomes used within those three hours, rescind the
        cancellation.
      
      * Add an override bit so that cancel/uncancel on the command line
        supercedes (so explicit cancel or rescinding a cancel, means do not
        make any more automated checks for unused).
      
      * Rework cancel to be more library friendly.
      67a0e58e
  13. 08 Jun, 2018 1 commit
  14. 30 May, 2018 1 commit
  15. 25 May, 2018 1 commit
  16. 23 May, 2018 2 commits
  17. 15 May, 2018 1 commit
  18. 11 Apr, 2018 3 commits
  19. 10 Apr, 2018 1 commit
  20. 03 Apr, 2018 2 commits
  21. 29 Mar, 2018 1 commit
    • Leigh Stoller's avatar
      Reservations system changes: · df90d7a7
      Leigh Stoller authored
      1) Rework so that instead of relying on swapin__last + autoswap timeout,
         set expt_expires for classic experiments at the beginning of swapin
         time. This is cause swapin_last is not set till the end of swapin,
         and so during swapin the res system is in an inconsistent state since
         there is no way to determine when the experiment ends.
      
      2) On the Geni path, simplify expiration handling; do not allow a slice
         modification and expiration change at the same time; the bookkeeping
         and failure rollback is a pain, especially wrt reservation system,
         and this rarely ever actually happens, so get rid of a lot of
         complication.
      df90d7a7
  22. 09 Mar, 2018 2 commits
  23. 08 Mar, 2018 1 commit
  24. 01 Mar, 2018 1 commit
  25. 21 Feb, 2018 1 commit
  26. 31 Jan, 2018 1 commit
  27. 01 Jan, 2018 1 commit
  28. 26 Dec, 2017 1 commit
    • Mike Hibler's avatar
      Adjust another stated timeout for the new HPs: RELOAD/SHUTDOWN. · 6cc159aa
      Mike Hibler authored
      Note that node_type_attributes.bios_waittime could be used to
      dynamically adjust the stated timeout, but I don't want to embed
      semantics of a particular state in stated, so we would have to
      have some more general mechanism to tell stated to adjust the
      timeout value based on a database field.
      6cc159aa
  29. 23 Dec, 2017 2 commits
  30. 13 Dec, 2017 1 commit
  31. 12 Dec, 2017 1 commit