1. 02 Apr, 2019 1 commit
  2. 07 Aug, 2018 1 commit
  3. 04 Jun, 2018 1 commit
  4. 01 Jun, 2018 1 commit
  5. 31 May, 2018 2 commits
  6. 30 May, 2018 1 commit
    • Leigh Stoller's avatar
      Several backend/RPC changes for reservations: · 8266ae51
      Leigh Stoller authored
      1. Return current set of reservations (if any) for a user when getting
         the max extension (piggy backing on the call to reduce overhead).
      
      2. Add RPC to get the reservation history for a user (all past
         reservations that were approved).
      
         Aside; the reservation_history table was not being updated properly,
         only expired reservations were saved, not deleted (but used)
         reservations, so we lost a lot of history. We could regen some of it
         from the history tables I added at the Portal for Dmitry, but not
         sure it is worth the trouble.
      
      3. And then the main content of this commit is that for both of the
         lists above, also return the experiment usage history for the project
         an dthe user who created the reservation. This takes the form of a
         time line of allocation changes so that we can graph node usage
         against the reservation bounds, to show graphically how well utilized
         the reservation is.
      8266ae51
  7. 20 Apr, 2018 1 commit
    • Leigh Stoller's avatar
      Couple of small reservation system changes: · 636cea07
      Leigh Stoller authored
      1. Show the reason on the listing page, as popover in the last
         column. Makes it easier to approve directly from the listing page if
         you can see the reason.
      
      2. Add optional approval message to pass along to the user.
      636cea07
  8. 16 Feb, 2018 2 commits
    • Leigh Stoller's avatar
    • Leigh Stoller's avatar
      A lot of work on the RPC code, among other things. · 56f6d601
      Leigh Stoller authored
      I spent a fair amount of improving error handling along the RPC path,
      as well making the code more consistent across the various files. Also
      be more consistent in how the web interface invokes the backend and gets
      errors back, specifically for errors that are generated when taking to a
      remote cluster.
      
      Add checks before every RPC to make sure the cluster is not disabled in
      the database. Also check that we can actually reach the cluster, and
      that the cluster is not offline (NoLogins()) before we try to do
      anything. I might have to relax this a bit, but in general it takes a
      couple of seconds to check, which is a small fraction of what most RPCs
      take. Return precise errors for clusters that are not available, to the
      web interface and show them to user.
      
      Use webtasks more consistently between the web interface and backend
      scripts. Watch specifically for scripts that exit abnormally (exit
      before setting the exitcode in the webtask) which always means an
      internal failure, do not show those to users.
      
      Show just those RPC errors that would make sense users, stop spewing
      script output to the user, send it just to tbops via the email that is
      already generated when a backend script fails fatally.
      
      But do not spew email for clusters that are not reachable or are
      offline. Ditto for several other cases that were generating mail to
      tbops instead of just showing the user a meaningful error message.
      
      Stop using ParRun for single site experiments; 99% of experiments.
      
      For create_instance, a new "async" mode that tells CreateSliver() to
      return before the first mapper run, which is typically very quickly.
      Then watch for errors or for the manifest with Resolve or for the slice
      to disappear. I expect this to be bounded and so we do not need to worry
      so much about timing this wait out (which is a problem on very big
      topologies). When we see the manifest, the RedeemTicket() part of the
      CreateSliver is done and now we are into the StartSliver() phase.
      
      For the StartSliver phase, watch for errors and show them to users,
      previously we mostly lost those errors and just sent the experiment into
      the failed state. I am still working on this.
      56f6d601
  9. 07 Feb, 2018 1 commit
  10. 25 Jan, 2018 1 commit
  11. 06 Dec, 2017 2 commits
  12. 19 Nov, 2017 1 commit
  13. 07 Nov, 2017 1 commit
  14. 27 Oct, 2017 1 commit
  15. 23 Oct, 2017 1 commit
  16. 14 Sep, 2017 2 commits
  17. 18 Aug, 2017 1 commit
  18. 08 Aug, 2017 3 commits
    • Leigh Stoller's avatar
      Reservation announcements get lower priority so they display after · 3edade48
      Leigh Stoller authored
      system announcements.
      3edade48
    • Leigh Stoller's avatar
      7bb7b252
    • Leigh Stoller's avatar
      Reservation system changes: · c7c93e9f
      Leigh Stoller authored
      1. Allow uuids to be used to specify reservations, change pretty much
         everything in the web interface to use uuid's so we stop exporting
         databases indexes to the client side.
      
      2. Change RPC path to return a blob of data when approving a
         reservation. Ditto for initial creation, so that we can see precisely
         what the local cluster has done.
      
      3. When a reservation is created/approved, insert an announcement in the
         announce system for that user, set to go off 24 hours ahead of
         reservation. Update that announcement when reservation is modified.
      c7c93e9f
  19. 23 May, 2017 1 commit
  20. 17 Apr, 2017 1 commit
    • Leigh Stoller's avatar
      Attempt to operate in an admin mode for reservations · 188f041f
      Leigh Stoller authored
      So, one reason the fast RPC path is fast cause we do not normally
      operate with credentials, but with reservations we have to since we want
      the reservation creator to be a real user and of course the project has
      to exist. Need credentials for that. But when an admin is editing or
      creating a reservation in another project, we need the admin user to
      exist too, and we might need the project to be created. That requires
      different credentials. So in an attempt to deal more generally with the
      admin problem, export an entrypoint to create a user (the admin user)
      before trying to create a reservation. Not sure this is the best way to
      go but its one way to go.
      
      In general, I think we need a more explicit user/project management API
      for the Portal. Needs more thought.
      188f041f
  21. 23 Mar, 2017 1 commit
  22. 09 Mar, 2017 1 commit
  23. 06 Mar, 2017 2 commits
    • Leigh Stoller's avatar
      Minor improvements: · 7ad899a4
      Leigh Stoller authored
      1. Return REFUSED for an admission control violation.
      
      2. Treat REFUSED errors as a user error instead of a fatal error.
      
      3. Fix up confirmation modal to make it more clear that the reservation
         needs to be submitted.
      7ad899a4
    • Leigh Stoller's avatar
      Two changes to reservations: · 5e7e613b
      Leigh Stoller authored
      1. Plumb through a prediction RPC to return the reservation system
         pressure and outstanding reservations for a list of projects. This is
         invoked from the instantiate page when loaded, using the projects
         the user has permission to create experiments in, the results are
         stored in a script global variable for someone else to make sense of.
      
      2. When checking to see if a reservation can be accommodated, check with
         the admission control library first to see if the is a project limit
         on the type that would be violated. Need to do a little rearranging
         of the deck chairs in admission control library.
      5e7e613b
  24. 20 Feb, 2017 1 commit
  25. 17 Feb, 2017 1 commit
  26. 15 Feb, 2017 1 commit
  27. 17 Jan, 2017 1 commit
    • Leigh Stoller's avatar
      Various tweaks to reservation UI: · 29258b2c
      Leigh Stoller authored
      * Allow start to be optional; means "now".
      
      * When selecting the current day, disable hours in the past.
      
      * Catch a few more form errors.
      
      * When editing, the start time might be in the past. Do not consider
        that an error, just pass it through since the backend is okay with
        it.
      29258b2c
  28. 03 Jan, 2017 1 commit
  29. 28 Nov, 2016 1 commit
  30. 09 Nov, 2016 1 commit
  31. 03 Nov, 2016 1 commit