1. 10 Apr, 2018 1 commit
  2. 09 Apr, 2018 1 commit
  3. 03 Apr, 2018 1 commit
  4. 02 Apr, 2018 1 commit
  5. 30 Mar, 2018 1 commit
    • Mike Hibler's avatar
      Support for frisbee direct image upload to fs node. · 99943a19
      Mike Hibler authored
      We have had issues with uploading images to boss where they are then written
      across NFS to ops. That seems to be a network hop too far on CloudLab Utah
      where we have a 10Gb control network. We get occasional transcient timeouts
      from somewhere in the TCP code. With the convoluted path through real and
      virtual NICs, some with offloading, some without, packets wind up getting
      out of order and someone gets far enough behind to cause problems.
      
      So we work around it.
      
      If IMAGEUPLOADTOFS is defined in the defs-* file, we will run a frisbee
      master server on the fs (ops) node and the image creation path directs the
      nodes to use that server. There is a new hack configuration for the master
      server "upload-only" which is extremely specific to ops: it validates the
      upload with the boss master server and, if allowed, fires up an upload
      server for the client to talk to. The image will thus be directly uploaded
      to the local (ZFS) /proj or /groups filesystems on ops. This seems to be
      enough to get around the problem.
      
      Note that we could allow this master server to serve downloads as well to
      avoid the analogous problem in that direction, but this to date has not
      been a problem.
      
      NOTE: the ops node must be in the nodes table in the DB or else boss will
      not validate proxied requests from it. The standard install procedure is
      supposed to add ops, but we have a couple of clusters where it is not in
      the table!
      99943a19
  6. 26 Mar, 2018 3 commits
  7. 02 Mar, 2018 2 commits
  8. 22 Feb, 2018 1 commit
  9. 21 Feb, 2018 4 commits
  10. 20 Feb, 2018 1 commit
  11. 12 Feb, 2018 1 commit
  12. 03 Feb, 2018 1 commit
  13. 30 Jan, 2018 2 commits
  14. 25 Jan, 2018 1 commit
  15. 12 Jan, 2018 1 commit
  16. 09 Jan, 2018 1 commit
    • Mike Hibler's avatar
      Yet another layer of backward compat... · afa1569d
      Mike Hibler authored
      If we support provenance but not deltas, then we do not use the
      newer create-versioned-image when creating images from Xen vnodes.
      However, we had a bug in that path where we would then not pass the
      imageid argument to the old script, resulting in us spewing the image
      out to stdout which got put in the logfile.
      afa1569d
  17. 01 Jan, 2018 1 commit
  18. 23 Dec, 2017 1 commit
  19. 14 Dec, 2017 1 commit
  20. 06 Dec, 2017 1 commit
  21. 27 Nov, 2017 1 commit
  22. 21 Nov, 2017 2 commits
  23. 20 Nov, 2017 1 commit
  24. 19 Nov, 2017 6 commits
  25. 17 Nov, 2017 2 commits
  26. 09 Nov, 2017 1 commit
    • Mike Hibler's avatar
      Introduce a "failed" state for resource allocation. · 7e13f79b
      Mike Hibler authored
      If a background resource allocation fails, we put the lease in the "failed"
      state instead of destroying it. There were some ripple effects, specifically,
      the lease_daemon now checks for "failed" leases and send messages to us at
      the same frequency as for "unapproved" leases. The correct response here is
      almost certainly to destroy the lease, though you can put it back in the
      "unapproved" state (via modlease) and try to approve it to see what happened.
      
      Also add background mode to approvelease since it can do time consuming
      resource allocation.
      
      Nit: cleanup logfiles used in backgroud operation.
      7e13f79b