1. 09 May, 2018 1 commit
  2. 08 May, 2018 6 commits
    • David Johnson's avatar
      Fix a nasty docker/mkvnode.pl race inspired by bootvnodes/vnodesetup. · e468cc49
      David Johnson authored
      This is probably true for Xen too, but in some cases, the
      vnodesetup early-release hackwaitandexit timeout of 30 seconds
      causes a race condition.  Normally, the first node sets up
      significant network state, and sometimes flips MAC addresses
      around from interface to interface -- OR puts a physical interface
      into a bridge, then changes the bridge's MAC address.  There is a
      short window of time where both the bridge and the new member
      interface share a MAC address -- and if the tmcc ifconfig assembly
      process for vnodes following the first vnode resolves
      the wrong device's MAC address and uses that to flesh out the
      ifconfig info, the vnodesetup will be in a world of hurt (i.e., you
      might see an attempt to make a vlan device out of a vlan device).
      The chance of this happening is miniscule, but I've seen it.
      
      So, at least for docker for now, we protect the first vnode against
      the 30-second timeout in vnodesetup hackwaitandexit, and we wait for the
      actual running file to be written, or error.
      
      This is probably applicable to any linux mkvnode.pl path, but I suppose
      it would have been another hundred thousand vnode creates before I saw
      it again.
      e468cc49
    • David Johnson's avatar
      Do not fail iptables rules gen on name resolution failure. · a9827417
      David Johnson authored
      Under high load, of course we can have DNS problems.  However, perl
      seems to get stuck on retry; it's like the nak gets cached (which would
      be extremely odd, but can't argue with the evidence).
      
      Anyway, if resolution continues to fail, give up and feed the name to
      iptables, and let it try :).
      a9827417
    • David Johnson's avatar
    • David Johnson's avatar
      533e1714
    • David Johnson's avatar
      Minor debug message fix. · 7515dc50
      David Johnson authored
      7515dc50
    • David Johnson's avatar
      Do not run ddjikstra while holding the global lock in docker clientside. · ee3694f4
      David Johnson authored
      (All we need to do while holding the global lock is allocated IFBs; the
      generation of routing scripts and traffic shaping scripts is both
      unlikely to fail and potentially slow due to running djikstra.  So, also
      let the vnode early release prior to those things, immediately after IFB
      allocation.)
      ee3694f4
  3. 07 May, 2018 9 commits
  4. 06 May, 2018 4 commits
  5. 04 May, 2018 7 commits
  6. 03 May, 2018 1 commit
  7. 02 May, 2018 2 commits
  8. 01 May, 2018 1 commit
  9. 30 Apr, 2018 6 commits
  10. 27 Apr, 2018 1 commit
  11. 26 Apr, 2018 2 commits
    • Leigh B Stoller's avatar
      Several fixes: · 93b66ba9
      Leigh B Stoller authored
      1. When editing a reservation, request the forecast info specifically
         for the project the reservation is attached to. This is really
         important when looking at it as an admin, since we want to edit in
         the user's project context.
      
      2. But to do that, we have to wait for reservation info to come back,
         and then ask for the forecast data.
      
      3. And if the user clicks on the "Search" button before the res data
         comes back, we have to wait for it before we can search.
      
      4. Fix the Search function; I was handling duplicates in the forecast
         data incorrectly.
      93b66ba9
    • Leigh B Stoller's avatar
      Convert an image server search failure into a BADARGS response, which is · f9c1180b
      Leigh B Stoller authored
      the generic error code for all rspec errors, so that the Portal does
      the right thing when displaying the error.
      f9c1180b