1. 16 Jan, 2018 1 commit
    • Leigh B Stoller's avatar
      Lots of changes for SSL enabled pubsub: · e44fc90d
      Leigh B Stoller authored
      Pubsub libraries are now SSL enabled by default, so that we can talk SSL
      from a perl client. To do this we need another entry point from SWIG
      into the event code, event_register_withssl. At the same time there is a
      new entry point called event_set_sockbufsizes that calls a new pubsub
      entry point pubsub_set_sockbufsizes.
      
      The problem is that current swig generates code that does not compile,
      and since I don't know nothing about swig, I just hand crafted the two
      new routines that needed in event_wrap.c and the few extra lines that go
      into event.pm.
      
      Also change all the link lines to include the ssl/crypto libraries when
      linking.
      e44fc90d
  2. 20 Dec, 2017 1 commit
    • Mike Hibler's avatar
      Revenge of the Delta Images. · a79af843
      Mike Hibler authored
      Can't live with em, can't kill em dead... When writing my hack
      routine to convert an image path into an imageid, I failed to
      consider the .ddz (delta image) suffix.
      a79af843
  3. 07 Dec, 2017 1 commit
  4. 30 Nov, 2017 1 commit
  5. 18 Aug, 2017 1 commit
  6. 30 Mar, 2017 1 commit
  7. 10 Mar, 2017 1 commit
  8. 04 Mar, 2017 1 commit
  9. 04 Feb, 2017 1 commit
  10. 02 Feb, 2017 1 commit
  11. 31 Jan, 2017 2 commits
    • Mike Hibler's avatar
      Sort out issues w.r.t. unlimited bandwidth and dynamic bandwidth. · 0126a71d
      Mike Hibler authored
      Setting bandwidth==0 in either the sitevars or via subboss_attributes
      will now get properly translated to no bandwidth cap in frisbeed.
      Previously it was ignored. But I still do not recommend this setting
      as slow clients can get swamped.
      
      Allow setting of dynamic BW control along with unlimited bandwidth.
      Likewise not recommended (though marginally better than the above).
      0126a71d
    • Mike Hibler's avatar
      More tweaks to frisbee heartbeat code. · df78b7ef
      Mike Hibler authored
      Make sure server doesn't exit as long as it is getting heartbeats
      from known clients. We used to exit when we stopped getting requests,
      however clients often finish their network activity long before they
      are actually done.
      
      Emulab event now reports Mebibytes rather than bytes. It is accurate
      enough and avoids perl bigints in the receiver(s).
      df78b7ef
  12. 30 Jan, 2017 2 commits
  13. 20 Jan, 2017 1 commit
  14. 19 Jan, 2017 1 commit
    • Mike Hibler's avatar
      Redo the frisbee heartbeat code again. · 5407463e
      Mike Hibler authored
      My "shortcut" to enable a heartbeat via a client-side command line
      proved to be untenable. There are just too many places where we fire
      off the client and getting the right heartbeat interval value to all
      those places would have been...challenging.
      
      So back to the original plan of having a server-side command line
      option and letting the server tell the client when/what to report.
      This limits the changes to just the frisbee master server, in particular
      I now just have to get the value to master server instances running
      on the subbosses (not done yet, just hardwiring a value for now).
      
      All this said, I still had to modify the various places we invoke
      the frisbee client to add an option to enable the heartbeat, but at
      least I didn't need to know a specific value.
      5407463e
  15. 18 Jan, 2017 1 commit
    • Mike Hibler's avatar
      Add the ability to "report" for proxied nodes. · 3665ce07
      Mike Hibler authored
      Not even 48 hours old and I have already had to change it...
      Forgot about vnode hosts that proxy on behalf of their vnodes.
      For those, we want to be reporting the actual client identity and
      not the physical host's. So add a "who" field to the report message
      to explicitly identify who requested the data. Previously we were
      just using the IP of the sender of the report message to identify
      the client.
      
      Note that this is NOT backward compatible with yesterday's version.
      Since I know everywhere that version was running, I could just
      eliminate them and pretend that version never existed. Dead men
      tell no tales.
      3665ce07
  16. 17 Jan, 2017 1 commit
    • Mike Hibler's avatar
      Implement heartbeat/status reports in Frisbee. · 2be46ba4
      Mike Hibler authored
      There are three pieces here, a change to the frisbee protocol itself, an
      Emulab event component to get status back to the portal, and the surrounding
      infrastructure to make it all work.
      
      Frisbee heartbeat messages:
      
      Added a new message type to the frisbee protocol, "Progress". In theory it
      operates by having the server send a multicast progress request to its clients
      which includes an interval at which to report (or "just once") and an
      indication of what to report (nothing, progress summary, or full stats). The
      client then sends unicast "fire and forget" UDP replies according to that
      schedule. However, I took a shortcut for the moment and just added a command
      line option to the client to tell it to report a summary at the indicated
      interval (-H <interval>).  So the server never sends requests.
      
      This is implemented in the client by a fourth thread since I wanted it to
      operate independent of packet reception (which would cause clients to report
      in a highly synchronized fashion due to multicast). The server instance just
      logs progress reports into its log.
      
      This protocol addition should be fully backward compatible as both client and
      server ignore (but log) unknown messages.
      
      Emulab progress report events:
      
      When this is compiled in (-DEMULAB_EVENTS) and turned on (-E <server>), the
      frisbee server instances will send a FRISBEEPROGRESS event to the indicated
      event server for every progress report it receives (in addition to logging the
      events to its own log). Right now it will create an event with key/value pairs
      for the information in a client summary reply:
      
      TSTAMP is the client's time at which it sends the event. Could be used by the
      received to determine latency of the report if it cared (and if it assumed
      that the clocks are in sync). We don't care about this.
      
      SEQUENCE is the report number. Again, could be used by the receiver, in this
      case to detect loss, if it cared. We don't.
      
      CHUNKS_RECV is complete chunks that the client has received from the network.
      CHUNKS_DECOMP is chunks decompressed by the client.  BYTES_WRITTEN is bytes
      written to disk by the client.
      
      Any of the three can be used by the event receiver as an indication of life
      and/or progress. However, only the last would be a reasonable indicator of
      time remaining since it is the last (and slowest) phase of imaging. To
      estimate time remaining we could compare that value to the amount of
      uncompressed data that is in the image. This makes the sketchy assumptions
      that time for writes to the disk are uniform and that the number and distance
      of seeks is uniform, but it is better than a sharp stick in the eye.
      
      Emulab infrastructure:
      
      There is a new sitevar "images/frisbee/heartbeat" which can be set to a
      non-zero value to tell the frisbee MFS to fire off frisbee with -H <value>
      and thus make reports. The default value of zero means to not make reports.
      The tmcd "loadinfo" command sends this through via the HEARTBEAT=<value>
      param.
      
      REQUIRED A TMCD VERSION BUMP TO 41.
      2be46ba4
  17. 10 Jan, 2017 1 commit
  18. 24 Oct, 2016 1 commit
  19. 23 Mar, 2016 1 commit
  20. 22 Mar, 2016 1 commit
  21. 15 Feb, 2016 2 commits
  22. 14 Jan, 2016 1 commit
  23. 13 Nov, 2015 1 commit
  24. 22 Oct, 2015 1 commit
  25. 22 Sep, 2015 1 commit
  26. 15 Sep, 2015 1 commit
  27. 14 Aug, 2015 1 commit
  28. 13 Aug, 2015 1 commit
    • Mike Hibler's avatar
      Before I lose them: (disabled) experimental performance changes. · 8f5a2158
      Mike Hibler authored
      Did this quite a while back, but haven't tested yet.
      
      Changes to make a client even more passive when there are blocks in
      flight (requested by someone else) that they can use. The goal here
      is to keep a late joining client from making requests for chunks that
      all the others have already seen. This is not a big problem in the
      default case where clients randomize the order of chunks they request,
      but when they are making sequential requests it can be a problem.
      8f5a2158
  29. 16 Jul, 2015 2 commits
    • Mike Hibler's avatar
      Generalize the retries in the TRYAGAIN case. · 7fb680d2
      Mike Hibler authored
      Note that this does not fix the problem I was chasing (that is fixed
      by the emulab_config change just commited), but it is still a good idea.
      7fb680d2
    • Mike Hibler's avatar
      Hopefully fix frisbee uploader "no such file or directory" errors. · 3399abdf
      Mike Hibler authored
      When using AMD, the uploader path wound up "realpath"ed in the form
      of /.amd_mnt/ops/... which is the location at which AMD does the NFS
      mount when triggered. However, if the mount times out, that path is
      no longer valid.
      
      So for the AMD case, we have to strip the AMD prefix from the path.
      This ensures that subsequent stats and other accesses of the path go
      through the AMD mountpoint (e.g., "/proj/foo") and not the NFS mountpoint
      (e.g., "/.amd_mnt/ops/proj/foo"), and thus trigger AMD to do the NFS mount.
      3399abdf
  30. 29 Jun, 2015 1 commit
  31. 09 Jun, 2015 1 commit
    • Mike Hibler's avatar
      Pooched the backward-compat case. · 222ab05d
      Mike Hibler authored
      I thought I would be clever and check the client ID on a JOIN reply
      to make sure we match up the proper request/reply (an issue when replies
      are multicast). However, on a JOINv1, the reply does not include the
      clientid. Unfortunately, for previous backward-compat situations, we
      still do generate JOINv1 requests most of the time.
      
      So for JOINv1's we don't try to check the clientid. This means that
      with a newer server that MCs replies, a JOINv1 client may see someone
      else's reply and start transmitting requests even before the server
      has seen that client's join request. This is not an issue for us right
      now since the server only loosely tracks its clients and does so just
      so that it can report stats when the client leaves. I say "loosely"
      because, e.g., the server does not require that block requests come
      from a JOIN'ed client.
      222ab05d
  32. 01 Jun, 2015 1 commit
  33. 27 May, 2015 3 commits
    • Mike Hibler's avatar
      d564150c
    • Mike Hibler's avatar
    • Mike Hibler's avatar
      Avoid (as best we can) port collisions on the frisbee client/server/uploader. · 0a7fd856
      Mike Hibler authored
      Two different fixes here.
      
      The first affects frisbeed ("the server") and frisuploadd ("the uploader").
      In both, the master server was choosing the port to use as an obscure function
      of the current value of emulab_indicies frisbee_index without regard to whether
      the port was already in use by someone else.
      
      To fix this, the "-p <port>" option of both programs has been changed to
      allow a value of 0 to indicate that the program (rather, the kernel) should
      choose the first available port. It will also take a port range (e.g.,
      "-p 50000-50100") which says to find the first available port in that range.
      To communicate The Chosen Port back to the master server, there is a new
      option to frisbeed and frisuploadd, "-A <file>", which says to write the
      address info into the indicated file in the <IP-addr>:<port> format.
      Note that we don't care about the <IP-addr> part since that is just the
      multicast address (frisbeed) or our unicast address (frisuploadd) that we
      pass in to the program. The "Emulab configuration" of the master server uses
      the defs file FRISEBEEMCASTPORT and FRISEBEENUMPORTS vars to determine what
      to pass via the "-p" option. See the comment in defs-example. The "null
      configuration" (aka, on a subboss) just passes "-p 0" to frisbeed.
      
      The second fix was an attempt to avoid port conflicts on the client side
      (frisbee). There is only so much we can do since all clients of a multicast
      frisbee session have to use the same port, but we can avoid conflicts with
      other UDP apps that bind to INADDR_ANY:<port>.
      
      We make use of the REUSEADDR socket option and bind specifically to
      <mcaddr>:<port>. This also requires that the server multicast the JOIN
      reply that was previously unicast. Note that use of REUSEADDR will also
      allow multiple frisbee clients on the same host to be in the same session
      (not that we ever do that).
      
      Since the server is typically updated whenever the Emulab software is, but
      the client is embedded in images and MFSes, there can be pretty much any
      combo of {old,new} server and {old,new} client in the field. So backward
      compatibility was essential and there are a variety of implementation details
      related to that. See the comment in network.c::ClientNetInit().
      0a7fd856
  34. 30 Apr, 2015 1 commit