1. 24 Nov, 2010 1 commit
    • Mike Hibler's avatar
      First crack at a frisbee "master server" for handling GET (download) requests. · a2a896ab
      Mike Hibler authored
      There are a couple of new packet types in the frisbee protocol which are
      exchanged via TCP with the master server: GETREQUEST and GETREPLY.  The
      client passes to the master server an opaque imageid and a couple of options
      and gets back the addr/port to use to actually download the image.  The
      implementation of the master server is fragile and is more of a test
      framework, Grant is working on a more robust master server.  I am mostly
      doing a backend that communicates with the Emulab DB to do its authentication
      and making the client changes.
      
      The client now uses the -S option to specify the IP address of the master
      server and the -F option to specify an imageid.  If no error is returned,
      the image is downloaded using the returned addr/port.  If -Q is used in place
      of -F, then the client makes a "status only" call getting back info about
      whether the named image is accessible to the client and whether a server is
      currently running.
      
      On the server side, the new master server (mserver.c) has an Emulab
      configuration "backend" that supports host-based authentication.
      The IP address of the caller is mapped to a node_id/pid/gid/eid combo
      that is used to determine access.  On a request, the specified imageid is
      treated either as a pathname (if it starts with '/') or an image identifier
      of the form "<pid>/<imagename>".  If it is a pathname, we check to make
      sure that pathname (after running through "realpath") is contained in one
      of the directories accessible to that node in its current experiment context;
      i.e., /share, /proj/<pid>, /groups/<pid>/<gid>, or /users/<swapper-uid>.
      If it is an image identifier, the DB is queried to ensure that access is
      allowed to that image; i.e., it must be "global" or in the appropriate
      project/group.
      
      The master server forks a frisbeed for each valid request, if one is not
      already running.  The multicast address selection is still based on the
      emulab_indicies.frisbee_index field, but the address/port/server info is no
      longer stored in the frisbee_blobs table (frisbee_pid, load_address,
      load_busy are not set).
      
      Note that this is not yet integrated in the os_load path.  Further work is
      required to replace frisbeelauncher.
      a2a896ab
  2. 12 Nov, 2010 1 commit
  3. 29 Oct, 2010 1 commit
    • Mike Hibler's avatar
      Improve regular (non-image) file transfer via frisbee. · 78f3f8d3
      Mike Hibler authored
      Basically, make it possible to transfer a non imagezip image.  Previously
      you had to wrap a regular file as an image in order to transfer it.  The
      big hang up was that the frisbee protocol could only transfer files that
      were a multiple of 1MB (the chunk size).
      
      This commit changes the frisbee protocol slightly to allow transfer of
      non-1MB-multiple files.  The protocol change was to add a new JOIN message
      that returns the size of the file in bytes rather than in blocks.  This
      allows the client to know that the file in question is not a multiple of 1MB
      and allows it to request the correct partial number of blocks for the
      final chunk and to extract the correct amount of data from the final 1K block
      (that block is still padded to 1K by the server).  For the server side, the
      request mostly allows it to do some sanity checking.  The fact that the
      server is started with a file that is not a multiple of 1MB is what triggers
      it to know about partial chunks.  The sanity checking is that the server will
      not acknowledge clients that attempt to join with a version 1 JOIN message,
      since nothing good would come of that pairing.
      
      On the client side, frisbee must be invoked with the -N (nodecompress) option
      in order to issue a v2 JOIN.  See the comment in the code for the rationale,
      but it is largely a backward compat feature.
      
      While I was changing the JOIN message, I added a couple of other future
      features.  One is that by passing back a 64-bit value for the size of the
      image in bytes, we can feed bigger images.  However there is still much to
      be done to realize this.  The other was to add blocksize/chunksize fields
      in the message so that the server/client can negotiate the transfer parameters,
      e.g., 1024 blocks of 1024 bytes vs. 256 blocks of 8192 bytes, the latter being
      for "jumbo" packets on a Gb ethernet.  But there is still more to be done to
      get this working too.
      78f3f8d3
  4. 21 Dec, 2009 1 commit
  5. 18 Dec, 2009 1 commit
  6. 07 Oct, 2009 1 commit
  7. 28 Sep, 2009 1 commit
    • Mike Hibler's avatar
      Changes: · 8fd4b67e
      Mike Hibler authored
      Support for jumbo packets.  Setting WITH_JUMBO on the make command line
      will change the image block size to 8192 bytes and reduces the number of
      block per chunk to 256 (to maintain the 1MB chunk size for compat with old
      images).  The default is still 1024.
      
      Added the notion of a "dubious" chunk buffer in the client.  If an incoming
      chunk buffer is marked as CHUNK_DUBIOUS, then its contents can be evicted and
      the buffer reused for a more promising chunk.  This is a crude replacement
      mechanism that is currently only used in one place: if we miss part of a
      chunk and the server switches to sending a new chunk for which we have no
      free buffer, we switch to collecting the new chunk.  The reasoning is that
      it will take a while for the server to switch back to completing the former
      chunk, during which time it may send one or more complete chunks that we
      could more fruitfully use (decompress and write out).
      
      Changed the meaning of the "done" field for a chunk.  It used to mean either
      that we have completely processed the chunk or that we are currently collecting
      it.  It took additional work (scanning all chunk buffers) to differentiate
      these cases, so I make it explicit.
      
      Allow the client and server to dynamically determine the maximum socket
      buffer size.
      
      Fix a couple more on-the-wire data structure size/alignment issues that
      showed up on a 64-bit OS.
      
      A few minor speedups to the bitmap handling code.  Think: "rearranging deck
      chairs on the Titanic" here.  We need more serious algorithmic changes
      to scale all this code going forward.
      
      Add some more TRACE events and refine what is already there.
      
      Added some hacks to allow frisbee client/server to run on the same machine.
      We had made it remarkably hard to do this.  But then again, why would you
      want to!  Look for SAME_HOST_HACK in the makefile.
      8fd4b67e
  8. 24 Sep, 2009 1 commit
  9. 15 Sep, 2009 2 commits
  10. 14 Sep, 2009 1 commit
  11. 11 Sep, 2009 1 commit
  12. 19 Aug, 2009 1 commit
  13. 20 Oct, 2008 1 commit
  14. 27 Jun, 2007 2 commits
    • Mike Hibler's avatar
      Print rates as quads rather than longs. · 87235b5c
      Mike Hibler authored
      87235b5c
    • Mike Hibler's avatar
      Don't know if this is a BSD linuxthread thing or just a pthread semantic, · e25ef575
      Mike Hibler authored
      but if the child thread calls exit(-1) from fatal, the frisbee process
      exits, but with a code of zero; i.e., the child exit code is lost.
      Granted, a multi-threaded program should not be calling exit willy-nilly,
      but it does so we deal with it as follows.
      
      Since the child should never exit during normal operation (we always
      kill it), if it does exit we know there is a problem.  So, we catch
      all exits and if it is the child, we set a flag.  The parent thread
      will see this and exit with an error.
      
      Since I don't understand this fully, I am making it a FreeBSD-only
      thing for now.
      e25ef575
  15. 25 May, 2007 1 commit
  16. 09 Jan, 2007 1 commit
    • Mike Hibler's avatar
      Frisbee MFS changes: · 346c0562
      Mike Hibler authored
       * support FreeBSD 6
       * client-side changes to support enable/disable of ACPI via slicefix
       * use dynamically linked Emulab binaries in frisbee MFS (for size)
      346c0562
  17. 01 Dec, 2006 1 commit
    • Mike Hibler's avatar
      Bug fixes from Annette DeSchon <deschon@ISI.EDU> and Keith Sklower · 8ea641c1
      Mike Hibler authored
      <sklower@vangogh.CS.Berkeley.EDU> for the following, related to the -z (zero)
      option in imageunzip/frisbee:
      
        1. For the case where a full-disk image is smaller than
           the disk the image is being unzipped onto, we added
           code to zero the area between the end of the image and
           the end of the disk.
      
        2. During the unzipping process, when zeros are being written
           at the end of a chunk, a write() that returned a length
           different from the expected value previously caused an
           infinite loop.  We noticed this problem at ISI, on a number
           of pc733s, which we suspect may have (relatively minor)
           hardware disk problems.
      
      The latter addressed a Mike-o that has existed for 4 years.  Call it failure
      resilient computing or just plain denial, but because of a botched conditional,
      I was ignoring failed writes to the disk.  This lead to one of those infinite loop
      thingees if you actually had a bad disk.
      8ea641c1
  18. 21 Nov, 2006 1 commit
  19. 16 Dec, 2005 1 commit
  20. 02 Dec, 2005 1 commit
  21. 16 Nov, 2005 1 commit
  22. 11 May, 2005 1 commit
    • Mike Hibler's avatar
      Hack multicast "keep alive" mechanism. The "-K <seconds>" option can be · b9425e72
      Mike Hibler authored
      used to force the server to send an IGMP report if it doesn't receive any
      packets within <seconds> seconds.  As long as the server is receiving
      packets, it won't send the report.
      
      What I'm not lovin here, is that to send a report I have to drop membership
      in the group (socket opt IP_DROP_MEMBERSHIP) and rejoin (IP_ADD_MEMBERSHIP).
      Simply trying to do an add membership doesn't work because the kernel thinks
      you are already in the group and errs out.  I'm hoping all the up and down
      activity doesn't make the switch behave any worse than it already does.
      b9425e72
  23. 16 Mar, 2005 1 commit
    • Mike Hibler's avatar
      Really important stuff: · a8ef625f
      Mike Hibler authored
      Unified the 'dot' handling (status printing) of frisbee and imagezip.
      They now both report the number of chunks remaining along with the dots.
      Also put out a periodic splat for every GB of uncompressed data we write.
      This is useful when you are zero-filling, since otherwise it appears that
      frisbee has hung when it is really just zeroing the last unused 100GB of
      your disk.
      a8ef625f
  24. 16 Nov, 2004 1 commit
  25. 12 Nov, 2004 2 commits
  26. 03 Nov, 2004 1 commit
  27. 28 Oct, 2004 1 commit
    • Mike Hibler's avatar
      Minor tweaks from a one-day binge of performance analysis. · 1a76e634
      Mike Hibler authored
      The only meaningful change was to insert a sched_yield() in the frisbee
      decompressor path.  Apparently, the decompressor can run long enough to
      cause the incoming socket buffer to overflow.  I was under the assumption
      that the decompressor would not run much longer than a single time slice
      (0.001 seconds, about 8 packets) before its priority would force it to
      be context switched.  But it was running much longer than that!  Forcing
      a periodic yield seems to have taken care of this.
      
      One other cause of retransmitted blocks that I saw was where the server
      was taking a long time to read data from a file (up to 0.25 seconds).
      This would stall the clients and force them to rerequest blocks (which
      they do after about 0.10 seconds).  We can improve on this by splitting
      the file reading off to a seperate thread.
      
      Most other changes are related to the event logging code.
      1a76e634
  28. 29 Sep, 2004 1 commit
  29. 16 Jun, 2004 1 commit
  30. 10 May, 2004 1 commit
  31. 22 Mar, 2004 1 commit
  32. 18 Mar, 2004 1 commit
  33. 09 Mar, 2004 1 commit
  34. 08 Mar, 2004 1 commit
  35. 14 Jan, 2004 1 commit
  36. 12 Dec, 2003 1 commit
  37. 17 Nov, 2003 1 commit