1. 08 Jan, 2003 1 commit
  2. 03 Jan, 2003 1 commit
  3. 11 Dec, 2002 1 commit
    • Mike Hibler's avatar
      Server: back to using a condvar since they seem to be fixed. · 2e77122f
      Mike Hibler authored
      Server: make file readsize independent of burstsize (previously
      readsize had to be a divisor of burstsize).  A subtle side-effect
      is that the dynamic burst rate is recalcluated at the conslusion
      of every burst instead of after every readsize count of blocks has
      been sent (less than a burst)  This just seems to be more logical.
      
      Client: add "-T DOS-type" option to tell frisbee, when in slice
      mode, to set the type of the slice in the DOS partition table.
      This is useful if you are dropping say a BSD filesystem into
      an unused slice, you don't have to go back later and set this
      with fdisk.  Considered making this info part of the image
      itself (recorded by imagezip when creating a slice image),
      but decided against it.
      2e77122f
  4. 06 Dec, 2002 1 commit
  5. 26 Nov, 2002 1 commit
    • Mike Hibler's avatar
      Commit of USENIX driven improvements: · 2ff95cee
      Mike Hibler authored
      1. Client: add "NAK avoidance."  We track our (and others, via snooping) block
         requests and avoid making re-requests unless it has been "long enough."
      
      2. Server: more aggressive merging of requests in the work queue.  For every
         new request, look for any overlap with an existing entry.
      
      3. Server: from Leigh: first cut at dynamic rate adjustment.  Can be enabled
         with -D option.
      
      4. Both: change a lot of the magic constants into runtime variables so that
         they can be adjusted on the command line or via the event interface (see
         below).
      
      5. Add code to do basic validatation of incoming packets.
      
      6. Client: randomization of block request order is now optional.
      
      7. Client: startup delay is optional and specified via a parameter N which
         says "randomly delay between 0 and N seconds before attempting to join."
      
      8. Both: add a new LEAVE message which reports back all the client stats to
         the server (which logs them).
      
      9. Both: attempt to comment some of the magic values in decls.h.
      
      10. Both: add cheezy hack to fake packet loss.  Disabled by default, see
         the GNUmakefile.  This code is coming out right after I archive it with
         this commit.
      
      11. Add tracing code.  Frisbee server/client will record a number of
         interesting events in a memory buffer and dump them at the end.  Not
         compiled in by default, see the GNUmakefile (NEVENTS) for turning this on.
      
      12. Not to be confused with the events above, also added testbed event system
         code so that frisbee clients can be remotely controlled.  This is a hack
         for measurement purposes (it requires a special rc.frisbee in the frisbee
         MFS).  Allows changing of all sorts of parameters as well as implementing
         a crude form of identification allowing you to start only a subset of
         clients.  Interface is via tevc with commands like:
      	tevc -e testbed,frisbee now frisbee start maxclients=5 readahead=5
      	tevc -e testbed,frisbee now frisbee stop exitstatus=42
         Again, this is not compiled in by default as it makes the client about
         4x bigger.  See the GNUmakefile for turning it on.
      2ff95cee
  6. 17 Nov, 2002 1 commit
  7. 07 Jul, 2002 1 commit
  8. 11 Jan, 2002 2 commits
  9. 10 Jan, 2002 1 commit
    • Leigh B. Stoller's avatar
      I noticed in the 12 nodes tests that CPU was running at 5-6% now. I · 5c02231f
      Leigh B. Stoller authored
      also noticed that the slower machines were getting very far behind the
      faster machines (the faster machines requests chunks faster), and
      actually dropping them cause they have no room for the chunks
      (chunkbufs at 32). I increased the timeout on the client (if no blocks
      received for this long; request something) from 30ms to 90ms.  This
      helped a bit, but the real help was increasing chunkbufs up to 64.
      Now the clients run in pretty much single node speed (152/174), and
      the CPU usage on boss went back down 2-3% during the run. The stats
      show far less data loss and resending of blocks. In fact, we were
      resending upwards 300MB of data cause of client loss. That went down
      to about 14MB for the 12 node test.
      
      Then I ran a 24 node node test. Very sweet. All 24 nodes ran in 155 -
      180 seconds. CPU peaked at 6%, and dropped off to steady state of 4%.
      None of the nodes saw any duplicate chunks. Note that the client is
      probably going to need some backoff code in case the server dies, to
      prevent swamping the boss with unanswerable packets. Next step is to
      have Matt run a test when he swaps in his 40 nodes.
      5c02231f
  10. 08 Jan, 2002 1 commit
    • Leigh B. Stoller's avatar
      Remove all of the connection handling stuff. Clients come and go, and · 7523a98c
      Leigh B. Stoller authored
      idleness is defined as an empty work queue. We still use join/leave
      messages, but the join message is so that the client can be informed
      of the number of blocks in the file. The leave message is strictly
      informational, and includes the elapsed time on the client, so that it
      can be written to the log file. If that message is lost, no big deal.
      I ran a 6 node test on this new code, and all the clients ran in 174
      to 176 seconds, with frisbeed using 1% CPU on average (typically
      starts out at about 3%, and quickly drops off to steady state).
      7523a98c
  11. 07 Jan, 2002 1 commit
    • Leigh B. Stoller's avatar
      Checkpoint first working version of Frisbee Redux. This version · 86efdd9e
      Leigh B. Stoller authored
      requires the linux threads package to give us kernel level pthreads.
      
      From: Leigh Stoller <stoller@fast.cs.utah.edu>
      To: Testbed Operations <testbed-ops@fast.cs.utah.edu>
      Cc: Jay Lepreau <lepreau@cs.utah.edu>
      Subject: Frisbee Redux
      Date: Mon, 7 Jan 2002 12:03:56 -0800
      
      Server:
      The server is multithreaded. One thread takes in requests from the
      clients, and adds the request to a work queue. The other thread processes
      the work queue in fifo order, spitting out the desrired block ranges. A
      request is a chunk/block/blockcount tuple, and most of the time the clients
      are requesting complete 1MB chunks. The exception of course is when
      individual blocks are lost, in which case the clients request just those
      subranges.  The server it totally asynchronous; It maintains a list of who
      is "connected", but thats just to make sure we can time the server out
      after a suitable inactive time. The server really only cares about the work
      queue; As long as the queue si non empty, it spits out data.
      
      Client:
      The client is also multithreaded. One thread receives data packets and
      stuffs them in a chunkbuffer data structure. This thread also request more
      data, either to complete chunks with missing blocks, or to request new
      chunks. Each client can read ahead up 2 chunks, although with multiple
      clients it might actually be much further ahead as it also receives chunks
      that other clients requested. I set the number of chunk buffers to 16,
      although this is probably unnecessary as I will explain below. The other
      thread waits for chunkbuffers to be marked complete, and then invokes the
      imagunzip code on that chunk. Meanwhile, the other thread is busily getting
      more data and requesting/reading ahread, so that by the time the unzip is
      done, there is another chunk to unzip. In practice, the main thread never
      goes idle after the first chunk is received; there is always a ready chunk
      for it. Perfect overlap of I/O! In order to prevent the clients from
      getting overly synchronized (and causing all the clients to wait until the
      last client is done!), each client randomizes it block request order. This
      why we can retain the original frisbee name; clients end up catching random
      blocks flung out from the server until it has all the blocks.
      
      Performance:
      The single node speed is about 180 seconds for our current full image.
      Frisbee V1 compares at about 210 seconds. The two node speed was 181 and
      174 seconds. The amount of CPU used for the two node run ranged from 1% to
      4%, typically averaging about 2% while I watched it with "top."
      
      The main problem on the server side is how to keep boss (1GHZ with a Gbit
      ethernet) from spitting out packets so fast that 1/2 of them get dropped. I
      eventually settled on a static 1ms delay every 64K of packets sent. Nothing
      to be proud of, but it works.
      
      As mentioned above, the number of chunk buffers is 16, although only a few
      of them are used in practice. The reason is that the network transfer speed
      is perhaps 10 times faster than the decompression and raw device write
      speed. To know for sure, I would have to figure out the per byte transfer
      rate for 350 MBs via network, via the time to decompress and write the
      1.2GB of data to the raw disk. With such a big difference, its only
      necessary to ensure that you stay 1 or 2 chunks ahead, since you can
      request 10 chunks in the time it takes to write one of them.
      86efdd9e