Commit 5c02231f authored by Leigh B. Stoller's avatar Leigh B. Stoller

I noticed in the 12 nodes tests that CPU was running at 5-6% now. I

also noticed that the slower machines were getting very far behind the
faster machines (the faster machines requests chunks faster), and
actually dropping them cause they have no room for the chunks
(chunkbufs at 32). I increased the timeout on the client (if no blocks
received for this long; request something) from 30ms to 90ms.  This
helped a bit, but the real help was increasing chunkbufs up to 64.
Now the clients run in pretty much single node speed (152/174), and
the CPU usage on boss went back down 2-3% during the run. The stats
show far less data loss and resending of blocks. In fact, we were
resending upwards 300MB of data cause of client loss. That went down
to about 14MB for the 12 node test.

Then I ran a 24 node node test. Very sweet. All 24 nodes ran in 155 -
180 seconds. CPU peaked at 6%, and dropped off to steady state of 4%.
None of the nodes saw any duplicate chunks. Note that the client is
probably going to need some backoff code in case the server dies, to
prevent swamping the boss with unanswerable packets. Next step is to
have Matt run a test when he swaps in his 40 nodes.
parent 30a76eee
......@@ -17,17 +17,17 @@
/*
* The number of chunk buffers in the client.
*/
#define MAXCHUNKBUFS 16
#define MAXCHUNKBUFS 64
/*
* The number of read-ahead chunks that the client will request
* at a time. No point in requesting to far ahead either, since they
* at a time. No point in requesting too far ahead either, since they
* are uncompressed/written at a fraction of the network transfer speed.
* Also, with multiple clients at different stages, each requesting blocks
* it is likely that there will be plenty more chunks ready or in progress.
*/
#define MAXREADAHEAD 2
#define MAXINPROGRESS 4
#define MAXINPROGRESS 8
/*
* Timeout (in usecs) for packet receive. The idletimer number is how
......@@ -36,12 +36,13 @@
* more.
*/
#define PKTRCV_TIMEOUT 30000
#define CLIENT_IDLETIMER_COUNT 1
#define CLIENT_IDLETIMER_COUNT 3
/*
* Timeout (in seconds!) server will hang around with no active clients.
* Make it zero to never exit.
*/
#define SERVER_INACTIVE_SECONDS 30
#define SERVER_INACTIVE_SECONDS 0
/*
* The number of disk read blocks in a single read on the server.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment