- 15 Oct, 2003 1 commit
-
-
Mike Hibler authored
as defined in the defs-* file (e.g. "TBLOGFACIL=local2"). The default is "local5" which is what we are setup to use so you shouldn't need to mess with your defs- file! perl scripts just get this value configured in when configure is run. C programs get the value in two ways. For programs that are intimate with the testbed infrastructure, and include "config.h", they just get it from that file. For programs that we sometimes use outside the Emulab build environment (e.g., frisbee, capture) and that don't include config.h, the value is set via a "-DLOG_TESTBED=..." in the GNUmakefile build line. If the value isn't set, it defaults to what it used to be (usually LOG_USER). Still to do: healthd, hmcd (whose build doesn't seem to be completely integrated) and plabdaemon.in (since its icky python :-)
-
- 26 Nov, 2002 1 commit
-
-
Mike Hibler authored
1. Client: add "NAK avoidance." We track our (and others, via snooping) block requests and avoid making re-requests unless it has been "long enough." 2. Server: more aggressive merging of requests in the work queue. For every new request, look for any overlap with an existing entry. 3. Server: from Leigh: first cut at dynamic rate adjustment. Can be enabled with -D option. 4. Both: change a lot of the magic constants into runtime variables so that they can be adjusted on the command line or via the event interface (see below). 5. Add code to do basic validatation of incoming packets. 6. Client: randomization of block request order is now optional. 7. Client: startup delay is optional and specified via a parameter N which says "randomly delay between 0 and N seconds before attempting to join." 8. Both: add a new LEAVE message which reports back all the client stats to the server (which logs them). 9. Both: attempt to comment some of the magic values in decls.h. 10. Both: add cheezy hack to fake packet loss. Disabled by default, see the GNUmakefile. This code is coming out right after I archive it with this commit. 11. Add tracing code. Frisbee server/client will record a number of interesting events in a memory buffer and dump them at the end. Not compiled in by default, see the GNUmakefile (NEVENTS) for turning this on. 12. Not to be confused with the events above, also added testbed event system code so that frisbee clients can be remotely controlled. This is a hack for measurement purposes (it requires a special rc.frisbee in the frisbee MFS). Allows changing of all sorts of parameters as well as implementing a crude form of identification allowing you to start only a subset of clients. Interface is via tevc with commands like: tevc -e testbed,frisbee now frisbee start maxclients=5 readahead=5 tevc -e testbed,frisbee now frisbee stop exitstatus=42 Again, this is not compiled in by default as it makes the client about 4x bigger. See the GNUmakefile for turning it on.
-
- 07 Jul, 2002 1 commit
-
-
Leigh Stoller authored
-
- 14 Jan, 2002 1 commit
-
-
Leigh Stoller authored
-
- 10 Jan, 2002 1 commit
-
-
Leigh Stoller authored
statements from the two threads.
-
- 07 Jan, 2002 1 commit
-
-
Leigh Stoller authored
requires the linux threads package to give us kernel level pthreads. From: Leigh Stoller <stoller@fast.cs.utah.edu> To: Testbed Operations <testbed-ops@fast.cs.utah.edu> Cc: Jay Lepreau <lepreau@cs.utah.edu> Subject: Frisbee Redux Date: Mon, 7 Jan 2002 12:03:56 -0800 Server: The server is multithreaded. One thread takes in requests from the clients, and adds the request to a work queue. The other thread processes the work queue in fifo order, spitting out the desrired block ranges. A request is a chunk/block/blockcount tuple, and most of the time the clients are requesting complete 1MB chunks. The exception of course is when individual blocks are lost, in which case the clients request just those subranges. The server it totally asynchronous; It maintains a list of who is "connected", but thats just to make sure we can time the server out after a suitable inactive time. The server really only cares about the work queue; As long as the queue si non empty, it spits out data. Client: The client is also multithreaded. One thread receives data packets and stuffs them in a chunkbuffer data structure. This thread also request more data, either to complete chunks with missing blocks, or to request new chunks. Each client can read ahead up 2 chunks, although with multiple clients it might actually be much further ahead as it also receives chunks that other clients requested. I set the number of chunk buffers to 16, although this is probably unnecessary as I will explain below. The other thread waits for chunkbuffers to be marked complete, and then invokes the imagunzip code on that chunk. Meanwhile, the other thread is busily getting more data and requesting/reading ahread, so that by the time the unzip is done, there is another chunk to unzip. In practice, the main thread never goes idle after the first chunk is received; there is always a ready chunk for it. Perfect overlap of I/O! In order to prevent the clients from getting overly synchronized (and causing all the clients to wait until the last client is done!), each client randomizes it block request order. This why we can retain the original frisbee name; clients end up catching random blocks flung out from the server until it has all the blocks. Performance: The single node speed is about 180 seconds for our current full image. Frisbee V1 compares at about 210 seconds. The two node speed was 181 and 174 seconds. The amount of CPU used for the two node run ranged from 1% to 4%, typically averaging about 2% while I watched it with "top." The main problem on the server side is how to keep boss (1GHZ with a Gbit ethernet) from spitting out packets so fast that 1/2 of them get dropped. I eventually settled on a static 1ms delay every 64K of packets sent. Nothing to be proud of, but it works. As mentioned above, the number of chunk buffers is 16, although only a few of them are used in practice. The reason is that the network transfer speed is perhaps 10 times faster than the decompression and raw device write speed. To know for sure, I would have to figure out the per byte transfer rate for 350 MBs via network, via the time to decompress and write the 1.2GB of data to the raw disk. With such a big difference, its only necessary to ensure that you stay 1 or 2 chunks ahead, since you can request 10 chunks in the time it takes to write one of them.
-