- 15 Jan, 2002 2 commits
-
-
Robert Ricci authored
-
Christopher Alfeld authored
-
- 14 Jan, 2002 11 commits
-
-
Robert Ricci authored
snmpit_intel module conform to the same API as snmpit_cisco. Intels VLANs are now done per port rather than per MAC. This should give experimenters more flexibility on the experimental net, and is more consistent with the way that VLANs are done on other switches. snmpit_intel_stack will need to undergo minor work to support stacks of multiple switches.
-
Robert Ricci authored
-
Christopher Alfeld authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
* Add appropriate goo to os/GNUMakefile so that Frisbee daemon is built and installed. * Rework the frisbee launcher slightly. Aside from little changes (send email to tbops when frisbeed dies, new cmdline syntax to frisbeed), allow for frisbeed to exit gracefully after a period of inactivity (no client requests for 30 minutes, at present). In order to prevent a race condition with a new client being added (and rebooted) and frisbeed terminating before the client gets started, add a load_busy indicator to the images table (next to load_address slot) and set that to one each time to frisbeelauncher is invoked. When frisbeed exits, test and clear that bit atomically (lock tables) and go around another time (restart frisbeed for another 30 minute period). * Rework waitmode in os_load. Wait for all of the nodes to finish at once, and track which nodes never finish. Retry those nodes again by rebooting. The number of retries is configurable in the script, and is currently set to one. This should take care of some PXE boot related problems, although obviously not all. * Got rid of -w option to os_load and made waitmode the default. The -s option can be used to start a reload, but not to wait for it to complete. * Minor changes to sched_reload and reload_daemon; pass in -s option to os_load.
-
Leigh B. Stoller authored
as independent options (-m and -p).
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
as requested by Jay.
-
Christopher Alfeld authored
code had been added in a bunch of places to clear the data out. The portmap table in the database can now be dropped.
-
- 11 Jan, 2002 10 commits
-
-
Leigh B. Stoller authored
The frisbee launcher now deals with multiple requests.
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
Leigh B. Stoller authored
flood (say, if there were 40 clients wanting chunks). I added some backoff code that will slow the rate at which the clients make requests in the face of a non asnwering daemon. Backs off in increments of the PKTRCV timeout value (30ms at present), until it gets to one second. Then it holds at one second intervals.
-
Leigh B. Stoller authored
status when we kill them off intentionally.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
defs file. For mini, revert the domain back to .emulab.net and set the suffix to "-mini". For the others, default it to "" to avoid invalidating current users and logins. Tested with opera, and seems to work okay now.
-
- 10 Jan, 2002 17 commits
-
-
Mike Hibler authored
-
Mike Hibler authored
-
Leigh B. Stoller authored
Capserver and capture now handshake the owner/group of the tipline. Owner is defaults to root, and the group defaults to root when the node is not allocated. Capture will do the chmod after the handshake, so if boss is down when capture starts, the acl/run file will get 0,0, but will get the proper owner/group later after its able to handshake. As a result, console_setup.proxy was trimmed down and cleaned up a bit, since it no longer has to muck with some of this stuff. A second change was to support multiple tiplines per node. I have modified the tiplines table as such: | Field | Type | Null | Key | Default | Extra | +---------+-------------+------+-----+---------+-------+ | tipname | varchar(32) | | PRI | | | | node_id | varchar(10) | | | | | | server | varchar(64) | | | | | That is, the name of the tip device (given to capture) is the unique key, and there can be multiple tiplines associated with each node. console_setup now uses the tiplines table to determine what tiplines need to be reset; used to be just the name of the node_id passed into console_setup. Conversely, capserver uses the tipname to map back to the node_id, so that it can get the owner/group from the reserved table. I also removed the shark hack from nalloc, nfree, and console_reset, since there is no longer any need for that; this can be described completely now with tiplines table entries. If we ever bring the sharks back, we will need to generate new entries. Hah!
-
Mike Hibler authored
-
Mike Hibler authored
-
Leigh B. Stoller authored
(somewhat) so that we can do submenu easily in other pages.
-
Leigh B. Stoller authored
-
Robert Ricci authored
account for the time it may take for changes made at the master to propagate to the slaves. Added a paramter to override this, as sometimes, we know that we're talking to the master so the delay does not come into play. This should improve the running time of snmpit by about 10 seconds per VLAN created, since we can tell right away if the VLAN already exists or not.
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
Leigh B. Stoller authored
-
Robert Ricci authored
keys on boss and ops.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
also noticed that the slower machines were getting very far behind the faster machines (the faster machines requests chunks faster), and actually dropping them cause they have no room for the chunks (chunkbufs at 32). I increased the timeout on the client (if no blocks received for this long; request something) from 30ms to 90ms. This helped a bit, but the real help was increasing chunkbufs up to 64. Now the clients run in pretty much single node speed (152/174), and the CPU usage on boss went back down 2-3% during the run. The stats show far less data loss and resending of blocks. In fact, we were resending upwards 300MB of data cause of client loss. That went down to about 14MB for the 12 node test. Then I ran a 24 node node test. Very sweet. All 24 nodes ran in 155 - 180 seconds. CPU peaked at 6%, and dropped off to steady state of 4%. None of the nodes saw any duplicate chunks. Note that the client is probably going to need some backoff code in case the server dies, to prevent swamping the boss with unanswerable packets. Next step is to have Matt run a test when he swaps in his 40 nodes.
-
Leigh B. Stoller authored
on for now, since its minor code, and spits out good info to the console.
-
Leigh B. Stoller authored
forever.
-
Leigh B. Stoller authored
-