- 08 Feb, 2002 1 commit
-
-
Leigh B. Stoller authored
supporting autocreating and autoloading images. The imageid form now sports a field to specify a nodeid to create the image from; If set, the backend create_image script is invoked. Thats the easy part. Slightly harder is autoloading images based on the osid specified in the NS file. To support this, I have added a new DB table called osidtoimageid, which holds the mapping from osid/pctype to imageid. When users create images, they must specify what node types that image is good for. Obviously, the mappings have to be unique or it would be impossible to figure it out! Anyway, once that image mapping is in place and the image created, the user can specify that ID in the NS file. I've changed os_setup to to look for IDs that are not loaded, and to try and find one in the osidtoimageid. If found, it invokes os_load. To keep things running in parallel as much as possible, os_setup issues all the loads/reboots (could be more than a single set of loads is multiple IDs are in the NS file) at once, and waits for all the children to exit. I've hacked up os_load a bit to try and be more robust in the face of PXE failures, which still happen and are rather troublsesome. Need an event system! Contained in this revision are unrelated changed to make the OS and Image IDs per-project unique instead of globally unique, since thats a pain for the users. This turns out to be very messy, since underneath we do not want to pass around pid/ID in all the various places its used. Rather, I create a globally unique name and extened the OS and Image tables to include pid/name/ID. The user selects pid/name, and I create the globally unique ID. For the most part this is invisible throughout the system, except where we interface with the user, say in the web pages; the user should see his chosen name where possible, and the should invoke scripts (os_load, create_image, etc) using his/her name not the internal ID. Also, in the front end the NS file should use the user name not the ID. All in all, this accounted for a number of annoying changes and some special cases that are unavoidable.
-
- 07 Feb, 2002 1 commit
-
-
Robert Ricci authored
retry (or warn) about nodes that get stuck more than once.
-
- 01 Feb, 2002 1 commit
-
-
Mike Hibler authored
understanding of how mountd operates. Things that should be fixed: 1. It iterates over every node calculating what directories are exported, what FSes they are on, etc. Most of that work only needs to happen per experiment. 2. The algorithm to determine what FS a directory is on is a hack. I just take the first component of the path provided. As long as we mount all our FSes at the top level and configure with canonical pathes, this is ok. Other solutions require calling out to the ops node to get actual mount info. 3. Once shared experiments are revived, the code to determine exported directories will need to change. The algorithm for computing the exports lines should still be correct.
-
- 31 Jan, 2002 1 commit
-
-
Robert Ricci authored
the Intels take so long to apply changes, 20 seconds isn't long enough when multiple snmpit are run simultaneously.
-
- 30 Jan, 2002 2 commits
-
-
Robert Ricci authored
en/disabling of ports through other methods, like creating and deleting VLANs.)
-
Leigh B. Stoller authored
-
- 28 Jan, 2002 2 commits
-
-
Leigh B. Stoller authored
for admin types. This is independent of the unix groups table, but I check it for a duplicate just in case.
-
Robert Ricci authored
not the creation suceeded, which caused snmpit to erroneously think that it had failed.
-
- 23 Jan, 2002 4 commits
-
-
Robert Ricci authored
database. Intended to be used to recover switch state after a crash or power outage. This option is farily dangerous, as it temporarily disrupts all experimental traffic, and will remove all hand-created VLANs. So, it interactively asks for confirmation that the user (who must be an admin, of course) really wants to do this.
-
Robert Ricci authored
-
Robert Ricci authored
things from getting cleaned up on failure.
-
Robert Ricci authored
First, the stack-level createVlan() function no longer takes as an argument a list of devices the VLAN exists on, since it looks like this will never be needed. In it's place, createVlan() now takes a list of ports, so that it can (if so desired) put the ports in the VLAN without a seperate lock and unlock. The snmpit_intel module now uses its 'nested locking' feature to avoid additional locking in these cases. Note though, that the way that this is done is not safe for multiple switches in a stack. If we ever have to support multiple Intels (looks doubtful), this will have to be removed, or locking will need to be moved a level up to snmpit_intel_stack . Yuck. For Intels, the removeVlan() function calls removePortsFromVlan() itself, again to save locking overhead. The Cisco behavior, however, is unchanged, as locking is not expensive, and this would be too messy.
-
- 22 Jan, 2002 5 commits
-
-
Robert Ricci authored
slow, but we seem to have problems otherwise.
-
Robert Ricci authored
multiple operations. For example, you can now remove multiple VLANs with one command, like: snmpit -o foo -o bar You can now give more than one -i option, so that you can give a list of more than one switch to operate on, like: snmpit -i cisco3 -i cisco4 -l Converted from using Getopt::Std to Getopt::Long, which is more flexible and better documented. All the 'worker' ( do*() ) functions now take a list of stacks as the first argument, rather than using a global @stacks variable. Fixed up the usage message, which was out of date in some cases, and innacurate in others.
-
Robert Ricci authored
cleaner, but they seem to cause problems in some cases.
-
Robert Ricci authored
out of alignment with the columns.
-
Robert Ricci authored
we treat Ciscos (in which we ignore VLAN #1)
-
- 18 Jan, 2002 8 commits
-
-
Leigh B. Stoller authored
fact is defaulted in configure.in but can be overridden in the defs file. Changed the one perl script that had a hardwired flux group. The other dozen or so uses are in the web page code. I'll do those next.
-
Robert Ricci authored
immensely. Also removed some unecessary code for verifying the results of certain SNMP requests that turned out to be broken with both BLOCK and CONFRIM are off.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Robert Ricci authored
They're now named /etc/namedb/@OURDOMAIN@.db*
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
- 17 Jan, 2002 4 commits
-
-
Leigh B. Stoller authored
image is bigger than any previous image we have dealt with! Probably need to make this dynamic in some way.
-
Robert Ricci authored
2.70 . The new code will be compatible with the old firware version (2.42), and should be slighly faster. Also fixed some return values, so that they don't appear to have failed when they actually succeeded.
-
Robert Ricci authored
more helpful error than complaining about calling methods on an undefined variable. Also fixed some logic in snmpit_intel_stack that was causing it to report failure when it had, in fact, succeeded (this was purely cosmetic.)
-
Robert Ricci authored
-r. When this check is made, frisbeelauncher is still running as root, and -R uses the real, rather than effective, uid for the check.
-
- 16 Jan, 2002 1 commit
-
-
Robert Ricci authored
Appropriate enviroment cleaning and taint checking is done, and it drops root privileges immedeately after opening the logfile, so frisbeed still runs as the invoking user rather than root.
-
- 15 Jan, 2002 1 commit
-
-
Christopher Alfeld authored
-
- 14 Jan, 2002 5 commits
-
-
Robert Ricci authored
snmpit_intel module conform to the same API as snmpit_cisco. Intels VLANs are now done per port rather than per MAC. This should give experimenters more flexibility on the experimental net, and is more consistent with the way that VLANs are done on other switches. snmpit_intel_stack will need to undergo minor work to support stacks of multiple switches.
-
Robert Ricci authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
* Add appropriate goo to os/GNUMakefile so that Frisbee daemon is built and installed. * Rework the frisbee launcher slightly. Aside from little changes (send email to tbops when frisbeed dies, new cmdline syntax to frisbeed), allow for frisbeed to exit gracefully after a period of inactivity (no client requests for 30 minutes, at present). In order to prevent a race condition with a new client being added (and rebooted) and frisbeed terminating before the client gets started, add a load_busy indicator to the images table (next to load_address slot) and set that to one each time to frisbeelauncher is invoked. When frisbeed exits, test and clear that bit atomically (lock tables) and go around another time (restart frisbeed for another 30 minute period). * Rework waitmode in os_load. Wait for all of the nodes to finish at once, and track which nodes never finish. Retry those nodes again by rebooting. The number of retries is configurable in the script, and is currently set to one. This should take care of some PXE boot related problems, although obviously not all. * Got rid of -w option to os_load and made waitmode the default. The -s option can be used to start a reload, but not to wait for it to complete. * Minor changes to sched_reload and reload_daemon; pass in -s option to os_load.
-
Christopher Alfeld authored
code had been added in a bunch of places to clear the data out. The portmap table in the database can now be dropped.
-
- 11 Jan, 2002 2 commits
-
-
Christopher Alfeld authored
-
Christopher Alfeld authored
-
- 10 Jan, 2002 2 commits
-
-
Leigh B. Stoller authored
Capserver and capture now handshake the owner/group of the tipline. Owner is defaults to root, and the group defaults to root when the node is not allocated. Capture will do the chmod after the handshake, so if boss is down when capture starts, the acl/run file will get 0,0, but will get the proper owner/group later after its able to handshake. As a result, console_setup.proxy was trimmed down and cleaned up a bit, since it no longer has to muck with some of this stuff. A second change was to support multiple tiplines per node. I have modified the tiplines table as such: | Field | Type | Null | Key | Default | Extra | +---------+-------------+------+-----+---------+-------+ | tipname | varchar(32) | | PRI | | | | node_id | varchar(10) | | | | | | server | varchar(64) | | | | | That is, the name of the tip device (given to capture) is the unique key, and there can be multiple tiplines associated with each node. console_setup now uses the tiplines table to determine what tiplines need to be reset; used to be just the name of the node_id passed into console_setup. Conversely, capserver uses the tipname to map back to the node_id, so that it can get the owner/group from the reserved table. I also removed the shark hack from nalloc, nfree, and console_reset, since there is no longer any need for that; this can be described completely now with tiplines table entries. If we ever bring the sharks back, we will need to generate new entries. Hah!
-
Robert Ricci authored
account for the time it may take for changes made at the master to propagate to the slaves. Added a paramter to override this, as sometimes, we know that we're talking to the master so the delay does not come into play. This should improve the running time of snmpit by about 10 seconds per VLAN created, since we can tell right away if the VLAN already exists or not.
-