- 23 Mar, 2004 2 commits
-
-
Kirk Webb authored
* Small fix to DBQueryFatal in libdb.py: () is a valid return value, don't fail on it(insert/replace); do fail if DBQuery returns None though. * Fix up libplab.py to not choke on new plab_slices column.
-
Kirk Webb authored
* incompatible option handling and use removed from gen purpose libs * Global PLC mutex implemented, but currently disabled * plabmonitord parallelization cut in half (for now) I'm still very frustrated with option handling/passing. Needs more thought, but the primary issue is that there really isn't a global variable space in python (global to file, yes, but not global to interpreter invocation). I've learned that __builtin__ might work for this, but it seems hacky..
-
- 18 Mar, 2004 1 commit
-
-
Jay Lepreau authored
-
- 17 Mar, 2004 2 commits
-
-
Kirk Webb authored
* Added comments * Added Emulab copyright * made mod_PLC handle the "not assigned" error case in freeNode() - optimization and less log clutter. * bug fix in plabmonitord (ISUP decection)
-
Kirk Webb authored
* Changed the way options are parsed in the python scripts so that modules can easily add and use their own options independent of top-level scripts. * Added --noIS and --pollNodes module options. * Added batch option to vnode_setup (degree of parallelization) - defaults to 10 * Major updates to plamonitord - batches testing, currently to 40
-
- 03 Mar, 2004 1 commit
-
-
Kirk Webb authored
* implemented PLC slice renewal * restructured daemon code/startup - removed getfree daemon (replaced by plabdiscover; run from cron) - moved generic daemonizing code into libtestbed (class) - created plabrenewd - small script that utilizes daemonizing class - removed plabdaemon file. - updated bossnode startup scripts * changed slice prefix - PLC denies permission w/ anything other than "utah" * Minor semantic changes to module API to be more consistent with other parts. * Some bug fixes.
-
- 02 Mar, 2004 1 commit
-
-
Kirk Webb authored
* removed unused and not generally useful ping checking * reorganized node discovery and added node info updating - e.g., update IP, SITE, or HOSTNAME when they have changed - no longer part of the backend module as this is independent of which backend is used; may modularize it due to plab's new "trumpet" service, which is basically its node DB available via a decentralized transport/API. * introduced new method of getting node info - use plab sites.xml file * various other cleanups.
-
- 26 Feb, 2004 1 commit
-
-
Kirk Webb authored
-
- 25 Feb, 2004 2 commits
-
-
Kirk Webb authored
-
Kirk Webb authored
I'll come along for a closer cut in the future. * Modularized the plab communications 'adaptor' interface and moved the dslice- and PLC-specific code into their own modules. * Wrote an API definition README * Separated out generic routines from libplab into their own library modules (libtestbed.py and libdb.py) Functionally, not much has changed - this was just a massive re-org with some other cleanup. Should be much easier to code up new PLAB interfaces as the plab folks flail around in their attempt to standardize on something. XXX: may want to re-think where the generic library modules should go. If more python code enters Elab, we'll probably want to move 'em to more standard locations. This isn't the end of the cleanup - I would eventually like to go back and rethink the class structures, beef up the comments, and extend the API.
-
- 10 Jan, 2004 1 commit
-
-
Kirk Webb authored
-
- 06 Jan, 2004 1 commit
-
-
Kirk Webb authored
-
- 03 Jan, 2004 2 commits
- 31 Dec, 2003 1 commit
-
-
Kirk Webb authored
mechanism in the service sliver).
-
- 30 Dec, 2003 3 commits
-
-
Kirk Webb authored
vnode_setup for the timeout on waiting for child processes. I've set it to 10 minutes since all ancillary setup programs have their own time bounds (I think - the plab ones do anyway). The function of plabmonitord has changed slightly. Instead of setting up and tearing down vnodes, its job is to just setup the emulab management sliver on plab nodes in hwdown. Once the vserver comes up and reports isalive, it moves the node out of hwdown. Currently, it first tries to tear down the vserver before reinstantiating it. In the future, we could get fancier and try interacting with the service sliver directly before simply tearing it down. All new plab nodes now start life in hwdown, and must be summoned forth into production by plabmonitord. This commit does NOT include support for the node-local httpd. That will come soon.
-
Mike Hibler authored
-
Kirk Webb authored
central. Also, back out Mike's hack, and use the ALLOWED_LIST feature Austin originally had to limit node scope.
-
- 29 Dec, 2003 1 commit
-
-
Mike Hibler authored
-
- 23 Dec, 2003 2 commits
-
-
Kirk Webb authored
-
Mike Hibler authored
than contacting the dslice agent which no longer exists.
-
- 15 Dec, 2003 1 commit
-
-
Kirk Webb authored
lease for a slice before we've _successfully_ set one up.
-
- 12 Dec, 2003 1 commit
-
-
Kirk Webb authored
us correlate better with log entries on plab nodes.
-
- 09 Dec, 2003 1 commit
-
-
Kirk Webb authored
A couple of things: 1) Added PLAB_SLICEPREFIX so that we can separately instantiate plab slices from mini, or elsewhere. On the mainbed, its set to "emulab". On mini, its set to "emulab_mini". The "emulab" part has to exist first so that the new plab node manager doesn't nuke our dslice slivers. 2) Fixed up Plab.getFree() so that it doesn't try to add the same IP twice to the DB if a new one is found, and listed more than once.
-
- 08 Dec, 2003 1 commit
-
-
Leigh B. Stoller authored
node's primary virtual type.
-
- 02 Dec, 2003 2 commits
- 01 Dec, 2003 2 commits
- 17 Nov, 2003 1 commit
-
-
Kirk Webb authored
-
- 05 Nov, 2003 1 commit
-
-
Kirk Webb authored
-
- 04 Nov, 2003 1 commit
-
-
Kirk Webb authored
when trying to renew any node. Needs further review later.
-
- 01 Nov, 2003 1 commit
-
-
Kirk Webb authored
1) properly disable alarm before exiting ForkCmd - this was causing SIGALRM to get sent when it shouldn't have, and probably caused the renewal failures. - was introduced accidentally yesterday when I unwittingly committed some beta libplab code along with the rootball version string fix. 2) Changed semantics of the renew daemon s.t. it only sends a single message for each invocation of the renewal loop - summarizes the ones that failed. The rest of the code I committed accidentally yesterday seems to be working just fine. It all looks sane on perusal.
-
- 31 Oct, 2003 2 commits
- 24 Oct, 2003 1 commit
-
-
Robert Ricci authored
which had been hanging around in my home directory for a while. There are a few new things in plab/etc/netbed_files that set up a directory of the same name in @prefix@. This will get rsync'ed with netbed_files/ on each planetlab node. log/ - just needs to exist for the httpd server sbin/ - contains thttpd, and scripts to manipulate it www/ - the directory served by thttpd. Contains symlinks to the 'real' location of the rootballs (etc/plab) I've committed a binary of thttpd - this is simply because it'd be a PITA to compile a Linux binary for every devel tree, etc. PLAB_ROOTBALL has now become a configure options. The idea is that we will keep the latest version number in configure.in, but you can override it in your defs file. This way, we don't have to update every defs file when there's a new version, but people can still play around with their own version if they want. The two scripts that interact with the plab nodes skip ones that are down. They ssh in as 'utah1', meaning that one of us who has access to that account needs to run them, so that they can have access to our keys. We can put boss's public key (or something) out there to remove this requirement. plabdist runs an rsync between @prefix@/etc/plab/netbed_files and a file of the same name on the planetlab nodes. It's intended to be run from the main install tree - the local rsync directory is not normally set up in devel trees. It runs in parallel, but is limited to 4 to avoid beating up boss too much. Takes about 1:40 with the current set of plab nodes (took > 10 minutes doing one at a time). plabhttpd (re)starts the mini web server on all plab nodes
-
- 23 Oct, 2003 1 commit
-
-
Kirk Webb authored
Well, here it is: The checkin implementing robust recovery/retry and asynchronous safe termination in plab allocation/deallocation/setup. Here are some of the more prominent changes/additions: * Bounded plab agent communication Scripts should never hang waiting for plab xmlrpc commands to complete; they have their own internal timeouts. Node.create() in libplab is an exception, but is always run under a timeout constraint in vnode_setup and can be changed easily if the need arises. * Wrote functions in libplab to do the retry/recovery/timeout of remote command exection. * Wrapped critical sections with a signal watcher. * Added code to handle various error conditions properly * Added a libtestbed function, TBForkCmd, which runs a given program in a child process, and can optionally catch incoming SIGTERMs and terminate the child (then exit itself). * Fixed up vnode_setup to batch the 'plabnode free' operation along with a few other cleanups. This should alleviate Jay's concern about how long it used to take to teardown a plab expt. * Whacked plabmonitord into better shape; fixed a couple bugs, taught it how to daemonize, and implemented a priority list for testing broken plab nodes. This list causes new (as yet unseen) nodes to be tried first over ones that have been tested already.
-
- 20 Oct, 2003 1 commit
-
-
Leigh B. Stoller authored
-
- 15 Oct, 2003 1 commit
-
-
Kirk Webb authored
-
- 14 Oct, 2003 1 commit
-
-
Kirk Webb authored
Update to libplab.plab.renew: * Make renewal robust against various kinds of failures. These changes will augment my larger set of libplab and plab* updates/fixes coming soon to an Emulab near you.
-