- 09 Dec, 2005 3 commits
-
-
Mike Hibler authored
-
Mike Hibler authored
-
Mike Hibler authored
-
- 08 Dec, 2005 1 commit
-
-
Mike Hibler authored
-
- 07 Dec, 2005 1 commit
-
-
Mike Hibler authored
Don't whine about missing swapinfo for nodes in experiments that are not saving disk state.
-
- 06 Dec, 2005 7 commits
-
-
Mike Hibler authored
Exec summary: after this checkin, the infrastructure exists (once enabled) to create swapout-time "delta" images for all machines in experiments. There is only a single, cumulative swap image per node (i.e., all diffs are from the base image, not from the previous swap). What doesn't yet exist, is the mechanism for reloading the delta at swapin time. That is Phase III. The nitty-gritty: 1. Keep disk image signature files for all nodes in an experiment. New fields in the DB to track, for each disk partition, what image the partition was loaded from. This enables us at swapin or os_load time to create signature files in /proj/<pid>/exp/<eid>/swapinfo for the current contents of a node disk/partition. All nodes with the same image loaded will share (via symlink) the same signature file. TODO: no longer referenced signature files should be removed. Signature info is only collected in the swapinfo directory if the experiment is set to have disk state saving enabled (see #5 below). Info consists of the <vname>.sig file, which is the file created by imagehash, and <vname>.part which says what the root disk is for the node and whether to look at the whole disk or just a single partition when crafting the delta image. 2. Swapout-time hook for creating swapout image. If the experiment is marked as allowing disk state saving, tbswap will arrange to run and then monitor the create-swapimage command on each node. This script will run the modified version of imagezip which uses the signature file to create a delta image. The command to run and maximum timeout are specified via sitevars (previously checked in). Note that the tbswap script currently has special knowledge of /usr/local/bin/create-swapimage as a swapout time script. If the swap/swapout_command sitevar is set to that, Magic Stuff shall occur (i.e. it will monitor the command and make periodic reports of progress). The sitevars are a total hack and will disappear at some point. 3. Client-side script for creating swapout image. os/create-swapimage, very similar to create-image. Uses the info stashed in /proj/..blahblah../swapinfo to create a delta image. XXX fer now hack: the script first looks in /proj/<pid>/bin for an imagezip binary to use. Failing that, it uses the one in the MFS. This allows for easier development of the imagezip changes (i.e., don't have to update the MFS every time. 4. Auto creation of signature files for new images. The create_image script (the one that runs on boss when creating images for users) has been modified to automatically create a signature via imagehash. The .sig file winds up in /usr/testbed/images/sigs or in /proj/<pid>/images/sigs. From there it will be copied at swapin/os_load time to the per-expt swapinfo directory for any node that uses the images. The process for creating standard system images (aka, "Mike") has not yet been modified. When the image creation/installation procedure is formalized into a script, this will be done. 5. Web changes to set/clear saving of disk state at swapout time. Add a checkbox to the experiment create page to allow setting "save swap state". Also added to the experiment modify page, but currently "if (0)"ed out as it will need some additional support. The showstuff page will show it. Taking a page from Leigh's hack book, if EXPOSESTATESAVE in defs.php3 is set to zero (as it is now), then the checkbox doesn't appear in the create experiment page except for STUDLY users.
-
Timothy Stack authored
underscores don't resolve. Also, fix a couple minor bugs in the filehandle resolver and add some stats for lookups.
-
Mike Hibler authored
Common mistake: forget the -i before the imagename, e.g., "os_load FBSD54-STD pcNN", which results in pcNN getting loaded with the default image. So if the first arg fails as a node, but is an image ID, assume they have made this mistake and stop.
-
Leigh B. Stoller authored
linktest at level 3 if a mere user. Studly users still have control though. Note that errors are no longer mailed to user by linktest_control. Also moved duplicated code to get dbuid (and email address) to top of file.
-
Mike Hibler authored
function.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
- 05 Dec, 2005 2 commits
-
-
Timothy Stack authored
-
Timothy Stack authored
the swap log otherwise.
-
- 02 Dec, 2005 1 commit
-
-
Mike Hibler authored
the port and retry with a different address/port if so. Also, check for immediate failure of the TBBackground call so that we can return at error if it fails (most likely because someone forgot to do a post-install after boss-install and it cannot write the indicated logfile)
-
- 01 Dec, 2005 4 commits
-
-
Timothy Stack authored
the sql cluestick, so the queries shouldn't be as bad as before.
-
Leigh B. Stoller authored
ltmap file so that it does the proper calculation of delay, bandwidth, and loss, as it is done in assign_wrapper. More specifically, the delay/bandwidth/loss between two nodes in a lan is a function of nodea delay/bandwidth/loss, and nodeb rdelay/rbandwidth/rloss, where the "r" values are for the "from the lan to the node" direction. These might very well be different then the other values for asymmetric links.
-
Leigh B. Stoller authored
rebuilt. Seems to be a moving target ...
-
Timothy Stack authored
-
- 30 Nov, 2005 4 commits
-
-
Timothy Stack authored
-
Leigh B. Stoller authored
point.
-
Timothy Stack authored
-
Leigh B. Stoller authored
going to be a lot easier!
-
- 29 Nov, 2005 1 commit
-
-
Timothy Stack authored
-
- 28 Nov, 2005 1 commit
-
-
Timothy Stack authored
-
- 22 Nov, 2005 1 commit
-
-
Mike Hibler authored
(e.g., swapin was cancelled before any nodes were allocated)
-
- 19 Nov, 2005 1 commit
-
-
Robert Ricci authored
before deleting it.
-
- 18 Nov, 2005 2 commits
-
-
Robert Ricci authored
Convert most SNMP interaction to use the snmpit*() library, so that they get support for retrying failures, etc. Add new library calls for wrapping bulkwalk() - so now, we will retry those on error as well. Before, we had the bad behavior than many functions, like listVlans() would just see empty lists instead of errors. When making a new Cisco object, we now test network connectivity right away, by fetching an OID that should exist on all SNMP devices. Before, we wouldn't find out we couldn't contact the switch until we actually did something on it. Also, make VLAN number choosing go a bit faster by converting it to bulkwalk() (using the new library function) so we can grab all VLAN numbers at once.
-
Robert Ricci authored
we're going to remove. We used to do all VLANs in one lock for performace reasons - however, I'm discovering that the lock can get held for such a long time when many VLANs are being deleted that other VLAN operations, such as listing VLANs, can fail. And, it's not actually that much slower to grab a new lock each time.
-
- 17 Nov, 2005 4 commits
-
-
Mike Hibler authored
Produces a different message in the web page. Also fix up a couple of minor firewalled elabinelab issues.
-
Timothy Stack authored
run before the main timeline is started. Also changed the scheduler to load events before adding the TIME START events so we can add setup events before the main timeline.
-
Mike Hibler authored
* Add libadminmfs.pm with routines for entering/exiting and executing commands in, the admin MFS. Node admin and firewall swapout (see below) now use this, the image creation process does not yet. * Add swapout time hooks for running an admin mode process, likely to be used to collect swapout time state. Currently controlled globally by two new sitevars. * Modified node_admin to use the library and added a "-c <command>" option to have nodes go into admin mode and run a command. I don't really expect this to be useful, it was just a testing vehicle for the library. 2. Improved the swapout process for firewalled experiments. Largely just generalized what we already did for paniced experiments. At swapout, firewalled nodes are: - powered off - set to boot into admin mode and run a disk zapper - powered on The swapout process then waits for all nodes to successfully complete disk zapage, at which point the nodes are nfree'ed as usual. Any failure of the above process, marks the experiment as panic'ed (to ensure that we are involved in cleanup) and sends mail to testbed-ops describing the state of the nodes. 3. Added the aforementioned disk zapper, a little C program in the MFS which zeroes out the MBR and partition boot blocks (but not the MBR partition table or FS superblocks). This is added insurance that if a node somehow gets diverted after being nfree'd but before getting the disk reloaded (e.g., goes to hwdown), that we cannot accidentally boot from the disk. This program gets installed in the admin MFS. 4. Related to firewalls, modified swapin to use the new documented "snmpit -N" to get the firewall VLAN number rather than parsing the output that was a side-effect of VLAN creation.
-
Timothy Stack authored
-
- 16 Nov, 2005 1 commit
-
-
Timothy Stack authored
making into the eventlist.
-
- 15 Nov, 2005 1 commit
-
-
Leigh B. Stoller authored
but probably going to mangle everything again when I get back.
-
- 14 Nov, 2005 1 commit
-
-
Leigh B. Stoller authored
-
- 11 Nov, 2005 1 commit
-
-
Leigh B. Stoller authored
start,swap,swapmod,terminate.
-
- 04 Nov, 2005 3 commits
-
-
Kevin Atkinson authored
Move libtblog.sql from tbsetup/ to sql/ directory.
-
Kevin Atkinson authored
Added error logging API. See tbsetup/libtblog.pm.in and tbsetup/libtblog.sql.
-
Kevin Atkinson authored
-