- 06 Dec, 2005 1 commit
-
-
Mike Hibler authored
Exec summary: after this checkin, the infrastructure exists (once enabled) to create swapout-time "delta" images for all machines in experiments. There is only a single, cumulative swap image per node (i.e., all diffs are from the base image, not from the previous swap). What doesn't yet exist, is the mechanism for reloading the delta at swapin time. That is Phase III. The nitty-gritty: 1. Keep disk image signature files for all nodes in an experiment. New fields in the DB to track, for each disk partition, what image the partition was loaded from. This enables us at swapin or os_load time to create signature files in /proj/<pid>/exp/<eid>/swapinfo for the current contents of a node disk/partition. All nodes with the same image loaded will share (via symlink) the same signature file. TODO: no longer referenced signature files should be removed. Signature info is only collected in the swapinfo directory if the experiment is set to have disk state saving enabled (see #5 below). Info consists of the <vname>.sig file, which is the file created by imagehash, and <vname>.part which says what the root disk is for the node and whether to look at the whole disk or just a single partition when crafting the delta image. 2. Swapout-time hook for creating swapout image. If the experiment is marked as allowing disk state saving, tbswap will arrange to run and then monitor the create-swapimage command on each node. This script will run the modified version of imagezip which uses the signature file to create a delta image. The command to run and maximum timeout are specified via sitevars (previously checked in). Note that the tbswap script currently has special knowledge of /usr/local/bin/create-swapimage as a swapout time script. If the swap/swapout_command sitevar is set to that, Magic Stuff shall occur (i.e. it will monitor the command and make periodic reports of progress). The sitevars are a total hack and will disappear at some point. 3. Client-side script for creating swapout image. os/create-swapimage, very similar to create-image. Uses the info stashed in /proj/..blahblah../swapinfo to create a delta image. XXX fer now hack: the script first looks in /proj/<pid>/bin for an imagezip binary to use. Failing that, it uses the one in the MFS. This allows for easier development of the imagezip changes (i.e., don't have to update the MFS every time. 4. Auto creation of signature files for new images. The create_image script (the one that runs on boss when creating images for users) has been modified to automatically create a signature via imagehash. The .sig file winds up in /usr/testbed/images/sigs or in /proj/<pid>/images/sigs. From there it will be copied at swapin/os_load time to the per-expt swapinfo directory for any node that uses the images. The process for creating standard system images (aka, "Mike") has not yet been modified. When the image creation/installation procedure is formalized into a script, this will be done. 5. Web changes to set/clear saving of disk state at swapout time. Add a checkbox to the experiment create page to allow setting "save swap state". Also added to the experiment modify page, but currently "if (0)"ed out as it will need some additional support. The showstuff page will show it. Taking a page from Leigh's hack book, if EXPOSESTATESAVE in defs.php3 is set to zero (as it is now), then the checkbox doesn't appear in the create experiment page except for STUDLY users.
-
- 31 May, 2005 1 commit
-
-
Leigh B. Stoller authored
I fixed a couple of minor problems, but mostly this worked fine. Note that I have tested this with the installed perl, *NOT* perl 5.8. I am just making sure this stuff gets committed before too much more bitrot sets in.
-
- 18 May, 2005 1 commit
-
-
Leigh B. Stoller authored
approach is that the person is already overquota and that the initial mkexpdir script is what fails. Eventually, might need to be more proactive by checking to see if there some minimal amount of room before going overquota.
-
- 10 Dec, 2003 1 commit
-
-
Leigh B. Stoller authored
-
- 19 Nov, 2003 1 commit
-
-
Leigh B. Stoller authored
solution, but no time today.
-
- 17 Nov, 2003 1 commit
-
-
Leigh B. Stoller authored
state machine (state). All of the stuff that was previously handled by using batchstate is now embedded into the one state machine. Of course, these mostly overlapped, so its not that much of a change, except that we also redid the machine, adding more states (for example, modify phases are now explicit. To get a picture of the actual state machine, on boss: stategraph -o newstates EXPTSTATE gv newstates.ps Things to note: * The "batchstate" slot of the experiments table is now used solely to provide a lock for batch daemon. A secondary change will be to change the slot name to something more appropriate, but it can happen anytime after this new stuff is installed. * I have left expt_locked for now, but another later change will be to remove expt_locked, and change it to active_busy or some such new state name in the state machine. I have removed most uses of expt_locked, except those that were necessary until there is a new state to replace it. * These new changes are an implementation of the new state machine, but I have not done anything fancy. Most of the code is the same as it was before. * I suspect that there are races with the batch daemon now, but they are going to be rare, and the end result is probably that a cancelation is delayed a little bit.
-
- 27 Mar, 2003 1 commit
-
-
Leigh B. Stoller authored
in /proj/$pid/exp/$eid, since this would seem to violate group privacy (/proj exported to all experiments). Minor changes to owner and mode to allow for non group members to swap/terminate and not get copy error (of logs).
-
- 16 Sep, 2002 1 commit
-
-
Leigh B. Stoller authored
experiment. Here is mail to tbops: * Moved the working directory for experiment setup/swap/end to a new directory located on boss instead of over NFS to /proj/$pid/$eid. This new location is /usr/testbed/expwork/$pid/$eid. * Changed the name of the directories we create in /usr/testbed/expinfo to $pid-$eid.$index where $index is a new autoincrement field in the DB table. I really hated the names that were created before. * Changed where logs are written from /tmp to the new location in /usr/testbed/expwork/$pid/$eid. Okay, why. * We no longer operate on NFS mounted directories that might hang. Its easier to catch the situation where a copy of the log file over at the end of experiment creation fails cause of an NFS problem. * We no longer have user writable files that are inputs to other parts of the system (like top and ptop files). Not that a user would be bad, but it closes a hole. * We no longer copy user writable files from /proj to boss where we might fill up an important filesystem cause the user put a .ndz file in the the working directory. Not that a user would be bad, but it closes a hole. * Its easier to save all the log files this way, for each swap in and out. * Removing a directory over NFS is a royal irritant when someone is CD'ed into that directory or looking at a file on the other side (the astute observer will peg this as the reason I went down this idiotic path in the first place!). * About 6 other reasons that I can no longer remember. Seriously, I really had more reasons I can no longer remember! :-)
-
- 07 Jul, 2002 1 commit
-
-
Leigh B. Stoller authored
-
- 16 Oct, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 23 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 03 Jan, 2001 1 commit
-
-
Leigh B. Stoller authored
the testbed list.
-
- 06 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
via the create-os directive in the NS file. tbsetup/ir/handle_os.tcl - Do a valid check for the image given with set-node-os in the NS file, and propogate that information through to the IR file. Nothing is added to the DB. tbsetup/mkexpdir - Add a tftpboot to the list of experiment directories. The tftpd daemon now allows kernels from /proj. tbsetup/os_setup - Very hacky changes to allow for multiboot kernels. Read local images table and cross check against that for nodeos spec. Hardwire in "mb" as a special partition tag that says to not try and do too much with it. This should be changed to a DB check of some kind. On reboot, do not wait for these nodes to come alive since there is no way to determine if an oskit kernel (or any foreign) kernel is running.
-
- 01 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
-