- 19 Oct, 2005 1 commit
-
-
Leigh B. Stoller authored
-
- 30 Sep, 2005 1 commit
-
-
Leigh B. Stoller authored
into my own project.
-
- 22 Sep, 2005 1 commit
-
-
Mike Hibler authored
Doesn't make any sense, but...
-
- 13 Jul, 2005 1 commit
-
-
Leigh B. Stoller authored
created, swapped in or modified when overquota. Ditto for creating images.
-
- 31 May, 2005 1 commit
-
-
Leigh B. Stoller authored
I fixed a couple of minor problems, but mostly this worked fine. Note that I have tested this with the installed perl, *NOT* perl 5.8. I am just making sure this stuff gets committed before too much more bitrot sets in.
-
- 27 May, 2005 1 commit
-
-
Leigh B. Stoller authored
-
- 18 May, 2005 1 commit
-
-
Leigh B. Stoller authored
approach is that the person is already overquota and that the initial mkexpdir script is what fails. Eventually, might need to be more proactive by checking to see if there some minimal amount of room before going overquota.
-
- 19 Apr, 2005 1 commit
-
-
Leigh B. Stoller authored
reasonable!
-
- 22 Feb, 2005 1 commit
-
-
Leigh B. Stoller authored
must have pruned this out by accident when cleaning up batchexp a while back (combined it with startexp).
-
- 05 Nov, 2004 1 commit
-
-
Leigh B. Stoller authored
Use this option from boss-install when creating the initial experiments. This option should not be exported via the XMLRPC server.
-
- 30 Aug, 2004 1 commit
-
-
Leigh B. Stoller authored
path to the experiments logs directory (exp/$eid/logs/linktest.log).
-
- 29 Jul, 2004 1 commit
-
-
Leigh B. Stoller authored
* The first involves swapmod. When a swapmod on an active experiment fails, tbswap will reswap the experiment back to the original configuration. The problem is that it is reswapping it with the *new* virtual state of the experiment in the DB. It is not until later when control returns to swapexp that the virtual state is restored. This is plainly wrong, and in fact was causing the event scheduler grief cause it was starting up, reading the the virtual topo, which was different, wrong, and about to be blown away. I reorganized the modify section of swapexp so that virtual state is restored only when its a swapmod on a swapped experiment. On an active experiment, I moved that code down into tbswap, which will now does all of the virtual and physical state retore before it does the reswap back to the original experiment. Just for kicks, its also done if tbswap decides to swap the experiment cause of a fatal error. Cleanups: I changed $NoRecover to $CanRecover. My feeble brain cannot deal with !$NoRecover. I know, two knots make a wright for most people. Renderer: I was annoyed by the fact that we rerun the renderer on a failed swapmod. The original reason is that the renderer runs in the background and so vis_nodes cannot be saved with the rest of the virtual state tables cause the renderer might still be running when the user fires off the swapmod. Well, the hell with that. We lock the vis_nodes table anyway in the renderer during update, so we are certain to get a consistent snapshot. We store the renderer pid in the experiments table, so if the renderer was running, just fire off another one; mostly this is not going to happen. In addition, tbprerun no longer starts a new renderer when doing the swapmod; I start the new renderer later after swapmod succeeds. I might end up tweaking this a bit depending on what people notice as being different. * Termination changes to batchexp and swapexp: I've rearranged the termination code using an END block so that any uncontrolled exit from either batchexp or swapexp will go through the cleanup code, and hopefully insert a stats record, as well as not leave the experiment in some inbetween state. I've set the max DB retry count to zero in both cases, which means infinite retry. I've also added SIGTERM handlers to both so that again, we can kill a hung batch/swap and have it clean up things more or less. Note that END blocks are not caught when a signal causes the program to die; you have to catch it and then die() so that the END block is executed. Eventually, we need to clean up the various libraries so that we do not use DBQueryFatal(), but rather use DBQueryWarn(), and look for failure. Ditto for event system interface.
-
- 28 Jul, 2004 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
to be reused if the DB is dropped and recreated, since when that happens, auto_increment history is lost and it will go back to using the latest highest index in the table. Usually not a problem, but since we cross index three other tables using the experiment index, this causes quite a bit of grief. So, my solution is to do my own auto_increment using the experiment_stats table (locked of course), which we never delete entries from without deleting all entries from the other cross referenced tables. DBQueryFatal("select MAX(exptidx) from experiment_stats"); I also added a sanity check to make sure the new index is not currently in use in any of the tables. I also cleaned up the error path when something goes wrong.
-
- 15 Jul, 2004 1 commit
-
-
Leigh B. Stoller authored
get sent properly; need to call TBdbfork(), and add a couple more event sends in libdb.
-
- 29 Jun, 2004 1 commit
-
-
Leigh B. Stoller authored
so that the process ID is tracked in the DB and so that the user can stop a linktest in progress from the web interface, even if its started directly from experiment swapin.
-
- 21 May, 2004 1 commit
-
-
Leigh B. Stoller authored
experiment cause of failure!
-
- 17 May, 2004 1 commit
-
-
Leigh B. Stoller authored
system as well as the summary page.
-
- 29 Apr, 2004 1 commit
-
-
Leigh B. Stoller authored
currently available to only people with stud=1 status in the DB. * www/tbauth.php3: Add a STUDLY() function to check that bit. * www/linktest.php3: New page to run linktest on the fly. The level defaults to the current level in the experiments table, but you can override that via the form on the page. * www/showexp.php3: Add link to aforementioned page. STUDLY() only. * www/beginexp_form.php3: Add an option (selection) to set the linktest level for create/swapin. Defaults to 0 (no linktest). STUDLY() only. * www/editexp.php3: Add an option to edit the default linktest level for an experiment. STUDLY() only. * tbsetup/batchexp.in and tbsetup/swapexp.in: Add code to optionally run the linktest, sending email if it fails (exists with non-zero status). Failure does not affect the swapin.
-
- 07 Apr, 2004 1 commit
-
-
Leigh B. Stoller authored
down when invoked from the RPC server.
-
- 15 Mar, 2004 1 commit
-
-
Leigh B. Stoller authored
these scripts!
-
- 09 Mar, 2004 1 commit
-
-
Leigh B. Stoller authored
* Add proper check_slot() calls to all of the user input that is going into the DB (already had taint checking), since batchexp is now available for interactive use from ops. * Remove separate DB insertions of noswap/noidleswap reasons from web script, and pass on the command line from web to batchexp. Now inserted in the backend script so that they can be provided on the command line when batchexp is used interactively. * Change defaults in backend script; experiments now default to swappable and idleswap; previously defaulted to not swappable and no idleswap. * Remove [-s] (swappable) and add [-S <reason>] option. -S sets experiment to not swappable, with supplied reason (text string). * Add [-L <reason>] option. -L sets experiment to no idleswap, with supplied reason (text string). * Add several missing table_regex entries for experiments table.
-
- 20 Feb, 2004 1 commit
-
-
Leigh B. Stoller authored
batchexp, and forgot to finish the changes! The result was a fairly broken batch system, which is not hopefully fixed! Took the opportunity to remove the -x (expires) and -l (priority) options which are no longer references anyplace. Fix up email message so that idle/auto swap times are in hours not minutes. Provide a proper usage() function that describes the morass of options (for interactive use from ops).
-
- 12 Feb, 2004 1 commit
-
-
Leigh B. Stoller authored
no reason for the separation for a long time, and it made maintence more difficult cause of duplication between batchexp and startexp (batch was the sole user of startexp). Cleaner solution. * Check argument processing for batchexp, swapexp, endexp to make sure the taint checks are correct. All three of these scripts will now be available from ops. I especially watch the filename processing, which was pretty loose before and could allow some to grab a file on boss by trying to use it as an NS file (scripts all runs as user of course). The web interface generates filenames that are hard to guess, so rather then wrapping these scripts when invoked from ops, just allow the usual paths (/proj, /groups, /users) but also /tmp/$uid-XXXXXX.nsfile pattern, which should be hard enough to guess that users will not be able to get anything they are not supposed to. * Add -w (waitmode) options to all three scripts. In waitmode, the backend detaches, but the parent remains waiting for the child to finish so it can exit with the appropriate status (for scripting). The user can interrupt (^C), but it has no effect on the backend; it just kills the parent side that is waiting (backend is in a new session ID). Log outout still goes to the file (available from web page) and is emailed.
-
- 09 Feb, 2004 1 commit
-
-
Leigh B. Stoller authored
possible use in web non-interactive web access. Currently, the web key is used just to download tar/rpm files to remote nodes.
-
- 02 Dec, 2003 1 commit
-
-
Leigh B. Stoller authored
Exit with positive value so that web interface reports error to the user, not to us!
-
- 18 Nov, 2003 1 commit
-
-
Leigh B. Stoller authored
of virt_tables so that it is saved and restored like the rest of the virtual state.
-
- 17 Nov, 2003 1 commit
-
-
Leigh B. Stoller authored
state machine (state). All of the stuff that was previously handled by using batchstate is now embedded into the one state machine. Of course, these mostly overlapped, so its not that much of a change, except that we also redid the machine, adding more states (for example, modify phases are now explicit. To get a picture of the actual state machine, on boss: stategraph -o newstates EXPTSTATE gv newstates.ps Things to note: * The "batchstate" slot of the experiments table is now used solely to provide a lock for batch daemon. A secondary change will be to change the slot name to something more appropriate, but it can happen anytime after this new stuff is installed. * I have left expt_locked for now, but another later change will be to remove expt_locked, and change it to active_busy or some such new state name in the state machine. I have removed most uses of expt_locked, except those that were necessary until there is a new state to replace it. * These new changes are an implementation of the new state machine, but I have not done anything fancy. Most of the code is the same as it was before. * I suspect that there are races with the batch daemon now, but they are going to be rare, and the end result is probably that a cancelation is delayed a little bit.
-
- 05 Nov, 2003 1 commit
-
-
Leigh B. Stoller authored
* Generate a shared secret key for the event system. This key is stored into the DB, and passed to the node via tmcd. It is also stashed into a file in the experiment directory (can be accessed only by the project/group members). The key is used to attach a HMAC (hashed message authentication) to each event, which is checked by the receivers to ensure that the event is not bogus. More details on this later when I commit the event library/client changes. * Added "virt_programs" table to store info about each program object defined by the user. The intent is to no longer send the command string in the event, but to fix it in the DB, and transfer it via tmcd. This removes our "remote execution facility" which was always a bad idea (we have ssh for that, and that is a lot more secure then the event system!). Note that for the time being we need to continue send the command in the event because of old images, but the new images will now ignore that part of the event.
-
- 01 Oct, 2003 1 commit
-
-
Leigh B. Stoller authored
-
- 30 Sep, 2003 1 commit
-
-
Leigh B. Stoller authored
plus a lock field. The lock field was a simple "experiment locked, go away" slot that is easy to use when you do not care about the actual state that an experiment is in, just that it is in "transition" and should not be messed with. The other two state variables are "state" and "batchstate". The former (state) is the original variable that Chris added, and was used by the tb* scripts to make sure that the experiment was in the state each particular script wanted them to be in. But over time (and with the addition of so much wrapper goo around them), "state" has leaked out all over the place to determine what operations on an experiment are allowed, and if/when it should be displayed in various web pages. There are a set of transition states in addition to the usual "active", "swapped", etc like "swapping" that make testing state a pain in the butt. I added the other state variable ("batchstate") when I did the batch system, obviously! It was intended as a wrapper state to control access to the batch queue, and to prevent batch experiments from being messed with except when it was really okay (for example, its okay to terminate a swapped out batch experiment, but not a swapped in batch experiment since that would confuse the batch daemon). There are fewer of these states, plus one additional state for "modifying" experiments. So what I have done is change the system to use "batchstate" for all experiments to control entry into the swap system, from the web interface, from the command line, and from the batch daemon. The other state variable still exists, and will be brutally pushed back under the surface until its just a vague memory, used only by the original tb* scripts. This will happen over time, and the "batchstate" variable will be renamed once I am convinced that this was the right thing to do and that my changes actually work as intended. Only people who have bothered to read this far will know that I also added the ability to cancel experiment swapin in progress. For that I am using the "canceled" flag (ah, this one was named properly from the start!), and I test that at various times in assign_wrapper and tbswap. A minor downside right now is that a canceled swapin looks too much like a failed swapin, and so tbops gets email about it. I'll fix that at some point (sometime after the boss complains). I also cleaned up various bits of code, replacing direct calls to exec with calls to the recently improved SUEXEC interface. This removes some cruft from each script that calls an external script. Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting. Also fixed to not run the parser directly! This was very wrong; should call nscheck instead. Changed to use "nobody" group instead of group flux (made the same change in nscheck). There is a script in the sql directory called newstates.pl. It needs to be run to initialize the batchstate slot of the experiments table for all existing experiments.
-
- 23 Sep, 2003 1 commit
-
-
Leigh B. Stoller authored
-
- 11 Sep, 2003 1 commit
-
-
Mike Hibler authored
swapping. This is primarily for the "queue an interactive job" form of batching, but applies to all batch jobs at the moment.
-
- 17 Jul, 2003 1 commit
-
-
Mac Newbold authored
Lots of changes to the form, both functional and aesthetic. See the testbed ops mail logs for a list of all of them, and the rationale. Corresponding updates to the showexp "edit meta-data" stuff, so that it gets all the same error checks as the real form. Also some backend changes in batchexp to pass through all the new form values.
-
- 10 Jul, 2003 1 commit
-
-
Leigh B. Stoller authored
* In the parser, make -n (impotent) and -a (anonymous) more independent. Used to be that -n required -a, but that makes the preparse less useful, since it cannot catch project related errors (like bad osids, or node type permissions), and so the user does not get that until the email message later. Thats so annoying, even Mike whined about it. Note that impotent mode is sorta misnamed now, since the parse never operates on the DB. Rather, impotent mode now skips doing the XML output phase (still aptly named updateDB!). * Add -p (pass) option. I added this for my script that was parsing all the old NS files to get renderings. In this case, I do not want -n or -a; I want to upload the results into the DB, but the project related checks are obviously going to fail since I was doing it inside the testbed project. So, -p turns on some of the anon checks, and later might be used to turn certain features that are no longer supported, since all we really care about is the toplology (some NS files failed on old features and syntax). Upon reflection I think these three options could probably be rolled into just two, by cleaning up the current impotent and anonymous flags.
-
- 09 Jul, 2003 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
can swapmod an experiment created this way, think again ...
-
- 30 Jun, 2003 1 commit
-
-
Leigh B. Stoller authored
do the actual parse. The parser now spits out XML instead of DB queries, and the wrapper on boss converts that to DB insertions after verification. There are some makefile changes as well to install the new parser on ops via NFS, since otherwise the parser could intolerably out of date on ops!
-
- 09 Jun, 2003 1 commit
-
-
Mac Newbold authored
-
- 31 May, 2003 1 commit
-
-
Mac Newbold authored
-