- 21 May, 2003 1 commit
-
-
Leigh B. Stoller authored
each portion of the experiment as it is modified. Also add expt_swap_uid so that we know who did the last operation, and so we can charge/credit the right person. So, if joe swaps in the experiment and jane swaps it out, joe gets charged. If jane swaps in the experiment and joe modifies it, jane gets credit for the first portion, and joe will later get charged for the second portion. Took longer to explain then to implement ... Lbs
-
- 15 May, 2003 1 commit
-
-
Leigh B. Stoller authored
per-experiment instantiation with aggregate data like the number of swapins, the dates and the like. The other part is the per swapin/modify stats. These are number of pnodes, links, lans, etc. Long term, I think we want more precise swapin stats, and with experiment modify in the mix, we need to have multiple stat records per experiment, but do not need to duplicate all the stuff in the other table just mentioned. To reduce the amount the table size, we cross reference the tables by index only instead of with pid,eid and the like. We use exptidx to link experiments, experiment_stats, and the new experiment_resources table. experiment_resources and stats are linked by another index in the resources table, which indicates which is the current resource row. On a modify, a new resource record is created, and the stats record updated to point to the new (latest) resource record. Web Changes: Improve showstats and showexpstats. Make them user accessible so that mere users can see stats for themselves and for their projects. No ability for mere users (PIs) to look at another person's stats. Generally, these two pages need more work, but now they are more useful. I added Show Stats to the user info and project info pages to display per-usr/proj stats. Add more info in the showstats display, but the showexpstats display is still not pretty printed; just the raw tables. Rename a few fields, add some indexes, and otherwise make some minor changes that are sure to annoy everyone.
-
- 05 May, 2003 1 commit
-
-
Leigh B. Stoller authored
idleswaps are not showing up in the testbed stats.
-
- 01 May, 2003 1 commit
-
-
Mac Newbold authored
but later changes to where/how the email is sent took it out.)
-
- 30 Apr, 2003 1 commit
-
-
Leigh B. Stoller authored
tb tools! I've changed the batch system to "preload" the experiment in foreground mode (results of parse spit back to user directly). The batch daemon now uses swapexp instead of startexp. Upon failure, the experiment goes back to the "swapped" state; previously its virt state was blasted, and rentered again next try. This is nice cause you can actually look at the batch experiment (vis, virt tables, etc) while it is posted and not running. Not sure if all the Ts are crossed. Will find out ...
-
- 29 Apr, 2003 1 commit
-
-
Chad Barb authored
Various Other changes to get Expt Modify ready for prime time. - If assign fails on a modify, experiment will be restored to old state, *not* swapped out. - Reboot option has been improved to reboot all nodes as part of os_setup, not in separate step. - Different assign error codes result in different retry behavior for assign_wrapper (Follow's Rob's change to assign to make it pass back special code for non-retriable faults) - '64' bit in assign_wrapper exit code indicates to tbswap that db/phys state hadn't been mucked with before the exit occurred (ergo, '65' and '1' are the common return codes, though the old 4,8,16,32 are still there for assign failing.) - (tbswap still returns codes from assign wrapper) - Added 5 sec pause between assign attempts. - Cleaned up tbswap code. - Physical state backup/restore removed from tbprerun, put into swapexp. - Interfaces table now getting cleaned up correctly (Mike noticed problem) - Changed menu display in showexp to show the "modify" menu option for swapped out experiments (like it used to.) - A couple other changes. Note: Still admin-only, but I plan to change that soon. To do: - Erase expt backups in /tmp after using them. - Re-viz failed experiments.
-
- 28 Apr, 2003 2 commits
-
-
Leigh B. Stoller authored
swap_exitcode (last error), idle_swaps (a count), batch (a flag to indicate a batch experiment). Add a operational log. Okay, its not actually a log, but a table that will grow forever until it consumes the earth. Its a small table though, so it will take a few years. Its cross indexed with the experiment_stats table, so by massaging this table along with the stats table, we can get a good picture of what was running on the testbed when, and how many resources it was using. Sorry, not a log file, but we can easily generate a log file from tbe table if the Boss really wants one. The table entry averages 28 bytes. Move stats to their own main menu item (admin mode only). Remove from the showexp_list page since that was bogus.
-
Leigh B. Stoller authored
The first three are aggregate tables, while the experiment stats table gets a record for each new experiment, and is updated when an experiment is swapped in/out/modify or terminated. Look at the table to see what is tracked. Once the experiment_stats record is updated, the aggregate tables are updated as necessary. There are a bunch of ugly changes to assign_wrapper to get the stats. Note that pnodes is not incremented until an experiment sucessfully swaps in. This is in leu of getting status codes; I'm not tracking failed operations yet, nor creating the log file that Jay wants. I'll do that in the next round of changes when we see how useful these numbers are. Most of the changes are to create/delete table entries where appropriate, and to display the records. Display is only under admin mode, and the display is raw; just a dump of the assoc tables in php. The last 100 experiment stats records are available via the Experiment List page, using the "Stats" show option at the top. Bad place, but will do for now.
-
- 17 Apr, 2003 1 commit
-
-
Chad Barb authored
For the benefit of our users, added 'reboot nodes in experiment' checkbox, on by default, with a stern warning.
-
- 16 Apr, 2003 1 commit
-
-
Leigh B. Stoller authored
experiment, rather than as an administrator, which presents group permission problems when the experiment is in a subgroup (requires two additional group, whereas suexec adds only one group). That aside, the correct approach is to run the swap as the creator. To do that, must flip to the user (from the admin person) in the backend using the new idleswap script, and then run the normal swapexp. Add new option to swapexp (-i) which changes the email slightly to make it clear that the experiment was idleswapped, and so that the From: is tbops not the user (again, to make it more clear).
-
- 03 Apr, 2003 1 commit
-
-
Chad Barb authored
Added new feature 'Experiment Modify'. Now available (to admins only for now) from the showexp page. Warning! doing a modify which alters the topology will probably require a "reboot all nodes" afterwards. (There will be a checkbox soon in the modify experiment page.) Adding/removing delay nodes seems to work fine without reboots, though. Warning! If the new version of the experiment cannot be mapped (not enough nodes available, for instance) the experiment will be swapped out! This will get fixed later. Prerun backs up the experiment topology, so using a bad NS file doesn't result in experiment termination. As part of this, added library functions to libdb to delete, backup, and restore both virtual and physical experiment state.
-
- 27 Mar, 2003 1 commit
-
-
Leigh B. Stoller authored
make one in /usr/testbed/expwork/$pid. Too much cruft getting left behind and it was causing even more log copy errors! Besides, typically its just tbops people who need to look at that stuff.
-
- 11 Mar, 2003 1 commit
-
-
Chad Barb authored
New version of unified tbswap in/out. startexp/endexp/swapexp have been changed to use new script. tbswapin and tbswapout have been replaced with a script which spits out a warning message, then calls tbswap appropriately. The README has also been modified.
-
- 18 Dec, 2002 1 commit
-
-
Leigh B. Stoller authored
Attempts to replay an experiment by rebooting all the nodes, clearing the various startup bits (ready, startstatus, bootstatus, portstats), and then restarting the event system. I am dubious that this is a workable solution because of the asynchronous nature of the testbed (nodes happily cruise from TBRESET to ISUP and beyond without stopping), and so its hard to truly replicate the initial lack of state that a freshly swapped in experiment has. Still, people requested it and I cheerfully provided it cause thats what I do; service with a smile and not a wit of complaint. Is anyone reading this?
-
- 16 Sep, 2002 1 commit
-
-
Leigh B. Stoller authored
experiment. Here is mail to tbops: * Moved the working directory for experiment setup/swap/end to a new directory located on boss instead of over NFS to /proj/$pid/$eid. This new location is /usr/testbed/expwork/$pid/$eid. * Changed the name of the directories we create in /usr/testbed/expinfo to $pid-$eid.$index where $index is a new autoincrement field in the DB table. I really hated the names that were created before. * Changed where logs are written from /tmp to the new location in /usr/testbed/expwork/$pid/$eid. Okay, why. * We no longer operate on NFS mounted directories that might hang. Its easier to catch the situation where a copy of the log file over at the end of experiment creation fails cause of an NFS problem. * We no longer have user writable files that are inputs to other parts of the system (like top and ptop files). Not that a user would be bad, but it closes a hole. * We no longer copy user writable files from /proj to boss where we might fill up an important filesystem cause the user put a .ndz file in the the working directory. Not that a user would be bad, but it closes a hole. * Its easier to save all the log files this way, for each swap in and out. * Removing a directory over NFS is a royal irritant when someone is CD'ed into that directory or looking at a file on the other side (the astute observer will peg this as the reason I went down this idiotic path in the first place!). * About 6 other reasons that I can no longer remember. Seriously, I really had more reasons I can no longer remember! :-)
-
- 11 Jul, 2002 1 commit
-
-
Leigh B. Stoller authored
directory so that they can be viewed later after the operation is complete. I've also cleaned up the mechanism for determining when a log file is active (for the web spew) by using another slot in the experiments table, and added some libdb routines to manage that slot. At present just the last (or latest) log can be viewed after the fact, but we can change that later if think its really necessary. At the same time, make it possible for admin types to view the log files for other peoples expierments; spew is setuid, but flips back after opening the file (does usual checks too). I've also incorporated the log changes into the batch daemon, so you can view the last batch log too, although I have not tested that yet!
-
- 07 Jul, 2002 1 commit
-
-
Leigh B. Stoller authored
-
- 16 Jun, 2002 1 commit
-
-
Leigh B. Stoller authored
transition error when you click too fast after creating it. Instead of looking at experiment state, use the logile slot of the experiments table, and make sure its cleared/set properly in start/swap experiment scripts. Also added a spew option to the swap page so you can watch experiments swap in/out.
-
- 16 May, 2002 1 commit
-
-
Leigh B. Stoller authored
-
- 19 Mar, 2002 1 commit
-
-
Leigh B. Stoller authored
cleanup of tbreport.
-
- 12 Feb, 2002 1 commit
-
-
Leigh B. Stoller authored
line in all email from the system. Remove all of the TESTBED: tags and modify the email function in the web server and perl library to prepend @DOMAIN@: to the message.
-
- 28 Dec, 2001 1 commit
-
-
Leigh B. Stoller authored
are created 664, so that group members other than the experiment creator can swap them.
-
- 27 Nov, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 07 Nov, 2001 1 commit
-
-
Leigh B. Stoller authored
all those top and ptop files are collecting in /usr/testbed/www!
-
- 24 Oct, 2001 1 commit
-
-
Leigh B. Stoller authored
but simply entered into the DB record for the experiment until we know what to do with them. Add to batchexp script arguments, since all that stuff is done outside the web interface. Add a swapexp perl script to swap an an experiment in/out form the command line. Add web links on the Experiment Information page to do this from the web interface. A bunch of locking changes. Previously expt_terminating in the experiment record prevented multiple calls to terminate an experiment, but now we have a more general locking problem with start,swapin,swapout, and terminate, so change expt_terminating to expt_locked (still a datetime field) and add locking to all of startexp, swapexp, and endexp. Note that batch experiments cannot be swapped yet because of locking issues still to be resolved. Minor cleanup in tbreport to make email message look better.
-
- 17 Oct, 2001 1 commit
-
-
Leigh B. Stoller authored
experiment code. No longer uses another table. Rather, the experiment record contains a couple of extra fields for the batch system. Also combined some of the backend code (no longer a killbatch script). Also added scriptable experiments; the batchexp program in the bin directory can start an experiment from the command line, and in fact is used from the web page for both batch experiments and immediate experiments (-i option). All of the DB code that was in the web interfaces was moved to batchexp.
-
- 16 Oct, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 26 Sep, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 24 Sep, 2001 2 commits
-
-
Leigh B. Stoller authored
terminated.
-
Leigh B. Stoller authored
experiment.
-
- 28 Aug, 2001 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
- 22 Aug, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 19 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
email, for the case when someone other than creator does the termination.
-
- 29 Jun, 2001 1 commit
-
-
Leigh B. Stoller authored
will terminate properly through web interface.
-
- 20 Jun, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 10 May, 2001 1 commit
-
-
Leigh B. Stoller authored
proper headers. Split out some of the mail into testbed-logs, testbed-ops, and testbed-approval. Added a library for including from our perl scripts. Contains a couple of mail helper functions, but will hopefully contain more as time goes by. Fixed a bug in the web interface that was causing breakage for people with multiple accounts. Mac and Jay have noticed this, when logging out and trying to join or create a project under a new or different name.
-
- 19 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
Change to startexp/endexp, which are almost scriptable now (can be called directly in addition from the web page). Add front ends to these for the web page (webstartexp and webendexp). These changes are mostly support for batch mode.
-
- 09 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
termination, setup now exits immediately and sends email to the user when the experiment is fully configured.
-
- 07 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
by the web server forks a child to do the actual work of calling tbend and other stuff. The parent returns right away and the script ends. When the experiment termination (child) ends, an email message is sent to the user that issued the termination request. To prevent multiple clicks, I added a DB field called expt_terminating that is a DATETIME field. If the field is set, the script fails and the user is told to be more patient. I used a DATETIME field mostly for debugging purposes so we can track and future problems.
-