- 12 Jan, 2005 1 commit
-
-
Leigh Stoller authored
table that will prevent an experiment from being swapped/modified. The toggle is on the showexp page, and the toggle is *not* admin over-ridable; you must turn the toggle off (and of course, you must be an admin to do that).
-
- 16 Dec, 2004 3 commits
-
-
Robert Ricci authored
the web interface.
-
Robert Ricci authored
PAGEHEADER() and PAGEFOOTER(). Pop it up in a new window.
-
Leigh Stoller authored
* tbsetup/panic.in: New backend script to implement the panic button feature. When used, it will cut the severe the connection to the firewall node by using snmpit to disable the port. Sets the panic bit (and date) in the experiments table, and changes the state of the experiment from "active" to "paniced" to ensure that the experiment cannot be messed with (swapped out or modified). Sends email to tbops when the panic button is pressed. Used with -r option, reverses the above. State is set back to active, the panic bit is cleared, and the port is renabled with snmpit. * tbsetup/tbswap.in: During swapout, a firewalled experiment that has been paniced will get a cleaning; The nodes are powered off, then the osids for all the nodes are reset (with os_select) so that they will boot the MFS, and then the nodes are powered on. Then the control network is turned back on, and then I wait for the nodes to reboot (this is simply cause we do not record in the DB that a node is turned off, and if I do not wait, the reload daemon will end hitting the power button again if they do not reboot in time. We can fix this later. I am not planning to apply this to general firewalled experiments yet as the power cycling is going to be hard on the nodes, so would rather that we at least have a 1/2 baked plan before we do that. * www/showexp.php3: If experiment is firewalled, show the Panic Button, linked to the panic button web script. If the experiment has already had the panic button pressed, show a big warning message and explain that user must talk to tbops to swap the experiment out. Also fiddle with menu options so that the terminate link is gone, and the swap link is visible only in admin mode. In other words, only an admin person can swap an experiment once it is paniced. And of course, an admin person can the backend panic script above with the -r option, but thats not something to be done lightly. * db/libdb.pm.in: Add "paniced" as an experiment state (EXPTSTATE_PANICED). Add utility functions: TBExptSetPanicBit(), TBExptGetPanicBit(), and TBExptClearPanicBit(). * tbsetup/swapexp.in: Minor state fiddling so that an experiment can be swapped while in paniced state, but only when in admin mode. Also clear the panic bit when experiment is swapped out. * www/dbdefs.php3.in: Add "paniced" as an experiment state. Add a utility function TBExptFirewall() to see if experiment is firewalled. * www/panicbutton.php3: New web script to invoke the backend panic script mentioned above, after the usual confirm song and dance. * www/panicbutton.gif: New gif of a red panic button that I stole off the net. If anyone has sees/has a better one, feel free to replace this one. * utils/node_statewait.in: Add -s option so that I can pass in the state I want to wait for (used from tbswap above to wait for nodes to reach ISUP after power on).
-
- 15 Dec, 2004 1 commit
-
-
Robert Ricci authored
Display a new 'Blinky Lights' button on the showexp page. In order to do this, I have to get a list of which classes/types are in use in the experiment. This leads to moteleds.php3, which displays the blink lights using Tim's cool Java applet.
-
- 12 Jul, 2004 1 commit
-
-
Timothy Stack authored
-
- 29 Jun, 2004 1 commit
-
-
Leigh Stoller authored
in the DB, change that to Stop Linktest instead.
-
- 13 May, 2004 1 commit
-
-
Leigh Stoller authored
inactive experiments.
-
- 29 Apr, 2004 1 commit
-
-
Leigh Stoller authored
currently available to only people with stud=1 status in the DB. * www/tbauth.php3: Add a STUDLY() function to check that bit. * www/linktest.php3: New page to run linktest on the fly. The level defaults to the current level in the experiments table, but you can override that via the form on the page. * www/showexp.php3: Add link to aforementioned page. STUDLY() only. * www/beginexp_form.php3: Add an option (selection) to set the linktest level for create/swapin. Defaults to 0 (no linktest). STUDLY() only. * www/editexp.php3: Add an option to edit the default linktest level for an experiment. STUDLY() only. * tbsetup/batchexp.in and tbsetup/swapexp.in: Add code to optionally run the linktest, sending email if it fails (exists with non-zero status). Failure does not affect the swapin.
-
- 15 Apr, 2004 2 commits
-
-
Leigh Stoller authored
active.
-
Leigh Stoller authored
experiment. If non-zero, then add a menu option point to the wireless floormaps for that experiment.
-
- 14 Jan, 2004 1 commit
-
-
Leigh Stoller authored
-
- 17 Nov, 2003 1 commit
-
-
Leigh Stoller authored
state machine (state). All of the stuff that was previously handled by using batchstate is now embedded into the one state machine. Of course, these mostly overlapped, so its not that much of a change, except that we also redid the machine, adding more states (for example, modify phases are now explicit. To get a picture of the actual state machine, on boss: stategraph -o newstates EXPTSTATE gv newstates.ps Things to note: * The "batchstate" slot of the experiments table is now used solely to provide a lock for batch daemon. A secondary change will be to change the slot name to something more appropriate, but it can happen anytime after this new stuff is installed. * I have left expt_locked for now, but another later change will be to remove expt_locked, and change it to active_busy or some such new state name in the state machine. I have removed most uses of expt_locked, except those that were necessary until there is a new state to replace it. * These new changes are an implementation of the new state machine, but I have not done anything fancy. Most of the code is the same as it was before. * I suspect that there are races with the batch daemon now, but they are going to be rare, and the end result is probably that a cancelation is delayed a little bit.
-
- 23 Oct, 2003 1 commit
-
-
Leigh Stoller authored
what window is what.
-
- 14 Oct, 2003 1 commit
-
-
Leigh Stoller authored
-
- 09 Oct, 2003 1 commit
-
-
Leigh Stoller authored
* install-rpm, install-tarfile, spewrpmtar.php3, spewrpmtar.in: Pumped up even more! The db file we store in /var/db now records both the timestamp (of the file, or if remote the install time) and the MD5 of the file that was installed. Locally, we can get this info when accessing the file via NFS (copymode on or off). Remote, we use wget to get the file, and so pass the timestamp along in the URL request, and let spewrpmtar.in determine if the file has changed. If the timestamp it gets is >= to the timestamp of the file, an error code of 304 (Not Modifed) is returned. Otherwise the file is returned. If the timestamps are different (remote, server sends back an actual file), the MD5 of the file is compared against the value stored. If they are equal, update the timestamp in the db file to avoid repeated MD5s (or server downloads) in the future. If the MD5 is different, then reinstall the tarball or rpm, and update the db file with the new timestamp and MD5. Presto, we have auto update capability! Caveat: I pass along the old MD5 in the URL, but it is currently ignored. I do not know if doing the MD5 on the server is a good idea, but obviously it is easy to add later. At the moment it happens on the node, which means wasted bandwidth when the timestamp has changed, but the file has not (probably not something that will happen in typical usage). Caveat: The timestamp used on remote nodes is the time the tarfile is installed (GM time of course). We could arrange to return the timestamp of the local file back to the node, but that would mean complicating the protocol (or using an http header) and I was not in the mood for that. In typical usage, I do not think that people will be changing tarfiles and rpms so rapidly that this will make a difference, but if it does, we can change it. * node_update.in, client side watchdog, and various web pages: Deflated node_update, removing all of the older ssh code. We now assume that all nodes will auto update on a periodic basis, via the watchdog that runs on all client nodes, including plab nodes. Changed the permission check to look for new UPDATE permission (used to be UPDATEACCOUNT). As before, it requires local_root or better. The reason for this is that node_update now implies more than just updating the accounts/mounts. The web pages have been changed to explain that in addition to mounts/accounts, rpms and tarfiles will also be updated. At the moment, this is still tied to a single variable (update_accounts) in the nodes table, but as Kirk requested at the meeting, it will probably be nice to split these out in the future. Added the ability to node_update a single node in an experiment (in addition to all nodes option on the showexp page). This has been added to the shownode webpage menu options. Changed locking code to use the newer wrapper states, and to move the experiment to RUNNING_LOCKED until the update completes. This is to prevent mayhem in the rest of the system (which could be dealt with, but is not worth the trouble; people have to wait until their initiated update is complete, before they can swap out the experiment). Added "short" mode to shownode routine, equiv to the recently added short mode for showexp. I use this on the confirmation page for updating a single node, giving the user a couple of pertinent (feel good) facts before they comfirm.
-
- 30 Sep, 2003 1 commit
-
-
Leigh Stoller authored
plus a lock field. The lock field was a simple "experiment locked, go away" slot that is easy to use when you do not care about the actual state that an experiment is in, just that it is in "transition" and should not be messed with. The other two state variables are "state" and "batchstate". The former (state) is the original variable that Chris added, and was used by the tb* scripts to make sure that the experiment was in the state each particular script wanted them to be in. But over time (and with the addition of so much wrapper goo around them), "state" has leaked out all over the place to determine what operations on an experiment are allowed, and if/when it should be displayed in various web pages. There are a set of transition states in addition to the usual "active", "swapped", etc like "swapping" that make testing state a pain in the butt. I added the other state variable ("batchstate") when I did the batch system, obviously! It was intended as a wrapper state to control access to the batch queue, and to prevent batch experiments from being messed with except when it was really okay (for example, its okay to terminate a swapped out batch experiment, but not a swapped in batch experiment since that would confuse the batch daemon). There are fewer of these states, plus one additional state for "modifying" experiments. So what I have done is change the system to use "batchstate" for all experiments to control entry into the swap system, from the web interface, from the command line, and from the batch daemon. The other state variable still exists, and will be brutally pushed back under the surface until its just a vague memory, used only by the original tb* scripts. This will happen over time, and the "batchstate" variable will be renamed once I am convinced that this was the right thing to do and that my changes actually work as intended. Only people who have bothered to read this far will know that I also added the ability to cancel experiment swapin in progress. For that I am using the "canceled" flag (ah, this one was named properly from the start!), and I test that at various times in assign_wrapper and tbswap. A minor downside right now is that a canceled swapin looks too much like a failed swapin, and so tbops gets email about it. I'll fix that at some point (sometime after the boss complains). I also cleaned up various bits of code, replacing direct calls to exec with calls to the recently improved SUEXEC interface. This removes some cruft from each script that calls an external script. Cleaned up modifyexp.ph3 quite a bit, reformatting and indenting. Also fixed to not run the parser directly! This was very wrong; should call nscheck instead. Changed to use "nobody" group instead of group flux (made the same change in nscheck). There is a script in the sql directory called newstates.pl. It needs to be run to initialize the batchstate slot of the experiments table for all existing experiments.
-
- 19 Sep, 2003 1 commit
-
-
Leigh Stoller authored
based page that looks like the original Begin Experiment page. Be sure to look at the page in both admin mode and non-admin mode since I had some trouble determining how swappable is treated these days. Oh, added the ability to convert non-batch experiments into batch, and back. The experiment must be unlocked and in the swapped state to go in either direction. Also added the cpu_usage and mem_usage slots for editing. I added a comment about planetlab only, since otherwise we would just confuse normal users who have no idea what they mean. I could conditionalize them on having plab nodes, but thats difficult to figure out in the web page when the experiment is swapped out, so lets not worry about it.
-
- 22 Aug, 2003 1 commit
-
-
Leigh Stoller authored
-
- 08 Aug, 2003 1 commit
-
-
Mac Newbold authored
-
- 07 Aug, 2003 1 commit
-
-
Leigh Stoller authored
Soon image when the image is not available (cause the renderer is not finished yet). The default thumbsize is now 160, so do not call out to the renderer to generate the thumb on the showexp page; just take it from the DB.
-
- 29 Jul, 2003 1 commit
-
-
Leigh Stoller authored
showexp page that its a batch experiment, by the menu options. Same deal in the swapexp output, plus some other minor cleanup. The only bug I found while trying to figure out the batchmode problem reported this morning by the FileMover people, is that the cancelflag is not cleared after swaping a running batch experiment out, so even after reinjecting it into the queue, it will not run. Still, that does seem to be what the FileMover people reported.
-
- 18 Jul, 2003 1 commit
-
-
Jay Lepreau authored
-
- 17 Jul, 2003 1 commit
-
-
Mac Newbold authored
Lots of changes to the form, both functional and aesthetic. See the testbed ops mail logs for a list of all of them, and the rationale. Corresponding updates to the showexp "edit meta-data" stuff, so that it gets all the same error checks as the real form. Also some backend changes in batchexp to pass through all the new form values.
-
- 16 Jul, 2003 1 commit
-
-
Mac Newbold authored
-
- 25 Jun, 2003 1 commit
-
-
Leigh Stoller authored
-
- 04 Jun, 2003 3 commits
-
-
Mac Newbold authored
-
Mac Newbold authored
-
Mac Newbold authored
-
- 03 Jun, 2003 1 commit
-
-
Mac Newbold authored
more useful, by including the reasons and such. Also add a similar email message when they change the reasons or timeouts.
-
- 29 May, 2003 2 commits
-
-
Mac Newbold authored
-
Mac Newbold authored
name, your idleswap time, and your unswappable/noidleswap reasons if applicable. Also, make toggle send mail if people try to go unswappable or turn off their idleswap bit.
-
- 20 May, 2003 1 commit
-
-
Leigh Stoller authored
per Shashi's request.
-
- 15 May, 2003 1 commit
-
-
Leigh Stoller authored
-
- 14 May, 2003 1 commit
-
-
Chad Barb authored
Modify experiment is now deployed/documented... We can announce it tomorrow.
-
- 29 Apr, 2003 1 commit
-
-
Chad Barb authored
Various Other changes to get Expt Modify ready for prime time. - If assign fails on a modify, experiment will be restored to old state, *not* swapped out. - Reboot option has been improved to reboot all nodes as part of os_setup, not in separate step. - Different assign error codes result in different retry behavior for assign_wrapper (Follow's Rob's change to assign to make it pass back special code for non-retriable faults) - '64' bit in assign_wrapper exit code indicates to tbswap that db/phys state hadn't been mucked with before the exit occurred (ergo, '65' and '1' are the common return codes, though the old 4,8,16,32 are still there for assign failing.) - (tbswap still returns codes from assign wrapper) - Added 5 sec pause between assign attempts. - Cleaned up tbswap code. - Physical state backup/restore removed from tbprerun, put into swapexp. - Interfaces table now getting cleaned up correctly (Mike noticed problem) - Changed menu display in showexp to show the "modify" menu option for swapped out experiments (like it used to.) - A couple other changes. Note: Still admin-only, but I plan to change that soon. To do: - Erase expt backups in /tmp after using them. - Re-viz failed experiments.
-
- 28 Apr, 2003 1 commit
-
-
Leigh Stoller authored
The first three are aggregate tables, while the experiment stats table gets a record for each new experiment, and is updated when an experiment is swapped in/out/modify or terminated. Look at the table to see what is tracked. Once the experiment_stats record is updated, the aggregate tables are updated as necessary. There are a bunch of ugly changes to assign_wrapper to get the stats. Note that pnodes is not incremented until an experiment sucessfully swaps in. This is in leu of getting status codes; I'm not tracking failed operations yet, nor creating the log file that Jay wants. I'll do that in the next round of changes when we see how useful these numbers are. Most of the changes are to create/delete table entries where appropriate, and to display the records. Display is only under admin mode, and the display is raw; just a dump of the assoc tables in php. The last 100 experiment stats records are available via the Experiment List page, using the "Stats" show option at the top. Bad place, but will do for now.
-
- 23 Apr, 2003 1 commit
-
-
Leigh Stoller authored
-
- 21 Apr, 2003 1 commit
-
-
Mac Newbold authored
its own subsection, and admin funcs in another one. (These both only show up when you're red-dotted.) Reusable SUBMENUSECTION($title) calls added, so you can do this in any of these submenus.
-
- 17 Apr, 2003 1 commit
-
-
Mac Newbold authored
Reorder some options to put force swap (idle-swap) up near the other swap button, so I don't accidentally use the wrong one.
-