- 10 Jan, 2003 4 commits
-
-
Leigh B. Stoller authored
pcvm nodes until a better fix is decided.
-
Robert Ricci authored
-
Mike Hibler authored
-
Leigh B. Stoller authored
reloads. Usage is: sched_reload <options> -t pctype [pctype ...] In other words, schedule a reload for nodes of a particualar type (or types) like pc600, pc850, etc.
-
- 09 Jan, 2003 2 commits
-
-
Mike Hibler authored
-
Leigh B. Stoller authored
-
- 08 Jan, 2003 8 commits
-
-
Mike Hibler authored
One I actually did the math (duh) 64 or even 48 is way too high. 16 works much better for the static case.
-
Mike Hibler authored
Minor nits.
-
Kirk Webb authored
Another set of changes to make the script clean up logs left aroung by prior runs (skipped due to timeout waiting for close).
-
Kirk Webb authored
Small change to continue when one of several logfiles is not closed before the script times out on it (rather than exiting immediately). Can potentially leave behind old logs, but these will be picked up by subsequent runs.
-
Robert Ricci authored
-
Mac Newbold authored
-
Mike Hibler authored
-
Mike Hibler authored
-
- 07 Jan, 2003 16 commits
-
-
Mac Newbold authored
-
Leigh B. Stoller authored
real nodes get. Also, run a proper os_select on jailed nodes, *after* the os for the physical node is setup, since otherwise stated will not be happy. Fixes for dealing with failed os_load. Previously, if os_load would fail, os_setup would wait for those nodes anyway since it had no idea what nodes had failed (and we do not want to just quit from os_setup since that might cause a lot of extra power cycles). Now, for each node that got an os_load, check its eventstate; it should be in ISUP immediately after os_load exits (since thats what os_load waited for), and if its not, then mark that node as failed. Note though that failed loads no longer result in the node going into hwdown, since 99 percent of the time its a busted user image, not a hardware problem. I figure we will catch real hw errors via the reload daemon, when it sends email about nodes not finishing. Do not bother with doing the vnode setup if any of the phys nodes failed to setup. Leads to cascading errors and prolongs the angony by another few minutes. Might revisit this later. Remove local WaitTillAlive() function, and switch to using the version I put into libdb a couple of weeks ago. Fix up a bunch of print statements to be nicer.
-
Robert Ricci authored
-
Robert Ricci authored
-
Robert Ricci authored
Simple command-line interface to the ready bits. Its primary purposes are: * Manually report ready for nodes that can't do it themselves * Get a list of which nodes are ready, so that you can figure out which one(s) aren't reporting in * Clear ready bits so you can use them again without restarting the experiment * Make it possible to poll ready bits on boss/ops
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
calculation based on the size of the image file. Okay, to avoid all you folks from going to see what bit of dreck I came up with, here it is: my $sb = stat($imagepath); my $chunks = $sb->size / (1024 * 1024); $maxwait = int((($chunks / 100.0) * 25) + (4 * 60)); Note the replacement of one hardwired number (15) with several dozen new ones! I like it anyway, cause I hate waiting 2*15 minutes when a 60 second load fails.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
Mac Newbold authored
as the op_mode it is currently running. If not, force it and send mail. This fixes the "stuck-in-reloading" phenomenon we occasionally see when the states get messed up. Now anytime it loads PXEFRISBEE it should force the mode to RELOAD, and it will stop reloading the first time it hits RELOADDONE.
-
Robert Ricci authored
and osid in the partitions table, use that, but, if not, use the one from the os_info table. Previously, only the osid from the partitions table was used.
-
Mac Newbold authored
-
Mac Newbold authored
-
Mike Hibler authored
-
Mac Newbold authored
then remove special case for sending REBOOTING event in node_reboot/power when using NORMAL mode. Now SHUTDOWN is always sent. (Important side note: SHUTDOWN needs to be a valid state in every machine now.)
-
- 06 Jan, 2003 10 commits
-
-
Robert Ricci authored
(the IXPs) with more. Will probably put this check back in later, with the maixumum number of links coming from the database, per node type.
-
Leigh B. Stoller authored
script name (basename, not full path). Also moved the "use" lines below the package declaration, as I think they are supposed to be.
-
Leigh B. Stoller authored
builds properly on Linux.
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
on linux. Add a makefile ifdef for Linux/Freebsd that sets up the proper set of programs. Note that the new image I'm making will have the event libraries installed!
-
Robert Ricci authored
in terms of not putting them in the reloadpending experiment, etc.
-
Mike Hibler authored
-
Mike Hibler authored
frisbee.redux makefile, not the imagezip makefile. Besides making more sense, this ensure that all the frisbee client objects get built with the same compiler (the imagezip/unzip code will be built with gcc30 if imagezip NTFS support is enabled)
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
percent, and optimize for space. Prelude to creating smaller jails on local nodes, as soon as I can get SFS running inside a jail the way I want it (in which case users will have access to their project and home dirs on the file server). Add Mike's IPADDR change, with slight modification. tmcd will specify a list of ip addresses as a comma separated list, which are converted to -i options to pass to jail. Kernel will restrict bind to these IPs.
-