- 17 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
a bootstatus field to the nodes table. os_setup sets this to one of okay, failed, unknown. This is to be used with the still to be defined method of specifying certain nodes that can fail reboot on experiment creation. Right now sharks are wired to this, and this information is presented in the web page. Its also essential for the batch system, which needs to consider nodes that failed to reboot, or else batch experiments would never end. Might still need a way for an experiment to tell the batch system its done though.
-
- 16 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
its hardwired to the shark type, but needs to come from DB somehow later. Mail is sent to the user (CC'ed to testbed-ops) when a node fails but the experiment continues setup.
-
- 13 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
Minor cleanups in os_setup, and move some code to libdb.
-
- 10 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 05 Jul, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 20 Jun, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 08 Jun, 2001 1 commit
-
-
Mac Newbold authored
-
- 16 May, 2001 1 commit
-
-
Leigh B. Stoller authored
to reboot the node.
-
- 10 May, 2001 1 commit
-
-
Leigh B. Stoller authored
proper headers. Split out some of the mail into testbed-logs, testbed-ops, and testbed-approval. Added a library for including from our perl scripts. Contains a couple of mail helper functions, but will hopefully contain more as time goes by. Fixed a bug in the web interface that was causing breakage for people with multiple accounts. Mac and Jay have noticed this, when logging out and trying to join or create a project under a new or different name.
-
- 07 May, 2001 2 commits
-
-
Leigh B. Stoller authored
Mike's request.
-
Leigh B. Stoller authored
Makes for easier failure termination if one of the files does not exist.
-
- 03 May, 2001 1 commit
-
-
Leigh B. Stoller authored
replaced by the "images" table. New os_info table is added. New web pages to add and delete OSIDs to/from the os_info table, for use in the NS file. tb-create-os is gone. handle_os no longer operates on the tbcmds file, and no longer writes anything into the ir file. Moved the setting up of os state (nodes table) from os_setup to handle_os, where it should be. os_load and sched_reload now take a single argument, the name of the imageid from the images table.
-
- 11 Apr, 2001 1 commit
-
-
Leigh B. Stoller authored
mere users. os_load and os_setup reworked to use node_reboot.
-
- 27 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 26 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 25 Mar, 2001 2 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
-
- 18 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
-
- 07 Mar, 2001 1 commit
-
-
Leigh B. Stoller authored
(or was it months)?
-
- 27 Jan, 2001 1 commit
-
-
Mac Newbold authored
Additions for detecting when nodes are down and moving them to testbed/down expt. Sends mail if something is down, waits 2.5 minutes currently before giving up.
-
- 16 Jan, 2001 1 commit
-
-
Leigh B. Stoller authored
wait for all of them. Also protect against an ssh hang with a surrounding alarm. Also reduce output for web page.
-
- 03 Jan, 2001 1 commit
-
-
Leigh B. Stoller authored
the testbed list.
-
- 22 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
rest to STDERR.
-
- 14 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
now, but don't actually do anything with the node. Leave that to user to run the os_load script.
-
- 13 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
-
- 06 Dec, 2000 3 commits
-
-
Leigh B. Stoller authored
-
Leigh B. Stoller authored
os_setup and ifc_setup read the features list from the database for the currently assigned OS. os_setup will not enter a ping wait for the node to come back alive if the node does not support ping! ifc_setup will not try and and do ifconfig stuff if the OS does not support ifconfig.
-
Leigh B. Stoller authored
via the create-os directive in the NS file. tbsetup/ir/handle_os.tcl - Do a valid check for the image given with set-node-os in the NS file, and propogate that information through to the IR file. Nothing is added to the DB. tbsetup/mkexpdir - Add a tftpboot to the list of experiment directories. The tftpd daemon now allows kernels from /proj. tbsetup/os_setup - Very hacky changes to allow for multiboot kernels. Read local images table and cross check against that for nodeos spec. Hardwire in "mb" as a special partition tag that says to not try and do too much with it. This should be changed to a DB check of some kind. On reboot, do not wait for these nodes to come alive since there is no way to determine if an oskit kernel (or any foreign) kernel is running.
-
- 05 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
for no more replies. Still not great, and this causes the loop to reboot all the machines to get kinda long. More important is that we have to wait until all the nodes reboot and come back so that the next part tbrun does not fail. That adds a bunch of time to this. Needs to parallelize the reboot and wait, but thats too hard too deal with right now.
-
- 04 Dec, 2000 3 commits
-
-
Leigh B. Stoller authored
next_boot_path so that OSKit-NETBOOT.SILENT does the right thing. This gets pulled from the disk_images table. Temporary though, since we want to allow more flexibility in how the OS and related stuff gets specified.
-
Leigh B. Stoller authored
appropriate for the node type). This prevents the user from editing the IR file directly and specifying a bogus image ID.
-
Leigh B. Stoller authored
checks to make sure that the OS image that has been specified in the IR file is valid. Need a database check for this. Also, no delta stuff yet; just the main OS partitions that are know to exist on the disk already. Guess we also need to worry about reloading the disk from this script. Sigh, still lots to do, but this will be okay for a little while.
-
- 01 Dec, 2000 1 commit
-
-
Leigh B. Stoller authored
-