1. 14 Dec, 2007 1 commit
  2. 13 Aug, 2007 1 commit
  3. 04 May, 2006 1 commit
    • Kirk Webb's avatar
      · 511e860b
      Kirk Webb authored
      Function prototypes for perl 5.8
      511e860b
  4. 28 Mar, 2006 1 commit
    • Mike Hibler's avatar
      Attempt to make firewall experiment swapout more robust by addressing a · 69b90e79
      Mike Hibler authored
      couple of MFS booting problems:
       * in the RPC power controller, make sure that an "on" command succeeds
         by checking the status, retrying if it failed (we already did this for
         "off")
       * if nodes fail to boot up the MFS after a power on, try again with a
         power cycle.  I have seen "power on" leave pc600s hung, and a power
         cycle seems to cure it.
      69b90e79
  5. 13 Mar, 2006 2 commits
    • Leigh Stoller's avatar
      A set of changes to run "prepare" on a node just prior to an image · d8f8f9b4
      Leigh Stoller authored
      being taken.
      
      The basic strategy is to have node_reboot (when -p option supplied)
      invoke a special command on the node that will cause the shutdown
      procedure to run prepare as it goes single user, but before the
      network is turned off and the machine rebooted. The output of the
      prepare run is capture and send back via the tmcd BOOTLOG command and
      stored in the DB, so that create_image can dump that to the logfile
      (so that the person taking the image can know for certain that the
      prepare ran and finished okay).
      
      On linux this is pretty easy to arrange since reboot is actually
      shutdown and shutdown runs the K scripts in /etc/rc.d/rc6.d, and at
      the end the node is basically single user mode. I just added a new
      script to run prepare and send back the output.
      
      On FreeBSD this is a lot harder since there are no decent hooks.
      Instead, I had to hack up init (see tmcd/freebsd/init/{4,5,6}) with
      some simple code that looks for a command to run instead of going to a
      single user shell. The command (script) runs prepare, sends the output
      back to tmcd, and then does a real reboot.
      
      Okay, so how to get -p passed to node_reboot? I hacked up the
      libadminmfs code slightly to do that, with new 'prepare' argument
      option. This may not be the best approach; might have to do this as a
      real state transition if problems develop. I will wait and see.
      
      Also, I changed www/loadimage.php3 to spew the output of the
      create_image to the browser.
      d8f8f9b4
    • Mike Hibler's avatar
      Reduce power cycle/on batch size when booting into the admin MFS because: · 66dfc7a3
      Mike Hibler authored
       * admin MFS is larger and had more problems with simultaneous reboots
      
       * power command did not support batching anyway (only node_reboot), so
         power ons were performed enmasse, exacerbating problems
      66dfc7a3
  6. 06 Dec, 2005 1 commit
  7. 17 Nov, 2005 1 commit
    • Mike Hibler's avatar
      1. Beef up "admin mode" support. · 4ec701e7
      Mike Hibler authored
      * Add libadminmfs.pm with routines for entering/exiting and executing
        commands in, the admin MFS.  Node admin and firewall swapout (see
        below) now use this, the image creation process does not yet.
      
      * Add swapout time hooks for running an admin mode process, likely to
        be used to collect swapout time state.  Currently controlled globally
        by two new sitevars.
      
      * Modified node_admin to use the library and added a "-c <command>"
        option to have nodes go into admin mode and run a command.  I don't
        really expect this to be useful, it was just a testing vehicle for
        the library.
      
      2. Improved the swapout process for firewalled experiments.  Largely
         just generalized what we already did for paniced experiments.
         At swapout, firewalled nodes are:
      
         - powered off
         - set to boot into admin mode and run a disk zapper
         - powered on
      
        The swapout process then waits for all nodes to successfully complete
        disk zapage, at which point the nodes are nfree'ed as usual.  Any
        failure of the above process, marks the experiment as panic'ed (to
        ensure that we are involved in cleanup) and sends mail to testbed-ops
        describing the state of the nodes.
      
      3. Added the aforementioned disk zapper, a little C program in the MFS
         which zeroes out the MBR and partition boot blocks (but not the MBR
         partition table or FS superblocks).  This is added insurance that if
         a node somehow gets diverted after being nfree'd but before getting
         the disk reloaded (e.g., goes to hwdown), that we cannot accidentally
         boot from the disk.  This program gets installed in the admin MFS.
      
      4. Related to firewalls, modified swapin to use the new documented
         "snmpit -N" to get the firewall VLAN number rather than parsing the
         output that was a side-effect of VLAN creation.
      4ec701e7