1. 08 Jun, 2009 1 commit
  2. 05 Dec, 2007 1 commit
  3. 17 Nov, 2005 1 commit
    • Mike Hibler's avatar
      1. Beef up "admin mode" support. · 4ec701e7
      Mike Hibler authored
      * Add libadminmfs.pm with routines for entering/exiting and executing
        commands in, the admin MFS.  Node admin and firewall swapout (see
        below) now use this, the image creation process does not yet.
      
      * Add swapout time hooks for running an admin mode process, likely to
        be used to collect swapout time state.  Currently controlled globally
        by two new sitevars.
      
      * Modified node_admin to use the library and added a "-c <command>"
        option to have nodes go into admin mode and run a command.  I don't
        really expect this to be useful, it was just a testing vehicle for
        the library.
      
      2. Improved the swapout process for firewalled experiments.  Largely
         just generalized what we already did for paniced experiments.
         At swapout, firewalled nodes are:
      
         - powered off
         - set to boot into admin mode and run a disk zapper
         - powered on
      
        The swapout process then waits for all nodes to successfully complete
        disk zapage, at which point the nodes are nfree'ed as usual.  Any
        failure of the above process, marks the experiment as panic'ed (to
        ensure that we are involved in cleanup) and sends mail to testbed-ops
        describing the state of the nodes.
      
      3. Added the aforementioned disk zapper, a little C program in the MFS
         which zeroes out the MBR and partition boot blocks (but not the MBR
         partition table or FS superblocks).  This is added insurance that if
         a node somehow gets diverted after being nfree'd but before getting
         the disk reloaded (e.g., goes to hwdown), that we cannot accidentally
         boot from the disk.  This program gets installed in the admin MFS.
      
      4. Related to firewalls, modified swapin to use the new documented
         "snmpit -N" to get the firewall VLAN number rather than parsing the
         output that was a side-effect of VLAN creation.
      4ec701e7
  4. 19 Oct, 2005 1 commit
  5. 18 May, 2005 1 commit
  6. 06 Jan, 2005 1 commit
  7. 22 Dec, 2004 3 commits
  8. 03 Aug, 2004 1 commit
  9. 12 Jan, 2004 1 commit
    • Leigh B. Stoller's avatar
      Death to proxydhcp; one less specialized daemon. DHCP will return the · 2b2b8ca1
      Leigh B. Stoller authored
      filename to boot, and all local nodes will boot the same pxeboot kernel,
      which has been extended to allow for jumping directly into a specific MFS
      (in addition to the usual testbed boot into a partition or multiboot
      kernel).
      
      Bootinfo and the bootwhat protocol extended to tell the client node what
      MFS to jump into directly, without a reboot. pxe_boot_path and
      next_pxe_boot_path are now deprecated, with bootinfo used to control which
      MFS to boot. Nodes now boot a single pxeboot kernel, and bootinfo tells
      them what to do next.
      
      Bootinfo greatly simplifed. temp_boot_osid has been added to allow for
      temporary booting of different kernels (such as with ndoe_admin or
      create_image). Unlike next_boot_osid which is a one-shot boot,
      temp_boot_osid causes the node to boot that OS until told not too.
      
      next_boot_path and def_boot_path in the nodes table are now ignored.
      Bootinfo gets path info strictly from the os_info table entry for the osid
      given in one of def_boot_osid, temp_boot_osid, or next_boot_osid.  This
      makes the selection of what to do in bootinfo a lot simpler (and for
      TBBootWhat in libdb). The os_info table also modified to include an MFS
      flag so that bootinfo knows to tell the client that the path refers to an
      MFS and not a multiboot kernel.
      
      Change to boot sequence; free nodes no longer boot into the default OSID.
      Instead, they are told to wait in pxeboot until told what to do, which
      will typically be when the node is allocated and a specific OSID
      picked. If the node needs to be reloaded, then the node is told to jump
      directly into the Frisbee MFS, which saves one complete reboot cycle
      whether the node has the requested OS installed, or not.  New program
      added called "bootinfosend" that is used by node_reboot to "wake up" up
      nodes sitting in pxewait mode, so that they query bootinfo again and boot.
      
      node_reboot changed to look at the event state of a node, and use
      bootinfosend to wake up nodes, rather then power cycle, since pxeboot does
      not repsond to pings. Retry (if the UDP packet is lost) is handled by
      stated.
      
      Event support added to bootinfo, to replace the event generation that was
      in proxydhcp. I have not included the caching that Mac had in proxydhcp
      since it does not appear that bootinfo packets are lost very
      often. Cleaned up all of the event and DB queury code to use lib/libtb for
      DB access, and moved all of the event code into a separate file.  The
      event sequence when a node boots now looks like this:
      
      	'SHUTDOWN'    --> 'PXEBOOTING'  (BootInfo)
      	'PXEBOOTING', --> 'PXEBOOTING'  (BootInfo Retry)
      	'PXEBOOTING', --> 'BOOTING'     (Node Not Free)
      	'PXEBOOTING', --> 'PXEWAIT'     (Node is Free)
      	'PXEWAIT',    --> 'PXEWAKEUP'   (Node Allocated)
      	'PXEWAKEUP',  --> 'PXEWAKEUP'   (Bootinfo Retry)
      	'PXEWAKEUP',  --> 'PXEBOOTING'  (Node Woke Up)
      
      Change stated to support resending PXEWAKEUP events when node times out.
      After 3 tries, node is power cycled. Other minor cleanup in stated.
      
      Clean up and simplify os_select, while adding support for temp_next_boot
      and removing all trace of def_boot_path and next_boot_path processing.
      Remove all pxe_boot_path and next_pxe_boot_path processing.  Changed
      command line interface to support "clearing" fields. For example,
      node_admin changed to call os_select like this to have the node
      temporarily boot the FreeBSD MFS:
      
      	os_select -t FREEBSD-MFS pcXXX
      
      which sets temp_boot_osid. To turn admin mode off:
      
      	os_select -c -t pcXXX
      
      which says to clear temp_boot_osid.
      
      sql/database-fill-supplemental.sql modifed to add os_info table
      entries for the FreeBSD, Frisbee, and newnode MFS's.
      
      Be sure to change dhcpd config, restart dhcp, kill proxydhcp, restart
      bootinfo,
      2b2b8ca1
  10. 18 Oct, 2002 1 commit
    • Mac Newbold's avatar
      Merge the newstated branch with the main tree. · 5c961517
      Mac Newbold authored
      Changes to watch out for:
      
      - db calls that change boot info in nodes table are now calls to os_select
      
      - whenever you want to change a node's pxe boot info, or def or next boot
      osids or paths, use os_select.
      
      - when you need to wait for a node to reach some point in the boot process
      (like ISUP), check the state in the database using the lib calls
      
      - Proxydhcp now sends a BOOTING state for each node that it talks to.
      
      - OSs that don't send ISUP will have one generated for them by stated
      either when they ping (if they support ping) or immediately after they get
      to BOOTING.
      
      - States now have timeouts. Actions aren't currently carried out, but they
      will be soon. If you notice problems here, let me know... we're still
      tuning it. (Before all timeouts were set to "none" in the db)
      
      One temporary change:
      
      - While I make our new free node manager daemon (freed), all nodes are
      forced into reloading when they're nfreed and the calls to reset the os
      are disabled (that will move into freed).
      5c961517
  11. 07 Jul, 2002 1 commit
  12. 23 May, 2002 2 commits
  13. 15 Oct, 2001 1 commit