1. 03 Aug, 2012 2 commits
  2. 01 Aug, 2012 2 commits
    • Leigh B Stoller's avatar
      This commit adds some simple support for using the Infiniband on the · 997b21b5
      Leigh B Stoller authored
      Probe Cluster. The problem is that the IFB is a shared network that
      every node attaches to, which can looks like an ethernet device that
      can ifconfig'ed. In other words, one big lan.
      
      But we still want the user to be able to create a lan so that they can
      interact with it in thei NS file like any other network.
      
      The NS syntax is:
      
      	set lan2 [$ns make-lan "node1 node2 node3" * 0ms]
      	tb-set-switch-fabric $lan2 "infiniband"
      
      The switch fabric tells the backend to do IP assignment for the
      specific global network. Yes, I tried to be a little but general
      purpose. Lets see how this actually turns out.
      
      This first commit treats the fabric as a single big lan on the same
      subnet.
      
      NOTE 1: Since the unroutable IP space is kinda small, but the Probe
      Cluster is really big, we can easily run out of bits if we tried to do
      assignment on virtual topos. Instead, fabrics get their IP allocation
      at swapin time, and the allocations are deleted when the experiment is
      swapped out. The rationale is that the number of swapped in
      experiments is much much smaller then the number of possible topos
      that can be loaded into the DB. Still might run out, but less likely.
      
      The primary impact of above is that IP assignments can change from
      one swap to another, but this is easy to deal with if the user is
      scripting their experiment; the IP allocation is available via the
      XMLRPC interface.
      
      NOTE 2: The current code allocates from a single big network, which
      makes it easy for users to mess each other up if they start doing
      things by hand. Ultimately, we want each lan in each experinent to use
      their own subnet, but that is going to take more work, so lets do it
      in the second phase.
      
      The definition of "network fabrics" is in the new network_fabrics
      tables. As an example for probe:
      
      	INSERT INTO `network_fabrics` set
      		idx=NULL,
      		name='ifband',
      		created=now(),
      		ipalloc=1, ipalloc_onenet=1,
      		ipalloc_subnet='192.168.0.0',ipalloc_netmask='255.255.0.0'
      997b21b5
    • Leigh B Stoller's avatar
      9155e015
  3. 30 Jul, 2012 1 commit
  4. 25 Jul, 2012 1 commit
  5. 24 Jul, 2012 1 commit
    • Mike Hibler's avatar
      Add a 'disabled' field to the subbosses table. · e08bfeec
      Mike Hibler authored
      This allows us to more easily disable a subboss in the event of a temporary
      subboss outage (e.g., hardware failure). Previously we would have to remove
      the related rows from the DB and restore them later.
      e08bfeec
  6. 23 Jul, 2012 1 commit
    • Mike Hibler's avatar
      Minor tweak to "waitmode" in libreboot. · 735758ce
      Mike Hibler authored
      Previously ISUP and TBFAILED were the two states that signified that a
      node had rebooted to the satisfaction of libreboot (with waitmode==1).
      Add RELOAD/RELOADING to that list since the frisbee MFS never sends either
      of ISUP or TBFAILED.
      
      Required a modification to TBNodeStateWait() to allow waiting for an
      op-mode/state combo as well as just a state.
      
      I made this change in anticipation that it would be useful for more
      responsive monitoring of failure in tbswap/os_load. But now I am no so sure.
      735758ce
  7. 17 Jul, 2012 3 commits
  8. 13 Jul, 2012 1 commit
  9. 11 Jul, 2012 1 commit
  10. 03 Jul, 2012 1 commit
  11. 02 Jul, 2012 4 commits
  12. 29 Jun, 2012 1 commit
  13. 26 Jun, 2012 1 commit
  14. 21 Jun, 2012 1 commit
  15. 12 Jun, 2012 3 commits
  16. 11 Jun, 2012 1 commit
  17. 07 Jun, 2012 1 commit
    • Leigh B Stoller's avatar
      New script, clone_image to simplify create/snapshot from a node. · b01c991d
      Leigh B Stoller authored
      clone_image is a wrapper around newimageid_ez and create_image, that
      simplifies the most common operation; creating a new imageid derived
      from the image/os that is currently running in the node, and then
      taking a snapshot of the node. So for example, if node pcXXX is
      running image FREEBSD, and you want to create a custom image from that
      node, all you need to do:
      
      	boss> clone_image myfreebsd pcXXX
      
      which will create the new descriptor, deriving everything from the
      FREEBSD image on the node, and then take a snapshot from pcXXX. If
      the descriptor already exists, just take the snapshot.
      
      So what if you do:
      
      	boss> clone_image FREEBSD pcXXX
      
      well, the image is always looked up in the project the node is
      currently attached to, so in fact a new descriptor is created in that
      project, and you do not actually overwrite an image from some other
      project. 
      
      I've added some locking to images to prevent concurrent snapshots.
      This seemed like a good idea since this script is going to be used
      from the ProtoGeni interface. More on this in another commit.
      b01c991d
  18. 06 Jun, 2012 2 commits
  19. 18 May, 2012 1 commit
  20. 17 May, 2012 1 commit
  21. 16 May, 2012 1 commit
    • Leigh B Stoller's avatar
      Another protogeni checkbox; scriptify and simplify adding "special" · cf517af6
      Leigh B Stoller authored
      devices with network interfaces. Emulab's spp and bbg nodes are
      examples, but I did all that by hand inserting sql. An spp node is a
      shared node with some interfaces. Users can allocate one or more of
      those interfaces and establish vlans to the interfaces. The node is a
      "fakenode" in "shared" mode, and everything else falls out. The mapper
      assigns virtual nodes until all of the interfaces are allocated,
      snmpit does its work on the interfaces, and the user then does the
      rest.
      
      Anyway, to added a special device:
      
        boss> wap addspecialdevice -s -t goober goober1
      
      The -t argument is the name of the node type, created if it does not
      exists. The last argument is the name of the fakenode to create in the
      DB. The -s option says the special device is shared. Without -s, the
      device is allocated exclusively.
      
      Then to add interfaces to the device:
      
        boss> wap addspecialiface -b 1Gb -s cisco4,100,100 goober1 eth0
      
      The -b option is the speed (either 100Mb or 1Gb). The -s option is the
      switch side of the interface (switchname,card,port). The last two
      arguments are the nodename and iface name for the interfaces table.
      
      After the interface and wires table entry are added to the DB, snmpit
      is called to put the switch port into tagged mode (if the node is
      shared). To skip the snmpit step, add the -t option.
      cf517af6
  22. 15 May, 2012 1 commit
  23. 07 May, 2012 2 commits
  24. 03 May, 2012 2 commits
    • Leigh B Stoller's avatar
      Add support for fully initializing the ilo on geni rack nodes. · 164da3ba
      Leigh B Stoller authored
      The basic operational model is as follows.
      
      * We turn the nodes on.
      
      * Since there is nothing on the disks, they will fall through to
        booting from the PXE and will boot the newnode MFS. They all check
        in.
      
      * We run Jon's script that adds the nodes. They are now in hwdown,
        still nothing on the disks.
      
      * We run my script, which is driven from a datafile we are supposed to
        get from HP. This script has the ilomac, ilopswd, control mac. I
        will add another column initially; the permanent IP to assign to the
        ilo. This script does:
      
       + Reads the datafile to get all the stuff.
       + Reads the dhcpd.leases file to find the temporary IPs of the ilos.
       + Finds the corresponding nodes in the DB.
       + Sends over an XML file that does the following:
           - Add the elabman user.
           - Add local root's dsa pub key to the new elabman user.
           - Add Utah's root dsa key to the Administrator user
           - Sets the power on mode to auto (so that the node turns on!).
           - Sets the idle timeout to 2 hours.
       + Sets the bootorder so that PXE is first. This has to be done
         with ssh and some expect stuff I culled from power_ilo. Sigh.
       + Calls out to another script that adds the ilo interface to the
         DB (this is the management_iface script I did last month).
       + Sends another XML file that tells the ilo to reset itself, so that
         it picks up its permanent IP address.
      
      * Now we can free the nodes from hwdown.
      164da3ba
    • Leigh B Stoller's avatar
      Give management ports a hostname so that we know what it is · f00b2176
      Leigh B Stoller authored
      in the conf file.
      f00b2176
  25. 01 May, 2012 1 commit
  26. 30 Apr, 2012 2 commits
  27. 24 Apr, 2012 1 commit