1. 08 Feb, 2002 1 commit
    • Leigh B. Stoller's avatar
      Big round of image/osid changes. This is the first cut (final cut?) at · a73e627e
      Leigh B. Stoller authored
      supporting autocreating and autoloading images. The imageid form now
      sports a field to specify a nodeid to create the image from; If set,
      the backend create_image script is invoked. Thats the easy part.
      Slightly harder is autoloading images based on the osid specified in
      the NS file. To support this, I have added a new DB table called
      osidtoimageid, which holds the mapping from osid/pctype to imageid.
      When users create images, they must specify what node types that image
      is good for. Obviously, the mappings have to be unique or it would be
      impossible to figure it out! Anyway, once that image mapping is
      in place and the image created, the user can specify that ID in the NS
      file. I've changed os_setup to to look for IDs that are not loaded,
      and to try and find one in the osidtoimageid. If found, it invokes
      os_load. To keep things running in parallel as much as possible,
      os_setup issues all the loads/reboots (could be more than a single set
      of loads is multiple IDs are in the NS file) at once, and waits for
      all the children to exit. I've hacked up os_load a bit to try and be
      more robust in the face of PXE failures, which still happen and are
      rather troublsesome. Need an event system!
      
      Contained in this revision are unrelated changed to make the OS and
      Image IDs per-project unique instead of globally unique, since thats a
      pain for the users. This turns out to be very messy, since underneath
      we do not want to pass around pid/ID in all the various places its
      used. Rather, I create a globally unique name and extened the OS and
      Image tables to include pid/name/ID. The user selects pid/name, and I
      create the globally unique ID. For the most part this is invisible
      throughout the system, except where we interface with the user, say in
      the web pages; the user should see his chosen name where possible, and
      the should invoke scripts (os_load, create_image, etc) using his/her
      name not the internal ID. Also, in the front end the NS file should
      use the user name not the ID. All in all, this accounted for a number
      of annoying changes and some special cases that are unavoidable.
      a73e627e
  2. 07 Feb, 2002 1 commit
  3. 01 Feb, 2002 1 commit
    • Mike Hibler's avatar
      Yet another attempt to Get It Right based on our latest half-assed · 609c547e
      Mike Hibler authored
      understanding of how mountd operates.
      
      Things that should be fixed:
      
      1. It iterates over every node calculating what directories are exported,
         what FSes they are on, etc.  Most of that work only needs to happen
         per experiment.
      
      2. The algorithm to determine what FS a directory is on is a hack.  I just
         take the first component of the path provided.  As long as we mount all
         our FSes at the top level and configure with canonical pathes, this is
         ok.  Other solutions require calling out to the ops node to get actual
         mount info.
      
      3. Once shared experiments are revived, the code to determine exported
         directories will need to change.  The algorithm for computing the
         exports lines should still be correct.
      609c547e
  4. 31 Jan, 2002 1 commit
  5. 30 Jan, 2002 2 commits
  6. 28 Jan, 2002 2 commits
  7. 23 Jan, 2002 4 commits
    • Robert Ricci's avatar
      New option: -c . Clears all VLANs are re-creates them from the · cee6ab9c
      Robert Ricci authored
      database.  Intended to be used to recover switch state after a
      crash or power outage.
      
      This option is farily dangerous, as it temporarily disrupts all
      experimental traffic, and will remove all hand-created VLANs.
      So, it interactively asks for confirmation that the user (who
      must be an admin, of course) really wants to do this.
      cee6ab9c
    • Robert Ricci's avatar
    • Robert Ricci's avatar
      There should _NOT_ be die()s in the modules, as this prevents · ae79f02d
      Robert Ricci authored
      things from getting cleaned up on failure.
      ae79f02d
    • Robert Ricci's avatar
      Some minor API changes to increase effieciency for Intels. · 9bd1dded
      Robert Ricci authored
      First, the stack-level createVlan() function no longer takes as an
      argument a list of devices the VLAN exists on, since it looks like
      this will never be needed.
      
      In it's place, createVlan() now takes a list of ports, so that it can
      (if so desired) put the ports in the VLAN without a seperate lock and
      unlock.
      
      The snmpit_intel module now uses its 'nested locking' feature to avoid
      additional locking in these cases. Note though, that the way that this
      is done is not safe for multiple switches in a stack. If we ever have
      to support multiple Intels (looks doubtful), this will have to be
      removed, or locking will need to be moved a level up to
      snmpit_intel_stack . Yuck.
      
      For Intels, the removeVlan() function calls removePortsFromVlan()
      itself, again to save locking overhead. The Cisco behavior, however,
      is unchanged, as locking is not expensive, and this would be too
      messy.
      9bd1dded
  8. 22 Jan, 2002 5 commits
  9. 18 Jan, 2002 8 commits
  10. 17 Jan, 2002 4 commits
  11. 16 Jan, 2002 1 commit
  12. 15 Jan, 2002 1 commit
  13. 14 Jan, 2002 5 commits
    • Robert Ricci's avatar
      Intel support is now fully functional. This mostly invovled making the · 223fff16
      Robert Ricci authored
      snmpit_intel module conform to the same API as snmpit_cisco.
      
      Intels VLANs are now done per port rather than per MAC. This should give
      experimenters more flexibility on the experimental net, and is more consistent
      with the way that VLANs are done on other switches.
      
      snmpit_intel_stack will need to undergo minor work to support stacks of
      multiple switches.
      223fff16
    • Robert Ricci's avatar
    • Leigh B. Stoller's avatar
      Fix minor Chris typo. · 7bf7645d
      Leigh B. Stoller authored
      7bf7645d
    • Leigh B. Stoller's avatar
      Make Frisbee.Redux live: · d08b5e41
      Leigh B. Stoller authored
      * Add appropriate goo to os/GNUMakefile so that Frisbee daemon is
        built and installed.
      
      * Rework the frisbee launcher slightly. Aside from little changes
        (send email to tbops when frisbeed dies, new cmdline syntax to
        frisbeed), allow for frisbeed to exit gracefully after a period of
        inactivity (no client requests for 30 minutes, at present). In order
        to prevent a race condition with a new client being added (and
        rebooted) and frisbeed terminating before the client gets started,
        add a load_busy indicator to the images table (next to load_address
        slot) and set that to one each time to frisbeelauncher is invoked.
        When frisbeed exits, test and clear that bit atomically (lock
        tables) and go around another time (restart frisbeed for another 30
        minute period).
      
      * Rework waitmode in os_load. Wait for all of the nodes to finish at
        once, and track which nodes never finish. Retry those nodes again by
        rebooting. The number of retries is configurable in the script, and
        is currently set to one. This should take care of some PXE boot
        related problems, although obviously not all.
      
      * Got rid of -w option to os_load and made waitmode the default. The
        -s option can be used to start a reload, but not to wait for it to
        complete.
      
      * Minor changes to sched_reload and reload_daemon; pass in -s option
        to os_load.
      d08b5e41
    • Christopher Alfeld's avatar
      Stripped out the portmap stuff. It wasn't doing anything anyway since · 7773b1e0
      Christopher Alfeld authored
      code had been added in a bunch of places to clear the data out.
      
      The portmap table in the database can now be dropped.
      7773b1e0
  14. 11 Jan, 2002 2 commits
  15. 10 Jan, 2002 2 commits
    • Leigh B. Stoller's avatar
      A set of capture/capserver/DB changes. · 8ec05f0d
      Leigh B. Stoller authored
      Capserver and capture now handshake the owner/group of the tipline.
      Owner is defaults to root, and the group defaults to root when the
      node is not allocated. Capture will do the chmod after the handshake,
      so if boss is down when capture starts, the acl/run file will get 0,0,
      but will get the proper owner/group later after its able to handshake.
      As a result, console_setup.proxy was trimmed down and cleaned up a
      bit, since it no longer has to muck with some of this stuff.
      
      A second change was to support multiple tiplines per node. I have
      modified the tiplines table as such:
      
      	| Field   | Type        | Null | Key | Default | Extra |
      	+---------+-------------+------+-----+---------+-------+
      	| tipname | varchar(32) |      | PRI |         |       |
      	| node_id | varchar(10) |      |     |         |       |
      	| server  | varchar(64) |      |     |         |       |
      
      That is, the name of the tip device (given to capture) is the unique
      key, and there can be multiple tiplines associated with each node.
      console_setup now uses the tiplines table to determine what tiplines
      need to be reset; used to be just the name of the node_id passed into
      console_setup. Conversely, capserver uses the tipname to map back to
      the node_id, so that it can get the owner/group from the reserved
      table.
      
      I also removed the shark hack from nalloc, nfree, and console_reset,
      since there is no longer any need for that; this can be described
      completely now with tiplines table entries. If we ever bring the
      sharks back, we will need to generate new entries. Hah!
      8ec05f0d
    • Robert Ricci's avatar
      Most of the time, findVlan() retries up to 10 times to find a VLAN, to · 6a3140fa
      Robert Ricci authored
      account for the time it may take for changes made at the master to
      propagate to the slaves. Added a paramter to override this, as sometimes,
      we know that we're talking to the master so the delay does not come into
      play.
      
      This should improve the running time of snmpit by about 10 seconds per VLAN
      created, since we can tell right away if the VLAN already exists or not.
      6a3140fa