1. 24 Jan, 2003 3 commits
  2. 21 Jan, 2003 2 commits
  3. 17 Jan, 2003 2 commits
    • Leigh B. Stoller's avatar
      Ah yes, I can waste time like the best of the best. Actually, I'm just · 88fdc7f0
      Leigh B. Stoller authored
      waiting for Mike to work on Jail. This is an auditing module. Its
      intended to serve two purposes. 1) Provide a common set of routines
      for generating all that audit email from various scripts and 2)
      provide a debugging hook for when things screw up via the web
      interface and the user is too clueless to help us out, or the
      information just got lost someplace.
      
      The main function is:
      
      	AuditStart($daemonize;$logname);
      
      To start an audit, call AuditStart(). The first arg indicates if the caller
      is wanting to daemonize. If not, just redirect stdout/stderr to a logfile,
      and return. The logfile is optional; if not provided one will be created
      based on the name of the script with mktemp. If the user wants to
      daemonize, also fork and detach. This is provided as a convenience for
      those scripts that tend to combine redirecting output and daemonizing. The
      parent is exptected to exit, like any good mother of a daemon.
      
      Okay, so all output is redirected to the log file. If a subscript is
      invoked that also calls audit, that call is ignored, under the assumption
      that the logging can be rolled into the parent, and besides it would create
      a blizzard of email.
      
      There is a package destructor (END) that is setup to email the log file to
      the audit list, and if a log was created, the log goes to the logs
      list. This is important. The audit list never gets any logs; it just gets a
      two line record of what was done ("rmacct mike (by stoller)"). The log goes
      separately to the logs list for inspection if needed at some point. This
      makes the audit list very consise and easy to distill.
      
      Like all good package destructors, you can tell if the script was
      exiting with an error. If it was, then instead of sending the mail to
      the logs list, send the message to tbops! The nice thing is that this
      gets invoked no matter how you exit! No need to explicitly send the
      email, unless of course you want it, but I have not written than
      function yet!
      
      Oh, for debugging. We can go stick in Audit calls when scripts
      misbehave, and we can watch the output.
      
      Cool, right? Really useful, right? Handy Dandy, right? Mike?
      88fdc7f0
    • Robert Ricci's avatar
      New features: · 2d7b6b82
      Robert Ricci authored
          Accept [switch.]<module>/<port> format for ports, so that we can
      	deal with ports not in the database (mostly for my own
      	debugging sanity.)
          A -n option that prevents assign from changing hardware settings
          	(though, unlike TESTMODE, does read some information from
      	the switches)
          Private VLAN support, through the -x,-y, and -z switches. There
      	are only 5 letters of the alphabet left, so I've given up on
      	memnonic switches.
          Worked a bit on making VLAN deletion more efficient, but with
          	little sucess
      
      Private VLANs work like so:
      Make a primary private VLAN with:
      snmpit -m myvlan-primary -y primary
      Attach a community VLAN to it like so:
      snmpit -m myvlan-community -y community -x myvlan-primary -z cisco2.1/15
      Put some ports into the community VLAN:
      snmpit -m myvlan-community pc1:0 pc2:0
      2d7b6b82
  4. 15 Jan, 2003 7 commits
  5. 14 Jan, 2003 1 commit
  6. 13 Jan, 2003 1 commit
  7. 10 Jan, 2003 3 commits
  8. 09 Jan, 2003 1 commit
  9. 07 Jan, 2003 7 commits
    • Leigh B. Stoller's avatar
      Changes for setting up jailed nodes, which need checks similar to what · 5ab15776
      Leigh B. Stoller authored
      real nodes get. Also, run a proper os_select on jailed nodes, *after*
      the os for the physical node is setup, since otherwise stated will not
      be happy.
      
      Fixes for dealing with failed os_load. Previously, if os_load would
      fail, os_setup would wait for those nodes anyway since it had no idea
      what nodes had failed (and we do not want to just quit from os_setup
      since that might cause a lot of extra power cycles). Now, for each
      node that got an os_load, check its eventstate; it should be in ISUP
      immediately after os_load exits (since thats what os_load waited for),
      and if its not, then mark that node as failed. Note though that failed
      loads no longer result in the node going into hwdown, since 99 percent
      of the time its a busted user image, not a hardware problem. I figure
      we will catch real hw errors via the reload daemon, when it sends
      email about nodes not finishing.
      
      Do not bother with doing the vnode setup if any of the phys nodes
      failed to setup. Leads to cascading errors and prolongs the angony by
      another few minutes. Might revisit this later.
      
      Remove local WaitTillAlive() function, and switch to using the version
      I put into libdb a couple of weeks ago.
      
      Fix up a bunch of print statements to be nicer.
      5ab15776
    • Leigh B. Stoller's avatar
      Remove hardwired 15 minute wait, and replace with a hardwired · f7b3e7b7
      Leigh B. Stoller authored
      calculation based on the size of the image file. Okay, to avoid all
      you folks from going to see what bit of dreck I came up with, here it
      is:
      
          my $sb     = stat($imagepath);
          my $chunks = $sb->size / (1024 * 1024);
          $maxwait   = int((($chunks / 100.0) * 25) + (4 * 60));
      
      Note the replacement of one hardwired number (15) with several dozen
      new ones!
      
      I like it anyway, cause I hate waiting 2*15 minutes when a 60 second
      load fails.
      f7b3e7b7
    • Leigh B. Stoller's avatar
    • Leigh B. Stoller's avatar
      Minor tweaks. · 4a309126
      Leigh B. Stoller authored
      4a309126
    • Mac Newbold's avatar
      81fea716
    • Mac Newbold's avatar
    • Mac Newbold's avatar
      Add SHUTDOWN as a legal state (identical to REBOOTING) in NORMAL mode, · 64ad3e5d
      Mac Newbold authored
      then remove special case for sending REBOOTING event in node_reboot/power
      when using NORMAL mode. Now SHUTDOWN is always sent. (Important side note:
      SHUTDOWN needs to be a valid state in every machine now.)
      64ad3e5d
  10. 06 Jan, 2003 2 commits
  11. 31 Dec, 2002 4 commits
    • Leigh B. Stoller's avatar
      Clean up permission check. · c832fa47
      Leigh B. Stoller authored
      Remove the sanity check of the experiment state.
      Add check for a local node and do not setup/teardown since the reboot
      will take care of that (jailed nodes setup at boot time, and obviously
      they are going to get torn down when the node goes down!).
      c832fa47
    • Leigh B. Stoller's avatar
      Do not allow users to specify the osid for nodes that are virtual · d50073b0
      Leigh B. Stoller authored
      (jailed) or for the nodes that are hosting virtual nodes. The checks
      are here so that errors are caught early on, and because its better
      than messing with assign_wrapper!
      d50073b0
    • Leigh B. Stoller's avatar
      cb57bc7d
    • Leigh B. Stoller's avatar
      Add support for rebooing jailed (virtual) nodes, either remote or · ab8b901f
      Leigh B. Stoller authored
      local. For local nodes, need to cull out jailed nodes if the phys node
      is also going to reboot. Jailed nodes are rebooted serially since they
      go down much faster.
      
      Fix up recently added wait mode for jailed nodes. Also, I noticed that
      I was having problems with events not filtering through stated before
      going into the ISUP wait loop; I was catching the nodes still in ISUP
      instead of SHUTDOWN. I added a sleep(2) before going into wait mode,
      but this might be something to watch out for elsewhere too.
      ab8b901f
  12. 30 Dec, 2002 3 commits
  13. 23 Dec, 2002 1 commit
  14. 20 Dec, 2002 2 commits
  15. 19 Dec, 2002 1 commit
    • Leigh B. Stoller's avatar
      Two real changes: · 04c20246
      Leigh B. Stoller authored
      1) Add support for local jailed nodes. This support overlaps in a nasty way
         with remote jailed nodes, but I added this for testing purposes, and as
         it turns out its pretty handy. A second pass is needed to unify remote
         and local jails, but for now this is how it goes:
      
        	tb-set-hardware $node3  pc600
        	tb-set-hardware $nodev1 pcvm600
        	tb-fix-node $nodev1 $node3
      
        So, "fix" $nodev1 to $node3. The intent is that once $node3 is
        allocated by assign to a real testbed node, we can then allocate a
        virtual node on pcXX to $nodev1. I did this primarily to allow for
        easy testing of jails via my NS file, without having to hack assign
        wrapper too deeply. So, after assign runs, I use avail to get the
        available vnodes on the assigned pcXX, allocate those for the virtual
        nodes. At present, we still depend on pre-exsiting pcvm nodes for each
        real node.
      
      2) Add code to assign non-overlapping port ranges to each experiment. This
         could be moved to an external script, but is fine right here. There is
         an ipport_ranges table for determining a testbed wide range (currently
         256 ports). This is of course only meaningful when using jailed nodes,
         so do not bother to set a range (and use up the port space) if no jailed
         (virtual) nodes.
      04c20246