1. 15 Jan, 2003 2 commits
    • Leigh B. Stoller's avatar
      Fixes to how port configuration is done (for trafgens and tunnels, · 76396510
      Leigh B. Stoller authored
      and now sshdport). First off, take the DB out of the loop. There is no
      reason to do DB locking since local nodes are not shared, and even if
      they were, we now allocate port ranges to each experiment when it uses
      virtual nodes (which is the only way to share a node anyway), so there
      is never a case that port allocation between experiments has to be
      coordinated. The only locking that is done is when the range is
      allocated to the experiment, buts that done only once, and only if the
      experiment uses virtual nodes (local or remote).
      
      Also, now that its easy to create lots of jails on a single node, its
      a good idea to give each one their own ssh port. This is recorded in
      the DB nodes table (sshdport) and returned in the jail config from
      tmcd.
      76396510
    • Leigh B. Stoller's avatar
      e5e423b7
  2. 14 Jan, 2003 1 commit
  3. 13 Jan, 2003 1 commit
  4. 10 Jan, 2003 3 commits
  5. 09 Jan, 2003 1 commit
  6. 07 Jan, 2003 7 commits
    • Leigh B. Stoller's avatar
      Changes for setting up jailed nodes, which need checks similar to what · 5ab15776
      Leigh B. Stoller authored
      real nodes get. Also, run a proper os_select on jailed nodes, *after*
      the os for the physical node is setup, since otherwise stated will not
      be happy.
      
      Fixes for dealing with failed os_load. Previously, if os_load would
      fail, os_setup would wait for those nodes anyway since it had no idea
      what nodes had failed (and we do not want to just quit from os_setup
      since that might cause a lot of extra power cycles). Now, for each
      node that got an os_load, check its eventstate; it should be in ISUP
      immediately after os_load exits (since thats what os_load waited for),
      and if its not, then mark that node as failed. Note though that failed
      loads no longer result in the node going into hwdown, since 99 percent
      of the time its a busted user image, not a hardware problem. I figure
      we will catch real hw errors via the reload daemon, when it sends
      email about nodes not finishing.
      
      Do not bother with doing the vnode setup if any of the phys nodes
      failed to setup. Leads to cascading errors and prolongs the angony by
      another few minutes. Might revisit this later.
      
      Remove local WaitTillAlive() function, and switch to using the version
      I put into libdb a couple of weeks ago.
      
      Fix up a bunch of print statements to be nicer.
      5ab15776
    • Leigh B. Stoller's avatar
      Remove hardwired 15 minute wait, and replace with a hardwired · f7b3e7b7
      Leigh B. Stoller authored
      calculation based on the size of the image file. Okay, to avoid all
      you folks from going to see what bit of dreck I came up with, here it
      is:
      
          my $sb     = stat($imagepath);
          my $chunks = $sb->size / (1024 * 1024);
          $maxwait   = int((($chunks / 100.0) * 25) + (4 * 60));
      
      Note the replacement of one hardwired number (15) with several dozen
      new ones!
      
      I like it anyway, cause I hate waiting 2*15 minutes when a 60 second
      load fails.
      f7b3e7b7
    • Leigh B. Stoller's avatar
    • Leigh B. Stoller's avatar
      Minor tweaks. · 4a309126
      Leigh B. Stoller authored
      4a309126
    • Mac Newbold's avatar
      81fea716
    • Mac Newbold's avatar
    • Mac Newbold's avatar
      Add SHUTDOWN as a legal state (identical to REBOOTING) in NORMAL mode, · 64ad3e5d
      Mac Newbold authored
      then remove special case for sending REBOOTING event in node_reboot/power
      when using NORMAL mode. Now SHUTDOWN is always sent. (Important side note:
      SHUTDOWN needs to be a valid state in every machine now.)
      64ad3e5d
  7. 06 Jan, 2003 2 commits
  8. 31 Dec, 2002 4 commits
    • Leigh B. Stoller's avatar
      Clean up permission check. · c832fa47
      Leigh B. Stoller authored
      Remove the sanity check of the experiment state.
      Add check for a local node and do not setup/teardown since the reboot
      will take care of that (jailed nodes setup at boot time, and obviously
      they are going to get torn down when the node goes down!).
      c832fa47
    • Leigh B. Stoller's avatar
      Do not allow users to specify the osid for nodes that are virtual · d50073b0
      Leigh B. Stoller authored
      (jailed) or for the nodes that are hosting virtual nodes. The checks
      are here so that errors are caught early on, and because its better
      than messing with assign_wrapper!
      d50073b0
    • Leigh B. Stoller's avatar
      cb57bc7d
    • Leigh B. Stoller's avatar
      Add support for rebooing jailed (virtual) nodes, either remote or · ab8b901f
      Leigh B. Stoller authored
      local. For local nodes, need to cull out jailed nodes if the phys node
      is also going to reboot. Jailed nodes are rebooted serially since they
      go down much faster.
      
      Fix up recently added wait mode for jailed nodes. Also, I noticed that
      I was having problems with events not filtering through stated before
      going into the ISUP wait loop; I was catching the nodes still in ISUP
      instead of SHUTDOWN. I added a sleep(2) before going into wait mode,
      but this might be something to watch out for elsewhere too.
      ab8b901f
  9. 30 Dec, 2002 3 commits
  10. 23 Dec, 2002 1 commit
  11. 20 Dec, 2002 2 commits
  12. 19 Dec, 2002 3 commits
    • Leigh B. Stoller's avatar
      Two real changes: · 04c20246
      Leigh B. Stoller authored
      1) Add support for local jailed nodes. This support overlaps in a nasty way
         with remote jailed nodes, but I added this for testing purposes, and as
         it turns out its pretty handy. A second pass is needed to unify remote
         and local jails, but for now this is how it goes:
      
        	tb-set-hardware $node3  pc600
        	tb-set-hardware $nodev1 pcvm600
        	tb-fix-node $nodev1 $node3
      
        So, "fix" $nodev1 to $node3. The intent is that once $node3 is
        allocated by assign to a real testbed node, we can then allocate a
        virtual node on pcXX to $nodev1. I did this primarily to allow for
        easy testing of jails via my NS file, without having to hack assign
        wrapper too deeply. So, after assign runs, I use avail to get the
        available vnodes on the assigned pcXX, allocate those for the virtual
        nodes. At present, we still depend on pre-exsiting pcvm nodes for each
        real node.
      
      2) Add code to assign non-overlapping port ranges to each experiment. This
         could be moved to an external script, but is fine right here. There is
         an ipport_ranges table for determining a testbed wide range (currently
         256 ports). This is of course only meaningful when using jailed nodes,
         so do not bother to set a range (and use up the port space) if no jailed
         (virtual) nodes.
      04c20246
    • Leigh B. Stoller's avatar
      Commit my little pc601 change so that pc601 nodes never get used in · ab4657d2
      Leigh B. Stoller authored
      the main tree. Note that this hack should be generalized (as we have
      discussed many times).
      ab4657d2
    • Leigh B. Stoller's avatar
      Add tbrestart for install. · 8f38aab2
      Leigh B. Stoller authored
      8f38aab2
  13. 18 Dec, 2002 6 commits
  14. 16 Dec, 2002 2 commits
  15. 11 Dec, 2002 2 commits