1. 27 Oct, 2003 1 commit
  2. 17 Oct, 2003 1 commit
  3. 16 Oct, 2003 3 commits
    • Shashi Guruprasad's avatar
      Distributed nse changes · 1630611a
      Shashi Guruprasad authored
      1) IP address based routes can now be added
         - The IP address is set on a link object
         - An "$ns rlink" is used to instantiate links
           that get cut and cross physical partitions
         - Traffic agents that are across physical
           partitions (i.e. different instances of nse)
           are connected by a new "$ns ip-connect"
           mechanism
         - A new Node instproc "add-route-to-ip" adds
           IP address based routes.
         - Changed ns multicast addressing to use 3 bits
           instead of the default 1
         - Currently, the classifier does a lookup on a
           complete 32 bit IP and if a target to route to
           is not found, uses a 24 bit IP mask. It does not
           try to match IP prefixes of all lengths. I'll add
           that later if necessary
      2) NS packets that cross partitions are encapsulated in
         IPPROTO_ENCAP IP packets.
      3) RAW IP sockets used to inject packets into the network
         now use a rtabid paramater so that packets can be
         routed according to different routing tables
      
      Tested with 2 test cases, one with UDP/CBR traffic
      and another with default NS TCP/FTP traffic. Setup was done
      manually. As I do testbed integration, there may be more changes.
      Here's the test setup:
      
         2.2    2.3   1.2      1.3   3.2      3.3
      n0 --------- n1 ----------- n2 ------------ n3
      
      n0,n1 are on one physical node and n2,n3 are on another. The n1-n2 link
      is cut.
      
      A TCP example:
      
      ---------------------physnode0---------------------
      set ns [new Simulator]
      $ns use-scheduler RealTime
      
      set n0 [$ns node]
      set n1 [$ns node]
      
      $ns duplex-link $n0 $n1 10Mb 5ms DropTail
      [$ns link $n0 $n1] set-ip 10.1.2.2
      [$ns link $n1 $n0] set-ip 10.1.2.3
      
      set rl0 [$ns rlink $n1 10.1.1.3 2Mb 40ms DropTail]
      $rl0 set-ip 10.1.1.2
      
      set tcp0 [new Agent/TCP]
      # The last parameter specifies the port
      $ns attach-agent $n0 $tcp0 20
      $ns ip-connect $tcp0 10.1.3.3 20
      set ftp0 [new Application/FTP]
      $ftp0 attach-agent $tcp0
      
      $n0 add-route-to-ip 10.1.3.3 10.1.2.3
      $n1 add-route-to-ip 10.1.3.3 10.1.1.3
      
      $ns at 1.0 "$ftp0 start"
      $ns at 10.0 "$ftp0 stop"
      -----------------end physnode0---------------------
      
      ---------------------physnode1---------------------
      set ns [new Simulator]
      $ns use-scheduler RealTime
      
      set n2 [$ns node]
      set n3 [$ns node]
      
      $ns duplex-link $n2 $n3 10Mb 5ms DropTail
      [$ns link $n2 $n3] set-ip 10.1.3.2
      [$ns link $n3 $n2] set-ip 10.1.3.3
      
      set rl1 [$ns rlink $n2 10.1.1.2 2Mb 40ms DropTail]
      $rl1 set-ip 10.1.1.3
      
      set tcpsink0 [new Agent/TCPSink]
      $ns attach-agent $n3 $tcpsink0 20
      $ns ip-connect $tcpsink0 10.1.2.2 20
      
      $n3 add-route-to-ip 10.1.2.2 10.1.3.2
      $n2 add-route-to-ip 10.1.2.2 10.1.1.2
      -----------------end physnode1---------------------
      1630611a
    • Leigh Stoller's avatar
      Fix getopt call (args reversed). · c0996d36
      Leigh Stoller authored
      Add install target to stick into the sbin directory.
      c0996d36
    • Robert Ricci's avatar
  4. 15 Oct, 2003 2 commits
    • Shashi Guruprasad's avatar
      The ns2.26 distribution is now in www/downloads directory. I've changed the · 140144bb
      Shashi Guruprasad authored
      install file to reflect this.
      140144bb
    • Mike Hibler's avatar
      Uniform syslog'ing. Change everything I could find to use a syslog facility · cc6d6fa7
      Mike Hibler authored
      as defined in the defs-* file (e.g. "TBLOGFACIL=local2").  The default is
      "local5" which is what we are setup to use so you shouldn't need to mess
      with your defs- file!
      
      perl scripts just get this value configured in when configure is run.
      C programs get the value in two ways.  For programs that are intimate with
      the testbed infrastructure, and include "config.h", they just get it from
      that file.  For programs that we sometimes use outside the Emulab build
      environment (e.g., frisbee, capture) and that don't include config.h,
      the value is set via a "-DLOG_TESTBED=..." in the GNUmakefile build line.
      If the value isn't set, it defaults to what it used to be (usually LOG_USER).
      
      Still to do: healthd, hmcd (whose build doesn't seem to be completely
      integrated) and plabdaemon.in (since its icky python :-)
      cc6d6fa7
  5. 13 Oct, 2003 4 commits
  6. 10 Oct, 2003 2 commits
    • Mac Newbold's avatar
      8890a9cb
    • Mac Newbold's avatar
      New StateWait changes - the main point of all this is to move to our new · 2b2a306d
      Mac Newbold authored
      model of waiting for state changes. Before we were watching the database
      (which means we can only watch for terminal/stable/long-lived states, and
      have to poll the db). Now things that are waiting for states to change
      become event listeners, and watch the stream of events flow by, and don't
      have to do any polling. They can now watch for any state, and even
      sequences of states (ie a Shutdown followed by an Isup).
      
      To do this, there is now a cool StateWait.pm library that encapsulates the
      functionality needed. To use it, you call initStateWait before you start
      the chain of events (ie before you call node reboot). Then do your stuff,
      and call waitForState() when you're ready to wait. It can be told to
      return periodically with the results so far, and you can cancel waiting
      for things. An example program called waitForState is in
      testbed/event/stated/ , and can also be used nicely as a command line tool
      that wraps up the library functionality.
      
      This also required the introduction of a TBFAILED event that can be sent
      when a node isn't going to make it to the state that someone may be
      waiting for. Ie if it gets wedged coming up, and stated retries, but
      eventually gives up on it, it sends this to let things know that the node
      is hozed and won't ever come up.
      
      Another thing that is part of this is that node_reboot moves (back) to the
      fully-event-driven model, where users call node reboot, and it does some
      checks and sends some events. Then stated calls node_reboot in "real mode"
      to actually do the work, and handles doing the appropriate retries until
      the node either comes up or is deemed "failed" and stated gives up on it.
      This means stated is also the gatekeeper of when you can and cannot reboot
      a node. (See mail archives for extensive discussions of the details.)
      
      A big part of the motivation for this was to get uninformed timeouts and
      retries out of os_load/os_setup and put them in stated where we can make a
      wiser choice. So os_load and os_setup now use this new stuff and don't
      have to worry about timing out on nodes and rebooting. Stated makes sure
      that they either come up, get retried, or fail to boot. tbrestart also
      underwent a similar change.
      2b2a306d
  7. 01 Oct, 2003 1 commit
  8. 30 Sep, 2003 2 commits
  9. 21 Aug, 2003 1 commit
  10. 07 Aug, 2003 1 commit
  11. 01 Aug, 2003 1 commit
  12. 10 Jul, 2003 1 commit
  13. 01 Jul, 2003 1 commit
  14. 19 Jun, 2003 3 commits
    • Robert Ricci's avatar
      Fix a bug relating to multiple handles - the cleanup function should · 608de04f
      Robert Ricci authored
      only be called after we're done with _all_ handles. So, add a simple
      count of how many handles we've given out, and only call cleanup on
      the last one to get unregistered.
      608de04f
    • Mac Newbold's avatar
      The new and fully functional rebooting-via-events stuff and the · 1daaa992
      Mac Newbold authored
      really-reboot-nodes-that-timeout stuff.
      
      NOTE: Until the timeout/retry stuff is gone from os_load/os_setup, it is
      disabled in stated. It will still only send email. But all the stuff is
      there and has been tested.
      
      NOTE: Until other things don't depend on the old behavior of node_reboot
      (when it returns, all nodes are in SHUTDOWN), the event stuff is disabled.
      Real mode is the default, and can be run by anyone.
      
      In short, this commit is new versions of stated and node_reboot that act
      almost exactly like the old ones. But I wanted to commit them before I go
      on making a bunch more changes, to have a checkpoint that I know works.
      1daaa992
    • Mac Newbold's avatar
      Fix a couple of minor glitches. · 243921d5
      Mac Newbold authored
      243921d5
  15. 18 Jun, 2003 2 commits
    • Mac Newbold's avatar
      The first working version of the StateWait library. The API changed a bit: · 33251a18
      Mac Newbold authored
      # $rv = initStateWait(\@states, @nodes);
      #
      # Call this first. Make sure to call it _before_ performing any
      # action that may trigger one or more of the states you're
      # expecting to see. Each node must see the list of states in the
      # proper order, ie if @states=(stateA, stateB), it will not
      # complete until the first time stateB is seen after stateA has
      # been seen. Returns 0 on success.
      #
      # $rv = waitForState(\@finished, \@failed[, $timeout);
      #
      # Do the actual waiting. Blocks until each node has either reached the
      # desired state(s) or failed, or timeout is reached (if given and
      # non-zero). Returns lists of nodes.
      #
      # $rv = endStateWait();
      #
      # Stop listening for states. Call this soon after waitForState.
      # This must be called before calling initStateWait again.
      
      Also, commit a command line tool that uses the lib. The waitForState
      script can be used by other programs to do the state waiting for you, or
      you can use the lib directly for more control, using this script as an
      example of how to do it.
      33251a18
    • Mac Newbold's avatar
  16. 16 Jun, 2003 1 commit
  17. 10 Jun, 2003 1 commit
  18. 09 Jun, 2003 2 commits
  19. 06 Jun, 2003 3 commits
    • Shashi Guruprasad's avatar
      fd2f7a62
    • Mac Newbold's avatar
      First batch of changes for adding TBCOMMAND events. Currently, here's what · 71b82cc4
      Mac Newbold authored
      is supported:
      
      - stated listens for TBCOMMAND events, and currently handles REBOOT,
        POWEROFF, POWERON, and POWERCYCLE events. It does everything except make
        the actual calls to node_reboot and power. And it accepts batches of
        nodes instead of just single ones.
      
      - Timeouts were added to the db for these commands, with no timeout for
        the power ones (since the node can't hang during those), and a 15 second
        timeout from reboot until the SHUTDOWN state.
      
      - If a rebootimes out, it tries it again, up to 3 times. If it gets to
        three times without working, it sends mail to tbops and turns the
        machine off instead of continuing to reboot it. Right now I haven't
        made it do node_reboot -f or power cycle on retries, but it easily
        could.
      
      - Stuff to be done before they work: make node_reboot send an event
        instead of doing the work, and make a new script that has node_reboot's
        old guts. Note that this requires authentication in our events for these
        commands, and a way to make sure that the command that came in as an
        event was properly authenticated.
      
      - For future growth and expansion, it is set up so it should be relatively
        easy to add other commands that do different things, even if they take
        arbitrary params that aren't nodes or lists of nodes.
      71b82cc4
    • Mac Newbold's avatar
      Checkpoint changes for StateWait module. Mostly just stubs now, so it · 12a6a5f4
      Mac Newbold authored
      obviously doesn't get used anywhere yet.
      12a6a5f4
  20. 05 Jun, 2003 2 commits
    • Leigh Stoller's avatar
      New event proxy. This proxy is used in lieu of Elvin clustering or · b5d82850
      Leigh Stoller authored
      federation, which is not supported in the version we have source to.
      Basically, we run an elvind on each node. The proxy on each node
      subscribes to all events for that node from the boss elvind, and hands
      them to the local elvind, Each client on the node subscribes to the
      local elvind, and gets its events via the proxy. This should reduce
      the number of connections to boss, and makes it possible to run agents
      inside each virtual node without an FD explosion on boss.
      b5d82850
    • Leigh Stoller's avatar
      d6559d18
  21. 30 May, 2003 2 commits
    • Leigh Stoller's avatar
      Add code to write pidfile, and -i option to specify pid file. · 66f8949f
      Leigh Stoller authored
      Add -u option to specify the user. Do the uid flip here instead of in
      the perl wrapper, but only if root of course. Otherwise runs as the
      user that invoked the program-agent.
      Add mandatory -e option to speicfy the pid/eid to use in event tuple
      instead of the ipaddr, since in jails without their own IP address,
      using the ipaddr is broken (all jails see all program events).
      Add mandatory -a option to specify a list of object names, so that the
      agent will get just the events it should. There is a corresponding new
      tmcc command that specifies the list of program objects for the node
      (or vnode).
      66f8949f
    • Leigh Stoller's avatar
      Change maxlinks from 4 to 256! · b68c79c7
      Leigh Stoller authored
      Add -l option to specify the logfile.
      Add code to write pidfile, and add -i option to specify pidfile name.
      Useful with jails where there is a delay agent per jail, and thus
      a logfile and pidfile per jail.
      b68c79c7
  22. 23 May, 2003 1 commit
    • Mac Newbold's avatar
      Fix two problems: · c8848155
      Mac Newbold authored
      1. timeouts for nodes weren't getting reset when they had a mode
      ransition, so they were timing out in shutdown after changing modes.
      2. It was still going back into a blocking wait, even though a signal had
      been recieved, and not quitting back up to the main loop to handle it.
      c8848155
  23. 22 May, 2003 2 commits