1. 30 Aug, 2004 2 commits
    • Leigh B. Stoller's avatar
      The bulk of the event system changes. · 9aa6b5ca
      Leigh B. Stoller authored
      * The per-experiment event scheduler now runs on ops instead of boss.
        Boss still runs elvind and uses events internally, but the user part
        of the event system has moved.
      
      * Part of the guts of eventsys_control moved to new script, eventsys.proxy,
        which runs on ops and fires off the event scheduler. The only tricky part
        of this is that the scheduler runs as the user, but killing it has to be
        done as root since a different person might swap out the experiment. So,
        the proxy is a perl wrapper invoked from a root ssh from boss, which
        forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid,
        then flips to the user and execs the event scheduler (which is careful
        not to fork). Obviously, if the kill is done as root, the pid file has to
        be stored someplace the user is not allowed to write.
      
      * The event scheduler has been rewritten to use Tim's C++ interface to the
        sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it
        can be built either as a mysql client, or as RPC client. Note that it can
        also be built to use the SSL version of the XMLRPC server, but that will
        not go live until I finish the server stuff up. Also some goo for dealing
        with building the scheduler with C++.
      
      * Changes to several makefiles to install the ops binaries over NFS to
        /usr/testbed/opsdir. Makes life easier, but only if boss and ops are
        running the same OS. For now, using static linking on the event scheduler
        until ops upgraded to same rev as boss.
      
      * All of the event clients got little tweaks for dealing with the new CNAME
        for the event system server (event-sever). Will need to build new images
        at some point. Old images and clients will continue to work cause of an
        inetd hack on boss that uses netcat to transparently redirect elvind
        connections to ops.
      
      * Note that eventdebug needs some explaining. In order to make the inetd
        redirect work, elvind cannot be listening on the standard port. So, the
        boss event system uses an alternate port since there are just a few
        subsystems on boss that use the server, and its easy to propogate changes
        on boss. Anyway, the default for eventdebug is to connect to the standard
        port on localhost, which means it will work as expected on ops, but will
        require -b argument on boss.
      
      * Linktest changes were slightly more involved. No longer run linktest on
        boss when called from the experiment swapin path, but ssh over to ops to
        fire it off. This is done as the user of course, and there are some
        tricks to make it possible to kill a running linktest and its ssh when
        experiment swapin is canceled (or from the command line) by forcing
        allocation of a tty. I will probably revisit this at some point, but I
        did not want to spend a bunch of time on linktest.
      
      * The upgrade path detailed in doc/UPDATING is necessarily complicated and
        bound to cause consternation at remote sites doing an upgrade.
      9aa6b5ca
    • Leigh B. Stoller's avatar
      Add makefile for intalling elvind.conf on both boss and ops. · cefba647
      Leigh B. Stoller authored
      Add elvind-inetd.conf for local inetd startup that redirects event
      traffic from old images over to event server on ops. Note that this
      will not be needed on new testbeds. Also note that this makefile install
      is not tied into the toplevel install; it is for the metaports install.
      cefba647
  2. 19 Aug, 2004 2 commits
  3. 18 Aug, 2004 2 commits
    • Leigh B. Stoller's avatar
      Add elvind.conf file specific to boss/ops. The ops version is vanilla, · a397e0eb
      Leigh B. Stoller authored
      but the boss one is special. It does not listen on the standard port,
      since we are going to transparently forward that port to ops. Instead,
      it listens on localhost:@BOSSEVENTPORT@ which is where all the
      internal event based programs on boss are going to connect. Note, this
      file should not be installed yet ...
      a397e0eb
    • Leigh B. Stoller's avatar
      Minor extension to stated. Add a trigger mechanism for invoking an · 6cf3e936
      Leigh B. Stoller authored
      "arbitrary" script as defined in the stated_triggers table. Currently
      using this to invoke the new opsreboot script whenever ISUP comes in
      from ops.
      
      The opsreboot script is currently a skeleton. All it does is send
      email.  I'll add the rest later (which really won't be much at first;
      just getting the event schedulers started).
      6cf3e936
  4. 13 Aug, 2004 1 commit
  5. 09 Aug, 2004 1 commit
  6. 05 Aug, 2004 1 commit
  7. 27 Jul, 2004 1 commit
  8. 26 Jul, 2004 1 commit
  9. 22 Jul, 2004 1 commit
  10. 15 Jul, 2004 2 commits
    • Mike Hibler's avatar
      Make it work on linux too. · d87e70ee
      Mike Hibler authored
      More error checking.
      d87e70ee
    • Leigh B. Stoller's avatar
      Overview: Add Event Groups: · ed964507
      Leigh B. Stoller authored
      	set g1 [new EventGroup $ns]
      	$g1 add  $link0 $link1
      	$ns at 60.0 "$g1 down"
      
      See the new advanced tutorial section on event groups for a better
      example.
      
      Changed tbreport to dump the event groups table when in summary mode.
      At the same time, I changed tbreport to use the recently added
      virt_lans:vnode and ip slots, decprecating virt_nodes:ips in one more
      place. I also changed the web interface to always dump the event and
      event group summaries.
      
      The parser gets a new file (event.tcl), and the "at" method deals with
      event group events by expanding them inline into individual events
      sent to each member. For some agents, this is unavoidable; traffic
      generators get the initial params in the event, so it is not possible
      to send a single event to all members of the group. Same goes for
      program objects, although program objects do default to the initial
      command now, at least on new images.
      
      Changed the event scheduler to load the event groups table. The
      current operation is that the scheduler expands events sent to a
      group, into a set of distinct events sent to each member of the
      group. At some point we proably want to optimize this by telling the
      agents (running on the nodes) what groups they are members of.
      
      Other News: Added a "mustdelay" slot to the virt_lans table so the
      parser can tell assign_wrapper that a link needs to be delayed, say if
      there are events or if the link is red/gred. Previously,
      assign_wrapper tried to figure this out by looking at the event list,
      etc. I have removed that code; see database-migrate for instructions
      on how to initialize this slot in existing experiments. assign_wrapper
      is free to ignore or insert delays anyway, but having the parser do
      this makes more sense.
      
      I also made some "rename" changes to the parser wrt queues and lans
      and links. Not really necessary, but I got sidetracked (for several
      hours!) trying to understand that rename stuff a little better, and
      now I do.
      ed964507
  11. 12 Jul, 2004 1 commit
  12. 29 Jun, 2004 2 commits
    • Leigh B. Stoller's avatar
      Some "improvements" to linktest ... · 159076bf
      Leigh B. Stoller authored
      * The linktest daemon (the one that runs on the nodes) no longer talks
        to boss directly, but instead contacts the local elvind; rc.linktest
        is changed to reflect that.
      
      * A bunch of signal handler changes to run_linktest.pl; do not rely on
        events to stop linktest when it is running on boss; when the user
        kills a running linktest make sure all the processes are killed
        explicitly.
      
      * New wrapper script (linktest_control) for use on boss, specifically
        when being called from the web interface. This script handles the DB
        part (getting linktest_level and linktest_pid), making sure that
        only one linktest is running at a time (on boss) and reseting the
        pid in the DB as needed. The -k option kills a running linktest, and
        is invoked from the web interface when the user wants to kill one in
        progress. This gets the pid from the DB and sends it a TERM signal,
        which sends a TERM to the run_linktest.pl script, which sends a TERM
        to the ltevent helper app.
      
        Note that this wrapper is also suitable for the XMLRPC interface,
        although I have not added it there yet.
      159076bf
    • Leigh B. Stoller's avatar
      Minor tweaks, nothing special. · 4cb78154
      Leigh B. Stoller authored
      4cb78154
  13. 28 Jun, 2004 1 commit
    • Leigh B. Stoller's avatar
      Fix a few things that cropped up while debugging for jails. · 2861a1a6
      Leigh B. Stoller authored
      * Do not have linktest daemon connect to boss; have it connect to
        local node elvind like all other local agents. Remove the event
        generation code (linktest was sending a KILL event to all other
        linktest programs), and replace with a system() call to tevc, which
        sends the event through the scheduler and exits; this will avoid a
        zillion tcp connctions to boss from the linktest daemon.
      
      * A couple of process group changes to linktest daemon; the daemon
        appeared to be killing itself off.
      
      * Fix to run_linktest.pl; It was just hanging after it completed,
        cause its child ltevent process was still running. Changed to record
        child pid, and kill/close ltevent child before exiting.
      2861a1a6
  14. 27 Jun, 2004 1 commit
    • David Anderson's avatar
      This commit has various changes to Linktest to make it more reliable. · 9a23fe83
      David Anderson authored
      1. The Linktest daemon, linktest.c, now listens for a KILL event. If received,
         the daemon will kill the linktest.pl child process and all of its subchildren.
      2. The daemon also listens for SIGSTP events from the linktest.pl child, and
         will kill the linktest.pl process and its children if linktest.pl dies
         unexpectedly.
      3. Locking has been implemented in linktest.c to ignore requests to start linktest
         while a current run is executing.
      4. The controller script run_linktest.pl now includes the following new options:
         -t   allows the user to specify a timeout in seconds for Linktest.
         -v   prints out better feedback from the Linktest run as it proceeds.
      
      Major remaining items are:
      1. Avoid NFS mount hups
      2. More testing, especially on vnodes
      9a23fe83
  15. 24 Jun, 2004 3 commits
  16. 21 Jun, 2004 2 commits
  17. 16 Jun, 2004 2 commits
  18. 20 May, 2004 1 commit
    • Leigh B. Stoller's avatar
      Add EventFork() to event.pm (perl interface to event system) and to · 116539b6
      Leigh B. Stoller authored
      the tail file of course. Called from TBdbfork() in libdb, EventFork
      resets the event handle so that the child does a reconnect. Note that
      I do not disconnect in the child since I have no idea what that is
      going to do to the parents connection to the elvind, as Elvin makes no
      mention of what to do in the presence of a process that forks after
      connecting to the event server. At the least, this avoids a bunch of
      warnings and errors from vnodesetup!
      116539b6
  19. 19 May, 2004 1 commit
    • Leigh B. Stoller's avatar
      Add new syntax for modifying a link in link_config (with changes in · 1dfb77af
      Leigh B. Stoller authored
      the link agent). You can now do this:
      
      	link_config -s nodew1 testbed three-wireless lan0 ENABLE=yes
      
      to bring a link up (or down; ENABLE=no). This just gets passed along
      in the event arguments. Basically an alias for "tevc now lan0 UP/DOWN"
      to make things a little easier. At present, wireless lans are brought
      up/down with ifconfig since "txpower off" does not work properly.
      
      I suppose I should add the equivalent changes to delay_config ...
      1dfb77af
  20. 18 May, 2004 1 commit
    • Mike Hibler's avatar
      Fixes for CD boot: · 8174369d
      Mike Hibler authored
      1. tbbootconfig: ensure block is zeroed on first init, fix cut/paste error
      2. rc.frisbee: cleanup interface to slicefix
      3. slicefix: cleanup, make it work correctly, init tbboot block for cd boot
      
      Unrelated:
      1. link-agent makefile: build link-agent when doing client-install
      8174369d
  21. 11 May, 2004 3 commits
    • Leigh B. Stoller's avatar
    • Leigh B. Stoller's avatar
      New event agent to control wireless links. At present, this agent is · 6cf05acb
      Leigh B. Stoller authored
      very specific to wireless links in general, and to iwconfig on Redhat
      9.0. It allows you to control the entire lan or an individual member
      of a wireless lan via the event system. For example to change the
      accesspoint of a wireless lan, you could do this:
      
      	tevc -e foo/bar now lan0 modify accesspoint=00:09:5B:93:0B:A4
      
      The agent deciphers the event arguments and calls iwconfig with the
      appropriate as needed. Note that there are many ways to make the lan
      unusable doing this, so you want to be careful. You can get the MAC
      addresses from the experiment info page (tbreport).
      
      New script called link_config, which might be badly named since it
      implies generality) to front end tevc. Operates mostly like
      delay_config in that it will change the physical table settings, and
      optionally (-m) the virtual table entries. So,
      
      	link_config testbed two-wireless lan0 accesspoint=00:09:5B:93:0B:A4
      
      You can change individual members of a lan too:
      
      	link_config -s nodew1 testbed two-wireless lan0 txpower=50
      
      Currently no web interface; too much work. I will add an xmlrpc
      interface though since that is easy to do.
      6cf05acb
    • Leigh B. Stoller's avatar
      Makefile hacks that allow stuff to build on Redhat 9.0 with the screwy · d1f572a3
      Leigh B. Stoller authored
      ssh libraries that want kerberos.
      d1f572a3
  22. 29 Apr, 2004 3 commits
  23. 28 Apr, 2004 1 commit
  24. 26 Apr, 2004 1 commit
    • Mike Hibler's avatar
      Cleanup Makefiles: · 297019fb
      Mike Hibler authored
      1. "make clean" will just remove stuff built in the process of a regular build
      2. "make distclean" will also clean out configure generated files.
      
      This is how it was always supposed to be, there was just some bitrot.
      297019fb
  25. 22 Apr, 2004 3 commits