• Leigh B. Stoller's avatar
    The bulk of the event system changes. · 9aa6b5ca
    Leigh B. Stoller authored
    * The per-experiment event scheduler now runs on ops instead of boss.
      Boss still runs elvind and uses events internally, but the user part
      of the event system has moved.
    * Part of the guts of eventsys_control moved to new script, eventsys.proxy,
      which runs on ops and fires off the event scheduler. The only tricky part
      of this is that the scheduler runs as the user, but killing it has to be
      done as root since a different person might swap out the experiment. So,
      the proxy is a perl wrapper invoked from a root ssh from boss, which
      forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid,
      then flips to the user and execs the event scheduler (which is careful
      not to fork). Obviously, if the kill is done as root, the pid file has to
      be stored someplace the user is not allowed to write.
    * The event scheduler has been rewritten to use Tim's C++ interface to the
      sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it
      can be built either as a mysql client, or as RPC client. Note that it can
      also be built to use the SSL version of the XMLRPC server, but that will
      not go live until I finish the server stuff up. Also some goo for dealing
      with building the scheduler with C++.
    * Changes to several makefiles to install the ops binaries over NFS to
      /usr/testbed/opsdir. Makes life easier, but only if boss and ops are
      running the same OS. For now, using static linking on the event scheduler
      until ops upgraded to same rev as boss.
    * All of the event clients got little tweaks for dealing with the new CNAME
      for the event system server (event-sever). Will need to build new images
      at some point. Old images and clients will continue to work cause of an
      inetd hack on boss that uses netcat to transparently redirect elvind
      connections to ops.
    * Note that eventdebug needs some explaining. In order to make the inetd
      redirect work, elvind cannot be listening on the standard port. So, the
      boss event system uses an alternate port since there are just a few
      subsystems on boss that use the server, and its easy to propogate changes
      on boss. Anyway, the default for eventdebug is to connect to the standard
      port on localhost, which means it will work as expected on ops, but will
      require -b argument on boss.
    * Linktest changes were slightly more involved. No longer run linktest on
      boss when called from the experiment swapin path, but ssh over to ops to
      fire it off. This is done as the user of course, and there are some
      tricks to make it possible to kill a running linktest and its ssh when
      experiment swapin is canceled (or from the command line) by forcing
      allocation of a tty. I will probably revisit this at some point, but I
      did not want to spend a bunch of time on linktest.
    * The upgrade path detailed in doc/UPDATING is necessarily complicated and
      bound to cause consternation at remote sites doing an upgrade.