1. 03 Sep, 2008 1 commit
  2. 16 Aug, 2007 1 commit
  3. 23 Apr, 2007 1 commit
  4. 23 Mar, 2007 1 commit
  5. 07 Mar, 2005 1 commit
    • Timothy Stack's avatar
      · 898cf9a2
      Timothy Stack authored
      Checkin some changes related to experiment automation and vnode feedback:
      
      	* configure, configure.in: Add sensors/canaryd/feedbacklogs
      	template.
      
      	* db/libdb.pm.in, db/xmlconvert.in: Add "virt_user_environment"
      	table that holds environment variable names and values.
      
      	* event/lib/event.c: Allocate memory of the right size for
      	event_notifications.
      
      	* event/program-agent/GNUmakefile.in: Add version.c file and
      	add install targets for the man page.
      
      	* event/program-agent/program-agent.8: Man page describing the
      	program-agent daemon.
      
      	* event/program-agent/program-agent.c: Add a bunch of convenience
      	features: let the user specify the working directory for commands;
      	save output to separate files on every invocation of an agent; let
      	the user specify a timeout for a command; make the set of
      	environment variables sane and add vars given in the NS file in
      	the opt array; a "status" file containing process information is
      	written out when children are collected.  Internal changes: child
      	processes are collected immediately, instead of waiting for the
      	next START event, so we can send back COMPLETE events; the daemon
      	now runs with a real-time priority, to increase the chances of
      	receiving events.
      
      	* event/proxy/evproxy.c: Made it bidirectional so the
      	program-agent's COMPLETE events make it back to the scheduler.
      
      	* event/sched/error-record.c: Change the default log directory.
      
      	* event/sched/event-sched.h, event/sched/event-sched.c: Setup an
      	environment similar to a program-agent to run the user's log
      	digester.
      
      	* event/sched/node-agent.cc: Add a handler for the SNAPSHOT event
      	that runs create_image for the node.
      
      	* event/sched/simulator-agent.h, event/sched/simulator-agent.cc:
      	Let the user specify a "DIGESTER" script that digests the log
      	files into a summary of the results.  Add event handler for
      	remapping a vnode experiment.
      
      	* event/sched/timeline-agent.c: Accept the RUN event as well as
      	the START event.
      
      	* os/GNUmakefile.in: Install the install-tarfile.1 man page.
      
      	* os/install-tarfile: Automatically chown/chgrp any files that do
      	not have valid user or group IDs, the new owner will be the user
      	that swapped in the experiment.  Include the install directory in
      	the DB file.  Add a "list" mode that just dumps what files have
      	been installed and where.  Add a "force" option so the user can
      	forcefully install the file, even though the DB says its already
      	there.
      
      	* os/install-tarfile.1: Man page describing the install-tarfile
      	tool.
      
      	* os/syncd/GNUmakefile.in: Install man pages on ops.
      
      	* sensors/canaryd/GNUmakefile.in: Link canaryd statically and
      	install "feedbacklogs" tool.
      
      	* sensors/canaryd/canaryd.c: Dump dummynet pipe data.
      
      	* sensors/canaryd/canarydEvents.c: Log errors.
      
      	* sensors/canaryd/feedbacklogs.in: Tool used to generate feedback
      	data from canaryd log files.
      
      	* sensors/slothd/GNUmakefile.in: Install digest-slothd on ops.
      
      	* sensors/slothd/digest-slothd: Fix some bugs and write out an
      	"alert" file with all the nodes/links that were overloaded.
      
      	* tbsetup/os_load.in, tbsetup/libosload.pm.in: Add "waitmode"
      	argument that lets you specify that you want to wait for the disk
      	to finish loading and/or wait for the node to come back up in the
      	new OS.
      
      	* tbsetup/power.in: Remove debugging printf.
      
      	* tbsetup/ns2ir/node.tcl, tbsetup/ns2ir/program.tcl,
      	tbsetup/ns2ir/sequence.tcl, tbsetup/ns2ir/sim.tcl.in: Fix some
      	quoting problems with event-sequences.  Add -expected-exit-code
      	and -tag options to the "$program run" event.  Add -digester to
      	the "$ns report" event that lets the user specify a program to run
      	to digest the log files.
      
      	* tbsetup/ns2ir/tb_compat.tcl.in: Change the initial scaling
      	factor for feedback nodes to 1%, instead of 100%.
      
      	* tmcd/tmcd.c, tmcd/common/libtmcc.pm: Add "userenv" command that
      	returns the values in "virt_user_environment".  Return new program
      	agent fields: dir, timeout, and expected_exit_code.
      
      	* tmcd/common/GNUmakefile.in: Install rc.canaryd.
      
      	* tmcd/common/bootvnodes: Add hack to boost the program-agents to
      	a real-time priority, they can't do it from inside the jail.
      
      	* tmcd/common/rc.canaryd: Rc script for canaryd.
      
      	* tmcd/common/watchdog: Don't fail outright if there is a bad line
      	in the battery.log
      
      	* tmcd/common/rc.progagent: Append "userenv" data to the
      	program-agent config file.
      
      	* utils/GNUmakefile.in: Install loghole and its man page on ops.
      
      	* utils/loghole.1: Document "clean" command and the change in
      	loghole directories.
      
      	* utils/loghole.in: Add "clean" command and parallelization.
      
      	* xmlrpc/emulabserver.py.in: Add "virt_user_environment" table.
      	Order the eventlist by "idx" and time, needed for sequences.  And
      	removed unnecessary nologin checks.
      898cf9a2
  6. 30 Aug, 2004 1 commit
    • Leigh Stoller's avatar
      The bulk of the event system changes. · 9aa6b5ca
      Leigh Stoller authored
      * The per-experiment event scheduler now runs on ops instead of boss.
        Boss still runs elvind and uses events internally, but the user part
        of the event system has moved.
      
      * Part of the guts of eventsys_control moved to new script, eventsys.proxy,
        which runs on ops and fires off the event scheduler. The only tricky part
        of this is that the scheduler runs as the user, but killing it has to be
        done as root since a different person might swap out the experiment. So,
        the proxy is a perl wrapper invoked from a root ssh from boss, which
        forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid,
        then flips to the user and execs the event scheduler (which is careful
        not to fork). Obviously, if the kill is done as root, the pid file has to
        be stored someplace the user is not allowed to write.
      
      * The event scheduler has been rewritten to use Tim's C++ interface to the
        sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it
        can be built either as a mysql client, or as RPC client. Note that it can
        also be built to use the SSL version of the XMLRPC server, but that will
        not go live until I finish the server stuff up. Also some goo for dealing
        with building the scheduler with C++.
      
      * Changes to several makefiles to install the ops binaries over NFS to
        /usr/testbed/opsdir. Makes life easier, but only if boss and ops are
        running the same OS. For now, using static linking on the event scheduler
        until ops upgraded to same rev as boss.
      
      * All of the event clients got little tweaks for dealing with the new CNAME
        for the event system server (event-sever). Will need to build new images
        at some point. Old images and clients will continue to work cause of an
        inetd hack on boss that uses netcat to transparently redirect elvind
        connections to ops.
      
      * Note that eventdebug needs some explaining. In order to make the inetd
        redirect work, elvind cannot be listening on the standard port. So, the
        boss event system uses an alternate port since there are just a few
        subsystems on boss that use the server, and its easy to propogate changes
        on boss. Anyway, the default for eventdebug is to connect to the standard
        port on localhost, which means it will work as expected on ops, but will
        require -b argument on boss.
      
      * Linktest changes were slightly more involved. No longer run linktest on
        boss when called from the experiment swapin path, but ssh over to ops to
        fire it off. This is done as the user of course, and there are some
        tricks to make it possible to kill a running linktest and its ssh when
        experiment swapin is canceled (or from the command line) by forcing
        allocation of a tty. I will probably revisit this at some point, but I
        did not want to spend a bunch of time on linktest.
      
      * The upgrade path detailed in doc/UPDATING is necessarily complicated and
        bound to cause consternation at remote sites doing an upgrade.
      9aa6b5ca
  7. 08 Mar, 2004 1 commit
  8. 05 Jun, 2003 1 commit
    • Leigh Stoller's avatar
      New event proxy. This proxy is used in lieu of Elvin clustering or · b5d82850
      Leigh Stoller authored
      federation, which is not supported in the version we have source to.
      Basically, we run an elvind on each node. The proxy on each node
      subscribes to all events for that node from the boss elvind, and hands
      them to the local elvind, Each client on the node subscribes to the
      local elvind, and gets its events via the proxy. This should reduce
      the number of connections to boss, and makes it possible to run agents
      inside each virtual node without an FD explosion on boss.
      b5d82850