1. 07 Mar, 2005 1 commit
    • Timothy Stack's avatar
      · 898cf9a2
      Timothy Stack authored
      Checkin some changes related to experiment automation and vnode feedback:
      
      	* configure, configure.in: Add sensors/canaryd/feedbacklogs
      	template.
      
      	* db/libdb.pm.in, db/xmlconvert.in: Add "virt_user_environment"
      	table that holds environment variable names and values.
      
      	* event/lib/event.c: Allocate memory of the right size for
      	event_notifications.
      
      	* event/program-agent/GNUmakefile.in: Add version.c file and
      	add install targets for the man page.
      
      	* event/program-agent/program-agent.8: Man page describing the
      	program-agent daemon.
      
      	* event/program-agent/program-agent.c: Add a bunch of convenience
      	features: let the user specify the working directory for commands;
      	save output to separate files on every invocation of an agent; let
      	the user specify a timeout for a command; make the set of
      	environment variables sane and add vars given in the NS file in
      	the opt array; a "status" file containing process information is
      	written out when children are collected.  Internal changes: child
      	processes are collected immediately, instead of waiting for the
      	next START event, so we can send back COMPLETE events; the daemon
      	now runs with a real-time priority, to increase the chances of
      	receiving events.
      
      	* event/proxy/evproxy.c: Made it bidirectional so the
      	program-agent's COMPLETE events make it back to the scheduler.
      
      	* event/sched/error-record.c: Change the default log directory.
      
      	* event/sched/event-sched.h, event/sched/event-sched.c: Setup an
      	environment similar to a program-agent to run the user's log
      	digester.
      
      	* event/sched/node-agent.cc: Add a handler for the SNAPSHOT event
      	that runs create_image for the node.
      
      	* event/sched/simulator-agent.h, event/sched/simulator-agent.cc:
      	Let the user specify a "DIGESTER" script that digests the log
      	files into a summary of the results.  Add event handler for
      	remapping a vnode experiment.
      
      	* event/sched/timeline-agent.c: Accept the RUN event as well as
      	the START event.
      
      	* os/GNUmakefile.in: Install the install-tarfile.1 man page.
      
      	* os/install-tarfile: Automatically chown/chgrp any files that do
      	not have valid user or group IDs, the new owner will be the user
      	that swapped in the experiment.  Include the install directory in
      	the DB file.  Add a "list" mode that just dumps what files have
      	been installed and where.  Add a "force" option so the user can
      	forcefully install the file, even though the DB says its already
      	there.
      
      	* os/install-tarfile.1: Man page describing the install-tarfile
      	tool.
      
      	* os/syncd/GNUmakefile.in: Install man pages on ops.
      
      	* sensors/canaryd/GNUmakefile.in: Link canaryd statically and
      	install "feedbacklogs" tool.
      
      	* sensors/canaryd/canaryd.c: Dump dummynet pipe data.
      
      	* sensors/canaryd/canarydEvents.c: Log errors.
      
      	* sensors/canaryd/feedbacklogs.in: Tool used to generate feedback
      	data from canaryd log files.
      
      	* sensors/slothd/GNUmakefile.in: Install digest-slothd on ops.
      
      	* sensors/slothd/digest-slothd: Fix some bugs and write out an
      	"alert" file with all the nodes/links that were overloaded.
      
      	* tbsetup/os_load.in, tbsetup/libosload.pm.in: Add "waitmode"
      	argument that lets you specify that you want to wait for the disk
      	to finish loading and/or wait for the node to come back up in the
      	new OS.
      
      	* tbsetup/power.in: Remove debugging printf.
      
      	* tbsetup/ns2ir/node.tcl, tbsetup/ns2ir/program.tcl,
      	tbsetup/ns2ir/sequence.tcl, tbsetup/ns2ir/sim.tcl.in: Fix some
      	quoting problems with event-sequences.  Add -expected-exit-code
      	and -tag options to the "$program run" event.  Add -digester to
      	the "$ns report" event that lets the user specify a program to run
      	to digest the log files.
      
      	* tbsetup/ns2ir/tb_compat.tcl.in: Change the initial scaling
      	factor for feedback nodes to 1%, instead of 100%.
      
      	* tmcd/tmcd.c, tmcd/common/libtmcc.pm: Add "userenv" command that
      	returns the values in "virt_user_environment".  Return new program
      	agent fields: dir, timeout, and expected_exit_code.
      
      	* tmcd/common/GNUmakefile.in: Install rc.canaryd.
      
      	* tmcd/common/bootvnodes: Add hack to boost the program-agents to
      	a real-time priority, they can't do it from inside the jail.
      
      	* tmcd/common/rc.canaryd: Rc script for canaryd.
      
      	* tmcd/common/watchdog: Don't fail outright if there is a bad line
      	in the battery.log
      
      	* tmcd/common/rc.progagent: Append "userenv" data to the
      	program-agent config file.
      
      	* utils/GNUmakefile.in: Install loghole and its man page on ops.
      
      	* utils/loghole.1: Document "clean" command and the change in
      	loghole directories.
      
      	* utils/loghole.in: Add "clean" command and parallelization.
      
      	* xmlrpc/emulabserver.py.in: Add "virt_user_environment" table.
      	Order the eventlist by "idx" and time, needed for sequences.  And
      	removed unnecessary nologin checks.
      898cf9a2
  2. 14 Jan, 2005 1 commit
    • Timothy Stack's avatar
      · dee46d59
      Timothy Stack authored
      Cross compilation fixes for the stargates, 'gmake client' should now
      build, link, and install properly.  Haven't really tried to run stuff though.
      
      	* GNUmakerules: Add target for stripping executables, used instead
      	of "install -s" since that doesn't work for cross-compiling.
      
      	* Makeconf.in: Add ELVIN_CONFIG variable that refers to
      	'elvin-config'.
      
      	* configure, configure.in: Detect and save the elvin-config path
      	since we need a different one for cross-compiling.
      
      	* event/lib/GNUmakefile.in, event/link-agent/GNUmakefile.in,
      	event/linktest/GNUmakefile.in, event/program-agent/GNUmakefile.in,
      	event/proxy/GNUmakefile.in, event/tbgen/GNUmakefile.in,
      	event/trafgen/GNUmakefile.in, os/dijkstra/GNUmakefile.in,
      	os/syncd/GNUmakefile.in, sensors/slothd/GNUmakefile.in,
      	tmcd/GNUmakefile.in, tmcd/linux/GNUmakefile.in: Cross compilation
      	fixes, don't statically link on arm, create "foo-debug"
      	executables with debugging info and install separately stripped
      	ones instead of passing "-s" to install.
      dee46d59
  3. 03 Jan, 2005 1 commit
  4. 30 Aug, 2004 1 commit
    • Leigh B. Stoller's avatar
      The bulk of the event system changes. · 9aa6b5ca
      Leigh B. Stoller authored
      * The per-experiment event scheduler now runs on ops instead of boss.
        Boss still runs elvind and uses events internally, but the user part
        of the event system has moved.
      
      * Part of the guts of eventsys_control moved to new script, eventsys.proxy,
        which runs on ops and fires off the event scheduler. The only tricky part
        of this is that the scheduler runs as the user, but killing it has to be
        done as root since a different person might swap out the experiment. So,
        the proxy is a perl wrapper invoked from a root ssh from boss, which
        forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid,
        then flips to the user and execs the event scheduler (which is careful
        not to fork). Obviously, if the kill is done as root, the pid file has to
        be stored someplace the user is not allowed to write.
      
      * The event scheduler has been rewritten to use Tim's C++ interface to the
        sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it
        can be built either as a mysql client, or as RPC client. Note that it can
        also be built to use the SSL version of the XMLRPC server, but that will
        not go live until I finish the server stuff up. Also some goo for dealing
        with building the scheduler with C++.
      
      * Changes to several makefiles to install the ops binaries over NFS to
        /usr/testbed/opsdir. Makes life easier, but only if boss and ops are
        running the same OS. For now, using static linking on the event scheduler
        until ops upgraded to same rev as boss.
      
      * All of the event clients got little tweaks for dealing with the new CNAME
        for the event system server (event-sever). Will need to build new images
        at some point. Old images and clients will continue to work cause of an
        inetd hack on boss that uses netcat to transparently redirect elvind
        connections to ops.
      
      * Note that eventdebug needs some explaining. In order to make the inetd
        redirect work, elvind cannot be listening on the standard port. So, the
        boss event system uses an alternate port since there are just a few
        subsystems on boss that use the server, and its easy to propogate changes
        on boss. Anyway, the default for eventdebug is to connect to the standard
        port on localhost, which means it will work as expected on ops, but will
        require -b argument on boss.
      
      * Linktest changes were slightly more involved. No longer run linktest on
        boss when called from the experiment swapin path, but ssh over to ops to
        fire it off. This is done as the user of course, and there are some
        tricks to make it possible to kill a running linktest and its ssh when
        experiment swapin is canceled (or from the command line) by forcing
        allocation of a tty. I will probably revisit this at some point, but I
        did not want to spend a bunch of time on linktest.
      
      * The upgrade path detailed in doc/UPDATING is necessarily complicated and
        bound to cause consternation at remote sites doing an upgrade.
      9aa6b5ca
  5. 24 Jun, 2004 2 commits
    • Mike Hibler's avatar
      Improve the client-side install. With these changes, it should now be · 976133e4
      Mike Hibler authored
      possible to:
      
      	gmake client
      	sudo gmake client-install
      
      on a FBSD4, FBSD5, RHL7.3, and RHL9.0 client node.
      
      There are still some dependencies that are not explicit and which would
      prevent a build/install from working on a "clean" OS.  Two that I know of are:
      you must install our version of the elvin libraries and you must install boost.
      976133e4
    • Mike Hibler's avatar
      Minor lint for GCC 3.3 · a36ccc7b
      Mike Hibler authored
      (early stages of getting Emulab software to build under FreeBSD 5)
      a36ccc7b
  6. 11 May, 2004 1 commit
  7. 20 Apr, 2004 1 commit
    • Mike Hibler's avatar
      Improve the client-install. You can now do a "make client-install" from · 361ee691
      Mike Hibler authored
      the top level.  This will build all the necessary binaries and then install
      them.  This works on FBSD4 and RHL7.3.  It still doesn't work on FBSD5
      (newer compiler that no longer supports a style of use of _FUNCTION_ in the
      event lib) or RHL9 (event lib needs SSL lib which has a bad dependency
      on Kerberos).  Notes:
      
      - requires that elvin libraries be installed on nodes (they are) to build
        event agents, requires linuxthreads be installed on FBSD (it is now) to
        build imagezip (which is installed, but is not strictly necessary)
      
      - installed event-agents and other binaries are stripped
      
      - added a few missing files to the source tree for bsd (healthd.conf)
        and linux (healthd.conf, rc.local)
      
      - the only thing that doesn't get rebuilt in /usr/local/etc/emulab is
        healthd, I couldn't quickly find how it gets built
      
      - uses a scaled down version of libtb with no DB functions (since mysql
        isn't installed on nodes).  N.B. DO NOT DO A CLIENT INSTALL FROM YOUR
        REGULAR OBJ TREE OR ELSE YOU MAY WIND UP WITH A NEUTERED VERSION OF
        libtb.a!
      
      The build-as-well-as-install semantics are counter to the regular install
      targets, but this is what we gotta do for now.  Once the TB source builds
      under Linux and newer BSDs, we could undo this and just require that people
      do a regular "make" followed by "make client-install"  OTOH, there should
      be no reason to require installation of mysql and other server-side packages
      just to build clients (or make them sit through the compilation of assign),
      so maybe we will keep the client build special.
      361ee691
  8. 08 Mar, 2004 1 commit
  9. 06 Nov, 2003 1 commit
  10. 05 Nov, 2003 1 commit
    • Leigh B. Stoller's avatar
      Middle part of the event system changes. The main part of this change · 54bc15c4
      Leigh B. Stoller authored
      is to add HMACs to events to ensure they that events cannot be
      injected into an experiment by an unauthorized client.
      
      * The frontend now generates a secret key for each experiment and
        stores that into a file and in the DB.
      
      * Each of the event clients, as well as the event producers
        (scheduler, tevc) have a new -k option to specify the name of the
        file. Two new event library functions were added for clients to give
        the key:
      
          event_handle_t
          event_register_withkeyfile(char *name, int threaded, char *keyfile);
      
          event_handle_t
          event_register_withkeydata(char *name, int threaded,
      	   		       unsigned char *keydata, int keylen);
      
      * When the library is in possesion of a key, it will generate an HMAC
        and attach it to outgoing notifications. A client receiving a
        notification will compute an HMAC and compare it against the HMAC in
        the notification. If they do not compare, the notification is
        dropped with a warning message printed (the client callback never
        gets the notification). If the client has not provided a key, then
        the HMAC in the incoming notification is ignored.
      
      * The scheduler also takes a -k option, and will compute HMACs for all
        of the static events ahead of time. That keeps it off the critical
        path.
      
      * The tevc client also takes a -k option. However, tevc will always
        try to find the keyfile (default path) so that it can attach the
        HMAC to dynamic events before sending them to the scheduler (which
        will check to make sure it matches). The scheduler will not accept
        dynamic events without unless the HMAC is present and matches.
      
      * I have rebuilt the elvin librarys, removing all of the X goop and
        the SSL goop. Smaller binaries. So, I had to add -lcrypto to all of
        the client makefiles to that programs link.
      
      * The program-agent got a few more changes. The command string is no
        longer passed inside the event; it comes in when the program agent
        is started, via a config file generated from tmcd data. This gets
        rid of our mostly insecure remote execution facility.
      54bc15c4
  11. 05 Jun, 2003 1 commit
    • Leigh B. Stoller's avatar
      New event proxy. This proxy is used in lieu of Elvin clustering or · b5d82850
      Leigh B. Stoller authored
      federation, which is not supported in the version we have source to.
      Basically, we run an elvind on each node. The proxy on each node
      subscribes to all events for that node from the boss elvind, and hands
      them to the local elvind, Each client on the node subscribes to the
      local elvind, and gets its events via the proxy. This should reduce
      the number of connections to boss, and makes it possible to run agents
      inside each virtual node without an FD explosion on boss.
      b5d82850