- 03 Aug, 2006 1 commit
-
-
Leigh B. Stoller authored
into per-experiment databases on ops. Additional support for reconsituting those databases back into temporary databases on ops, for post processing. * This revision relies on the "snort" port (/usr/ports/security/snort) to read the pcap files and load them into a database. The schema is probably not ideal, but its better then nothing. See the file ops:/usr/local/share/examples/snort/create_mysql for the schema. * For simplicity, I have hooked into loghole, which already had all the code for downloading the trace data. I added some new methods to the XMLRPC server for loghole to use, to get the users DB password and the name of the per-experiment database. There is a new slot in the traces table that indicates that the trace should be snorted to its DB. In case you forgot, at the end of a run or when the instance is swapped out, loghole is run to download the trace data. * For reconsituting, there are lots of additions to opsdb_control and opsdb_control.proxy to create "temporary" databases and load them from a dump file that is stored in the archive. I've added a button to the Template Record page, inappropriately called "Analyze" since right now all it does is reconsitute the trace data into a DB on ops. Currently, the only indication of what has been done (the name of the DBs created on ops) is the log email that the user gets. A future project is tell the user this info in the web interface. * To turn on database capturing of trace data, do this in your NS file: set link0 ... $link0 trace $link0 trace_snaplen 128 $link0 trace_db 1 the increase in snaplen is optional, but a good idea if you want snort to undertand more then just ip headers. * Also some changes to the parser to allow plain experiments to take advantage of all this stuff. To simple get yourself a per-experiment DB, put this in your NS file: tb-set-dpdb 1 however, anytime you turn trace_db on for a link or lan, you automatically get a per-experiment DB. * To capture the trace data to the DB, you can run loghole by hand: loghole sync -s the -s option turns on the "post-process" phase of loghole.
-
- 28 Jul, 2006 1 commit
-
-
Leigh B. Stoller authored
create a new template (well, really a modify) from the current swapped in experiment. This allows you to create a template, swap in an instance, modify the datastore in the instance (which is a copy of the datastore in the template), and then create a new template using the datastore and nsfile from the instance. This is a new menu item on the showexp page for the instance. Also in this commit are fixes and improvements to the new navagation bar that I recently added.
-
- 26 Jul, 2006 1 commit
-
-
Russ Fish authored
-
- 11 Jul, 2006 1 commit
-
-
Mike Hibler authored
Technically this binary should be installed on the FS server and we should have an "fsdir" link in case fs/ops are different. But for now...
-
- 21 Jun, 2006 1 commit
-
-
Leigh B. Stoller authored
-
- 30 May, 2006 1 commit
-
-
Leigh B. Stoller authored
Record page lets you export the contents of the archive that corresponds to that record, along with an XML file that describes the various DB bits for the template and instance. This is just a first cut so that Mike can start playing around. Subject to change, I'm sure. The archive is dumped to /proj/$pid/exports/$guid/$vers/$exptidx, which is basically the last commit of the instance when it was terminated. The xml file is called export.xml and is placed in the top level directory of the above directory. The file is created with XML::Simple, and a typical XML file might look like: <instance> <bindings> <name>NodeCount</name> <description>Number of nodes!</description> <value>1</value> </bindings> <bindings> <name>OS</name> <description></description> <value>RHL90-STD</value> </bindings> <bindings> <name>ScriptArgs</name> <description></description> <value>-b</value> </bindings> <eid>NewOne-V2</eid> <guid>10149/2</guid> <metadata> <name>M1</name> <guid>10162/1</guid> <value>Some metadata</value> </metadata> <pid>testbed</pid> <runs> <name>1</name> <archive_tag>T20060526-082533-172_endexp</archive_tag> <description></description> <exptidx>110</exptidx> <idx>1</idx> <runid>NewOne-V2</runid> <start_time>2006-05-26 08:23:02</start_time> <stop_time>2006-05-26 08:25:16</stop_time> </runs> <uid>stoller</uid> </instance>
-
- 15 May, 2006 1 commit
-
-
Mike Hibler authored
tb-set-node-plab-role $plc plc to make it the PLC node. Then any number of other nodes are declared as: tb-set-node-plab-role $plab1 node to make them inner plab nodes. Unlike elabinelab, there is no magic "tb-plab-in-elab" command which implies the topology, you put all the plab nodes in a LAN or whatever yourself. This may or may not be a good idea. Anyway, these NS commands set DB state in virt_nodes and reserved much like elabinelab. During swapin, the dhcpd.conf file is rewritten so that inner plab nodes have their "filename" set to "pxelinux.0" and their "next-server" set to the designated PLC node. The PLC node will then be loaded/booted before anything is done to the inner-plab nodes. After it comes up, the inner plab nodes are rebooted and declared as up. There is a new tmcd command "eplabconfig" (suggestions for a new name welcom!), which returns info like: NAME=plc ROLE=plc IP=155.98.36.3 MAC=00d0b713f57d NAME=plab1 ROLE=node IP=155.98.36.10 MAC=0002b3877a4f NAME=plab2 ROLE=node IP=155.98.36.34 MAC=00d0b7141057 to just the PLC node (returns nothing to any other node). The implications of this setup are: * The PLC node must act as a TFTP server as we have discussed in the past. The TMCC info above is hopefully enough to configure pxelinux, if not we can change it. * The PLC node is responsible for loading the disks of inner plab nodes. This is implied by the setup, where we change the dhcpd.conf file before doing anything to the inner nodes. Thus, once the inner nodes are rebooted, they will be talking pxelinux with PLC, and not to boss. This step is dubious, as we could no doubt load the disks faster than whatever plab uses can. But it simplified the setup (and is more realistic!). The alternative, which is something that might be useful anyway, is to introduce a "state" after which nodes have been reloaded but before they are rebooted. With that, we can reload the plab nodes and then change the dhcpd.conf file so when they reboot they start talking to the PLC.
-
- 12 May, 2006 1 commit
-
-
Leigh B. Stoller authored
"object" and this was a good opportunity to see if they are useful and easy enough to use. Yep they are; the code is much cleaner with many fewer utility functions to get at stuff. I recommend this approach from now on. The problem is the php side, which ends up duplicating some stuff, but in the old style. This is not so bad for the template code since I have made it a point not to do anything but display functions in php; all modifications are handled in the backend.
-
- 05 May, 2006 1 commit
-
-
Leigh B. Stoller authored
-
- 30 Mar, 2006 1 commit
-
-
Leigh B. Stoller authored
now in place (create and display), and the backend part that deals with setting up the DB records. Nothing actually happens on the nodes yet.
-
- 28 Mar, 2006 1 commit
-
-
Leigh B. Stoller authored
you can turn it on in your devel tree by setting the $EXPOSETEMPLATES variable in www/defs.php3.in to 1. BE SURE NOT TO CHECK THAT CHANGE IN!
-
- 22 Feb, 2006 1 commit
-
-
Leigh B. Stoller authored
fixed.
-
- 07 Feb, 2006 1 commit
-
-
Leigh B. Stoller authored
This is fine for the purposes of getting Emulab software base updated.
-
- 26 Jan, 2006 1 commit
-
-
Kevin Atkinson authored
Merged in changes from tblog-2-branch: Move parts of libtblog into libtblog_simple. Libtblog simple provided the basic logging functions but doesn't touch anything. Moreover including libtblog_simple doesn't automatically start the logging subsystem. It also doesn't have testbed dependencies which mean 1) it can be used in the core testbed libraries (such as libdb, libtestbed) without introducing a circular dependency and 2) can be used independently. Reworked DBFatal and DBWarn to use tblog. It will still email testbed-ops, however. Make use of the "cause" field to determine the cause of the bug. In particular tblog_find_error will look at the value of this field and report the "cause". In the future different actions can be taken based on the ultimate "cause" of the bug, such as if testbed-ops should be notified. Change format of Error Message reported by libtblog. As per the email "Format or Error Messages" ro testbed-dev. Have libtblog use its own Database handle to avoid problems with locked tables. Also set DBCONN_MAXTRIES to 3 for most important queries. For queries that are not important don't send mail on error.
-
- 23 Jan, 2006 1 commit
-
-
Timothy Stack authored
Parse the NS file with the real NS parser so we can make sure linktest is doing the "right" thing. * configure, configure.in: Add tbsetup/nsverify files. * tbsetup/GNUmakefile.in: Add nsverify subdir. * tbsetup/tbprerun.in: Run verify-ns on the experiments NS file. * tbsetup/ns2ir/nstb_compat.tcl: Bring up-to-date with the current world. * tbsetup/nsverify/GNUmakefile.in: Makefile. * tbsetup/nsverify/ns-2.27.patch: Patch file for NS version 2.27. * tbsetup/nsverify/nstbparse.in: Wrapper for the NS parser. * tbsetup/nsverify/tb_compat.tcl: Different version of tb_compat.tcl that is used to verify linktest parameters. * tbsetup/nsverify/verify-ns.in: Script that runs on boss and verifies that the testbed parser worked correctly. * tbsetup/ns2ir/parse-ns.in, tbsetup/ns2ir/parse.proxy.in: Tweaked a bit so parse.proxy can be used to run the regular NS parser in addition to the testbed one.
-
- 05 Jan, 2006 1 commit
-
-
Leigh B. Stoller authored
of the archive.
-
- 02 Jan, 2006 1 commit
-
-
Timothy Stack authored
First cut at a daemon that does regular checkups of the testbed hardware/software. * configure, configure.in: Add tbsetup/checkup directory. * db/audit.in: Add a listing of stuck checkups. * install/boss-install.in: Add 'elabckup' user. * rc.d/3.testbed.sh.in: Startup the checkup_daemon. * sql/database-create.sql, sql/database-migrate.txt: Add the checkups tables. * tbsetup/GNUmakefile.in: Descend into the checkup directory. * tbsetup/checkup: The checkup daemon, man page, and associated scripts. * tbsetup/ptopgen.in: Add a feature with a value of 0.9 to prereserved nodes to keep them from being allocated unless they're really wanted. * utils/firstuser.in: Add some other options so the script can be used to create other pseudo users.
-
- 19 Dec, 2005 1 commit
-
-
Kevin Atkinson authored
Updates to to Error Logging API Code. You should start seeing much better error messages coming from my system. Errors coming from parse.proxy and assign (the two most frequent sources of errors) should now be concise and to the point. Errors coming from libosload/libreboot (the next most frequent source of errors) should now also be much better, but not perfect. Getting perfect errors will likely a rework of how errors are handled in libosload/libreboot, just adding tberror/tbwarn/tbnotice calls is not enough. I can do this at a latter date if necessary. A few minor database changes. Some changes to the API. A few bug fixes. Lots of tberror/tbwarn/tbnotice added to scripts. Since assign is a C program, and at this time my API is perl only, I wrote a second wrapper around assign, assign_wrapper2. When assign fails errors are now parsed in assign_wrapper2, sent to stderr and logged. This means that RunAssign() just returns when assign fails rather than echoing some of assign.log output and then quiting. The output to the activity log remains unchanged. Since "parse.proxy" is run from ops I couldn't use my API in it, even though it is a perl program. Instead I parse the errors coming form it in parse-ns.
-
- 15 Dec, 2005 1 commit
-
-
Leigh B. Stoller authored
studly users in the testbed project on the mainsite.
-
- 28 Nov, 2005 1 commit
-
-
Timothy Stack authored
-
- 17 Nov, 2005 1 commit
-
-
Mike Hibler authored
* Add libadminmfs.pm with routines for entering/exiting and executing commands in, the admin MFS. Node admin and firewall swapout (see below) now use this, the image creation process does not yet. * Add swapout time hooks for running an admin mode process, likely to be used to collect swapout time state. Currently controlled globally by two new sitevars. * Modified node_admin to use the library and added a "-c <command>" option to have nodes go into admin mode and run a command. I don't really expect this to be useful, it was just a testing vehicle for the library. 2. Improved the swapout process for firewalled experiments. Largely just generalized what we already did for paniced experiments. At swapout, firewalled nodes are: - powered off - set to boot into admin mode and run a disk zapper - powered on The swapout process then waits for all nodes to successfully complete disk zapage, at which point the nodes are nfree'ed as usual. Any failure of the above process, marks the experiment as panic'ed (to ensure that we are involved in cleanup) and sends mail to testbed-ops describing the state of the nodes. 3. Added the aforementioned disk zapper, a little C program in the MFS which zeroes out the MBR and partition boot blocks (but not the MBR partition table or FS superblocks). This is added insurance that if a node somehow gets diverted after being nfree'd but before getting the disk reloaded (e.g., goes to hwdown), that we cannot accidentally boot from the disk. This program gets installed in the admin MFS. 4. Related to firewalls, modified swapin to use the new documented "snmpit -N" to get the firewall VLAN number rather than parsing the output that was a side-effect of VLAN creation.
-
- 04 Nov, 2005 1 commit
-
-
Kevin Atkinson authored
-
- 20 Oct, 2005 1 commit
-
-
Kirk Webb authored
New node_attributes facility and table. Auxiliary node attributes, such as service tag #, BIOS version, etc., are should now be placed into the node_attributes table. This can be accomplished by either using the node_attributes command line tool, or by using the modnodeattributes_form.php3 form (not linked in anywhere yet, but will be in a moment). Attribute names and values are checked for sanity using table_regex entries. Also note that I started with the nodecontrol stuff as a template. The command line tool and web form (which simply calls the command line tool to actually do the modifications) can add, delete, and/or remove attributes. Finally, note that the bios_version column has been moved from the nodes table to the node_attributes table. The Node Information page will show the list of current attributes at the bottom of the info table.
-
- 19 Sep, 2005 1 commit
-
-
Leigh B. Stoller authored
into a single new script call modgroups. Usage: modgroups [-a pid:gid:trust[,pid:gid:trust]...] [-m pid:gid:trust[,pid:gid:trust]...] [-r pid:gid[,pid:gid]...] user So, -a to add groups, -r to remove groups, and -m to modify the trust value for a member of a group. The reason for doing this is that previously, we had no idea in the backend what group changes actually happened; we just knew what the current groups are. This make it hard to add and remove users from mailing lists, chat server buddy lists, etc. This is cleaner ...
-
- 14 Sep, 2005 1 commit
-
-
Mike Hibler authored
Entailed new instructions for manual setup as well as integration into elabinelab framework. First, the manual path: setup.txt, setup-boss.txt, setup-ops.txt and new setup-fs.txt: Updated to reflect potential for separate fs node. The org here is a little dicey and could be confusing with ops+fs vs. ops and fs. Has not been field tested yet. */GNUmakefile.in: new fs-install target. configure, configure.in, defs-*: Somewhat unrelated, make min uid/gid to use be a defs setting. Also add config of fs-install.in script. boss-install.in, ops-install.in and new fs-install.in: Handle distinct fs node. If you have one, fs-install is run before ops-install. All scripts rely on the defs file settings of FSNODE and USERNODE to determine if the fs node is seperate. utils/checkquota.in: Just return "ok" if quotas are not used (i.e., if defs file FS_WITH_QUOTA string is null. install/ports/emulab-fs: Meta port for fs node specific stuff. Also a patch for the samba port Makefile so it doesn't drag in CUPs, etc. Note that the current samba port Makefile has this change, I am just backporting to our version. Elabinelab specific changes: elabinelab-withfs.ns: NS fragment used in conjunction with tb-elab-in-elab-topology "withfs" to setup inner-elab with fs node. elabinelab.ns: The hard work on the boss side. Recognize seperate-fs config and handle running of rc.mkelab on that node. fs setup happens before ops setup. rc.mkelab: The hard work on the client side. Recognize FsNode setup as well as differentiate ops+fs from ops setup. Related stuff either not part of the repo or checked in previously: emulab-fs package
-
- 13 Jun, 2005 1 commit
-
-
Timothy Stack authored
Initial checkin of a "repositioning" daemon that moves robots back to their pens on swapout. * configure, configure.in: Add tbsetup/repos_daemon. * db/libdb.pm.in: Add constants for the repositionpending/repositioning experiments. * db/nfree.in: When freeing garcias, send them to repositionpending instead of reloadpending. * event/sched/event-sched.c: Deal with the rare case of no SIMULATOR object being in the agent list for an experiment. * robots/emc/emcd.c, robots/emc/locpiper.in: Fix some typos. * robots/rmcd/masterController.h, robots/rmcd/masterController.c, robots/rmcd/obstacles.h, robots/rmcd/obstacles.c: Ignore dynamic obstacles that are far away and remove dynamic obstacles where the robot is inside the natural obstacle area. * sql/database-create.sql, sql/database-migrate.txt: Add a reposition_status table that tracks the status of robots that are being moved back to their pens. * tbsetup/GNUmakefile.in: Install the repos_daemon script. * tbsetup/reload_daemon.in: Move robots to the repositionpending experiment, if they haven't already reached their pen. * tbsetup/repos_daemon.in: Daemon that takes care of seeing robots back to their pens after they are freed from an experiment.
-
- 26 May, 2005 1 commit
-
-
Robert Ricci authored
need it.
-
- 18 Mar, 2005 1 commit
-
-
Mike Hibler authored
To enable WhOL, you need to add outlets table entries for nodes which are whol-enabled. The power_id encodes the interface on boss to use: +---------+-----------+--------+----------------+ | node_id | power_id | outlet | last_power | +---------+-----------+--------+----------------+ | pcwf6 | whol-fxp0 | 0 | 20050318152119 | +---------+-----------+--------+----------------+ You then need interfaces and wires table entries for that interface on boss, so that snmpit works with syntax like "boss:1". This is probably not really needed once the VLAN has been setup. You need a magic VLAN called "WhOL", I used VLAN 999. Add it to all the switches and trunks. Put boss's port in it. It will remain there, enabled, forever. In the interfaces table entry for every interface that supports WhOL, you need to set the 'whol' field to 1.
-
- 04 Feb, 2005 1 commit
-
-
Kirk Webb authored
Support for reloading garcias. Currently this is a total hack; we simply rsync the stargate as it is freed from the experiment. However, until we unravel os_load dependancies between nodes and subnodes (motes and stargates in this case), doing this through the appropriate setup channels won't work. The tbrsync script borrows much of its infrastructure from Rob's tbuisp script.
-
- 24 Jan, 2005 1 commit
-
-
Timothy Stack authored
Robot related stuff: power via e-mail, client-install fixups, checking coords against camera boundaries. * configure, configure.in: Add tbsetup/power_mail.pm to the list of template files. * doc/cross-compiling.txt: More stargate notes. * event/sched/rpc.cc: Updates for the addition of the cameras table. * robots/GNUmakefile.in, robots/emc/GNUmakefile.in, robots/mtp/GNUmakefile.in, robots/rmcd/GNUmakefile.in, robots/tbsetdest/GNUmakefile.in, robots/vmcd/GNUmakefile.in: client-install fixups. * tbsetup/GNUmakefile.in: Add power_mail.pm. * tbsetup/os_setup.in: Don't skip reboot of robots anymore. * tbsetup/power.in: Add special case for a power_id of "mail", which calls into the power_mail.pm backend. * tbsetup/power_mail.pm.in: E-mail backend for power, it sends an e-mail to tbops and waits for the outlets.last_power value to be updated from the power.php3 web page. * tbsetup/ns2ir/parse-ns.in: Add the contents of the cameras table to the TBCOMPAT namespace. * tbsetup/ns2ir/sim.tcl.in: More checking of "setdest" inputs. * tbsetup/ns2ir/topography.tcl: Update the checkdest method to check destination points against the camera list. * www/powertime.php3: Webpage used to update the last power time for nodes. * www/shownode.php3: Add "Update Power Time" menu button.
-
- 13 Jan, 2005 1 commit
-
-
Mike Hibler authored
The web interface could use some work...
-
- 16 Dec, 2004 3 commits
-
-
Leigh B. Stoller authored
-
Robert Ricci authored
the web interface.
-
Leigh B. Stoller authored
* tbsetup/panic.in: New backend script to implement the panic button feature. When used, it will cut the severe the connection to the firewall node by using snmpit to disable the port. Sets the panic bit (and date) in the experiments table, and changes the state of the experiment from "active" to "paniced" to ensure that the experiment cannot be messed with (swapped out or modified). Sends email to tbops when the panic button is pressed. Used with -r option, reverses the above. State is set back to active, the panic bit is cleared, and the port is renabled with snmpit. * tbsetup/tbswap.in: During swapout, a firewalled experiment that has been paniced will get a cleaning; The nodes are powered off, then the osids for all the nodes are reset (with os_select) so that they will boot the MFS, and then the nodes are powered on. Then the control network is turned back on, and then I wait for the nodes to reboot (this is simply cause we do not record in the DB that a node is turned off, and if I do not wait, the reload daemon will end hitting the power button again if they do not reboot in time. We can fix this later. I am not planning to apply this to general firewalled experiments yet as the power cycling is going to be hard on the nodes, so would rather that we at least have a 1/2 baked plan before we do that. * www/showexp.php3: If experiment is firewalled, show the Panic Button, linked to the panic button web script. If the experiment has already had the panic button pressed, show a big warning message and explain that user must talk to tbops to swap the experiment out. Also fiddle with menu options so that the terminate link is gone, and the swap link is visible only in admin mode. In other words, only an admin person can swap an experiment once it is paniced. And of course, an admin person can the backend panic script above with the -r option, but thats not something to be done lightly. * db/libdb.pm.in: Add "paniced" as an experiment state (EXPTSTATE_PANICED). Add utility functions: TBExptSetPanicBit(), TBExptGetPanicBit(), and TBExptClearPanicBit(). * tbsetup/swapexp.in: Minor state fiddling so that an experiment can be swapped while in paniced state, but only when in admin mode. Also clear the panic bit when experiment is swapped out. * www/dbdefs.php3.in: Add "paniced" as an experiment state. Add a utility function TBExptFirewall() to see if experiment is firewalled. * www/panicbutton.php3: New web script to invoke the backend panic script mentioned above, after the usual confirm song and dance. * www/panicbutton.gif: New gif of a red panic button that I stole off the net. If anyone has sees/has a better one, feel free to replace this one. * utils/node_statewait.in: Add -s option so that I can pass in the state I want to wait for (used from tbswap above to wait for nodes to reach ISUP after power on).
-
- 14 Dec, 2004 1 commit
-
-
Robert Ricci authored
'power' command.
-
- 16 Nov, 2004 1 commit
-
-
Leigh B. Stoller authored
download images from the outer emulab. This script is invoked from frisbeelauncher when ELABINELAB=1 and the filename does not exist (thus attempting to get the image file before bailing). The frisbeeimage script uses a new method in the RPC server to fire up a frisbeed (using frisbeelauncher on the outer Emulab), subject to the usual permission checks against creator of the elabinelab experiment (I assume that the creator will have access to any outer images that are used inside the inner emulab). If outer frisbeelauncher succeeds, its return value is the load_address (IP:port), which is used to fire up a frisbee client to get the image file and write it out (using Mike's new -N option that just dumps the raw data to file). Once the image is downloaded, control returns to inner frisbeelauncher and proceeds as normal. I whacked this together pretty quickly. Under heavy usage it might hit a race condition or two, but I do not expect that to happen in an inner elab for a while.
-
- 15 Nov, 2004 1 commit
-
-
Leigh B. Stoller authored
* snmpit: When ElabInELabis true, use the routines in the new snmpit_remote.pm library for setting up and tearing down vlans for an experiment. At present, only these two operations are proxied out to the outer emulab. * snmpit_remote.pm: A new little library that uses the XMLPRC server on the outer emulab to setup and destroy vlans for an inner experiment. This code is used from snmpit (see above). * snmpit_lib.pm: A couple of minor changes for the server side of the proxy operation. * snmpit.proxy.in: A new perl module that is invoked from the RPC server. This proxy sets up and tears down vlans for an inner elab. The basic model is that the container experiment will have lots of vlans for various individual experiments running on the inner emulab. * swapexp: A couple of minor elabinelab hacks. * tbswap: For elabinelab experiments, reconfig/restart dhcpd when tearing down the experiment, and call out to new elabinelab script when setting up an elabinelab experiment. There is no provision for swapmod at this time. * elabinelab: A new script to create the inner emulab. Does all kinds of gross DB stuff then more gross stuff on the inner ops and boss.
-
- 12 Nov, 2004 1 commit
-
-
Robert Ricci authored
Contributed by Keith Sklower at Berkeley.
-
- 08 Oct, 2004 1 commit
-
-
Leigh B. Stoller authored
by more then just plab code.
-
- 30 Aug, 2004 1 commit
-
-
Leigh B. Stoller authored
* The per-experiment event scheduler now runs on ops instead of boss. Boss still runs elvind and uses events internally, but the user part of the event system has moved. * Part of the guts of eventsys_control moved to new script, eventsys.proxy, which runs on ops and fires off the event scheduler. The only tricky part of this is that the scheduler runs as the user, but killing it has to be done as root since a different person might swap out the experiment. So, the proxy is a perl wrapper invoked from a root ssh from boss, which forks, writes the pid file into /var/run/emulab/evsched/$pid_$eid.pid, then flips to the user and execs the event scheduler (which is careful not to fork). Obviously, if the kill is done as root, the pid file has to be stored someplace the user is not allowed to write. * The event scheduler has been rewritten to use Tim's C++ interface to the sshxmlrpc server on boss. Actually, I reorg'ed the scheduler so that it can be built either as a mysql client, or as RPC client. Note that it can also be built to use the SSL version of the XMLRPC server, but that will not go live until I finish the server stuff up. Also some goo for dealing with building the scheduler with C++. * Changes to several makefiles to install the ops binaries over NFS to /usr/testbed/opsdir. Makes life easier, but only if boss and ops are running the same OS. For now, using static linking on the event scheduler until ops upgraded to same rev as boss. * All of the event clients got little tweaks for dealing with the new CNAME for the event system server (event-sever). Will need to build new images at some point. Old images and clients will continue to work cause of an inetd hack on boss that uses netcat to transparently redirect elvind connections to ops. * Note that eventdebug needs some explaining. In order to make the inetd redirect work, elvind cannot be listening on the standard port. So, the boss event system uses an alternate port since there are just a few subsystems on boss that use the server, and its easy to propogate changes on boss. Anyway, the default for eventdebug is to connect to the standard port on localhost, which means it will work as expected on ops, but will require -b argument on boss. * Linktest changes were slightly more involved. No longer run linktest on boss when called from the experiment swapin path, but ssh over to ops to fire it off. This is done as the user of course, and there are some tricks to make it possible to kill a running linktest and its ssh when experiment swapin is canceled (or from the command line) by forcing allocation of a tty. I will probably revisit this at some point, but I did not want to spend a bunch of time on linktest. * The upgrade path detailed in doc/UPDATING is necessarily complicated and bound to cause consternation at remote sites doing an upgrade.
-