- 14 Aug, 2006 4 commits
-
-
Kevin Atkinson authored
-
Leigh B. Stoller authored
draft is that the user will at the end of an experiment run, log into one of his nodes and perform some analysis which is intended to be repeated at the end of the next run, and in future instantiations of the template. A new table called experiment_template_events holds the dynamic events for the template. Right now I am supporting just program events, but it will be easy to support arbitrary events later. As an absurd example: node6> /usr/local/bin/template_analyze ~/data_analyze arg arg ... The user is currently responsible for making sure the output goes into a file in the archive. I plan to make the template_analyze wrapper handle that automatically later, but for now what you really want is to invoke a script that encapsulates that, redirecting output to $ARCHIVE (this variable is installed in the environment template_analyze. The wrapper script will save the current time, and then run the program. If the program terminates with a zero exit status, it will ssh over to ops and invoke an xmlrpc routine to tell boss to add a program event to both the eventlist for the current instance, and to the template_eventlist for future instances. The time of the event is the relative start time that was saved above (remember, each experiment run replays the event stream from time zero). For the future, we want to allow this to be done on ops as well, but that will take more infrastructure, to run "program agents" on ops. It would be nice to install the ssl xmlrpc client side on our images so that we do not have to ssh to ops to invoke the client.
-
Mike Hibler authored
-
Leigh B. Stoller authored
agent to exit. rc.progagent now loops, restarting the program agent, but first getting new copies of the agent list and the environment from tmcd. Note that this conflicts slightly with the pa-wrapper used on plab nodes, which also loops. I think we can just get rid of pa-wrapper now, along with a slight change to rc.progagent. I'm gonna let Kirk comment on this. Need new images ...
-
- 11 Aug, 2006 21 commits
-
-
Mike Hibler authored
an issue with the program agent and signals (monitor is not getting killed during stop-experiment).
-
Mike Hibler authored
to slow plab nodes. First cut at a selective creation.
-
Mike Hibler authored
non-NULL latency/BW from those was too simplistic. With latency measurements much more frequent than BW measurements, we often never got a valid BW because we didn't go back far enough. So now just do two queries for the most recent non-NULL value of each. This could probably be done in a single query by joining the table with itself...
-
Mike Hibler authored
-
Mike Hibler authored
Turn off buffering for stdout.
-
Dan Gebhardt authored
-
Mike Hibler authored
-
Mike Hibler authored
it out to all images.
-
Mike Hibler authored
-
Jonathon Duerig authored
-
Kirk Webb authored
Fix up another place where the hostname lookup can fail, and thus cause the proxy (or anything else) to exit. Both evproxyplab and the event lib now first try to lookup the hostname to get the IP, and then fall back to grabbing the IP from /var/emulab/myip.
-
Mike Hibler authored
-
Jonathon Duerig authored
-
Mike Hibler authored
-
Mike Hibler authored
and just updates link characteristics based on the pelab DB values.
-
Mike Hibler authored
-
Dan Gebhardt authored
Loss rate still needs to be calculated (are we doing this now?)
-
Robert Ricci authored
Getting to say: tb-set-lan-protocol $realinternet ipv4 ... feels very powerful. Maybe I should set it to ipv6 and move the whole world forward into the future.
-
Dan Gebhardt authored
-
Robert Ricci authored
daemonize by default.
-
Jonathon Duerig authored
-
- 10 Aug, 2006 13 commits
-
-
Mike Hibler authored
-
Dan Gebhardt authored
Available data elements in initial condition structure: - Exponential average for bandwidth and latency, - Number of samples used - Number of error-val samples - Number of sequential error-val from newest measurement, backwards - timestamp of most recent measurement - source node - destination node Testing needed.
-
Kirk Webb authored
The other half of the changes that cause the plab event proxy to now try to get the routable IP of the node from tmcd rather than relying on the success of a hostname lookup. It will still fall back to trying a hostname lookup if it can't get the IP from tmcd.
-
Kirk Webb authored
Send along the IP address of the plab node in the return string from the 'plabconfig' command. We can't trust that the node will have a resolvable hostname (or have working DNS even..), so slap down the IP we have on record in the DB into a file. This will be used by the event proxy, which needs to know the node's routable IP in order to subscripe to elvind on ops properly.
-
Leigh B. Stoller authored
-
Robert Ricci authored
-
Jonathon Duerig authored
-
Robert Ricci authored
function, fprintTime() which will be used to standardize the time format. Also added const to some declarations to keep the compiler happy.
-
Robert Ricci authored
-
Leigh B. Stoller authored
A couple of things to note: * When requesting a graph, we have to have a checkout of the archive (the DB dump file) so that we can create a temporary DB with the data. This is done on demand, and the DB is left in place since its a fairly time consuming operation to do the checkout and the dbload. I do not delete the DBs though; we will need to age them out as needed. * Even so, when returning to a page we end up getting the graphs again, and that still takes more time then I like to wait. Perhaps add a refresh button so that the user has to force a redraw. Might need to add a time/date stamp to the graph.
-
Mike Hibler authored
node sits in the stub or monitor barrier sync for more than the SYNCTIMO timeout value in common-env.sh, it will send a HUP to syncd which will knock all the other nodes out of their barrier sync. If that happens, all nodes will print a warning message and continue. All nodes wait for both a stub sync and a monitor sync, so if one plab node is down, they will timeout on both barrier syncs. Race conditions? Sure. If for example everyone times out on the stub barrier due to a slow node, and then that node reaches the barrier, it will hang there while everyone else waits on the monitor barrier. When the latter times out, it will kick the slow node out of the stub sync and it will then proceed to hang in the monitor sync until the experiment is stopped. Got that? As an aside, it would be nice if the initializer of a barrier could specify a timeout value, and return a special error code to everyone if it timed out, but that would require an incompatible change to the sync protocol.
-
Mike Hibler authored
* add getopt processing * adjust delay to be one way before calling tevc
-
Leigh B. Stoller authored
instance there are graphs on the instance show page and on the individual run show pages. On the run pages, the graphs select just the packets between start and stop of the run. I also added drop down menus to select particular source and destination vnodes.
-
- 09 Aug, 2006 2 commits
-
-
Jonathon Duerig authored
Minor tweaks to the declarations used in argument processing. magent now backgrounds itself at the end of initialization. It stays in the same directory and keeps its standard files open.
-
Robert Ricci authored
We'll kill it using killall - not graceful, but should work.
-