DESIGN 13.8 KB
Newer Older
1 2 3 4 5
OVERVIEW
********
The testbed event system is composed of four parts: (1) the event
library, which implements a simple, transport-independent API that
clients may use to trigger and respond to events; (2) the event
6 7 8 9 10 11
scheduler, an event system client that triggers events at specified
future times; (3) frontends that take events specified by the user as
input and pass them to the event scheduler; and (4) backends that
respond to various system events on behalf of the testbed.  Each
of these four parts are described in more detail in the following
sections.
12

13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Quick Elvin Overview
====================

The testbed event system currently uses the Elvin publish/subscribe
system for communicating between event producers and event consumers.
Elvin uses a client/server architecture for delivering notifications.
The server accepts subscriptions from clients and routes notifications
from producers to the consumers that wish to receive them.
Notifications are arbitrary-length lists of name/value pairs (e.g.,
"type: NODE_FAILED, time: 23"), and subscriptions are expressed using
a simple syntax (e.g., "type == NODE_FAILED && time > 20").  Clients
can be both producers and consumers in the same session.  Clients may
be written in C, C++, Java, Python, and Emacs Lisp.  For more
information about how Elvin works, please see http://elvin.dstc.com/.


29 30 31 32 33
Event library
=============

Clients in the system trigger and respond to events using the event
library.  The event library is currently implemented as a layer above
34
the Elvin publish/subscribe system (see "ELVIN OVERVIEW" above for a
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
brief overview of how Elvin works).  A library and API separate from
Elvin exists for several reasons:

    - The event library provides a much simpler interface than Elvin,
      providing only the functionality that should be needed in
      the testbed.  This should make it significantly easier to write
      clients that use the event system.

    - The event library API is independent of Elvin, so if we change
      to a system other than Elvin in the future (or write our own),
      only the event library will need to be ported.  In particular,
      no event system clients should need to be changed.

    - The event library API is significantly simpler than the Elvin
      API, so it should be much easier to implement in environments
      such as the Linux kernel, the OSKit, or Scout.

52 53 54 55
The event library is currently available as a C library on Unix, and
as a PERL module. The C API is described in the API file, while the
Perl interface is described in API.PERL.

56 57

Event representation
58 59 60 61 62 63 64 65 66
====================

Events are represented by Elvin notifications that contain a number of
system defined fields which specifies what client should receive the
notifications. The system defined fields include:

    typedef struct {
	char		*site;		/* Which Emulab site. */
	char		*expt;		/* Project and experiment IDs */
67
	char		*group;		/* Deprecated */
68 69 70 71 72 73 74
	char		*host;		/* A specific host (ipaddr) */
	char		*objtype;	/* LINK, TRAFGEN, etc ... */
        char		*objname;	/* link0, cbr0, cbr1, etc ... */
        char		*eventtype;	/* START, STOP, UP, DOWN, etc ... */
	int		scheduler;	/* A dynamic event to schedule */
    } address_tuple;
    
75 76 77 78 79 80
Important Note: Event groups are *not* implemented by the group field
in the tuple. Rather, they are implemented by adding names to the
objname field treating it as a comma-delimited field. Since names are
not unique, the event is sent to every agent registered under the
group name.

81 82 83 84 85 86 87 88 89 90 91 92 93
The last field, scheduler, is used internally to route notifications to the
scheduler for an experiment. This is described in more detail below.

In addition to the system defined fields, event notifications may contain
additional attributes (in the form of name/value pairs) that may be used to
provide additional information about the event that has occurred. Each
event client is responsible for extracting those additional fields as is
necessary.

To further hide the details of Elvin, event clients use a simple API to
construct a notification or a subscription. The API functions take an
"address_tuple" structure as above. The library converts the fields in this
structure into an Elvin notification or subscription statement. 
94 95 96 97 98


Event scheduler
===============

99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
The event scheduler is an plain event system client responsible for
triggering future events. There is an event scheduler running for each
swapped in experiment on the testbed. The scheduler subscribes to all
events for its experiment (see "expt" field above), that have the
"scheduler" bit set.

To schedule an event for firing at a future time, event system clients may
call the event library API function event_schedule.  event_schedule accepts
a notification and a firing time, alters the notification so that the
"scheduler" bit is on and includes a firing time attribute, and then sends
the altered notification using event_notify. The event scheduler receives
this event since it is listening for events with the scheduler bit on!

As the events arrive, it clears the scheduler bit in the notification and
removes the firing time attribute. The notification is then enqueued in a
priority queue for firing at the indicated time.  When the time arrives,
the scheduler removes the notification from the queue and resends it using
event_notify. 

118 119 120 121

Frontend support
================

122
There two frontends to the event system, one static and one dynamic.
123

124 125
Static Events - The frontend for static events is the NS script, using the
ns "at" statement:
126 127 128

    $ns at 0.5 "$cbr0 start"

129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
Each "at" statement is converted by the NS parser to an entry in the DB for
that experiment. When the experiment is swapped in and the scheduler
started, it reads the static event list from the DB, building the initial
queue of events to be fired at their respective times. Each DB entry is
converted to a corresponding event notification structure based on the name
of the object, the type of event, and any arguments to be sent along with
it.

Dynamic Events - The frontend for dynamic events is the "tevc" client
program that runs on users.emulab.net or on any of the local nodes in an
experiment. Briefly, the tevc client allows you to inject an event that
will be received by the scheduler for an experiment. For example, to
control the "cbr0" object above:

    tevc -e foo/bar now cbr0 start

This will construct a notification, and send it using the event_schedule()
API function described above. The per-experiment scheduler will recieve the
notification and schedule it accordingly. More information on the tevc
client can be found at:

	www.emulab.net/tutorial/docwrapper.php3?docname=advanced.html
151 152 153 154 155


Backend support
===============

156 157 158 159 160
The backends are event system clients responsible for responding to certain
events on behalf of the testbed.  For example, the testbed traffic
generators respond to events so that they can be controlled via the NS file
or with dynamic events. A trafgen can be started, stopped, or its
parameters may be changed on the fly during an experiment.
161

162 163 164 165 166 167 168 169 170 171
Each client subscribes to the event system with a subscription that
includes at least the name of the object ("cbr0") and the experiment
(pid/eid) in which it is running. The subscription can be more finely tuned
of course, say by listening for specific types of events (START, STOP,
etc), or a client can listen for all events of a particular type (LINK,
TRAFGEN, etc). A client can also listen for all events in an experiment by
setting all of the fields to ANY except the expt field, which should always
be set to pid/eid (although nothing actually enforces this; a client may
listen for events in any experiment, but since nothing sensitive is ever
sent via the event system, this is deemed okay).
172

173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305
More information on backend clients can be found at:
www.emulab.net/tutorial/docwrapper.php3?docname=advanced.html

Implementation Details
======================

The original implementation was very simple. There was an instance of
elvind running on boss, and each event scheduler as well as all of the
event clients on every node, connected to that elvind via TCP. This quickly
got out of hand as we added more physical nodes, and then added jailed
nodes (which also run event clients inside).

We experimented with Elvin's UDP mode, but we found that it does not work
properly (nor does its unix domain socket mode), but even if it did, there
is no reliability model built into it.

We then changed to the following model:

         Boss                   |                Testbed Node
                                |
   Event      Elvin             |           Event      Elvin       Node
   ----->---->------>--->control network--->----->---->----->---->------
   Sched      Daemon            |           Proxy      Daemon     Agents
                                |
	event flow ===>  	|


The above scheme reduces the number of TCP connections to one per physical
node. Each agent on a node connects to a local elvind. If the node is
hosting jails, then agents running inside the jail also connect to the
elvind on the physical node via "localhost".

An "event proxy" runs on each node. This is a poor mans implementation of
"clustering" which also does not work in 4.0. Instead, we wrote a simple
proxy that subscribes to both the local elvind and the elvind on boss. The
proxy subscribes to all events for that physical node. Each event the proxy
receives is then forwarded to the local elvind (via the usual event
notification API function) without change. The local elvind delivers the
event to proper agent, as specified in the agent's subscription.

The event scheduler handles both static (from the NS file) and dynamic
(command line) events. Static events follow the path described above
exactly, since they originate from the scheduler at the proper time.

Dynamic events are injected with the tevc client. This causes an additional
hop through elvind on boss:

    users.emulab.net     |                  Boss
                         |
     Tevc                |             Elvin        Event
    ------>------>control network----->------>----->------> ...
    Client               |             Daemon       Sched
                         |
    event flow ===>      |

tevc sets the scheduler bit in the notification, as well as the pid/eid.
As described above, the event scheduler on boss is listening for these
events. See above for more details.

While this may sound sound really complicated (and it is!), the time it
takes to deliver a static event is not too bad; about 1ms. Dynamic events
take a little more, but still less than 2ms.


Security
========

This section describes work in progress, but not yet finished. It should be
finished real soon.

Security is based on end-to-end HMACs using a shared secret key that is
transfered to the client via the secure tmcd connection and placed on the
local disk. Notifications and the attached HMAC are transferred in the
clear; there is no encryption of the event stream. The goal is strictly
authentication of notifications before they are delivered to the clients.

Further, widearea clients will not be able to inject events at this
time. We will enforce that by firewalling the elvin port at the
firewall. They will be able to receive events only. Event clients on
widearea nodes *must* use the aforementioned authentication mechanism as
well, to ensure that they do get bogus events. This is desctibed in more
detail below in the section on widearea nodes.

All of the event client programs take a [-k keyfile] argument that points
to the file containing the key. This obviously has to be kept secret! The
name of the file is passed to the event library when the client registers,
thus the key is totally opaque to the clients and servers.

The frontend programs (event scheduler and tevc) attach the HMACs to each
event before it is sent. Note that events generated with tevc flow through
the scheduler, but because the notification includes the scheduler info,
which must be removed, the hmac must be recomputed after the notification
is altered. For static events, the scheduler computes the HMAC when the
event list is read from the DB, thus keeping that computation out of the
delivery path.

The scheduler gets the secret key from the DB, while tevc gets the key from
the /proj/$pid/exp/$eid/tbdata directory. Since only people in the
project/experiment can access the tbdata directory, only authorized people
will be able to inject events for an experiment.

Each node that cares about security (which should be *all* remote nodes)
passes the keyfile name to the event client on the command line. If the
event library has a keyfile it will compute a HMAC and attach it to the
notification on the server, and compute/compare the HMAC if it is a
client. In other words, a notification might have an HMAC, but if the
client has not provided a keyfile, no check will be done on the
notification. This is fairly coarse control of course, and we might need to
extend this. If the HMACs do not compare, the notification is dropped in
the client side library (warning printed).

The API change for supplying the key to the event system is two new
functions, and will eventually replace the original event_register().

    event_handle_t
    event_register_withkeyfile(char *name, int threaded, char *keyfile);

    event_handle_t
    event_register_withkeydata(char *name, int threaded,
	   		       unsigned char *keydata, int keylen);


The first function allows you to provide a file, which the event library
will read the key from. The second allows you to supply the key directly,
say if you were to read it from the DB.

What about subscriptions and snooping? Since we are not using Elvin's
builtin security mechanisms, the Elvin subscription protocol makes it easy
to capture events simply by connecting to the server and supplying an
appropriate subscription. Since we do not send any sensitive information
via the event system, this is okay. The event stream is also sent in clear
text, and in the widearea it would be easy to snoop the events. Again, this
is not considered a problem.
306 307


308 309
Widearea Nodes
==============