tutorial.html 41.9 KB
Newer Older
Leigh B. Stoller's avatar
Leigh B. Stoller committed
1
2
<!--
   EMULAB-COPYRIGHT
3
   Copyright (c) 2000-2003 University of Utah and the Flux Group.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
4
5
   All rights reserved.
  -->
6
<center>
7
<h1>Emulab Tutorial</h1>
8
9
</center>

10
11
<h2>Contents</h2>
<ul>
12
<li> <a href="#GettingStarted">Getting Started</a>
13
    <ul>
14
15
16
17
18
19
    <li> <a href="#LoggingIn">Logging into the Web Interface</a>
    <li> <a href="#Designing">Designing a Network Topology</a>
    <li> <a href="#Beginning">Beginning the Experiment</a>
    <li> <a href="#UsingNodes">Using your Nodes</a>
    <li> <a href="#RootAccess">I need <b>root</b> access!</a>
    <li> <a href="#Wedged">My node is wedged!</a>
20
    <li> <a href="#Scrogged">I've scrogged my disk!</a>
21
22
    <li> <a href="#Finished">I've finished my experiment</a>
    <li> <a href="#Help">Getting Help!</a>
23
    </ul>
24
<li> <a href="#Advanced">Advanced Topics</a>
25
    <ul>
26
    <li> <a href="#ADVEX">A more advanced example</a>
27
    <li> <a href="#RPMS">Installing RPMS automatically</a>
28
    <li> <a href="#TARBALLS">Installing Tar files automatically</a>
29
    <li> <a href="#Startupcmd">
30
            Starting your application automatically</a>
31
    <li> <a href="#ReadyBits">
32
            How do I know when all my nodes are ready?</a>
33
34
    <li> <a href="#Routing">
            Setting up IP routing between nodes</a>
35
    <li> <a href="#Simem">
36
37
           Hybrid Experiments with Simulation and Emulation</a>
	   <img src="../new.gif" alt="&lt;NEW&gt;">
38
    </ul>
39
40
<li> <a href="#BatchMode">Batch Mode Experiments</a>
<li> <a href="#CustomOS">Creating your own disk image</a>
41
42
</ul>

43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

<hr>
<a NAME="GettingStarted"></a>
<center>
<h2>Getting Started</h2>
</center>

<p>
This section of the tutorial describes how to run your first Testbed
experiment. We cover basic NS syntax and various operational issues
that you will need to know in order conduct experiments to completion.
Later sections of the tutorial will cover more advanced topics such as
loading your own RPMs automatically, running programs automatically,
running batch jobs, creating your own disk images and loading those
images on your nodes. 
</p>

<ul>
<li> <a href="#LoggingIn">Logging into the Web Interface</a>
<li> <a href="#Designing">Designing a Network Topology</a>
<li> <a href="#Beginning">Beginning the Experiment</a>
<li> <a href="#UsingNodes">Using your Nodes</a>
<li> <a href="#RootAccess">I need <b>root</b> access!</a>
<li> <a href="#Wedged">My node is wedged!</a>
67
<li> <a href="#Scrogged">I've scrogged my disk!</a>
68
69
70
71
72
<li> <a href="#Finished">I've finished my experiment</a>
<li> <a href="#Help">Getting Help!</a>
</ul>

<ul>
73
74
<a NAME="LoggingIn"></a>
<li> <h3>Logging Into the Web Interface</h3>
75
76
77
78
79
80

<p>
If you already have an account on the Testbed, all you need to do is
go to <a href="https://www.emulab.net"> Emulab Home
Page</a>, enter your login name and your password, and then click on
the "Login" button. If you don't have an account, click on the "Join
81
Project" or "Start Project" links. For an overview of how you go about
82
83
84
85
86
87
88
getting an Emulab account, go to the <b><a href =
"http://www.emulab.net/auth.html"> "How To Get Started"</a></b> page.

<p>
<li> <a NAME="Designing"></a>
     <h3>Designing a Network Topology</h3>
<p>
Jay Lepreau's avatar
nits    
Jay Lepreau committed
89
Part of the Testbed's power lies in its ability to assume many different
90
91
92
93
94
95
96
97
98
99
100
topologies; the description of a such a topology is a necessary part
of an experiment.

<p>
Emulab uses the "NS" ("Network Simulator") format to describe network
topologies.  This is substantially the same <a
href="http://www.scriptics.com/software/tcltk/">Tcl</a>-based format
used by <a href="http://www.isi.edu/nsnam/ns/">ns-2</a>.  Since the
Testbed offers emulation, rather than simulation, these files are
interpreted in a somewhat different manner than ns-2.  Therefore, some
ns-2 functionality may work differently than you expect, or may not be
101
102
103
104
105
106
107
implemented at all. Please look for warnings of the form:
	<code><pre>
	*** WARNING: Unsupported NS Statement!
	    Link type BAZ, using DropTail!</code></pre>

If you feel there is useful functionality missing, please let us know.
Also, some <a href="docwrapper.php3?docname=nscommands.html">
108
109
110
111
112
113
114
testbed-specific syntax</a> has been added, which with the inclusion
of compatibility module (<a href="tb_compat.tcl">tb_compat.tcl</a>),
will be ignored by the NS simulator. This
allows the same NS file to work on both the Testbed and ns-2, most of
the time.

<p>
115
116
117
118
119
For those unfamiliar with the NS format, here is a small example
(<em>We urge all new Emulab users to begin with a small 3-5 node experiment
such as this, so that you will become familiar with NS syntax and the
practical aspects of Emulab operation</em>). Let's say we are trying to
create a test network which looks like the following:
120
121
  <br>
  <center>
122
    <img src="abcd.png"><br><br>
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
    (A is connected to B, B to C, and B to D.)
  </center>
  <br>

<p>
An NS file which would describe such a topology is as follows. First
off, all NS files start with a simple prolog, declaring a simulator 
and including a file that allow you to use the special 
<code>tb-</code> commands:
	<code><pre>
	# This is a simple ns script. Comments start with #.
	set ns [new Simulator]			
        source tb_compat.tcl                   </code></pre>

<p>
Then define the 4 nodes in the topology.
	<code><pre>
	set NodeA [$ns node]
	set NodeB [$ns node]
	set NodeC [$ns node]
	set NodeD [$ns node]			</code></pre>
</p>

<p>
Next define the 3 links between the nodes. NS syntax permits you to
specify the bandwidth, latency, and queue type. For our example, we
will define full speed links between B and C,D and a delayed link
from node A to B.
	<code><pre>
	$ns duplex-link $NodeA $NodeB 100Mb 50ms DropTail
153
154
	$ns duplex-link $NodeB $NodeC 100Mb  0ms DropTail
	$ns duplex-link $NodeB $NodeD 100Mb  0ms DropTail</code></pre>
155
156
157

<p>
In addition to the standard NS syntax above, a number of
158
<a href="docwrapper.php3?docname=nscommands.html">
159
160
161
extensions</a> have been added that allow you
to better control your experiment. For example, you may specify what
Operating System is booted on your nodes. We currently support FreeBSD
162
4.5 and Linux RedHat 7.1, as well as
163
<a href="http://www.cs.utah.edu/flux/oskit/">OSKit</a> kernels on the
164
testbed PCs. By default, Linux RedHat 7.1 is selected.
165
166

	<code><pre>
167
168
	tb-set-node-os $NodeA FBSD-STD
	tb-set-node-os $NodeC RHL-STD		</code></pre>
169
170
171
172
173
174
175
176
177
178
179
180
181

<p>
You may also control what IP addresses are assigned to the
experimental interfaces on your nodes. The experiment configuration
software will select IP addresses for you, but if your experiment
depends on particular IP addresses, you may specify them at each
link. The following example sets the IP address of node B on the port
going to node C:

	<code><pre>
	tb-set-ip-interface $NodeB $NodeC 192.168.42.42	</code></pre>

<p>
Jay Lepreau's avatar
nits    
Jay Lepreau committed
182
Lastly, all NS files end with an epilogue that instructs the simulator
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
to start.

	<code><pre>
	# Go!
	$ns run					</code></pre>

<p>
If you would like to try the above example, the completed <a
href="basic.ns" target=stuff>NS file</a> can be run as an experiment in your
project. Another example ns script that shows off using the power of
Tcl to generate topologies is <a href="loop.ns" target=stuff>here</a>.

<p>
<li> <a NAME="Beginning"></a>
     <h3>Beginning the Experiment</h3>

<p>
After logging on to the Testbed Web Interface, choose the "Begin
Experiment" option from the menu. First select which project you want
the experiment to be configured in. Most people will be a member of just
one project, and will not have a choice. If you are a member of
multiple projects, be sure to select the correct project from the
menu.

<p>
208
209
Next fill in the `Name' and `Description fields. The Name should be a
single word (no spaces) identifier, while the Description is a multi
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
word description of your experiment. In the "Your NS file" field,
place the <i>local path</i> of a NS file which you have created to
describe your network topology. This file will be uploaded through
your browser when you choose "Submit."

<p>
After submission, the Testbed interface will begin processing your
request. This will likely take several minutes, depending on how large
your topology is, and what other features (such as delay nodes and
bandwidth limits) you are using. Assuming all goes well, you will
receive an email message indicating success or failure, and if
successful, a listing of the nodes and IP address that were allocated
to your experiment. 

<p>
For the NS file described above, you would receive a listing that looks
similar to this:

	<code><pre>
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
        Node Info:
        ID              Type       OSID           
        --------------- ---------- ---------------
        NodeA           pc         FBSD-STD       
        NodeB           pc                        
        NodeC           pc         RHL-STD        
        NodeD           pc                        

        Node Mapping:
        Virtual         Physical        Qualified Name
        --------------- --------------- --------------------
        NodeA           pc18            NodeA.myexp.myproj.emulab.net
        NodeB           pc29            NodeB.myexp.myproj.emulab.net
        NodeC           pc28            NodeC.myexp.myproj.emulab.net
        NodeD           pc35            NodeD.myexp.myproj.emulab.net
        tbsdelay0       pc15            tbsdelay0.myexp.myproj.emulab.net

        Lan/Link Info:
        ID              Member          IP              Delay     BW (Kbs)  Loss Rate
        --------------- --------------- --------------- --------- --------- ---------
        l6              NodeB:eth0      192.168.42.42   0.00      100000    0.000    
        l5              NodeB:eth2      192.168.1.3     25.00     100000    0.000    
        l5              NodeA:eth0      192.168.1.2     25.00     100000    0.000    
        l6              NodeC:eth0      192.168.42.2    0.00      100000    0.000    
        l7              NodeB:eth1      192.168.2.2     0.00      100000    0.000    
        l7              NodeD:eth0      192.168.2.3     0.00      100000    0.000    

        Delay Node Info:
        LinkID          Virtual         Physical        Pipe Numbers   
        --------------- --------------- --------------- --------------- 
        l5              tbsdelay0       pc15            100,110		</code></pre>
260
261
262
263
264
265
266

<p>
A few points should be noted:
<ul>
<li> A single delay node was allocated and inserted into the link
     between NodeA and NodeB. This link is invisible from your
     perspective, except for the fact that it adds latency, error,
267
268
269
270
     or reduced bandwidth. However, the information for the delay links
     are included so that you can
     <a href="../faq.php3?#HDS-6">modify the delay parameters</a>
     after the experiment has been created. 
271
<p>
272
<li> Delays of less than 2ms (per trip) are too small to be
273
     accurately modeled at this time, and will be silently ignored.
274
275
276
277
     A delay of 0ms can be used to indicate that you do not want
     added delay; the two interfaces will be "directly" connected to
     each other. 
     Also, please see the
278
     <a href="docwrapper.php3?docname=nscommands.html#LOSS">
279
     <i>Link Loss Commands</i></a> section in the
280
     <a href="docwrapper.php3?docname=nscommands.html">
281
282
283
     Extensions</a> reference.
     
<p>
284
285
286
287
<li> The names in the "Qualified Name" column refer to the control
     network interfaces for each of your allocated nodes. These names
     are added to the Emulab nameserver map on the fly, and are
     immediately available for you to use so that you do not have to
288
     worry about the actual physical node names that were chosen. In
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
     the names listed above, `myproj' is the name of the project that
     you chose to work in, and `myexp' is the name of the experiment
     that you provided in the "Begin Experiment" page.
<p>
<li> Since the IP address for the link from NodeB to NodeC was set in
     the NS file, the system selected an appropriate IP address for
     the other end of the link. All of the other links were configured
     by the system to use 192.168.XXX.XXX subnets.
</ul>


<p>
<li> <a NAME="UsingNodes"></a>
     <h3>Using your Nodes</h3>

<p>
By the time you receive the email message listing your nodes, the
Testbed configuration system will have ensured that your nodes are
fully configured and ready to use. If you have selected one of the
308
Testbed-supported operating system images (FreeBSD, Linux, NetBSD),
309
310
this configuration process includes:

311
<p>
312
313
314
315
316
317
318
319
320
321
322
323
324
<ul>
<li> loading fresh disk images so that each node is in a known clean
     state;
<li> rebooting each node so that it is running the OS specified in the
     NS script;
<li> configuring each of the network interfaces so that each one is
     "up" and talking to its virtual LAN (VLAN);
<li> creating user accounts for each of the project members;
<li> mounting the projects NFS directory in /proj so that project
     files are easily shared amongst all the nodes in the experiment;
<li> creating a /etc/hosts file on each node so that you may refer to the
     experimental interfaces of other nodes by name instead of IP number;
<li> configuring all of the delay parameters;
325
326
<li> configuring the serial console lines so that project members may access the
     console ports from users.emulab.net or directly from their desktop.
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
</ul>

<p>
As this point you may log into any of the nodes in your experiment.
You will need to use Secure Shell (ssh), and you should use the
`qualified name' from the nodes mapping table so that you do not form
dependencies on any particular physical node. Your login name and
password will be the same as your Web Interface login and password.

<p>
The /etc/hosts file on each node will provide a local name mapping for
the other nodes in your experiments. You should take care to use these
names (or IP numbers) and <b>not</b> the .emulab.net names listed in
the node mapping, since the emulab names refer to the control network
LAN that is shared amongst all nodes in all experiments. It is only
the experimental interfaces that are entirely private to your
experiment.

<p>
<b> NOTE:</b> The configuration process just described occurs only on
Emulab constructed operating system images. If you are using an OSKit
kernel, or your own disk image with your own operating system, you
will be responsible for all of the configuration. At some point we
hope to provide tools to assist in the configuration, but for now you
are on your own.


<p>
<li> <a NAME="RootAccess"></a>
     <h3>I need <b>root</b> access!</h3>

<p>
If you need to customize the configuration, or perhaps reboot nodes,
you can use the "sudo" command, located in <code>/usr/local/bin</code>
on FreeBSD and Linux, and <code>/usr/pkg/bin</code> on NetBSD. Our
policy is very liberal; you can customize the configuration in any way
you like, provided it does not violate Emulab's
<a href="../docwrapper.php3?docname=policies.html">
administrative policies</a>. As as example, to reboot a node that is
running FreeBSD:

	<code><pre>
	/usr/local/bin/sudo reboot			</code></pre>

	
<p>
<li> <a NAME="Wedged"></a>
     <h3>My node is wedged!</h3>

<p>
This is bound to happen when running experimental software and/or
experimental operating systems. Fortunately we have an easy way for
you to power cycle nodes without requiring Testbed Operations to get
involved. If you must power cycle a node, log on to users.emulab.net
and use the "node_reboot" command:

	<code><pre><xmp>
	node_reboot <node> [node ... ]			</xmp></code></pre>

where `node' is the physical name, as listed in the node mapping
table. You may provide more than one node on the command line. Be
aware that you may power cycle only nodes in projects that you are
member of. Also, <tt>node_reboot</tt> does its very best to perform a
clean reboot before resorting to cycling the power to the node. This
is to prevent the damage that can occur from constant power cycling
over a long period of time.  For this reason, <tt>node_reboot</tt> may
delay a minute or two if it detects that the machine is still
responsive to network transmission.  In any event, please try to
reboot your nodes first (see above).

397
398
399
400
401
402
403
404
405
406
407
<p>
You may also reboot all the nodes in an experiment by using the <tt>-e</tt>
option to specify the project and experiment names. For example:

	<code><pre><xmp>
	node_reboot -e testbed,multicast		</xmp></code></pre>

will reboot all of the nodes reserved in the "multicast" experiment in
the "testbed" project. This option is provided as a shorthand method
for rebooting large groups of nodes.

408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
<p>
<li> <a NAME="Scrogged"></a>
     <h3>I've scrogged my disk!</h3>

<p>
Scrogging your disk is certainly not as common, but it does happen.
You can either terminate your experiment, and recreate it (which will
allocate another group of nodes), or if you prefer you can reload the
disk image yourself. You will of course lose anything you have stored
on that disk; it is a good idea to store only data that can be easily
recreated, or else store it in your project directory in <tt>/proj</tt>.
Reloading your disk with a fresh copy of the default image is easy,
and requires no intervention by Emulab staff:

	<code><pre><xmp>
423
	os_load <node> [node ... ]			</xmp></code></pre>
424

425
426
os_load will wait (not exit) until the nodes have been reloaded,
so that you do not need to check the console lines of each node
427
428
to determine when the load is done.

429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
<p>
<li> <a NAME="Finished"></a>
     <h3>I've finished my experiment</h3>

When your experiment is completed, and you no longer need the
resources that have been allocated to it, you will need to terminate
the experiment via the Emulab Web Interface. Click on the "End An
Experiment" link. You will be presented with a list of all of the
experiments in all of the projects for which you have the
authorization to terminate experiments. Select the experiment you want
to terminate by clicking on the button in the "Terminate" column on
the right hand side. You will be asked to <b>confirm</b> your choice.
The Testbed configuration system will then tear down your experiment,
and send you an email message when the process is complete. At this
point you are allowed to reuse the experiment name (say, if you wanted
to create a similar experiment with different parameters).

<p>
<li> <a NAME="Help"></a>
     <h3>Getting Help!</h3>

If you have any questions or problems, or just want to comment on
Emulab's operation (maybe you want to suggest an improvement to one of
the Web pages), feel free to contact us by sending email to
453
454
455
<a href="../sendemail.php3">Testbed Operations</a>. Also note that much
of the software is in development, and occasionally things might break
or not work as you expect. Again, please feel free to contact us.
456

457
458
459
<!-- This ends the Basic Tutorial Section -->
</ul>

460
461
<hr>
<a NAME="Advanced"></a>
462
<center>
463
464
465
466
<h2>Advanced Topics</h2>
</center>

<ul>
467
<li> <a href="#ADVEX">A more advanced example</a>
468
<li> <a href="#RPMS">Installing RPMS automatically</a>
469
<li> <a href="#TARBALLS">Installing Tar files automatically</a>
470
471
<li> <a href="#Startupcmd">Starting your application automatically</a>
<li> <a href="#ReadyBits">How do I know when all my nodes are ready?</a>
472
<li> <a href="#Routing">Setting up IP routing between nodes</a>
473
474
<li> <a href="#Simem">Hybrid Experiments with Simulation and Emulation</a> 
     <img src="../new.gif" alt="&lt;NEW&gt;">
475
476
477
478
</ul>
<p>

<ul>
479
480
481
482
483
484
485

<li> <a NAME="ADVEX"></a>
     <h3>A more advanced example</h3>
<p>

We have a more <a href="docwrapper.php3?docname=advanced.html">
advanced example</a> demonstrating the use of RED queues, traffic
486
487
488
489
generators, the event system and the integration of network
simulation (NS)

</p>
490

491
492
493
494
495
496
497
498
499
500
501
502
503
504
<li> <a NAME="RPMS"></a>
     <h3>Installing RPMS automatically</h3>
<p>

The Testbed NS extension <tt>tb-set-node-rpms</tt> allows you to
specify a (space separated) list of RPMs to install on each of your
nodes when it boots:

<code><pre>
tb-set-node-rpms $nodeA /proj/pid/rpms/silly-freebsd.rpm
tb-set-node-rpms $nodeB /proj/pid/rpms/silly-linux.rpm	</code></pre>

The above NS code says to install the <tt>silly-freebsd.rpm</tt> file
on <tt>nodeA</tt>, and the <tt>silly-linux.rpm</tt> on <tt>nodeB</tt>.
505
506
507
508
RPMs are installed as root, must reside in either the project's
<tt>/proj</tt> directory, or if the experiment has been created in a
subgroup, in the <tt>/group</tt> directory. You may not place your
rpms in your home directory.
509
510
511

</p>

512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
<li> <a NAME="TARBALLS"></a>
     <h3>Installing TAR files automatically</h3>
<p>

The Testbed NS extension <tt>tb-set-node-tarfiles</tt> allows you to
specify a set of tarfiles to install on each of your nodes when it
boots. This command is similar to the <a href="#RPMS">tb-set-node-rpms</a>
above, and is provided for those people who do not want to spend a month
trying to figure out how to build an RPM! The format of this command
is slightly different though in that you must specify a directory in
which to unpack the tar file. This avoids problems with having to
specify absolute pathnames in your tarfile, which many modern tar
programs balk at. 

<code><pre>
tb-set-node-tarfiles $nodeA /usr/site /proj/pid/tarfiles/silly.tar.gz </code></pre>

The above NS code says to install the <tt>silly.tar.gz</tt> tar file
on <tt>nodeA</tt> from the working directory <tt>/usr/site</tt> when
531
532
533
534
the node first boots. The tarfile must reside in either the project's
<tt>/proj</tt> directory, or if the experiment has been created in a
subgroup, in the <tt>/group</tt> directory. You may not place your
tarfiles in your home directory. You may specify as many tarfiles as
535
536
537
538
539
you wish, as long as each one is proceeded by the directory it should
be unpacked in, all separated by spaces.

</p>

540
541
542
543
544
<li> <a NAME="Startupcmd"></a>
     <h3>Starting your application automatically</h3>
<p>

You can start your application automatically when your nodes boot by
545
using the <tt>tb-set-node-startup</tt> NS extension. The argument is
546
547
548
549
550
551
552
553
554
555
556
557
the pathname of a script or program that is run as the <tt>UID</tt> of
the experiment creator, after the node has reached multiuser mode. You
can specify the same program for each node, or a different program.
For example:

<code><pre>
tb-set-node-startup $nodeA /proj/pid/runme.nodeA
tb-set-node-startup $nodeB /proj/pid/runme.nodeB	</code></pre>

will run <tt>/proj/pid/runme.nodeA</tt> on nodeA and
<tt>/proj/pid/runme.nodeA</tt> on nodeB. The programs must reside on
the node's local filesystem, or in a directory that can be reached via
558
559
560
NFS. This is either the project's <tt>/proj</tt> directory, in the
<tt>/group</tt> directory if the experiment has been created in a
subgroup, or a project member's home directory in <tt>/users</tt>.
561

562
563
564
565
566
567
<p>
The exit value of the startup command is reported back to the Web
Interface, and is made available to you via the "Experiment
Information" link. There is a listing for all of the nodes in the
experiment, and the exit value is recorded in this listing. The
special symbol <tt>none</tt> indicates that the node is still running
568
569
570
the startup command. A log file containing the output of the startup
command is created in the project's <tt>logs</tt> directory
(<tt>/proj/pid/logs</tt>).
571

572
573
574
<p>
The startup command is especially useful when
combined with <a href="#BatchMode"><i>batch mode</i></a> experiments.
575
576

<p>
577
578
579
580
581
582
583
584
585
586
<li> <a NAME="ReadyBits"></a>
     <h3>How do I know when all my nodes are ready?</h3>
<p>

It is often necessary for your startup program to determine when all
of the other nodes in the experiment have started, and are ready to
proceed. Sometimes called a <i>barrier</i>, this allows programs to
wait at a specific point, and then all proceed at once. Emulab
provides a primitive form of this mechanism using experiment <i>ready
bits</i>, which are set and read using the
587
<a href="../doc/docwrapper.php3?docname=tmcd.html">
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
TMCD/TMCC</a>. When an experiment is first configured, the ready bit
for each node is cleared. As each node starts its application and
reaches the point where it must be sure that all other nodes have
started up, it issues a TMCC <tt>ready</tt> command:

<code><pre>
tmcc ready				</code></pre>

which tells Emulab's configuration system that the node is ready to
proceed. The node can then poll for the <i>ready count</i> to
determine how many nodes are ready (have issued a tmcc ready command):

<code><pre>
tmcc readycount				</code></pre>

which will return the ready count as a string:

<code><pre>
READY=N TOTAL=M				</code></pre>

where <tt>N</tt> is the number of nodes that are ready, and <tt>M</tt>
is the total number of nodes in the experiment. An application can
poll the ready count with a simple script, or it can encode the ready
bits check directly into its program. For example, here is a simple
Perl fragment that issues the ready command, and then polls for the
ready count, being sure to delay a small amount between each poll.

<code><pre>
system("tmcc ready");
while (1) {
    my $bits = `tmcc readycount`;
619
    if ($bits =~ /READY=(\d*) TOTAL=(\d*)/) {
620
621
622
623
624
625
626
        if ($1 == $2) {
            last;
	}
    }
    #
    # Please sleep to avoid swamping the TMCD!
    # 
627
    sleep(5);
628
629
630
631
632
633
634
635
}					</code></pre>

<i>Note that the ready count is essentially a use-once feature; The
ready count cannot be reinitialized to zero since there is no actual
synchronization happening.  If in the future it appears that a
generalized barrier synchronization would be more useful, we will
investigate the implementation of such a feature.</i>

636
</p>
637

638
639
640
641
642
643
644
645
646
647
648
649
650
<li> <a NAME="Routing"></a>
     <h3>Setting up IP routing between nodes</h3>
<p>

As Emulab strives to make all aspects of the network controllable by the
user, we do not attempt to impose any IP routing architecture or protocol
by default.
However, many users are more interested in end-to-end aspects and don't
want to be bothered with setting up routes.  For those users we provide
an option to automatically set up routes on nodes which run one of our
provided FreeBSD or Linux disk images.

<p>
651
You can use the NS <tt>rtproto</tt> syntax in your NS file to enable
652
routing:
653
<code><pre>
654
$ns rtproto <i>protocol</i>
655
</pre></code>
656
657
658
659
660
661
662
663
664
665
666
667
where the <i>protocol</i> option is limited to one of
<code>Session</code>, <code>Static</code>, or <code>Manual</code>.
<code>Session</code> routing provides fully automated routing support, 
and is implemented by enabling <code>gated</code> running the OSPF protocol
on all nodes in the experiment. <code>Static</code> routing also
provides automatic routing support, but rather than computing the
routes dynamically, the routes are precomputed when the experiment is
created, and then loaded on each node when it boots.

<p>
<code>Manual</code> routing allows you to explicitly specify per-node
routing information in the NS file. To do this, use the
Chad Barb's avatar
   
Chad Barb committed
668
<tt>Manual</tt> routing option to <tt>rtproto</tt>, 
669
followed by a list of routes using the <tt>add-route</tt> command:
670
<code><pre>
671
$node add-route $dst $nexthop
672
</pre></code>
673
674
where the <tt>dst</tt> can be either a node, a link, or a LAN. For
example: 
675
<code><pre>
676
677
678
$client add-route $server $router
$client add-route [$ns link $server $router] $router
$client add-route $serverlan $router
679
</pre></code>
680
681
682
683
684
685
686
687
Note that you would need a separate <code>add-route</code> command to
establish a route for the reverse direction; thus allowing you to
specify differing forward and reverse routes if so desired.
These statements are converted into appropriate <tt>route(8)</tt>
commands on your experimental nodes when they boot.

<p>
In the above
688
689
690
691
692
693
694
695
696
697
698
699
examples, the first form says to set up a manual route between
<tt>$client</tt> and <tt>$server</tt>, using <tt>$router</tt> as the
nexthop; <tt>$client</tt> and <tt>$router</tt> should be directly
connected, and the interface on <tt>$server</tt> should be
unambiguous; either directly connected to the router, or an edge node
that has just a single interface.

<p>
<img src=routing.png align=right>

If the destination has multiple interfaces configured, and it is not
connected directly to the nexthop, the interface that you are
700
701
intending to route to is ambiguous.
In the topology shown to the right,
702
703
<tt>$nodeD</tt> has two interfaces configured. If you attempted to 
set up a route like this:
704
<code><pre>
705
$nodeA add-route $nodeD $nodeB
706
</pre></code>
707
you would receive an error since it cannot be determined (easily, with
708
little programmer effort, by Emulab staff!) which of the two links on
709
<tt>$nodeD</tt> you are referring to. Fortunately, there is an easy
710
solution, courtesy of an Emulab extension. Instead of a node, specify the
711
link directly:
712
<code><pre>
713
$nodeA add-route [$ns link $nodeD $nodeC] $nodeB
714
</pre></code>
715
716
This tells us exactly which link you mean, enabling us to convert
that information into a proper <tt>route</tt> command on <tt>$nodeA</tt>.
717
718

<p>
719
The last form of <tt>add-route</tt> command is used when adding a
720
721
route to an entire LAN. It would be tedious and error prone to specify
a route to each node in a LAN by hand. Instead, just route to the
722
entire network:
723
<code><pre>
724
set clientlan [$ns make-lan "$nodeE $nodeF $nodeG" 100Mb 0ms]
725
$nodeA add-route $clientlan $nodeB
726
727
</pre></code>

728
729
730
731
732
733
734
While all this manual routing infrastructure sounds really nifty, its
probably a good idea to use either <tt>Session</tt> or <tt>Static</tt>
routing for all but small, simple topologies.  Explicitly setting up
all the routes in even a moderately-sized experiment is extremely
error prone.  Consider this: a recently created experiment with 17
nodes and 10 subnets required 140 hand-created routes in the NS
file. Yow!
735
736

<p>
737
Two final, cautionary notes on routing:
738
<ul>
739
740
741
<li> You might be tempted to set the default route on your nodes
     to reduce the number of explicit routes used.  <b>Don't do it.</b>
     That would prevent nodes from contacting the outside world, i.e., you.
742
743
     The default route <em>must</em> be set to use the control network
     interface.
744
</p>
745
     
746
<p>
747
748
749
750
<li> If you use your own routing daemon, you must avoid using the
     control network interface in the configuration.  Since every node
     in the testbed is directly connected to the control network LAN,
     a naive routing daemon configuration will discover that any node
751
752
     is just one hop away, via the control network, from any other node
     and <em>all</em> inter-node traffic will be routed via that interface.
753
</ul>
754
755
756
</p>

<a NAME="Simem"></a>
757
758
<li><h3>Hybrid Experiments with Simulation and Emulation 
<img src="../new.gif" alt="&lt;NEW&gt;"></h3>
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805

<p>
Emulab has integrated network simulation using 
<a href="http://www.isi.edu/nsnam/ns/doc/node487.html">NS Emulation (NSE)</a>
enabling an experimenter to combine simulation and real
hardware emulation. This allows scale beyond the limit of
physical resources as well as interaction of real 
application traffic with simulated traffic. The latter makes it 
possible to do validation of simulation models against the real 
world or use simulation cross traffic when the particular model 
is still experimental and not available in a real implementation. 
</p>

<p>
To create an experiment with simulated resources in it, a user simply has to
enclose a block of NS Tcl code in <code>$ns make-simulated {
}</code>. You specify connections between simulated and physical nodes as usual,
with the current restriction that they must be lexically outside the
<code>make-simulated</code> block. The following code gives an example:
</p>

<code><pre>

set ns [new Simulator]

set realnode1 [$ns node]
set realnode2 [$ns node]

$ns make-simulated {

    # All the code here run in the simulation
    set simnode1 [$ns node]
    set simnode2 [$ns node]

    # A duplex link inside the simulation
    $ns duplex-link $simnode1 $simnode2 1.5Mb 40ms DropTail
}

# connecting real and simulated nodes. outside make-simulated
$ns duplex-link $realnode1 $simnode1 5Mb 10ms DropTail
$ns duplex-link $realnode2 $simnode2 5Mb 10ms DropTail

</pre></code>

<p>
A hybrid experiment like this causes the simulation to run in
best effort real time. The number of simulation objects that
806
can be supported without falling behind real time depends on
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
the amount of external traffic and the number of internal
simulation events that need to be processed. Please read 
<a href="docwrapper.php3?docname=nse.html">nse scaling, accuracy and capacity</a> 
 to get a better picture.
</p>

<p>
The output from the simulation including errors such as the ones that
report inability to keep up with real time are logged into a file 
<code>/proj/&lt;project_name&gt;/exp/&lt;experiment_name&gt;/logs/nse-&lt;nodename&gt;.log
</code>
</p>

<p>
<i>
nse support is still under development. Please let us know if you face 
problems in this system. Here are some caveats:

<ul>
<li>
Currently, all simulated nodes in an experiment
are mapped to one physical pc in emulab. This limits the scalability
of this system. Support to automatically map simulated resources
on to multiple physical nodes is coming soon. 

<li>
Enabling NS tracing causes huge file I/O overhead resulting in 
nse not keeping up with real time. Therefore, do not enable tracing.

<li>
Remember that each link between the simulated nodes and the physical
nodes is a real physical link, so there can't be more of them than 
there are ethernet links on a physical node (currently 4).

</ul>

</i>
</p>
845

846
</ul>
847
848
849
850
851
852
853
854
855

<!-- Batch Mode -->

<hr>
<a NAME="BatchMode"></a>
<center>
<h2>Batch Mode</h2>
</center>

856
857
858
859
860
861
862
863
864
865
<ul>
<li> <a href="#BatchIntro">Batch Mode Introduction</a>
<li> <a href="#BatchExample">A Batch Mode Example</a>
</ul>

<ul>
<li> <a NAME="BatchIntro"></a>
     <h3>Batch Mode Introduction</h3>

<p>
866
867
868
869
870
Batch Mode experiments can be created on the Testbed via the "Create
an Experiment" link in the operations menu to your left. There is a
checkbox near the bottom of the form that indicates you want to use
the batch system. There are several important differences between a
regular experiment and a batch mode experiment:
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924

<p>
<ul>
<li> The experiment is run when enough resources (ie: nodes) are
     available. This might be immediately, or it might be sometime in
     the future.
     <p>
<li> Once your NS file is handed off to the system, the batch system
     is responsible for setting up the experiment and tearing it down
     once the experiment has completed. You will receive email
     notifying you when the experiment has been scheduled and when it
     has been terminated.
     <p>
<li> Your NS file must define a <i>startup</i> command to run on each
     node using the <a href="#Startupcmd"><tt>tb-set-node-startup</tt></a>
     NS extension. It is the exit value(s) of the startup command(s) that
     indicates that the experiment is completed; when all of the
     nodes have run their respective startup commands and exited, the
     batch system will then tear down the experiment. The output of
     the startup command is stored in a file in your home directory so
     you can follow what has happened.
</ul>
<p>

<li> <a NAME="BatchExample"></a>
     <h3>A Batch Mode Example</h3>

Consider example NS file <a href="batch.ns" target=stuff>batch.ns</a>.
First off, we have to arrange for the experimental software to be
automatically installed when the nodes boot. This is done with the <a
href="#RPMS"><tt>tb-set-node-rpms</tt></a> NS extension:

<code><pre>
tb-set-node-rpms $nodeA /proj/testbed/rpms/silly-1.0-1.i386-freebsd.rpm
tb-set-node-rpms $nodeB /proj/testbed/rpms/silly-1.0-1.i386-freebsd.rpm
</code></pre>

The next two lines of the NS file specify what program should be run
on each of the nodes. Using the <a href="#Startupcmd">
<tt>tb-set-node-startup</tt></a> NS extension, we say that the program
<tt>run-silly</tt> (installed by the <tt>silly-1.0</tt> RPM) is to be
run on both nodes:

<code><pre>
tb-set-node-startup $nodeA /usr/site/bin/run-silly
tb-set-node-startup $nodeB /usr/site/bin/run-silly
</code></pre>

After you have been notified via email that the batch experiment is
running, you can track the progress of your experiment by looking in
the "Experiment Information" page. As each node completes the startup
command, the listing for that node will be updated to reflect the exit
status of the command (you may need to hit the Reload button to see
the changes). Once all of the nodes hare reported in an exit status,
925
926
927
928
929
930
931
932
933
934
935
936
the batch system will tear down the experiment and send you email.  If
your experiment is such that one node is the controller, and runs
commands on all the other nodes, then simply run a dummy startup
command on the other nodes so that the batch system will receive an
exit value for that node. Since the batch is not terminated until
<em>all</em> nodes have reported in, be sure that the controlling node
does not exit from its startup command until all of the nodes have
finished. A dummy startup command can be setup like this:

<code><pre>
tb-set-node-startup $nodeC /bin/echo
</code></pre>
937
938
939
940
941

<p>
The status of your batch experiment can be viewed via the "Experiment
Information" link in the Web Interface Options menu. You may also
cancel a batch after you have submitted it using the "Terminate"
942
943
944
945
option in the information display. As noted in the section on the <a
href="#Startupcmd">Startupcmd</a>, the output of the startup command
on each node is written to separate files in your project log
directory. You can use these log files to debug your batch experiment.
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960

<p>
<i>
The batch system is still under development. It appears to be
functional, but there are bound to be kinks in the system. Please help
us debug and improve it by letting us know what you think and if you
have problems with it. Currently, the batch system tries every 10
minutes to run your batch. It will send you email every 5 or so
attempts to let you know that it is trying, but that resources are not
available. It is a good idea to glance at the message to make sure
that the problem is lack of resources and not an error in your NS
file.</i>

<!-- This ends the Basic Tutorial Section -->
</ul>
961
962
963
964
965
966
967
968
969

<!-- Custom OS Images -->

<hr>
<a NAME="CustomOS"></a>
<center>
<h2>Custom OS Images</h2>
</center>

Leigh B. Stoller's avatar
Leigh B. Stoller committed
970
971
972
973
974
975
976
977
978
979
980
981
982
983
If your set of operating system customizations cannot be easily
contained within an RPM/TAR (or multiple RPM/TARs), then you can
create your own custom OS image; Emulab allows you to create your own
disk images and load them on your experimental nodes, automatically
when your experiment is created or swapped in. Once you have created a
custom disk image (and the associated <a
href="https://www.emulab.net/newimageid_ez.php3"> image/osid
descriptor</a> for it, you can use that OSID in your NS file. When
your experiment is swapped in, the testbed system will arrange for
your disks to be loaded in parallel using a locally written multicast
disk loading protocol. Experience has shown that it is much faster to
load a disk image on 10 nodes at once, then it is to load a bunch of
RPMS or tarballs on each node as it boots. So, while it may seem like
overkill to create your own disk image, we can assure you it is not!
984
985

<p>
986
987
988
989
990
991
992
993
994
995
996
997
The most common approach is to use the
<a href="https://www.emulab.net/newimageid_ez.php3">New Image Descriptor</a>
form to create a disk image that contains a customized version of the
standard Redhat Linux partition or the FreeBSD partition. Or, you can
start from scratch and load your own operating system in any of the
DOS partitions, and then capture that partition when you are
done. Either way, all you need to do is enter the node name in the
form, and the testbed system will create the image for you
automatically, notifying you via email when it is finished.  You can
then use that image in subsequent experiments by specifying the
descriptor name in your NS file with the
<a href="docwrapper.php3?docname=nscommands.html#OS">
998
999
1000
<tt>tb-set-node-os</tt></a> directive. When the experiment is
configured, the proper image will be loaded on each node automatically by
the Testbed system.