linktest.html 9.75 KB
Newer Older
1
2
<!--
   EMULAB-COPYRIGHT
3
   Copyright (c) 2000-2005 University of Utah and the Flux Group.
4
5
6
7
8
9
10
11
12
   All rights reserved.
  -->
<center>
<h1>Linktest Tutorial</h1>
</center>

<h2>Contents</h2>
<ul>
<li> <a href="#QuickStart">Quick Start</a>
13
<li> <a href="#Limits">Limitations</a>
14
15
16
17
18
19
20
21
22
23
<li> <a href="#Understand">Understanding Linktest</a>
    <ul>
    <li> <a href="#Level0">Level 0 - Do not run Linktest</a>
    <li> <a href="#Level1">Level 1 - Connectivity and Latency</a>
    <li> <a href="#Level2">Level 2 - Static Routing</a>
    <li> <a href="#Level3">Level 3 - Loss</a>
    <li> <a href="#Level4">Level 4 - Bandwidth</a>
    </ul>
<li> <a href="#Advanced">Advanced Topics</a>
    <ul>
24
    <li> <a href="#Web">Running Linktest from the Web Interface</a>
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
    <li> <a href="#Ops">Running Linktest on Ops</a>
    <li> <a href="#Log">Linktest Log Directory</a>
    </ul>
</ul>

<hr>
<a name="QuickStart"></a>
<center>
<h2>Quick Start</h2>
</center>

<p>
To run linktest, select a Linktest test level from the dropdown on the
<a href="beginexp_html.php3">Begin an Experiment</a> page.
<ul>
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
<li> <a href="#Level1"><b>Level 1</b> - Connectivity and Latency</a>:
     For fastest response, level 1. This will check that all nodes respond
     to ping and that latency is correct.
<br>
<li> <a href="#Level3"><b>Level 3</b> - Link Loss</a>: Check link loss
     characteristics. Test levels are cumulative, so this option also
     includes connectivity, latency, and static routing (if
     applicable).
<br>
<li> <a href="#Level4"><b>Level 4</b> - Bandwidth</a>: If bandwidth is
     important in your experiment, select level 4. Be warned that the
     bandwidth test takes up to 40 seconds per link. <b>Also note that
     not all bandwidths can be accurately measured</b>. If this
     happens, a warning will be placed in the log file for each link
     that is out of range for bandwidth testing. 
55
</ul>
56
57
58
59
60
61
62
63
64
<p>
If you select a test lest level other than zero, Linktest will run after
the experiment completes its swapin. If a problem is found,
testbed-ops is automatically notified and a message will appear in the
activation log. Otherwise, no notification will appear. A failure in
linktest will <em><b>not</b> cause the swapin to fail!</em> If traffic
shaping parameters are of critical importance to your experiments,
make sure you take a closer look if linktest reports failures!

65
66
</p>

67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
<hr>
<a name="Limits"></a>
<center>
<h2>Limitations (Important, Please Read)</h2>
</center>

<ul>
<li> Not all bandwidths can be accurately measured, and linktest will
     skip links that it knows will give false results. Please check
     the output, and be sure to test those links yourself if your
     results depend on total accuracy.

<li> As with any automated testing procedure, we have to balance the desire
     for accuracy with the possibility of false positives. To reduce
     the number of false positives, we allow for a small amount of
82
     fudge on any link. <b>If your results are dependent on total
83
84
85
86
87
88
     accuracy, then you should test your links yourself!</b>

<li> When using linkdelays (endnodeshaping) latency is less accurate
     because of the 1ms clock resolution that the kernel runs at. At
     worst, latency can be off by up to 2ms, say for a roundtrip ping
     packet.
89
90
91
92
93
94

<li> Linktest can take a <b>long time</b> on large experiments. Even
     on very small experiments (5-10 nodes), doing the full bandwidth test
     can 3-4 minutes. You should probably <em>not<em> do bandwidth
     tests at swapin on any experiment over 20 nodes unless you are
     prepared to wait a <em>long</em> time for the experiment to swap in.
95
96
97
98
99
 
</ul>
</center>


100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
<hr>
<a name="Understand"></a>
<center>
<h2>Understanding Linktest</h2>
</center>

<p>
Linktest is an end-to-end validation test for Emulab experiments. It
verifies that experiment nodes are up, that they are reachable by
static routes (when applicable), and that traffic shaping on delay
nodes matches the experiment NS script.
</p>
<p>
Linktest works by parsing the experiment NS script, then invoking
external measurement tools -- namely
ping, <a href="http://rude.sourceforge.net">Rude and Crude</a> and <a
href="http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/pathrate.html">Pathrate</a>.
Linktest compares the results against margins of error calculated in
advance to identify major errors in configuration.
</p>
<p>
Linktest runs on each experiment node. The Linktest daemon waits for a
custom event instructing it to begin testing. When it receives the
event, it invokes the Linktest script to conduct the actual tests. The
script invokes external processes to parse the NS script, validate
links and log any errors found.
</p>
<p>
If a node detects an error, it writes
an explanatory message to the experiment tbdata/linktest
directory. Otherwise, no messages appear in the directory after
Linktest completes its run.
</p>
<p>
Linktest uses test levels to select which tests to perform. Test
levels are cumulative, so that selecting a higher test level ensures
lower-level tests are also run. Test levels are ordered in length of
time to complete, so that <a href="#Level4">Level 4 - Bandwidth</a> takes the most time and
<a href="#Level1">Level 1 - Connectivity and Latency</a> takes the least.
</p>
<p>
Read more about each test level in the following sections:
</p>
<p>
    <ul>
    <li> <a href="#Level0">Level 0 - Do not run Linktest</a>
    <li> <a href="#Level1">Level 1 - Connectivity and Latency</a>
    <li> <a href="#Level2">Level 2 - Static Routing</a>
    <li> <a href="#Level3">Level 3 - Loss</a>
    <li> <a href="#Level4">Level 4 - Bandwidth</a>
    </ul>
</p>

<ul>
<a NAME="Level0"></a>
<li> <h3>Level 0 - Do not run Linktest</h3>

<p>
The default test level is Level 0 - Do not run Linktest. Use this
level to leave Linktest turned off, performing no validation of
experiment links after swapin.
</p>

<a NAME="Level1"></a>
<li> <h3>Level 1 - Connectivity and Latency</h3>

<p>
Each Linktest node on a lan or direct link pings the node on the other
side of the link. From the responses, the node detects whether the
link is up and the latency of the link. Linktest compares the measured
latency with the expected latency of the link, adjusting for known
delay crossing the testbed backplane. If the measured latency
is outside the 99% confidence interval for latencies at that setting, Linktest reports an error.
</p>

<a NAME="Level2"></a>
<li> <h3>Level 2 - Static Routing</h3>

<p>
If the routing mode of the experiment is static, each Linktest node
pings the remainder of nodes in the experiment. If any node cannot be
reached, the Linktest node reports an error.
</p>


<a NAME="Level3"></a>
<li> <h3>Level 3 - Loss</h3>

<p>
Each Linktest node on a lan or direct link with loss > 0 sends a burst
of packets to the node on the other side of the link using Rude and
Crude, a real-time packet emitter and collector. If the percentage of
packets lost is outside the 99% confidence interval for
normally-distributed loss at that setting, Linktest reports an
error.
</p>


<a NAME="Level4"></a>
<li> <h3>Level 4 - Bandwidth</h3>

<p>
Each Linktest node on a lan or direct link uses Pathrate to measure
the bandwidth of the link, provided that the link is >= 1 Mbps and <=
45 Mbps. If the measured bandwidth is outside the margin of error
of 1Mb, Linktest reports an error.
</p>

<p>
The Bandwidth test adds up to 40 seconds per distinct link in the
experiment, or 15-20 seconds in each direction. Linktest attempts to run tests in parallel whenever
possible, but topologies such as a star will lead to longer runtimes
because Linktest allows only one sender or receiver to run on a node at a
time.
</p>

</ul>
<hr>
<a name="Advanced"></a>
<center>
<h2>Advanced Topics</h2>
</center>

<p>
224
225
226
227
To run Linktest after experiment swapin, you may use Emulab's Web
Interface, or you may manually invoke the script <tt>run_linktest.pl</tt> on
ops. You may also examine Linktest output in its log directory. Read
about these options in the following sections:
228
229
230
</p>
<p>
    <ul>
231
    <li> <a href="#Web">Running Linktest from the Web Interface</a>
232
233
234
235
236
    <li> <a href="#Ops">Running Linktest on Ops</a>
    <li> <a href="#Log">Linktest Log Directory</a>
    </ul>
</p>
<ul>
237
238
239
240
241
242
243
244
245
246
247
248
<a NAME="Web"></a>
<li> <h3>Running Linktest from the Web Interface</h3>
<p>
If you go to the "Show Experiment" page for your experiment, you will
see an option called "Run Linktest" in the auxiliary menu for the
experiment. Clicking on that link will take you to the run linktest
page, where you can select a level, and then start linktest running by
clicking on the Start button. Once linktest starts running, you can
stop it by clicking on the Stop button. Please be patient; linktest
can take a long time to run. Eventually, you will be notified of its
results in the window below the Start/Stop button. 
</p>
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
<a NAME="Ops"></a>
<li> <h3>Running Linktest on Ops</h3>
<p>
Use run_linktest.pl to run Linktest on ops. The option "-e" is
mandatory for specifying the
project and experiment id. Running with "-q" will run all tests except
the bandwidth test. Running without "-q" runs all tests. Running with "-o" allows you to specify an output directory
for log messages. Invoke run_linktest.pl with no options for a help
message.
Example:
</p>
<p>
<code><pre>run_linktest.pl -e utahstud/simple -q -o /tmp/linktest.log</code></pre>
</p>
<p>You may also specify the test level using the "-l" option. Example:
</p>
<p>
<code><pre>run_linktest.pl -e utahstud/simple -l 1</code></pre>
</p>

<p>
After Linktest completes, run_linktest.pl prints out errors found
during the run, if any. For scripting, run_linktest.pl returns 0 if no
errors were found, or !0 if at least one error was found.
</p>


<a NAME="Log"></a>
<li> <h3>Linktest Log Directory</h3>
<p>

Linktest logs the results of parsing the NS script and any error reports
in the tbdata/linktest directory for each experiment. By default this
is in ops:/proj/$pid/exp/$eid.
</p>
</ul>