Commit 51917b15 authored by Mike Hibler's avatar Mike Hibler

Linktest version 1.2, not completely done but check it in before I lose it.

See event/linktest/CHANGES for details.
parent b4e203a5
......@@ -1902,6 +1902,7 @@ else
event/stated/GNUmakefile event/stated/stated \
event/linktest/GNUmakefile \
event/linktest/iperf/GNUmakefile \
event/linktest/rude/GNUmakefile \
event/linktest/ \
event/linktest/weblinktest event/linktest/linktest.proxy \
event/linktest/linktest_control \
......@@ -565,6 +565,7 @@ else
event/stated/GNUmakefile event/stated/stated \
event/linktest/GNUmakefile \
event/linktest/iperf/GNUmakefile \
event/linktest/rude/GNUmakefile \
event/linktest/ \
event/linktest/weblinktest event/linktest/linktest.proxy \
event/linktest/linktest_control \
Version 1.2:
* Implement the "symmetric LAN" optimization where we only test each
"leg" of a LAN once if every leg has the same attributes. We perform
this optimization for loss and bandwidth tests, other tests continue as
* Optional "ARP" pass. Performs a single ping to all directly connected
hosts. Intended to force the ARP protocol to avoid excessive first-packet
latency during the real latency test. Obviously this is not going to do
much for high-loss links. Currently enabled by default, can be disabled
with DOARP=0 command line option.
* gentopomap now provides a second second map file (ltpmap) with "physical"
information about the topology. This includes the canonical host names,
OSes being run on the nodes, MAC addresses of interfaces, and type of
link multiplexing in use.
* Allow a linktest run to involve only a subset of an experiment's nodes.
The subset could be passed as an argument, but currently we only use this
option to exclude nodes that do not support linktest (determined on the
nodes themselves via the ltpmap).
* Have everyone read/parse the topofile up front, so that the barrier
master can be chosen dynamically if necessary (i.e., if the synchserver
node is not part of the run).
* Custom version of rude which supports sleeping rather than spinning
between intervals. Used on vnodes.
* Adjust the way parallelism is handled on vnodes: allow a small number
of test pairs (currently 2) to run on each vnode host machine rather
than allowing one test pair per vnode!
* Run crude at a higher priority than rude, we value receiving packets
more highly during the loss tests.
* Start crude listener up at the beginning along with the iperf listener.
The hope is to even further minimize any chance that crude isn't ready
when the loss test starts (still looking for Missing Packet 401 :-).
* Reduce iperf (bandwidth test) packet size to 1450 to avoid fragmentation
with encapsulating veths.
* Don't report during the bandwidth tests, just wait til the end like all
the other tests. Reduces the barrier synchs.
* Add a "print schedule" option which causes linktest to just record the
order in which it would have run tests in /var/emulab/logs/linktest.sched
rather than actually running the tests. The new script,,
can then be used to merge the schedule files from all nodes to show
how much parallelism we manage (and also to check for missing tests).
* Add a "report only" option telling linktest to run the tests but not
to analyse the results, just to send back the results.
* Compensate for variable header overheads due to linkdelays and veth
encapsulated devices.
* Compensate for decreased accuracy of latency measurements when using
FreeBSD linkdelays.
* Arrange to print physical node/link attributes when reporting errors.
Post-mortem analysis is easier if you don't have to go back and reconstruct
what machine "node0" was mapped to during a specific swapin, and what
interface got mapped to "link0".
Other changes:
* "Fix" dummynet. A previous official fix to dummynet (unfortunately inspired
by a comment in my bug report) caused the per-tick bandwidth "credit"
(aka "numbytes") to be cleared whenever the BW queue was idle. For low
bandwidth usage, this meant that there was never any credit when a new
packet arrived. The result is that every packet would get queued to the
next tick before it could move on and thus, when just shaping BW, every
packet incurred up to 1 tick of delay. This did have a beneficial effect
on delays, as every packet would enter the delay queue (if any) on exactly
a tick boundary when the BW queues were processed, so delays were "exact".
Now, when packets bypass the BW queue (due to there being sufficient credit),
they move to the delay queue at an arbitrary time during the current tick,
and the resulting delay will be from 0-1 tick short of the specified value.
* Fix the Linux linkdelay setup script. Linux tc apparently considers
"kilobits" to be 1024 bits, not 1000 bits as we were assuming. So we now
specify bandwidth in raw bits rather than kilobits.
......@@ -108,20 +108,21 @@ control-install: binaries
client: all
$(MAKE) -C iperf client
$(MAKE) -C rude client
client-install: client
@if test ! -x '/usr/local/bin/rude$(EXE)' -o \
! -x '/usr/local/bin/crude$(EXE)' -o \
! -x '/usr/local/bin/emulab-iperf$(EXE)'; then \
@if test ! -x '$(CLIENT_BINDIR)/emulab-rude$(EXE)' -o \
! -x '$(CLIENT_BINDIR)/emulab-crude$(EXE)' -o \
! -x '$(CLIENT_BINDIR)/emulab-iperf$(EXE)'; then \
echo "**********************************************************"; \
echo "* *"; \
echo "* WARNING: Some tools needed by linktest were not found. *"; \
echo "* *"; \
echo "* Make sure the following executables are installed: *"; \
echo "* *"; \
echo "* /usr/local/bin/rude$(EXE) *"; \
echo "* /usr/local/bin/crude$(EXE) *"; \
echo "* /usr/local/bin/emulab-iperf$(EXE) *"; \
echo "* $(CLIENT_BINDIR)/emulab-rude$(EXE) *"; \
echo "* $(CLIENT_BINDIR)/emulab-crude$(EXE) *"; \
echo "* $(CLIENT_BINDIR)/emulab-iperf$(EXE) *"; \
echo "* *"; \
echo "**********************************************************"; \
......@@ -130,7 +131,9 @@ client-install: client
$(MAKE) -C iperf client-install
$(MAKE) -C rude client-install
rm -f *.o $(TESTS) $(SCRIPT) $(SCRIPT_RUN) weblinktest linktest_control
-$(MAKE) -C iperf clean
-$(MAKE) -C rude clean
The overly convoluted flow of control looks like:
The Web:
<user> ->
-> www/linktest.php3 -> weblinktest -> linktest_control ->
-> linktest.proxy ->
-> linktest ->
Description of files...
......@@ -7,11 +18,13 @@ Name:
A daemon that waits for LINKTEST events.
Where it runs:
On nodes only. It should be invoked by rc.setup
so that it is running before any events arrive.
On nodes only. Invoked at boot time by rc.linktest
(tmcd/common/rc.linktest) so that it is running before
any events arrive. Runs in the current directory.
What it does:
After receiving the START event, forks
to conduct tests. Waits for to exit.
to conduct tests. Waits for to exit or
for a KILL event.
Who it should run as:
experiment swapper
......@@ -21,68 +34,45 @@ Name:
A test suite for Emulab experiments.
Where it runs:
On nodes only.
The synch node does some extra processing, so it will
cd to a directory where it finds tb_compat.tcl. Otherwise
the directory is default.
On nodes only. Runs in the current directory.
What it does:
Parses the experiment ns script, then conducts
Parses the experiment link maps (ltmap, ltpmap)
to reconstruct the topology and then conducts
various tests for connectivity and link attributes.
If errors are found, it saves them to a directory
under tbdata for the experiment. It also sends
a STOP event when all tests are completed.
under tbdata for the experiment. It sends LOG and
REPORT events to report on its progress and ultimately,
a COMPLETE event when all tests are completed.
Who it should run as:
Invoked by linktest; therefore experiment swapper.
A patchfile for NS that adds extra data structures
useful for parsing experiment datafiles.
Where it runs:
The patched version of NS is run by the synch node in
inside a directory where it can find tb_compat.tcl
What it does:
Who it should run as:
A modified version of testbed tb_compat.tcl that
has support for parsing NS scripts using
a patched version of ns (using ns-patchfile).
Where it runs:
What it does:
Who it should run as:
A script to run Linktest and report results.
Where it runs:
Nodes, ops or boss.
Nodes or ops.
What it does:
Sends a START event to Linktest, then waits for the
STOP event. If errors were found, exit code is 1, else 0.
Wrapper for starting linktest remotely. Sends initial START
command and waits for report and status events.
If errors were found, exit code is 1, else 0.
Prints out contents of error files saved by Linktest.
This script could be used by boss to run linktest when
starting up the experiment.
Important usage note: -q to skip the (time-consuming)
bandwidth test.
Who it should run as:
Experimenter either manually or as invoked indirectly from
boss at swapin or during explicit web-page invocation.
A program to either send an event or wait for an event.
Where it runs:
Anywhere is called, or
On nodes by to inject events.
On nodes or ops by to inject or extract
events to/from linktest on the nodes.
What it does:
Accepts command line arguments to either send or wait for
an event.
......@@ -90,6 +80,38 @@ Who it should run as:
Experiment swapper when invoked by;
experimenter when invoked by
A shim run on ops to run as the correct user.
Where it runs:
On ops, invoked via ssh from boss as root.
What it does:
Runs on ops as correct user with the correct groups.
Who it should run as:
Root, changes to correct user.
A setuid-root wrapper script run on boss to communicate between
web pages on boss and run_linktest (via linktest.proxy) on ops.
Where it runs:
On boss.
What it does:
Primary action script for running linktest from the web/DB server.
Performs authentication and invokes on ops.
Who it should run as:
Anyone, typically the non-privileged web user.
Yet another wrapper script, called by PHP code to invoke
linktest_control on boss.
Where it runs:
On boss.
What it does:
Adds another layer of indirection for no apparent reason.
Who it should run as:
Web user.
Mike's immediate list:
* Now that a "physical" topology map is provided, the error messages,
at least the ones that involve completely failed links, should include
info like "eth3 on pc38 could not send packets to eth2 on pc53". This
would make debugging sooo much easier.
* Resolve the "minimum of 1ms latency for dummynet" problem. I think
this is technically a bug, but fixing it has the unintended side-effect
of making delays less accurate. To wit: having every packet get put
on a BW queue has the effect of ensuring that every packet then gets
fed into a delay queue at exactly a tick boundary thus ensuring that
the delay is "exactly" the specified number of ticks (I'm talking
linkdelays here where 1 tick == 1ms). If packets are allowed to bypass
the bandwidth queue (because the current tick BW "credit" has not been
exceeded) then it will get put on a delay queue sometime in the middle
of a tick. That tick counts in the delay calculation, so the delay
will be, on average, 1/2 a tick short of the indicated value.
So the trade-off is no delay for packet for which only BW/loss shaping
are intended vs. more accurate delay values in general. I favor the
former. But this will require that linktest be recalibrated for latency.
* Investigate BW inaccuracies further.
* Investigate unexpected losses further.
* Can we do a partial linktest using just nodes that are marked as
> Linktest needs its own tb_compat.tcl because the linktest version of
> tb_compat.tcl overrides function definitions in order to parse out the
This diff is collapsed.
diff -Naur --exclude-from=xfile tcl/lan/vlan.tcl ../fbsd/ns-2.26/tcl/lan/vlan.tcl
--- tcl/lan/vlan.tcl Wed Feb 26 15:09:37 2003
+++ ../fbsd/ns-2.26/tcl/lan/vlan.tcl Wed Oct 1 17:27:06 2003
@@ -138,8 +138,12 @@
$src add-neighbor $self
set sid [$src id]
- set link_($sid:$id_) [new Vlink $ns_ $self $src $self $bw 0]
- set link_($id_:$sid) [new Vlink $ns_ $self $self $src $bw 0]
+ set link_($sid:$id_) [new Vlink $ns_ $self $src $self $bw $delay]
+ set link_($id_:$sid) [new Vlink $ns_ $self $self $src $bw $delay]
+ # linktest: add to the linktest set of links.
+ $ns_ addLTLink $sid:$id_
+ $ns_ addLTLink $id_:$sid
$src add-oif [$link_($sid:$id_) head] $link_($sid:$id_)
$src add-iif [[$nif set iface_] label] $link_($id_:$sid)
@@ -382,6 +386,7 @@
set dst_ $dst
set bw_ $b
set delay_ $d
Vlink instproc src {} { $self set src_ }
Vlink instproc dst {} { $self set dst_ }
@@ -509,6 +514,10 @@
-mactrace $mactrace]
$lan addNode $nodelist $bw $delay $llType $ifqType $macType \
$phyType $mactrace
+ # linktest renaming
+ global last_lan
+ set last_lan $lan
return $lan
diff -Naur --exclude-from=xfile tcl/lib/ns-lib.tcl ../fbsd/ns-2.26/tcl/lib/ns-lib.tcl
--- tcl/lib/ns-lib.tcl Wed Feb 26 15:09:37 2003
+++ ../fbsd/ns-2.26/tcl/lib/ns-lib.tcl Wed Oct 1 17:29:20 2003
@@ -539,6 +539,11 @@
$node set ns_ $self
$self check-node-num
+ # linktest renaming
+ global last_host
+ set last_host $node
return $node
@@ -1092,6 +1097,26 @@
$n1 set-neighbor [$n2 id]
$n2 set-neighbor [$n1 id]
+ ### linktest -- set up DupLink class to return
+ set dup [new Duplink]
+ $dup set from $link_($i1:$i2)
+ $dup set to $link_($i2:$i1)
+ # add the duplink ref to the simplex links.
+ $link_($i1:$i2) set linkRef_ $dup
+ $link_($i2:$i1) set linkRef_ $dup
+ # and add to the linktest list of links.
+ $self addLTLink $i1:$i2
+ $self addLTLink $i2:$i1
+ # naming
+ global last_link
+ set last_link $dup
+ return $dup
Simulator instproc duplex-intserv-link { n1 n2 bw pd sched signal adc args } {
diff -Naur --exclude-from=xfile tcl/lib/ns-link.tcl ../fbsd/ns-2.26/tcl/lib/ns-link.tcl
--- tcl/lib/ns-link.tcl Wed Feb 26 15:09:37 2003
+++ ../fbsd/ns-2.26/tcl/lib/ns-link.tcl Wed Oct 1 16:47:32 2003
@@ -192,7 +192,7 @@
set link_ [new $lltype]
$link_ set bandwidth_ $bw
$link_ set delay_ $delay
$queue_ target $link_
$link_ target [$dst entry]
$queue_ drop-target $drophead_
# -*- tcl -*-
# Copyright (c) 2000-2003 University of Utah and the Flux Group.
# All rights reserved.
source nstb_compat.tcl
# Linktest-specific functions. Source these before running
# linktest-ns.
# holds a pair of simplex links.
Class Duplink
# rename set in order to capture the variable names used in the ns file.
variable last_host {}
variable last_lan {}
variable last_link {}
# arrays mapping tcl hostnames to variable names
variable hosts
variable lans
variable links
# optional items
variable rtproto {}
rename set real_set
proc set {args} {
global last_host last_lan last_link
global hosts lans links
real_set var [lindex $args 0]
# Here we change ARRAY(INDEX) to ARRAY-INDEX
regsub -all {[\(]} $var {-} out
regsub -all {[\)]} $out {} val
if {[llength $last_host] > 0 } {
array set hosts [list $last_host $val]
real_set last_host {}
if {[llength $last_lan] > 0 } {
array set lans [list $last_lan $val]
real_set last_lan {}
if {[llength $last_link] > 0 } {
array set links [list $last_link $val]
real_set last_link {}
# in all cases do a real set.
if {[llength $args] == 1} {
return [uplevel real_set \{[lindex $args 0]\}]
} else {
return [uplevel real_set \{[lindex $args 0]\} \{[lindex $args 1]\}]
# converts internal ns structures to linktest structure
# where it's easier for me to get to them.
Class LTLink
LTLink instproc init {} {
$self instvar lanOrLink_ src_ dst_ bw_ delay_ loss_
# note: lanOrLink is the "owner" entity.
set loss_ 0
LTLink instproc lanOrLink {} {
$self instvar lanOrLink_
set lanOrLink_
LTLink instproc src {} {
$self instvar src_
set src_
LTLink instproc dst {} {
$self instvar dst_
set dst_
LTLink instproc bw {} {
$self instvar bw_
set bw_
LTLink instproc delay {} {
$self instvar delay_
set delay_
LTLink instproc loss {} {
$self instvar loss_
set loss_
LTLink instproc set_lanOrLink { lanOrLink } {
$self instvar lanOrLink_
set lanOrLink_ $lanOrLink
LTLink instproc set_src { src } {
$self instvar src_
set src_ $src
LTLink instproc set_dst { dst } {
$self instvar dst_
set dst_ $dst
# use some parsing procs provided by ns.
LTLink instproc set_bw { bw } {
$self instvar bw_
set bw_ [bw_parse $bw]
LTLink instproc set_delay { delay } {
$self instvar delay_
set delay_ [time_parse $delay]
LTLink instproc set_loss { loss } {
$self instvar loss_
set loss_ $loss
#LTLink instproc clone {} {
# $self instvar lanOrLink_ src_ dst_ bw_ delay_ loss_
# set newLink [new LTLink]
# $newLink set_src $src_
# $newLink set_dst $dst_
# $newLink set_bw $bw_
# $newLink set_delay $delay_
# $newLink set_loss $loss_
# return $newLink
# for final printing, always resolve lans to actual lists of hosts.
LTLink instproc toString {} {
$self instvar lanOrLink_ src_ dst_ bw_ delay_ loss_
global hosts
return [format "l $hosts($src_) $hosts($dst_) %10.0f %.4f %.4f" $bw_ $delay_ $loss_ ]
# linktest representation of links, containing LTLinks.
variable lt_links {}
# called by ns to add the ns link to the linktest representation
# nice accessors not necessarily available so in some cases I get
# the inst vars directly
Simulator instproc addLTLink { linkref } {
$self instvar Node_ link_
global hosts lans links lt_links
set newLink [new LTLink]
$newLink set_src [$link_($linkref) src ]
$newLink set_dst [$link_($linkref) dst ]
if {0 == [string compare [$link_($linkref) info class ] "Vlink"]} {
$newLink set_bw [$link_($linkref) set bw_ ]
$newLink set_delay [$link_($linkref) set delay_ ]
# lan reference
$newLink set_lanOrLink [$link_($linkref) set lan_ ]
# netbed-specific implementation for lans: add 1/2 the delay
$newLink set_delay [expr [$newLink delay] / 2.0]
} elseif {0 == [string compare [$link_($linkref) info class ] "SimpleLink"]} {
$newLink set_bw [$link_($linkref) bw ]
$newLink set_delay [$link_($linkref) delay ]
# duplink reference
$newLink set_lanOrLink [$link_($linkref) set linkRef_ ]
} else {
error "unknown link type!"
lappend lt_links $newLink
# just print the representation to stdout
Simulator instproc run {args} {
# store the rtproto
Simulator instproc rtproto {arg} {
global rtproto
set rtproto $arg
# update lt_links such that lans become new links containing destination hosts
# delay: sum both delays
# loss: product both losses
# bandwidth: min of both bandwidths (the bottleneck)
proc join_lans {} {
global lt_links lans
set new_links {}
set all_lans [array names lans]
foreach srclink $lt_links {
# dst a lan link?
if { [lsearch $all_lans [$srclink dst]] > -1 } {
set lan [$srclink dst]
# find all of the "receivers" for this lan.
foreach dstlink $lt_links {
if { $lan == [$dstlink src]
[$srclink src] != [$dstlink dst]
} {
set newLink [new LTLink]
$newLink set_src [$srclink src]
$newLink set_dst [$dstlink dst]
$newLink set_bw [expr [$srclink bw] < [$dstlink bw] ? [$srclink bw] : [$dstlink bw]]
$newLink set_delay [expr [$srclink delay ] + [$dstlink delay]]
$newLink set_loss [expr 1.0 - (1.0 - [$srclink loss] ) * (1.0 - [$dstlink loss])]
# puts [$newLink toString]