This section of the tutorial describes how to run your first Testbed experiment. We cover basic NS syntax and various operational issues that you will need to know in order conduct experiments to completion. Later sections of the tutorial will cover more advanced topics such as loading your own RPMs automatically, running programs automatically, running batch jobs, creating your own disk images and loading those images on your nodes.
If you already have an account on the Testbed, all you need to do is go to Emulab Home Page, enter your login name and your password, and then click on the "Login" button. If you don't have an account, click on the "Join Project" or "Start Project" links. For an overiew of how you go about getting an Emulab account, go to the "How To Get Started" page.
The Testbed's power lies in its ability to assume many different topologies; the description of a such a topology is a necessary part of an experiment.
Emulab uses the "NS" ("Network Simulator") format to describe network topologies. This is substantially the same Tcl-based format used by ns-2. Since the Testbed offers emulation, rather than simulation, these files are interpreted in a somewhat different manner than ns-2. Therefore, some ns-2 functionality may work differently than you expect, or may not be implemented. If you feel there is useful functionality missing, please let us know. Also, some testbed-specific syntax has been added, which with the inclusion of compatibility module (tb_compat.tcl), will be ignored by the NS simulator. This allows the same NS file to work on both the Testbed and ns-2, most of the time.
For those unfamiliar with the NS format, here is a small example
(We urge all new Emulab users to begin with a small 3-5 node experiment
such as this, so that you will become familiar with NS syntax and the
practical aspects of Emulab operation). Let's say we are trying to
create a test network which looks like the following:
An NS file which would describe such a topology is as follows. First
off, all NS files start with a simple prolog, declaring a simulator
and including a file that allow you to use the special
tb-
commands:
# This is a simple ns script. Comments start with #.
set ns [new Simulator]
source tb_compat.tcl
Then define the 4 nodes in the topology.
set NodeA [$ns node]
set NodeB [$ns node]
set NodeC [$ns node]
set NodeD [$ns node]
Next define the 3 links between the nodes. NS syntax permits you to
specify the bandwidth, latency, and queue type. For our example, we
will define full speed links between B and C,D and a delayed link
from node A to B.
$ns duplex-link $NodeA $NodeB 100Mb 50ms DropTail
$ns duplex-link $NodeB $NodeC 100Mb .1ms DropTail
$ns duplex-link $NodeB $NodeD 100Mb .1ms DropTail
In addition to the standard NS syntax above, a number of
extensions have been added that allow you
to better control your experiment. For example, you may specify what
Operating System is booted on your nodes. We currently support FreeBSD
4.3 and Linux RedHat 7.1, as well as
OSKit kernels on the
testbed PCs. By default, Linux RedHat 7.1 is selected.
tb-set-node-os $NodeA FBSD-STD
tb-set-node-os $NodeC RHL-STD
You may also control what IP addresses are assigned to the
experimental interfaces on your nodes. The experiment configuration
software will select IP addresses for you, but if your experiment
depends on particular IP addresses, you may specify them at each
link. The following example sets the IP address of node B on the port
going to node C:
tb-set-ip-interface $NodeB $NodeC 192.168.42.42
Lastly, all NS files end with an epilog that instructs the simulator
to start.
# Go!
$ns run
If you would like to try the above example, the completed NS file can be run as an experiment in your project. Another example ns script that shows off using the power of Tcl to generate topologies is here.
After logging on to the Testbed Web Interface, choose the "Begin Experiment" option from the menu. First select which project you want the experiment to be configured in. Most people will be a member of just one project, and will not have a choice. If you are a member of multiple projects, be sure to select the correct project from the menu.
Next fill in the `Name' and `Long Name' fields. The Name should be a single word (no spaces) identifier, while the Long Name is a multi word description of your experiment. In the "Your NS file" field, place the local path of a NS file which you have created to describe your network topology. This file will be uploaded through your browser when you choose "Submit."
After submission, the Testbed interface will begin processing your request. This will likely take several minutes, depending on how large your topology is, and what other features (such as delay nodes and bandwidth limits) you are using. Assuming all goes well, you will receive an email message indicating success or failure, and if successful, a listing of the nodes and IP address that were allocated to your experiment.
For the NS file described above, you would receive a listing that looks
similar to this:
Node Mapping:
Virtual Physical Qualified Name
--------------- --------------- --------------------
nodeA pc12 nodeA.myproj.myexp.emulab.net
nodeB pc14 nodeB.myproj.myexp.emulab.net
nodeC pc16 nodeC.myproj.myexp.emulab.net
nodeD pc18 nodeD.myproj.myexp.emulab.net
delay0 pc7 delay0.myproj.myexp.emulab.net
IP Addresses:
Node IFC Destination IP
--------------- -------------------- --------------------
nodeB eth0 nodeC 192.168.42.42
nodeB eth2 nodeA 192.168.2.2
nodeA eth0 nodeB 192.168.2.3
nodeC eth0 nodeB 192.168.42.2
nodeD eth0 nodeB 192.168.3.2
nodeB eth1 nodeD 192.168.3.3
A few points should be noted:
By the time you receive the email message listing your nodes, the Testbed configuration system will have ensured that your nodes are fully configured and ready to use. If you have selected one of the Testbed supported operating system images (FreeBSD, Linux, NetBSD), this configuration process includes:
As this point you may log into any of the nodes in your experiment. You will need to use Secure Shell (ssh), and you should use the `qualified name' from the nodes mapping table so that you do not form dependencies on any particular physical node. Your login name and password will be the same as your Web Interface login and password.
The /etc/hosts file on each node will provide a local name mapping for the other nodes in your experiments. You should take care to use these names (or IP numbers) and not the .emulab.net names listed in the node mapping, since the emulab names refer to the control network LAN that is shared amongst all nodes in all experiments. It is only the experimental interfaces that are entirely private to your experiment.
NOTE: The configuration process just described occurs only on Emulab constructed operating system images. If you are using an OSKit kernel, or your own disk image with your own operating system, you will be responsible for all of the configuration. At some point we hope to provide tools to assist in the configuration, but for now you are on your own.
If you need to customize the configuration, or perhaps reboot nodes,
you can use the "sudo" command, located in /usr/local/bin
on FreeBSD and Linux, and /usr/pkg/bin
on NetBSD. Our
policy is very liberal; you can customize the configuration in any way
you like, provided it does not violate Emulab's
administrative policies. As as example, to reboot a node that is
running FreeBSD:
/usr/local/bin/sudo reboot
This is bound to happen when running experimental software and/or
experimental operating systems. Fortunately we have an easy way for
you to power cycle nodes without requiring Testbed Operations to get
involved. If you must power cycle a node, log on to users.emulab.net
and use the "node_reboot" command:
where `node' is the physical name, as listed in the node mapping
table. You may provide more than one node on the command line. Be
aware that you may power cycle only nodes in projects that you are
member of. Also, node_reboot does its very best to perform a
clean reboot before resorting to cycling the power to the node. This
is to prevent the damage that can occur from constant power cycling
over a long period of time. For this reason, node_reboot may
delay a minute or two if it detects that the machine is still
responsive to network transmission. In any event, please try to
reboot your nodes first (see above).
You may also reboot all the nodes in an experiment by using the -e
option to specify the project and experiment names. For example:
will reboot all of the nodes reserved in the "multicast" experiment in
the "testbed" project. This option is provided as a shorthand method
for rebooting large groups of nodes.
The Testbed NS extension tb-set-node-rpms allows you to
specify a (space separated) list of RPMs to install on each of your
nodes when it boots:
The above NS code says to install the silly-freebsd.rpm file
on nodeA, and the silly-linux.rpm on nodeB.
RPMs are installed as root when the node first boots, and must reside
on the node's local filesystem, or in a directory that can be reached
via NFS. This is either the project's /proj directory, or a
project member's home directory in /users.
tb-set-node-rpms $nodeA /proj/pid/rpms/silly-freebsd.rpm
tb-set-node-rpms $nodeB /proj/pid/rpms/silly-linux.rpm
You can start your application automatically when your nodes boot by
using the tb-set-node-startup NS extension. The argument is
the pathname of a script or program that is run as the UID of
the experiment creator, after the node has reached multiuser mode. You
can specify the same program for each node, or a different program.
For example:
will run /proj/pid/runme.nodeA on nodeA and
/proj/pid/runme.nodeA on nodeB. The programs must reside on
the node's local filesystem, or in a directory that can be reached via
NFS. This is either the project's /proj directory, or a
project member's home directory in /users.
tb-set-node-startup $nodeA /proj/pid/runme.nodeA
tb-set-node-startup $nodeB /proj/pid/runme.nodeB
The exit value of the startup command is reported back to the Web Interface, and is made available to you via the "Experiment Information" link. There is a listing for all of the nodes in the experiment, and the exit value is recorded in this listing. The special symbol none indicates that the node is still running the startup command. A log file containing the output of the startup command is created in the project's logs directory (/proj/pid/logs).
The startup command is especially useful when combined with batch mode experiments.
It is often necessary for your startup program to determine when all
of the other nodes in the experiment have started, and are ready to
proceed. Sometimes called a barrier, this allows programs to
wait at a specific point, and then all proceed at once. Emulab
provides a primitive form of this mechanism using experiment ready
bits, which are set and read using the
TMCD/TMCC. When an experiment is first configured, the ready bit
for each node is cleared. As each node starts its application and
reaches the point where it must be sure that all other nodes have
started up, it issues a TMCC ready command:
which tells Emulab's configuration system that the node is ready to
proceed. The node can then poll for the ready count to
determine how many nodes are ready (have issued a tmcc ready command):
tmcc ready
which will return the ready count as a string:
tmcc readycount
where N is the number of nodes that are ready, and M
is the total number of nodes in the experiment. An application can
poll the ready count with a simple script, or it can encode the ready
bits check directly into its program. For example, here is a simple
Perl fragment that issues the ready command, and then polls for the
ready count, being sure to delay a small amount between each poll.
READY=N TOTAL=M
Note that the ready count is essentially a use-once feature; The
ready count cannot be reinitialized to zero since there is no actual
synchronization happening. If in the future it appears that a
generalized barrier synchronization would be more useful, we will
investigate the implementation of such a feature.
system("tmcc ready");
while (1) {
my $bits = `tmcc readycount`;
if ($bits ~= /READY=(\d*) TOTAL=(\d*)/) {
if ($1 == $2) {
last;
}
}
#
# Please sleep to avoid swamping the TMCD!
#
sleep(5);
}
If your set of operating system customizations cannot be easily contained within an RPM (or multiple RPMs), or if you are just not familiar with the RPM mechanism, then you can create your own operating system delta. A delta is like an RPM or Tar file in that it contains a bunch of files to be unpacked onto the node. The difference is that with a delta you do not have to figure what files you changed, and how to automate the installation process. Instead, you just allocate a node, change it anyway you like, and then issue the create-delta command. The resulting delta file can then be specified in your NS file using the Testbed NS extension tb-set-node-deltas. You can create one delta to install on all of your nodes, or several different deltas for various nodes in your experiment. When the nodes in your new experiment boot for the first time, the delta will be installed (all of the files unpacked) very early in the boot phase, and the node rebooted again (in case you have installed daemons that need to be started during initialization). Your experiment can then proceed.
The key point is that the Testbed configuration software deals with figuring out what files you changed, installing the delta on your nodes, rebooting the nodes that have new software installed, and ensuring that any particular delta is installed only once on each node.
Lets step through an example. The first thing you need to do is
create an experiment with a single node in it. The following NS file
can be submitted to the "Begin Experiment" page.
When you have received email notification that the experiment has
configured, log into the node with ssh. Install whatever
software you like, making sure to update the necessary files if you
have installed daemons that need to be started automatically at boot
time. After all of your software is installed, create the delta file
with:
set ns [new Simulator]
source tb_compat.tcl
set nodeA [$ns node]
tb-set-node-os $nodeA FBSD-STD
$ns run
The argument to the create-delta command is a complete
pathname, which must reside someplace in your /proj directory
(a subdirectory is fine). You cannot write the delta file to any
other filesystem. This restriction is enforced so that diskspace (and
resources in general) can be accounted for on a per-project basis.
It should be noted that a delta created on one OS
cannot be installed on another. In other words, a delta created on a
FreeBSD machine can only be installed on a FreeBSD machine. If you
need the same software installed on a Linux machine as well, you will
need to repeat this process with a node running Linux. See the section
on
tb-set-node-os in the
Extensions reference.
sudo /usr/local/bin/create-delta /proj/testbed/foo.delta
After you have created your delta, you can then use it in subsequent
experiments by using the Testbed NS extension
tb-set-node-deltas. For example, here is an NS file that
creates a two node experiment, installs a different delta on each
node, and then runs a program automatically on each node. Presumably,
the startup program is installed by the delta, and encapsulates the
experiment being performed.
Implementation Notes:
set ns [new Simulator]
source tb_compat.tcl
set nodeA [$ns node]
set nodeB [$ns node]
tb-set-node-os $nodeA FBSD-STD
tb-set-node-os $nodeB RHL-STD
tb-set-node-deltas $nodeA /proj/testbed/deltas/silly-freebsd.delta
tb-set-node-deltas $nodeB /proj/testbed/deltas/silly-linux.delta
tb-set-node-startup $nodeA /usr/site/bin/run-my-experiment
tb-set-node-startup $nodeB /usr/site/bin/run-my-experiment
$ns run
Batch Mode experiments can be created on the Testbed via the "Create a Batch Experiment" link in the operations menu to your left. A batch mode experiment is a lot like a regular experiment, but with a few important differences:
tb-set-node-rpms $nodeA /proj/testbed/rpms/silly-1.0-1.i386-freebsd.rpm
tb-set-node-rpms $nodeB /proj/testbed/rpms/silly-1.0-1.i386-freebsd.rpm
The next two lines of the NS file specify what program should be run
on each of the nodes. Using the
tb-set-node-startup NS extension, we say that the program
run-silly (installed by the silly-1.0 RPM) is to be
run on both nodes:
tb-set-node-startup $nodeA /usr/site/bin/run-silly
tb-set-node-startup $nodeB /usr/site/bin/run-silly
After you have been notified via email that the batch experiment is
running, you can track the progress of your experiment by looking in
the "Experiment Information" page. As each node completes the startup
command, the listing for that node will be updated to reflect the exit
status of the command (you may need to hit the Reload button to see
the changes). Once all of the nodes hare reported in an exit status,
the batch system will tear down the experiment and send you email. If
your experiment is such that one node is the controller, and runs
commands on all the other nodes, then simply run a dummy startup
command on the other nodes so that the batch system will receive an
exit value for that node. Since the batch is not terminated until
all nodes have reported in, be sure that the controlling node
does not exit from its startup command until all of the nodes have
finished. A dummy startup command can be setup like this:
tb-set-node-startup $nodeC /bin/echo
The status of your batch experiment can be viewed via the "Experiment Information" link in the Web Interface Options menu. You may also cancel a batch after you have submitted it using the "Terminate" option in the information display. As noted in the section on the Startupcmd, the output of the startup command on each node is written to separate files in your project log directory. You can use these log files to debug your batch experiment.
The batch system is still under development. It appears to be functional, but there are bound to be kinks in the system. Please help us debug and improve it by letting us know what you think and if you have problems with it. Currently, the batch system tries every 10 minutes to run your batch. It will send you email every 5 or so attempts to let you know that it is trying, but that resources are not available. It is a good idea to glance at the message to make sure that the problem is lack of resources and not an error in your NS file.