Commit 83e8e3c6 authored by Kevin Atkinson's avatar Kevin Atkinson

Removed doc files now in the WIKI, see MOVED-TO-WIKI

parent a3fde1f8
# The following documentation files are now available on the WIKI at
# at http://users.emulab.net/trac/emulab/wiki/<PAGE> and will eventually
# be removed from CVS.
# at http://users.emulab.net/trac/emulab/wiki/<PAGE>
#
FILE WIKI PAGE
......
Testbed Architecture Hierarchy
(from 4/7/03 testbed mtg)
(This shows heirarchy only... any apparent ordering between sibling
nodes in the tree is irrelevant and insignificant.)
NETBED SYSTEM ARCHITECTURE
* DB
- schema
- state:
- access control/admin
- virt expt config
- phys node config
* UI
- web
- command line
- NS
- Netbuild GUI
- Visualization
* TB Administration
- ? create new testbed
- ? boss/ops install
- add nodes
- local
- widearea - netbed CD
* Scheduling
- Idle detection/monitoring
- manual scheduling
- batch queue
* Access control
- User accounts
- Projects
- Groups
- permissions model?
- security
- isolation
* Experiment Configuration and Control
- Node configuration
- different impls: local, wa, sim, mux
- virt info via tmcd
- Link config
- different impls: local, wa, sim, mux
- tunnels, etc?
- Resource allocation
- Storage config
- Run-time Control
- events
- consoles
- control net?
- expt life cycle (state machine)
* WHERE???
- stated / node state machines
- control net?
- disk loading
This diff is collapsed.
This diff is collapsed.
EMERGENCIES
-----------
Disable the web interface:
wap webcontrol -m "Testbed Closed while we debug switch problems" -l nologin
CONTACTING USERS
----------------
Web interface sitevars:
web/banner - Message to place in large lettering at top of home page
(typically a special message).
web/message - Message to place in large lettering under the login
message on the Web Interface.
Mailing lists. NOTE: If these are filtered for spam you will probably
have to add an X-auth header with the appropriate password.
emulab-active-users - All the users in projects that have an
experiment swapped in.
emulab-recently-active-users - All users that have been active in
the last N days. Where N comes from the general/recently_active
sitevar.
emulab-recently-active-projects - All the users in projects that
have been active in the last N days. Where N comes from the
general/recently_active sitevar.
emulab-allusers - All approved users that haven't been frozen.
emulab-project-leaders - The users who are project_roots.
MALFUNCTIONING HARDWARE
-----------------------
Move a node to the hardware down experiment when an experiment is
swapped out:
wap sched_reserve emulab-ops hwdown pcXXX
Release a node from hardware down after it has been fixed:
nfree emulab-ops hwdown pcXXX
FRISBEE
-------
Frisbeed for an image dies and won't restart:
Use 'edit image descriptor' to clear the load address.
This diff is collapsed.
This diff is collapsed.
How to make a generic image from a Utah one:
1. Load up the current FBSD+RHL image.
2. FreeBSD:
2a. Update the Emulab software
See for example, ~mike/obj/doclient.
2b. Turn off cvsup
sudo rm -rf /root/.cvsup
sudo cp /dev/null /etc/emulab/nosup
sudo rm /etc/emulab/supfile
2c. Change the root password.
Set it to something generic like the newnode MFS password via
sudo passwd root
Once you set it in master.passwd, you will have to hand copy
the password hash to /etc/emulab/master.passwd. Make sure you change
both the root and toor password hashes (in the /etc/emulab file).
2d. Remove root's known_hosts file and authorized_keys.
sudo sh -c 'rm /root/.ssh/*'
Note that we used to leave our boss' authorized_keys file in the image,
but now that file is automatically overwritten as part of node setup
so there is no point.
2e. Install generic kernels.
[ As of 8/13/07 there are prebuilt versions of these in
http://www.emulab.net/downloads/generic-kernels-4.10.tar.gz.
For Utah, there are prebuilt kernels (and master.passwd files
with the usual newnode root password) in
~mike/distimages/generic-image/fbsd* ]
Build kernels from the various TESTBED-* configs and install them.
For FBSD 4.x:
TESTBED-GENERIC -> /kernel.100HZ
TESTBED-LINKDELAY-GENERIC -> /kernel.1000HZ
TESTBED-DELAY-GENERIC -> /kernel.10000HZ
TESTBED-JAIL-GENERIC -> /kernel.jail
Make sure that all the proper aliases exist too:
ln -f /kernel.1000HZ /kernel.linkdelay
ln -f /kernel.10000HZ /kernel.delay
cp -p /kernel.100HZ /kernel
For FBSD 5.x and higher:
TESTBED-GENERIC -> /boot/kernel
TESTBED-LINKDELAY-GENERIC -> /boot/kernel.linkdelay
TESTBED-DELAY-GENERIC -> /boot/kernel.delay
(there is no vnode/jail support on FBSD 5+).
2f. Shutdown to single user and run the prepare script.
I always first umount NFS directories:
umount -h fs
Then I remove /users and /proj mount points:
rmdir /users/* /proj/*
(Note that this looks really dangerous, in the case where something is
still NFS mounted, but it isn't since I use "rmdir" which will not
remove a directory that isn't empty or a mount point.)
Then run prepare:
cd /usr/local/etc/emulab
./prepare
2g. Remove the Utah SSL certs and SSH host keys
rm /etc/ssh/ssh_host*
rm /etc/emulab/*.pem
2h. Reboot into Linux
3. Linux:
3a. Update the Emulab software.
3b. Turn off cvsup
sudo rm -rf /root/.cvsup
sudo cp /dev/null /etc/emulab/nosup
sudo rm /etc/emulab/supfile
3c. Change the root password.
Set it to something generic like the newnode MFS password via
sudo passwd root
Once you set it in /etc/shadow, you will have to hand copy
the password hash to /etc/emulab/shadow. Make sure you change
both the root and toor password hashes (in the /etc/emulab file).
3d. Remove root's known_hosts file and authorized_keys.
sudo sh -c 'rm -f /root/.ssh/*'
Note that we used to leave our boss' authorized_keys file in the image,
but now that file is automatically overwritten as part of node setup
so there is no point.
3e. Install a generic kernel.
We do not yet have such a thing for Linux. So we hope the
current kernel is "generic enough".
3f. Shutdown to single user and run the prepare script.
I always first umount NFS directories:
umount -at nfs
Then I remove /users and /proj mount points:
rmdir /users/* /proj/*
(Note that this looks really dangerous, in the case where something is
still NFS mounted, but it isn't since I use "rmdir" which will not
remove a directory that isn't empty or a mount point.)
Then run prepare:
cd /usr/local/etc/emulab
./prepare
3g. Remove the Utah SSL certs and SSH host keys
rm /etc/ssh/ssh_host*
rm /etc/emulab/*.pem
3h. Reboot into the admin MFS
4. Minor admin stuff.
Once in the admin MFS, I run fsck on the filesystems just to be safe:
fsck /dev/ad0s1[aef]
e2fsck -f -y /dev/ad0s2
The latter in particular ensures that the last-fsck timestamp is updated
so the filesystem won't be forced into an fsck for another 180 days or so.
5. Create the image.
cd /proj/<pid>/images
imagezip -o /dev/ad0 <somename>-GENERIC.ndz
Cross Compiling
---------------
These are some quick notes on cross-compiling the testbed tree for the
stargates on the garcia. Autoconf generates some of the necessary
code for dealing with this stuff, you just need the necessary magic to
turn it on. In this case, running the following command line on the
RHL90-XSCALE-CROSS disk image should do the trick:
$ ../testbed/configure --host=arm-linux --build=i686-linux \
--with-brainstem=/proj/testbed/src/brainstem
The "--host" argument tells configure what CPU/OS we want the code to
run on and the "--build" argument tells it what CPU/OS we are building
on. Both arguments must be given for this to work.
XXX ... more junk ... clean up later ... can't talk in detail now ...
The magic incantations needed to build the testbed client side for the
stargates. I've put most of the results in:
/proj/testbed/src/
To do it from scratch, you'll need to allocate a node with the
RHL90-XSCALE-CROSS images. I went after libelvin and elvind first,
they needed a quick hack on their configure scripts to ignore some
tests...
libelvin: (not sure if the without/disable flags are all right...)
env PATH=/usr/local/arm/3.4.1/bin:${PATH} \
CC=arm-linux-gcc CXX=arm-linux-g++ RANLIB=arm-linux-ranlib \
./configure --prefix=/tmp/elvin-install \
--without-x --without-xt --without-gtk --disable-http \
--disable-cluster --disable-mgmt
!! Note the elvin-install dir for the prefix, we need to use this
because that is what elvin-config will report, and we don't want it
talking about the x86 config !!
env PATH=/usr/local/arm/3.4.1/bin:${PATH} gmake
env PATH=/usr/local/arm/3.4.1/bin:${PATH} gmake install
elvind: (not sure if the without/disable flags are all right...)
env PATH=/usr/local/arm/3.4.1/bin:${PATH} \
CC=arm-linux-gcc CXX=arm-linux-g++ RANLIB=arm-linux-ranlib \
./configure --with-elvin=/tmp/elvin-install/ \
--disable-mgmt --disable-cluster
env PATH=/usr/local/arm/3.4.1/bin:${PATH} gmake
env PATH=/usr/local/arm/3.4.1/bin:${PATH} gmake install \
DESTDIR=/tmp/stargate-install
Cross-compiling openssl for arm:
http://dudu.dyn.2-h.org/nist/qt-notes.php#ccSsl
env PATH=/usr/local/arm/3.4.1/bin:${PATH} gmake install_sw \
INSTALL_PREFIX=/tmp/stargate-install
gmake install_sw INSTALL_PREFIX=/tmp/stargate-install \
SHARED_ILBS="libssl.so.0.9.7 libcrypto.so.0.9.7"
Finally, build the testbed tree:
env ELVIN_CONFIG=/tmp/elvin-install/bin/elvin-config \
LDFLAGS=-L/tmp/stargate-install/usr/lib \
CFLAGS=-I/tmp/stargate-install/usr/include \
../testbed/configure --host=arm-linux --build=i686-linux \
--with-brainstem=/proj/testbed/src/brainstem
!! Add LDFLAGS and CPPFLAGS to pickup openssl !!
gmake client
gmake client-install DESTDIR=/tmp/stargate-install
env CC=arm-linux-gcc LDSHARED="arm-linux-gcc -shared -Wl,-soname,libz.so.1" \
./configure --prefix=/usr --shared
Random notes:
hostname isn't set right, it's always "stargate" instead of the emulab
one, have to run sethostname.dhclient manually.
had to build, statically link, and install mktemp since its missing
and needed by install-tarfile.
made a fake /sbin/consoletype
copied /etc/sysconfig/network from a pc
added /var/tmp, /var/db to /etc/rcS.d/S05mountall.sh for install-tarfile
openssh:
./configure --prefix=/usr --sysconfdir=/etc/ssh --with-libs="-lresolv" --disable-strip
gmake CC=arm-linux-gcc \
LDFLAGS="-L/tmp/stargate-install/usr/lib -L. -Lopenbsd-compat" \
CFLAGS="-I/tmp/stargate-install/usr/include -g -O2 -Wall -Wpointer-arith -Wuninitialized -Wsign-compare -std=gnu99" RANLIB=arm-linux-ranlib \
AR=arm-linux-ar LD=arm-linux-gcc
gmake install CC=arm-linux-gcc \
LDFLAGS="-L/tmp/stargate-install/usr/lib -L. -Lopenbsd-compat" \
CFLAGS=-I/tmp/stargate-install/usr/include RANLIB=arm-linux-ranlib \
AR=arm-linux-ar LD=arm-linux-gcc DESTDIR=/tmp/stargate-install/
rsync flags: -avzpog --exclude-from=garcia.exclude
contents of garcia.exclude:
/proj/*
/share/*
/var/*
/proc/*
/users/*
/dev/*
/tmp/*
/mnt/*
This file gives a basic overview of experiment creation and teardown, though
it is very out-of-date (as of 4/28/03).
Prerun Outline:
Step 1 - parse.tcl
Here we convert the NS file into a temporary IR file. We
assign port numbers and generate a virtual topology in terms
of these virtual node:port's.
We also do as many checks on the tb-* commands as possible at
this point.
By the end of this step all tb-* commands should have resulted in
data in the IR file. The topology we have is very close to the
final emulation. All that is missing is the delay nodes and
extra links that go along with them.
Step 2 - Traffic Generation
We set up any traffic sources and syncs that may have been requested.
Step 3 - IP address allocation
We now fill out all unassigned IP addresses. Whenever a link or
LAN has already had some of its nodes assigned IP addresses we
preserve the subnet. In cases were no IP addresses have been
assigned we generate a unique subnet and unique IP addresses
within that subnet.
Step 4 - Update DB
If no errors have occurred then we upload our representation into
the DB (virt_nodes and virt_lans) tables.
--
Swap In Outline:
Step 1 - Snapshot current testbed state
We generate a ptop file from the current testbed state.
Step 2 - DB to top
We take the topology from the database and convert it into a top
file. This involves generating delay nodes.
Step 3 - Assign
We check that resources exist and run assign. If successful we
convert the output into a virtual node <-> physical node mapping
and a virtual node:port <-> physical node:port mapping.
Step 4 - Port Shuffling
We now shuffle ports as much as possible to match the portmap
table. The portmap table holds the last mapping (or none if this
is the first swapin). We can swap two ports if physically they
have the same bandwidth and go to the same destination. In terms
of the Utah testbed this means we can always swap ports of the
same type (ethernet, gigabit, etc.) and so should always be able
to match the previous mapping.
Step 5 - Reserve Resources
At this point we call nalloc and grab the resources we need. If
the resources are no longer available we go back to step 1 (or
terminate).
Step 6 - DB
We need to calculate and setup vlans and delays. We also update
portmap, nodes, and interfaces.
Step 7 - tbrun
We do all the old tbrun tasks. Should be identical except for
snmpit taking from vlans instead of IR file.
Step 8 - Final DB
At the very end we set the state column of the experiments table
to 'active'.
--
Swap Out Outline:
Step 1 - Teardown
We reset VLANs, clear up the named maps, and clean up any other
node state.
Step 2 - DB Teardown
We do any DB changes to show that the experiment is no longer
active. This involves changing vlans, delays, interfaces, and
nodes and changing the expt_state flag in experiments.
--
End Experiment Outline:
Step 1 - Swap Out
If the experiment is currently running we swap it out.
Step 2 - Clean up DB
We remove the experimental data from the DB. This involves
virt_nodes, virt_lans.
----------------------------------------------------------------------
Crash Recovery:
If a crash occurs during tbprerun then we should run tbend which will
clean up all partial state.
If a crash occurs during swapin we should do any necessary tasks to
reset node state and run tbswapout to cleanup DB state. There is one
potential problem if the crash occurred at exactly the right moment so
that portmap was only partially updated. In this case we would need
to clear the portmap table for that experiment and would loose our
port consistency. I foresee this as being extremely unlikely and not
worth worrying about. On the Utah testbed, as the portmap should
never change after it is setup on the first swapin, this is even more
unlikely. If we decide we really care about this we can add
generation numbers to the portmap table, add the new generation before
removing the old, and add checks for crash recovery.
If a crash occurs during tbend we should be able to rerun tbend
without adverse effect.
----------------------------------------------------------------------
DB tables:
The existing tables: delays, vlans, nodes stick around to represent
current DB state.
We'll need tables to store experiments:
virt_nodes:
pid
eid
virtual name
type Indexed into node_types
ips This is a list of <port>:<ip>
osid
cmd_line
rpms
deltas
startupcmd
tarfiles
The following table is both lans and links. Links are just lans with
only two nodes.
virt_lans:
pid
eid
virtual name
members (node:ports)
delay
bandwidth
lossrate
We need a table that stores the previous host:port virtual<->physical
mapping. This table is used in the port shuffling step of tbswapin to
try to maximize port consistency across swaps. While an experiment is
swapped in the this table is redundant with the vname columns of the
nodes and interfaces table. However, as soon as an experiment is
swapped out this becomes a useful record of the previous mapping.
portmap:
pid
eid
vnode
vport
pport
Add expt_state to experiments with values 'active'|'dormant'.
Add vname to interfaces table which holds virtual name of each
interface.
--
How it connects:
Virtual Physical
virt_nodes nodes
virt_lans delays, vlans
--
Example:
NS File:
----------------------------------------------------------------------
source tb_compat.tcl
set ns [new Simulator]
set n1 [$ns node]
set n2 [$ns node]
set n3 [$ns node]
set lan1 [$ns make-lan "$n1 $n2" 100Mb 0ms]
set link1 [$ns duplex-link $n3 $n2 100Mb 0ms DropTail]
tb-set-node-ip $n1 6.6.6.2
----------------------------------------------------------------------
This will generate a virt_nodes table of something like:
pid eid vname ips ...
testbed chris n1 0:6.6.6.2
testbed chris n2 0:6.6.6.3 1:192.168.1.2
testbed chris n3 0:192.168.1.3
and a virt_lans table of:
pid eid vname members ...
testbed chris lan1 n1:0 n2:0
testbed chris link1 n2:1 n3:0
Now let's assign runs and comes up with the following assignment:
vnode pnode
n1 pc33
n2 pc19
n3 pc2
vport pport
n1:0 pc33:eth0
n2:0 pc19:eth1
n2:1 pc19:eth3
n3:0 pc2:eth2
As this is the first run we do now port shuffling. We do update the
portmap table:
pid eid vnode vport pport
testbed chris n1 0 eth0
testbed chris n2 0 eth1
testbed chris n2 1 eth3
testbed chris n3 0 eth2
We now generate the vlans table:
id pid eid virtual members
0 testbed chris lan1 pc33:eth0 pc19:eth1
1 testbed chris link1 pc19:eth3 pc2:eth0
Finally we fill in the reserved and interfaces table:
reserved:
node_id pid eid vname ...
pc33 testbed chris n1
pc19 testbed chris n2
pc2 testbed chris n3
interfaces:
node_id port IP ...
pc33 eth0 6.6.6.2
pc19 eth1 6.6.6.3
pc19 eth3 192.168.1.2
pc2 eth2 192.168.1.3
This diff is collapsed.
This file contains notes about the error codes from a few testbed programs
that provide information more than sucess/failure in their return values.
NOTE: It's possible that this file has gotten out of sync with reality, so sure
to check the program in question before using these.
assign:
0 - success
1 - failure; try assign again
2 - failure; do not try assign again - the given top can never map to
the given ptop
assign_wrapper:
0 - success
>0 = failure, returncode is 1 OR a subset of:
2 - max_concurrent violation
4 - bandwidth violation
8 - linkusers violation
16 - desires violation
32 - unassigned (e.g., not enough nodes, I think)
64 - "recoverability":
Think of this as "clean";
indicates that assign_wrapper did not get past the assigning phase;
[If a swap-modify fails, this must be set for recovery to happen]
tbswap:
0 - success
1 - failure not involving assign_wrapper
>1 - failure from assign_wrapper;
returncode is whatever assign_wrapper returned,
WITHOUT the '64' bit set.
This diff is collapsed.
How we assume IP (or more correctly, IPv4):
1. Most obviously, assignment of node addresses, either explicitly by the
user or implicitly by us, is IPv4. IPv6 addresses get assigned to
interfaces as well (at least by FreeBSD) as a side-effect, but in general,
we support nothing else.
2. Routing (manual, static, session) is all IPv4 as well.
3. As of FreeBSD 4.7, ipfw only handles IPv4. When bridging, as in our
delay nodes, non-IPv4 packets are forwarded without applying any rules.
This means that ARP and VETH packets do not get shaped. This is arguably
wrong. It would be completely wrong for VETH except that all veth
shaping is done using link delays and applied before encapsulation is
done. And no veth traffic should ever go through a delayed link.
4. Current jail support restricts interface access indirectly by specifying
the IPv4 addresses that a jail can use. Raw socket access in jails is
likewise restricted.
5. The multiple routing table hack we use for vnodes only applies to IPv4.
What we need to do to support raw ethernet access:
1. Disable IP assignment and setup. Do we need to give the user a way
(on a node) to associate an NS-named link with a MAC address?
2. Switch to at least FreeBSD 4.9 and use ipfw2 which supports ipfw/dummynet
at the MAC level. Then we can shape non-IP packets in general.
3. Switch jails to explicitly bind to interfaces.
What we need to do to support IPv6:
1. Support IPv6 addresses in our extended NS syntax for node address
assignment as well as well as internally for our implicit address
assignments.
2. Support IPv6 addresses in routing. For static/manual this is just a
matter of internal DB representation (I think). For session, we'll
have to see if gated even supports IPv6.
3. Delay nodes. May need a different ipfw (ip6fw?). But probably not for
the granularity of (ipfw and dummynet) rules we use (since they are
largely interface-based, but we do use "next-hop").
4. Fix jails to understand IPv6. This is more than just restricting jail
interfaces directly, there is *no* jail support in the IPv6 code.
5. Use something other than the multiple-routing table hack. Probably
switch to the multiple network stack implementation. We could hack
the multi-routing table stuff, but...ugh!
6. Configure/enable IPv6 access in any number of applications. Hopefully,
most are setup to handle either v4 or v6 by default.
7. Support IPv6 on the control net? To communicate with the outside
world, we cannot do this. Our router doesn't support IPv6. They
will have to use GIF tunnels. Ditto to communicate from nodes to
boss/ops, since that goes through the router. For node to node
control net communication, it shouldn't be any different than the
experimental interfaces.
8. Multicast? Probably won't be able to do this, the switch will not
recognize IPv6 multicast packets.
FREEBSD 4.3 Patches:
/usr/src/lib/libstand/net.c:
*** Minor errno=0 fix to avoid repeated error condition never being
cleared.
/usr/src/lib/libstand/tftp.c: