Forked from
emulab / emulab-devel
24546 commits behind the upstream repository.
-
Robert Ricci authored
files in doc/, but got the important ones.
Robert Ricci authoredfiles in doc/, but got the important ones.
arch.txt 17.81 KiB
#
#
# EMULAB-COPYRIGHT
# Copyright (c) 2003 University of Utah and the Flux Group.
# All rights reserved.
#
======================
Emulab Source Tree Map
======================
This file documents roughly the contents of our source tree as of
April, 2003. Some of the entries in here are per-script, others are
for a group of scripts, in which case the documentation inside the
individual scripts should be sufficient explanation. The end of the
file also has some overview-ish stuff about abstractions and things
like that.
[This file maintained by testbed-ops@emulab.net]
For big picture and some details, read the OSDI'02 paper, in
doc/papers/netbed-osdi02* and on the Web.
Accounts
- unix accounts
- unix group management (per-proj and per-group)
- ssh key distribution
- sfs key distribution
- account permissions (web only, ron/wa, root/non-root, etc.)
- emulab permissions
- control of hardware/hw config.
- administrative control
- hierarchical organization
- delegation at all levels
- trust models and their security impact
Assign (resource allocation algorithms)
- the Testbed Mapping Problem: read draft of our upcoming CCR paper in doc/papers
- NP Hard
- in some ways, constraint satisfaction problem
- but more, because not all satisfactory solutions are equal
- time constraints: we're an interactive system, and need to
perform on interactive timescales - a few seconds max to get a
good answer
- variation in wide area
- soft matching
- complicated more by fact that we can combine the unknown
(wide-area link) with something we control (traffic shaping)
- Emulab solution
- many "valid" solutions, but difference between near-optimal and
random valid soln. is huge and important
- sim. annealing core
- highly optimized
- clever domain specific tricks
- main purpose is to conserve scarce resources (nodes,
interswitch bandwidth, soon special hw like GigE)
- lots of parameters, not always clear how to tune them
- Netbed solution
- typically no exact match, just some that may be closer than
others - very fuzzy matching
- genetic algo. core
- not as highly developed yet, but meets our needs
- main purpose is to find a real-world overlay that matches the
supplied topology as closely as possible
Capture/console (node consoles - "'zero-penalty' remote research")
- serial line consoles to nodes replace kbd/vga
- fine-grained access control
- changes quickly when node changes "ownership"
- simple, secure remote access
- ACLs, authenticated ssl tunnel program + standard telnet client
CD-ROM (remote node mgmt/robustness, adding nodes to the system)
- simple to add a node
- fallback boot method (CD-ROM) when disk is hozed
- path for self-update and disk reimaging
- goal is to reduce need for human intervention whenever possible
Database (centralized store for persistent shared system state)
- lots o' stuff here
- most stuff falls into one of several categories
- semi-permanant hw setup info (wires, ifaces, nodes, outlets)
- current hardware configs (reservations, ifaces, vlans, etc)
- semi-permanant sw setup info (disk images, OS's, etc.)
- current sw setup (traffic shaping, trafgen, routing, etc.)
- virtualized expt info (topology, config, etc)
- administrative info (users, groups, projects, etc.)
- misc. config bits and logging
- sw engineering issues
- db schema must match sw build
IXP (special hw resources) [not released due to Intel license restrictions]
- use as testbed infrastructure
- traffic shaping
- use for experimentation
- shared facil. gives more people access, increases usage
- emulab is good environment w/many tools
Event system (distributed event coordination/communication)
- "Elvin" publish/subscribe system underneath (imported from elsewhere)
- used in several directions
- emulab to nodes/programs
- nodes to emulab
- programs on emulab server to each other
- can be nodes to nodes too
- delay agent
- coordinated control of traffic shaping
- changes can initiate anywhere
- automatic timed changes from emulab
- manual changes from emulab server or a node
- allows for reactive traffic shaping, trace playback, etc.
- nsetrafgen
- control of NSE simulators and their traffic generation
- program agent
- start/stop arbitrary program
- timed or manual, and allows reactivity
- event scheduler
- controls timed events
- may be submitted apriori or during a run
- stated uses it heavily, but is described elsewhere
- tevc/tevd
- simple command line client for use on any server or node
- trafgen
- traffic generation via TG toolkit
- patched to allow control via events
install (emulab cluster site configuration tools)
- for making more emulabs
- mostly automated install process
- FreeBSD "port"/"meta-port"-style install script
- installs dependencies as needed
- performs emulab-specific install tasks
- one for configuring a "boss" node (secure server)
- one for configuring an "ops" node (public server)
ipod/apod (node control without power control hardware)
- "ICMP Ping-Of-Death" and big brother, "Authenticated Ping-Of-Death"
- reboot pingable but hung node without external intervention
- adds robustness and greater control
- especially important where only other alternative is a human
Libaries (Software engineering?)
- shared constants
- common interfaces
- database routines and abstractions
- important for robust, maintainable software
OS tools (disk images, etc)
- management of disk contents
- image creation
- imagezip
- lots of cool tricks here - read the frisbee paper
- image distribution/installation
- frisbee
- lots to say here... read the paper in USENIX'03 and doc/papers
- growdisk - partition management on heterogeneous nodes
- deltas
- deprecated - dump/restore
- with our incredible disk image tools, it is way faster to
just reload the disk instead of checking it first
- tarfile installation
- easy changes without forcing a customized disk image
PXE/DHCP - node boot process
- automatic database-driven control of nodes
- can't assume anything about the disk
- node always boots off of PXE so we get control
- talk to the database (via bootinfo)
- may be told to boot a tftp kernel or a specific partition
- tftp kernels (often with Memory file systems) used for:
- disk image creation/installation
- NetBoot
- OSKit kernels
- in emulab disk images, nodes self-configure using a pull model
- see also TMCD
- progress monitored by stated
Security
- always conscious of threat model
- segregate public server (ops)
- limited shells on secure server
- secure server trusted by all nodes
- emulab performs config tasks on behalf of user
- plasticwrap/paperbag - transparently run commands on secure server
- suexec during web execution adds extra layer of security and
permission checks
- lastlogs
- track logins on servers and nodes, report into main db
- giving away root on the nodes causes issues
- passwords
- we enforce good ones via checkpass/cracklib
- have expirations
Sensors
- monitor nodes
- healthd - temperature, etc
- slothd - activity measurements
- detect tty, network, cpu activity and report it
- low overhead
- agile
- extremely low latency in detecting new activity in an idle node
- higher latency okay for detecting beginning of inactivity
- when its active, stay out of the way...
TBSetup
- core of testbed software
- primary focus: expt config tasks
- and auxiliary functions necessary for expt config stuff
- assign_wrapper
- interface between db data representation and resource allocation
algorithms. Call the solver and use the output to set up the
database state that runs the rest of the process.
- batch daemon
- core of a pretty typical batch system
- allows for more automation
- submit expt even when no resources are avail., runs later
- checkports - ?
- console reset/setup
- control console access (see also capture section)
- db2ns - dump our db data rep back into an ns file
- eventsys start/control
- start up event schedulers for each expt - see event section
- exports setup
- control access to files via NFS on nodes
- create an /etc/exports file based on current node "ownership" and
group membership
- controls access to all home dirs, proj dirs, and group dirs
- frisbeelauncher
- wrapper to set up a frisbee server when trying to load a disk
- libaudit - track requests for certain control actions
- libtbsetup - see libraries section
- libtestbed - see libraries section
- mkgroup/mkproj, rmgroup/rmproj, rmuser
- manage users, groups, and projects (sync unix world to match db)
- named_setup
- set up dns subdomains for each expt
- create aliases for each node that are consistent across swapins
- node_control - change node sw setup params (boot params, startup)
- node_reboot
- reboot a node as gracefully as possible
- try 'ssh reboot', IPOD, then power cycle, as needed.
- node_update - push mounts/accounts changes to nodes
- nscheck - syntax check an ns file for use in emulab
- os_load - start a frisbee disk reload
- os_select - configure node boot params
- os_setup
- major part of expt config
- db says what nodes should be running, so make it happen
- may load disks, then reboots nodes and waits for them to come up
- portstats - diag. tool for switch port counters
- power - power control program
- ptopgen - generate description of currently available hw
- reload_daemon
- first-cut node manager
- reload disks when nodes get freed
- resetvlans - clear any vlans made up of a set of nodes
- routecalc - generate shortest path routes for a topology
- sched_reload - set up a disk reload for later
- sched_reserve - set up a node to go to an expt when freed
- setgroups - update unix groups file with current membership
- sfskey update - sync live sfskey config with db config
- snmpit - SNMP switch control
- supports multiple switch types
- configures VLANs into "links" and "LANs" in topologies
- read other switch data (ie for portstats)
- startexp/endexp - begin/end experiments
- wrappers called from web
- start takes a "new" expt and an ns file
- prerun it and swap it in, and send mail, leaving "active" expt
- end takes a expt that is "new", "swapped", "active", or "terminated"
- swap out if needed, and tbend it, then clean up the last bits
- staticroutes
- take db topology info and pass it to routecalc to generate static
shortest-path routes for the expt. Save result in db.
- swapexp
- called from web - swap in, out, or restart an expt.
- performs some checks, some locking, and calls tbswap or tbrestart
- tbprerun
- parse an ns file into the database, fully preparing it for swapin
- tbswap
- swap an expt in or out
- performs a long list of sw/hw setup tasks
- tbend
- end an expt that has been swapped out
- clean out virtual state
- tbreport
- dump a report of the experiment's configuration (virt and phys)
- tbresize
- older interface for rudimentary expt editing
- add nodes to an expt, either unconnected or in a LAN
- tbrestart
- restart an expt without completely swapping out and back in
- restart event system, reset ready/startup/boot status, port cntrs
- vnode_setup
- called from os_setup
- configures multiplexed virtual nodes
- mechanism: ssh runs a script in on the disk
- wanassign/wanlinksolve (see assign section)
- wanlinkinfo - display info on wide-area nodes from db
- checkpass - see security section
- ns2ir
- The Parser
- similar to/based on ns parser
- rewrote methods to put info into database
- performs emulab-specific checks
- we supply a library that they use to get access to
emulab-specific commands
Testsuite (regression testing - software engineering?)
- automated system runs lists of tests in different modes
- modes are levels of reality
- used for regression testing ("did we break something?")
- and development ("does this new thing work?"
- test mode (aka frontend mode):
- all scripts run like normal, but whenever something would have
touched hardware, assume it succeeded, and return
- doesn't touch nodes/switches, etc, but does all the db changes
- full mode:
- reserve some nodes from the testbed
- set up "redirect" for certain critical daemons
- set up an alternate db, make our nodes the only free ones
- run alternate daemons (or live daemons use alt. db for our nodes)
- entire system runs like normal, but off of a separate installed
set of scripts
- very flexible
- tests can modify db, run arbitrary scripts
- simple to use in normal case
- check that normal expt path runs w/o errors
- work in progress:
- use full mode to verify accuracy/precision of traffic shaping
- some parts may evolve to a set of tests that we run quickly at
after swapping in before turning it over to the user
TMCD - Testbed Master Control Daemon
- Server for node self-configuration
- provides controlled access to the database
- supports a pull model
- recieves various reports/messages from nodes
- TMCC - Testbed Master Control Client
- currently supported on FreeBSD and Linux, and ported to OpenBSD
- tool for nodes<->emulab communication
- part of a set of node initialization scripts
- Node self-configuration process
- report "I'm alive"
- update config scripts (currently via sup)
- run the config, which sets up:
- interfaces, accounts, mounts, agents, startup programs, testbed
daemons, installs tarfiles/rpms/etc, starts ntp, traffic shaping,
virtual nodes, routing (gated/ospf and static/manual routes),
hostname, /etc/hosts, IPOD/APOD, sfs, etc.
- used on local nodes and widearea nodes, as well as inside jails
Tools (built for emulab, but useful outside of it too)
- pcapper
- traffic visualization tool
- realtime tcl/tk graph of packets/throughput
- categorized by traffic types
Visualization
- graphical view of topologies in the database
Web Interface
- Main configuration/administrative interface
- Manage projects, groups, users
- edit user info, ssh keys, sfs keys, etc.
- push account updates to nodes
- Control nodes/experiments
- start/end/swap expts
- control nodes, delays, etc.
- NetBuild GUI for creating expts/nsfiles
- node status/monitoring
- Get info about Emulab/Netbed
- even download a CD, and get a key to join Netbed
- all the documentation
- tutorials, FAQs, etc.
- publications, photos, some of our users, etc.
- manage project data
- disk images, custom OS's, etc.
- for admins etc, also provides web db access and cvs web access
Stated ("state-dee") - node state management daemon
- listens for node state events
- performs triggered actions
- watches for problems/timeouts
- sends notifications at times
- updates the database with current state
- watches how nodes reboot, reload, etc
- several "state machines" (operational modes) define what is correct
- each node is somewhere in some state machine always
- reports successful boots, reloads, etc.
Netbed Wide-area nodes
- Most emulab abstractions have netbed wide-area counterpart
- same methods/abstractions/tools used in LAN or WAN environment
- easy to switch from a wide-area run to an emulated run (or simulated)
- Boot process a little different
- Many parallels to local area case
- SFS instead of NFS for shared homedirs
- Can set up links as tunnels with 192.168.* addresses
- Accounts same (except for rootness)
- Traffic generation
Simulated Nodes
- many nodes simulated inside NSE on a single phys. node
- can interact with real network
- traffic gen can happen inside
- links, etc. all work like normal
- Due to NS limitations/abstractions, lots of things in the real
world don't have a parallel here
Multiplexed Nodes
- many nodes run on one physical node, and appear as many individual nodes
- Implemented with "jail" on FreeBSD, or "____" on Linux
- Goal to be as close to normal physical nodes as possible
- creates lots of issues with multiplexing of virtual links onto
physical links
- routing, demultiplexing, etc
Cross-cutting Abstractions
- Four different environments
- Emulab/emulation (dedicated phys.) nodes, wide-area nodes,
simulated nodes, and multiplexed ("virtual") nodes
- can mix and match in same expt
- in many cases, same expt can run in any (or several) of the
environments with few or no changes
- Nodes
- Emulated/emulab: dedicated physical nodes in a cluster
- get root, can reboot, serial console, total control of node
- including OS, disk imaging, etc.
- Widearea: shared nodes, geographically distributed
- get an account (non-root, typically)
- sometimes get a jail / "virtual server"
- less control (of OS, rebooting, etc.)
- Simulated: nodes inside of an NS simulator
- nodes are simulated, don't run an OS, etc.
- functionality programmed via NS models
- Multiplexed: jails / virtual servers on cluster nodes
- Almost as real as emulation nodes
- allows bigger scale, risks potential for side-effects
- same level of control as emulation nodes
- Links
- Emulated/emulab:
- completely controllable network characteristics
- including LAN speeds or shaped links
- isolated control network
- very realistic, predictable, repeatable
- Widearea:
- network is the real/raw internet
- tunnels are optionally configured
- no separate control network
- completely realistic, but unpredictable
- Simulated:
- links inside NSE (NS Emulator)
- NSE does shaping
- real and sim worlds can talk to each other
- Multiplexed:
- Same capabilities as normal emulated/emulab links
- some tricks involved to get everything to work right
---EOF---