Commit 16321a1e authored by Mac Newbold's avatar Mac Newbold
Browse files

Checkpoint, for historical purposes, etc.

parent fb1fb484
Testbed Architecture Hierarchy
(from 4/7/03 testbed mtg)
(This shows heirarchy only... any apparent ordering between sibling
nodes in the tree is irrelevant and insignificant.)
* DB
- schema
- state:
- access control/admin
- virt expt config
- phys node config
* UI
- web
- command line
- NS
- Netbuild GUI
- Visualization
* TB Administration
- ? create new testbed
- ? boss/ops install
- add nodes
- local
- widearea - netbed CD
* Scheduling
- Idle detection/monitoring
- manual scheduling
- batch queue
* Access control
- User accounts
- Projects
- Groups
- permissions model?
- security
- isolation
* Experiment Configuration and Control
- Node configuration
- different impls: local, wa, sim, mux
- virt info via tmcd
- Link config
- different impls: local, wa, sim, mux
- tunnels, etc?
- Resource allocation
- Storage config
- Run-time Control
- events
- consoles
- control net?
- expt life cycle (state machine)
* WHERE???
- stated / node state machines
- control net?
- disk loading
Testbed Architecture
Overview of parts and organization
(Started April 2, 2003)
Parts, in no particular order yet:
- unix accounts
- unix group management (per-proj and per-group)
- ssh key distribution
- sfs key distribution
- account permissions (web only, ron/wa, root/non-root, etc.)
- emulab permissions
- control of hardware/hw config.
- administrative control
- hierarchical organization
- delegation at all levels
- trust models and their security impact
Assign (resource allocation algorithms)
- the Testbed Mapping Problem
- NP Hard
- in some ways, constraint satisfaction problem
- but more, because not all satisfactory solutions are equal
- time constraints: we're an interactive system, and need to
perform on interactive timescales - a few seconds max to get a
good answer
- variation in wide area
- soft matching
- complicated more by fact that we can combine the unknown
(wide-area link) with something we control (traffic shaping)
- Emulab solution
- many "valid" solutions, but difference between near-optimal and
random valid soln. is huge and important
- sim. annealing core
- highly optimized
- clever domain specific tricks
- main purpose is to conserve scarce resources (nodes,
interswitch bandwidth, soon special hw like GigE)
- lots of parameters, not always clear how to tune them
- Netbed solution
- typically no exact match, just some that may be closer than
others - very fuzzy matching
- genetic algo. core
- not as highly developed yet, but meets our needs
- main purpose is to find a real-world overlay that matches the
supplied topology as closely as possible
Capture/console (node consoles - "'zero-penalty' remote research")
- serial line consoles to nodes replace kbd/vga
- fine-grained access control
- changes quickly when node changes "ownership"
- simple, secure remote access
- ACLs, authenticated ssl tunnel program + standard telnet client
- [do we still want to talk about tip?]
CD-ROM (remote node mgmt/robustness, adding nodes to the system)
- simple to add a node
- fallback boot method (CD-ROM) when disk is hozed
- path for self-update and disk reimaging
- goal is to reduce need for human intervention whenever possible
Database (centralized store for persistent shared system state)
- lots o' stuff here
- most stuff falls into one of several categories
- semi-permanant hw setup info (wires, ifaces, nodes, outlets)
- current hardware configs (reservations, ifaces, vlans, etc)
- semi-permanant sw setup info (disk images, OS's, etc.)
- current sw setup (traffic shaping, trafgen, routing, etc.)
- virtualized expt info (topology, config, etc)
- administrative info (users, groups, projects, etc.)
- misc. config bits and logging
- sw engineering issues
- db schema must match sw build
IXP (special hw resources?)
- use as testbed infrastructure
- traffic shaping
- use for experimentation
- shared facil. gives more people access, increases usage
- emulab is good environment w/many tools
discvr? (topology discovery tool(s))
- verification tool?
Event system (distributed event coordination/communication)
- publish/subscribe system (written by someone else)
- used in several directions
- emulab to nodes/programs
- nodes to emulab
- programs on emulab server to each other
- can be nodes to nodes too
- delay agent
- coordinated control of traffic shaping
- changes can initiate anywhere
- automatic timed changes from emulab
- manual changes from emulab server or a node
- allows for reactive traffic shaping, trace playback, etc.
- nsetrafgen
- control of NSE simulators and their traffic generation
- program agent
- start/stop arbitrary program
- timed or manual, and allows reactivity
- event scheduler
- controls timed events
- may be submitted apriori or during a run
- stated uses it heavily, but is described elsewhere
- tevc/tevd
- simple command line client for use on any server or node
- trafgen
- traffic generation via TG toolkit
- patched to allow control via events
install (emulab cluster site configuration tools)
- for making more emulabs
- mostly automated install process
- FreeBSD "port"/"meta-port"-style install script
- installs dependencies as needed
- performs emulab-specific install tasks
- one for configuring a "boss" node (secure server)
- one for configuring an "ops" node (public server)
ipod/apod (node control without power control hardware)
[this should fit under something]
- "ICMP Ping-Of-Death" and big brother, "Authenticated Ping-Of-Death"
- reboot pingable but hung node without external intervention
- adds robustness and greater control
- especially important where only other alternative is a human
Libaries (Software engineering?)
[this should fit somewhere else probably]
- shared constants
- common interfaces
- database routines and abstractions
- important for robust, maintainable software
OS tools (disk images, etc)
- management of disk contents
- image creation
- imagezip
- lots of cool tricks here - read the frisbee paper
- image distribution/installation
- frisbee
- lots to say here... read the paper in USENIX'03
- growdisk - partition management on heterogeneous nodes
- deltas
- deprecated - dump/restore
- with our incredible disk image tools, it is way faster to
just reload the disk instead of checking it first
- tarfile installation
- easy changes without forcing a customized disk image
PXE/DHCP - node boot process
- automatic database-driven control of nodes
- can't assume anything about the disk
- node always boots off of PXE so we get control
- talk to the database (via bootinfo)
- may be told to boot a tftp kernel or a specific partition
- tftp kernels (often with Memory file systems) used for:
- disk image creation/installation
- NetBoot
- OSKit kernels
- in emulab disk images, nodes self-configure using a pull model
- see also TMCD
- progress monitored by stated
- always conscious of threat model
- segregate public server (ops)
- limited shells on secure server
- secure server trusted by all nodes
- emulab performs config tasks on behalf of user
- plasticwrap/paperbag - transparently run commands on secure server
- suexec during web execution adds extra layer of security and
permission checks
- lastlogs [should be somewhere else?]
- track logins on servers and nodes, report into main db
- giving away root on the nodes causes issues [discuss elsewhere?]
- passwords
- we enforce good ones via checkpass/cracklib
- have expirations
- monitor nodes
- healthd - temperature, etc
- slothd - activity measurements
- detect tty, network, cpu activity and report it
- low overhead
- agile
- extremely low latency in detecting new activity in an idle node
- higher latency okay for detecting beginning of inactivity
- when its active, stay out of the way...
SQL (database schema and prefilled data, and how to upgrade schemas)
[discuss elsewhere, under database]
SSL (secure wide-area communications)
[discuss elsewhere, if at all]
TBSetup [break into several groups? Organize some other way?]
- core of testbed software
- primary focus: expt config tasks
- and auxiliary functions necessary for expt config stuff
- assign_wrapper
- interface between db data representation and resource allocation
algorithms. Call the solver and use the output to set up the
database state that runs the rest of the process.
- batch daemon
- core of a pretty typical batch system
- allows for more automation
- submit expt even when no resources are avail., runs later
- checkports - ?
- console reset/setup [move somewhere?]
- control console access [see also capture section]
- db2ns - dump our db data rep back into an ns file
- eventsys start/control
- start up event schedulers for each expt - see event section
- exports setup
- control access to files via NFS on nodes
- create an /etc/exports file based on current node "ownership" and
group membership
- controls access to all home dirs, proj dirs, and group dirs
- frisbeelauncher
- wrapper to set up a frisbee server when trying to load a disk
- libaudit - track requests for certain control actions
- libtbsetup - see libraries section
- libtestbed - see libraries section
- mkgroup/mkproj, rmgroup/rmproj, rmuser
- manage users, groups, and projects (sync unix world to match db)
- named_setup
- set up dns subdomains for each expt
- create aliases for each node that are consistent across swapins
- node_control - change node sw setup params (boot params, startup)
- node_reboot
- reboot a node as gracefully as possible
- try 'ssh reboot', IPOD, then power cycle, as needed.
- node_update - push mounts/accounts changes to nodes
- nscheck - syntax check an ns file for use in emulab
- os_load - start a frisbee disk reload
- os_select - configure node boot params
- os_setup
- major part of expt config
- db says what nodes should be running, so make it happen
- may load disks, then reboots nodes and waits for them to come up
- portstats - diag. tool for switch port counters
- power - power control program
- ptopgen - generate description of currently available hw
- reload_daemon
- first-cut node manager
- reload disks when nodes get freed
- resetvlans - clear any vlans made up of a set of nodes
- routecalc - generate shortest path routes for a topology
- sched_reload - set up a disk reload for later
- sched_reserve - set up a node to go to an expt when freed
- setgroups - update unix groups file with current membership
- sfskey update - sync live sfskey config with db config
- snmpit - SNMP switch control
- supports multiple switch types
- configures VLANs into "links" and "LANs" in topologies
- read other switch data (ie for portstats)
- startexp/endexp - begin/end experiments
- wrappers called from web
- start takes a "new" expt and an ns file
- prerun it and swap it in, and send mail, leaving "active" expt
- end takes a expt that is "new", "swapped", "active", or "terminated"
- swap out if needed, and tbend it, then clean up the last bits
- staticroutes
- take db topology info and pass it to routecalc to generate static
shortest-path routes for the expt. Save result in db.
- swapexp
- called from web - swap in, out, or restart an expt.
- performs some checks, some locking, and calls tbswap or tbrestart
- tbprerun
- parse an ns file into the database, fully preparing it for swapin
- tbswap
- swap an expt in or out
- performs a long list of sw/hw setup tasks
- tbend
- end an expt that has been swapped out
- clean out virtual state
- tbreport
- dump a report of the experiment's configuration (virt and phys)
- tbresize
- older interface for rudimentary expt editing
- add nodes to an expt, either unconnected or in a LAN
- tbrestart
- restart an expt without completely swapping out and back in
- restart event system, reset ready/startup/boot status, port cntrs
- vnode_setup
- called from os_setup
- configures multiplexed virtual nodes
- mechanism: ssh runs a script in on the disk
- wanassign/wanlinksolve (see assign section)
- wanlinkinfo - display info on wide-area nodes from db
- checkpass - see security section
- ns2ir
- The Parser
- similar to/based on ns parser
- rewrote methods to put info into database
- performs emulab-specific checks
- we supply a library that they use to get access to
emulab-specific commands
Testsuite (regression testing - software engineering?)
- automated system runs lists of tests in different modes
- modes are levels of reality
- used for regression testing ("did we break something?")
- and development ("does this new thing work?"
- test mode (aka frontend mode):
- all scripts run like normal, but whenever something would have
touched hardware, assume it succeeded, and return
- doesn't touch nodes/switches, etc, but does all the db changes
- full mode:
- reserve some nodes from the testbed
- set up "redirect" for certain critical daemons
- set up an alternate db, make our nodes the only free ones
- run alternate daemons (or live daemons use alt. db for our nodes)
- entire system runs like normal, but off of a separate installed
set of scripts
- very flexible
- tests can modify db, run arbitrary scripts
- simple to use in normal case
- check that normal expt path runs w/o errors
- work in progress:
- use full mode to verify accuracy/precision of traffic shaping
- some parts may evolve to a set of tests that we run quickly at
after swapping in before turning it over to the user
TMCD - Testbed Master Control Daemon
- Server for node self-configuration
- provides controlled access to the database
- supports a pull model
- recieves various reports/messages from nodes
- TMCC - Testbed Master Control Client
- currently supported on FreeBSD and Linux, and ported to OpenBSD
- tool for nodes<->emulab communication
- part of a set of node initialization scripts
- Node self-configuration process [move elsewhere? up a level?]
- report "I'm alive"
- update config scripts (currently via sup)
- run the config, which sets up:
- interfaces, accounts, mounts, agents, startup programs, testbed
daemons, installs tarfiles/rpms/etc, starts ntp, traffic shaping,
virtual nodes, routing (gated/ospf and static/manual routes),
hostname, /etc/hosts, IPOD/APOD, sfs, etc.
- used on local nodes and widearea nodes, as well as inside jails
Tools (built for emulab, but useful outside of it too)
- pcapper
- traffic visualization tool
- realtime tcl/tk graph of packets/throughput
- categorized by traffic types
- graphical view of topologies in the database
Web Interface
- Main configuration/administrative interface
- Manage projects, groups, users
- edit user info, ssh keys, sfs keys, etc.
- push account updates to nodes
- Control nodes/experiments
- start/end/swap expts
- control nodes, delays, etc.
- NetBuild GUI for creating expts/nsfiles
- node status/monitoring
- Get info about Emulab/Netbed
- even download a CD, and get a key to join Netbed
- all the documentation
- tutorials, FAQs, etc.
- publications, photos, some of our users, etc.
- manage project data
- disk images, custom OS's, etc.
- for admins etc, also provides web db access and cvs web access
Stated ("state-dee") - node state management daemon
- listens for node state events
- performs triggered actions
- watches for problems/timeouts
- sends notifications at times
- updates the database with current state
- watches how nodes reboot, reload, etc
- several "state machines" (operational modes) define what is correct
- each node is somewhere in some state machine always
- reports successful boots, reloads, etc.
Netbed Wide-area nodes
- Most emulab abstractions have netbed wide-area counterpart
- same methods/abstractions/tools used in LAN or WAN environment
- easy to switch from a wide-area run to an emulated run (or simulated)
- Boot process a little different
- [need some details here... Leigh? can you tell me how it works?]
- Many parallels to local area case
- SFS instead of NFS for shared homedirs
- Can set up links as tunnels with 192.168.* addresses
- Accounts same (except for rootness)
- Traffic generation
- [more?]
Simulated Nodes
- many nodes simulated inside NSE on a single phys. node
- can interact with real network
- traffic gen can happen inside
- links, etc. all work like normal
- Due to NS limitations/abstractions, lots of things in the real
world don't have a parallel here
Multiplexed Nodes
- many nodes run on one physical node, and appear as many individual nodes
- Implemented with "jail" on FreeBSD, or "____" on Linux
- Goal to be as close to normal physical nodes as possible
- creates lots of issues with multiplexing of virtual links onto
physical links
- routing, demultiplexing, etc
Cross-cutting Abstractions
- Four different environments
- Emulab (dedicated phys.) nodes, wide-area nodes, simulated nodes,
and multiplexed ("virtual") nodes
- can mix and match in same expt
- in many cases, same expt can run in any (or several) of the
environments with few or no changes
- Nodes
- E: (emulab) dedicated physical nodes
- completely controllable network characteristics
- get root, can reboot, serial console, total control of node
- including OS, disk imaging, etc.
- W: (widearea) shared nodes, geographically distributed
- get an account (non-root)
HW config (switch/router configs specific to emulab)?
rc.d (daemons on boss/ops/tipservers for running emulab)?
sysadmin (apachelogroll)?
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment