FreeBSD Jail-based Virtual Node Implementation

This page describes the changes we made to FreeBSD jails to support Emulab virtual nodes and describes the boot time setup process for those jails.

Jail Changes

Following is a list of the features we added, and bugs we fixed in FreeBSD jails. All of the new features are optional, controlled by sysctl MIBs and per-jail flags. This new jail implementation is backward compatible with the original implementation, meaning all new features are disabled by default.

Starting a FreeBSD Virtual Node

The goal for Emulab jail-based virtual nodes (hence forth known just as "jails") is to set up an environment that is as much like the standard Emulab node environment as possible. This makes it easy for the Emulab infrastructure as well as for the Emulab user. Also note that the intent is to use jails both locally (Emulab cluster nodes) and remotely (wide-area, RON nodes), where there are going to be different security considerations. Hence the need for per-jail permissions bits as mentioned above. Setting up the jail is broken into two parts; the stuff that needs to be done outside the jail (creating the jail filesystem, setting up interfaces, tunnels, routes, mounting shared filesystems) because the jail does not have enough permission, and the stuff that can be done inside the jail (creating accounts, installing software, starting programs and traffic generators). Following is a description of those two phases.

Setting up the jail, phase one:

To set up the outer environment it is necessary to: Setting up the filesystem for the jail is a long arduous process: The other complication in setting up the jailed environment involves access to TMCD. Wide-area testbed nodes are not allowed to contact tmcd without an ssl certificate, but we do not want to hand out per-jail certificates that could be easily copied. My approach was to not allow a jail to contact tmcd directly, but to instead go through a proxy running outside the jail. This has the added benefit of ensuring that the jail is not able to spoof another jail in another experiment. The implementation of this was to add a proxy mode to the tmcc client. Outside the jail, a tmcc proxy is started that creates a unix domain socket, whose path is inside the jail filesystem. In other words, the socket is named such that a tmcc client running inside the jail sees it too. The tmcc client inside the jail connects to the tmcc proxy running outside the jail via the unix domain socket, which relays the request to tmcd (sanitizing the request string), and then relays the answer back to the tmcc inside the jail. The proxy ensures that there is no spoofing of the jail id. There are many other alternatives for accomplishing this, but this was fairly easy to do.

Setting up the jail, phase two:

Once the jail system call has been issued, it is up to the inner environment to finish getting it set up. Inside the jail, the first program to run is a little program (injail) that is intended to act like "init" in that it starts the initial shell and then waits until it receives a signal to terminate. The easiest way to ensure that all processes inside the jail are terminated is for injail to send a TERM to the entire process group, and then a KILL to pick up any stragglers. This is because kill all of the processes from outside the jail is difficult (hard to see inside the jail), and because the jail will not actually terminate until all the processes inside are really dead.

The initial shell mentioned above is /etc/rc, which proceeds to do all of the same boot time configuration that normally happens when a node boots. The difference of course is that the jail has a heavily constrained /etc/rc.conf that starts up just a few essential services such as syslogd, cron, and sshd (on the specific port assigned sshd for the jail; see above). The last part of configuration run is the standard testbed setup, although again in a somewhat restricted manner. Currently the following testbed mechanisms are supported within the jailed environment: