Remaining infrastructure for control network "ARP lockdown".
It works like this. Certain nodes that are on the node control net (right now just subbosses, but ops coming soon) can set static ARP entries for the nodes they serve. This raises the bar for (but does not eliminate the possibility of) nodes spoofing servers. Currently this is only for FreeBSD. When such a server boots, it will early on run /etc/rc.d/arplock.sh which will in turn run /usr/local/etc/emulab/fixarpinfo. fixarpinfo asks boss via an SSL tmcc call for "arpinfo" (using SSL ensures that the info coming back is really from boss). Tmcd on boss returns such arpinfo as appropriate for the node (subboss, ops, fs, etc.) along with the type of lockdown being done. The script uses this info to update the ARP cache on the machine, adding, removing, or making permanent entries as appropriate. fixarpinfo is intended to be called not just at boot, but also whenever we might need to update the ARP info on a server. The only other use right now is in subboss_dhcpd_makeconf which is called whenever DHCP info may need to be changed on a subboss (we hook this because a call to this script might also indicate a change in the set of nodes served by the subboss). In the future, fixarpinfo might be called from the newnode path (for ops/fs, when a node is added to the testbed), the deletenode path, or maybe from the watchdog (if we started locking down arp entries on experiment nodes) The type of the lockdown is controlled by a sitevar on boss, general/arplockdown, which can be set to 'none', 'static' or 'staticonly'. 'none' means do nothing, 'static' means just create static arp entries for the given nodes but continue to dynamically arp for others, and 'staticonly' means use only this set of static arp entries and disable dynamic arp on the control net interface. The last implies that the server will only be able to talk to the set of nodes for which it got ARP info. As mentioned, tmcd is responsible for returning the correct set of arp info for a given request. The logic currently is: * Only return ARP info to nodes which are on the CONTROL_NETWORK. If the requester is elsewhere (e.g., Utah's boss and ops are currently segregated on different IP subnets) then this whole infrastructure does not apply and nothing is returned. * If the requester is a subboss, return info for all other servers that are on the node control network as well as for the set of nodes which the subboss serves. * If the requester is an ops or fs node, again return info for all other servers and info for all testnodes or virtnodes whose control net IP is on the node control net. * Otherwise, return nothing. One final note is that the ARP info for servers such as boss/ops/fs or the gateway router is not readily available in most Emulab instances since those machines are not in the DB nodes or interfaces tables. Eventually we will fix that, but for now the info must come from new site variables. To help initially populate those variables, I added the utils/update_sitevars script which attempts to determine which servers are on the node control net and gathers the appropriate IP and MAC info from them.
Showing with 928 additions and 16 deletions