tmcd/tmcd.c · f12491790572204c53353c6ea84e32e2ee02f7d9 · emulab / emulab-devel

Another Kludge for returning mounts to VMs. What a pain. Here · f1249179
Leigh B Stoller authored Aug 27, 2013
are the details, so they are recorded someplace.

The Racks do not have a real 172 router for the "jail" network.
This is a mild pain, and one possibility would be to make the
router be the physical node, so that each set of VMs is using its own
router thus spreading the load.

Well, that does not work because we use bridge mode on the physical
host, and so the packets leave the node before they have a chance to
go through the routing code. Yes, iptables does have something called
a brouter via etables, but I could not make that work after a lot of
trying and tearing my hair out

So the next not so best thing is to make the control node be the
router by sticking an alias on xenbr0 for 172.16.0.1. Fine, that works
although performance could suffer.

But what about NFS traffic to ops? It would be really silly to send
that through the routing code on the control node, just to end up
bridging into into the ops VM. So figured I would optimize that by
changing domounts to return mounts that reference ops address on the
jail network. And in fact this worked fine, but only for shared
nodes.

But it failed for exclusive VMs! In this case, we add a SNAT rule on
the physical host that changes the source IP to be that of the
physical host so that users cannot spoof a VM on a shared node and
mount an NFS filesystem they should not have access to. In fact, it
failed for UDP mounts but not for TCP mounts. When I looked at the
traffic with tcpdump, it appeared that return TCP traffic from ops was
using its jail IP, but return UDP traffic was using the public IP.
This confuses SNAT and so the packets never get back into the VM.

So, this change basically looks at the sharing mode of the node, and
if its shared we use the jailip in the mounts, and if it is exclusive
we use the public IP (and thus, that traffic gets routed through the
control node). This sucks, but I am worn down on this.
f1249179