defs-cloudlab-utah · portal-css · Sriram Selvam / emulab-devel

Source project has a limited visibility.

6 years ago

Support for frisbee direct image upload to fs node. · 99943a19

Mike Hibler authored 6 years ago

We have had issues with uploading images to boss where they are then written
across NFS to ops. That seems to be a network hop too far on CloudLab Utah
where we have a 10Gb control network. We get occasional transcient timeouts
from somewhere in the TCP code. With the convoluted path through real and
virtual NICs, some with offloading, some without, packets wind up getting
out of order and someone gets far enough behind to cause problems.

So we work around it.

If IMAGEUPLOADTOFS is defined in the defs-* file, we will run a frisbee
master server on the fs (ops) node and the image creation path directs the
nodes to use that server. There is a new hack configuration for the master
server "upload-only" which is extremely specific to ops: it validates the
upload with the boss master server and, if allowed, fires up an upload
server for the client to talk to. The image will thus be directly uploaded
to the local (ZFS) /proj or /groups filesystems on ops. This seems to be
enough to get around the problem.

Note that we could allow this master server to serve downloads as well to
avoid the analogous problem in that direction, but this to date has not
been a problem.

NOTE: the ops node must be in the nodes table in the DB or else boss will
not validate proxied requests from it. The standard install procedure is
supposed to add ops, but we have a couple of clusters where it is not in
the table!

99943a19

History

Support for frisbee direct image upload to fs node.

Mike Hibler authored 6 years ago

So we work around it.

Note that we could allow this master server to serve downloads as well to
avoid the analogous problem in that direction, but this to date has not
been a problem.