Eliminate tipserver mounts on ops
Our remaining tipserv*
nodes are getting old and flaky and if one of them goes down for an extended period of time, it (eventually) takes ops
with it because we NFS mount the tipserv nodes on ops. Even though the mounts are interruptible (intr) and set to timeout (soft), neither seems to work. A 30 minute reboot of ops because a tipserv node fails is not acceptable.
Why are tipserv nodes mounted on ops? Originally, it was so that users could get to the log/run files for captures. But now that users can no longer login to ops directly, maybe that is not needed. @stoller confirms that the portal interface accesses logs by ssh
ing directly to the tipserv nodes. The remaining concern is that ops
does directly access the .acl
file for authentication. A reasonably simple solution for this is to reverse the mounts so that tipserv nodes mount ops
instead of the other way around. We would want to isolate the .acl
files so that they are the only thing being exported to ops
; we do not want to have every tipserv node writing the console logs themselves across NFS to ops
. ops
does not need more NFS load.