clientside/tmcc/linux/libvnode.pm · a9e75f330b37a6f90fe7fb7ab488aea3ed9ec308 · emulab / emulab-devel

Major overhaul to support thin snapshot volumes and also fixup locking. · a9e75f33

Mike Hibler authored Feb 17, 2015

A "thin volume" is one in which storage allocation is done on demand; i.e.,
space is not pre-allocated, hence the "thin" part. If thin snapshots and
the associated base volume are all part of a "thin pool", then all snapshots
and the base share blocks from that pool. If there are N snapshots of the
base, and none have written a particular block, then there is only one copy
of that block in the pool that everyone shares.

Anyway, we now create a global thin pool in which the thin snapshots can be
created. We currently allocate up to 75% of the available space in the VG
to the pool (note: space allocated to the thin pool IS statically allocated).
The other 25% is for Things That Will Not Be Shared and as fallback in case
something on the thin volume path fails. That is, we can disable thin
volume creation and go back to the standard path.

Images are still downloaded and saved in compressed form in individual
LVs. These LVs are not allocated from the pool since they are TTWNBS.

When the first vnode comes along that needs an image, we imageunzip the
compressed version to create a "golden disk" LV in the pool. That first
node and all subsequent nodes get thin snapshots of that volume.

When the last vnode that uses a golden disk goes away we...well,
do nothing. Unless $REAP_GDS (linux/xen/libvnode_xen.pm) is set non-zero,
in which case we reap the golden disk. We always leave the compressed
image LV around. Leigh says he is going to write a daemon to GC all these
things when we start to run short of VG space...

This speed up for creation of vnodes that shared an image turned up some
more rack conditions, particularly around iptables. I close a couple more
holes (in particular, ensuring that we lock iptables when setting up
enet interfaces as we do for the cnet interface) and added some optional
lock debug logging (turned off right now).

Timestamped those messages and a variety of other important messages
so that we could merge (important parts of) the assorted logfiles and
get a sequential picture of what happened:

grep TIMESTAMP *.log | sort +2

(Think of it as Weir lite!)

a9e75f33