In the beginning, the number and size of experiments was small, and so
storing the entire slice/sliver status blob as json in the web task was
fine, even though we had to lock tables to prevent races between the
event updates and the local polling.
But lately the size of those json blobs is getting huge and the lock is
bogging things down, including not being able to keep up with the number
of events coming from all the clusters, we get really far behind.
So I have moved the status blobs out of the per-instance web task and
into new tables, once per slice and one per node (sliver). This keeps
the blobs very small and thus the lock time very small. So now we can
keep up with the event stream.
If we grow big enough that this problem comes big enough, we can switch
to innodb for the per-sliver table and do row locking instead of table
locking, but I do not think that will happen