-
Leigh B Stoller authored
In the beginning, the number and size of experiments was small, and so storing the entire slice/sliver status blob as json in the web task was fine, even though we had to lock tables to prevent races between the event updates and the local polling. But lately the size of those json blobs is getting huge and the lock is bogging things down, including not being able to keep up with the number of events coming from all the clusters, we get really far behind. So I have moved the status blobs out of the per-instance web task and into new tables, once per slice and one per node (sliver). This keeps the blobs very small and thus the lock time very small. So now we can keep up with the event stream. If we grow big enough that this problem comes big enough, we can switch to innodb for the per-sliver table and do row locking instead of table locking, but I do not think that will happen
e5d36e0d