Quick fix for watchdog/backup interaction; use a script lock.
From Slack: What I notice is that mysqldump is read locking all of the tables for a long time. This time gets longer and longer of course as the DB gets bigger. Last night enough stuff backed up (trying to get various write locks) that we hit the 500 thread limit. I only know this cause mysql prints "killing 501" threads at 2:03am. Which makes me wonder if our thread limit is too small (but seems like it would have to be much bigger) or if our backup strategy is inappropriate for how big the DB is and how busy the system is. But to be clear, I am not even sure if mysqld throws in the towel when it hits 500 threads, I am in the midst of reading obtuse mysql documentation. (edited) There a bunch of other error messages that I do not understand yet. I can reproduce this in my elabinelab with a 10 line perl script. Two problems; one is that we do not use the permission system, so we cannot use dynamic permissions, which means that the single thread that is left for just this case, can be used by anyone, and so the server is fully out of threads. And 2) then the Emulab mysql watchdog cannot perform its query, and so it thinks mysqld has gone catatonic and kills it, right in the middle of the backup. Yuck * 2. (edited) And if anyone is curious about a more typical approach: "If you want to do this for MyISAM or mixed tables without any downtime from locking the tables, you can set up a slave database, and take your snapshots from there. Setting up the slave database, unfortunately, causes some downtime to export the live database, but once it's running, you should be able to lock it's tables, and export using the methods others have described. When this is happening, it will lag behind the master, but won't stop the master from updating it's tables, and will catch up as soon as the backup is complete"
Showing with 26 additions and 2 deletions