Watch for "not enough nodes" error status from startexp. Send email
every now and then. Also change the way experiments are selected to be configured. Instead of trying to start the same experiment over and over every 15 seconds, use a select to pick out experiments that have not been tried within the last 10 minutes. This will favor brand new experiments the first time, but after that all failed experiments are treated the same. The least recently attempted experiment over 10 minutes is selected next.
Showing with 16 additions and 31 deletions