-
Kevin Atkinson authored
Many changes to tblog code. Database update needed: 1) Added summary of failed nodes is os_setup. The cause of the error is now classified as "user" if it is only user images that failed and the user image failed on every pc of a particular type. Otherwise I leave the cause as "unknown" since it is really hard to tell what the real cause is. 2) Raised the confidence threshold for most errors so that they will appear on the top. 3) Added a special error when an experiment is canceled. The cause is "canceled" and testbed-ops won't see these errors. 4) Fixed a bug in assign_wrapper where it will incorrectly report "This experiment cannot be instantiated on this testbed..." when really the user canceled the swapin. 5) Fixed a bug where os_setup errors where being incorrectly reported as assign errors. This happens when os_setup fails for some reason and tbswap tries again, but the second time around there are not enough nodes. So the last error is coming from assign even though the true cause of the error is due to failed nodes. The fix for this involved added a new column to the log table, "attempt", which will be 1 for the first attempt and then incremented for each new attempt. tblog_find_error will then simply ignore any errors with "attempt > 1". 6) Also fixed a potential problem when there is an error during the cleanup phase by adding another column "cleanup". tblog_find_error will also ignore any errors with the cleanup bit set.
183040de