1. 23 May, 2019 1 commit
    • Leigh Stoller's avatar
      Changed related to parameter sets and experiment bindings: · 03e4d8bc
      Leigh Stoller authored
      * Show the parameter bindings on the status page for an experiment, and
        on the memlane page. This is strictly informational so that users can
        quickly see the parameters that are/were chosen at the time the
        experiment was created.
      * Add a Save Parameters button on the memlane and status pages. This
        will generate a json structure and store it in the DB for that profile
        and user. Optionally, mark the parameter set as specific to a profile
        version or repo hash, so a user can quickly link to that version/hash
        and apply the parameter set.
      * On the instantiate page, the parameters step include new buttons to
        1) reset the form to default, 2) apply the parameters used in the most
        recent experiment (current, then history), 3) choose from a dropdown
        of parameters the users has saved for that profile, and 4) take the
        user to their activation history for the profile, to pick one to run
        again or save parameters.
      * Add a new tab to the user dashboard to show the user's saved parameter
      * Lots of changes to the new version of the ppwizard for apply
        parameter sets and showing warnings about them. This code has NOT been
        applied to the old ppwizard.
  2. 08 May, 2019 1 commit
  3. 30 Apr, 2019 1 commit
  4. 26 Apr, 2019 3 commits
  5. 13 Mar, 2019 1 commit
  6. 11 Mar, 2019 2 commits
  7. 06 Mar, 2019 1 commit
  8. 28 Feb, 2019 1 commit
  9. 12 Feb, 2019 1 commit
    • Leigh Stoller's avatar
      Recovery mode: · bde6c94d
      Leigh Stoller authored
      * Add a new Portal context menu option to nodes, to boot into "recovery"
        mode, which will be a Linux MFS (rather then the FreeBSD MFS, which
        99% of user will not know what to do with).
      * Plumb all through to the Geni RPC interface, which invokes node_admin
        with a new option, to use the recovery mfs nodetype attribute.
      * recoverymfs_osid is a distinct osid from adminmfs_osid, we use that in
        the CM to add an Emulab name space attribute to the manifest, that
        tells the Portal that a node supports recovery mode (and thus gets a
        context menu option).
      * Add an inrecovery flag to the sliver status blob, which the Portal
        uses to determine that a node is currently in recovery mode, so that
        we can indicate that in the topology and list tabs.
  10. 08 Feb, 2019 2 commits
  11. 04 Feb, 2019 2 commits
  12. 02 Jan, 2019 1 commit
  13. 13 Dec, 2018 1 commit
  14. 06 Dec, 2018 1 commit
  15. 30 Nov, 2018 1 commit
  16. 28 Nov, 2018 1 commit
  17. 16 Nov, 2018 1 commit
  18. 07 Nov, 2018 1 commit
    • Leigh Stoller's avatar
      Quick fix for watchdog/backup interaction; use a script lock. · 72b4ba32
      Leigh Stoller authored
      From Slack:
      What I notice is that mysqldump is read locking all of the tables for a
      long time. This time gets longer and longer of course as the DB gets
      bigger. Last night enough stuff backed up (trying to get various write
      locks) that we hit the 500 thread limit. I only know this cause mysql
      prints "killing 501" threads at 2:03am. Which makes me wonder if our
      thread limit is too small (but seems like it would have to be much
      bigger) or if our backup strategy is inappropriate for how big the DB is
      and how busy the system is. But to be clear, I am not even sure if
      mysqld throws in the towel when it hits 500 threads, I am in the midst
      of reading obtuse mysql documentation. (edited) There a bunch of other
      error messages that I do not understand yet.
      I can reproduce this in my elabinelab with a 10 line perl script. Two
      problems; one is that we do not use the permission system, so we cannot
      use dynamic permissions, which means that the single thread that is left
      for just this case, can be used by anyone, and so the server is fully
      out of threads. And 2) then the Emulab mysql watchdog cannot perform its
      query, and so it thinks mysqld has gone catatonic and kills it, right in
      the middle of the backup. Yuck * 2. (edited)
      And if anyone is curious about a more typical approach: "If you want to
      do this for MyISAM or mixed tables without any downtime from locking the
      tables, you can set up a slave database, and take your snapshots from
      there. Setting up the slave database, unfortunately, causes some
      downtime to export the live database, but once it's running, you should
      be able to lock it's tables, and export using the methods others have
      described. When this is happening, it will lag behind the master, but
      won't stop the master from updating it's tables, and will catch up as
      soon as the backup is complete"
  19. 05 Nov, 2018 1 commit
  20. 30 Oct, 2018 1 commit
  21. 29 Oct, 2018 1 commit
  22. 23 Oct, 2018 1 commit
  23. 11 Oct, 2018 1 commit
  24. 24 Sep, 2018 1 commit
    • David Johnson's avatar
      Fix an IsFeasible bug where start and end of different reservations overlap. · a7089186
      David Johnson authored
      When IsFeasible processes the list of events (i.e. reservation
      start/end, expt start/end), it processes them in sorted order of event
      time, but if times are equal, there is no secondary sort, and thus the
      additive (incoming) reservation might be processed before the reductive
      (outgoing) reservation), which would create a false negative hole in the
      forecast.  This commit adds the secondary sort.
  25. 12 Sep, 2018 1 commit
  26. 29 Aug, 2018 3 commits
  27. 30 Jul, 2018 2 commits
  28. 20 Jul, 2018 1 commit
  29. 16 Jul, 2018 1 commit
  30. 12 Jul, 2018 1 commit
  31. 09 Jul, 2018 1 commit
  32. 18 Jun, 2018 1 commit
    • Leigh Stoller's avatar
      Add automated cancellation of reservations that are not used: · 67a0e58e
      Leigh Stoller authored
      * If unused at six hours, schedule for cancel in three hours and send
      * If reservation becomes used within those three hours, rescind the
      * Add an override bit so that cancel/uncancel on the command line
        supercedes (so explicit cancel or rescinding a cancel, means do not
        make any more automated checks for unused).
      * Rework cancel to be more library friendly.