1. 13 Sep, 2017 5 commits
    • David Johnson's avatar
      loadinfo: if docker vnode and docker-format image, also send the PATH. · 9a3697de
      David Johnson authored
      (The image path in the docker case is a pointer to a full-qualified
      image -- registryhost[:port]/repo/si/tory/image:tag .)
    • Leigh B Stoller's avatar
    • Leigh B Stoller's avatar
      Fix to previous revision. · 270cadf0
      Leigh B Stoller authored
    • Leigh B Stoller's avatar
      Return number of nodes being used for a reservation, if the reservation · 23ef549a
      Leigh B Stoller authored
      has started. I attempted to deal with multiple reservations in the same
      project, lets see if it makes sense.
    • Mike Hibler's avatar
      Introduce sitevars to control the sensitivity of alerts. · 2962b32f
      Mike Hibler authored
      The sitevars are a bit obscure:
        # cnetwatch/check_interval
        #   Interval at which to collect info.
        #   Zero means don't run cnetwatch (exit immediately).
        # cnetwatch/alert_interval
        #   Interval over which to calculate packet/bit rates and to log alerts.
        #   Should be an integer multiple of the check_interval.
        # cnetwatch/pps_threshold
        #   Packet rate (packets/sec) in excess of which to log an alert.
        #   Zero means don't generate packet rate alerts.
        # cnetwatch/bps_threshold
        #   Data rate (bits/sec) in excess of which to log an alert.
        #   Zero means don't generate data rate alerts.
        # cnetwatch/mail_interval
        #   Interval at which to send email for all alerts logged during the interval.
        #   Zero means don't ever send email.
        # cnetwatch/mail_max
        #   Maximum number of alert emails to send; after this alerts are only logged.
        #   Zero means no limit to the emails.
      Basically you can tweak pps_threshold and bps_threshold to define what you
      think an unusual "burst" of cnet traffic is and then alert_interval to
      determine how long a burst has to last before you will send an alert.
      Why would you have check_interval less than alert_interval? You probably
      wouldn't unless you want to record finer-grained port stats using the -l
      option to write stats to a logfile. We do it on the mothership as a data
      source for some student machine learning projects. Note that in an environment
      with lots of control net switches, a single instance of gathering port
      counters from the switches could take 30 seconds or longer (on the mothership
      it can take minutes). So don't set check_interval too low.
      The mail_* variables are paranoia about sending too much email due to runaway
      nodes. The mail_interval just coalesces alerts to reduce messages, and
      mail_max is the maximum number of emails that one instance of cnetwatch will
      send. The latter is a pretty silly mechanism as a long running cnetwatch will
      probably hit the limit legitiamtely after 6 months or so and you will have to
      restart it.
  2. 12 Sep, 2017 8 commits
  3. 11 Sep, 2017 2 commits
  4. 10 Sep, 2017 1 commit
  5. 08 Sep, 2017 2 commits
  6. 07 Sep, 2017 2 commits
  7. 06 Sep, 2017 5 commits
  8. 05 Sep, 2017 3 commits
  9. 01 Sep, 2017 5 commits
  10. 31 Aug, 2017 4 commits
  11. 30 Aug, 2017 3 commits