- 02 Nov, 2017 1 commit
-
-
Mike Hibler authored
Defaults to 0, which means do not pass user password hashes to nodes via tmcc. Non-zero will restore the old behavior.
-
- 12 Sep, 2017 1 commit
-
-
Mike Hibler authored
The sitevars are a bit obscure: # cnetwatch/check_interval # Interval at which to collect info. # Zero means don't run cnetwatch (exit immediately). # # cnetwatch/alert_interval # Interval over which to calculate packet/bit rates and to log alerts. # Should be an integer multiple of the check_interval. # # cnetwatch/pps_threshold # Packet rate (packets/sec) in excess of which to log an alert. # Zero means don't generate packet rate alerts. # # cnetwatch/bps_threshold # Data rate (bits/sec) in excess of which to log an alert. # Zero means don't generate data rate alerts. # # cnetwatch/mail_interval # Interval at which to send email for all alerts logged during the interval. # Zero means don't ever send email. # # cnetwatch/mail_max # Maximum number of alert emails to send; after this alerts are only logged. # Zero means no limit to the emails. Basically you can tweak pps_threshold and bps_threshold to define what you think an unusual "burst" of cnet traffic is and then alert_interval to determine how long a burst has to last before you will send an alert. Why would you have check_interval less than alert_interval? You probably wouldn't unless you want to record finer-grained port stats using the -l option to write stats to a logfile. We do it on the mothership as a data source for some student machine learning projects. Note that in an environment with lots of control net switches, a single instance of gathering port counters from the switches could take 30 seconds or longer (on the mothership it can take minutes). So don't set check_interval too low. The mail_* variables are paranoia about sending too much email due to runaway nodes. The mail_interval just coalesces alerts to reduce messages, and mail_max is the maximum number of emails that one instance of cnetwatch will send. The latter is a pretty silly mechanism as a long running cnetwatch will probably hit the limit legitiamtely after 6 months or so and you will have to restart it.
-
- 01 Sep, 2017 1 commit
-
-
Mike Hibler authored
This is for the per-experiment root keypair. Note that the sitevar is not "hooked in" yet, just wanted to get it in place for testing.
-
- 29 Mar, 2017 1 commit
-
-
Mike Hibler authored
-
- 24 Mar, 2017 2 commits
-
-
Mike Hibler authored
-
Mike Hibler authored
A crude tool to control whether node-local blockstores use SSD drives.
-
- 03 Mar, 2017 1 commit
-
-
Leigh B Stoller authored
-
- 26 Jan, 2017 1 commit
-
-
Leigh B Stoller authored
-
- 17 Jan, 2017 1 commit
-
-
Mike Hibler authored
There are three pieces here, a change to the frisbee protocol itself, an Emulab event component to get status back to the portal, and the surrounding infrastructure to make it all work. Frisbee heartbeat messages: Added a new message type to the frisbee protocol, "Progress". In theory it operates by having the server send a multicast progress request to its clients which includes an interval at which to report (or "just once") and an indication of what to report (nothing, progress summary, or full stats). The client then sends unicast "fire and forget" UDP replies according to that schedule. However, I took a shortcut for the moment and just added a command line option to the client to tell it to report a summary at the indicated interval (-H <interval>). So the server never sends requests. This is implemented in the client by a fourth thread since I wanted it to operate independent of packet reception (which would cause clients to report in a highly synchronized fashion due to multicast). The server instance just logs progress reports into its log. This protocol addition should be fully backward compatible as both client and server ignore (but log) unknown messages. Emulab progress report events: When this is compiled in (-DEMULAB_EVENTS) and turned on (-E <server>), the frisbee server instances will send a FRISBEEPROGRESS event to the indicated event server for every progress report it receives (in addition to logging the events to its own log). Right now it will create an event with key/value pairs for the information in a client summary reply: TSTAMP is the client's time at which it sends the event. Could be used by the received to determine latency of the report if it cared (and if it assumed that the clocks are in sync). We don't care about this. SEQUENCE is the report number. Again, could be used by the receiver, in this case to detect loss, if it cared. We don't. CHUNKS_RECV is complete chunks that the client has received from the network. CHUNKS_DECOMP is chunks decompressed by the client. BYTES_WRITTEN is bytes written to disk by the client. Any of the three can be used by the event receiver as an indication of life and/or progress. However, only the last would be a reasonable indicator of time remaining since it is the last (and slowest) phase of imaging. To estimate time remaining we could compare that value to the amount of uncompressed data that is in the image. This makes the sketchy assumptions that time for writes to the disk are uniform and that the number and distance of seeks is uniform, but it is better than a sharp stick in the eye. Emulab infrastructure: There is a new sitevar "images/frisbee/heartbeat" which can be set to a non-zero value to tell the frisbee MFS to fire off frisbee with -H <value> and thus make reports. The default value of zero means to not make reports. The tmcd "loadinfo" command sends this through via the HEARTBEAT=<value> param. REQUIRED A TMCD VERSION BUMP TO 41.
-
- 10 Jan, 2017 1 commit
-
-
Mike Hibler authored
Some messages said bytes when it is really bits, sizes were actually Mib/Gib instead of Mb/Gb.
-
- 13 Dec, 2016 1 commit
-
-
Gary Wong authored
-
- 17 Nov, 2016 1 commit
-
-
Mike Hibler authored
The interval (60 minutes) was compiled into tmcd before. N.B.: DYNAMICROOTPASSWORD must be defined for this sitevar to have any effect. Otherwise, the root password is *never* set to the Emulab value. This is not a change in behavior, just sayin...
-
- 12 Oct, 2016 1 commit
-
-
Leigh B Stoller authored
all nodes are untyped.
-
- 09 Aug, 2016 1 commit
-
-
Mike Hibler authored
-
- 16 Jul, 2016 1 commit
-
-
Leigh B Stoller authored
-
- 01 Apr, 2016 1 commit
-
-
Kirk Webb authored
-
- 04 Dec, 2015 1 commit
-
-
Kirk Webb authored
-
- 03 Dec, 2015 1 commit
-
-
Kirk Webb authored
Also add sitevar for PhantomNet portal banner message.
-
- 21 Oct, 2015 1 commit
-
-
Leigh B Stoller authored
switches do not support it, we want to fail earlier then snmpit.
-
- 16 Oct, 2015 1 commit
-
-
Mike Hibler authored
In createdataset, if the "usequotas" sitevar is set for the dataset type in question but a quota does not exist for the dataset's project, we create a quota object using the value from the new "default_quota" sitevar for that dataset type. If that sitevar does not exist or has a value of zero, we do NOT create a quota object and hence createdataset will fail.
-
- 10 Aug, 2015 1 commit
-
-
Leigh B Stoller authored
-
- 18 May, 2015 1 commit
-
-
Leigh B Stoller authored
types in the images/default_typelist sitevar.
-
- 31 Mar, 2015 2 commits
-
-
Mike Hibler authored
Yes, out of the blue and off the wall. But I got tired of trying to guess what we had Linux and FreeBSD use. I was surprised to discover that we were using UDP on Linux (which caused Clemson CloudLab to fail because they have jumbo frames enabled on their control net switches but ops had the MTU set to 1500). Anyway, here it is. The default setting is UDP for backward compat. We should probably set it to TCP nowadays. There is also an 'osdefault' setting which says use the default setting on the client OS.
-
Leigh B Stoller authored
-
- 25 Mar, 2015 1 commit
-
-
Leigh B Stoller authored
are granted for free. Better then hardwired to seven in the code, and the new code treats zero as, no free extensions for mere users.
-
- 04 Feb, 2015 1 commit
-
-
Leigh B Stoller authored
-
- 16 Jan, 2015 1 commit
-
-
Mike Hibler authored
This is the time in seconds that a frisbeed will hang around after the last time it receives a packet. Traditionally, this was 1800 (30 minutes!) but now we default it to 180.
-
- 12 Jan, 2015 1 commit
-
-
Kirk Webb authored
-
- 02 Dec, 2014 2 commits
-
-
Leigh B Stoller authored
XEN43-64-STD, but is XEN44-64-BIGFS on APT and probably Cloud.
-
Mike Hibler authored
-
- 12 May, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 12 Feb, 2014 1 commit
-
-
Mike Hibler authored
For the Emulab configuration, we add the new site variable "images/frisbee/maxrate_dyn" which should be set non-zero to enable dynamic adjustment. If maxrate_dyn is enabled, then the maxrate_{std,usr} values are used as both the initial and maximum values for the BW of any instance. Really, if maxrate_dyn is on, then both of those should be set to the same value so that all servers are operating the same and the value should be just above the link BW. For the "null" configuration (aka, the subboss configuration), this is set by adding command line options: -O dynamicbw=1,bandwidth=1100000000 which would enable it and start/cap the BW at 1.1Gb/sec.
-
- 08 Jan, 2014 1 commit
-
-
Leigh B Stoller authored
When sitevar general/xenvifrouting is true, use the mac of the physical host for the arp entry, since packets will be coming from the host itself (via proxy arp).
-
- 06 Jan, 2014 1 commit
-
-
Mike Hibler authored
-
- 17 Oct, 2013 1 commit
-
-
Leigh B Stoller authored
-
- 09 Aug, 2013 1 commit
-
-
Leigh B Stoller authored
-
- 17 Jun, 2013 1 commit
-
-
Mike Hibler authored
Controlled by new sitevars.
-
- 18 Jan, 2013 1 commit
-
-
Leigh B Stoller authored
the metadata even if there is nothing in the logfile. Mostly so that the URL link works and we can get the header info if needed. Also add slice_urn and slice_idx to the metadata so we find all the logs associated with a slice. Fixes.
-
- 06 Dec, 2012 1 commit
-
-
Leigh B Stoller authored
daemon.
-
- 30 Oct, 2012 1 commit
-
-
Mike Hibler authored
-