- 04 Jun, 2018 1 commit
-
-
Leigh B Stoller authored
Initially intended for debugging, but now its more useful. :-)
-
- 16 Apr, 2018 1 commit
-
-
Leigh B Stoller authored
added to the testbed. Useful when we know that all of the old images will work on the new types.
-
- 21 Feb, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 27 Nov, 2017 1 commit
-
-
Leigh B Stoller authored
directories from the proj and groups tree.
-
- 13 Oct, 2017 1 commit
-
-
Leigh B Stoller authored
-
- 12 Sep, 2017 1 commit
-
-
Mike Hibler authored
The sitevars are a bit obscure: # cnetwatch/check_interval # Interval at which to collect info. # Zero means don't run cnetwatch (exit immediately). # # cnetwatch/alert_interval # Interval over which to calculate packet/bit rates and to log alerts. # Should be an integer multiple of the check_interval. # # cnetwatch/pps_threshold # Packet rate (packets/sec) in excess of which to log an alert. # Zero means don't generate packet rate alerts. # # cnetwatch/bps_threshold # Data rate (bits/sec) in excess of which to log an alert. # Zero means don't generate data rate alerts. # # cnetwatch/mail_interval # Interval at which to send email for all alerts logged during the interval. # Zero means don't ever send email. # # cnetwatch/mail_max # Maximum number of alert emails to send; after this alerts are only logged. # Zero means no limit to the emails. Basically you can tweak pps_threshold and bps_threshold to define what you think an unusual "burst" of cnet traffic is and then alert_interval to determine how long a burst has to last before you will send an alert. Why would you have check_interval less than alert_interval? You probably wouldn't unless you want to record finer-grained port stats using the -l option to write stats to a logfile. We do it on the mothership as a data source for some student machine learning projects. Note that in an environment with lots of control net switches, a single instance of gathering port counters from the switches could take 30 seconds or longer (on the mothership it can take minutes). So don't set check_interval too low. The mail_* variables are paranoia about sending too much email due to runaway nodes. The mail_interval just coalesces alerts to reduce messages, and mail_max is the maximum number of emails that one instance of cnetwatch will send. The latter is a pretty silly mechanism as a long running cnetwatch will probably hit the limit legitiamtely after 6 months or so and you will have to restart it.
-
- 30 Aug, 2017 1 commit
-
-
Leigh B Stoller authored
-
- 23 Aug, 2017 1 commit
-
-
Leigh B Stoller authored
stuff in them.
-
- 18 Aug, 2017 1 commit
-
-
Leigh B Stoller authored
the DB please.
-
- 26 Jul, 2017 1 commit
-
-
Mike Hibler authored
Provide automated setup of an ssh keypair enabling root to login without a password between nodes. The biggest challenge here is to get the private key onto nodes in such a way that a non-root user on those nodes cannot obtain it. Otherwise that user would be able to ssh as root to any node. This precludes simple distribution of the private key using tmcd/tmcc as any user can do a tmcc (tmcd authentication is based on the node, not the user). This version does a post-imaging "push" of the private key from boss using ssh. The key is pushed from tbswap after nodes are imaged but before the event system, and thus any user startup scripts, are started. We actually use "pssh" (really "pscp") to scale a bit better, so YOU MUST HAVE THE PSSH PACKAGE INSTALLED. So be sure to do a: pkg install -r Emulab pssh on your boss node. See the new utils/pushrootkeys.in script for more. The public key is distributed via the "tmcc localization" command which was already designed to handle adding multiple public keys to root's authorized_keys file on a node. This approach should be backward compatible with old images. I BUMPED THE VERSION NUMBER OF TMCD so that newer clients can also get back (via rc.localize) a list of keys and the names of the files they should be stashed in. This is used to allow us to pass along the SSL and SSH versions of the public key so that they can be placed in /root/.ssl/<node>.pub and /root/.ssh/id_rsa.pub respectively. Note that this step is not necessary for inter-node ssh to work. Also passed along is an indication of whether the returned key is encrypted. This might be used in Round 2 if we securely implant a shared secret on every node at imaging time and then use that to encrypt the ssh private key such that we can return it via rc.localize. But the client side script currently does not implement any decryption, so the client side would need to be changed again in this future. The per experiment root keypair mechanism has been exposed to the user via old school NS experiments right now by adding a node "rootkey" method. To export the private key to "nodeA" and the public key to "nodeB" do: $nodeA rootkey private 1 $nodeB rootkey public 1 This enables an asymmetric relationship such that "nodeA" can ssh into "nodeB" as root but not vice-versa. For a symmetric relationship you would do: $nodeA rootkey private 1 $nodeB rootkey private 1 $nodeA rootkey public 1 $nodeB rootkey public 1 These user specifications will be overridden by hardwired Emulab restrictions. The current restrictions are that we do *not* distribute a root pubkey to tainted nodes (as it opens a path to root on a node where no one should be root) or any keys to firewall nodes, virtnode hosts, delay nodes, subbosses, storagehosts, etc. which are not really part of the user topology. For more on how we got here and what might happen in Round 2, see: #302
-
- 05 Jun, 2017 1 commit
-
-
Leigh B Stoller authored
Add new script to "deprecate" images: boss> wap deprecate_image Usage: deprecate_image [-e|-w] <image> [warning message to users] Options: -e Use of image is an error; default is warning -w Use of image is a warning When an image is deprecated with just warnings, new classic experiments generate warnings in the output. Swapping in an experiment also generates warnings in the output, but also sends email to the user. When the image set for error, both new experiment and swapin will fail with prejudice. Same deal on the Geni path; we generate warnings/errors and send email. Errors are reflected back in the Portal interface. At the moment the image server knows nothing about deprecated images, so the Portal constraint checker will not be bothered nor tell the user until later when the cluster throws an error. As a result, when we deprecate an image, we need to do it on all clusters. Needs to think about this a bit more.
-
- 30 May, 2017 1 commit
-
-
Mike Hibler authored
Add setzfsquotas script to handle fixup of existing quotas, add update script to do a one-time invocation of this script at boss-install time, and fix accountsetup so it will properly set both quotas going forward.
-
- 04 May, 2017 1 commit
-
-
Gary Wong authored
-
- 14 Mar, 2017 1 commit
-
-
Leigh B Stoller authored
up millions of rows and tell us nothing useful.
-
- 20 Jan, 2017 1 commit
-
-
Gary Wong authored
-
- 06 Jan, 2017 1 commit
-
-
Gary Wong authored
(By assuming nobody swaps in, extends, voluntarily swaps out, or requests reservations.)
-
- 07 Nov, 2016 1 commit
-
-
Leigh B Stoller authored
-
- 12 Oct, 2016 1 commit
-
-
Leigh B Stoller authored
Usage: createimagealias [-r] <image> target1,target2,... -h This message -r Delete alias
-
- 17 Jun, 2016 1 commit
-
-
Mike Hibler authored
-
- 26 May, 2016 1 commit
-
-
Jonathon Duerig authored
-
- 25 May, 2016 1 commit
-
-
Gary Wong authored
Right now this is strictly advisory. In particular, swap-ins go through the normal path and are NOT forced to comply with admission control wrt future reservations; therefore, reservations don't yet come with any guarantees at all.
-
- 12 Apr, 2016 1 commit
-
-
Leigh B Stoller authored
interface; I do not want to reimplement any feature stuff in PHP, and its fast enough to call out to the perl script.
-
- 11 Apr, 2016 1 commit
-
-
Gary Wong authored
-
- 23 Mar, 2016 1 commit
-
-
Mike Hibler authored
-
- 21 Mar, 2016 1 commit
-
-
Leigh B Stoller authored
from the server, keeping it in sync with the server as new versions of the image are added. Also handles importing deltas if the metadata says there is a delta. Note that downloading the image files is still lazy; we will not import all 15 versions of an image unless they actually are needed. Lots of work still do. This is a bit of a nightmare cause of client/server (backward) compatibility issues wrt provenance/noprovenance and deltas/nodeltas. I might change my mind and say the hell with compatibility! Along these same lines, there is an issue of what to do when a site that is running with provenance turned on, gets this new code. Up to now, the client and server never tried to stay in sync, but now they have to (cause of deltas), and so the client image descriptors have to be upgraded. That will be a hassle too.
-
- 08 Dec, 2015 1 commit
-
-
Gary Wong authored
-
- 15 May, 2015 1 commit
-
-
Leigh B Stoller authored
Soon, we will have images with both full images and deltas, for the same image version. To make this possible, the image path will now be a directory instead of a file, and all of the versions (ndz,sig,sha1,delta) files will reside in the directory. A new config variable IMAGEDIRECTORIES turns this on, there is also a check for the ImageDiretories feature. This is applied only when a brand new image is created; a clone version of the image inherits the path it started with. Yes, you can have a mix of directory based and file based image descriptors. When it is time to convert all images over, there is a script called imagetodir that will go through all image descriptors, create the directory, move/rename all the files, and update the descriptors. Ultimately, we will not support file based image paths. I also added versioning to the image metadata descriptors so that going forward, old clients can handle a descriptor from a new server.
-
- 05 Mar, 2015 1 commit
-
-
Leigh B Stoller authored
-
- 27 Jan, 2015 1 commit
-
-
Leigh B Stoller authored
1) Implement the latest dataset read/write access settings from frontend to backend. Also updates for simultaneous read-only usage. 2) New configure options: PROTOGENI_LOCALUSER and PROTOGENI_GENIWEBLOGIN. The first changes the way that projects and users are treated at the CM. When set, we create real accounts (marked as nonlocal) for users and also create real projects (also marked as nonlocal). Users are added to those projects according to their credentials. The underlying experiment is thus owned by the user and in the project, although all the work is still done by the geniuser pseudo user. The advantage of this approach is that we can use standard emulab access checks to control access to objects like datasets. Maybe images too at some point. NOTE: Users are not removed from projects once they are added; we are going to need to deal with this, perhaps by adding an expiration stamp to the groups_membership tables, and using the credential expiration to mark it. The second new configure option turns on the web login via the geni trusted signer. So, if I create a sliver on a backend cluster when both options are set, I can use the trusted signer to log into my newly created account on the cluster, and see it (via the emulab classic web interface). All this is in flux, might end up being a bogus approach in the end.
-
- 15 Dec, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 25 Nov, 2014 1 commit
-
-
Mike Hibler authored
Keeping them up to date throughout the node lifecycle is not a lot of fun...
-
- 04 Nov, 2014 1 commit
-
-
Leigh B Stoller authored
usage: runsonxen [-p <parent>] <imageid> usage: runsonxen -a [-p <parent>] usage: runsonxen -c <imageid> Options: -n - Impotent mode -c - Clear XEN parent settings completely -a - Operate on all current XEN capable images -p - Set default parent; currently XEN43-64-STD
-
- 28 Oct, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 09 Jul, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 01 Jul, 2014 1 commit
-
-
Leigh B Stoller authored
-
- 13 Jun, 2014 1 commit
-
-
Leigh B Stoller authored
down the entire testbed.
-
- 06 Jun, 2014 1 commit
-
-
Leigh B Stoller authored
was driving me nuts that we do not have an easy way to see what is going on *inside* the fabric. So this one reports on traffic across trunk links and interconnects out of the fabric. Basic operation is pretty simple: Usage: switch_traffic [-rs] [-i seconds] [switch[:switch] ...] Reports traffic across trunk links and interconnects -h This message -i seconds Show stats over a <seconds>-period interval So with no arguments will give portstats style output of all trunk links and interconnects in the database. Trunk links are aggregate numbers of all of the trunk wires that connect two switches. The -i option gives traffic over an interval, which is much more useful than the raw packet numbers, since on most of our switches those numbers have probably rolled over a few times. You can optionally specify specific switches and interconnects on the command line. For example: boss> wap switch_traffic -i 10 cisco3 ion Trunk InOctets InUpkts InNUpkts ... ----------------------------------------------------------- ... cisco3:cisco10 128 0 1 ... cisco3:cisco8 2681 7 4 ... cisco3:cisco1 4493 25 7 ... cisco3:cisco9 192 0 1 ... cisco3:cisco4 128 0 2 ... pg-atla:ion 0 0 0 ... pg-hous:ion 0 0 0 ... pg-losa:ion 0 0 0 ... pg-salt:ion 2952 0 42 ... pg-wash:ion 0 0 0 ... NOTE that the above output is abbreviated so it does not wrap in the git log, but you get the idea. Or you can specify a specific trunk link: boss> wap switch_traffic -i 10 cisco3:cisco8 Okay this is all pretty basic and eventually it would be nice to take these numbers and feed them into mrtg or rrdtool so we can view pretty graphs, but this as far as I can take it for now. Maybe in the short term it would be enough to record the numbers every 5 minutes or so and put the results into a file.
-
- 09 May, 2014 1 commit
-
-
Mike Hibler authored
This should be run whenever an image is created or updated and possibly periodically over existing images. It makes sure that various image metadata fields are up to date: * hash: the SHA1 hash of the image. This field has been around for awhile and was previously maintained by "imagehash". * size: the size of the image file. * range: the sector range covered by the uncompressed image data. * mtime: modification time of the image. This is the "updated" datetime field in the DB. Its intent was always to track the update time of the image, but it wasn't always exact (create-image would update this with the current time at the start of the image capture process). Documentation? Umm...the usage message is comprehensive! It sports a variety of useful options, but the basics are: * imagevalidate -p <image> ... Print current DB metadata for indicated images. <image> can either be a <pid>/<imagename> string or the numeric imageid. * imagevalidate <image> ... Check the mtime, size, hash, and image range of the image file and compare them to the values in the DB. Whine for ones which are out of date. * imagevalidate -u <image> ... Compare and then update DB metadata fields that are out of date. Fixed a variety of scripts that either used imagehash or computed the SHA1 hash directly to now use imagevalidate.
-
- 17 Mar, 2014 1 commit
-
-
Kirk Webb authored
This will currently work with os descriptors and nodes.
-
- 21 Jan, 2014 1 commit
-
-
Leigh B Stoller authored
the resource mapper. To add a node restriction: boss> node_exclude pcXXX To remove the restriction: boss> node_exclude -r pcXXX
-