- 27 Feb, 2019 1 commit
-
-
Leigh B Stoller authored
-
- 12 Feb, 2019 1 commit
-
-
Leigh B Stoller authored
* Add a new Portal context menu option to nodes, to boot into "recovery" mode, which will be a Linux MFS (rather then the FreeBSD MFS, which 99% of user will not know what to do with). * Plumb all through to the Geni RPC interface, which invokes node_admin with a new option, to use the recovery mfs nodetype attribute. * recoverymfs_osid is a distinct osid from adminmfs_osid, we use that in the CM to add an Emulab name space attribute to the manifest, that tells the Portal that a node supports recovery mode (and thus gets a context menu option). * Add an inrecovery flag to the sliver status blob, which the Portal uses to determine that a node is currently in recovery mode, so that we can indicate that in the topology and list tabs.
-
- 29 Nov, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 16 Nov, 2018 1 commit
-
-
Mike Hibler authored
Entering admin mode clears nodes.startupcmd so we need to restore it afterward. This applies when either returning from taking an image or doing "node_admin off".
-
- 05 Nov, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 26 Oct, 2018 1 commit
-
-
Leigh B Stoller authored
it is a jail, and it's mac is the same as boss.
-
- 16 Oct, 2018 1 commit
-
-
Mike Hibler authored
-
- 15 Oct, 2018 1 commit
-
-
Mike Hibler authored
Also, allow percentages (-P) of greater than 100%.
-
- 11 Oct, 2018 1 commit
-
-
Mike Hibler authored
-
- 10 Oct, 2018 2 commits
-
-
Mike Hibler authored
...and an option to specify if you want to consider logical CPUs (hyperthreading) and an option to specify an absolute minimum load average to use when doing a percentage. The latter is for, e.g., you have 1 CPU (pc3000); it would not be uncommon to have a load average > 1 even if nothing special is going on.
-
Leigh B Stoller authored
(reservation_autoapprove_limit) that overrides the site variable. Also in node hours and zero means zero instead of unlimited.
-
- 09 Oct, 2018 1 commit
-
-
Mike Hibler authored
Periodically looks at the slothd RRD files collected on boss. This is just an initial attempt to see if doing this is feasible or if the false positive rate is just too high.
-
- 29 Aug, 2018 3 commits
-
-
Leigh B Stoller authored
the new Portal interface. Used by the web interface when a Classic user logs into the Portal for the first time.
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
- 20 Aug, 2018 1 commit
-
-
Mike Hibler authored
Also, avoid excess set statements in the dump file.
-
- 17 Aug, 2018 2 commits
-
-
Mike Hibler authored
-
Mike Hibler authored
Also add partial support for 11.2 MFS (just kernel right now, binaries are still 10.3).
-
- 13 Aug, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 30 Jul, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 13 Jul, 2018 1 commit
-
-
Mike Hibler authored
Cuz you can never have too many sitevars!
-
- 09 Jul, 2018 3 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
hand). Also add enable sitevar since we run this only on clusters that support portstats on the control network.
-
Leigh B Stoller authored
easily get to the experiment (or portal status page).
-
- 25 Jun, 2018 1 commit
-
-
Mike Hibler authored
-
- 21 Jun, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 19 Jun, 2018 1 commit
-
-
Leigh B Stoller authored
-
- 18 Jun, 2018 4 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
-
Leigh B Stoller authored
* If unused at six hours, schedule for cancel in three hours and send email. * If reservation becomes used within those three hours, rescind the cancellation. * Add an override bit so that cancel/uncancel on the command line supercedes (so explicit cancel or rescinding a cancel, means do not make any more automated checks for unused). * Rework cancel to be more library friendly.
-
- 09 Jun, 2018 1 commit
-
-
Leigh B Stoller authored
image.
-
- 08 Jun, 2018 1 commit
-
-
Leigh B Stoller authored
elsewhere to compute reservation utilization.
-
- 06 Jun, 2018 2 commits
-
-
David Johnson authored
-
Leigh B Stoller authored
-
- 05 Jun, 2018 1 commit
-
-
David Johnson authored
-
- 04 Jun, 2018 2 commits
-
-
David Johnson authored
The docker VM server-side goo is mostly identical to Xen, with slightly different handling for parent images. We also support loading external Docker images (i.e. those without a real imageid in our DB; in that case, user has to set a specific stub image, and some extra per-vnode metadata (a URI that points to a Docker registry/image repo/tag); the Docker clientside handles the rest. Emulab Docker images map to a Emulab imageid:version pretty seamlessly. For instance, the Emulab `emulab-ops/docker-foo-bar:1` image would map to `<local-registry-URI>/emulab-ops/emulab-ops/docker-foo-bar:1`; the mapping is `<local-registry-URI>/pid/gid/imagename:version`. Docker repository names are lowercase-only, so we handle that for the user; but I would prefer that users use lowercase Emulab imagenames for all Docker images; that will help us. That is not enforced in the code; it will appear in the documentation, and we'll see. Full Docker imaging relies on several other libraries (https://gitlab.flux.utah.edu/emulab/pydockerauth, https://gitlab.flux.utah.edu/emulab/docker-registry-py). Each Emulab-based cluster must currently run its own private registry to support image loading/capture (note however that if capture is unnecessary, users can use the external images path instead). The pydockerauth library is a JWT token server that runs out of boss's Apache and implements authn/authz for the per-Emulab Docker registry (probably running on ops, but could be anywhere) that stores images and arbitrates upload/download access. For instance, nodes in an experiment securely pull images using their pid/eid eventkey; and the pydockerauth emulab authz module knows what images the node is allowed to pull (i.e. sched_reloads, the current image the node is running, etc). Real users can also pull images via user/pass, or bogus user/pass + Emulab SSL cert. GENI credential-based authn/z was way too much work, sadly. There are other auth/z paths (i.e. for admins, temp tokens for secure operations) as well. As far as Docker image distribution in the federation, we use the same model as for regular ndz images. Remote images are pulled in to the local cluster's Docker registry on-demand from their source cluster via admin token auth (note that all clusters in the federation have read-only access to the entire registries of any other cluster in the federation, so they can pull images). Emulab imageid handling is the same as the existing ndz case. For instance, image versions are lazily imported, on-demand; local version numbers may not match the remote image source cluster's version numbers. This will potentially be a bigger problem in the Docker universe; Docker users expect to be able to reference any image version at any time anywhere. But that is of course handleable with some ex post facto synchronization flag day, at least for the Docker images. The big new thing supporting native Docker image usage is the guts of a refactor of the utils/image* scripts into a new library, libimageops; this is necessary to support Docker images, which are stored in their own registry using their own custom protocols, so not amenable to our file-based storage. Note: the utils/image* scripts currently call out to libimageops *only if* the image format is docker; all other images continue on the old paths in utils/image*, which all still remain intact, or minorly-changed to support libimageops. libimageops->New is the factory-style mechanism to get a libimageops that works for your image format or node type. Once you have a libimageops instance, you can invoke normal image logical operations (CreateImage, ImageValidate, ImageRelease, et al). I didn't do every single operation (for instance, I haven't yet dealt with image_import beyond essentially generalizing DownLoadImage by image format). Finally, each libimageops is stateless; another design would have been some statefulness for more complicated operations. You will see that CreateImage, for instance, is written in a helper-subclass style that blurs some statefulness; however, it was the best match for the existing body of code. We can revisit that later if the current argument-passing convention isn't loved. There are a couple outstanding issues. Part of the security model here is that some utils/image* scripts are setuid, so direct libimageops library calls are not possible from a non-setuid context for some operations. This is non-trivial to resolve, and might not be worthwhile to resolve any time soon. Also, some of the scripts write meaningful, traditional content to stdout/stderr, and this creates a tension for direct library calls that is not entirely resolved yet. Not hard, just only partly resolved. Note that tbsetup/libimageops_ndz.pm.in is still incomplete; it needs imagevalidate support. Thus, I have not even featurized this yet; I will get to that as I have cycles.
-
Leigh B Stoller authored
Initially intended for debugging, but now its more useful. :-)
-
- 30 May, 2018 1 commit
-
-
Leigh B Stoller authored
caution.
-
- 18 May, 2018 1 commit
-
-
Mike Hibler authored
-