1. 05 Jun, 2018 6 commits
  2. 04 Jun, 2018 8 commits
    • David Johnson's avatar
      Docker server-side core, esp new libimageops support for Docker images. · 66366489
      David Johnson authored
      The docker VM server-side goo is mostly identical to Xen, with slightly
      different handling for parent images.  We also support loading external
      Docker images (i.e. those without a real imageid in our DB; in that
      case, user has to set a specific stub image, and some extra per-vnode
      metadata (a URI that points to a Docker registry/image repo/tag);
      the Docker clientside handles the rest.
      
      Emulab Docker images map to a Emulab imageid:version pretty seamlessly.
      For instance, the Emulab `emulab-ops/docker-foo-bar:1` image would map
      to `<local-registry-URI>/emulab-ops/emulab-ops/docker-foo-bar:1`; the
      mapping is `<local-registry-URI>/pid/gid/imagename:version`.  Docker
      repository names are lowercase-only, so we handle that for the user; but
      I would prefer that users use lowercase Emulab imagenames for all Docker
      images; that will help us.  That is not enforced in the code; it will
      appear in the documentation, and we'll see.
      
      Full Docker imaging relies on several other libraries
      (https://gitlab.flux.utah.edu/emulab/pydockerauth,
      https://gitlab.flux.utah.edu/emulab/docker-registry-py).  Each
      Emulab-based cluster must currently run its own private registry to
      support image loading/capture (note however that if capture is
      unnecessary, users can use the external images path instead).  The
      pydockerauth library is a JWT token server that runs out of boss's
      Apache and implements authn/authz for the per-Emulab Docker registry
      (probably running on ops, but could be anywhere) that stores images and
      arbitrates upload/download access.  For instance, nodes in an experiment
      securely pull images using their pid/eid eventkey; and the pydockerauth
      emulab authz module knows what images the node is allowed to pull
      (i.e. sched_reloads, the current image the node is running, etc).  Real
      users can also pull images via user/pass, or bogus user/pass + Emulab
      SSL cert.  GENI credential-based authn/z was way too much work, sadly.
      There are other auth/z paths (i.e. for admins, temp tokens for secure
      operations) as well.
      
      As far as Docker image distribution in the federation, we use the same
      model as for regular ndz images.  Remote images are pulled in to the
      local cluster's Docker registry on-demand from their source cluster via
      admin token auth (note that all clusters in the federation have
      read-only access to the entire registries of any other cluster in the
      federation, so they can pull images).  Emulab imageid handling is the
      same as the existing ndz case.  For instance, image versions are lazily
      imported, on-demand; local version numbers may not match the remote
      image source cluster's version numbers.  This will potentially be a
      bigger problem in the Docker universe; Docker users expect to be able to
      reference any image version at any time anywhere.  But that is of course
      handleable with some ex post facto synchronization flag day, at least
      for the Docker images.
      
      The big new thing supporting native Docker image usage is the guts of a
      refactor of the utils/image* scripts into a new library, libimageops;
      this is necessary to support Docker images, which are stored in their
      own registry using their own custom protocols, so not amenable to our
      file-based storage.  Note: the utils/image* scripts currently call out
      to libimageops *only if* the image format is docker; all other images
      continue on the old paths in utils/image*, which all still remain
      intact, or minorly-changed to support libimageops.
      
      libimageops->New is the factory-style mechanism to get a libimageops
      that works for your image format or node type.  Once you have a
      libimageops instance, you can invoke normal image logical operations
      (CreateImage, ImageValidate, ImageRelease, et al).  I didn't do every
      single operation (for instance, I haven't yet dealt with image_import
      beyond essentially generalizing DownLoadImage by image format).
      Finally, each libimageops is stateless; another design would have been
      some statefulness for more complicated operations.   You will see that
      CreateImage, for instance, is written in a helper-subclass style that
      blurs some statefulness; however, it was the best match for the existing
      body of code.  We can revisit that later if the current argument-passing
      convention isn't loved.
      
      There are a couple outstanding issues.  Part of the security model here
      is that some utils/image* scripts are setuid, so direct libimageops
      library calls are not possible from a non-setuid context for some
      operations.  This is non-trivial to resolve, and might not be worthwhile
      to resolve any time soon.  Also, some of the scripts write meaningful,
      traditional content to stdout/stderr, and this creates a tension for
      direct library calls that is not entirely resolved yet.  Not hard, just
      only partly resolved.
      
      Note that tbsetup/libimageops_ndz.pm.in is still incomplete; it needs
      imagevalidate support.  Thus, I have not even featurized this yet; I
      will get to that as I have cycles.
      66366489
    • Leigh Stoller's avatar
      c5259a31
    • Leigh Stoller's avatar
      Fix a bug that was introduced when we shifted to using os_setup · e59fc714
      Leigh Stoller authored
      directly (on the Cloudlab clusters); we were losing a lock out that
      allowed DeleteSliver() to run while in the middle of a CreateSliver().
      This was resulting in a lot of email about node failures since the nodes
      were getting yanked out from underneath the CreateSliver(). From the
      user perspective, this did not matter much, since they wanted the slice
      gone, but it finally bothered me enough to look more closely.
      e59fc714
    • Leigh Stoller's avatar
      adefb6f5
    • Leigh Stoller's avatar
      New script to compute reservation timelines and utilization number. · 872a5af1
      Leigh Stoller authored
      Initially intended for debugging, but now its more useful. :-)
      872a5af1
    • Leigh Stoller's avatar
      Minor fix. · 2692b61f
      Leigh Stoller authored
      2692b61f
    • Leigh Stoller's avatar
      bbf42391
    • Leigh Stoller's avatar
      Minor changes. · ed2dce21
      Leigh Stoller authored
      ed2dce21
  3. 01 Jun, 2018 2 commits
  4. 31 May, 2018 9 commits
  5. 30 May, 2018 15 commits
    • Leigh Stoller's avatar
      Bug fix. · 5d264a32
      Leigh Stoller authored
      5d264a32
    • Leigh Stoller's avatar
      Add support for linkwide properties which are far more efficient wrt the · aba79edd
      Leigh Stoller authored
      XML size on really big lans. I do not expect this to be used very often,
      but it is handy. On the geni-lib side:
      
      class setProperties(object):
          """Added to a Link or LAN object, this extension tells Emulab based
          clusters to set the symmetrical properties of the entire link/lan to
          the desired characteristics (bandwidth, latency, plr). This produces
          more efficient XML then setting a property on every source/destination
          pair, especially on a very large lan. Bandwidth is in Kbps, latency in
          milliseconds, plr a floating point number between 0 and 1. Use keyword
          based arguments, all arguments are optional:
      
              link.setProperties(bandwidth=100000, latency=10, plr=0.5)
      
          """
      aba79edd
    • Leigh Stoller's avatar
      Minor fixes to previous revision(s). · 428d54d3
      Leigh Stoller authored
      428d54d3
    • Leigh Stoller's avatar
      Oops, left this out of previous revision. · 32569e18
      Leigh Stoller authored
      32569e18
    • Leigh Stoller's avatar
      Change to run DHCP on a specific set of interfaces. When XENVIFROUTING · 3c3918cb
      Leigh Stoller authored
      is off, this is just the control net interface (xenbr0). But when
      XENVIFROUTING is on, we want to listen on the control net bridge plus
      all of the container vifs. Since these are not created until the
      container is started, we have to call restartDHCP from emulab-cnet (we
      were already doing that), and now we also call reconfigDHCP() when the
      contain is destroyed so that interface list is correct (note that DHCPD
      does not seem to care if an interface dissappears, or even if an
      interface does not exist when starting.
      
      The main point here, is that on shared nodes we have to restrict the
      number of interfaces that DHCPd listens on (or even looks at) since it
      can be 100s, and dhcpd was taking well over a minute to start up each
      time.
      
      Aside; minor change to not look at the IP config for bridges, just the
      mac. Takes to long when there are 100s of bridges.
      3c3918cb
    • Leigh Stoller's avatar
      When XENVIFROUTING is on, and going offline (call new function to · d045249f
      Leigh Stoller authored
      rewrite the interface list in /etc/defaults/isc-dhcp-server. We do
      not need to restart DHCP, it does not mind that the vif is gone.
      d045249f
    • Leigh Stoller's avatar
      With XENVIFROUTING on, no point in restarting DHCP when adding an entry, · 04ebcf11
      Leigh Stoller authored
      since the vif foes not exist yet, and we call restartDHCP() again in
      emulab-cnet after the container and vif exist. In fact, no point in
      restarting DHCP when removing an entry, since by that time the vif is
      gone and dhcpd does not seem to mind that anyway.
      04ebcf11
    • Leigh Stoller's avatar
      Web UI changes for reservations, for backend/RPC changes in 039f27b1: · 26f77c59
      Leigh Stoller authored
      1. Show current reservations on the admin extend page (if any) for the
         user who started the experiment.
      
      2. Add a reservation history page, to see historical reservations for a
         user.
      
      3. Changes to the reservation listing page.
      
      4. And then the main content of this commit is that for the pages above,
         show the experiment usage history for the project and the user who
         created the reservation. This takes the form of a time line of
         allocation changes so that we can graph node usage against the
         reservation bounds, to show graphically how well utilized the
         reservation is.
      26f77c59
    • Leigh Stoller's avatar
      Several backend/RPC changes for reservations: · 8266ae51
      Leigh Stoller authored
      1. Return current set of reservations (if any) for a user when getting
         the max extension (piggy backing on the call to reduce overhead).
      
      2. Add RPC to get the reservation history for a user (all past
         reservations that were approved).
      
         Aside; the reservation_history table was not being updated properly,
         only expired reservations were saved, not deleted (but used)
         reservations, so we lost a lot of history. We could regen some of it
         from the history tables I added at the Portal for Dmitry, but not
         sure it is worth the trouble.
      
      3. And then the main content of this commit is that for both of the
         lists above, also return the experiment usage history for the project
         an dthe user who created the reservation. This takes the form of a
         time line of allocation changes so that we can graph node usage
         against the reservation bounds, to show graphically how well utilized
         the reservation is.
      8266ae51
    • Leigh Stoller's avatar
      Switch to graceful restart of apache instead of killing it outright, so · 735250d3
      Leigh Stoller authored
      that backend processes in flight do not get killed in their tracks. This
      might not work right, but elabinelab testing does not tell me much, lets
      see how it goes.
      735250d3
    • Leigh Stoller's avatar
      Add swapper to the project Usage() listing so we can correlate usage · 3b4626e7
      Leigh Stoller authored
      within a project with reservations active during an experiment.
      3b4626e7
    • Leigh Stoller's avatar
    • Leigh Stoller's avatar
      Add TypesInUse() for experiments. · 879fc245
      Leigh Stoller authored
      879fc245
    • Leigh Stoller's avatar
      75545174
    • Leigh Stoller's avatar
      Minor debugging change. · a6074bec
      Leigh Stoller authored
      a6074bec