1. 17 Aug, 2018 1 commit
  2. 16 Aug, 2018 1 commit
  3. 10 Aug, 2018 1 commit
  4. 08 Aug, 2018 1 commit
    • David Johnson's avatar
      Add Docker container blockstore support. · 9bf09981
      David Johnson authored
      Docker containers may be (and default to, and in the shared host case,
      must be) deprivileged; thus, they cannot mount devices, much less tell
      the kernel (via iscsi userspace tools, etc) to make devices.
      Therefore, we must setup any storage backing devices (temp LVs, iscsi
      attachments) outside the container.  This commit makes that possible for
      rc.storage and linux liblocstorage.  Basically, rc.storage now supports
      (for the Linux liblocstorage and Docker) the -j vnodeid calling
      convention; and if it's being called on behalf of a vnodeid, it uses
      per-vnodeid fstab for any mounts, storage.conf for its state; etc.
      I modified libvnode_docker to *not* create virtual networks for
      remote blockstore links, because those are pinned to /30s, and thus I
      have no client blockstore link address to place on a device in the root
      context.  However, I (ab)used the existing Docker network setup for the
      blockstore links, and that all happens the same as it used to; we just
      no longer create the Docker virtual network nor attach the container to
      Finally, I modified tmcd dostorageconfig slightly to return
      HOSTIP/HOSTMASK for remote blockstores; and now
      libsetup::getstorageconfig will use HOSTIP in preference to its own
      HOSTID->HOSTIP translation.  I had to do this so that libvnode_docker in
      the root context would not have to go through the mess of translating
      HOSTID on behalf of a vnode.
  5. 30 Jul, 2018 3 commits
  6. 18 May, 2018 1 commit
  7. 08 May, 2018 1 commit
    • David Johnson's avatar
      Fix a nasty docker/mkvnode.pl race inspired by bootvnodes/vnodesetup. · e468cc49
      David Johnson authored
      This is probably true for Xen too, but in some cases, the
      vnodesetup early-release hackwaitandexit timeout of 30 seconds
      causes a race condition.  Normally, the first node sets up
      significant network state, and sometimes flips MAC addresses
      around from interface to interface -- OR puts a physical interface
      into a bridge, then changes the bridge's MAC address.  There is a
      short window of time where both the bridge and the new member
      interface share a MAC address -- and if the tmcc ifconfig assembly
      process for vnodes following the first vnode resolves
      the wrong device's MAC address and uses that to flesh out the
      ifconfig info, the vnodesetup will be in a world of hurt (i.e., you
      might see an attempt to make a vlan device out of a vlan device).
      The chance of this happening is miniscule, but I've seen it.
      So, at least for docker for now, we protect the first vnode against
      the 30-second timeout in vnodesetup hackwaitandexit, and we wait for the
      actual running file to be written, or error.
      This is probably applicable to any linux mkvnode.pl path, but I suppose
      it would have been another hundred thousand vnode creates before I saw
      it again.
  8. 05 May, 2018 1 commit
  9. 02 Apr, 2018 1 commit
    • David Johnson's avatar
      Fix a race in kill/restart of pubsubd in rc.bootsetup . · a3b1a555
      David Johnson authored
      pubsubd wasn't restarting, surely because the existing pubsubd was still
      running and/or socket state was still live in the kernel even after
      putative death.  This took a long time to manifest, and it's not clear
      exactly what the problem was, but making sure pubsubd is dead (and is no
      longer holding its specific port) is appropriate even if we assume
      REUSEADDR is working, and fixes the current problem.  This was only
      observable on the pc3000s and c220g2s, as far as I saw.
  10. 18 Jan, 2018 4 commits
  11. 11 Jan, 2018 1 commit
    • David Johnson's avatar
      Make clientside startcmdstatus reporting more reliable. · cb5ab9f5
      David Johnson authored
      (I had a disk image containing unmodifiable binary software that would
      overwrite dhcpcd's sane copy of /etc/resolv.conf, at a nondeterministic
      point in time, with something completely bogus.  That screwed up
      startcmdstatus reports; this helps out with that case (in combination
      with other custom scripting that returns /etc/resolv.conf to sanity).
      Note though that we only retry infinitely once runstartup has
      successfully gone to the background; up til then, we're limited to about
      a minute's worth of retries.  Likewise, we don't retry forever if
      runstartup itself experiences an error.  We only retry forever if we
      actually have a status to send.
  12. 08 Jan, 2018 1 commit
    • David Johnson's avatar
      Add some debugging support to clientside TBScriptLock; use it in libvnode_xen. · 5d0ff72b
      David Johnson authored
      If the TBScriptLock caller provides a debug message, it will be stored
      in a file, and other blocked TBScriptLock callers will get (possibly
      slightly racy) info about who holds the lock.
      Then, use this in libvnode_xen to get some info about long calls to xl
      Also enable lockdebug in libvnode_xen for now.
  13. 12 Dec, 2017 1 commit
    • David Johnson's avatar
      Add Linux exp firewall support for virt_node_public_addr addresses. · 798f9b6f
      David Johnson authored
      A new tmcd command, publicaddrinfo, just dumps the relevant bits of
      virt_node_public_addr to any node in an experiment that has addrs
      allocated (we don't want to restrict based on calling node_id or
      Then the generic getfwconfig() function calls that, and sets some bits.
      I also extended this function to add some dynamic clientside vars
      firewall rule writers can use them to refer to the control net IPs of
      nodes in their experiment (i.e., node-0.EMULAB_EXPDOMAIN); and so that
      rules can be written over EMULAB_PUBLICADDRS -- a command-delineated
      list of IP addrs).
      Finally, I extended the Linux firewalling code to allow any experiment
      node to answer ARPs for the public IP addresses; we can't know a priori
      which node should answer -- and it could change.
      This closes #353 .
  14. 05 Dec, 2017 1 commit
  15. 17 Nov, 2017 1 commit
  16. 26 Oct, 2017 1 commit
  17. 07 Aug, 2017 1 commit
    • Dan Reading's avatar
      Issue #316 emulab/emulab-devel · c5ce9d4c
      Dan Reading authored
      In checknode code for FreeBSD don't check the /dev/ad* device if it is a symlink.
      [I think the a error in the test command for -c]
  18. 26 Jul, 2017 1 commit
    • Mike Hibler's avatar
      Support for per-experiment root keypairs (Round 1). See issue #302. · c6150425
      Mike Hibler authored
      Provide automated setup of an ssh keypair enabling root to login without
      a password between nodes. The biggest challenge here is to get the private
      key onto nodes in such a way that a non-root user on those nodes cannot
      obtain it. Otherwise that user would be able to ssh as root to any node.
      This precludes simple distribution of the private key using tmcd/tmcc as
      any user can do a tmcc (tmcd authentication is based on the node, not the
      This version does a post-imaging "push" of the private key from boss using
      ssh. The key is pushed from tbswap after nodes are imaged but before the
      event system, and thus any user startup scripts, are started. We actually
      use "pssh" (really "pscp") to scale a bit better, so YOU MUST HAVE THE
      PSSH PACKAGE INSTALLED. So be sure to do a:
          pkg install -r Emulab pssh
      on your boss node. See the new utils/pushrootkeys.in script for more.
      The public key is distributed via the "tmcc localization" command which
      was already designed to handle adding multiple public keys to root's
      authorized_keys file on a node.
      This approach should be backward compatible with old images. I BUMPED THE
      VERSION NUMBER OF TMCD so that newer clients can also get back (via
      rc.localize) a list of keys and the names of the files they should be stashed
      in. This is used to allow us to pass along the SSL and SSH versions of the
      public key so that they can be placed in /root/.ssl/<node>.pub and
      /root/.ssh/id_rsa.pub respectively. Note that this step is not necessary for
      inter-node ssh to work.
      Also passed along is an indication of whether the returned key is encrypted.
      This might be used in Round 2 if we securely implant a shared secret on every
      node at imaging time and then use that to encrypt the ssh private key such
      that we can return it via rc.localize. But the client side script currently
      does not implement any decryption, so the client side would need to be changed
      again in this future.
      The per experiment root keypair mechanism has been exposed to the user via
      old school NS experiments right now by adding a node "rootkey" method. To
      export the private key to "nodeA" and the public key to "nodeB" do:
          $nodeA rootkey private 1
          $nodeB rootkey public 1
      This enables an asymmetric relationship such that "nodeA" can ssh into
      "nodeB" as root but not vice-versa. For a symmetric relationship you would do:
          $nodeA rootkey private 1
          $nodeB rootkey private 1
          $nodeA rootkey public 1
          $nodeB rootkey public 1
      These user specifications will be overridden by hardwired Emulab restrictions.
      The current restrictions are that we do *not* distribute a root pubkey to
      tainted nodes (as it opens a path to root on a node where no one should be
      root) or any keys to firewall nodes, virtnode hosts, delay nodes, subbosses,
      storagehosts, etc. which are not really part of the user topology.
      For more on how we got here and what might happen in Round 2, see:
  19. 06 Jul, 2017 1 commit
  20. 03 Jul, 2017 2 commits
  21. 01 Jul, 2017 1 commit
  22. 22 Jun, 2017 1 commit
  23. 21 Jun, 2017 1 commit
  24. 19 Jun, 2017 3 commits
  25. 30 May, 2017 1 commit
  26. 18 May, 2017 1 commit
  27. 02 May, 2017 1 commit
    • David Johnson's avatar
      Fix a race in common/mkvnode.pl vnodeCreate. · 345bf9bd
      David Johnson authored
      safeLibOp blocks all our vnodesetup-related signals from interrupting
      libvnode ops to ensure at least op-level consistency.  However, there
      was an opportunity for signals to sneak in, in between a successful
      vnodeCreate and the writing of the vnode.info file (that mkvnode.pl uses
      to know if the vnode was created or not).
      So I redid safeLibOp to make blocking signals optional (of course it's
      on for nearly all calls, except now vnodeCreate, and formerly
      vnodePoll).  Now there's a signal-safe zone all the way around
      vnodeCreate, including a StoreState() before we unblock.  This should
      ensure consistency in that particular spot.  I didn't think about
      whether this affects anything else.
  28. 29 Apr, 2017 1 commit
  29. 27 Apr, 2017 2 commits
  30. 26 Apr, 2017 1 commit
  31. 24 Apr, 2017 1 commit
    • David Johnson's avatar
      Clientside Docker vnode support. · 96794781
      David Johnson authored
      See clientside/tmcc/linux/docker/README.md for design notes.
      See clientside/tmcc/linux/docker/dockerfiles/README.md for a description
      of how we automatically Emulabize existing Docker images.
      Also, this mostly fits within the existing vnodesetup path, but I did modify
      mkvnode.pl to allow the libvnode backend to provide a vnodePoll wait
      loop instead of the builtin vnodeState loop.