1. 26 Oct, 2018 9 commits
  2. 25 Oct, 2018 10 commits
    • Aleksander Maricq's avatar
      Add defs file for amaricq · 2f41610c
      Aleksander Maricq authored
      2f41610c
    • Leigh Stoller's avatar
    • David Johnson's avatar
      Replace the Docker entrypoint/cmd/env implementation for augmented images. · a986a085
      David Johnson authored
      (Also, add support for user to change container entrypoint at runtime.
      Note also that the server side now stores the entrypoint/cmd/env
      attributes as base64url-encoded virt_node_attributes, so that we can
      just use the existing table_regex for those values.)
      
      We add a new runit service (/etc/service/dockerentrypoint) to
      clientside/tmcc/linux/docker/dockerfiles/common to handle the
      entrypoint/cmd/env/workingdir/user emulation.  From the comments:
      
        Docker's semantics for ENTRYPOINT/CMD vary depending on if those
        values are specified as arrays of string, or simple as single strings
        (which must be interpreted by /bin/sh -c).
      
        Handling all the quoting possibilities in the shell is a major pain.
        So, this script handles the basic stuff (in particular, sourcing env
        vars, because we want the shell to interpret them!) -- then execs our
        perl companion script (run.pl) to deal with the entrypoint/command
        files that libvnode_docker::emulabizeImage and
        libvnode_docker::vnodeCreate populated.
      
        libvnode_docker creates these single-line files in /etc/emulab/docker
        as either string:hexstr(<entrypoint-or-cmd-string>), or
        array:hexstr(a[0]),hexstr(a[1])... .  This allows us to preserve the
        original type of the image's entrypoint/cmd as well as the runtime
        entrypoint/cmd, and to preserve the exact bytes for the eventual final
        call to exec.
      
        The static files builtin to an emulabized image are
        /etc/emulab/docker/{entrypoint.image,cmd.image}, and those created
        dynamically at runtime if user changes the entrypoint or cmd are
        bind-mounted to /etc/emulab/docker{entrypoint.runtime,cmd.runtime}.
      
        Given the presence (or absence!) of those files, this script
        implements the emulation, based upon the content in those files.
      a986a085
    • David Johnson's avatar
      993e9f8c
    • David Johnson's avatar
      e48155a7
    • Mike Hibler's avatar
      Tweaks for 2018Q4 port set. · f3dc1bfe
      Mike Hibler authored
      f3dc1bfe
    • Leigh Stoller's avatar
      Minor fix to repo based profile update. · 671c9a48
      Leigh Stoller authored
      671c9a48
    • Leigh Stoller's avatar
      Turn on image tracking. · d43e6a81
      Leigh Stoller authored
      d43e6a81
    • Mike Hibler's avatar
    • Mike Hibler's avatar
      Introduce a full port of m2crypto rather than a wrapper. · 7257198b
      Mike Hibler authored
      The full port is fixed at version 0.29.1. The latest version that was
      wraped, version 0.30.1 has problems with unicode to "string" conversions.
      This explicitly caused an exception from the m2crypto SWIG stubs for libssl.
      Even after fixing that, we still could not verify a certificate due to apparent
      missing chars in strings.
      7257198b
  3. 24 Oct, 2018 3 commits
    • Leigh Stoller's avatar
      Fixes for DeleteNodes(): · c14472f9
      Leigh Stoller authored
      * When deleting a lan can there is only one interface left, need to go
        back and delete the interface from the last node. Else its a malformed
        rpsec (which we have been ignoring), but it was passing through to the
        manifest, which made it a malformed manifest.
      
      * But a later bug was causing that now removed interface to sneak back
        in via the old copy of the manifest in the database.
      
      * Also fix a bug that was causing multiple versions of the site_info
        element to get inserted during an update.
      
      * Remove code that updates the manifest in the DB, use the existing
        Aggregate->UpdateManifest() method instead.
      c14472f9
    • Mike Hibler's avatar
      Changes for Arduino I did a while back. · c2387c9b
      Mike Hibler authored
      Avoid gratuituous serial line signal changes when opening up the USB
      device for the Arduino. Otherwise, the Arduino will reset its state.
      c2387c9b
    • Leigh Stoller's avatar
      Minor fix; we let users delete profiles (or versions) while there is an · e234b170
      Leigh Stoller authored
      experiment running that uses that profile. A small bug here prevented
      the Terminate button from getting enabled. In general though, I wonder
      if we should not allow a profile to be deleted while its instantiated. :-)
      e234b170
  4. 23 Oct, 2018 10 commits
    • Leigh Stoller's avatar
    • Leigh Stoller's avatar
      238fcb83
    • Leigh Stoller's avatar
      Fix gaping race condition in ParRun() that was causing an infinite loop · 833d0937
      Leigh Stoller authored
      when getting a termination signal. Also add an option to not redefine
      the HUP handler, which is needed for the portal_monitor, which uses the
      HUP signal to reopen the logfile (from syslogd).
      833d0937
    • Leigh Stoller's avatar
      Minor fix. · 6f628c59
      Leigh Stoller authored
      6f628c59
    • Leigh Stoller's avatar
      New version of the portal monitor that is specific to the Mothership. · 2a5cbb2a
      Leigh Stoller authored
      This version is intended to replace the old autostatus monitor on bas,
      except for monitoring the Mothership itself. We also notify the Slack
      channel like the autostatus version. Driven from the apt_aggregates
      table in the DB, we do the following.
      
      1. fping all the boss nodes.
      
      2. fping all the ops nodes and dboxen. Aside; there are two special
         cases for now, that will eventually come from the database. 1)
         powder wireless aggregates do not have a public ops node, and 2) the
         dboxen are hardwired into a table at the top of the file.
      
      3. Check all the DNS servers. Different from autostatus (which just
         checks that port 53 is listening), we do an actual lookup at the
         server. This is done with dig @ the boss node with recursion turned
         off. At the moment this is serialized test of all the DNS servers,
         might need to change that latter. I've lowered the timeout, and if
         things are operational 99% of the time (which I expect), then this
         will be okay until we get a couple of dozen aggregates to test.
      
         Note that this test is skipped if the boss is not pingable in the
         first step, so in general this test will not be a bottleneck.
      
      4. Check all the CMs with a GetVersion() call. As with the DNS check, we
         skip this if the boss does not ping. This test *is* done in parallel
         using ParRun() since its slower and the most likely to time out when
         the CM is busy. The time out is 20 seconds. This seems to be the best
         balance between too much email and not hanging for too long on any
         one aggregate.
      
      5. Send email and slack notifications. The current loop is every 60
         seconds, and each test has to fail twice in a row before marking a
         test as a failure and sending notification. Also send a 24 hour
         update for anything that is still down.
      
      At the moment, the full set of tests takes 15 seconds on our seven
      aggregates when they are all up. Will need more tuning later, as the
      number of aggregates goes up.
      2a5cbb2a
    • Leigh Stoller's avatar
      More tweaks to powder fixed node build. · 3dcc45bc
      Leigh Stoller authored
      3dcc45bc
    • Leigh Stoller's avatar
      Add timeout override to PingAggregate(). · 076547b6
      Leigh Stoller authored
      076547b6
    • Leigh Stoller's avatar
      When searching for an IP on the history page, lets also show a matching · 10383734
      Leigh Stoller authored
      current experiment if there is one. This is convenient.
      10383734
    • Leigh Stoller's avatar
      Allow HTML in warn/kill message to user. · 74258700
      Leigh Stoller authored
      74258700
    • Leigh Stoller's avatar
      With Apache 2.4, there is a new option to allow CAs with no CRLS · c1220b25
      Leigh Stoller authored
      when CRLS are enabled. This used to be the default but is now an
      option we need to turn on.
      c1220b25
  5. 22 Oct, 2018 1 commit
  6. 16 Oct, 2018 1 commit
  7. 15 Oct, 2018 1 commit
  8. 11 Oct, 2018 5 commits