1. 19 Aug, 2016 1 commit
    • David Johnson's avatar
      Add Mitaka; unified controller/networkmanager; Manila; linuxbridge. · 6d23a989
      David Johnson authored
      The feature notes:
      
        * Mitaka is now the default OpenStack release configured by this
          profile.  Kilo and Juno are deprecated, and we are no longer testing
          the profile's functionality under those versions (although we have
          no concrete plans to remove the code at this point).  They may
          continue to work, or they may not.  You should update to Mitaka if
          possible, of course.
      
        * The default topology is now down to two nodes: a controller (`ctl`)
          node and a compute (`cp-1`) node; the networkmanager node's
          functionality has been moved to the controller, as is the default in
          the OpenStack Ubuntu/Apt documentation.  You can return to the old
          three-node configuration by changing the name of the
          "networkmanager" node in the Advanced Parameters from `ctl` to `nm`.
      
        * One of the bigger Mitaka features is shared filesystem support
          (Manila).  We download a shared filesystem image and configure
          Manila so that you can immediately create a share and connect it to
          guests.
      
        * We have added support for the Neutron ML2 "Linuxbridge" driver,
          although we continue to install the "OpenVSwitch" ML2 driver by
          default.  The Linuxbridge driver is not as well-tested as the
          OpenVSwitch driver, in all possible configurations of this profile.
          Although OpenStack has switched to the linuxbridge driver as its
          default, we have no plans to do that yet.
      
        * You can now choose an Apt mirror and set a custom mirror path if you
          require fast localized access to a mirror.
      
        * The MTU that dnsmasq pushes to your OpenStack VMs has been reduced
          from 1454 bytes to 1450 bytes.  1454 is an adequate setting for GRE
          tunnels, of course, but not for VXLAN networks, which require 1450
          on a normal physical network with 1500-byte MTU.  Somehow this
          mistake escaped prior testing.
      
      A few details:
      
        * I refactored the Neutron ML2 plugin setup code, since all nodes
          have to be configured essentially the same way.  Moreover, it
          supports either openvswitch or linuxbridge.
      
        * I haven't setup Manila for aarch64 because there is no available
          Manila service image for aarch64.  Have to build one of my own.
      6d23a989
  2. 14 Jul, 2016 2 commits
    • David Johnson's avatar
      Use a different mechanism to tell dpkg automatic conffile settings. · 4ace7c90
      David Johnson authored
      Now I see why I hadn't enabled the
      
        -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold"
      
      directly on the apt-get command lines.  apt-get must have a bug, because
      when you specify this option in noninteractive (and non-pty, I assume,
      because this is via startup command-then-ssh), at least one of the dpkg
      commands invoked by apt-get has no dpkg action.
      
      So, put these two options into /etc/dpkg/dpkg.cfg/cloudlab, and then
      there are no problems.
      
      Of course, this means this same behavior will happen to the users if
      they try apt-get or dpkg later on.  This is on the one hand, preferable,
      because then they can't possibly screw up openstack config files through
      package upgrades.  On the other hand, they might get fooled they're
      upgrading some other package.
      
      Probably will just document this and call it good :).
      4ace7c90
    • David Johnson's avatar
  3. 13 Jul, 2016 3 commits
  4. 06 Jul, 2016 1 commit
    • David Johnson's avatar
      Add a couple missing fallback-to-EOL wgets. · d74be89c
      David Johnson authored
      The openstack people rename their branches to -eol variants, and so that
      tarballs keep working in perpetuity despite us needing to download some
      things, we first try the stable/<release> tag, then fall back to
      <release>-eol tag.  Sigh...
      d74be89c
  5. 17 Jun, 2016 1 commit
    • David Johnson's avatar
      Several changes to openstack slothd gatherer. · 3b8bde10
      David Johnson authored
      Added the entire epoch as a period named '__EPOCH__'
      
      Stopped collecting cpu_util, network.(incoming,outgoing).bytes.rate
      meters for periods, and collect 'instance' meters per period (these just
      show how many instances were on each hypervisor during the period).
      
      Stopped collecting port.(delete,create,update) events -- far too many
      and they essentially relate n-to-1 with VMs (where n is usually 1, 1
      port per VM).
      
      Handle some weird inconsistencies in resource metadata (it seemed that
      for some VMs, some of the metadata wasn't set and thus we were not
      including said VMs in the primary VM info dict -- and this seemed to
      cause problems for the javascript graph stuff).
      
      Also, if a resource/meter value isn't associated with a hostname, use
      the 'UNKNOWN' hostname instead of None/null.  For whatever reason, this
      occurs with image.update now; funny.
      
      Get rid of the __FLATTEN__ stuff; that was just a lingering turd.
      3b8bde10
  6. 09 Jun, 2016 1 commit
  7. 01 Jun, 2016 2 commits
  8. 31 May, 2016 3 commits
  9. 26 May, 2016 1 commit
  10. 21 May, 2016 2 commits
    • David Johnson's avatar
      Reorg top level; add more resource info; add timestamp/runtime metadata. · acbf5767
      David Johnson authored
      Now the top-level keys are: 'META' (metadata about the collection run,
      so that whoever pulls this file back to boss doesn't have to check its
      ctime/mtime to know how stale the data is -- times in GMT); 'info',
      which has keys like 'images', 'vms', 'networks', 'subnets', 'ports',
      'routers', and a UUID->dict where the dict has a 'name' field (HRN),
      'status' (if the resource has status; all do); and 'deleted' (True
      or False).  Then the periods (which were previously top-level keys)
      are now keys in the 'periods' top-level dict.
      acbf5767
    • David Johnson's avatar
      Add network in/out byte rate meters, and API meters. · f77aac78
      David Johnson authored
      Openstack reports in/out byte rates for each vm and for each device
      on those VMs, but I aggregate the per-device stats into per-VM in/out
      totals.
      
      Currently, I'm reporting these API calls:
        * (network,subnet,port,router).(create,update,delete)
        * (image).(upload,update)
      API calls are reported from which "host" they were issued (I think);
      if there is no host info logged (like for images), the hostname is
      "null".
      f77aac78
  11. 20 May, 2016 1 commit
    • David Johnson's avatar
      A simple cpu_util statistics gatherer. · 5300a37f
      David Johnson authored
      This collects openstack cpu_util stats, grouped by hypervisor, and dumps
      them into a JSON file.  The JSON file will be written into
      /root/setup/cloudlab-openstack-stats.json . Currently it gets written
      every 2 minutes (however, openstack by default collects CPU stats only
      every 600 seconds...).
      
      The format is quite simple. It's a dict of time periods -- currrently
      the last 10 minutes, last hour, last 6 hours, last day, and last
      week. Each period is also a dict, currently with two keys: vm_info and
      cpu_util. vm_info contains a dict for each physical hypervisor node, and
      that dict contains a mapping of openstack VM uuid to VM
      shortname. cpu_util also contains a dict for each physical hypervisor
      node, and that dict contains two keys: a total of the average cpu utils
      for all the VMs on that node; and a "vms" dict containing the avg cpu
      util for each VM.
      5300a37f
  12. 17 May, 2016 1 commit
    • David Johnson's avatar
      Fix ctl node reboot races on Liberty/Ubuntu 15.10. · 74df028a
      David Johnson authored
      Reboots of the ctl node for the Liberty version would result in
      failures to startup mysql, and this renders all openstack services
      inoperable.
      
      Recall that in the common case (because we have many testbeds whose
      nodes only have one expt interface), we setup the openstack mgmt lan as
      a VPN over the control net between all the nodes, served from the nm
      node.
      
      Well, mysql binds to and listens on the ip addr of the mgmt net device,
      and when the ctl node is rebooted, mysql starts long before openvpn can
      bring up the vpn client net device.  Moreover, rabbitmq would fail to
      start for the same reason, and rabbitmq is the AMQP messaging service
      that underlies all openstack RPC.
      
      For various reasons, it's not sufficient to just make the mysql
      initscript (which on 15.10 is still legacy LSB!) depend on the openvpn
      legacy LSB initscript.
      
      So I wrote a little initcript (embedded in setup-controller.sh) that
      spins in a sleep 1; loop, looking for the mgmt net to get its known IP
      from the openvpn client.  It has reverse dependency on mysql, so it runs
      to completion before mysql starts.
      
      Then, we had to handle the rabbitmq case... but rabbitmq has a modern
      systemd unit file, not an LSB initscript.  So I wrote a systemd unit
      file that invokes my mgmt net LSB initscript to wait for the mgmt net
      IP... and that has a reverse dep on rabbitmq-server.service.
      
      Now all is good.  mysql and rabbitmq-server are certainly blocked for a
      few extra seconds, while the VPN comes up, but all the openstack
      services themselves are written defensively to handle RPC server
      disconnects, or database disconnects (doh).
      74df028a
  13. 04 May, 2016 1 commit
  14. 03 May, 2016 1 commit
  15. 21 Apr, 2016 1 commit
  16. 19 Apr, 2016 2 commits
  17. 25 Mar, 2016 1 commit
  18. 03 Mar, 2016 2 commits
  19. 27 Feb, 2016 2 commits
  20. 26 Feb, 2016 1 commit
  21. 25 Feb, 2016 3 commits
  22. 22 Feb, 2016 2 commits
  23. 19 Feb, 2016 2 commits
    • David Johnson's avatar
      On Liberty aarch64, enable vnc using vga driver disabling usb input. · fde6f527
      David Johnson authored
      Liberty didn't seem to like the disable_vnc flag to Nova on aarch64 that
      we relied on so that images would boot.  Fortuitously, qemu/libvirt have
      been upgraded enough so that you can actually attach a VGA adapter to an
      aarch64 KVM qemu instance.  So we do that, and now we mark images with a
      specific flag that says to use the vga display driver instead of the
      'cirrus' default, which qemu/libvirt aarch64 does *not* support.
      
      Probably I should just find a way to fix the vnc disablement :).
      fde6f527
    • David Johnson's avatar
      27c65db3
  24. 17 Feb, 2016 3 commits