1. 29 Oct, 2004 7 commits
    • Leigh B. Stoller's avatar
      Make sure that boss' host keys end up in ops:/etc/ssh/known_host_keys · 41b95692
      Leigh B. Stoller authored
      to avoid silly questions that the event-scheduler cannot answer.
      41b95692
    • Timothy Stack's avatar
      b1dfb5fb
    • Timothy Stack's avatar
      · c61858c7
      Timothy Stack authored
      Make the hurting stop.  Make sshxmlrpc auto-detect things, fails over
      properly, and dump useful information when it is unable to deal with
      the peer.
      
        * xmlrpc/sshxmlrpc.py: Major update.  It now tries to autoconfigure
          itself by scanning the path for "ssh" and "plink.exe" (although I
          haven't actually tried it on windows).  Environment variables can
          now be used to turn on debugging and set the command to use for
          doing the ssh.  Before running ssh, it will check for an agent or
          a passphrase-less key and prints a warning if it finds neither.
          The last five lines read from the server, as well as the standard
          error output, are stored so they can be dumped later; helpful for
          figuring out what is actually being run on the other side.  The
          protocol layer between ssh and xml-rpc will now respond to a
          "probe" header so that clients can figure out who they are talking
          too.  The server side will now properly detect a closed connection
          and not write anything, which means...
      c61858c7
    • Timothy Stack's avatar
    • Leigh B. Stoller's avatar
      Such a brutal ElabinElab hack ... When trying to swapin an actual · 0749ef9c
      Leigh B. Stoller authored
      experiment from the web interface, I ran into another control network
      problem, this time in bootinfo. When a node is sitting free, it waits
      in pxeboot for a bootinfo packet from boss to tell it what to do (this
      is different then when the node is allocated, and bootinfo tells it
      what to do in a reply to the initial request). In the PXEWAIT case, we
      *send* it a packet, addressed to its *control network* address, which
      in the inner DB, is on the inner control network, but of course PXE is
      really using the outer control network, so packets addressed to inner
      control network are never seen by pxeboot.
      
      This is the only (known) case of this happening, and rather then try
      for some general, over engineered solution, I did something unusual,
      and put in a hack, ifdefed for ELABINELAB (meaning, its an inner
      elab). I know, you're thinking, how could he have done such a thing,
      its so unlike him!
      
      Well, it was damn easy! Anyway, this little hack checks the DB for an
      interface tagged as role='outer_ctrl' and uses that IP instead of the
      inner control network. When I create the inner DB from the outer DB, I
      was already leaving the outer control network in place so that
      bootinfo could find the proper node (again, cause the bootinfo request
      packets are coming from the outer control network, and so its IP would
      not match any nodes in the DB).
      
      I'd like to say that this is the last problem with swapin, but I see
      in my other window that the event scheduler failed to start on inner
      ops with some silly error ssh permission denied error. Whats that all
      about?
      0749ef9c
    • Leigh B. Stoller's avatar
      I (finally) have an Elabinelab hack that is too ugly to leave in unless · cafbb325
      Leigh B. Stoller authored
      its an inner elab. Define ELABINELAB in the inner defs file. Actual hack
      is coming in a bit.
      cafbb325
    • Leigh B. Stoller's avatar
      dhclient changes for ElabInElab. The crux of this is that inner · 3afcab05
      Leigh B. Stoller authored
      nodes are treated specially. For inner boss/ops, ignore most of what
      DHCPD returns; we need to do the DHCP so that we know what interface,
      but for the moment stuff is hardwired into /etc/rc.conf when the inner
      boss and ops are created. I can probably fix this up later as needed,
      to be more dynamic for supporting swapout/swapin of an inner emulab,
      but swapout and restore of an inner elab has som many open issues,
      that not worrying about it now.
      
      For inner nodes, the change is simple; If no hostname provided, ignore
      the DHCPD reply completely, favoring a full reply from the inner
      control network, and returning -1 from the exit hook so that dhclient
      keeps trying in the foreground.
      
      I am committing these so they get into new images.
      3afcab05
  2. 28 Oct, 2004 6 commits
  3. 27 Oct, 2004 13 commits
  4. 26 Oct, 2004 10 commits
  5. 25 Oct, 2004 4 commits
    • Mike Hibler's avatar
    • Mike Hibler's avatar
      Minor side-track: when Jay asked for frisbee numbers I noticed that our · def95827
      Mike Hibler authored
      times have gotten worse since the USENIX paper.  Turns out we were operating
      at lower BW than the paper (62/Mb sec vs. 70Mb) due to clock granularity.
      The disk was falling idle too much.  Cranked it back up to 72Mb/sec for
      "standard" (/usr/testbed) images.  Actually lowered it to about 54Mb/sec
      for "user" images that have to be read across NFS (/proj).
      def95827
    • Timothy Stack's avatar
      d439b3a1
    • Timothy Stack's avatar
      · 636aaa2b
      Timothy Stack authored
      Changes to the "auto nice daemon" so it can work better in Emulab.
      
        * sensors/and/GNUmakefile.in: Emulab-specific make file.  Updated to
          work with a build tree separate from the source and gave it a new
          version number. Files are installed under "/usr/testbed/" on ops.
      
        * sensors/and/Makefile: Add a warning that this is not the real
          makefile for us.
      
        * sensors/and/and-OpenBSD.c: Update to work with FreeBSD and add
          support for reporting process start time.
      
        * sensors/and/and-emulab.conf.in: Emulab-specific configuration,
          similar to the standard one, except it sends mail to tbops when it
          does something.
      
        * sensors/and/and-emulab.priorities: Emulab-specific priorities
          database. It excludes daemon pseudo users and the event-scheduler,
          otherwise, niceness levels apply to everyone.
      
        * sensors/and/and.8.man: Add the pid file to the 'FILES' section.
      
        * sensors/and/and.c: Added support for running a command when a
          niceness level change occurs.  Also writes out the p...
      636aaa2b