1. 13 Dec, 2004 2 commits
    • Russ Fish's avatar
    • Timothy Stack's avatar
      · a83bc7d2
      Timothy Stack authored
      Start of an applet/php hack for showing mote LED status on web pages:
      
      	* www/BlinkenLichten.java, www/BlinkenLichten.class: Applet that
      	connects back to the server and changes its color based on output
      	from a php page.
      
      	* www/GNUmakefile.in: Install java classes.
      
      	* www/ledpipe.php3: Draft page for driving the BlinkenLichten
      	applet, it just alternates on/off and doesn't read the real
      	status.
      
      	* www/showstuff.php3: Add a function that outputs the html magic
      	to embed the applet.
      a83bc7d2
  2. 09 Dec, 2004 1 commit
  3. 01 Oct, 2004 1 commit
    • Mike Hibler's avatar
      Oft-desired (by me anyway) hack to print out most recent log entry in · 24938e57
      Mike Hibler authored
      the hwdown experiment listing.  Just a simple hack I though.  Ye Gads!
      That was painful.  On the plus side, I now know the different between all
      those SQL join types :-)
      
      Anyway, added a showlastlog variable in the SHOWNODES function.  Currently,
      it only gets set if the pid/eid is emulab-ops/hwdown.  Eventually it could
      be a parameter so it could be selected for any experiment.
      24938e57
  4. 09 Sep, 2004 1 commit
  5. 26 Jul, 2004 1 commit
  6. 07 Jul, 2004 1 commit
  7. 04 Jun, 2004 1 commit
  8. 18 May, 2004 1 commit
    • Leigh B. Stoller's avatar
      Two wireless changes: · 5713fac4
      Leigh B. Stoller authored
      * On the shownode page include the channels that a wireless interface
        is currently set to, by looking at the interface_settings table. So,
        if you click on an allocated node in the wireless map, you can see
        what channels it is using.
      
        Still to be done; map between a channel number and a frequency.
      
      * On the floormap page, create a small table at the top of the page,
        listing what frequencies are in use on each floor. This is rather
        primitive of course, especially the part where I assume we have just
        one building, but it will do for now.
      
      This is not complete. We probably need some sense of scale on the
      maps.
      5713fac4
  9. 12 Apr, 2004 1 commit
    • Leigh B. Stoller's avatar
      Minor change to SHOWNODE() routine, which is called from various · f99a377a
      Leigh B. Stoller authored
      places. Change $short optional argument to a flags argument, and add a
      NOPERM flag, which allows display of a node to people without
      permission to view that node. The display is cut back to node type,
      and a couple of other things that are not private. I also added a
      section on the Interfaces of a node, using the interfaces table for
      the node, joined with the interface_types and interface_capabilities
      tables. This gives more specific information about a node then is
      possible using the generic node_types table.
      f99a377a
  10. 22 Mar, 2004 1 commit
  11. 27 Feb, 2004 1 commit
  12. 09 Feb, 2004 1 commit
  13. 15 Jan, 2004 1 commit
  14. 14 Jan, 2004 1 commit
  15. 12 Jan, 2004 1 commit
    • Leigh B. Stoller's avatar
      Death to proxydhcp; one less specialized daemon. DHCP will return the · 2b2b8ca1
      Leigh B. Stoller authored
      filename to boot, and all local nodes will boot the same pxeboot kernel,
      which has been extended to allow for jumping directly into a specific MFS
      (in addition to the usual testbed boot into a partition or multiboot
      kernel).
      
      Bootinfo and the bootwhat protocol extended to tell the client node what
      MFS to jump into directly, without a reboot. pxe_boot_path and
      next_pxe_boot_path are now deprecated, with bootinfo used to control which
      MFS to boot. Nodes now boot a single pxeboot kernel, and bootinfo tells
      them what to do next.
      
      Bootinfo greatly simplifed. temp_boot_osid has been added to allow for
      temporary booting of different kernels (such as with ndoe_admin or
      create_image). Unlike next_boot_osid which is a one-shot boot,
      temp_boot_osid causes the node to boot that OS until told not too.
      
      next_boot_path and def_boot_path in the nodes table are now ignored.
      Bootinfo gets path info strictly from the os_info table entry for the osid
      given in one of def_boot_osid, temp_boot_osid, or next_boot_osid.  This
      makes the selection of what to do in bootinfo a lot simpler (and for
      TBBootWhat in libdb). The os_info table also modified to include an MFS
      flag so that bootinfo knows to tell the client that the path refers to an
      MFS and not a multiboot kernel.
      
      Change to boot sequence; free nodes no longer boot into the default OSID.
      Instead, they are told to wait in pxeboot until told what to do, which
      will typically be when the node is allocated and a specific OSID
      picked. If the node needs to be reloaded, then the node is told to jump
      directly into the Frisbee MFS, which saves one complete reboot cycle
      whether the node has the requested OS installed, or not.  New program
      added called "bootinfosend" that is used by node_reboot to "wake up" up
      nodes sitting in pxewait mode, so that they query bootinfo again and boot.
      
      node_reboot changed to look at the event state of a node, and use
      bootinfosend to wake up nodes, rather then power cycle, since pxeboot does
      not repsond to pings. Retry (if the UDP packet is lost) is handled by
      stated.
      
      Event support added to bootinfo, to replace the event generation that was
      in proxydhcp. I have not included the caching that Mac had in proxydhcp
      since it does not appear that bootinfo packets are lost very
      often. Cleaned up all of the event and DB queury code to use lib/libtb for
      DB access, and moved all of the event code into a separate file.  The
      event sequence when a node boots now looks like this:
      
      	'SHUTDOWN'    --> 'PXEBOOTING'  (BootInfo)
      	'PXEBOOTING', --> 'PXEBOOTING'  (BootInfo Retry)
      	'PXEBOOTING', --> 'BOOTING'     (Node Not Free)
      	'PXEBOOTING', --> 'PXEWAIT'     (Node is Free)
      	'PXEWAIT',    --> 'PXEWAKEUP'   (Node Allocated)
      	'PXEWAKEUP',  --> 'PXEWAKEUP'   (Bootinfo Retry)
      	'PXEWAKEUP',  --> 'PXEBOOTING'  (Node Woke Up)
      
      Change stated to support resending PXEWAKEUP events when node times out.
      After 3 tries, node is power cycled. Other minor cleanup in stated.
      
      Clean up and simplify os_select, while adding support for temp_next_boot
      and removing all trace of def_boot_path and next_boot_path processing.
      Remove all pxe_boot_path and next_pxe_boot_path processing.  Changed
      command line interface to support "clearing" fields. For example,
      node_admin changed to call os_select like this to have the node
      temporarily boot the FreeBSD MFS:
      
      	os_select -t FREEBSD-MFS pcXXX
      
      which sets temp_boot_osid. To turn admin mode off:
      
      	os_select -c -t pcXXX
      
      which says to clear temp_boot_osid.
      
      sql/database-fill-supplemental.sql modifed to add os_info table
      entries for the FreeBSD, Frisbee, and newnode MFS's.
      
      Be sure to change dhcpd config, restart dhcp, kill proxydhcp, restart
      bootinfo,
      2b2b8ca1
  16. 10 Dec, 2003 1 commit
  17. 17 Nov, 2003 1 commit
    • Leigh B. Stoller's avatar
      Merge the two state machines (batchstate and state) into a single · 2025e0bd
      Leigh B. Stoller authored
      state machine (state). All of the stuff that was previously handled by
      using batchstate is now embedded into the one state machine. Of
      course, these mostly overlapped, so its not that much of a change,
      except that we also redid the machine, adding more states (for
      example, modify phases are now explicit. To get a picture of the
      actual state machine, on boss:
      
      		stategraph -o newstates EXPTSTATE
      		gv newstates.ps
      
      Things to note:
      
      * The "batchstate" slot of the experiments table is now used solely to
        provide a lock for batch daemon. A secondary change will be to
        change the slot name to something more appropriate, but it can
        happen anytime after this new stuff is installed.
      
      * I have left expt_locked for now, but another later change will be to remove
        expt_locked, and change it to active_busy or some such new state name in
        the state machine. I have removed most uses of expt_locked, except those
        that were necessary until there is a new state to replace it.
      
      * These new changes are an implementation of the new state machine,
        but I have not done anything fancy. Most of the code is the same as
        it was before.
      
      * I suspect that there are races with the batch daemon now, but they
        are going to be rare, and the end result is probably that a
        cancelation is delayed a little bit.
      2025e0bd
  18. 05 Nov, 2003 1 commit
  19. 30 Oct, 2003 1 commit
    • Leigh B. Stoller's avatar
      After discussion with Rob and Mac about state machines, make some · da4eab96
      Leigh B. Stoller authored
      temporary changes to avoid user confusion.
      
      * Show just the batchstate variable to mere users.
      
      * Change the label we print from "terminating" to "swapping" and from
        "paused" to "swapped".
      
      When I get some time I will make these changes in the DB and we can
      take out the little bit of code that changes the labels.
      da4eab96
  20. 09 Oct, 2003 1 commit
    • Leigh B. Stoller's avatar
      Reorg of two aspects of node update. · 2641af4d
      Leigh B. Stoller authored
      * install-rpm, install-tarfile, spewrpmtar.php3, spewrpmtar.in: Pumped
        up even more! The db file we store in /var/db now records both the
        timestamp (of the file, or if remote the install time) and the MD5
        of the file that was installed. Locally, we can get this info when
        accessing the file via NFS (copymode on or off). Remote, we use wget
        to get the file, and so pass the timestamp along in the URL request,
        and let spewrpmtar.in determine if the file has changed. If the
        timestamp it gets is >= to the timestamp of the file, an error code
        of 304 (Not Modifed) is returned. Otherwise the file is returned.
      
        If the timestamps are different (remote, server sends back an actual
        file), the MD5 of the file is compared against the value stored. If
        they are equal, update the timestamp in the db file to avoid
        repeated MD5s (or server downloads) in the future. If the MD5 is
        different, then reinstall the tarball or rpm, and update the db file
        with the new timestamp and MD5. Presto, we have auto update capability!
      
        Caveat: I pass along the old MD5 in the URL, but it is currently
        ignored. I do not know if doing the MD5 on the server is a good
        idea, but obviously it is easy to add later. At the moment it
        happens on the node, which means wasted bandwidth when the timestamp
        has changed, but the file has not (probably not something that will
        happen in typical usage).
      
        Caveat: The timestamp used on remote nodes is the time the tarfile
        is installed (GM time of course). We could arrange to return the
        timestamp of the local file back to the node, but that would mean
        complicating the protocol (or using an http header) and I was not in
        the mood for that. In typical usage, I do not think that people will
        be changing tarfiles and rpms so rapidly that this will make a
        difference, but if it does, we can change it.
      
      * node_update.in, client side watchdog, and various web pages:
        Deflated node_update, removing all of the older ssh code. We now
        assume that all nodes will auto update on a periodic basis, via the
        watchdog that runs on all client nodes, including plab nodes.
      
        Changed the permission check to look for new UPDATE permission (used
        to be UPDATEACCOUNT). As before, it requires local_root or better.
        The reason for this is that node_update now implies more than just
        updating the accounts/mounts. The web pages have been changed to
        explain that in addition to mounts/accounts, rpms and tarfiles will
        also be updated. At the moment, this is still tied to a single
        variable (update_accounts) in the nodes table, but as Kirk requested
        at the meeting, it will probably be nice to split these out in the
        future.
      
        Added the ability to node_update a single node in an experiment (in
        addition to all nodes option on the showexp page). This has been
        added to the shownode webpage menu options.
      
        Changed locking code to use the newer wrapper states, and to move
        the experiment to RUNNING_LOCKED until the update completes. This is
        to prevent mayhem in the rest of the system (which could be dealt
        with, but is not worth the trouble; people have to wait until their
        initiated update is complete, before they can swap out the
        experiment).
      
        Added "short" mode to shownode routine, equiv to the recently added
        short mode for showexp. I use this on the confirmation page for
        updating a single node, giving the user a couple of pertinent (feel
        good) facts before they comfirm.
      2641af4d
  21. 07 Oct, 2003 1 commit
  22. 30 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Up to now we have had two state variables associated with an experiment, · 4269dad1
      Leigh B. Stoller authored
      plus a lock field. The lock field was a simple "experiment locked, go away"
      slot that is easy to use when you do not care about the actual state that
      an experiment is in, just that it is in "transition" and should not be
      messed with.
      
      The other two state variables are "state" and "batchstate". The former
      (state) is the original variable that Chris added, and was used by the tb*
      scripts to make sure that the experiment was in the state each particular
      script wanted them to be in. But over time (and with the addition of so
      much wrapper goo around them), "state" has leaked out all over the place to
      determine what operations on an experiment are allowed, and if/when it
      should be displayed in various web pages. There are a set of transition
      states in addition to the usual "active", "swapped", etc like "swapping"
      that make testing state a pain in the butt.
      
      I added the other state variable ("batchstate") when I did the batch
      system, obvious...
      4269dad1
  23. 19 Sep, 2003 1 commit
    • Leigh B. Stoller's avatar
      Redo the Edit Experiment Metadata page; turned it into a standard form · 3f56ba9b
      Leigh B. Stoller authored
      based page that looks like the original Begin Experiment page. Be sure
      to look at the page in both admin mode and non-admin mode since I had
      some trouble determining how swappable is treated these days.
      
      Oh, added the ability to convert non-batch experiments into batch, and
      back. The experiment must be unlocked and in the swapped state to go
      in either direction.
      
      Also added the cpu_usage and mem_usage slots for editing. I added a
      comment about planetlab only, since otherwise we would just confuse
      normal users who have no idea what they mean. I could conditionalize
      them on having plab nodes, but thats difficult to figure out in the
      web page when the experiment is swapped out, so lets not worry about
      it.
      3f56ba9b
  24. 18 Sep, 2003 2 commits
  25. 13 Sep, 2003 1 commit
  26. 12 Sep, 2003 1 commit
  27. 09 Sep, 2003 1 commit
  28. 26 Aug, 2003 2 commits
  29. 25 Aug, 2003 1 commit
    • Austin Clements's avatar
      Added knowledge of Plab nodes. The correct ssh port is now given when · 4da7262a
      Austin Clements authored
      appropriate.  The node status information now reports SSHD port
      separately; doesn't report IP port ranges for Plab nodes; does report
      RPM's for virtual nodes; only reports BIOS version if it's specified
      in the DB; and, if a control net IP isn't found for a node (the case
      with virtnodes), it reports the control net IP of the physical node.
      4da7262a
  30. 20 Aug, 2003 1 commit
  31. 19 Aug, 2003 1 commit
  32. 07 Aug, 2003 1 commit
  33. 05 Aug, 2003 1 commit
    • Leigh B. Stoller's avatar
      The rest of the sync server additions: · 212cc781
      Leigh B. Stoller authored
      * Parser: Added new tb command to set the name of the sync server:
      
      	tb-set-sync-server <node>
      
        This initializes the sync_server slot of the experiment entry to the
        *vname* of the node that should run the sync server for that
        experiment. In other words, the sync server is per-experiment, runs
        on a node in the experiment, and the user gets to chose which node
        it runs on.
      
      * tmcd and client side setup. Added new syncserver command which
        returns the name of the syncserver and whether the requesting node
        is the lucky one to run the daemon:
      
          SYNCSERVER SERVER='nodeG.syncserver.testbed.emulab.net' ISSERVER=1
      
        The name of the syncserver is written to /var/emulab/boot/syncserver
        on the nodes so that clients can easily figure out where the server
        is.
      
        Aside: The ready bits are now ignored (no DB accesses are made) for
        virtual nodes; they are forced to use the new sync server.
      
      * New os/syncd directory containing the daemon and the client. The
        daemon is pretty simple. It waits for TCP (and UDP, although that
        path is not complete yet) connections, and reads in a little
        structure that gives the name of the "barrier" to wait for, and an
        optional count of clients in the group (this would be used by the
        "master" who initializes barriers for clients). The socket is saved
        (no reply is made, so the client is blocked) until the count reaches
        zero. Then all clients are released by writting back to the
        sockets, and the sockets are closed. Obviously, the number of
        clients is limited by the numbed of FDs (open sockets), hence the
        need for a UDP variant, but that will take more work.
      
        The client has a simple command line interface:
      
          usage: emulab-sync [options]
          -n <name>         Optional barrier name; must be less than 64 bytes long
          -d                Turn on debugging
          -s server         Specify a sync server to connect to
          -p portnum        Specify a port number to connect to
          -i count          Initialize named barrier to count waiters
          -u                Use UDP instead of TCP
      
          The client figures out the server by looking for the file created
          above by libsetup (/var/emulab/boot/syncserver). If you do not
          specify a barrier "name", it uses an internal default. Yes, the
          server can handle multiple barriers (differently named of course)
          at once (non-overlapping clients obviously).
      
          Clients can wait before a barrier in "initialized." The count on
          the barrier just goes negative until someone initializes the
          barrier using the -i option, which increments the count by the
          count. Therefore, the master does not have to arrange to get there
          "first." As an example, consider a master and one client:
      
      	nodeA> /usr/local/etc/emulab/emulab-sync -n mybarrier
      	nodeB> /usr/local/etc/emulab/emulab-sync -n mybarrier -i 1
      
          Node A waits until Node B initializes the barrier (gives it a
          count).  The count is the number of *waiters*, not including the
          master. The master is also blocked until all of the waiters have
          checked in.
      
          I have not made an provision for timeouts or crashed clients. Lets
          see how it goes.
      212cc781
  34. 28 Jul, 2003 1 commit
  35. 16 Jul, 2003 1 commit
  36. 09 Jul, 2003 1 commit
  37. 27 Jun, 2003 1 commit