1. 03 Nov, 2004 1 commit
  2. 05 Oct, 2004 1 commit
  3. 29 Jul, 2004 1 commit
  4. 14 Jul, 2004 1 commit
  5. 24 Jun, 2004 1 commit
    • Mike Hibler's avatar
      Improve the client-side install. With these changes, it should now be · 976133e4
      Mike Hibler authored
      possible to:
      
      	gmake client
      	sudo gmake client-install
      
      on a FBSD4, FBSD5, RHL7.3, and RHL9.0 client node.
      
      There are still some dependencies that are not explicit and which would
      prevent a build/install from working on a "clean" OS.  Two that I know of are:
      you must install our version of the elvin libraries and you must install boost.
      976133e4
  6. 01 Jun, 2004 1 commit
  7. 25 May, 2004 1 commit
  8. 10 May, 2004 1 commit
  9. 26 Apr, 2004 1 commit
    • Mike Hibler's avatar
      Cleanup Makefiles: · 297019fb
      Mike Hibler authored
      1. "make clean" will just remove stuff built in the process of a regular build
      2. "make distclean" will also clean out configure generated files.
      
      This is how it was always supposed to be, there was just some bitrot.
      297019fb
  10. 09 Oct, 2003 1 commit
    • Leigh Stoller's avatar
      Reorg of two aspects of node update. · 2641af4d
      Leigh Stoller authored
      * install-rpm, install-tarfile, spewrpmtar.php3, spewrpmtar.in: Pumped
        up even more! The db file we store in /var/db now records both the
        timestamp (of the file, or if remote the install time) and the MD5
        of the file that was installed. Locally, we can get this info when
        accessing the file via NFS (copymode on or off). Remote, we use wget
        to get the file, and so pass the timestamp along in the URL request,
        and let spewrpmtar.in determine if the file has changed. If the
        timestamp it gets is >= to the timestamp of the file, an error code
        of 304 (Not Modifed) is returned. Otherwise the file is returned.
      
        If the timestamps are different (remote, server sends back an actual
        file), the MD5 of the file is compared against the value stored. If
        they are equal, update the timestamp in the db file to avoid
        repeated MD5s (or server downloads) in the future. If the MD5 is
        different, then reinstall the tarball or rpm, and update the db file
        with the new timestamp and MD5. Presto, we have auto update capability!
      
        Caveat: I pass along the old MD5 in the URL, but it is currently
        ignored. I do not know if doing the MD5 on the server is a good
        idea, but obviously it is easy to add later. At the moment it
        happens on the node, which means wasted bandwidth when the timestamp
        has changed, but the file has not (probably not something that will
        happen in typical usage).
      
        Caveat: The timestamp used on remote nodes is the time the tarfile
        is installed (GM time of course). We could arrange to return the
        timestamp of the local file back to the node, but that would mean
        complicating the protocol (or using an http header) and I was not in
        the mood for that. In typical usage, I do not think that people will
        be changing tarfiles and rpms so rapidly that this will make a
        difference, but if it does, we can change it.
      
      * node_update.in, client side watchdog, and various web pages:
        Deflated node_update, removing all of the older ssh code. We now
        assume that all nodes will auto update on a periodic basis, via the
        watchdog that runs on all client nodes, including plab nodes.
      
        Changed the permission check to look for new UPDATE permission (used
        to be UPDATEACCOUNT). As before, it requires local_root or better.
        The reason for this is that node_update now implies more than just
        updating the accounts/mounts. The web pages have been changed to
        explain that in addition to mounts/accounts, rpms and tarfiles will
        also be updated. At the moment, this is still tied to a single
        variable (update_accounts) in the nodes table, but as Kirk requested
        at the meeting, it will probably be nice to split these out in the
        future.
      
        Added the ability to node_update a single node in an experiment (in
        addition to all nodes option on the showexp page). This has been
        added to the shownode webpage menu options.
      
        Changed locking code to use the newer wrapper states, and to move
        the experiment to RUNNING_LOCKED until the update completes. This is
        to prevent mayhem in the rest of the system (which could be dealt
        with, but is not worth the trouble; people have to wait until their
        initiated update is complete, before they can swap out the
        experiment).
      
        Added "short" mode to shownode routine, equiv to the recently added
        short mode for showexp. I use this on the confirmation page for
        updating a single node, giving the user a couple of pertinent (feel
        good) facts before they comfirm.
      2641af4d
  11. 03 Oct, 2003 1 commit
  12. 17 Sep, 2003 1 commit
  13. 05 Aug, 2003 1 commit
    • Leigh Stoller's avatar
      The rest of the sync server additions: · 212cc781
      Leigh Stoller authored
      * Parser: Added new tb command to set the name of the sync server:
      
      	tb-set-sync-server <node>
      
        This initializes the sync_server slot of the experiment entry to the
        *vname* of the node that should run the sync server for that
        experiment. In other words, the sync server is per-experiment, runs
        on a node in the experiment, and the user gets to chose which node
        it runs on.
      
      * tmcd and client side setup. Added new syncserver command which
        returns the name of the syncserver and whether the requesting node
        is the lucky one to run the daemon:
      
          SYNCSERVER SERVER='nodeG.syncserver.testbed.emulab.net' ISSERVER=1
      
        The name of the syncserver is written to /var/emulab/boot/syncserver
        on the nodes so that clients can easily figure out where the server
        is.
      
        Aside: The ready bits are now ignored (no DB accesses are made) for
        virtual nodes; they are forced to use the new sync server.
      
      * New os/syncd directory containing the daemon and the client. The
        daemon is pretty simple. It waits for TCP (and UDP, although that
        path is not complete yet) connections, and reads in a little
        structure that gives the name of the "barrier" to wait for, and an
        optional count of clients in the group (this would be used by the
        "master" who initializes barriers for clients). The socket is saved
        (no reply is made, so the client is blocked) until the count reaches
        zero. Then all clients are released by writting back to the
        sockets, and the sockets are closed. Obviously, the number of
        clients is limited by the numbed of FDs (open sockets), hence the
        need for a UDP variant, but that will take more work.
      
        The client has a simple command line interface:
      
          usage: emulab-sync [options]
          -n <name>         Optional barrier name; must be less than 64 bytes long
          -d                Turn on debugging
          -s server         Specify a sync server to connect to
          -p portnum        Specify a port number to connect to
          -i count          Initialize named barrier to count waiters
          -u                Use UDP instead of TCP
      
          The client figures out the server by looking for the file created
          above by libsetup (/var/emulab/boot/syncserver). If you do not
          specify a barrier "name", it uses an internal default. Yes, the
          server can handle multiple barriers (differently named of course)
          at once (non-overlapping clients obviously).
      
          Clients can wait before a barrier in "initialized." The count on
          the barrier just goes negative until someone initializes the
          barrier using the -i option, which increments the count by the
          count. Therefore, the master does not have to arrange to get there
          "first." As an example, consider a master and one client:
      
      	nodeA> /usr/local/etc/emulab/emulab-sync -n mybarrier
      	nodeB> /usr/local/etc/emulab/emulab-sync -n mybarrier -i 1
      
          Node A waits until Node B initializes the barrier (gives it a
          count).  The count is the number of *waiters*, not including the
          master. The master is also blocked until all of the waiters have
          checked in.
      
          I have not made an provision for timeouts or crashed clients. Lets
          see how it goes.
      212cc781
  14. 18 Dec, 2002 1 commit
  15. 27 Nov, 2002 1 commit
  16. 23 Nov, 2002 1 commit
  17. 07 Jul, 2002 1 commit
  18. 21 Apr, 2002 1 commit
  19. 14 Jan, 2002 1 commit
    • Leigh Stoller's avatar
      Make Frisbee.Redux live: · d08b5e41
      Leigh Stoller authored
      * Add appropriate goo to os/GNUMakefile so that Frisbee daemon is
        built and installed.
      
      * Rework the frisbee launcher slightly. Aside from little changes
        (send email to tbops when frisbeed dies, new cmdline syntax to
        frisbeed), allow for frisbeed to exit gracefully after a period of
        inactivity (no client requests for 30 minutes, at present). In order
        to prevent a race condition with a new client being added (and
        rebooted) and frisbeed terminating before the client gets started,
        add a load_busy indicator to the images table (next to load_address
        slot) and set that to one each time to frisbeelauncher is invoked.
        When frisbeed exits, test and clear that bit atomically (lock
        tables) and go around another time (restart frisbeed for another 30
        minute period).
      
      * Rework waitmode in os_load. Wait for all of the nodes to finish at
        once, and track which nodes never finish. Retry those nodes again by
        rebooting. The number of retries is configurable in the script, and
        is currently set to one. This should take care of some PXE boot
        related problems, although obviously not all.
      
      * Got rid of -w option to os_load and made waitmode the default. The
        -s option can be used to start a reload, but not to wait for it to
        complete.
      
      * Minor changes to sched_reload and reload_daemon; pass in -s option
        to os_load.
      d08b5e41
  20. 01 Aug, 2001 1 commit
    • Leigh Stoller's avatar
      An attempt at making image creation an easy/automatic operation. HA! · 27f26d99
      Leigh Stoller authored
      This uses the pxe booted freebsd kernel and MFS. In addition, I use
      the standard testbed mechanism of specifying a startup command to
      run, which will do the imagezip to NFS mounted /proj/<pid>/.... The
      controlling script on paper sets up the database, reboots the node,
      and then waits for the startstatus to change. Then it resets the DB
      and reboots the node so that it returns back to its normal OS. The
      format of operation is:
      
      	create_image <node> <imageid> <filename>
      
      Node must be under the user's control of course. The filename must
      reside in the node's project (/proj/<pid>/whatever) since thats the
      directory that is mounted by the testbed config software when the
      machine boots. The imageid already exists in the DB, and is used to
      determine what part of the disk to zip up (say, using the slice option
      to the zipper). Since this operation is rather time consuming, it does
      the usual trick of going to background and sending email status later.
      27f26d99
  21. 16 May, 2001 1 commit
  22. 09 Apr, 2001 1 commit
  23. 01 Mar, 2001 1 commit
  24. 04 Jan, 2001 1 commit
  25. 03 Jan, 2001 1 commit
  26. 02 Jan, 2001 1 commit
  27. 13 Dec, 2000 1 commit
  28. 01 Dec, 2000 1 commit
  29. 25 Aug, 2000 1 commit
  30. 17 Aug, 2000 1 commit