os/syncd/emulab-sync.c · 212cc781e2af09d5bd0da28da6be0399762dfbca · emulab / emulab-devel

The rest of the sync server additions: · 212cc781
Leigh B. Stoller authored Aug 05, 2003
* Parser: Added new tb command to set the name of the sync server:

	tb-set-sync-server <node>

  This initializes the sync_server slot of the experiment entry to the
  *vname* of the node that should run the sync server for that
  experiment. In other words, the sync server is per-experiment, runs
  on a node in the experiment, and the user gets to chose which node
  it runs on.

* tmcd and client side setup. Added new syncserver command which
  returns the name of the syncserver and whether the requesting node
  is the lucky one to run the daemon:

    SYNCSERVER SERVER='nodeG.syncserver.testbed.emulab.net' ISSERVER=1

  The name of the syncserver is written to /var/emulab/boot/syncserver
  on the nodes so that clients can easily figure out where the server
  is.

  Aside: The ready bits are now ignored (no DB accesses are made) for
  virtual nodes; they are forced to use the new sync server.

* New os/syncd directory containing the daemon and the client. The
  daemon is pretty simple. It waits for TCP (and UDP, although that
  path is not complete yet) connections, and reads in a little
  structure that gives the name of the "barrier" to wait for, and an
  optional count of clients in the group (this would be used by the
  "master" who initializes barriers for clients). The socket is saved
  (no reply is made, so the client is blocked) until the count reaches
  zero. Then all clients are released by writting back to the
  sockets, and the sockets are closed. Obviously, the number of
  clients is limited by the numbed of FDs (open sockets), hence the
  need for a UDP variant, but that will take more work.

  The client has a simple command line interface:

    usage: emulab-sync [options]
    -n <name>         Optional barrier name; must be less than 64 bytes long
    -d                Turn on debugging
    -s server         Specify a sync server to connect to
    -p portnum        Specify a port number to connect to
    -i count          Initialize named barrier to count waiters
    -u                Use UDP instead of TCP

    The client figures out the server by looking for the file created
    above by libsetup (/var/emulab/boot/syncserver). If you do not
    specify a barrier "name", it uses an internal default. Yes, the
    server can handle multiple barriers (differently named of course)
    at once (non-overlapping clients obviously).

    Clients can wait before a barrier in "initialized." The count on
    the barrier just goes negative until someone initializes the
    barrier using the -i option, which increments the count by the
    count. Therefore, the master does not have to arrange to get there
    "first." As an example, consider a master and one client:

	nodeA> /usr/local/etc/emulab/emulab-sync -n mybarrier
	nodeB> /usr/local/etc/emulab/emulab-sync -n mybarrier -i 1

    Node A waits until Node B initializes the barrier (gives it a
    count).  The count is the number of *waiters*, not including the
    master. The master is also blocked until all of the waiters have
    checked in.

    I have not made an provision for timeouts or crashed clients. Lets
    see how it goes.
212cc781