emulab-sync.1 6.16 KB
Newer Older
Timothy Stack's avatar
Timothy Stack committed
1
2
3
4
5
.\"
.\" EMULAB-COPYRIGHT
.\" Copyright (c) 2004 University of Utah and the Flux Group.
.\" All rights reserved.
.\"
Timothy Stack's avatar
Timothy Stack committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
.TH EMULAB-SYNC 1 "April 5, 2004" "Emulab" "Emulab Commands Manual"
.OS
.SH NAME
emulab-sync \- Simple distributed synchronization client.
.SH SYNOPSIS
.BI emulab-sync
[\fB-hV\fR]
[\fB-n \fIname\fR]
[\fB-d\fR]
[\fB-s \fIserver\fR]
[\fB-p \fIport-number\fR]
[\fB-i \fIcount\fR]
[\fB-a\fR]
[\fB-u\fR]
[\fB-e \fIerror-number\fR]
21
[\fB-m\fR]
Timothy Stack's avatar
Timothy Stack committed
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.SH DESCRIPTION
The
.B emulab-sync
utility is used to wait for other programs in your experiment to reach some
specific point before proceeding forward.  This utility and its server,
.B emulab-syncd\fR,
implement this synchronization through a simple counter based barrier.  The
counter is increased by
.B emulab-sync
when using the
.B -i
option and then decremented by one on each regular use.  Once the counter
reaches zero, all of the waiters will return and execution can proceed.
.P
Multiple synchronizations can take place at the same time by using the
.B -n
option to specify a separate name for each barrier.  For example, scripts on
nodeA and nodeB can be waiting on a barrier named "foo" while (other) scripts
on nodeA and nodeC can be waiting on a barrier named "bar." You may reuse an
existing barrier (including the default barrier) once it has been released (all
clients arrived and woken up).
.P
Basic error reporting is also supported using the
.B -e
option and the program's exit code.  For example, if a client encountered an
error and would be unable to continue after crossing the barrier, it can signal
its peers by using the
.B -e
option with a non-zero error code.  Then, when all of the clients have arrived,
they will wake up and exit with this non-zero error code.  If multiple peers
encounter errors, the maximum of all of the errors will be returned.
.P
Available options:
.P
.TP
\fB-h
Print out a usage message.
.TP
\fB-V
61
Print out version information and exit.
Timothy Stack's avatar
Timothy Stack committed
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
.TP
\fB-n \fIname
The name of the barrier, must be less than 64 bytes long.  (Default: barrier)
.TP
\fB-d
Turn on debugging.
.TP
\fB-s \fIserver
Specify the host where the
.B emulab-syncd
server is running.  Normally, you do not need to use this option since Emulab
will automatically use the experiment's default sync server.  The experiment's
sync server is picked at random or specified by the
.B tb-set-sync-server
NS command when creating the experiment.
.TP
\fB-p \fIport-number
Specify the port number the server is listening on.  (Default: 16534)
.TP
\fB-i \fIcount
Increase the number of waiters by
.I count
waiters.
.B NOTE\fR:
The number of waiters should not include this call, unless you are using the
.B -a
option and intend to do another sync later on.
.TP
\fB-a
Return immediately after increasing the barrier value.  Can only be used 
with the
.B -i
option.
.TP
\fB-u
Use UDP to contact the server.  (Default: TCP)
.TP
\fB-e \fIerror-number
100
101
102
103
104

Specify the error code for this client.  The value must be greater than zero
(indicating no error) and less than 240 because we reserve the range [240, 256)
for internal use.  The error codes for all of the clients are collected by the
server and the maximum is returned by the clients when they exit.  (Default: 0)
105
106
107
108
109
110
111
112
.TP
\fB-m
Check if this node is where the default sync server is running and then
immediately exit.  If this node
.I is
running the server, the return value is zero, otherwise it is one.  This mode
is useful for detecting the "master" node when the experimental nodes all run
the same startup script.
Timothy Stack's avatar
Timothy Stack committed
113
114
.SH RETURN VALUES
.TP
115
A \fB-e \fRerror value
Timothy Stack's avatar
Timothy Stack committed
116
117
118
If the barrier was crossed, but atleast one of the clients reported a non-zero
error code.
.TP
119
240
Timothy Stack's avatar
Timothy Stack committed
120
121
122
If the server received a SIGHUP from a user, causing all of the barriers 
to be cleared and the clients released.
.TP
123
124
125
126
127
128
129
76
If there was an error while communicating with the server.
.TP
68
If the server name lookup failed.
.TP
64
Timothy Stack's avatar
Timothy Stack committed
130
131
If there was an invalid command line argument.
.TP
132
133
134
135
136
1
If the
.B -m
option was used and this node is not the master.
.TP
Timothy Stack's avatar
Timothy Stack committed
137
0
138
139
140
If the barrier was successfully crossed without error or the
.B -m
option was used and this node is the master.
Timothy Stack's avatar
Timothy Stack committed
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
.SH EXAMPLES
.PP
To synchronize one machine with another:
.PP
.RS
[gob@bluth1 ~] emulab-sync -i 1
.P
[michael@bluth2 ~] emulab-sync
.RE
.PP
To perform two different synchronizations at the same time using barriers named
"staircar" and "prison":
.PP
.RS
[marta@bluth1 ~] emulab-sync -i 1 -n staircar
.P
[michael@bluth2 ~] emulab-sync -n staircar
.P
[gob@bluth1 ~] emulab-sync -i 1 -n prison
.P
[george@bluth2 ~] emulab-sync -n prison
.RE
.PP
164
To synchronize and report an error code of "100":
Timothy Stack's avatar
Timothy Stack committed
165
166
167
168
169
170
171
172
173
174
175
176
177
178
.PP
.RS
.PD 0
[buster@bluth1 ~] emulab-sync -i 1 -e 100; echo $?
.P
100
.PD
.P
.PD 0
[annyong@bluth2 ~] emulab-sync -e 0; echo $?
.P
100
.PD
.RE
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
.PP
To use a single script to synchronize N nodes:
.PP
.RS
.PD 0
#! /bin/sh

# Get number of nodes to synchronize against.
.P
node_count=${1}

# Synchronize first.
.P
\fBif\fR emulab-sync -m; \fBthen\fR
.P
    # This is the master node, initialize the
.P
    # barrier to N - 1 since the master node
.P
    # is not counted.
.P
    emulab-sync -i `expr ${node_count} - 1`
.P
\fBelse\fR
.P
    # This is a regular node, just do a regular
.P
    # sync.
.P
    emulab-sync
.P
\fBfi\fR

# ... then do something interesting.
.RE
.PD
215
216
217
218
.SH FILES
.TP
/var/emulab/boot/syncserver
Contains the name and port number of the experiment's default sync server.
Timothy Stack's avatar
Timothy Stack committed
219
.SH NOTES
220
221
222
223
If a client using TCP is unable to connect to the server, it will continue to
retry at five second intervals.  Clients using UDP do not support such
reliability, so be careful.
.P
Timothy Stack's avatar
Timothy Stack committed
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
If you terminate a TCP client before it returns, the server will recognize the
closed socket and adjust the counter so that it appears that the call 
never occurred.
.SH BUGS
Despite the above efforts, it is still possible for a barrier's counter to end
up with an unknown value.  Typical symptoms of this condition are clients that
refuse to wakeup despite all of them checking in.  At this point it is probably
best to use a different barrier name or you can clear all of the barriers by
sending a SIGHUP to the experiment's
.B emulab-syncd\fR.
.SH SEE ALSO
emulab-syncd(1)
.SH AUTHOR
The Emulab project at the University of Utah.
.SH NOTES
The Emulab project can be found on the web at
.IR http://www.emulab.net