swapping.html 14 KB
Newer Older
Leigh Stoller's avatar
Leigh Stoller committed
1 2
   Copyright (c) 2000-2003 University of Utah and the Flux Group.
Leigh Stoller's avatar
Leigh Stoller committed
4 5
   All rights reserved.
Chad Barb's avatar
Chad Barb committed
<a name="top"></a>
7 8 9 10 11 12 13 14 15
<h1>Node Use Policies</h1>

<li><a href="#summary">
What are your policies?</a>
<li><a href="#active">
What is "active use"?</a>
16 17 18 19 20 21
<li><a href="#idleness">
When is an experiment considered idle?</a>
<li><a href="#swapping">
What is "swapping"?</a>
<li><a href="#idleswap">
What is an "Idle-Swap"?</a>
22 23 24 25 26 27
<li><a href="#idle">
How long is too long for a node to be idle?</a>
<li><a href="#state">
What is "node state"?</a>
<li><a href="#email">
I just received an email asking me to swap or terminate my
Chad Barb's avatar
Chad Barb committed
29 30
<li><a href="#swapped">
Someone swapped my experiment!</a>
31 32
<li><a href="#autoswap">
What is "Maximum Duration"?</a>
33 34 35

Chad Barb's avatar
Chad Barb committed
<a name="summary"></a>
37 38
<h3>What are your policies?</h3>

Chad Barb's avatar
Chad Barb committed
40 41 42 43
As a courtesy to other experimenters, we ask that experiments be
swapped out or terminated when they are no longer in active use. There
are a limited number of nodes available, and node reservations are
exclusive, so it is important to free nodes that will be idle so that
44 45 46 47 48 49 50 51 52 53 54 55
others may use them. In summary, our policy is that experiments
should be swapped out when they are not in use.  We encourage you to
do that yourself.  In general, if experiments are idle for several
hours, the system will automatically swap them out, or send you mail
about it, and/or an operator may manually swap them out.  The actual
grace period will differ depending on the size of the experiment, the
current demand for resources, and other factors (such as whether
you've been a good Emulab citizen in the past!).  If you mark your
experiment "non-idle-swappable" at creation time or before swapin, and
testbed-ops approves your justification, we will make every effort to
contact you before swapping it, since local node state could be lost
on swapout.  Please see full details below.
Chad Barb's avatar
Chad Barb committed

Chad Barb's avatar
Chad Barb committed
<a name="active"></a>
59 60
<h3>What is "active use"?</h3>

62 63 64 65 66
A node or experiment that is being actively used will be doing
something related to your experiment. In almost all cases, someone
will either be logged into it using it interactively, or some program
will be running, sending and receiving packets, and performing the
operations necessary to carry out the experiment.
Chad Barb's avatar
Chad Barb committed

69 70 71 72
<a name="idleness"></a>
<h3>When is an experiment considered idle?</h3>

73 74 75
Your experiment will be considered idle if it has no measurable 
activity for a significant period of time (a few hours; the exact
time is typically set at swapin time). 
76 77
We detect the following types of activity:
<li>Any network activity on the experimental network
79 80 81 82 83
<li>Substantial activity on the control network
<li>TTY/console activity on nodes
<li>High CPU activity on nodes
<li>Certain external events, like rebooting a node with <tt>node_reboot</tt>
84 85 86
If your experiment's activity falls outside these measured types 
of activity, or it seems that Emulab is not assessing your idle
time correctly, please be sure to let us know when you create your 
87 88 89
experiment, or you may be swapped out unexpectedly.

90 91 92 93 94 95 96 97 98 99
<em>It is considered <b>abuse</b> to generate artificial activity in
order to prevent your experiment from being marked idle.  Abusers'
access to Emulab will be revoked, and their actions will be reported
to their project leader.  Don't do it! If you think you need special
treatment for some horrible deadline or demo or other reason, just
mail us-- we're reasonable!</em>

100 101 102 103 104
<a name="swapping"></a>
<h3>What is "swapping"?</h3>

Swapping is the process of instantiating your experiment,
105 106
i.e., allocating nodes, configuring links, etc. It also refers 
to the reverse process, in which nodes are released. These are
107 108 109 110 111 112 113 114 115
called "swapping in" and "swapping out" respectively. See also
the <A href="docwrapper.php3?docname=faq.html#UTT-Swapping">
What is Swapping?</a> question on our FAQ.

<a name="idleswap"></a>
<h3>What is an "Idle-Swap"?</h3>

116 117 118 119 120 121 122 123 124 125 126 127 128 129
An "Idle-Swap" is when the Emulab system or its operators swap out
your experiment because it was idle for too long.  There are two ways
that your experiment may be idle-swapped: automatic and manual.  The
most common is automatic, which happens when Idle-Swap is enabled for
your experiment and the experiment has been continuously idle for the
idle-swap time that was set at creation/swapin time (usually a few
hours).  The Emulab system will then automatically swap it out.  The
other way to get idle-swapped is manually, by an Emulab operator.
This typically happens when there is very high resource demand and the
experiment has been idle a substantial time, usually a few hours.  In
this case we will typically make every effort to contact you, since it
may cause you to lose data stored on the nodes.
<em>Note that operators (and you) may swap your excessively idle
experiment whether or not it is marked idle-swappable!</em>
130 131
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
When you create your experiment, you may uncheck the "Idle-Swap" box,
disabling the <em>automatic</em> idle-swapping of your experiment.  If
you do so, you must specify the reason, which will be reviewed by
Testbed Ops.
<!-- Don't encourage this yet, but obviously, the tiny space
     in the form is too small for some of these justifications.
  (you may choose to elaborate in email). -->
If your reason is judged unacceptable or insufficient, we will explain
why, and your experiment will be marked idle-swappable.  Valid reasons
<em>might</em> be things such as:
<li>"Your idle-detection system fails to detect my experimental activity."
<li> "I have node-local state that is impractical to copy off in a
     timely or reliable manner, because ....."
<li>"My experiment takes a huge number of nodes, I have several runs
    to make with intervening think time, and if someone grabs some of
    these nodes if I'm swapped while thinking, I'll miss my deadline 2
    days from now."
If an experiment is non-idle-swappable, our system will not
automatically swap it out, and testbed administrators will attempt to
contact you in the event a swapout becomes necessary.  However, we
expect <em>you</em> to be responsible for managing your experiment in
a responsible way, a way that uses Emulab's hardware resources
161 162

163 164 165 166 167
When you create your experiment, you may decrease the idle-swap time
from the displayed default, but you may not raise it.  If lowering it
is compatible with your planned use, doing so helps you be a good
Emulab citizen.  If you want it raised, for example for reasons
similar to those given above, send mail to testbed-ops.
Mac Newbold's avatar
Mac Newbold committed

Mac Newbold's avatar
Mac Newbold committed
171 172 173
You can edit the swap settings (Idle-Swap, Max-Duration, and
corresponding reasons and timeouts) using the "Edit Experiment
Metadata" menu item on the experiment page for your experiment.
174 175

Chad Barb's avatar
Chad Barb committed
<a name="idle"></a>
177 178
<h3>How long is too long for a node to be idle?</h3>

Chad Barb's avatar
Chad Barb committed
180 181
Ideally, an experiment should be used nearly continuously from start
to finish of the experiment, then swapped out or terminated. However,
182 183
this isn't always possible. In general, if your experiment is
idle for 2 hours or more, it should be swapped out. This is
184 185 186 187 188 189 190
especially true at night (in U.S. timezones) and on weekends. Many
experimenters take advantage of lower demand during evenings and
weekends to run their large-scale (50-150 node) tests. If your
experiment uses 10 nodes or more, it is even more important to release
your nodes as soon as possible. Swapin and swapout only take a few
minutes (typically 3-5 for swapin, and less than 1 for swapout), so
you won't lose much time by doing it.
Chad Barb's avatar
Chad Barb committed

193 194
Sometimes an experiment will run long enough that you cannot be online
195 196
to terminate it, for example, if the experiment completes in the
middle of the night. We provide three mechanisms to assist you in
terminating your experiment and releasing nodes in a timely
198 199
manner.  The first is the <a href="tutorial/tutorial.php3#BatchMode">
<em>batch system</em></a>, the second is
<a href="tutorial/tutorial.php3#Halting"><em>scheduled
termination/swapout</em></a>, and the third is the "Max Duration" option,
Jay Lepreau's avatar
Jay Lepreau committed
explained <a href="#autoswap">below</a>.
203 204

Chad Barb's avatar
Chad Barb committed
<a name="state"></a>
206 207
<h3>What is "node state"?</h3>

Chad Barb's avatar
Chad Barb committed
Some experiments have state that is stored exclusively on the nodes
210 211 212
themselves, on their local hard drives. This is state that is not
in your NS file or files or disk images that it references,
and therefore is not preserved in
213 214 215
our database across swapin/swapout. This is state you add to your
machines "by hand" after Emulab sets up your experiment, like files
you add or modify on filesystems local to test nodes. Local node state
does not include any data you store in /users, /proj, or /groups, since those are
saved on a fileserver, and not on the local nodes. 
Chad Barb's avatar
Chad Barb committed
219 220 221 222

Most experiments don't have any local node state, and can be swapped
out and in without losing any information. This is highly recommended,
223 224 225 226
since it is more courteous to other experimenters.  It allows you, or
the Emulab system, or the operations staff, to easily free up your
nodes at any time, without losing any of your work.
Please make your experiments adhere to this guideline whenever possible.
Chad Barb's avatar
Chad Barb committed
228 229

230 231 232 233 234 235 236 237 238
An experiment that needs local state that inherently cannot be saved
(for some reason) or that you will not be able to copy off
before your experiment hits the "idle-swap time,"
should not be marked "idle-swap" when you create it. 
In the "begin experiment" form you must explain the reason.
<!-- (Contact Testbed Ops to have them set this for you.) -->
If you
must have node state, you can save it before you swap out by copying
it off by hand (e.g., into a tar or RPM file), or creating
239 240 241 242
a disk image of the node in question, and later reloading it to a new
node after you swap in again. Disk images in effect create a "custom
OS" that can be loaded automatically based on your NS file. More
information about disk images can be found on our <a
href="https://www.emulab.net/newimageid_ez.php3"> Disk Image
244 245 246
page</a> (you must be logged in to use it).  We will be developing 
a system that will allow the swapping system automatically to save and
restore the local node state of an entire experiment.
Chad Barb's avatar
Chad Barb committed

Chad Barb's avatar
Chad Barb committed
<a name="email"></a>
<h3>I just received an email asking me to swap or terminate my
Chad Barb's avatar
Chad Barb committed
251 252
253 254 255
Emulab has a system for detecting node use, to help achieve more
efficient and fair use of Emulab's limited resources. This system
sends email messages to experiment leaders whose experiments have been
idle for several hours. If you get a message like this, your
257 258
experiment has been inactive for too long and you should free up its nodes.
If the experiment continues to be idle, more
259 260 261 262 263 264 265 266
reminders may be sent, and soon 
your project leader will be one of the recipients. 
After you have been notified, your experiment may be swapped at any 
time, depending on current demand for nodes, and other factors.

If you feel you received the message in error, please respond to <a
267 268
href="mailto:testbed-ops@flux.utah.edu"> Testbed Operations
(testbed-ops@flux.utah.edu)</a> as soon as possible, describing how
you have used your node in the last few hours. There are some types of
270 271 272
activity that are difficult to accurately detect, so we'd like to know
how we can improve our activity detection system. Above all, <b>do not
ignore these messages</b>. If you get several reminders and don't
respond, your experiment will be swapped out, potentially causing loss
of some of your work (see "node state" above). If there is a reason
you need to keep your experiment running, <b>tell us</b> so we don't
inadvertently cause problems for you.
Chad Barb's avatar
Chad Barb committed

Chad Barb's avatar
Chad Barb committed
<a name="swapped"></a>
<h3>Someone swapped my experiment!</h3>
Chad Barb's avatar
Chad Barb committed
282 283 284 285 286 287 288 289
As described above, the system automatically swaps out your experiment
after it reaches its idle time limit, or sometimes an Emulab
operator does it earlier when resources are in especially high demand.
In the latter case, we will typically try to contact you by email
before we swap it out.  However, especially if the experiment has been
idle for several hours, we may swap it out for you without
waiting very long to hear from you.
<!-- unless you marked it as "unswappable" when you created your experiment. -->
Mac Newbold's avatar
Mac Newbold committed
Because of this,
291 292 293
it is critical that you keep in close contact with us about an
experiment that we may perceive as idle if you want to avoid any loss
of your work.
Chad Barb's avatar
Chad Barb committed
295 296 297 298 299 300 301 302 303 304 305

<a name="autoswap"></a>
<h3>What is "Maximum Duration"?</h3>

Each experiment may have a Maximum Duration, where an experimenter
specifies the maximum amount of time that the experiment should
stay swapped in. When that time is exceeded, the experiment is unconditionally swapped
out. The timer is reset every time the experiment swaps in. A reminder
message is sent about an hour before the experiment is swapped. This
swapout happens regardless of any activity on the nodes, and can be
averted by using the "Edit Metadata" menu item on the experiment's page
307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331
to turn off the Maximum Duration feature or to lengthen the duration.
This feature allows users to schedule experiment swapouts, helping
them to release nodes in a timely manner.  For
instance, if you plan to use your experiment throughout an 8 hour work
day, you can schedule a swapout for 8 hours after it is swapped
in. That way, if you forget to swap out before leaving for the day, it
will automatically free up the nodes for other users, without leaving the nodes
idle for several hours before being idle-swapped, and will work even
if you leave your test programs running, making the experiment look non-idle.
For automated experiments, it lets you schedule a swapout for slightly
after the maximum amount of time your experiment should last.  It can
also help catch "runaway" experiments (typically batch).

"Max duration" has a similar effect as <a href="tutorial/tutorial.php3#Halting">
<em>scheduled termination/swapout</em></a>, which is specified in the
<i>ns</i> file.  The differences are that the former lets you adjust
the duration while the experiment is running, you get a warning email,
and you're always swapped, never terminated.  (It's also implemented
differently, with a 5 minute scheduling granularity.)

332 333
<a href="#top"> Back to top of page</a>
Chad Barb's avatar
Chad Barb committed