basic-concepts.scrbl 10.9 KB
Newer Older
Robert Ricci's avatar
Robert Ricci committed
1 2 3 4
#lang scribble/manual

@(require "defs.rkt")

5
@title[#:tag "basic-concepts" #:version apt-version]{Basic Concepts}
Robert Ricci's avatar
Robert Ricci committed
6

7
This chapter covers the basic concepts that you'll need to understand in
8
order to use @(tb).
9

10 11
@section[#:tag "profiles"]{Profiles}

12 13 14 15 16
A @italic{profile} encapsulates everything needed to run an
@seclink["experiments"]{@italic{experiment}}.  It consists of two main parts: a
@seclink["rspecs"]{description of the resources} (hardware, storage, network,
etc.) needed to run the experiment, and the @seclink["disk-images"]{software
artifacts} that run on those resources.
Robert Ricci's avatar
Robert Ricci committed
17

Robert Ricci's avatar
Robert Ricci committed
18 19 20 21 22 23 24 25
The resource specification is in the @seclink["rspecs"]{RSpec} format. The
RSpec describes an entire @italic{topology}: this includes the nodes (hosts)
that the software will run on, the storage that they are attached to, and the
network that connects them. The nodes may be
@seclink["virtual-machines"]{virtual machines} or
@seclink["physical-machines"]{physical servers}. The RSpec can specify the
properties of these nodes, such as how much RAM they should have, how many
cores, etc., or can directly reference a specific @seclink["hardware"]{class of
26
hardware} available in one of @(tb)'s clusters. The network topology can include
27
point to point links, LANs, etc. and may be either built from Ethernet or
Robert Ricci's avatar
Robert Ricci committed
28
Infiniband.
Robert Ricci's avatar
Robert Ricci committed
29 30

The primary way that software is associated with a profile are through
Robert Ricci's avatar
Robert Ricci committed
31 32
@seclink["disk-images"]{disk images}. A disk image (often just called an
``image'') is a block-level snapshot of the contents of a real or virtual
33
disk---and it can be loaded onto either. A disk image in @(tb) typically has an
Robert Ricci's avatar
Robert Ricci committed
34 35 36 37 38
installation of a full operating system on it, plus additional packages,
research software, data files, etc. that comprise the software environment of
the profile. Each node in the RSpec is associated with a disk image, which it
boots from; more than one node in a given profile may reference the same disk
image, and more than one profile may use the same disk image as well.
Robert Ricci's avatar
Robert Ricci committed
39

40
Profiles come from two sources: some are provided by @(tb) itself; these tend
41
to be standard installations of popular operating systems and software
42
stacks. Profiles may also be provided by @(tb)'s users, as a way for communities
Robert Ricci's avatar
Robert Ricci committed
43 44
to share research artifacts.

Robert Ricci's avatar
Robert Ricci committed
45
@subsection[#:tag "on-demand-profiles"]{On-demand Profiles}
Robert Ricci's avatar
Robert Ricci committed
46

47
Profiles in @(tb) may be @italic{on-demand profiles}, which means that they are
Robert Ricci's avatar
Robert Ricci committed
48 49
designed to be instantiated for a relatively short period of time (hours or
days). Each person instantiating the profile gets their own
50
@seclink["experiments"]{experiment}, so everyone using the profile is doing so
Robert Ricci's avatar
Robert Ricci committed
51 52
independently on their own set of resources.

Robert Ricci's avatar
Robert Ricci committed
53
@subsection[#:tag "persistent-profiles"]{Persistent Profiles}
Robert Ricci's avatar
Robert Ricci committed
54

55
@(tb) also supports @italic{persistent} profiles, which are longer-lived (weeks
Robert Ricci's avatar
Robert Ricci committed
56 57
or months) and are set up to be shared by multiple users. A persistent profile
can be thought of as a ``testbed within a testbed''---a testbed facility built
58
on top of @(tb)'s hardware and provisioning capabilities. Examples of persistent
Robert Ricci's avatar
Robert Ricci committed
59 60 61
profiles include:

@itemlist[
62
@item{An instance of a cloud software stack, providing VMs to a large community}
Robert Ricci's avatar
Robert Ricci committed
63 64 65
@item{A cluster set up with a specific software stack for a class}
@item{A persistent instance of a database or other resource used by a large
research community}
Robert Ricci's avatar
Robert Ricci committed
66
@item{Machines set up for a contest, giving all participants access to the same
Robert Ricci's avatar
Robert Ricci committed
67 68 69 70 71 72
hardware}
@item{An HPC cluster temporarily brought up for the running of a particular
set of jobs}
]

A persistent profile may offer its own user interface, and its users may not
73 74 75 76
necessarily be aware that they are using @(tb).  For example, a cloud-style
profile might directly offer its own API for provisioning virtual machines.
Or, an HPC-style persistent profile might run a standard cluster scheduler,
which users interact with rather than the @(tb) website.
Robert Ricci's avatar
Robert Ricci committed
77

78 79
@apt-only{For the time being, allocations for persistent profiles on @(tb) are
handled by directly @seclink["getting-help"]{contacting the @(tb) staff}.}
Robert Ricci's avatar
Robert Ricci committed
80

81 82
@section[#:tag "experiments"]{Experiments}

Robert Ricci's avatar
Robert Ricci committed
83
@margin-note{See the chapter on @seclink["repeatable-research"]{repeatability}
84
for more information on repeatable experimentation in @(tb).}
Robert Ricci's avatar
Robert Ricci committed
85 86 87 88

An @italic{experiment} is an instantiation of a @seclink["profiles"]{profile}.
An experiment uses resources, @seclink["virtual-machines"]{virtual} or
@seclink["physical-machines"]{physical}, on one or more of the
89
@seclink["hardware"]{clusters} that @(tb) has access to. In most cases, the
Robert Ricci's avatar
Robert Ricci committed
90 91 92
resources used by an experiment are devoted to the individual use of the user
who instantiates the experiment. This means that no one else has an account,
access to the filesystems, etc. In the case of experiments using solely
93
@seclink["physical-machines"]{physical machines}, this also means strong
94
performance isolation from all other @(tb) users. @apt-only{In the case of
Robert Ricci's avatar
Robert Ricci committed
95
@seclink["virtual-machines"]{virtual machines}, there is still isolation from a
96
security and accounting standpoint, but weaker performance isolation.} (The
Robert Ricci's avatar
Robert Ricci committed
97
exceptions to this rule are @seclink["persistent-profiles"]{persistent
98
profiles}, which may offer resources to many users.)
Robert Ricci's avatar
Robert Ricci committed
99

100
Running experiments on @(tb) consume @bold{real resources}, which are
Robert Ricci's avatar
Robert Ricci committed
101 102 103 104 105 106
@bold{limited}. We ask that you be careful about not holding on to experiments
when you are not actively using them. If you are are holding on to experiments
because getting your working environment set up takes time, consider
@seclink["creating-profiles"]{creating a profile}.

The contents of local disk on nodes in an experiment are considered
Robert Ricci's avatar
Robert Ricci committed
107
@italic{ephemeral}---that is, when the experiment ends (either by being
Robert Ricci's avatar
Robert Ricci committed
108 109
explicitly terminated by the user or expiring), the contents of the disk are
lost. So, you should copy important data off before the experiment ends. A
110
simple way to do this is through @tt{scp} or @tt{sftp}.  You may also
Robert Ricci's avatar
Robert Ricci committed
111
@seclink["creating-profiles"]{create a profile}, which captures the contents of
Robert Ricci's avatar
Robert Ricci committed
112
the disk in a @seclink["disk-images"]{disk image}.
Robert Ricci's avatar
Robert Ricci committed
113 114 115 116 117

All experiments have an @italic{expiration time}. By default, the expiration
time is short (a few hours), but users can use the ``Extend'' button on the
experiment page to request an extension. A request for an extension must
be accompanied by a short description that explains the reason for requesting
118 119 120 121 122 123
an extension, which will be reviewed by @(tb) staff.
@apt-only{@seclink["guest-users"]{Guest users} are not permitted to hold
experiments for very long; if you are using @(tb) as a guest, and find yourself
running out of time frequently, we recommend @seclink["register"]{registering
for an account}.} You will receive email a few hours before your experiment
expires reminding you to copy your data off or request an extension.
Robert Ricci's avatar
Robert Ricci committed
124

125 126
@subsection[#:tag "extending"]{Extending Experiments}

Mike Hibler's avatar
Mike Hibler committed
127
If you need more time to run an experiment, you may use the ``Extend'' button
128 129 130 131 132 133 134 135
on the experiment's page. You will be presented with a dialog that allows you
to select how much longer you need the experiment. Longer time periods require
more extensive appoval processes. Short extensions are auto-approved, while
longer ones require the intervention of @(tb) staff or, in the case of
indefinite extensions, the steering commitee.

@screenshot["extend-experiment.png"]

Robert Ricci's avatar
Robert Ricci committed
136
@section[#:tag "projects"]{Projects}
137

Robert Ricci's avatar
Robert Ricci committed
138 139
Users are grouped into @italic{projects}. A project is, roughly speaking, a
group of people working together on a common research or educational goal. This
Robert Ricci's avatar
Robert Ricci committed
140
may be people in a particular research lab, a distributed set of
Robert Ricci's avatar
Robert Ricci committed
141 142 143 144 145 146
collaborators, instructors and students in a class, etc.

A project is headed by a @italic{project leader}. We require that project
leaders be faculty, senior research staff, or others in an authoritative
position. This is because we trust the project leader to approve other members
into the project, ultimately making them responsible for the conduct of the
147
users they approve. If @(tb) staff have questions about a project's activities,
Robert Ricci's avatar
Robert Ricci committed
148 149 150
its use of resources, etc., these questions will be directed to the project
leader. Some project leaders run a lot of experiments themselves, while some
choose to approve accounts for others in the project, who run most of the
151
experiments. Either style works just fine in @(tb).
Robert Ricci's avatar
Robert Ricci committed
152 153 154 155 156 157

Permissions for some operations / objects depend on the project that they
belong to. Currently, the only such permission is the ability to make a profile
visible onto to the owning project. We expect to introduce more
project-specific permissions features in the future.

Robert Ricci's avatar
Robert Ricci committed
158
@section[#:tag "virtual-machines"]{Virtual Machines}
159

160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
@apt-only{
    The default node type in @(tb) is a @italic{virtual machine}, or VM. VMs in
    @(tb) are currently implemented on
    @hyperlink["http://blog.xen.org/index.php/2013/07/09/xen-4-3-0-released/"]{Xen
    4.3} using
    @hyperlink["http://wiki.xenproject.org/wiki/Paravirtualization_(PV)"]{paravirtualization}.
    Users have full root access with their VMs via @tt{sudo}.

    VMs in @(tb) are hosted on shared nodes; this means that while no one else
    has access to your VMs, there are other users on the same hardware whose
    activities may affect the performance of your VMs.  @(tb) currently @italic{does}
    oversubscribe virtual cores, meaning VMs may get less CPU time than would be
    indicated by the number of virtual cores they have, depending on others'
    activity. It @italic{does not} oversubscribe RAM, meaning that all virtual
    RAM is backed by physical pages, though performance can still be affected
    by others' use of the memory bus, which is a shared resource. @(tb) @italic{does
    not} provide performance isolation for disks, so I/O performance may also vary
    depending on use. Finally, @(tb) @italic{does not} oversubscribe bandwidth on
    network links attached to VMs, though variance in performance due to
    fine-grained timing effects is still possible.
}

@clab-only{
    While @(tb) does have the ability to provision virtual machines itself
    (using the Xen hypervisor), we expect that the dominant use of @(tb) is
    that users will provision @seclink["physical-machines"]{physical machines}.
    Users (or the cloud software stacks that they run) may build their own
    virtual machines on these physical nodes using whatever hypervisor they 
    wish.
}
190

Robert Ricci's avatar
Robert Ricci committed
191
@section[#:tag "physical-machines"]{Physical Machines}
192

193
Users of @(tb) may get exclusive, root-level control over @italic{physical
194 195 196 197 198 199 200 201
machines}. When allocated this way, no layers of virtualization or indirection
get in the way of the way of performance, and users can be sure that no other
users have access to the machines at the same time. This is an ideal situation
for @seclink["repeatable-research"]{repeatable research}.

Physical machines are @seclink["disk-images"]{re-imaged} between users, so you
can be sure that your physical machines don't have any state left around from
the previous user. You can find descriptions of the
202
hardware in @(tb)'s clusters in the @seclink["hardware"]{hardware} chapter.
203

204
@apt-only{
205 206
Physical machines are relatively scarce, and getting access to large numbers of
them, or holding them for a long time, may require
207
@seclink["getting-help"]{contacting @(tb) staff}. 
208
    }