advanced-topics.scrbl 11.3 KB
Newer Older
1 2 3 4
#lang scribble/manual

@(require "defs.rkt")

@title[#:tag "advanced-topics" #:style main-style #:version apt-version]{Advanced Topics}

Robert Ricci's avatar
Robert Ricci committed
@section[#:tag "disk-images"]{Disk Images}

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Disk images in @(tb) are stored and distributed in the
@hyperlink[""]{Frisbee} disk
image format. They are stored at block level, meaning that, in theory,
any filesystem can be used. In practice, Frisbee's filesystem-aware
compression is used, which causes the image snapshotting and installation
processes to parse the filesystem and skip free blocks; this provides large
performance benefits and space savings. Frisbee has support for filesystems
that are variants of the BSD UFS/FFS, Linux ext, and Windows NTFS formats. The
disk images created by Frisbee are bit-for-bit identical with the original
image, with the caveat that free blocks are skipped and may contain leftover
data from previous users.

Disk images in @(tb) are typically created by starting with one of @(tb)'s
supplied images, customizing the contents, and taking a snapshot of the
resulting disk. The snapshotting process reboots the node, as it boots into an
Robert Ricci's avatar
Robert Ricci committed
MFS to ensure a quiescent disk. If you wish to bring in an image from outside
25 26 27 28 29 30 31
of @(tb) or create a new one from scratch, please
@seclink["getting-help"]{contact us} for help; if this is a common request, we
may add features to make it easier.

@(tb) has default disk image for each node type; after a node is freed by
one experimenter, it is re-loaded with the default image before being released
back into the free pool. As a result, profiles that use the default disk
Robert Ricci's avatar
Robert Ricci committed
images typically instantiate faster than those that use custom images, as no
33 34 35
disk loading occurs.

Frisbee loads disk images using a custom multicast protocol, so loading large
Robert Ricci's avatar
Robert Ricci committed
numbers of nodes typically does not slow down the instantiation process
37 38 39 40 41 42 43 44 45 46 47 48

Images may be referred to in requests in three ways: by URN, by an unqualified
name, and by URL. URNs refer to a specific image that may be hosted on any of
the @seclink["hardware"]{@(tb)-affilated clusters}. An unqualified name refers
to the version of the image hosted on the cluster on which the experiment is
instantiated. If you have large images that @(tb) cannot store due to space
constraints, you may host them yourself on a webserver and put the URL for
the image into the profile. @(tb) will fetch your image on demand, and cache
it for some period of time for efficient distribution.

Images in @(tb) are versioned, and @(tb) records the provenance of images.
Robert Ricci's avatar
Robert Ricci committed
That is, when you create an image by snapshotting a disk that was previously
50 51 52 53 54 55
loaded with another image, @(tb) records which image was used as a base for
the new image, and stores only the blocks that differ between the two. Image
URLs and URNs can contain version numbers, in which case they refer to that
specific version of the image, or they may omit the version number, in which
case they refer to the latest version of the image at the time an experiment
is instantiated.

Robert Ricci's avatar
Robert Ricci committed
@section[#:tag "rspecs"]{RSpecs}

59 60 61 62 63 64 65
The resources (nodes, links, etc.) that define a profile are expressed in the
format from the GENI project. In general, RSpec should be thought of as a
sort of ``assembly language''---something it's best not to edit yourself, but
as manipulate with other tools or create as a ``compiled'' target from a
higher-level language.

Robert Ricci's avatar
Robert Ricci committed
66 67
@section[#:tag "public-ip-access"]{Public IP Access}

68 69 70 71 72 73
@(tb) treats publicly-routable IP addresses as an allocatable resource.

By default, all @seclink["physical-machines"]{physical hosts} are given a
public IP address. This IP address is determined by the host, rather than the
experiment. There are two DNS names that point to this public address: a
static one that is the node's permanent hostname (such as
Robert Ricci's avatar
Robert Ricci committed
@tt{}), and a dynamic one that is assigned based on the
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
experiment; this one may look like @tt{<vname>.<exp>.<proj>},
where @tt{<vname>} is the name assigned in the @seclink["rspecs"]{request
RSpec}, @tt{<eid>} is the name assigned to the
@seclink["experiments"]{experiment}, and @tt{proj} is the
@seclink["projects"]{project} that the experiment belongs to. This name is
predictable regardless of the physical nodes assigned.

By default, @seclink["virtual-machines"]{virtual machines} are @italic{not}
given public IP addresses; basic remote access is provided through an @tt{ssh}
server running on a non-standard port, using the physical host's IP address.
This port can be discovered through the @seclink["rspecs"]{manifest} of an
instantiated experiment, or on the ``list view'' of the experiment page.
If a public IP address is required for a virtual machine (for example, to
host a webserver on it), a public address can be requested on a per-VM basis.
If using @seclink["jacks"]{the Jacks GUI}, each VM has a checkbox to request
a public address. If using @seclink["geni-lib"]{python scripts and
@tt{geni-lib}}, setting the @tt{routable_control_ip} property of a node
accomplishes the same effect.
Different clusters will have different numbers of public addresses available
for allocation in this manner.

96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
@subsection[#:tag "dynamic-public-ip"]{Dynamic Public IP Addresses}

In some cases, users would like to create their own virtual machines, and would
like to give them public IP addresses. We also allow profiles to request
a pool of dynamic addresses; VMs brought up by the user can then run DHCP to
be assigned one of these addresses.

Profiles using @seclink["geni-lib"]{python scripts and @tt{geni-lib}} can
request dynamic IP address pools by constructing an @tt{AddressPool} object
(defined in the @tt{geni.rspec.igext} module), as in the following example:


The addresses assigned to the pool are found in the experiment

Robert Ricci's avatar
Robert Ricci committed
112 113
@section[#:tag "markdown"]{Markdown}

@(tb) supports
115 116
in the major text fields in @seclink["rspecs"]{RSpecs}. Markdown is a simple
Robert Ricci's avatar
Robert Ricci committed
formatting syntax with a straightforward translation to basic HTML elements
118 119 120 121 122 123 124 125 126 127 128 129 130
such as headers, lists, and pre-formatted text. You may find this useful in 
the description and instructions attached to your profile.

While editing a profile, you can preview the Markdown rendering of the
Instructions or Description field by double-clicking within the text


You will probably find the
@hyperlink[""]{Markdown manual} to
be useful.

131 132 133 134
@section[#:tag "introspection"]{Introspection}

@(tb) implements @hyperlink[""]{the
GENI APIs}, and in particular
135 136
@hyperlink[""]{the @tt{geni-get
command}}.  @tt{geni-get} is a generic means for nodes to query their own
137 138 139 140 141 142
configuration and metadata, and is pre-installed on all facility-provided
disk images.  (If you are supplying your own disk image built from scratch,
you can add the @tt{geni-get} client from

While @tt{geni-get} supports many options, there are five commands most
useful in the @(tb) context.
145 146 147 148 149 150 151 152 153 154 155 156 157

@subsection[#:tag "geni-get-client-id"]{Client ID}

Invoking @tt{geni-get client_id} will print a single line to standard
output showing the identifier specified in the profile corresponding to
the node on which it is run.  This is a particularly useful feature in
@seclink["geni-lib-example-os-install-scripts"]{execute services}, where
a script might want to vary its behaviour between different nodes.

@subsection[#:tag "geni-get-control_mac"]{Control MAC}

The command @tt{geni-get control_mac} will print the MAC address of
the control interface (as a string of 12 hexadecimal digits with no
punctuation).  In some circumstances this can be a useful means
159 160 161 162 163 164 165 166 167 168
to determine which interface is attached to the control network, as
OSes are not necessarily consistent in assigning identifiers to
network interfaces.

@subsection[#:tag "geni-get-manifest"]{Manifest}

To retrieve the @seclink["rspecs"]{manifest RSpec} for the instance,
you can use the command @tt{geni-get manifest}.  It will print the
manifest to standard output, including any annotations added during
instantiation.  For instance, this is an appropriate technique to use to
query the allocation of a @seclink["dynamic-public-ip"]{dynamic public
IP address pool}.
171 172 173 174 175 176 177 178 179 180 181 182 183 184

@subsection[#:tag "geni-get-key"]{Private key}

As a convenience, @(tb) will automatically generate an RSA private
key unique to each profile instance.  @tt{geni-get key} will retrieve
the private half of the keypair, which makes it a useful command for
profiles bootstraping an authenticated channel.  For instance:


Please note that the private key will be accessible to any user who
can invoke @tt{geni-get} from within the profile instance.  Therefore, it
is NOT suitable for an authentication mechanism for privilege within
a multi-user instance!
185 186 187 188 189 190 191 192 193 194 195

@subsection[#:tag "geni-get-param"]{Profile parameters}

When executing within the context of a profile instantiated with
@seclink["geni-lib-example-parameters"]{user-specified parameters},
@tt{geni-get} allows the retrieval of any of those parameters.  The
proper syntax is @tt{geni-get "param @italic{name}"}, where @tt{@italic{name}}
is the parameter name as specified in the @tt{geni-lib} script
@tt{defineParameter} call.  For example, @tt{geni-get "param n"} would
retrieve the number of nodes in an instance of the profile shown in
the @seclink["geni-lib-example-parameters"]{@tt{geni-lib} parameter section}.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230

@section[#:tag "user-controlled-switches"]{User-controlled switches and layer-1 topologies}

Some experiments require exclusive access to Ethernet switches and/or the
ability for users to reconfigure those switches. One example of a good use
case for this feature is to enable or tune QoS features that cannot be enabled
on @(tb)'s shared infrastructure switches. 

User-allocated switches are treated similarly to the way @(tb) treats
servers: the switches appear as nodes in your topology, and you 'wire'
them to PCs and each other using point-to-point layer-1 links. When one
of these switches is allocated to an experiment, that experiment is the
exclusive user, just as it is for a raw PC, and the user has ssh access with
full administrative control. This means that users are free to enable and
disable features, tweak parameters, reconfigure as will, etc.  Users are
be given the switches in a 'clean' state (we do little configuration on
them), and can reload and reboot them like you would do with a server.

The list of available switches is found in our @seclink["hardware"]{hardware
chapter}, and the following example shows how to request a simple topology
using @seclink["geni-lib"]{@tt{geni-lib}}.

@profile-code-sample["PortalProfiles" "layer1-sw-2pcs"]

This feature is implemented using a set of @italic{layer-1} switches between some
servers and Ethernet switches. These switches act as ``patch panels,'' allowing us
to ``wire'' servers to switches with no intervening Ethernet packet processing and
minimal impact on latency. This feature can also be used to ``wire'' servers directly
to one another, and to create links between switches, as seen in the following two

@profile-code-sample["PortalProfiles" "layer1-2pcs"]

@profile-code-sample["PortalProfiles" "layer1-2sws-2pcs"]