advanced-topics.scrbl 8.85 KB
Newer Older
1 2 3 4
#lang scribble/manual

@(require "defs.rkt")

5 6

@title[#:tag "advanced-topics" #:version apt-version]{Advanced Topics}

Robert Ricci's avatar
Robert Ricci committed
@section[#:tag "disk-images"]{Disk Images}

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Disk images in @(tb) are stored and distributed in the
@hyperlink[""]{Frisbee} disk
image format. They are stored at block level, meaning that, in theory,
any filesystem can be used. In practice, Frisbee's filesystem-aware
compression is used, which causes the image snapshotting and installation
processes to parse the filesystem and skip free blocks; this provides large
performance benefits and space savings. Frisbee has support for filesystems
that are variants of the BSD UFS/FFS, Linux ext, and Windows NTFS formats. The
disk images created by Frisbee are bit-for-bit identical with the original
image, with the caveat that free blocks are skipped and may contain leftover
data from previous users.

Disk images in @(tb) are typically created by starting with one of @(tb)'s
supplied images, customizing the contents, and taking a snapshot of the
resulting disk. The snapshotting process reboots the node, as it boots into an
Robert Ricci's avatar
Robert Ricci committed
MFS to ensure a quiescent disk. If you wish to bring in an image from outside
27 28 29 30 31 32 33
of @(tb) or create a new one from scratch, please
@seclink["getting-help"]{contact us} for help; if this is a common request, we
may add features to make it easier.

@(tb) has default disk image for each node type; after a node is freed by
one experimenter, it is re-loaded with the default image before being released
back into the free pool. As a result, profiles that use the default disk
Robert Ricci's avatar
Robert Ricci committed
images typically instantiate faster than those that use custom images, as no
35 36 37
disk loading occurs.

Frisbee loads disk images using a custom multicast protocol, so loading large
Robert Ricci's avatar
Robert Ricci committed
numbers of nodes typically does not slow down the instantiation process
39 40 41 42 43 44 45 46 47 48 49 50

Images may be referred to in requests in three ways: by URN, by an unqualified
name, and by URL. URNs refer to a specific image that may be hosted on any of
the @seclink["hardware"]{@(tb)-affilated clusters}. An unqualified name refers
to the version of the image hosted on the cluster on which the experiment is
instantiated. If you have large images that @(tb) cannot store due to space
constraints, you may host them yourself on a webserver and put the URL for
the image into the profile. @(tb) will fetch your image on demand, and cache
it for some period of time for efficient distribution.

Images in @(tb) are versioned, and @(tb) records the provenance of images.
Robert Ricci's avatar
Robert Ricci committed
That is, when you create an image by snapshotting a disk that was previously
52 53 54 55 56 57
loaded with another image, @(tb) records which image was used as a base for
the new image, and stores only the blocks that differ between the two. Image
URLs and URNs can contain version numbers, in which case they refer to that
specific version of the image, or they may omit the version number, in which
case they refer to the latest version of the image at the time an experiment
is instantiated.

Robert Ricci's avatar
Robert Ricci committed
@section[#:tag "rspecs"]{RSpecs}

61 62 63 64 65 66 67 68
The resources (nodes, links, etc.) that define a profile are expressed in the
format from the GENI project. In general, RSpec should be thought of as a
sort of ``assembly language''---something it's best not to edit yourself, but
as manipulate with other tools or create as a ``compiled'' target from a
higher-level language.

That said, the tools for manipulating RSpecs are still incomplete. (For a
preview of @(tb)'s plans in this regard, see our section on
70 71 72 73 74
@seclink["planned-easier-profiles"]{planned profile creation features}.) So,
there are still some cases in which it is unfortunately useful to look at or
manipulate RSpecs directly.

@italic{Still to come: documentation of @(tb)'s extensions to the RSpec format.}

77 78

Robert Ricci's avatar
Robert Ricci committed
79 80
@section[#:tag "public-ip-access"]{Public IP Access}

81 82 83 84 85 86
@(tb) treats publicly-routable IP addresses as an allocatable resource.

By default, all @seclink["physical-machines"]{physical hosts} are given a
public IP address. This IP address is determined by the host, rather than the
experiment. There are two DNS names that point to this public address: a
static one that is the node's permanent hostname (such as
Robert Ricci's avatar
Robert Ricci committed
@tt{}), and a dynamic one that is assigned based on the
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
experiment; this one may look like @tt{<vname>.<exp>.<proj>},
where @tt{<vname>} is the name assigned in the @seclink["rspecs"]{request
RSpec}, @tt{<eid>} is the name assigned to the
@seclink["experiments"]{experiment}, and @tt{proj} is the
@seclink["projects"]{project} that the experiment belongs to. This name is
predictable regardless of the physical nodes assigned.

By default, @seclink["virtual-machines"]{virtual machines} are @italic{not}
given public IP addresses; basic remote access is provided through an @tt{ssh}
server running on a non-standard port, using the physical host's IP address.
This port can be discovered through the @seclink["rspecs"]{manifest} of an
instantiated experiment, or on the ``list view'' of the experiment page.
If a public IP address is required for a virtual machine (for example, to
host a webserver on it), a public address can be requested on a per-VM basis.
If using @seclink["jacks"]{the Jacks GUI}, each VM has a checkbox to request
a public address. If using @seclink["geni-lib"]{python scripts and
@tt{geni-lib}}, setting the @tt{routable_control_ip} property of a node
accomplishes the same effect.
Different clusters will have different numbers of public addresses available
for allocation in this manner.

109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
@subsection[#:tag "dynamic-public-ip"]{Dynamic Public IP Addresses}

In some cases, users would like to create their own virtual machines, and would
like to give them public IP addresses. We also allow profiles to request
a pool of dynamic addresses; VMs brought up by the user can then run DHCP to
be assigned one of these addresses.

Profiles using @seclink["geni-lib"]{python scripts and @tt{geni-lib}} can
request dynamic IP address pools by constructing an @tt{AddressPool} object
(defined in the @tt{geni.rspec.igext} module), as in the following example:


The addresses assigned to the pool are found in the experiment

Robert Ricci's avatar
Robert Ricci committed
125 126
@section[#:tag "markdown"]{Markdown}

@(tb) supports
128 129
in the major text fields in @seclink["rspecs"]{RSpecs}. Markdown is a simple
Robert Ricci's avatar
Robert Ricci committed
formatting syntax with a straightforward translation to basic HTML elements
131 132 133 134 135 136 137 138 139 140 141 142 143
such as headers, lists, and pre-formatted text. You may find this useful in 
the description and instructions attached to your profile.

While editing a profile, you can preview the Markdown rendering of the
Instructions or Description field by double-clicking within the text


You will probably find the
@hyperlink[""]{Markdown manual} to
be useful.

Robert Ricci's avatar
Robert Ricci committed
@section[#:tag "tours"]{Tours}
145 146

@italic{This feature under development}
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187

@section[#:tag "introspection"]{Introspection}

@(tb) implements @hyperlink[""]{the
GENI APIs}, and in particular
@hyperlink[""]{the @tt{geni-get}}
command.  @tt{geni-get} is a generic means for nodes to query their own
configuration and metadata, and is pre-installed on all facility-provided
disk images.  (If you are supplying your own disk image built from scratch,
you can add the @tt{geni-get} client from

While @tt{geni-get} supports many options, there are three commands most
useful in the @(tb) context:

@subsection[#:tag "geni-get-client-id"]{Client ID}

Invoking @tt{geni-get client_id} will print a single line to standard
output showing the identifier specified in the profile corresponding to
the node on which it is run.  This is a particularly useful feature in
@seclink["geni-lib-example-os-install-scripts"]{execute services}, where
a script might want to vary its behaviour between different nodes.

@subsection[#:tag "geni-get-control_mac"]{Control MAC}

The command @tt{geni-get control_mac} will print the MAC address of
the control interface (as a string of 12 hexadecimal digits with no
other punctuation).  In some circumstances this can be a useful means
to determine which interface is attached to the control network, as
OSes are not necessarily consistent in assigning identifiers to
network interfaces.

@subsection[#:tag "geni-get-manifest"]{Manifest}

To retrieve the @seclink["rspecs"]{manifest RSpec} for the instance,
you can use the command @tt{geni-get manifest}.  It will print the
manifest to standard output, including any annotations added during
instantiation.  For instance, this is an appropriate technique to use to
query the allocation for a @seclink["dynamic-public-ip"]{dynamic public
IP address pool}.