update-node.txt 11.1 KB
Newer Older
1 2
#
# EMULAB-COPYRIGHT
3
# Copyright (c) 2004-2005, 2007 University of Utah and the Flux Group.
4 5 6
# All rights reserved.
#

7 8 9 10 11 12 13
How to update the client-side Emulab software on a node and make new images.
This is still a bit ugly at the moment.

A. Things to understand up front:

1. The disk image I am talking about updating here, is one which has both
   a FreeBSD partition and a Linux partition.  So obviously, you will need
14 15 16 17 18
   to update both parts before creating a new whole-disk image.  By Utah
   convention, FreeBSD is in DOS partition #1 and Linux in DOS partition #2.
   For the record, partition #3 is a Linux swap partition and partition #4
   is defined to contain the remaining space on the hard drive (curtesy of
   "growdisk" that is run after frisbee is run) and is available for users.
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

2. Since you will need to boot both the BSD and Linux partitions as well
   as the Emulab "admin MFS" (a scaled-down, PXE-loaded, memory filesystem
   based FreeBSD system), it is best to understand how to interact with
   the Emulab pxeboot program as that is the easiest way to get between
   them.  When pxeboot loads, it prompts the console for override input
   before contacting boss for its default behavior:

	Type a key for interactive mode (quick, quick!)

   So hit the space bar (quick, quick!) and you go into interactive mode
   where you can tell it to boot from the FreeBSD disk partition:
	part:1
   the linux partition:
	part:2
   of to boot the admin MFS:
	loader:/tftpboot/freebsd
   (type "help" to get the complete list of commands).  So below, when I
   speak of "rebooting into Linux" or "rebooting into the admin MFS", this
   is how I expect you to do it.

40 41 42 43 44
3. When running in the admin MFS, use "ad0" if you have an IDE disk,
   "da0" if you have a SCSI disk, or "ad4" if you have SATA.  All of the
   following examples specify "ad0", replace as necessary.

4. To do an Emulab client software install, certain other software packages
45 46 47
   must be installed before you even try to configure the Emulab software.
   If you are updating a recent Emulab image, these should all be in place.
   But if you are updating a really old image,
Mike Hibler's avatar
Mike Hibler committed
48 49 50
   or you are installing the software for the first time, you will need
   these things:

51 52 53
   - GNU make.  On Linux it is the standard make.  On FreeBSD, you must
     install the port: /usr/ports/devel/gmake.

54 55 56 57 58 59 60 61 62 63 64
   - Python.  Python 2.4 is what we use right now.  For FreeBSD, go to
     /usr/ports/lang/python24 and do a "make install".

   - Utah's pubsub headers and libraries.  Grab the source tarball from:

       http://www.emulab.net/downloads/pubsub-0.8.tar.gz

     unpack it and do:

       gmake ELVIN_COMPAT= client
       sudo gmake ELVIN_COMPAT= install-client
Mike Hibler's avatar
Mike Hibler committed
65 66 67 68 69 70 71 72 73

   - Boost headers.  Check for the boost directory in the include directory
     path (probably in /usr/local/include or even /usr/include).
     For FreeBSD you can just install the package or port (version >= 1.30).
     For Linux, you may have a harder time.  The RedHat RPMs I have found
     only include the libraries, you need just the headers (everything we
     use is implemented as a template I think).  I think I just copied over
     the installed headers from a BSD box.

74 75 76 77 78 79 80 81 82 83 84 85 86
   - Dhclient.  On FreeBSD and RedHat > 7, this should be standard.
     (We used to use "pump" for RedHat 7, but we couldn't make it work
     efficiently for multiple interfaces.  So, we switched to dhclient
     there as well.)  You will need to grab a RedHat 7 RPM from somewhere,
     I found one at pbone.net:

       http://rpm.pbone.net/index.php3/stat/4/idpl/1073819/com/dhclient-3.0pl2-1.norlug.i386.rpm.html

     When installing the RPM, you will need to use "--nodeps" to avoid its
     dependency on some initscripts RPM (those scripts presumably just
     provide the boot time rc boilerplate to call dhclient, we have our
     own and don't need it).

87 88 89 90 91 92 93 94 95
   - BPF devices.  Under FreeBSD, the DHCP client uses /dev/bpf* devices.
     In FreeBSD 4, there are only 4 devices by default so if you have more
     than 4 interfaces in the system, DHCP will fail.  So you may need to
     go out to /dev and:

         sudo ./MAKEDEV bpf5 bpf6 ...

     For FreeBSD 5 and Linux, you should not have to do this.

Mike Hibler's avatar
Mike Hibler committed
96 97 98
   - Perl.  On FreeBSD 5, perl is not installed by default.  Make sure you
     have a version of perl5 installed.

99 100 101 102 103
   - Perl HiRes timer module.  If debugging timestamps are enabled in the
     client scripts, you will need to install the HiRes module.  Currently
     this only happens in the mkjail script which is FreeBSD specific.
     To install on FreeBSD, install the devel/p5-Time-HiRes port.

Mike Hibler's avatar
Mike Hibler committed
104 105 106
   - Ethtool.  On Linux, with certain NICs, you will need ethtool (instead
     of mii-tool) so that the Emulab software can change link speed/duplex.
     Just install an RPM.
107
    
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
   UPDATE: for a November 2007 install of Ubuntu 7.04 (minimal),
   I needed to do the following (in addition to pubsub):

     apt-get install gcc
     apt-get install libc6-dev
     apt-get install python-dev
     apt-get install make
     apt-get install g++
     apt-get install byacc
     apt-get install libssl-dev
     apt-get install flex
     apt-get install libboost-dev
     apt-get install libboost-graph-dev
     apt-get install libpcap-dev
     apt-get install ntp-simple
     apt-get install tcsh
     apt-get install rpm
     apt-get install perl-suid

127
5. Another "first time, one time" thing to do is to setup the serial console
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
   in the OSes (if you are planning on using serial consoles).  We use 115200
   baud as the typical default of 9600 is too painful.

   For FreeBSD you need to rebuild a new boot loader if you intend to use
   115200 baud (which is what our PXE boot loader and MFSes expect).  To
   do that first add the line:

      BOOT_COMCONSOLE_SPEED=115200

   to /usr/src/sys/boot/i386/Makefile.inc.  Then:

      cd /usr/src/sys/boot
      sudo make obj
      sudo make
      sudo make install
      sudo disklabel -B ad0  # where "ad0" is your boot device

   and to /boot/loader.conf add:

      console="comconsole"

   Hang on, you're not done yet!  One last thing: change the "console" line
   in /etc/ttys to look like:

      console "/usr/libexec/getty std.115200"   unknown on secure

   Linux and lilo are a little simpler.  In /etc/lilo.conf add:

      serial=0,115200n8

   at the top and, for each kernel listed add:

      append="console=tty0 console=ttyS0,115200"

   the run /sbin/lilo to record the changes.
Mike Hibler's avatar
Mike Hibler committed
163

164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194

B. Now we can begin the process:

1. Make sure you have a testbed source and build trees in a filesystem
   that is visible to a testbed node, either in your home directory or
   /proj/emulab-ops.  You will need a build tree for both BSD and Linux
   (see ~hibler/obj for example).

2. Load up a node with the current image, set either to boot BSD or
   Linux, you'll need to boot both eventually.

3. Login to the node, and fill in a little bit of missing source.
   We don't distribute the prototype password files for BSD/Linux,
   so you'll have to copy the "template" versions from the current node:

	# when running linux:
	sudo cp -p /etc/emulab/shadow <testbed-source-tree>/tmcd/linux/

	# when running BSD:
	sudo cp -p /etc/emulab/master.passwd <testbed-source-tree>/tmcd/freebsd/

   The only thing special about these (and the reason we don't distribute ours)
   is that they contain your site's node root password.

4. Go to your build directory and install new client binaries.
   Paranoid guy that I am, I first backup directories that will be affected,
   ala:

	sudo cp -pr /etc /Oetc
	sudo cp -pr /usr/local/etc/emulab /usr/local/etc/Oemulab

Mike Hibler's avatar
Mike Hibler committed
195 196
   then do the install.  For FreeBSD 4, FreeBSD 5, RedHat 7 and RedHat 9
   systems you can just do:
197 198

	cd <build-tree-for-this-os>
Mike Hibler's avatar
Mike Hibler committed
199
	gmake client
200

Mike Hibler's avatar
Mike Hibler committed
201 202
   and it will build the necessary client-side binaries.  If something
   doesn't build, most likely it is because of a missing software
203
   package, see A4 above.  After successfully building, install the
Mike Hibler's avatar
Mike Hibler committed
204
   binaries and scripts with:
205

Mike Hibler's avatar
Mike Hibler committed
206
	sudo gmake client-install
207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252

   If you did the backup, you can then compare the original to the new:

	sudo diff -r /Oetc /etc
	sudo diff -r /usr/local/etc/Oemulab /usr/local/etc/emulab

   The diffs can be significant however, so it may not tell you much of value.

5. Make sure everything works.  Reboot the node once and make sure it comes
   up ok with the new binaries/scripts.

6. Cleanup the filesystem prior to making the image.  Login at the console
   and do a shutdown to go to single-user mode.  In single-user mode do:

	# BSD paranoia: unmount all NFS filesystems, this will already be
	# done for RHL
	umount -h fs

	cd /usr/local/etc/emulab
	sudo ./prepare

7. Now you need to do the same (3-6) for the other OS on the disk.
   So reboot the machine and tell pxeboot (see A2 above) to boot from
   the other partition:

	sync
	reboot

	# wait for pxeboot prompt
	part:N		# N==1 for BSD, 2 for Linux

   When it comes up in the OS, go do steps 3-6 again.

9. All done?  Ok, now you can make the new images.  I don't use the form
   since you need to create three images: one whole disk, one each for
   the individual partitions.  You usually don't need the partition images,
   but we'll make em anyway!

   First step is to get into the admin MFS via pxeboot:

	sync
	reboot

	# wait for pxeboot prompt
	loader:/tftpboot/freebsd

253 254
   This will boot into the MFS.  Now you can ssh in as root from boss
   and make the images:
255
 
256
	cd /proj/emulab-ops/images
257 258 259
	imagezip -o /dev/ad0 FBSD410+RHL90-STD.ndz
	imagezip -o -s 1 /dev/ad0 FBSD410-STD.ndz
	imagezip -o -s 2 /dev/ad0 RHL90-STD.ndz
260 261 262 263 264 265

10. Move the new images into place.  The only trick here is to make sure
    frisbeed isn't currently serving up the image.  This is another hack.
    On boss:

	cd /proj/emulab-ops/images
266
	sudo cp -p FBSD410+RHL90-STD.ndz FBSD410-STD.ndz RHL90-STD.ndz /usr/testbed/images/N/
267 268

	cd /usr/testbed/images
269
	sudo mv FBSD410+RHL90-STD.ndz FBSD410-STD.ndz RHL90-STD.ndz O/
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294
	sudo mv N/* .

    Now the images are in place.  If there is no currently active frisbeed
    serving up that image, you are done.  If there is an existing frisbeed,
    it will still be serving up the old image since it has it open.  You
    need to kill that and start a new one.  There are three processes
    (threads) per frisbeed instance.  If you just kill the first one (it'll
    be the one that has used the most CPU), then you are done.  The parent
    frisbeelauncher will see that it has died and start a new one, which
    will open the new image file.  One catch: if the old frisbeed is actively
    sending out the image (as opposed to sitting around idle waiting for a
    client), you can really screw things.  The new frisbeed will happily
    take up where the old one left off, continuing to feed blocks to any
    active client.  Unfortunately, it will be feeding blocks from a
    completely different image.  There is currently no unique serial number
    in an image that would enable us to detect this scenario.

11. As long as you still have your node allocated, you might as well test
    the whole disk image.  On boss just do:

	os_load pc<XXX>

    and it will reload the node, and bring it back up in what ever OS is
    the default.  Make sure it comes up, and then use your pxeboot prowess
    to boot into the other OS and make sure it works.