Commit 32458d68 authored by Robert Ricci's avatar Robert Ricci
Browse files

Everything in this file was extremely out of date - and most of it

already done.

Yeah, we should get around to adding FS-aware compression to
imagezip one of these days...
parent a85d3094
Things that could be done:
1. Monitor temperature sensor on CPU (mike)
Linux and probably BSD have drivers/software for reading the temp
sensors. We should use, maybe in conjunction with Dave's autostatus
to make sure things like fan failure don't kill a CPU.
2. Filesystem compression (mike)
[ Leigh is working on this. ]
Jay's Holy Grail of Ultimate Disk Image Compression. I see three
ways of doing this:
- Have the "zip" part effectively do a tar of everything in the FS
and make a note of how big the FS is. Unzip will newfs a filesystem
of the same size and untar. Note that I don't necessarily mean a
literal tar/untar as you may want to operate beneath the FS level
for speed. Downside is that you don't create an exact copy of the
original. If something in the original relies on exact location
(e.g., LILO), ya be screwed.
- Modify zlib (or create a new OSKit library) to know about filesystem
formats, at least UFS and EXT2FS. The zipper will recognize free
blocks and encode them special. The unzipper will note these special
runs and just leave room on the disk (rather than zero them). Note
the information security issue here, we need to provide the user
with an option to wipe the disk when they are done so that their
old stuff doesn't leak through into a new experiment.
- A not so obvious solution with beneficial side-effects would be to
write a tool to expand filesystems like UFS and EXT2FS. Then we
could make our image filesystems just big enough to hold everything
we care about, and create regularly zipped images of those.
The unzipper, expands the small disk image and then goes in and
grows the filesystem to some reasonable size. A filesystem
expansion (contraction?) tool would probably be a nice addition to
*BSD and/or Linux. Note that this assumes we are creating images
with only a single partition (OS), otherwise you still have to
fill out the first partition no matter how little of it you used.
FYI: long ago I asked Kirk about expanding an FFS filesystem and this
is what he said:
From: mckusick@chez.Berkeley.EDU (Kirk Mckusick)
To: mike@cs (Mike Hibler)
Subject: Re: a UFS question
Date: Mon, 15 Jul 91 17:11:17 PDT
Date: Fri, 5 Jul 91 17:43:36 -0600
From: (Mike Hibler)
To: mckusick@okeeffe.Berkeley.EDU
Subject: a UFS question
Occasionally we come across a situation where it would be
nice to expand a filesystem. Rather than create a new
filesystem it would be nice if you could extend an existing
one. How practical is that with the 4.2 FS? It seems like
things are fairly encapsulated at the cylinder group level
so it might not be that hard to add more groups. Is there
any static-sized global state (e.g. bitmaps in the superblock)
that would make this impossible?
Thought I would go straight to the source rather than
muddling around.
In theory expanding a filesystem is not too difficult. The basic
algorithm is to add more cylinder groups (possibly first expanding
an existing partial cylinder group at the end of the old filesystem).
The easiest way to do this is to update the superblock to reflect
the larger size and zero out the inode blocks in the new cylinder
groups (if any). Then run fsck to update the bit maps in the new or
expanded cylinder groups. The only caveat is that the superblock
summary maps are allocated in the data fragments immediately
following the inodes in the first cylinder group. If you add enough
new cylinder groups to overflow the previously allocated fragments,
you will have to relocate the immediately following fragment to make
room for the expanded summary information. On a filesystem with 1K
fragments, each fragment holds summaries for 64 cylinder groups,
so you can only easily increase (i.e. without relocating an existing
file) the total number of cylinder groups to a multiple of 64.
3. A new tip (mike)
[ Mike fixed the easy stuff: ripped out most of the escape sequences
and dialer support, reduced to one process, cleanup tty on exit. ]
Things that are wrong with tip in the environment we use it for
(remote console access):
- Inflexible authentication model. If there is no lock file and you
can access the tty, you are in. We need to verify a user ID and tty
combo from a database.
- Obsolete escape sequences. Not only are useless, but interfere with
running things like emacs.
- Ineffective cleanup. You kill one half of the tip, the other side
might get left around, the lock file left, the tty left screwed up,
you name it.
- Obsolete or inefficient model. The two-process input/output model
hardly seems worthwhile. A simple select-based scheme is probably
more efficient in the modern age.
- No direct remote access. You must run tip on the machine with the
serial line. You can r/slogin to that machine from anywhere, but
there is an extra layer of program.
- Inadequate logging. We introduced a layer between the tty and tip
to provide good logging. Shouldn't be necessary.
What we want is something along the lines of our current capture,
a server per serial port which:
- always logs to a disk file,
- allows one (or multiple) users can connect via sockets and get output
from (or provide input to) the serial line,
- ensures all such connections are access checked, with access checking
abstracted to allow for different implementations,
- supports few (no?) magic escape sequences,
- possibly encrypts traffic,
4. Make DNARDs useful (mike)
Have someone produce interesting software for the DNARDs. If they
are going to be source/sinks, make useful standalone OSKit kernels
to avoid needing NetBSD if possible. Start with the oskit traffic-gen
programs and beef them up (i.e., make them work on the DNARDs, make
them remotely, dynamically controllable).
5. Teach tcpdump about spanning tree and other inter-switch traffic (mike)
Even if it could just make it easier to identify in the output,
or even better, give us a simple handle so that we can exclude it
(e.g., "tcpdump not switchtraffic") I am reminded of this by seeing
the never-ending stream of "Unknown IPX Data" packets.
6. Reduce (or eliminate) logging in default OSes (mike)
The standard daily/weekly/monthly activities are occurring and
sending mail to root (at least in our FreeBSD image). We should
either eliminate this or, better, reduce it and have it logged
on plastic or paper. We should at least do some security checking
and mail any anomalies to testbed-ops or someplace so we know if
some machine has been cracked.
7. Reduce power usage (mike)
We should do what we can to reduce excess power usage. The most
obvious thing to do is turn machines off when they are not assigned.
The benefits are:
- maximum power savings
- reduced vulnerability to script kiddies
- longer component life?
- no rc5des!
The downsides are:
- can't use unassigned machines for other purposes (abone)
- don't save anything if the testbed is in use 24/7/365
- frequent transitions would be harder on the HW
At the very least, we should make our scripts capable of dealing
with this model. For example, os_setup would need to check and see
if the machine is powered down and, if so, turn it on. Likewise,
deassignment should turn the machine off. Note we will need to
differentiate "down to save power" from "down cuz it emits large
billows of black smoke when turned on." Autostatus would have to
be taught to show "available/assigned/down" instead of "up/down".
To avoid excess transitions due to frequent reassignment, we can
delay shutting down a machine "for a while" after an experiment
has ended. The end-experiment algorithm might look like:
if someone is waiting for an available node,
immediately reassign/reload this one
"clean" (reload/zero) the disk
if the node still isn't needed,
power it down
That would give a 5-10 minute window for reclamation.
What can we do for machines that are turned on? For assigned
machines we probably can't dictate much. The best we can do
is make our default kernels "green". Both Linux and FreeBSD
probably idle the CPU, so we are probably ok there. Not sure
about spinning down the disk. Because of daemons and logging it
might be difficult to keep the disk spun down. Dave/Kevin/Steve
worked on this some on the Jaz-disk test machines. Some keys are:
- shutdown as many daemons as possible
- reduce logging and log across NFS if possible
- move unimportant logging to a memory disk
Maybe we should disconnect one of those big case fans in each
machine or get temperature regulated fans.
Of course, the most important thing to do is determine whether
we really have a problem and what components suck the most power.
8. Get ARM/Linux running on the DNARDs (mike)
I spent a couple of days on this, long enough to get a diskless
system basically working, figured out how to build a kernel, and
discover all the problems that we need to fix. For the complete
lowdown on my long, strange trip into DNARD/Linux, see
Here are some of the things we still need to do:
1. Linux is not well setup for operating a large number
of diskless clients. It has provisions for booting each node
with a root of /tftpboot/<ip-addr> and then all mounting a
common /usr, but the root filesystem is still on the order
of 40MB. 10MB of this is /lib, which has lots of shared
libraries (glibc alone is 4MB) needed for binaries in /sbin.
14MB is /var, most of which is the RPM database (since this
disk images was loaded with about every package known to man).
Even if we go with the 40MB roots, I still had to hack some
startup files to deal with the NFS root. In particular, /
must be in the fstab but fsck will fail if / is an NFS
filesystem (duh!) Made a gross hack to deal with that
(look for .I_am_an_NFS_rootfilesystem) in rc.sysinit.
Also make sure ONBOOT=no in ifcfg-eth0 else it will hang
trying to initialize eth0 (which was already inited because
of NFS root).
2. There is a known NFS bug in pre-2.4 kernels which cause
much grief with diskless systems. Has to do with the old
open-and-unlink-a-file-but-still-have-access semantic.
We need a newer kernel.
3. Apparently you cannot use the PIT to get a periodic
interrupt on the DNARDs. Thought this was a Linux problem
but NetBSD doesn't use it either. Both use the RTC at 64Hz.
However, the Netwinder application base we are using doesn't
recognize 64Hz as a valid value and defaults to 100Hz.
Probably throwing lots of timing related things off.
We need to rebuild the appropriate shared library and
affected static binaries.
4. Related to #3 is just the general problem that the Linux
setup relies on mish-mash of kernel/binary releases.
We should build our own system from the sources. The
kernel may always be a problem since the Shark code is
bit-rotting in the ARM linux tree.
5. Reboot doesn't work, it just hangs.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment