Commit 29005ab9 authored by Robert Ricci's avatar Robert Ricci
Browse files

Move some files that are Utah-specific to the Utah-specific

doc directory.
parent 9e76e723
There are a few things could/should be done to back up configurations
before the Ciscos lose power.
Save cisco2's MSFC (router) card's configuration. Unlike the switch
configuration, this _does not_ happen automatically, so we could lose
configuration if we forget do do this.
snake> tip cisco2
Console> (enable) session 15
Trying Router-15...
Connected to Router-15.
Escape character is '^]'.
Router#copy running-config startup-config
Destination filename [startup-config]?
Building configuration...
To be safe, we can save backups of the switch and router configuration on the
boss node, saved via TFTP. Before doing this, make sure that
boss:/tftpboot/cisco-config and all of the files it contains are world-writable
(and remove world-write permissions when you're done.) Files _must_ exist
before you can upload via TFTP, so if you want to upload to a file that isn't
already there, touch it (and fix the permissions) first.
To save the router config:
Router#copy running-config tftp:
Address or name of remote host []?
Destination filename [router-confg]? /tftpboot/cisco-config/cisco2-MSFC.cfg
11267 bytes copied in 0.168 secs
To save switch config:
cisco1> (enable) copy config tftp
This command uploads non-default configurations only.
Use 'copy config tftp all' to upload both default and non-default configurations.
IP address or name of remote host []?
Name of file to copy to [/tftpboot/cisco-config/cisco1.cfg]?
Upload configuration to tftp:/tftpboot/cisco-config/cisco1.cfg, (y/n) [n]? y
Configuration has been copied successfully.
cisco1> (enable)
In case of emergency!
Booting paper/plastic/tipserv1
The tip lines for each of paper/plastic/tipserv1 are on
(standard desktop machine).
You can talk to power controller number 9 from snake by using "tip
apc" to talk to power9. User is "apc", password is written on the
white board in the flux area. The outlets are set up as follows:
plastic - 8
paper - 6
cyclades2 - 5
cyclades1 - 7
The boot blocks are set up so that they will let you interrupt the
boot on the VGA, but if you leave that alone, the last stage loader
will run on the serial line. You can interrupt the boot there.
Note that Mike and I built/installed new boot blocks on paper/plastic
and snake that run at 115200. The defaul is 9600 for all 3 stages.
| For pc850's: |
Port | F | L | O |
Bottom | 0 | 2 | 0 |
Top | 1 | 3 | 1 |
Left | 2 | 0 | 3?|
Right | 3 | 1 | 4?|
Single | 4 | 4 | 2 |
Cannonical Testbed order is the Linux order.
Example MACs, IRQs, and I/O bases:
Bottom: 00:02:b3:3f:73:5a (10,0xef00)
Top: 00:02:b3:3f:73:5b (10,0xee80)
Left: 00:03:47:73:a1:d7 (7,0xdf80)
Right: 00:03:47:73:a1:d8 (10,0xdf40)
Single: 00:03:47:94:c0:4e (5,0xed80)
How to upgrade the BIOS on an ISP1100:
[ Notes make 10/01 when upgrading to TR440BXA.86B.0042.P15.0107200951
(aka "revision 15", needed for new PXE code). ]
The 10,000 foot view is that you boot a machine on a DOS floppy and run a
script and then reboot. The step-by-step details:
0. Reserve the node. There is a bios-update experiment for this purpose.
In fact, Rob has it setup so that all the nodes that haven't been
upgraded get put in this experiment when they are freed. So you can
likely skip this step, but just in case, you do (on boss):
nalloc emulab-ops bios-update pcXXX
1. Kill the capture process for the node. You will need to talk to the
node at 19200 baud after flashing the BIOS. So login to ops (pc41-120)
or tipserv1 (pc121-128) and kill the capture process.
2. Stick in the floppy and boot from it. I have a whole pile (8) of ISP1100
BIOS update floppies. Grab one and stick it into the machine in question.
Tip to the direct console line ("pcXXX-tty") and reboot the node.
NOTE: because of a screw up on my part, the ISPs will boot
from the network before the floppy. There are two ways to
fix this, you can take the node out of the DHCP configuration
temporarily so it will fall back on a floppy boot, or you can
go into the BIOS and reset the boot order to have floppy first.
I do the latter, which is described here.
When you get the "type F2 to enter SETUP" message, type F2 and go
into setup. Go to the boot menu and set the first choice to floppy
(don't bother to do anything fancier, we are about to wipe out this
BIOS setting anyway). Save the change and let it boot from the floppy.
3. Flash the BIOS. Type "1" followed by newline at the DOS command prompt.
I then wait for it to put out the message about programming the BIOS
to make sure it is going. At that point I get out of tip, and get back
in with:
tip -19200 pcXXX-tty
You need to do this since, when the BIOS flash is done, it will reboot
and the serial console will be back to the default baud rate of 19200.
4. Restore the BIOS settings (note that floppy is now before PXE in
the boot order!):
1. Main menu
processor serial number -> disabled
2. Security menu
set supervisor password (NOT user password)
user level access -> view only
3. Boot menu
after power fail -> power on
1st boot -> floppy
2nd boot -> <first PXE choice>
3rd boot -> IDE-HD
the rest -> disable
4. System management menu
serial baud -> 115200
flow control -> none
5. Exit
exit saving changes -> yes
After hitting newline and saving the BIOS changes, get out of tip
and get back in again at 115200 baud:
tip pcXXX-tty
and prepare for the next step.
5. Shut up the PXE code. As the machine reboots again, you will need
to turn off 3 different PXE prompts (why 3 and not 4? No idea...)
So wait til you see the "type ^S blah blah blah", and type ^S.
Set the prompt to "disable" and set the wait time to "0 seconds"
and then type F4 to save and exit. Wait for the next prompt and
repeat. Wait for the final prompt and repeat.
6. Clean up. When you have neutered all the prompts, the machine will
reboot on the floppy again. Go down, pop out the floppy and hit the
reset button. Get out of tip and restart the capture (see
/usr/site/etc/capture.rc for the appropriate command line).
If you forgot to reboot the machine while you were downstairs,
(I always do) power cycle the node now.
NOTE: CTRL-ALT-DEL won't work on the serial line!
At best nothing will happen, at worst you will reboot
the machine you are sitting at!
You can then get back into tip:
tip pcXXX
and make sure everything comes back up ok.
7. Update the DB. Free the node from the bios-update experiment:
nfree emulab-ops bios-update pcXXX
and mark that the BIOS has been upgraded:
mysql tbdb
update nodes set bios_version="TR440BXA.86B.0042.P15.0107200951" where node_id='pcXXX';
The Master Plan (tm) for moving to our new IP subnets:
X Mail users explaining downtime
* Fix hardcoded things in datbase/files (see Leigh's mail for list)
X Set up experimental simulation on test Cisco
* Save configurations on ciscos 1 and 2
X Put Netcomm port in own VLAN, enable, set up routing in Cisco2
X Move plastic to new VLAN, change IP, route
X Move paper to new VLAN, change IP, rotue
* Move control network ports to new VLAN, reboot testbed machines
***** IP addresses to change
X Paper's .214 interface becomes .128 - Fix local routing table
X Plastic's .214 interface becomes .129 - Fix routing
* Cisco1's IP interface (becomes a .128)
* Cisco2's IP interface (becomes a .128, in VLAN 666)
* Cisco4's IP interface (becomes a .128)
* All testbed nodes become .132's (in dchp and proxydhcp config)
* Make changes permanent on paper/plastic
***** Misc things to do
X Set up the UDP helper port to forward from the control VLAN to
* Get DNS records (forward and reverse) to agree with new setup
* Generate new SSL certificate for paper?
* Tell CS about new IPs for paper and plastic, for backup purposes (or, we
could leave their .212 interfaces alive)
* One paper is using new address, update DNS records to use new address
***** VLANs to create
X Outside: Consists only of cable running to outside router and a fake router interface
X Private: Already exists as VLAN 666 - Add paper's interface (3/25) to this
(Power controllers are 4/30-44 (evens)) and cisco1 (4/23) and cisco4 (4/2)
X Public: For now, just plastic's interface (3/23) and router
X Control: All tbpcs (3/1-20,3/29-48), router, temporarily 214 net wire (4/27)
***** Some kludges to get around potential problems
* Paper and plastic should be able to keep their .212 interfaces for an
indefine period of time
X Paper can get a .130 interface temporarily if the 'UDP helper' port (for
DHCP) is problematic
***** Simple IP allocation description
* Private: 155.101.128/24
* Public: 155.101.129/24
* Control (sharks): 155.101.130/24
* Control (PCs): 155.101.132/22
***** NOTE:
ip address
ip route
***** Overall responsiblities
Maintenance of this file: Rob
ISP network card installation: Rob
Rack assembly: Mac/Rob
Wiring: Mac
Temp. worker co-ordination: Mac
***** Hardware
Status: Received
Responsible: Rob
RMAs: 1 returned, replacement recieved
To be RMA'ed: (called Jamie 9/18)
Still in our possesion:
pc83 (NMI errors)
Returned 9/26:
pc48 (simply dead, though it worked at first)
pc51 (often hangs)
pc111 (used to boot, lights come on, but no serial/network traffic)
pc110's hard drive (I/O errors when doing md5 of whole disk)
Re-recieved 10/3:
pc48 - put back in service 8/3
Network cables:
Status: Received
Responsible: Mac
Rackmount nuts/screws:
Status: Received
Responsible: Mac
3ft Power cables from ValCom:
Status: Recieved
Responsible: Rob
Short power cables from Standard:
Status: Received
Responsible: Rob or Leigh
Additional RPCs:
Status: Recieved
Additional APCs:
Status: Not ordered
Responsible: ??
Note: We've stolen 1 APC from the sharks, which will need to be
replaced if we decide to reconnect them
Status: Recieved
Responsible: ??
Note: One host adapter installed in plastic, expanders need to be
hooked up. The other is installed in the serial server.
Avery labels for cables - pick them up at OfficeMax, etc.
Status: Arrived
Responsible: Rob
Cable ties:
Pick them up at Home Depot
Status: Arrived
Responsible: Mike
Cable management bars:
Status: Cancelled - we're using cable ties instead
Simple horizontal metal bars to tie cables to
Responsible: Mac
Note: Corey suggests that the metal shop is the simplest way to go,
but they may take a week to deiver
Fiber cables:
Status: Recieved
Responsible: Leigh
***** Testbed Software
Correctly handle multiple switches
Responsible: Mac
Status: in progress, target date 9/20
Get it working over the network and on multiple serial servers
Responsible: Leigh
Status: Done
Extend it to talk to RPC (serial) power controllers
Responsible: Mac
Status: Done
Disk image:
Build a disk image that has any changes required for ISPs
Responsible: Rob
Status: Done
***** Problems to resolve
Pick control net ineterface:
Bottom on-board interface will be control
Responsible: Mike
Status: Done
Decide on machine numbering scheme:
Responsible: ??
Status: Done
Start at pc41
Serial server:
What machines can we use the Cyclades card in?
Responsible: Rob/Leigh/Mike
Status: Done
ASA box has become tipserv1
NIC probe order:
We need to determine if FreeBSD and Linux probe the NIC cards in the
same order, and if not, make the order consistent
Responsible: Leigh
Status: Done
***** Misc
Cable labels:
Responsible: Rob
Status: Printed
PC labels:
Responsible: Rob
Status: Printed
Note: We won't label machines until we know they work
Power outlets:
We need 2 outlets of at least 15 amps per rack
Responsible: Rob
Status: Found 3, need 3 more
Note: Any outlets we find under the floor are fair game
Note: We should wait until _after_ September 9th to pursue this
Label power cables:
Responsible: Rob
***** Testbed state to update
New machine type:
Status: Done
Responsible: Mac
Wires table:
Using a consistent wiring scheme, this can be filled by a script
Responsible: Rob
Status: Script created/checked in, first two racks done
Interfaces table:
Toughest part here is the MAC addresses - See "MAC Address gathering"
procedure below
Responsible: Rob
Status: First two racks done
Nodes table:
The only information in this table this is unique per-node is node_id,
ip, and precedence - should be easy to populate with a script
Responsible: Rob
Status: First two racks done
There's already a script (db/dhcpd_makeconf) to generate this once
the nodes and interfaces tables are populated
Responsible: Rob
Status: First two racks done
Outlets table:
Using a consistent wiring scheme, this can be filled by a script
Responsible: Rob
Status: First two racks done
DNS zone files on paper:
Need to add DNS entries for all new nodes
Responsible: Mac/Rob
Status: First two racks done
Information about the RPCs:
Will probably need entires in the nodes table
Responsible: Mac
Status: First two racks done
tip/capture state:
Not sure what needs to be done for the new tip/capture
Responsible: Leigh
Status: First two racks done
***** Hardware Testing
Boots and runs
Responsible: Rob
Status: 2 racks done
Ferret out flaky nodes through stress testing, eg lots of rebooting and power-cycling
Responsible: ?
Status: 2 racks partially done in course of everyday use; need real script done
Note: create an ns and run script for this.
Run nodes all out to verify temps don't get too high.
Responsible: ?
Status: zilch
Note: Create an ns and run script for this.
Probably a 4-hour run of a big build, over and over, will do it.
Believe the disk and processor are the only important heat sources.
Measure the board temp(s) thru on-board sensors, especially of
top-most node.
***** Procedures
Serial/BIOS setup: (not complete)
1. Main menu
processor serial number -> disabled
2. Security menu
set supervisor password (NOT user password)
user level access -> view only
3. Boot menu
after power fail -> power on
1st boot -> <first PXE choice>
2nd boot -> floppy
3rd boot -> IDE-HD
the rest -> disable
4. System management menu
serial baud -> 115200
flow control -> none
Note: Possibly scriptable, but probably not worth it
MAC Address gathering: (Rob and Mike will set up)
1. Set up a pool of dynamic DHCP addresses - unknown (aka new) machines
will get put in this pool, which will refer them to a
specially hacked proxydhcp, which will cause them to boot a
different PXE kernel than the other machines
2. New machines will boot Mike's MAC-printing kernel, and the output
will be saved by capture
3. A script will search through capture logs, harvesting MACs and
putting them in the database
Wiring: (Note: Mac has a file with some additional tips)
1. Start from the bottom of the rack
2. Hook up experimental network interfaces:
2.1 Affix labels - number sequentially starting from the left
2.2 Tape labels to secure them to the cable
2.3 Bundle as you go up the rack
2.4 Keep them out of the way of air vents
2.5 Plug them into sequential (NOT HORIZONTALLY) ports on the
3. Hook up the control network
3.1 Sequential going up the rack, sequential on the switch
3.2 Same procedure as experimental network
4. Hook up serial lines
4.1 Sequential going up the rack, sequential on the Cyclades
4.2 Same procedure as experimental network
5. Hook up power cables
5.1 Bottom machine goes on the outside power controller, second
machine goes on the inside one, third goes on the
outside, etc.
Need to keep track of:
Ranges of cables for the experimental network on machines:
eg. pc85 starts with cable 500, pc65 ends with cable 600
Ranges of experimental cables on the switch:
eg. ports 3/1 - 3/24 = cables 500 - 523
Ranges of cables on the control network
eg. pc85 - pc41 = cables 600 - 645
ports 5/1 - 5/48 = cables 600 - 647
Ranges of serial cables:
eg. pc85 - pc41 = cables 700-800
1st. serial expander ports 1 - 16 = cables 700 -715
Which serial ports the power controllers are on
***** Schedule
To get the first rack set up, we need:
44 ISPs w/ rails (arrived, installed)
At least 1/3 of the network/serial cables
At least 1/3 of the power cables (enough have arrived)
'power' that can talk to RPCs
Working disk image
Automation tools
2 power outlets (we have them already)
Machine/cable labels
The second rack requires:
44 more ISPs
ALL rails (third rack must be built, as we're running serial cables to
Multi-switch snmpit
Network tip/capture
Serial line server
1/3 more network/serial cables
1/3 more power cables
2 more power outlets
The third rack requires:
The rest of the ISPs
The rest of the network/serial cables
The rest of the power cables
2 more power outlets
################################################## TODO
* Port cyclades driver
* Re-setup Networker
* Install new versions of ports/packages
* Setup sendmail
* Get correct inetd services running
* Set up forwarding http server (if necessary)
* Preserve ssh identity and host key
* resolv.conf
* Preserve password
* Compile custom kernel (w/ tons of mbufs, etc.)
* Install LEDA
################################################## ISSUES
Open issues:
* Do we need to move over stuff in /usr/site?
* NIS no longer necessary? (probably not)
* Do we still need a forwarding http server? (probably not)
* Preserved important files in /users/ricci/plastic-upgrade
Kernel issues:
* Lots o' mbufs
* Make sure to have the ccd driver
Custom stuff installed on plastic:
tip (from testbed tree)
plasticwrap (from testbed tree)
Stuff running out of inetd: