snmpit-internals.txt 18.5 KB
Newer Older
# Copyright (c) 2004,2007 University of Utah and the Flux Group.
3 4 5
# All rights reserved.

6 7
##### Some notes on the design and operation of snmpit 
##### Robert Ricci <>, Keith Sklower <sklower@cs.Berkeley.EDU>
9 10 11 12 13 14 15 16 17 18

##### File organization

snmpit - The command line tool. Contains most of the database accesses, does
permissions checking, formats output, and figures out which device-specific
backends it needs to invoke. Most emulab-specific knowledge is embedded in
snmpit. If you're going to add another switch backend to snmpit, the only part
of snmpit itself that you should have to change is the part that loads the
20 21 22 23 24 25 26 27 28
device library - search for 'cisco' in the file. - functions useful to snmpit and common to multiple device
backends. The most important functions to be aware of in here are
ReadTranlationTable() and the snmpitGetWarn() and snmpitGetFatal() functions.
The latter two wrap SNMP commands, retry in case of timeout, and send mail to
the site's testbed-ops list if they fail. The first simply warns the user and
continues execution, whereas the Fatal() version exit()s.

29 30 - Contains the knowledge required to handle a collection
of switches which share a common set of VLANs. Does not actually do any
SNMP itself, but has the knowledge of how to deal with multiple switches. For
32 33
example, it knows that (in some configurations) one switch acts as a 'VLAN
server', and in order to create a VLAN, you just have to talk to it. But,
34 35 36 37
to get a list of ports in VLANs, you have to talk to all of the switches. Also
has the job of aggregating information from switches - ie. doing a listVlans()
on the stack does a listVlans() on all individual switches and then collates
the results.  snmpit itself calls into the stack module, which creates a
snmpit_<switchtype> module for each switch, and calls into that to do the actual
39 40

41 42 43 44 45 - Contains the knowledge required to handle a collection
of Cisco switches which share a common set of VLANs.  There are optimization
that could be done for collections in which all the switches are cisco switches,
but currently both emulab and DETER using instead.

46 47 48 49 50 - Contains the actual SNMP commands to deal with Cisco
switches, and deals with error checking, retries, etc. Writing versions of
this for other switches will be the hardest part of porting snmpit, because it
has a lot of knowledge about the quirkiness of Cisco's SNMP implementation.

51 52 53 54 55 56 - Contains the specific SNMP for Nortel 55[123]0 class
switches, and deals with error checking, retries, etc. - Contains the specific SNMP for Foundry 9604 and 1500 class
switches, and deals with error checking, retries, etc.

57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 - Similar to . We haven't used it
in quite some time, so it's quite possibly bitrotted. - Ditto - like . - For controlling APC-brand SNMP-controllable power controllers.
Don't be fooled by it's name - it's not actually part of snmpit anymore, it
just used to be. It's now used only by the 'power' command, and you can ignore

##### Design philosophy

One of the key issues is that we trust switch state over database state. Thus,
when we go to remove a VLAN, we do not trust the database to list all the
ports in that VLAN for us - we get the list of member ports from the switch.
The idea here is twofold - first, we don't want to get too confused if people
have been manually manipulating VLANs on the switch, which does happen from
time to time, and usually happens for good reasons. Second, we don't want to
have to worry too much about getting out of sync, which can be a huge mess -
for example, if the database has to be restored from a backup, or if someone
messes around with the database by hand.

On the other hand, we don't trust the switches _too_ much - we usually verify
that set operations have succeeded before reporting success. ie. we'll set the
port speed to 100Mbps, then check the speed to make sure the change actually
took place. It's been our experience that, since we're pushing this stuff far
harder than it was intended (really, how many sites have made hundreds of
thousands of VLANs?), we do occasionally hit bugs in the switches.

Many of the API choices were made in order to enable high performance in the
backends. For example, we can supply a list of ports to setPortVlan() (in the
stack module), rather than just a single port, because it may be faster to
affect multiple ports at once than to do them serially. Whether or not you
choose to exploit this bit of API design is up to you.

##### Switch stacks

snmpit has a concept of stacks of switches. These are a set of switches on
which we can create a VLAN that spans potentially all switches. Thus, they are
connected by trunk links. Right now, snmpit does not support creating a VLAN
101 102
across multiple stacks (that would go against the definition of a stack), but
it does not stacks that contain more than one type of switch.
103 104 105

Probably, all of your experimental-net switches will be in a single stack.

106 107 108 109 110 111
Stacks are, by convention, sometimes named after their leader.  In a Cisco
stack, the leader is the one you talk to in order to create VLANs, etc,
via VTP; however in a generic stack, the two are usual named "Control"
and "Experiment", where the "Control" stack is used for setting up
firewall vlans and the "Experiment" stack is used for normal links
and lans described in .ns files.
112 113 114 115 116 117

##### New switch backends

Essentially, what you need to do to port snmpit to a new switch vendor is make
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
a new module that exports the same API as snmpit_stack, which expects
the following object methods:

	returns a new object, blessed into the snmpit_<switch> class.

$device->vlanNumberExists($self, $vlan_number)
	returns 1 if the 802.1Q VLAN tag exists on the switch, 0 otherwise

$device->findVlans($self, @vlan_ids)
	returns a hash mapping VLAN ids to 802.1Q VLAN numbers
	any VLANs not found have NULL VLAN numbers
	If no VLAN id is given, returns mappings for the entire switch.

$device->findVlan($self, $vlan_id, $no_retry)
	returns the VLAN number for the given vlan_id if it exists
	returns undef if the VLAN id is not found
	Retries several times unless the $no_retry option is given.

(could be written in terms of $device->findVlans, but in the interest
of efficiency, some switches might do this in a single snmp request
instead of walking a table).

	returns a list of VLAN information
	Each entry is an array reference. The array is in the form
	[id, 802.1Q tag, members] where id is the human readable
	VLAN identifier (as stored in the database), and members is
	a reference to an array of ports that are VLAN members.

	returns: A list of port information. Each entry is an array reference.
	The array is in the form [id, enabled, link, speed, duplex] where:
	    id is the port identifier (in node:port form)
	    enabled is "yes" if the port is enabled, "no" otherwise
	    link is "up" if the port has carrier, "down" otherwise
	    speed is in the form "XMbps"
	    duplex is "full" or "half"

         this function is largely device independent, but uses snmp
	 to retrieve a list of standard SNMP quantities from the
	 non-proprietary interface MIB, (octets in, out, unicast packets
	 in, out, etc.)  It is put in the snmpit_device() file since
	 there might be a device depending mapping between modport
	 and ifIndex, and it does use snmp.

	 The value is not really documented, but only is relevant
	 if you change or

$device->createVlan($self, $vlan_id, $vlan_number)
	Create a VLAN on this switch, with the identifier and 802.1Q tag
	returns the new VLAN number on success, 0 on failure.

(in the case of the Foundry switch, you cannot create a vlan with no
elements, so the module cashes the information, fibs about having does
it, and looks it up when ports are to be added).

$device->setPortVlan($self, $vlan_number, @ports)
	 Put the given ports in the given VLAN, given 802.1Q tag number
	 returns 0 on success, or the number of failed ports on failure.

There is side-channel return that would have been better to have as
ref argument but was not done in the name of backwards compatiblility;
the act of putting a port in one VLAN may remove it from another
and when this happens you want to invalidate the forwarding caches
on the switches which are frequently triples of (mac, last seen port, vlan).

A list of vlans which are thus affected is put in $device->{DISPLACED_VLANS}
during the course of that call.

$device->enablePortTrunking2($self, $port, $vlan_number, $equaltrunking)
#        modport: module.port of the trunk to operate on
#        nativevlan: VLAN number of the native VLAN for this trunk
#        equaltrunk: don't do dual mode; tag PVID also.
#        Returns 1 on success, 0 otherwise

$device->removePortsFromVlan($self, @target_vlans_by_tag)
	 remove all ports from the list of VLANs, given 802.1Q tag number
	 returns 0 on success, or the number of failed ports on failure.

$device->removeVlan($self, @target_vlans_by_tag)
	 remove each of the list of VLANs, given 802.1Q tag number
	 returns 0 on success, or the number of failed ports on failure.

(some switches require a separate action to remove the vlan after the ports
are emptied out of it).

$device->clearAllVlansOnTrunk($self, $modport)
#        modport: module.port of the trunk to operate on
#        Returns 1 on success, 0 otherwise (must be done before taking a
#	 port out of trunking mode.

(usually internal).

$device->disablePortTrunking($self, $modport)
	returns 1 on success, 0 on failure.

$device->getChannelIfIndex($self, @ports)
	this is used in the function immediately below; an interswitch
	trunk maybe connected by several physical wires constituting
	a logical trunk.  It is necessary on cisco's (and possibly
	others) to return a special cookie for trunk operations.
	this function only deals with one trunk at a time.

$device->setVlansOnTrunk($self, $trunkIndex, $value, @vlan_number)
#        $trunkIndex: cookie returned above for the trunk on which to operate.
#        $value: 0 to disallow the VLAN on the trunk, 1 to allow it
#        #vlan_numbers: An array of 802.1Q VLAN numbers to operate on
#        Returns 1 on success, 0 otherwise

$device->resetVlanIfOnTrunk($self, $modport, $vlan_number)
#        modport: module.port of the trunk to operate on
#        vlan_number: A 802.1Q VLAN tag number to check
#        return value currently ignored.  Takes vlan out of the trunk and puts
#	 it back in to flush the FDB.

$device->convertPortFormat($self, $output format, @ports)
	returns a list of ports in the specified output format
241 242 243 244 245 246 247 248 249 250 251 252 253

Looking at to figure out the basics of switch configuration
with SNMP is not a bad idea, but keep in mind that there are several things
which make this module very complicated.

First, it supports two different switch operating systems, IOS and CatOS, so
there are some special cases for each.

Second, it supports some wacky features that you (hopefully) won't have to
worry about, like 'private VLANs'. This, in particular, leads to some complex
cases in setPortVlan() and createVlan().

Third, in different MIBs, ports are 'addressed' differently. In standardized
MIBs, ports tend to be referred to by an 'ifIndex', which is just an integer.
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345
Cisco likes to refer to ports as a 'module.port' (at least, in CatOS), since
this is the 'native' way to name ports on their modular switches. So, we have
to convert back and forth between the two formats. So, keep in mind that
operations for which I convert the ports into $PORT_FORMAT_IFINDEX are ones
that _might_ be supported in your switch.

You can also look at for examples if you wish, but keep in
mind it's not actively maintained.

##### VLAN IDs and VLAN numbers

This part causes a lot of confusion, sorry. There are two different ways we
refer to VLANs, which can get confusing, 'cause they're both integers. A VLAN
ID is a made-up number that we use to identify a VLAN - it's an auto-increment
value in the vlans table. The actual VLAN on the switch also has a number
associated with it.

Let's give an example. I create an experiment called testbed/threenodes, and I
put the three members, Dusty, Lucky, and Ned, into a LAN called 'amigos'.
Let's say this gets an ID of 314 in the VLAN table. 

Now I need to create this VLAN on the switches. The snmpit_cisco_stack module
goes out and finds the list of VLAN _numbers_ that already exist on the
switch. This is the list you'd get from, say, a 'show vlan' command on a Cisco
switch. Let's say this list of VLANs is 1,2,3,4,5,7,9,10,12. The module finds
the first unused number (6), and uses that for the new VLAN. Note that there
are holes in the VLAN list, presumably from VLANs that were previously created
and then deleted.

So now we have a VLAN with ID 314 and number 6. We have to store this mapping
somewhere so that future invocations of snmpit will be able to find the VLAN.
For Cisco's we do this by setting the VLAN's 'name' field to the ID. So now
we have a VLAN with number 6 and the name '314'. Got it? This mapping may have
to be stored somewhere else, like in the database, for other switches, if you
can't set names for VLANs.

The upshot is, be careful as you're looking at the API to distinguish between
functions that take VLAN names and ones that take VLAN numbers. It should be
pretty clear from the comments and/or variable names. Send me mail if you find
any that aren't clear.

##### Stack options

The switch_stack types table holds options for each stack. They are:
stack_type:		Used by snmpit to figure out which backend module to
supports_private:	Cisco-specific, don't worry about it
single_domain:		Whether all switches in the stack share a VLAN domain
			(such as using VTP on Ciscos), or whether you need to
			talk to each switch individually to create VLANs. You
			can decide if it makes sense to implement this option
			for your switches or not.
snmp_community:		The SNMP community string that will be used for
			read/write access to the switches. You should support
			this option, and default to 'public' if no community
			is given
min_vlan:		The minimum VLAN number (remember, not ID) that your
			module is allowed to create on the switch. It would be
			good to implement this if possible. This can be useful
			if the switch is being used for more than one purpose
			- ie. someone else could be creating VLANs in the
			range 1-500, and your Emulab could be using VLANs 501
			- 1000.
max_vlan:		Like min_vlan, silly!

##### Misc. things to be aware of

We want to disable ports that are not currently part of an experiment. So,
when we tear down a VLAN, we have to do something to the ports that were
previously in it. On Intel switches, you can actually have ports that are not
in any VLANs, so we disable them and remove them from their VLAN.
For Ciscos, however, ports are always in a VLAN - so we move them into VLAN 1,
and set them to 'disabled'.

In snmpit_cisco, we have two ways to create VLANs on switches - one is using
VTP, in which we create VLANs on a 'leader' and the switches do the job of
getting it created everywhere. The other scheme has some interesting
properties. Depending on what your switches support, you may have to deal with
some of the same issues. In it, we have to talk to all switches to create
VLANs. We have made a decision that, though poor from a performance
perspective, helps consistency. We create the VLAN on _all_ switches,
regardless of which ones actually have ports in it. This way, we do not have
to deal with issues that come from different switches having differing sets of
VLANs. Our locking protocol is such that when we create a VLAN, we do so on
the switches in lexicographically sorted order - when we delete a VLAN, we do
it in reverse lexicographic order. This way, we don't have different concurrent
347 348 349 350 351 352 353 354 355 356 357 358 359 360 361
instances of snmpit pick the same VLAN name because they are getting VLAN
lists from different switches.

##### Where to learn about new switch vendors and SNMP

There are three ways in which we've found out which SNMP commands we needed to
invoke, in what order, to get this stuff to work.

Originally, we started by using the vendor's own SNMP configuration tools and
tcpdump-ing them. This can end up being a bit confusing, but it's the most
reliable source for discoving the switches' quirks.

We've also grabbed all the MIB files from the vendor (Cisco's are at:
362, and looked through the OID names and
363 364 365 366 367 368 369 370 371
comments. Luckily, the comments in Cisco MIB files are pretty good.

Finally, some vendors may provide documentation on how to perform some actions.
I wouldn't count on this one - Cisco's documentation, for example, is very
lacking in this area. I actually had a Cisco support person tell me that it was
not possible to do through SNMP something we'd been doing for years.

You can, of course, try out the OIDs we use in snmpit_cisco and snmpit_intel -
some of them may be supported on your switch.
372 373 374 375 376

##### switchmac

switchmac is a script, used at installation time, to print out all the MAC
378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397
adresses a switch has learned, and what ports they're on. The idea is to
eliminate the need for the testbed administrators to manually enter which nodes
are connected to which ports on which switches.

The stuff in this should be pretty standard - ie. this script has support for
Intel, Nortel, and Cisco switches, and I suspect that one of these should work
pretty similarly to whatever switch you're trying to port it to.

In your switch vendor's documentation, take a look for the dot1dTpFdbTable in
the BRIDGE-MIB (which is a standard one.)

The main ickiness in this script is something called 'Community String
Indexing' - see, the MIB is designed for switches that store one MAC table that
is VLAN-independant (ie. you can't use the same MAC in two different VLANs at
once.) Cisco keeps a MAC table per-VLAN. So, the way they get around this is to
change the community string you use to get access to the MAC table for each
VLAN.  So, if your community string is 'public', and you want to look at the
MAC table for VLAN 42, you talk to the switch with the community string
'public@42'. Terrible. I don't know if other switch vendors use this same hack,
or if they will have some other hack of their own.