Skip to content
  • Leigh B Stoller's avatar
    What started as a small project to support B/LAGGs on stitching links, · 02a1c736
    Leigh B Stoller authored
    but developed into a giant BAGG of problems:
    
    For the first part, the complication stems from representing stitching
    links as a fake node with an interface. In general this is fine since up
    to now, every stitching link has been a plain wire, and so snmpit does
    the right thing cause the port is in tagged mode. But this breaks down
    when the link is actually an aggregate (LAG/BAGG) since snmpit needs to
    take a different path for that, setVlanOnTrunks2(), which operates on all
    of the trunk links between switches, and knows how to deal with link
    aggregation. Trying to convince snmpit to handle fake switches and
    trunks with only one end, seemed like a bad idea. So I opted for adding
    a LAG field to the interfaces table so we can mark those interfaces. And
    I changed snmpit_stack to look for those ports, and redirect them onto
    the trunk path.
    
    This worked great on our Dell switches, but not on scidmz (an HP). Which
    was strange cause it was failing in an identical situation that seemed
    to work fine on our Moonshot HP (bighp1).
    
    Many hours later ... we determine that Version 7 firmware has a
    different ifindex mapping for BAGGs, and that we have been lucky not to
    have hit that problem on the moonshot cluster. Many more hours later,
    Kirk discovered that the very recent firmware update to scidmz resulted
    in snmp no longer being able to change the membership of BAGGs. Holy Bat
    BAGG!
    
    The best alternative was to use the libNetconf module that is already
    used to speak CLI to the switch for OpenFlow configuration. Just
    manipulate the BAGG with via the CLI, the commands are pretty simple.
    
    Well, that didn't quite work cause scidmz does not allow password based
    authentication (at some point it might have), and the ssh key that
    scidmz does accept is in /root/.ssh/, and snmpit runs as the user. No
    problem, lets just add another key pair and stick that in /usr/testbed/etc
    where the user can access it. But, ssh will not allow a 644 private
    key file to be used. So ... copy the that file to /tmp (so that the
    user owns it) and chmod it to 600, and pass that filename down into the
    libNetconf module, which has been changed to optionally use an ssh key
    file.
    
    So to sum up, there are two new node_attributes set on the stitching
    aggregate for scidmz:
    
    * snmpit_badBAGG: which says the firmware no longer allows snmpit to
      manipulate BAGGs.
    
    * snmpit_sshkey: which is the path to the ssh key for libNetConf,
      instead of password based authentication.
    
    The bottom line is ... do not upgrade our other HP switches.
    02a1c736