Skip to content
  • Jon Mason's avatar
    PCI: Set PCI-E Max Payload Size on fabric · b03e7495
    Jon Mason authored
    
    
    On a given PCI-E fabric, each device, bridge, and root port can have a
    different PCI-E maximum payload size.  There is a sizable performance
    boost for having the largest possible maximum payload size on each PCI-E
    device.  However, if improperly configured, fatal bus errors can occur.
    Thus, it is important to ensure that PCI-E payloads sends by a device
    are never larger than the MPS setting of all devices on the way to the
    destination.
    
    This can be achieved two ways:
    
    - A conservative approach is to use the smallest common denominator of
      the entire tree below a root complex for every device on that fabric.
    
    This means for example that having a 128 bytes MPS USB controller on one
    leg of a switch will dramatically reduce performances of a video card or
    10GE adapter on another leg of that same switch.
    
    It also means that any hierarchy supporting hotplug slots (including
    expresscard or thunderbolt I suppose, dbl check that) will have to be
    entirely clamped to 128 bytes since we cannot predict what will be
    plugged into those slots, and we cannot change the MPS on a "live"
    system.
    
    - A more optimal way is possible, if it falls within a couple of
      constraints:
    * The top-level host bridge will never generate packets larger than the
      smallest TLP (or if it can be controlled independently from its MPS at
      least)
    * The device will never generate packets larger than MPS (which can be
      configured via MRRS)
    * No support of direct PCI-E <-> PCI-E transfers between devices without
      some additional code to specifically deal with that case
    
    Then we can use an approach that basically ignores downstream requests
    and focuses exclusively on upstream requests. In that case, all we need
    to care about is that a device MPS is no larger than its parent MPS,
    which allows us to keep all switches/bridges to the max MPS supported by
    their parent and eventually the PHB.
    
    In this case, your USB controller would no longer "starve" your 10GE
    Ethernet and your hotplug slots won't affect your global MPS.
    Additionally, the hotplugged devices themselves can be configured to a
    larger MPS up to the value configured in the hotplug bridge.
    
    To choose between the two available options, two PCI kernel boot args
    have been added to the PCI calls.  "pcie_bus_safe" will provide the
    former behavior, while "pcie_bus_perf" will perform the latter behavior.
    By default, the latter behavior is used.
    
    NOTE: due to the location of the enablement, each arch will need to add
    calls to this function.  This patch only enables x86.
    
    This patch includes a number of changes recommended by Benjamin
    Herrenschmidt.
    
    Tested-by: default avatar <Jordan_Hargrave@dell.com>
    Signed-off-by: default avatarJon Mason <mason@myri.com>
    Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
    b03e7495