1. 06 Apr, 2016 7 commits
    • Felix Fietkau's avatar
      mac80211: add A-MSDU tx support · 6e0456b5
      Felix Fietkau authored
      Requires software tx queueing and fast-xmit support. For good
      performance, drivers need frag_list support as well. This avoids the
      need for copying data of aggregated frames. Running without it is only
      supported for debugging purposes.
      
      To avoid performance and packet size issues, the rate control module or
      driver needs to limit the maximum A-MSDU size by setting
      max_rc_amsdu_len in struct ieee80211_sta.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      [fix locking issue]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      6e0456b5
    • Johannes Berg's avatar
      mac80211: enable collecting station statistics per-CPU · c9c5962b
      Johannes Berg authored
      If the driver advertises the new HW flag USE_RSS, make the
      station statistics on the fast-rx path per-CPU. This will
      enable calling the RX in parallel, only hitting locking or
      shared cachelines when the fast-RX path isn't available.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c9c5962b
    • Johannes Berg's avatar
      mac80211: add fast-rx path · 49ddf8e6
      Johannes Berg authored
      The regular RX path has a lot of code, but with a few
      assumptions on the hardware it's possible to reduce the
      amount of code significantly. Currently the assumptions
      on the driver are the following:
       * hardware/driver reordering buffer (if supporting aggregation)
       * hardware/driver decryption & PN checking (if using encryption)
       * hardware/driver did de-duplication
       * hardware/driver did A-MSDU deaggregation
       * AP_LINK_PS is used (in AP mode)
       * no client powersave handling in mac80211 (in client mode)
      
      of which some are actually checked per packet:
       * de-duplication
       * PN checking
       * decryption
      and additionally packets must
       * not be A-MSDU (have been deaggregated by driver/device)
       * be data packets
       * not be fragmented
       * be unicast
       * have RFC 1042 header
      
      Additionally dynamically we assume:
       * no encryption or CCMP/GCMP, TKIP/WEP/other not allowed
       * station must be authorized
       * 4-addr format not enabled
      
      Some data needed for the RX path is cached in a new per-station
      "fast_rx" structure, so that we only need to look at this and
      the packet, no other memory when processing packets on the fast
      RX path.
      
      After doing the above per-packet checks, the data path collapses
      down to a pretty simple conversion function taking advantage of
      the data cached in the small fast_rx struct.
      
      This should speed up the RX processing, and will make it easier
      to reason about parallelizing RX (for which statistics will need
      to be per-CPU still.)
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      49ddf8e6
    • Johannes Berg's avatar
      mac80211: fix RX u64 stats consistency on 32-bit platforms · 0f9c5a61
      Johannes Berg authored
      On 32-bit platforms, the 64-bit counters we keep need to be protected
      to be consistently read. Use the u64_stats_sync mechanism to do that.
      
      In order to not end up with overly long lines, refactor the tidstats
      assignments a bit.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      0f9c5a61
    • Johannes Berg's avatar
      mac80211: fix last RX rate data consistency · 4f6b1b3d
      Johannes Berg authored
      When storing the last_rate_* values in the RX code, there's nothing
      to guarantee consistency, so a concurrent reader could see, e.g.
      last_rate_idx on the new value, but last_rate_flag still on the old,
      getting completely bogus values in the end.
      
      To fix this, I lifted the sta_stats_encode_rate() function from my
      old rate statistics code, which encodes the entire rate data into a
      single 16-bit value, avoiding the consistency issue.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      4f6b1b3d
    • Johannes Berg's avatar
      mac80211: add separate last_ack variable · b8da6b6a
      Johannes Berg authored
      Instead of touching the rx_stats.last_rx from the status path, introduce
      and use a status_stats.last_ack variable. This will make rx_stats.last_rx
      indicate when the last frame was received, making it available for real
      "last_rx" and statistics gathering; statistics, when done per-CPU, will
      need to figure out which place was updated last for those items where the
      "last" value is exposed.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b8da6b6a
    • Johannes Berg's avatar
      mac80211: move averaged values out of rx_stats · 0be6ed13
      Johannes Berg authored
      Move the averaged values out of rx_stats and into rx_stats_avg,
      to cleanly split them out. The averaged ones cannot be supported
      for parallel RX in a per-CPU fashion, while the other values can
      be collected per CPU and then combined/selected when needed.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      0be6ed13
  2. 05 Apr, 2016 4 commits
  3. 24 Feb, 2016 4 commits
  4. 14 Jan, 2016 1 commit
    • Emmanuel Grumbach's avatar
      mac80211: fix PS-Poll handling · 1a57081a
      Emmanuel Grumbach authored
      My commit below broken PS-Poll handling. In case the driver
      has no frames buffered, driver_release_tids will be 0, but
      calling find_highest_prio_tid() with 0 as a parameter is
      not a good idea:
      fls(0) - 1 = -1.
      This bug caused mac80211 to think that frames were buffered
      in the driver which in turn was confused because mac80211
      was asking to release frames that were not reported to
      exist.
      On iwlwifi, this led to the WARNING below:
      
      WARNING: CPU: 0 PID: 11230 at drivers/net/wireless/intel/iwlwifi/mvm/sta.c:1733 iwl_mvm_sta_modify_sleep_tx_count+0x2af/0x320 [iwlmvm]()
      ffffffffc0627c60 ffff8800069b7648 ffffffff81888913 0000000000000000
      0000000000000000 ffff8800069b7688 ffffffff81089d6a ffff8800069b7678
      0000000000000001 ffff88003b35abf0 ffff88000698b128 ffff8800069b76d4
      Call Trace:
      [<ffffffff81888913>] dump_stack+0x4c/0x65
      [<ffffffff81089d6a>] warn_slowpath_common+0x8a/0xc0
      [<ffffffff81089e5a>] warn_slowpath_null+0x1a/0x20
      [<ffffffffc05f36bf>] iwl_mvm_sta_modify_sleep_tx_count+0x2af/0x320 [iwlmvm]
      [<ffffffffc05dae41>] iwl_mvm_mac_release_buffered_frames+0x31/0x40 [iwlmvm]
      [<ffffffffc045d8b6>] ieee80211_sta_ps_deliver_response+0x6e6/0xd80 [mac80211]
      [<ffffffffc0461296>] ieee80211_sta_ps_deliver_poll_response+0x26/0x30 [mac80211]
      [<ffffffffc048f743>] ieee80211_rx_handlers+0xa83/0x2900 [mac80211]
      [<ffffffffc04917ad>] ieee80211_prepare_and_rx_handle+0x1ed/0xa70 [mac80211]
      [<ffffffffc045e3d5>] ? sta_info_get_bss+0x5/0x4a0 [mac80211]
      [<ffffffffc04925b6>] ieee80211_rx_napi+0x586/0xcd0 [mac80211]
      [<ffffffffc05eaa3e>] iwl_mvm_rx_rx_mpdu+0x59e/0xc60 [iwlmvm]
      
      Fixes: 0ead2510 ("mac80211: allow the driver to send EOSP when needed")
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      1a57081a
  5. 04 Dec, 2015 3 commits
  6. 21 Oct, 2015 2 commits
    • Johannes Berg's avatar
      mac80211: move station statistics into sub-structs · e5a9f8d0
      Johannes Berg authored
      Group station statistics by where they're (mostly) updated
      (TX, RX and TX-status) and group them into sub-structs of
      the struct sta_info.
      
      Also rename the variables since the grouping now makes it
      obvious where they belong.
      
      This makes it easier to identify where the statistics are
      updated in the code, and thus easier to think about them.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      e5a9f8d0
    • Johannes Berg's avatar
      mac80211: move beacon_loss_count into ifmgd · 976bd9ef
      Johannes Berg authored
      There's little point in keeping (and even sending to userspace)
      the beacon_loss_count value per station, since it can only apply
      to the AP on a managed-mode connection. Move the value to ifmgd,
      advertise it only in managed mode, and remove it from ethtool as
      it's available through better interfaces.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      976bd9ef
  7. 14 Oct, 2015 1 commit
  8. 05 Oct, 2015 1 commit
    • Arnd Bergmann's avatar
      mac80211: use ktime_get_seconds · 84b00607
      Arnd Bergmann authored
      The mac80211 code uses ktime_get_ts to measure the connected time.
      As this uses monotonic time, it is y2038 safe on 32-bit systems,
      but we still want to deprecate the use of 'timespec' because most
      other users are broken.
      
      This changes the code to use ktime_get_seconds() instead, which
      avoids the timespec structure and is slightly more efficient.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: linux-wireless@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84b00607
  9. 14 Aug, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: use DECLARE_EWMA · 40d9a38a
      Johannes Berg authored
      Instead of using the out-of-line average calculation, use the new
      DECLARE_EWMA() macro to declare a signal EWMA, and use that.
      
      This actually *reduces* the code size slightly (on x86-64) while
      also reducing the station info size by 80 bytes.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      40d9a38a
  10. 17 Jul, 2015 3 commits
  11. 10 Jun, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: convert HW flags to unsigned long bitmap · 30686bf7
      Johannes Berg authored
      As we're running out of hardware capability flags pretty quickly,
      convert them to use the regular test_bit() style unsigned long
      bitmaps.
      
      This introduces a number of helper functions/macros to set and to
      test the bits, along with new debugfs code.
      
      The occurrences of an explicit __clear_bit() are intentional, the
      drivers were never supposed to change their supported bits on the
      fly. We should investigate changing this to be a per-frame flag.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      30686bf7
  12. 24 Apr, 2015 1 commit
  13. 23 Apr, 2015 2 commits
  14. 22 Apr, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: add TX fastpath · 17c18bf8
      Johannes Berg authored
      In order to speed up mac80211's TX path, add the "fast-xmit" cache
      that will cache the data frame 802.11 header and other data to be
      able to build the frame more quickly. This cache is rebuilt when
      external triggers imply changes, but a lot of the checks done per
      packet today are simplified away to the check for the cache.
      
      There's also a more detailed description in the code.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      17c18bf8
  15. 20 Apr, 2015 2 commits
    • Johannes Berg's avatar
      mac80211: lock rate control · 35c347ac
      Johannes Berg authored
      Both minstrel (reported by Sven Eckelmann) and the iwlwifi rate
      control aren't properly taking concurrency into account. It's
      likely that the same is true for other rate control algorithms.
      
      In the case of minstrel this manifests itself in crashes when an
      update and other data access are run concurrently, for example
      when the stations change bandwidth or similar. In iwlwifi, this
      can cause firmware crashes.
      
      Since fixing all rate control algorithms will be very difficult,
      just provide locking for invocations. This protects the internal
      data structures the algorithms maintain.
      
      I've manipulated hostapd to test this, by having it change its
      advertised bandwidth roughly ever 150ms. At the same time, I'm
      running a flood ping between the client and the AP, which causes
      this race of update vs. get_rate/status to easily happen on the
      client. With this change, the system survives this test.
      Reported-by: default avatarSven Eckelmann <sven@open-mesh.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      35c347ac
    • Bob Copeland's avatar
      mac80211: introduce plink lock for plink fields · 48bf6bed
      Bob Copeland authored
      The mesh plink code uses sta->lock to serialize access to the
      plink state fields between the peer link state machine and the
      peer link timer.  Some paths (e.g. those involving
      mps_qos_null_tx()) unfortunately hold this spinlock across
      frame tx, which is soon to be disallowed.  Add a new spinlock
      just for plink access.
      Signed-off-by: default avatarBob Copeland <me@bobcopeland.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      48bf6bed
  16. 01 Apr, 2015 2 commits
    • Felix Fietkau's avatar
      mac80211: add an intermediate software queue implementation · ba8c3d6f
      Felix Fietkau authored
      This allows drivers to request per-vif and per-sta-tid queues from which
      they can pull frames. This makes it easier to keep the hardware queues
      short, and to improve fairness between clients and vifs.
      
      The task of scheduling packet transmission is left up to the driver -
      queueing is controlled by mac80211. Drivers can only dequeue packets by
      calling ieee80211_tx_dequeue. This makes it possible to add active queue
      management later without changing drivers using this code.
      
      This can also be used as a starting point to implement A-MSDU
      aggregation in a way that does not add artificially induced latency.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      [resolved minor context conflict, minor changes, endian annotations]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      ba8c3d6f
    • Johannes Berg's avatar
      mac80211: use rhashtable for station table · 7bedd0cf
      Johannes Berg authored
      We currently have a hand-rolled table with 256 entries and are
      using the last byte of the MAC address as the hash. This hash
      is obviously very fast, but collisions are easily created and
      we waste a lot of space in the common case of just connecting
      as a client to an AP where we just have a single station. The
      other common case of an AP is also suboptimal due to the size
      of the hash table and the ease of causing collisions.
      
      Convert all of this to use rhashtable with jhash, which gives
      us the advantage of a far better hash function (with random
      perturbation to avoid hash collision attacks) and of course
      that the hash table grows and shrinks dynamically with chain
      length, improving both cases above.
      
      Use a specialised hash function (using jhash, but with fixed
      length) to achieve better compiler optimisation as suggested
      by Sergey Ryazanov.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      7bedd0cf
  17. 20 Mar, 2015 1 commit
  18. 28 Feb, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: remove TX latency measurement code · abfbc3af
      Johannes Berg authored
      Revert commit ad38bfc9 ("mac80211: Tx frame latency statistics")
      (along with some follow-up fixes).
      
      This code turned out not to be as useful in the current form as we
      thought, and we've internally hacked it up more, but that's not
      very suitable for upstream (for now), and we might just do that
      with tracing instead.
      
      Therefore, for now at least, remove this code. We might also need
      to use the skb->tstamp field for the TCP performance issue, which
      is more important than the debugging.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      abfbc3af
  19. 23 Jan, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: support beacon statistics · 225b8189
      Johannes Berg authored
      For drivers without beacon filtering, support beacon statistics
      entirely, i.e. report the number of beacons and average signal.
      
      For drivers with beacon filtering, give them the number of beacons
      received by mac80211 -- in case the device reports only the number
      of filtered beacons then driver doesn't have to count all beacons
      again as mac80211 already does.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      225b8189
  20. 09 Jan, 2015 1 commit
    • Johannes Berg's avatar
      mac80211: fix handling TIM IE when stations disconnect · 9b7a86f3
      Johannes Berg authored
      When a station disconnects with frames still pending, we clear
      the TIM bit, but too late - it's only cleared when the station
      is already removed from the driver, and thus the driver can get
      confused (and hwsim will loudly complain.)
      
      Fix this by clearing the TIM bit earlier, when the station has
      been unlinked but not removed from the driver yet. To do this,
      refactor the TIM recalculation to in that case ignore traffic
      and simply assume no pending traffic - this is correct for the
      disconnected station even though the frames haven't been freed
      yet at that point.
      
      This patch isn't needed for current drivers though as they don't
      check the station argument to the set_tim() operation and thus
      don't really run into the possible confusion.
      Reported-by: default avatarJouni Malinen <j@w1.fi>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      9b7a86f3