1. 09 Oct, 2015 1 commit
  2. 24 Sep, 2015 1 commit
  3. 17 Jan, 2015 1 commit
    • Johannes Berg's avatar
      netlink: make nlmsg_end() and genlmsg_end() void · 053c095a
      Johannes Berg authored
      
      
      Contrary to common expectations for an "int" return, these functions
      return only a positive value -- if used correctly they cannot even
      return 0 because the message header will necessarily be in the skb.
      
      This makes the very common pattern of
      
        if (genlmsg_end(...) < 0) { ... }
      
      be a whole bunch of dead code. Many places also simply do
      
        return nlmsg_end(...);
      
      and the caller is expected to deal with it.
      
      This also commonly (at least for me) causes errors, because it is very
      common to write
      
        if (my_function(...))
          /* error condition */
      
      and if my_function() does "return nlmsg_end()" this is of course wrong.
      
      Additionally, there's not a single place in the kernel that actually
      needs the message length returned, and if anyone needs it later then
      it'll be very easy to just use skb->len there.
      
      Remove this, and make the functions void. This removes a bunch of dead
      code as described above. The patch adds lines because I did
      
      -	return nlmsg_end(...);
      +	nlmsg_end(...);
      +	return 0;
      
      I could have preserved all the function's return values by returning
      skb->len, but instead I've audited all the places calling the affected
      functions and found that none cared. A few places actually compared
      the return value with <= 0 in dump functionality, but that could just
      be changed to < 0 with no change in behaviour, so I opted for the more
      efficient version.
      
      One instance of the error I've made numerous times now is also present
      in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
      check for <0 or <=0 and thus broke out of the loop every single time.
      I've preserved this since it will (I think) have caused the messages to
      userspace to be formatted differently with just a single message for
      every SKB returned to userspace. It's possible that this isn't needed
      for the tools that actually use this, but I don't even know what they
      are so couldn't test that changing this behaviour would be acceptable.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      053c095a
  4. 16 Jan, 2015 2 commits
    • Johannes Berg's avatar
      genetlink: synchronize socket closing and family removal · ee1c2442
      Johannes Berg authored
      
      
      In addition to the problem Jeff Layton reported, I looked at the code
      and reproduced the same warning by subscribing and removing the genl
      family with a socket still open. This is a fairly tricky race which
      originates in the fact that generic netlink allows the family to go
      away while sockets are still open - unlike regular netlink which has
      a module refcount for every open socket so in general this cannot be
      triggered.
      
      Trying to resolve this issue by the obvious locking isn't possible as
      it will result in deadlocks between unregistration and group unbind
      notification (which incidentally lockdep doesn't find due to the home
      grown locking in the netlink table.)
      
      To really resolve this, introduce a "closing socket" reference counter
      (for generic netlink only, as it's the only affected family) in the
      core netlink code and use that in generic netlink to wait for all the
      sockets that are being closed at the same time as a generic netlink
      family is removed.
      
      This fixes the race that when a socket is closed, it will should call
      the unbind, but if the family is removed at the same time the unbind
      will not find it, leading to the warning. The real problem though is
      that in this case the unbind could actually find a new family that is
      registered to have a multicast group with the same ID, and call its
      mcast_unbind() leading to confusing.
      
      Also remove the warning since it would still trigger, but is now no
      longer a problem.
      
      This also moves the code in af_netlink.c to before unreferencing the
      module to avoid having the same problem in the normal non-genl case.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee1c2442
    • Johannes Berg's avatar
      genetlink: disallow subscribing to unknown mcast groups · 5ad63005
      Johannes Berg authored
      
      
      Jeff Layton reported that he could trigger the multicast unbind warning
      in generic netlink using trinity. I originally thought it was a race
      condition between unregistering the generic netlink family and closing
      the socket, but there's a far simpler explanation: genetlink currently
      allows subscribing to groups that don't (yet) exist, and the warning is
      triggered when unsubscribing again while the group still doesn't exist.
      
      Originally, I had a warning in the subscribe case and accepted it out of
      userspace API concerns, but the warning was of course wrong and removed
      later.
      
      However, I now think that allowing userspace to subscribe to groups that
      don't exist is wrong and could possibly become a security problem:
      Consider a (new) genetlink family implementing a permission check in
      the mcast_bind() function similar to the like the audit code does today;
      it would be possible to bypass the permission check by guessing the ID
      and subscribing to the group it exists. This is only possible in case a
      family like that would be dynamically loaded, but it doesn't seem like a
      huge stretch, for example wireless may be loaded when you plug in a USB
      device.
      
      To avoid this reject such subscription attempts.
      
      If this ends up causing userspace issues we may need to add a workaround
      in af_netlink to deny such requests but not return an error.
      Reported-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ad63005
  5. 29 Dec, 2014 1 commit
  6. 27 Dec, 2014 2 commits
    • Johannes Berg's avatar
      netlink/genetlink: pass network namespace to bind/unbind · 023e2cfa
      Johannes Berg authored
      
      
      Netlink families can exist in multiple namespaces, and for the most
      part multicast subscriptions are per network namespace. Thus it only
      makes sense to have bind/unbind notifications per network namespace.
      
      To achieve this, pass the network namespace of a given client socket
      to the bind/unbind functions.
      
      Also do this in generic netlink, and there also make sure that any
      bind for multicast groups that only exist in init_net is rejected.
      This isn't really a problem if it is accepted since a client in a
      different namespace will never receive any notifications from such
      a group, but it can confuse the family if not rejected (it's also
      possible to silently (without telling the family) accept it, but it
      would also have to be ignored on unbind so families that take any
      kind of action on bind/unbind won't do unnecessary work for invalid
      clients like that.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      023e2cfa
    • Johannes Berg's avatar
      genetlink: pass multicast bind/unbind to families · c380d9a7
      Johannes Berg authored
      
      
      In order to make the newly fixed multicast bind/unbind
      functionality in generic netlink, pass them down to the
      appropriate family.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c380d9a7
  7. 02 Jun, 2014 1 commit
  8. 24 Apr, 2014 1 commit
  9. 06 Jan, 2014 1 commit
  10. 28 Nov, 2013 2 commits
    • Johannes Berg's avatar
      genetlink/pmcraid: use proper genetlink multicast API · 5e53e689
      Johannes Berg authored
      
      
      The pmcraid driver is abusing the genetlink API and is using its
      family ID as the multicast group ID, which is invalid and may
      belong to somebody else (and likely will.)
      
      Make it use the correct API, but since this may already be used
      as-is by userspace, reserve a family ID for this code and also
      reserve that group ID to not break userspace assumptions.
      
      My previous patch broke event delivery in the driver as I missed
      that it wasn't using the right API and forgot to update it later
      in my series.
      
      While changing this, I noticed that the genetlink code could use
      the static group ID instead of a strcmp(), so also do that for
      the VFS_DQUOT family.
      
      Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e53e689
    • Geert Uytterhoeven's avatar
      genetlink: Fix uninitialized variable in genl_validate_assign_mc_groups() · 0f0e2159
      Geert Uytterhoeven authored
      net/netlink/genetlink.c: In function ‘genl_validate_assign_mc_groups’:
      net/netlink/genetlink.c:217: warning: ‘err’ may be used uninitialized in this
      function
      
      Commit 2a94fe48
      
       ("genetlink: make multicast
      groups const, prevent abuse") split genl_register_mc_group() in multiple
      functions, but dropped the initialization of err.
      
      Initialize err to zero to fix this.
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f0e2159
  11. 21 Nov, 2013 1 commit
    • Johannes Berg's avatar
      genetlink: fix genlmsg_multicast() bug · 220815a9
      Johannes Berg authored
      
      
      Unfortunately, I introduced a tremendously stupid bug into
      genlmsg_multicast() when doing all those multicast group
      changes: it adjusts the group number, but then passes it
      to genlmsg_multicast_netns() which does that again.
      
      Somehow, my tests failed to catch this, so add a warning
      into genlmsg_multicast_netns() and remove the offending
      group ID adjustment.
      
      Also add a warning to the similar code in other functions
      so people who misuse them are more loudly warned.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      220815a9
  12. 19 Nov, 2013 7 commits
  13. 18 Nov, 2013 1 commit
  14. 15 Nov, 2013 1 commit
  15. 14 Nov, 2013 3 commits
  16. 28 Aug, 2013 2 commits
  17. 22 Aug, 2013 1 commit
  18. 13 Aug, 2013 1 commit
    • Johannes Berg's avatar
      genetlink: fix family dump race · 58ad436f
      Johannes Berg authored
      
      
      When dumping generic netlink families, only the first dump call
      is locked with genl_lock(), which protects the list of families,
      and thus subsequent calls can access the data without locking,
      racing against family addition/removal. This can cause a crash.
      Fix it - the locking needs to be conditional because the first
      time around it's already locked.
      
      A similar bug was reported to me on an old kernel (3.4.47) but
      the exact scenario that happened there is no longer possible,
      on those kernels the first round wasn't locked either. Looking
      at the current code I found the race described above, which had
      also existed on the old kernel.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarAndrei Otcheretianski <andrei.otcheretianski@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58ad436f
  19. 30 Jul, 2013 1 commit
    • Pablo Neira's avatar
      genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE · e1ee3673
      Pablo Neira authored
      Currently, it is not possible to use neither NLM_F_EXCL nor
      NLM_F_REPLACE from genetlink. This is due to this checking in
      genl_family_rcv_msg:
      
      	if (nlh->nlmsg_flags & NLM_F_DUMP)
      
      NLM_F_DUMP is NLM_F_MATCH|NLM_F_ROOT. Thus, if NLM_F_EXCL or
      NLM_F_REPLACE flag is set, genetlink believes that you're
      requesting a dump and it calls the .dumpit callback.
      
      The solution that I propose is to refine this checking to
      make it stricter:
      
      	if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)
      
      And given the combination NLM_F_REPLACE and NLM_F_EXCL does
      not make sense to me, it removes the ambiguity.
      
      There was a patch that tried to fix this some time ago (0ab03c2b
      
      
      netlink: test for all flags of the NLM_F_DUMP composite) but it
      tried to resolve this ambiguity in *all* existing netlink subsystems,
      not only genetlink. That patch was reverted since it broke iproute2,
      which is using NLM_F_ROOT to request the dump of the routing cache.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1ee3673
  20. 27 Jul, 2013 1 commit
    • Stanislaw Gruszka's avatar
      genetlink: release cb_lock before requesting additional module · c74f2b26
      Stanislaw Gruszka authored
      Requesting external module with cb_lock taken can result in
      the deadlock like showed below:
      
      [ 2458.111347] Showing all locks held in the system:
      [ 2458.111347] 1 lock held by NetworkManager/582:
      [ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40
      [ 2458.111347] 1 lock held by modprobe/603:
      [ 2458.111347]  #0:  (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
      
      [ 2461.579457] SysRq : Show Blocked State
      [ 2461.580103]   task                        PC stack   pid father
      [ 2461.580103] NetworkManager  D ffff880034b84500  4040   582      1 0x00000080
      [ 2461.580103]  ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8
      [ 2461.580103]  ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff
      [ 2461.580103]  ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700
      [ 2461.580103] Call Trace:
      [ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
      [ 2461.580103]  [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360
      [ 2461.580103]  [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140
      [ 2461.580103]  [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50
      [ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [ 2461.580103]  [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170
      [ 2461.580103]  [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20
      [ 2461.580103]  [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210
      [ 2461.580103]  [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170
      [ 2461.580103]  [<ffffffff81095cc3>] __request_module+0x1b3/0x370
      [ 2461.580103]  [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [ 2461.580103]  [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190
      [ 2461.580103]  [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0
      [ 2461.580103]  [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0
      [ 2461.580103]  [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0
      [ 2461.580103]  [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0
      [ 2461.580103]  [<ffffffff8162bc88>] genl_rcv+0x28/0x40
      [ 2461.580103]  [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190
      [ 2461.580103]  [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750
      [ 2461.580103]  [<ffffffff815db849>] sock_sendmsg+0x99/0xd0
      [ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
      [ 2461.580103]  [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350
      [ 2461.580103]  [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0
      [ 2461.580103]  [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50
      [ 2461.580103]  [<ffffffff810218b9>] ? sched_clock+0x9/0x10
      [ 2461.580103]  [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80
      [ 2461.580103]  [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100
      [ 2461.580103]  [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10
      [ 2461.580103]  [<ffffffff810bb58f>] ? local_clock+0x5f/0x70
      [ 2461.580103]  [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0
      [ 2461.580103]  [<ffffffff8120fec9>] ? fget_light+0xf9/0x510
      [ 2461.580103]  [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510
      [ 2461.580103]  [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80
      [ 2461.580103]  [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20
      [ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
      [ 2461.580103] modprobe        D ffff88000f2c8000  4632   603    602 0x00000080
      [ 2461.580103]  ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8
      [ 2461.580103]  ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500
      [ 2461.580103]  ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0
      [ 2461.580103] Call Trace:
      [ 2461.580103]  [<ffffffff817355f9>] schedule+0x29/0x70
      [ 2461.580103]  [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0
      [ 2461.580103]  [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0
      [ 2461.580103]  [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20
      [ 2461.580103]  [<ffffffff8173492d>] ? down_write+0x9d/0xb2
      [ 2461.580103]  [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30
      [ 2461.580103]  [<ffffffff8162baa5>] genl_lock_all+0x15/0x30
      [ 2461.580103]  [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211]
      [ 2461.580103]  [<ffffffffa01dc000>] ? 0xffffffffa01dbfff
      [ 2461.580103]  [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211]
      [ 2461.580103]  [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0
      [ 2461.580103]  [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50
      [ 2461.580103]  [<ffffffff810f75af>] load_module+0x1c6f/0x27f0
      [ 2461.580103]  [<ffffffff810f2c90>] ? store_uevent+0x40/0x40
      [ 2461.580103]  [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0
      [ 2461.580103]  [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b
      [ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1
      
      Problem start to happen after adding net-pf-16-proto-16-family-nl80211
      alias name to cfg80211 module by below commit (though that commit
      itself is perfectly fine):
      
      commit fb4e1568
      
      
      Author: Marcel Holtmann <marcel@holtmann.org>
      Date:   Sun Apr 28 16:22:06 2013 -0700
      
          nl80211: Add generic netlink module alias for cfg80211/nl80211
      Reported-and-tested-by: default avatarJeff Layton <jlayton@redhat.com>
      Reported-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Reviewed-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c74f2b26
  21. 26 Apr, 2013 1 commit
  22. 24 Apr, 2013 1 commit
    • Pravin B Shelar's avatar
      genl: Allow concurrent genl callbacks. · def31174
      Pravin B Shelar authored
      
      
      All genl callbacks are serialized by genl-mutex. This can become
      bottleneck in multi threaded case.
      Following patch adds an parameter to genl_family so that a
      particular family can get concurrent netlink callback without
      genl_lock held.
      New rw-sem is used to protect genl callback from genl family unregister.
      in case of parallel_ops genl-family read-lock is taken for callbacks and
      write lock is taken for register or unregistration for any family.
      In case of locked genl family semaphore and gel-mutex is locked for
      any openration.
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      def31174
  23. 20 Mar, 2013 1 commit
  24. 10 Sep, 2012 1 commit
  25. 08 Sep, 2012 2 commits
  26. 24 Jul, 2012 1 commit
  27. 11 Jul, 2012 1 commit