1. 23 Oct, 2015 1 commit
  2. 24 Aug, 2015 1 commit
    • David Ahern's avatar
      net: Fix RCU splat in af_key · ba51b6be
      David Ahern authored
      Hit the following splat testing VRF change for ipsec:
      
      [  113.475692] ===============================
      [  113.476194] [ INFO: suspicious RCU usage. ]
      [  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
      [  113.477545] -------------------------------
      [  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
      [  113.479288]
      [  113.479288] other info that might help us debug this:
      [  113.479288]
      [  113.480207]
      [  113.480207] rcu_scheduler_active = 1, debug_locks = 1
      [  113.480931] 2 locks held by setkey/6829:
      [  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
      [  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
      [  113.483509]
      [  113.483509] stack backtrace:
      [  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
      [  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
      [  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
      [  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
      [  113.489525] Call Trace:
      [  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
      [  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
      [  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
      [  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
      [  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
      [  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
      [  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
      [  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
      [  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
      [  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
      [  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
      [  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
      [  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
      [  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213
      
      In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
      the RCU lock.
      
      Since pfkey_broadcast takes the RCU lock the allocation argument is
      pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
      The one call outside of rcu can be done with GFP_KERNEL.
      
      Fixes: 7f6b9dbd ("af_key: locking change")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba51b6be
  3. 27 May, 2015 1 commit
  4. 11 May, 2015 1 commit
  5. 31 Mar, 2015 1 commit
  6. 02 Mar, 2015 1 commit
  7. 24 Nov, 2014 1 commit
  8. 05 Nov, 2014 1 commit
    • David S. Miller's avatar
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller authored
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51f3d02b
  9. 15 Jul, 2014 1 commit
  10. 30 May, 2014 1 commit
  11. 23 Apr, 2014 1 commit
  12. 22 Apr, 2014 1 commit
    • Tetsuo Handa's avatar
      xfrm: Remove useless secid field from xfrm_audit. · f1370cc4
      Tetsuo Handa authored
      It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing
      something strange at xfrm_audit_helper_usrinfo().
      If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls
      audit_log_task_context() which basically does
      secid != 0 && security_secid_to_secctx(secid) == 0 case
      except that secid is obtained from current thread's context.
      
      Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was
      obtained from other thread's context? It might audit current thread's
      context rather than other thread's context if security_secid_to_secctx()
      in xfrm_audit_helper_usrinfo() failed for some reason.
      
      Then, are all the caller of xfrm_audit_helper_usrinfo() passing either
      secid obtained from current thread's context or secid == 0?
      It seems to me that they are.
      
      If I didn't miss something, we don't need to pass secid to
      xfrm_audit_helper_usrinfo() because audit_log_task_context() will
      obtain secid from current thread's context.
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      f1370cc4
  13. 11 Apr, 2014 1 commit
    • David S. Miller's avatar
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller authored
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      676d2369
  14. 10 Mar, 2014 2 commits
    • Nikolay Aleksandrov's avatar
      selinux: add gfp argument to security_xfrm_policy_alloc and fix callers · 52a4c640
      Nikolay Aleksandrov authored
      security_xfrm_policy_alloc can be called in atomic context so the
      allocation should be done with GFP_ATOMIC. Add an argument to let the
      callers choose the appropriate way. In order to do so a gfp argument
      needs to be added to the method xfrm_policy_alloc_security in struct
      security_operations and to the internal function
      selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
      callers and leave GFP_KERNEL as before for the rest.
      The path that needed the gfp argument addition is:
      security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
      all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
      selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)
      
      Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
      add it to security_context_to_sid which is used inside and prior to this
      patch did only GFP_KERNEL allocation. So add gfp argument to
      security_context_to_sid and adjust all of its callers as well.
      
      CC: Paul Moore <paul@paul-moore.com>
      CC: Dave Jones <davej@redhat.com>
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      CC: Fan Du <fan.du@windriver.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: LSM list <linux-security-module@vger.kernel.org>
      CC: SELinux list <selinux@tycho.nsa.gov>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      52a4c640
    • Nikolay Aleksandrov's avatar
      net: af_key: fix sleeping under rcu · 87536a81
      Nikolay Aleksandrov authored
      There's a kmalloc with GFP_KERNEL in a helper
      (pfkey_sadb2xfrm_user_sec_ctx) used in pfkey_compile_policy which is
      called under rcu_read_lock. Adjust pfkey_sadb2xfrm_user_sec_ctx to have
      a gfp argument and adjust the users.
      
      CC: Dave Jones <davej@redhat.com>
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      CC: Fan Du <fan.du@windriver.com>
      CC: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      87536a81
  15. 07 Mar, 2014 1 commit
  16. 20 Feb, 2014 1 commit
  17. 16 Feb, 2014 1 commit
    • Nicolas Dichtel's avatar
      ipsec: add support of limited SA dump · d3623099
      Nicolas Dichtel authored
      The goal of this patch is to allow userland to dump only a part of SA by
      specifying a filter during the dump.
      The kernel is in charge to filter SA, this avoids to generate useless netlink
      traffic (it save also some cpu cycles). This is particularly useful when there
      is a big number of SA set on the system.
      
      Note that I removed the union in struct xfrm_state_walk to fix a problem on arm.
      struct netlink_callback->args is defined as a array of 6 long and the first long
      is used in xfrm code to flag the cb as initialized. Hence, we must have:
      sizeof(struct xfrm_state_walk) <= sizeof(long) * 5.
      With the union, it was false on arm (sizeof(struct xfrm_state_walk) was
      sizeof(long) * 7), due to the padding.
      In fact, whatever the arch is, this union seems useless, there will be always
      padding after it. Removing it will not increase the size of this struct (and
      reduce it on arm).
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      d3623099
  18. 12 Feb, 2014 1 commit
  19. 16 Dec, 2013 1 commit
  20. 05 Dec, 2013 3 commits
  21. 20 Nov, 2013 1 commit
    • Hannes Frederic Sowa's avatar
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa authored
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3d33426
  22. 17 Sep, 2013 1 commit
    • Fan Du's avatar
      xfrm: Guard IPsec anti replay window against replay bitmap · 33fce60d
      Fan Du authored
      For legacy IPsec anti replay mechanism:
      
      bitmap in struct xfrm_replay_state could only provide a 32 bits
      window size limit in current design, thus user level parameter
      sadb_sa_replay should honor this limit, otherwise misleading
      outputs("replay=244") by setkey -D will be:
      
      192.168.25.2 192.168.22.2
      	esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000)
      	E: aes-cbc  9a8d7468 7655cf0b 719d27be b0ddaac2
      	A: hmac-sha1  2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331
      	seq=0x00000000 replay=244 flags=0x00000000 state=mature
      	created: Sep 17 14:00:00 2013	current: Sep 17 14:00:22 2013
      	diff: 22(s)	hard: 30(s)	soft: 26(s)
      	last: Sep 17 14:00:00 2013	hard: 0(s)	soft: 0(s)
      	current: 1408(bytes)	hard: 0(bytes)	soft: 0(bytes)
      	allocated: 22	hard: 0	soft: 0
      	sadb_seq=1 pid=4854 refcnt=0
      192.168.22.2 192.168.25.2
      	esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000)
      	E: aes-cbc  6485d990 f61a6bd5 e5660252 608ad282
      	A: hmac-sha1  0cca811a eb4fa893 c47ae56c 98f6e413 87379a88
      	seq=0x00000000 replay=244 flags=0x00000000 state=mature
      	created: Sep 17 14:00:00 2013	current: Sep 17 14:00:22 2013
      	diff: 22(s)	hard: 30(s)	soft: 26(s)
      	last: Sep 17 14:00:00 2013	hard: 0(s)	soft: 0(s)
      	current: 1408(bytes)	hard: 0(bytes)	soft: 0(bytes)
      	allocated: 22	hard: 0	soft: 0
      	sadb_seq=0 pid=4854 refcnt=0
      
      And also, optimizing xfrm_replay_check window checking by setting the
      desirable x->props.replay_window with only doing the comparison once
      for all when xfrm_state is first born.
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      33fce60d
  23. 07 Aug, 2013 1 commit
  24. 05 Aug, 2013 2 commits
  25. 30 Jul, 2013 1 commit
  26. 26 Jun, 2013 1 commit
  27. 31 May, 2013 1 commit
  28. 07 Mar, 2013 1 commit
  29. 27 Feb, 2013 1 commit
    • Sasha Levin's avatar
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin authored
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: default avatarPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  30. 20 Feb, 2013 1 commit
  31. 18 Feb, 2013 2 commits
  32. 01 Feb, 2013 1 commit
  33. 29 Jan, 2013 1 commit
  34. 28 Jan, 2013 1 commit
  35. 18 Nov, 2012 1 commit
    • Eric W. Biederman's avatar
      net: Allow userns root to control llc, netfilter, netlink, packet, and xfrm · df008c91
      Eric W. Biederman authored
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Allow creation of af_key sockets.
      Allow creation of llc sockets.
      Allow creation of af_packet sockets.
      
      Allow sending xfrm netlink control messages.
      
      Allow binding to netlink multicast groups.
      Allow sending to netlink multicast groups.
      Allow adding and dropping netlink multicast groups.
      Allow sending to all netlink multicast groups and port ids.
      
      Allow reading the netfilter SO_IP_SET socket option.
      Allow sending netfilter netlink messages.
      Allow setting and getting ip_vs netfilter socket options.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df008c91