1. 01 Jul, 2016 2 commits
  2. 30 Jun, 2016 1 commit
  3. 29 Jun, 2016 1 commit
    • Daniel Borkmann's avatar
      bpf, perf: delay release of BPF prog after grace period · ceb56070
      Daniel Borkmann authored
      Commit dead9f29 ("perf: Fix race in BPF program unregister") moved
      destruction of BPF program from free_event_rcu() callback to __free_event(),
      which is problematic if used with tail calls: if prog A is attached as
      trace event directly, but at the same time present in a tail call map used
      by another trace event program elsewhere, then we need to delay destruction
      via RCU grace period since it can still be in use by the program doing the
      tail call (the prog first needs to be dropped from the tail call map, then
      trace event with prog A attached destroyed, so we get immediate destruction).
      
      Fixes: dead9f29 ("perf: Fix race in BPF program unregister")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Jann Horn <jann@thejh.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceb56070
  4. 28 Jun, 2016 2 commits
  5. 24 Jun, 2016 5 commits
    • Kirill A. Shutemov's avatar
      Revert "mm: make faultaround produce old ptes" · 315d09bf
      Kirill A. Shutemov authored
      This reverts commit 5c0a85fa.
      
      The commit causes ~6% regression in unixbench.
      
      Let's revert it for now and consider other solution for reclaim problem
      later.
      
      Link: http://lkml.kernel.org/r/1465893750-44080-2-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Vinayak Menon <vinmenon@codeaurora.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      315d09bf
    • Andrey Ryabinin's avatar
      mm: mempool: kasan: don't poot mempool objects in quarantine · 9b75a867
      Andrey Ryabinin authored
      Currently we may put reserved by mempool elements into quarantine via
      kasan_kfree().  This is totally wrong since quarantine may really free
      these objects.  So when mempool will try to use such element,
      use-after-free will happen.  Or mempool may decide that it no longer
      need that element and double-free it.
      
      So don't put object into quarantine in kasan_kfree(), just poison it.
      Rename kasan_kfree() to kasan_poison_kfree() to respect that.
      
      Also, we shouldn't use kasan_slab_alloc()/kasan_krealloc() in
      kasan_unpoison_element() because those functions may update allocation
      stacktrace.  This would be wrong for the most of the remove_element call
      sites.
      
      (The only call site where we may want to update alloc stacktrace is
       in mempool_alloc(). Kmemleak solves this by calling
       kmemleak_update_trace(), so we could make something like that too.
       But this is out of scope of this patch).
      
      Fixes: 55834c59 ("mm: kasan: initial memory quarantine implementation")
      Link: http://lkml.kernel.org/r/575977C3.1010905@virtuozzo.comSigned-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarKuthonuzo Luruo <kuthonuzo.luruo@hpe.com>
      Acked-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b75a867
    • Linus Torvalds's avatar
      fix up initial thread stack pointer vs thread_info confusion · 7f1a00b6
      Linus Torvalds authored
      The INIT_TASK() initializer was similarly confused about the stack vs
      thread_info allocation that the allocators had, and that were fixed in
      commit b235beea ("Clarify naming of thread info/stack allocators").
      
      The task ->stack pointer only incidentally ends up having the same value
      as the thread_info, and in fact that will change.
      
      So fix the initial task struct initializer to point to 'init_stack'
      instead of 'init_thread_info', and make sure the ia64 definition for
      that exists.
      
      This actually makes the ia64 tsk->stack pointer be sensible for the
      initial task, but not for any other task.  As mentioned in commit
      b235beea, that whole pointer isn't actually used on ia64, since
      task_stack_page() there just points to the (single) allocation.
      
      All the other architectures seem to have copied the 'init_stack'
      definition, even if it tended to be generally unusued.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7f1a00b6
    • Linus Torvalds's avatar
      Clarify naming of thread info/stack allocators · b235beea
      Linus Torvalds authored
      We've had the thread info allocated together with the thread stack for
      most architectures for a long time (since the thread_info was split off
      from the task struct), but that is about to change.
      
      But the patches that move the thread info to be off-stack (and a part of
      the task struct instead) made it clear how confused the allocator and
      freeing functions are.
      
      Because the common case was that we share an allocation with the thread
      stack and the thread_info, the two pointers were identical.  That
      identity then meant that we would have things like
      
      	ti = alloc_thread_info_node(tsk, node);
      	...
      	tsk->stack = ti;
      
      which certainly _worked_ (since stack and thread_info have the same
      value), but is rather confusing: why are we assigning a thread_info to
      the stack? And if we move the thread_info away, the "confusing" code
      just gets to be entirely bogus.
      
      So remove all this confusion, and make it clear that we are doing the
      stack allocation by renaming and clarifying the function names to be
      about the stack.  The fact that the thread_info then shares the
      allocation is an implementation detail, and not really about the
      allocation itself.
      
      This is a pure renaming and type fix: we pass in the same pointer, it's
      just that we clarify what the pointer means.
      
      The ia64 code that actually only has one single allocation (for all of
      task_struct, thread_info and kernel thread stack) now looks a bit odd,
      but since "tsk->stack" is actually not even used there, that oddity
      doesn't matter.  It would be a separate thing to clean that up, I
      intentionally left the ia64 changes as a pure brute-force renaming and
      type change.
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b235beea
    • Paolo Bonzini's avatar
      locking/static_key: Fix concurrent static_key_slow_inc() · 4c5ea0a9
      Paolo Bonzini authored
      The following scenario is possible:
      
          CPU 1                                   CPU 2
          static_key_slow_inc()
           atomic_inc_not_zero()
            -> key.enabled == 0, no increment
           jump_label_lock()
           atomic_inc_return()
            -> key.enabled == 1 now
                                                  static_key_slow_inc()
                                                   atomic_inc_not_zero()
                                                    -> key.enabled == 1, inc to 2
                                                   return
                                                  ** static key is wrong!
           jump_label_update()
           jump_label_unlock()
      
      Testing the static key at the point marked by (**) will follow the
      wrong path for jumps that have not been patched yet.  This can
      actually happen when creating many KVM virtual machines with userspace
      LAPIC emulation; just run several copies of the following program:
      
          #include <fcntl.h>
          #include <unistd.h>
          #include <sys/ioctl.h>
          #include <linux/kvm.h>
      
          int main(void)
          {
              for (;;) {
                  int kvmfd = open("/dev/kvm", O_RDONLY);
                  int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
                  close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
                  close(vmfd);
                  close(kvmfd);
              }
              return 0;
          }
      
      Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call.
      The static key's purpose is to skip NULL pointer checks and indeed one
      of the processes eventually dereferences NULL.
      
      As explained in the commit that introduced the bug:
      
        706249c2 ("locking/static_keys: Rework update logic")
      
      jump_label_update() needs key.enabled to be true.  The solution adopted
      here is to temporarily make key.enabled == -1, and use go down the
      slow path when key.enabled <= 0.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> # v4.3+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 706249c2 ("locking/static_keys: Rework update logic")
      Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com
      [ Small stylistic edits to the changelog and the code. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4c5ea0a9
  6. 23 Jun, 2016 3 commits
  7. 22 Jun, 2016 1 commit
  8. 19 Jun, 2016 1 commit
  9. 18 Jun, 2016 2 commits
  10. 17 Jun, 2016 4 commits
  11. 16 Jun, 2016 1 commit
  12. 15 Jun, 2016 6 commits
    • Eric Dumazet's avatar
      gre: fix error handler · e582615a
      Eric Dumazet authored
      1) gre_parse_header() can be called from gre_err()
      
         At this point transport header points to ICMP header, not the inner
      header.
      
      2) We can not really change transport header as ipgre_err() will later
      assume transport header still points to ICMP header (using icmp_hdr())
      
      3) pskb_may_pull() logic in gre_parse_header() really works
        if we are interested at zone pointed by skb->data
      
      4) As Jiri explained in commit b7f8fe25 ("gre: do not pull header in
      ICMP error processing") we should not pull headers in error handler.
      
      So this fix :
      
      A) changes gre_parse_header() to use skb->data instead of
      skb_transport_header()
      
      B) Adds a nhs parameter to gre_parse_header() so that we can skip the
      not pulled IP header from error path.
        This offset is 0 for normal receive path.
      
      C) remove obsolete IPV6 includes
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e582615a
    • Jason A. Donenfeld's avatar
      net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG · daddef76
      Jason A. Donenfeld authored
      The implementation of net_dbg_ratelimited in the CONFIG_DYNAMIC_DEBUG
      case was added with 2c94b537 ("net: Implement net_dbg_ratelimited() for
      CONFIG_DYNAMIC_DEBUG case"). The implementation strategy was to take the
      usual definition of the dynamic_pr_debug macro, but alter it by adding a
      call to "net_ratelimit()" in the if statement. This is, in fact, the
      correct approach.
      
      However, while doing this, the author of the commit forgot to surround
      fmt by pr_fmt, resulting in unprefixed log messages appearing in the
      console. So, this commit adds back the pr_fmt(fmt) invocation, making
      net_dbg_ratelimited properly consistent across DEBUG, no DEBUG, and
      DYNAMIC_DEBUG cases, and bringing parity with the behavior of
      dynamic_pr_debug as well.
      
      Fixes: 2c94b537 ("net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Tim Bingham <tbingham@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      daddef76
    • Mauro Carvalho Chehab's avatar
      Update my main e-mails at the Kernel tree · 5dc8a864
      Mauro Carvalho Chehab authored
      For the third time in three years, I'm changing my e-mail at Samsung.
      That's bad, as it may stop communications with me for a while.  So, this
      time, I'll also add the mchehab@kernel.org e-mail, as it remains stable
      since ever.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5dc8a864
    • J. Bruce Fields's avatar
      rpc: share one xps between all backchannels · 39a9beab
      J. Bruce Fields authored
      The spec allows backchannels for multiple clients to share the same tcp
      connection.  When that happens, we need to use the same xprt for all of
      them.  Similarly, we need the same xps.
      
      This fixes list corruption introduced by the multipath code.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarTrond Myklebust <trondmy@primarydata.com>
      39a9beab
    • J. Bruce Fields's avatar
      nfsd4/rpc: move backchannel create logic into rpc code · d50039ea
      J. Bruce Fields authored
      Also simplify the logic a bit.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarTrond Myklebust <trondmy@primarydata.com>
      d50039ea
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: reject loops from set element jump to chain · 8588ac09
      Pablo Neira Ayuso authored
      Liping Zhang says:
      
      "Users may add such a wrong nft rules successfully, which will cause an
      endless jump loop:
      
        # nft add rule filter test tcp dport vmap {1: jump test}
      
      This is because before we commit, the element in the current anonymous
      set is inactive, so osp->walk will skip this element and miss the
      validate check."
      
      To resolve this problem, this patch passes the generation mask to the
      walk function through the iter container structure depending on the code
      path:
      
      1) If we're dumping the elements, then we have to check if the element
         is active in the current generation. Thus, we check for the current
         bit in the genmask.
      
      2) If we're checking for loops, then we have to check if the element is
         active in the next generation, as we're in the middle of a
         transaction. Thus, we check for the next bit in the genmask.
      
      Based on original patch from Liping Zhang.
      Reported-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Tested-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      8588ac09
  13. 11 Jun, 2016 1 commit
    • Ben Dooks's avatar
      net: diag: add missing declarations · c3ec5e5c
      Ben Dooks authored
      The functions inet_diag_msg_common_fill and inet_diag_msg_attrs_fill
      seem to have been missed from the include/linux/inet_diag.h header
      file. Add them to fix the following warnings:
      
      net/ipv4/inet_diag.c:69:6: warning: symbol 'inet_diag_msg_common_fill' was not declared. Should it be static?
      net/ipv4/inet_diag.c:108:5: warning: symbol 'inet_diag_msg_attrs_fill' was not declared. Should it be static?
      Signed-off-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3ec5e5c
  14. 10 Jun, 2016 3 commits
    • Al Viro's avatar
      much milder d_walk() race · ba65dc5e
      Al Viro authored
      d_walk() relies upon the tree not getting rearranged under it without
      rename_lock being touched.  And we do grab rename_lock around the
      places that change the tree topology.  Unfortunately, branch reordering
      is just as bad from d_walk() POV and we have two places that do it
      without touching rename_lock - one in handling of cursors (for ramfs-style
      directories) and another in autofs.  autofs one is a separate story; this
      commit deals with the cursors.
      	* mark cursor dentries explicitly at allocation time
      	* make __dentry_kill() leave ->d_child.next pointing to the next
      non-cursor sibling, making sure that it won't be moved around unnoticed
      before the parent is relocked on ascend-to-parent path in d_walk().
      	* make d_walk() skip cursors explicitly; strictly speaking it's
      not necessary (all callbacks we pass to d_walk() are no-ops on cursors),
      but it makes analysis easier.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ba65dc5e
    • Brian Norris's avatar
      pwm: Improve args checking in pwm_apply_state() · ef2bf499
      Brian Norris authored
      It seems like in the process of refactoring pwm_config() to utilize the
      newly-introduced pwm_apply_state() API, some args/bounds checking was
      dropped.
      
      In particular, I noted that we are now allowing invalid period
      selections, e.g.:
      
        # echo 1 > /sys/class/pwm/pwmchip0/export
        # cat /sys/class/pwm/pwmchip0/pwm1/period
        100
        # echo 101 > /sys/class/pwm/pwmchip0/pwm1/duty_cycle
        [... driver may or may not reject the value, or trigger some logic bug ...]
      
      It's better to see:
      
        # echo 1 > /sys/class/pwm/pwmchip0/export
        # cat /sys/class/pwm/pwmchip0/pwm1/period
        100
        # echo 101 > /sys/class/pwm/pwmchip0/pwm1/duty_cycle
        -bash: echo: write error: Invalid argument
      
      This patch reintroduces some bounds checks in both pwm_config() (for its
      signed parameters; we don't want to convert negative values into large
      unsigned values) and in pwm_apply_state() (which fix the above described
      behavior, as well as other potential API misuses).
      
      Fixes: 5ec803ed ("pwm: Add core infrastructure to allow atomic updates")
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Acked-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      ef2bf499
    • Willem de Bruijn's avatar
      packet: compat support for sock_fprog · 719c44d3
      Willem de Bruijn authored
      Socket option PACKET_FANOUT_DATA takes a struct sock_fprog as argument
      if PACKET_FANOUT has mode PACKET_FANOUT_CBPF. This structure contains
      a pointer into user memory. If userland is 32-bit and kernel is 64-bit
      the two disagree about the layout of struct sock_fprog.
      
      Add compat setsockopt support to convert a 32-bit compat_sock_fprog to
      a 64-bit sock_fprog. This is analogous to compat_sock_fprog support for
      SO_REUSEPORT added in commit 19575988 ("soreuseport: add compat
      case for setsockopt SO_ATTACH_REUSEPORT_CBPF").
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      719c44d3
  15. 09 Jun, 2016 5 commits
  16. 08 Jun, 2016 2 commits
    • Robin Murphy's avatar
      drivers: of: Fix of_pci.h header guard · 5c1d3310
      Robin Murphy authored
      The compilation of of_pci.c is governed by CONFIG_OF_PCI, but the
      corresponding declarations in of_pci.h are inconsistently guarded by
      CONFIG_OF, with the result that if CONFIG_PCI is disabled for an OF
      platform, the dangling external declarations are still active and the
      inline stub definitions not. So far this has managed to go unnoticed
      since it happens that the only references to these functions are from
      code which itself depends on CONFIG_PCI or CONFIG_OF_PCI.
      
      Fix this with the appropriate config guard so that any new callers
      outside PCI-specific code don't start unexpectedly breaking under
      certain configs.
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      5c1d3310
    • Peter Zijlstra's avatar
      locking/qspinlock: Fix spin_unlock_wait() some more · 2c610022
      Peter Zijlstra authored
      While this prior commit:
      
        54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      
      ... fixes spin_is_locked() and spin_unlock_wait() for the usage
      in ipc/sem and netfilter, it does not in fact work right for the
      usage in task_work and futex.
      
      So while the 2 locks crossed problem:
      
      	spin_lock(A)		spin_lock(B)
      	if (!spin_is_locked(B)) spin_unlock_wait(A)
      	  foo()			foo();
      
      ... works with the smp_mb() injected by both spin_is_locked() and
      spin_unlock_wait(), this is not sufficient for:
      
      	flag = 1;
      	smp_mb();		spin_lock()
      	spin_unlock_wait()	if (!flag)
      				  // add to lockless list
      	// iterate lockless list
      
      ... because in this scenario, the store from spin_lock() can be delayed
      past the load of flag, uncrossing the variables and loosing the
      guarantee.
      
      This patch reworks spin_is_locked() and spin_unlock_wait() to work in
      both cases by exploiting the observation that while the lock byte
      store can be delayed, the contender must have registered itself
      visibly in other state contained in the word.
      
      It also allows for architectures to override both functions, as PPC
      and ARM64 have an additional issue for which we currently have no
      generic solution.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Giovanni Gherdovich <ggherdovich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <waiman.long@hpe.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org # v4.2 and later
      Fixes: 54cf809b ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2c610022