1. 28 Mar, 2012 1 commit
  2. 08 Mar, 2012 2 commits
  3. 05 Mar, 2012 6 commits
    • Hugh Dickins's avatar
      memcg: fix GPF when cgroup removal races with last exit · 7512102c
      Hugh Dickins authored
      When moving tasks from old memcg (with move_charge_at_immigrate on new
      memcg), followed by removal of old memcg, hit General Protection Fault in
      mem_cgroup_lru_del_list() (called from release_pages called from
      free_pages_and_swap_cache from tlb_flush_mmu from tlb_finish_mmu from
      exit_mmap from mmput from exit_mm from do_exit).
      Somewhat reproducible, takes a few hours: the old struct mem_cgroup has
      been freed and poisoned by SLAB_DEBUG, but mem_cgroup_lru_del_list() is
      still trying to update its stats, and take page off lru before freeing.
      A task, or a charge, or a page on lru: each secures a memcg against
      removal.  In this case, the last task has been moved out of the old memcg,
      and it is exiting: anonymous pages are uncharged one by one from the
      memcg, as they are zapped from its pagetables, so the charge gets down to
      0; but the pages themselves are queued in an mmu_gather for freeing.
      Most of those pages will be on lru (and force_empty is careful to
      lru_add_drain_all, to add pages from pagevec to lru first), but not
      necessarily all: perhaps some have been isolated for page reclaim, perhaps
      some isolated for other reasons.  So, force_empty may find no task, no
      charge and no page on lru, and let the removal proceed.
      There would still be no problem if these pages were immediately freed; but
      typically (and the put_page_testzero protocol demands it) they have to be
      added back to lru before they are found freeable, then removed from lru
      and freed.  We don't see the issue when adding, because the
      mem_cgroup_iter() loops keep their own reference to the memcg being
      scanned; but when it comes to mem_cgroup_lru_del_list().
      I believe this was not an issue in v3.2: there, PageCgroupAcctLRU and
      PageCgroupUsed flags were used (like a trick with mirrors) to deflect view
      of pc->mem_cgroup to the stable root_mem_cgroup when neither set.
      38c5d72f ("memcg: simplify LRU handling by new rule") mercifully
      removed those convolutions, but left this General Protection Fault.
      But it's surprisingly easy to restore the old behaviour: just check
      PageCgroupUsed in mem_cgroup_lru_add_list() (which decides on which lruvec
      to add), and reset pc to root_mem_cgroup if page is uncharged.  A risky
      change?  just going back to how it worked before; testing, and an audit of
      uses of pc->mem_cgroup, show no problem.
      And there's a nice bonus: with mem_cgroup_lru_add_list() itself making
      sure that an uncharged page goes to root lru, mem_cgroup_reset_owner() no
      longer has any purpose, and we can safely revert 4e5f01c2 ("memcg:
      clear pc->mem_cgroup if necessary").
      Calling update_page_reclaim_stat() after add_page_to_lru_list() in swap.c
      is not strictly necessary: the lru_lock there, with RCU before memcg
      structures are freed, makes mem_cgroup_get_reclaim_stat_from_page safe
      without that; but it seems cleaner to rely on one dependency less.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      vfork: kill PF_STARTING · 6e27f63e
      Oleg Nesterov authored
      Previously it was (ab)used by utrace.  Then it was wrongly used by the
      scheduler code.
      Currently it is not used, kill it before it finds the new erroneous user.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      coredump_wait: don't call complete_vfork_done() · 57b59c4a
      Oleg Nesterov authored
      Now that CLONE_VFORK is killable, coredump_wait() no longer needs
      complete_vfork_done().  zap_threads() should find and kill all tasks with
      the same ->mm, this includes our parent if ->vfork_done is set.
      mm_release() becomes the only caller, unexport complete_vfork_done().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      vfork: make it killable · d68b46fe
      Oleg Nesterov authored
      Make vfork() killable.
      Change do_fork(CLONE_VFORK) to do wait_for_completion_killable().  If it
      fails we do not return to the user-mode and never touch the memory shared
      with our child.
      However, in this case we should clear child->vfork_done before return, we
      use task_lock() in do_fork()->wait_for_vfork_done() and
      complete_vfork_done() to serialize with each other.
      Note: now that we use task_lock() we don't really need completion, we
      could turn task->vfork_done into "task_struct *wake_up_me" but this needs
      some complications.
      NOTE: this and the next patches do not affect in-kernel users of
      CLONE_VFORK, kernel threads run with all signals ignored including
      However this is obviously the user-visible change.  Not only a fatal
      signal can kill the vforking parent, a sub-thread can do execve or
      exit_group() and kill the thread sleeping in vfork().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      vfork: introduce complete_vfork_done() · c415c3b4
      Oleg Nesterov authored
      No functional changes.
      Move the clear-and-complete-vfork_done code into the new trivial helper,
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Matthew Garrett's avatar
      kmsg_dump: don't run on non-error paths by default · c22ab332
      Matthew Garrett authored
      Since commit 04c6862c ("kmsg_dump: add kmsg_dump() calls to the
      reboot, halt, poweroff and emergency_restart paths"), kmsg_dump() gets
      run on normal paths including poweroff and reboot.
      This is less than ideal given pstore implementations that can only
      represent single backtraces, since a reboot may overwrite a stored oops
      before it's been picked up by userspace.  In addition, some pstore
      backends may have low performance and provide a significant delay in
      reboot as a result.
      This patch adds a printk.always_kmsg_dump kernel parameter (which can also
      be changed from userspace).  Without it, the code will only be run on
      failure paths rather than on normal paths.  The option can be enabled in
      environments where there's a desire to attempt to audit whether or not a
      reboot was cleanly requested or not.
      Signed-off-by: default avatarMatthew Garrett <mjg@redhat.com>
      Acked-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Marco Stornelli <marco.stornelli@gmail.com>
      Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 04 Mar, 2012 2 commits
  5. 02 Mar, 2012 7 commits
  6. 28 Feb, 2012 1 commit
  7. 27 Feb, 2012 1 commit
    • James Bottomley's avatar
      [PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional · 97a29d59
      James Bottomley authored
      The problem in
      commit fea80311
      Author: Randy Dunlap <rdunlap@xenotime.net>
      Date:   Sun Jul 24 11:39:14 2011 -0700
          iomap: make IOPORT/PCI mapping functions conditional
      is that if your architecture supplies pci_iomap/pci_iounmap, it expects
      always to supply them.  Adding empty body defitions in the !CONFIG_PCI
      case, which is what this patch does, breaks the parisc compile because
      the functions become doubly defined.  It took us a while to spot this,
      because we don't actually build !CONFIG_PCI very often (only if someone
      is brave enough to test the snake/asp machines).
      Since the note in the commit log says this is to fix a
      CONFIG_GENERIC_IOMAP issue (which it does because CONFIG_GENERIC_IOMAP
      supplies pci_iounmap only if CONFIG_PCI is set), there should actually
      have been a condition upon this.  This should make sure no other
      architecture's !CONFIG_PCI compile breaks in the same way as parisc.
      The fix had to be updated to take account of the GENERIC_PCI_IOMAP
      Reported-by: default avatarRolf Eike Beer <eike@sf-mail.de>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
  8. 26 Feb, 2012 1 commit
  9. 24 Feb, 2012 2 commits
    • Oleg Nesterov's avatar
      epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() · d80e731e
      Oleg Nesterov authored
      This patch is intentionally incomplete to simplify the review.
      It ignores ep_unregister_pollwait() which plays with the same wqh.
      See the next change.
      epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
      f_op->poll() needs. In particular it assumes that the wait queue
      can't go away until eventpoll_release(). This is not true in case
      of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
      which is not connected to the file.
      This patch adds the special event, POLLFREE, currently only for
      epoll. It expects that init_poll_funcptr()'ed hook should do the
      necessary cleanup. Perhaps it should be defined as EPOLLFREE in
      __cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
      ->signalfd_wqh is not empty, we add the new signalfd_cleanup()
      ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
      This make this poll entry inconsistent, but we don't care. If you
      share epoll fd which contains our sigfd with another process you
      should blame yourself. signalfd is "really special". I simply do
      not know how we can define the "right" semantics if it used with
      The main problem is, epoll calls signalfd_poll() once to establish
      the connection with the wait queue, after that signalfd_poll(NULL)
      returns the different/inconsistent results depending on who does
      EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
      has nothing to do with the file, it works with the current thread.
      In short: this patch is the hack which tries to fix the symptoms.
      It also assumes that nobody can take tasklist_lock under epoll
      locks, this seems to be true.
      	- we do not have wake_up_all_poll() but wake_up_poll()
      	  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.
      	- signalfd_cleanup() uses POLLHUP along with POLLFREE,
      	  we need a couple of simple changes in eventpoll.c to
      	  make sure it can't be "lost".
      Reported-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Jozsef Kadlecsik's avatar
      netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2) · 7d367e06
      Jozsef Kadlecsik authored
      Marcell Zambo and Janos Farago noticed and reported that when
      new conntrack entries are added via netlink and the conntrack table
      gets full, soft lockup happens. This is because the nf_conntrack_lock
      is held while nf_conntrack_alloc is called, which is in turn wants
      to lock nf_conntrack_lock while evicting entries from the full table.
      The patch fixes the soft lockup with limiting the holding of the
      nf_conntrack_lock to the minimum, where it's absolutely required.
      It required to extend (and thus change) nf_conntrack_hash_insert
      so that it makes sure conntrack and ctnetlink do not add the same entry
      twice to the conntrack table.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
  10. 23 Feb, 2012 2 commits
  11. 22 Feb, 2012 1 commit
    • Peter Zijlstra's avatar
      sched/events: Revert trace_sched_stat_sleeptime() · 8c79a045
      Peter Zijlstra authored
      Commit 1ac9bc69 ("sched/tracing: Add a new tracepoint for sleeptime")
      added a new sched:sched_stat_sleeptime tracepoint.
      It's broken: the first sample we get on a task might be bad because
      of a stale sleep_start value that wasn't reset at the last task switch
      because the tracepoint was not active.
      It also breaks the existing schedstat samples due to the side
      effects of:
      -               se->statistics.sleep_start = 0;
      -               se->statistics.block_start = 0;
      Nor do I see means to fix it without adding overhead to the scheduler
      fast path, which I'm not willing to for the sake of redundant
      Most importantly, sleep time information can already be constructed
      by tracing context switches and wakeups, and taking the timestamp
      difference between the schedule-out, the wakeup and the schedule-in.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-pc4c9qhl8q6vg3bs4j6k0rbd@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
  12. 21 Feb, 2012 6 commits
    • Linus Torvalds's avatar
      sys_poll: fix incorrect type for 'timeout' parameter · faf30900
      Linus Torvalds authored
      The 'poll()' system call timeout parameter is supposed to be 'int', not
      Now, the reason this matters is that right now 32-bit compat mode is
      broken on at least x86-64, because the 32-bit code just calls
      'sys_poll()' directly on x86-64, and the 32-bit argument will have been
      zero-extended, turning a signed 'int' into a large unsigned 'long'
      We could just introduce a 'compat_sys_poll()' function for this, and
      that may eventually be what we have to do, but since the actual standard
      poll() semantics is *supposed* to be 'int', and since at least on x86-64
      glibc sign-extends the argument before invocing the system call (so
      nobody can actually use a 64-bit timeout value in user space _anyway_,
      even in 64-bit binaries), the simpler solution would seem to be to just
      fix the definition of the system call to match what it should have been
      from the very start.
      If it turns out that somebody somehow circumvents the user-level libc
      64-bit sign extension and actually uses a large unsigned 64-bit timeout
      despite that not being how poll() is supposed to work, we will need to
      do the compat_sys_poll() approach.
      Reported-by: default avatarThomas Meyer <thomas@m3y3r.de>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Hitoshi Mitake's avatar
      asm-generic: architecture independent readq/writeq for 32bit environment · 797a796a
      Hitoshi Mitake authored
      This provides unified readq()/writeq() helper functions for 32-bit
      For some cases, readq/writeq without atomicity is harmful, and order of
      io access has to be specified explicitly.  So in this patch, new two
      header files which contain non-atomic readq/writeq are added.
       - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/
         writeq with the order of lower address -> higher address
       - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/
         writeq with reversed order
      This allows us to remove some readq()s that were added drivers when the
      default non-atomic ones were removed in commit dbee8a0a ("x86:
      remove 32-bit versions of readq()/writeq()")
      The drivers which need readq/writeq but can do with the non-atomic ones
      must add the line:
        #include <asm-generic/io-64-nonatomic-lo-hi.h> /* or hi-lo.h */
      But this will be nop in 64-bit environments, and no other #ifdefs are
      required.  So I believe that this patch can solve the problem of
       1. driver-specific readq/writeq
       2. atomicity and order of io access
      This patch is tested with building allyesconfig and allmodconfig as
      ARCH=x86 and ARCH=i386 on top of tip/master.
      Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Ravi Anand <ravi.anand@qlogic.com>
      Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
      Cc: Matthew Garrett <mjg@redhat.com>
      Cc: Jason Uhlenkott <juhlenko@akamai.com>
      Cc: James Bottomley <James.Bottomley@parallels.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarHitoshi Mitake <h.mitake@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Greg Rose's avatar
      rtnetlink: Fix problem with buffer allocation · 115c9b81
      Greg Rose authored
      Implement a new netlink attribute type IFLA_EXT_MASK.  The mask
      is a 32 bit value that can be used to indicate to the kernel that
      certain extended ifinfo values are requested by the user application.
      At this time the only mask value defined is RTEXT_FILTER_VF to
      indicate that the user wants the ifinfo dump to send information
      about the VFs belonging to the interface.
      This patch fixes a bug in which certain applications do not have
      large enough buffers to accommodate the extra information returned
      by the kernel with large numbers of SR-IOV virtual functions.
      Those applications will not send the new netlink attribute with
      the interface info dump request netlink messages so they will
      not get unexpectedly large request buffers returned by the kernel.
      Modifies the rtnl_calcit function to traverse the list of net
      devices and compute the minimum buffer size that can hold the
      info dumps of all matching devices based upon the filter passed
      in via the new netlink attribute filter mask.  If no filter
      mask is sent then the buffer allocation defaults to NLMSG_GOODSIZE.
      With this change it is possible to add yet to be defined netlink
      attributes to the dump request which should make it fairly extensible
      in the future.
      Signed-off-by: default avatarGreg Rose <gregory.v.rose@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Ming Lei's avatar
      percpu: use raw_local_irq_* in _this_cpu op · e920d597
      Ming Lei authored
      It doesn't make sense to trace irq off or do irq flags
      lock proving inside 'this_cpu' operations, so replace local_irq_*
      with raw_local_irq_* in 'this_cpu' op.
      Also the patch fixes onelockdep warning[1] by the replacement, see
      In commit: 933393f5(percpu:
      Remove irqsafe_cpu_xxx variants), local_irq_save/restore(flags) are
      added inside this_cpu_inc operation, so that trace_hardirqs_off_caller
      will be called by trace_hardirqs_on_caller directly because
      __debug_atomic_inc is implemented as this_cpu_inc, which may trigger
      the lockdep warning[1], for example in the below ARM scenary:
      	kernel_thread_helper	/*irq disabled*/
      		->trace_hardirqs_on_caller	/*hardirqs_enabled was set*/
      			->trace_hardirqs_off_caller	/*hardirqs_enabled cleared*/
      			->trace_hardirqs_off_caller	/*irq disabled, so call here*/
      The 'unannotated irqs-on' warning will be triggered somewhere because
      irq is just enabled after the irq trace in kernel_thread_helper.
      [    0.162841] ------------[ cut here ]------------
      [    0.167694] WARNING: at kernel/lockdep.c:3493 check_flags+0xc0/0x1d0()
      [    0.174468] Modules linked in:
      [    0.177703] Backtrace:
      [    0.180328] [<c00171f0>] (dump_backtrace+0x0/0x110) from [<c0412320>] (dump_stack+0x18/0x1c)
      [    0.189086]  r6:c051f778 r5:00000da5 r4:00000000 r3:60000093
      [    0.195007] [<c0412308>] (dump_stack+0x0/0x1c) from [<c00410e8>] (warn_slowpath_common+0x54/0x6c)
      [    0.204223] [<c0041094>] (warn_slowpath_common+0x0/0x6c) from [<c0041124>] (warn_slowpath_null+0x24/0x2c)
      [    0.214111]  r8:00000000 r7:00000000 r6:ee069598 r5:60000013 r4:ee082000
      [    0.220825] r3:00000009
      [    0.223693] [<c0041100>] (warn_slowpath_null+0x0/0x2c) from [<c0088f38>] (check_flags+0xc0/0x1d0)
      [    0.232910] [<c0088e78>] (check_flags+0x0/0x1d0) from [<c008d348>] (lock_acquire+0x4c/0x11c)
      [    0.241668] [<c008d2fc>] (lock_acquire+0x0/0x11c) from [<c0415aa4>] (_raw_spin_lock+0x3c/0x74)
      [    0.250610] [<c0415a68>] (_raw_spin_lock+0x0/0x74) from [<c010a844>] (set_task_comm+0x20/0xc0)
      [    0.259521]  r6:ee069588 r5:ee0691c0 r4:ee082000
      [    0.264404] [<c010a824>] (set_task_comm+0x0/0xc0) from [<c0060780>] (kthreadd+0x28/0x108)
      [    0.272857]  r8:00000000 r7:00000013 r6:c0044a08 r5:ee0691c0 r4:ee082000
      [    0.279571] r3:ee083fe0
      [    0.282470] [<c0060758>] (kthreadd+0x0/0x108) from [<c0044a08>] (do_exit+0x0/0x6dc)
      [    0.290405]  r5:c0060758 r4:00000000
      [    0.294189] ---[ end trace 1b75b31a2719ed1c ]---
      [    0.299041] possible reason: unannotated irqs-on.
      [    0.303955] irq event stamp: 5
      [    0.307159] hardirqs last  enabled at (4): [<c001331c>] no_work_pending+0x8/0x2c
      [    0.314880] hardirqs last disabled at (5): [<c0089b08>] trace_hardirqs_on_caller+0x60/0x26c
      [    0.323547] softirqs last  enabled at (0): [<c003f754>] copy_process+0x33c/0xef4
      [    0.331207] softirqs last disabled at (0): [<  (null)>]   (null)
      [    0.337585] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarMing Lei <tom.leiming@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    • Konstantin Khlebnikov's avatar
      percpu: fix generic definition of __this_cpu_add_and_return() · 7d96b3e5
      Konstantin Khlebnikov authored
      This patch adds missed "__" into function prefix.
      Otherwise on all archectures (except x86) it expands to irq/preemtion-safe
      variant: _this_cpu_generic_add_return(), which do extra irq-save/irq-restore.
      Optimal generic implementation is __this_cpu_generic_add_return().
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    • Joerg Willmann's avatar
      netfilter: ebtables: fix alignment problem in ppc · 88ba136d
      Joerg Willmann authored
      ebt_among extension of ebtables uses __alignof__(_xt_align) while the
      corresponding kernel module uses __alignof__(ebt_replace) to determine
      the alignment in EBT_ALIGN().
      These are the results of these values on different platforms:
      x86 x86_64 ppc
      __alignof__(_xt_align) 4 8 8
      __alignof__(ebt_replace) 4 8 4
      ebtables fails to add rules which use the among extension.
      I'm using kernel 2.6.33 and ebtables 2.0.10-4
      According to Bart De Schuymer, userspace alignment was changed to
      _xt_align to fix an alignment issue on a userspace32-kernel64 system
      (he thinks it was for an ARM device). So userspace must be right.
      The kernel alignment macro needs to change so it also uses _xt_align
      instead of ebt_replace. The userspace changes date back from
      June 29, 2009.
      Signed-off-by: default avatarJoerg Willmann <joe@clnt.de>
      Signed-off by: Bart De Schuymer <bdschuym@pandora.be>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
  13. 20 Feb, 2012 1 commit
  14. 15 Feb, 2012 7 commits
    • Alexey Dobriyan's avatar
      crypto: sha512 - use standard ror64() · f2ea0f5f
      Alexey Dobriyan authored
      Use standard ror64() instead of hand-written.
      There is no standard ror64, so create it.
      The difference is shift value being "unsigned int" instead of uint64_t
      (for which there is no reason). gcc starts to emit native ROR instructions
      which it doesn't do for some reason currently. This should make the code
      Patch survives in-tree crypto test and ping flood with hmac(sha512) on.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    • Shawn Guo's avatar
      dt: add empty of_find_compatible_node function · 2261cc62
      Shawn Guo authored
      Add empty of_find_compatible_node function for !CONFIG_OF build.
      Signed-off-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarGrant Likely <grant.likely@secretlab.ca>
    • Ulisses Furquim's avatar
      Bluetooth: Remove usage of __cancel_delayed_work() · 6de32750
      Ulisses Furquim authored
      __cancel_delayed_work() is being used in some paths where we cannot
      sleep waiting for the delayed work to finish. However, that function
      might return while the timer is running and the work will be queued
      again. Replace the calls with safer cancel_delayed_work() version
      which spins until the timer handler finishes on other CPUs and
      cancels the delayed work.
      Signed-off-by: default avatarUlisses Furquim <ulisses@profusion.mobi>
      Acked-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
    • Andre Guedes's avatar
      Bluetooth: Fix potential deadlock · a51cd2be
      Andre Guedes authored
      We don't need to use the _sync variant in hci_conn_hold and
      hci_conn_put to cancel conn->disc_work delayed work. This way
      we avoid potential deadlocks like this one reported by lockdep.
      [ INFO: possible circular locking dependency detected ]
      3.2.0+ #1 Not tainted
      kworker/u:1/17 is trying to acquire lock:
       (&hdev->lock){+.+.+.}, at: [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
      but task is already holding lock:
       ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
      which lock already depends on the new lock.
      the existing dependency chain (in reverse order) is:
      -> #2 ((&(&conn->disc_work)->work)){+.+...}:
             [<ffffffff81057444>] lock_acquire+0x8a/0xa7
             [<ffffffff81034ed1>] wait_on_work+0x3d/0xaa
             [<ffffffff81035b54>] __cancel_work_timer+0xac/0xef
             [<ffffffff81035ba4>] cancel_delayed_work_sync+0xd/0xf
             [<ffffffffa00554b0>] smp_chan_create+0xde/0xe6 [bluetooth]
             [<ffffffffa0056160>] smp_conn_security+0xa3/0x12d [bluetooth]
             [<ffffffffa0053640>] l2cap_connect_cfm+0x237/0x2e8 [bluetooth]
             [<ffffffffa004239c>] hci_proto_connect_cfm+0x2d/0x6f [bluetooth]
             [<ffffffffa0046ea5>] hci_event_packet+0x29d1/0x2d60 [bluetooth]
             [<ffffffffa003dde3>] hci_rx_work+0xd0/0x2e1 [bluetooth]
             [<ffffffff810357af>] process_one_work+0x178/0x2bf
             [<ffffffff81036178>] worker_thread+0xce/0x152
             [<ffffffff81039a03>] kthread+0x95/0x9d
             [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
      -> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}:
             [<ffffffff81057444>] lock_acquire+0x8a/0xa7
             [<ffffffff812e553a>] _raw_spin_lock_bh+0x36/0x6a
             [<ffffffff81244d56>] lock_sock_nested+0x24/0x7f
             [<ffffffffa004d96f>] lock_sock+0xb/0xd [bluetooth]
             [<ffffffffa0052906>] l2cap_chan_connect+0xa9/0x26f [bluetooth]
             [<ffffffffa00545f8>] l2cap_sock_connect+0xb3/0xff [bluetooth]
             [<ffffffff81243b48>] sys_connect+0x69/0x8a
             [<ffffffff812e6579>] system_call_fastpath+0x16/0x1b
      -> #0 (&hdev->lock){+.+.+.}:
             [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
             [<ffffffff81057444>] lock_acquire+0x8a/0xa7
             [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
             [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
             [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
             [<ffffffff810357af>] process_one_work+0x178/0x2bf
             [<ffffffff81036178>] worker_thread+0xce/0x152
             [<ffffffff81039a03>] kthread+0x95/0x9d
             [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
      other info that might help us debug this:
      Chain exists of:
        &hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work)
       Possible unsafe locking scenario:
             CPU0                    CPU1
             ----                    ----
       *** DEADLOCK ***
      2 locks held by kworker/u:1/17:
       #0:  (hdev->name){.+.+.+}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
       #1:  ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
      stack backtrace:
      Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1
      Call Trace:
       [<ffffffff812e06c6>] print_circular_bug+0x1f8/0x209
       [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
       [<ffffffff81021ef2>] ? arch_local_irq_restore+0x6/0xd
       [<ffffffff81022bc7>] ? vprintk+0x3f9/0x41e
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
       [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff81190fd6>] ? __dynamic_pr_debug+0x6d/0x6f
       [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff8105320f>] ? trace_hardirqs_off+0xd/0xf
       [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
       [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81035751>] ? process_one_work+0x11a/0x2bf
       [<ffffffff81055af3>] ? lock_acquired+0x1d0/0x1df
       [<ffffffffa00410f3>] ? hci_acl_disconn+0x65/0x65 [bluetooth]
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff810407ed>] ? finish_task_switch+0x45/0xc5
       [<ffffffff810360aa>] ? manage_workers.isra.25+0x16a/0x16a
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
       [<ffffffff812e5db4>] ? retint_restore_args+0x13/0x13
       [<ffffffff8103996e>] ? __init_kthread_worker+0x55/0x55
       [<ffffffff812e7750>] ? gs_change+0x13/0x13
      Signed-off-by: default avatarAndre Guedes <andre.guedes@openbossa.org>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@openbossa.org>
      Reviewed-by: default avatarUlisses Furquim <ulisses@profusion.mobi>
      Acked-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
    • Octavian Purdila's avatar
      Bluetooth: silence lockdep warning · b5a30dda
      Octavian Purdila authored
      Since bluetooth uses multiple protocols types, to avoid lockdep
      warnings, we need to use different lockdep classes (one for each
      protocol type).
      This is already done in bt_sock_create but it misses a couple of cases
      when new connections are created. This patch corrects that to fix the
      following warning:
      <4>[ 1864.732366] =======================================================
      <4>[ 1864.733030] [ INFO: possible circular locking dependency detected ]
      <4>[ 1864.733544] 3.0.16-mid3-00007-gc9a0f62 #3
      <4>[ 1864.733883] -------------------------------------------------------
      <4>[ 1864.734408] t.android.btclc/4204 is trying to acquire lock:
      <4>[ 1864.734869]  (rfcomm_mutex){+.+.+.}, at: [<c14970ea>] rfcomm_dlc_close+0x15/0x30
      <4>[ 1864.735541]
      <4>[ 1864.735549] but task is already holding lock:
      <4>[ 1864.736045]  (sk_lock-AF_BLUETOOTH){+.+.+.}, at: [<c1498bf7>] lock_sock+0xa/0xc
      <4>[ 1864.736732]
      <4>[ 1864.736740] which lock already depends on the new lock.
      <4>[ 1864.736750]
      <4>[ 1864.737428]
      <4>[ 1864.737437] the existing dependency chain (in reverse order) is:
      <4>[ 1864.738016]
      <4>[ 1864.738023] -> #1 (sk_lock-AF_BLUETOOTH){+.+.+.}:
      <4>[ 1864.738549]        [<c1062273>] lock_acquire+0x104/0x140
      <4>[ 1864.738977]        [<c13d35c1>] lock_sock_nested+0x58/0x68
      <4>[ 1864.739411]        [<c1493c33>] l2cap_sock_sendmsg+0x3e/0x76
      <4>[ 1864.739858]        [<c13d06c3>] __sock_sendmsg+0x50/0x59
      <4>[ 1864.740279]        [<c13d0ea2>] sock_sendmsg+0x94/0xa8
      <4>[ 1864.740687]        [<c13d0ede>] kernel_sendmsg+0x28/0x37
      <4>[ 1864.741106]        [<c14969ca>] rfcomm_send_frame+0x30/0x38
      <4>[ 1864.741542]        [<c1496a2a>] rfcomm_send_ua+0x58/0x5a
      <4>[ 1864.741959]        [<c1498447>] rfcomm_run+0x441/0xb52
      <4>[ 1864.742365]        [<c104f095>] kthread+0x63/0x68
      <4>[ 1864.742742]        [<c14d5182>] kernel_thread_helper+0x6/0xd
      <4>[ 1864.743187]
      <4>[ 1864.743193] -> #0 (rfcomm_mutex){+.+.+.}:
      <4>[ 1864.743667]        [<c1061ada>] __lock_acquire+0x988/0xc00
      <4>[ 1864.744100]        [<c1062273>] lock_acquire+0x104/0x140
      <4>[ 1864.744519]        [<c14d2c70>] __mutex_lock_common+0x3b/0x33f
      <4>[ 1864.744975]        [<c14d303e>] mutex_lock_nested+0x2d/0x36
      <4>[ 1864.745412]        [<c14970ea>] rfcomm_dlc_close+0x15/0x30
      <4>[ 1864.745842]        [<c14990d9>] __rfcomm_sock_close+0x5f/0x6b
      <4>[ 1864.746288]        [<c1499114>] rfcomm_sock_shutdown+0x2f/0x62
      <4>[ 1864.746737]        [<c13d275d>] sys_socketcall+0x1db/0x422
      <4>[ 1864.747165]        [<c14d42f0>] syscall_call+0x7/0xb
      Signed-off-by: default avatarOctavian Purdila <octavian.purdila@intel.com>
      Acked-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
    • Vinicius Costa Gomes's avatar
      Bluetooth: Fix using an absolute timeout on hci_conn_put() · 33166063
      Vinicius Costa Gomes authored
      queue_delayed_work() expects a relative time for when that work
      should be scheduled.
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@openbossa.org>
      Acked-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
    • Andrzej Kaczmarek's avatar
      Bluetooth: l2cap_set_timer needs jiffies as timeout value · 6e1da683
      Andrzej Kaczmarek authored
      After moving L2CAP timers to workqueues l2cap_set_timer expects timeout
      value to be specified in jiffies but constants defined in miliseconds
      are used. This makes timeouts unreliable when CONFIG_HZ is not set to
      __set_chan_timer macro still uses jiffies as input to avoid multiple
      conversions from/to jiffies for sk_sndtimeo value which is already
      specified in jiffies.
      Signed-off-by: default avatarAndrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
      Ackec-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>