1. 27 May, 2015 8 commits
    • Luis R. Rodriguez's avatar
      kernel/module.c: avoid ifdefs for sig_enforce declaration · 6727bb9c
      Luis R. Rodriguez authored
      There's no need to require an ifdef over the declaration
      of sig_enforce as IS_ENABLED() can be used. While at it,
      there's no harm in exposing this kernel parameter outside of
      CONFIG_MODULE_SIG as it'd be a no-op on non module sig
      kernels.
      
      Now, technically we should in theory be able to remove
      the #ifdef'ery over the declaration of the module parameter
      as we are also trusting the bool_enable_only code for
      CONFIG_MODULE_SIG kernels but for now remain paranoid
      and keep it.
      
      With time if no one can put a bullet through bool_enable_only
      and if there are no technical requirements over not exposing
      CONFIG_MODULE_SIG_FORCE with the measures in place by
      bool_enable_only we could remove this last ifdef.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      6727bb9c
    • Luis R. Rodriguez's avatar
      kernel/params.c: generalize bool_enable_only · d19f05d8
      Luis R. Rodriguez authored
      This takes out the bool_enable_only implementation from
      the module loading code and generalizes it so that others
      can make use of it.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      d19f05d8
    • Luis R. Rodriguez's avatar
      kernel/module.c: use generic module param operaters for sig_enforce · 05f408dd
      Luis R. Rodriguez authored
      We're directly checking and modifying sig_enforce when needed instead
      of using the generic helpers. This prevents us from generalizing this
      helper so that others can use it. Use indirect helpers to allow us
      to generalize this code a bit and to make it a bit more clear what
      this is doing.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      05f408dd
    • Peter Zijlstra's avatar
      module: Rework module_addr_{min,max} · 4f666546
      Peter Zijlstra authored
      __module_address() does an initial bound check before doing the
      {list/tree} iteration to find the actual module. The bound variables
      are nowhere near the mod_tree cacheline, in fact they're nowhere near
      one another.
      
      module_addr_min lives in .data while module_addr_max lives in .bss
      (smarty pants GCC thinks the explicit 0 assignment is a mistake).
      
      Rectify this by moving the two variables into a structure together
      with the latch_tree_root to guarantee they all share the same
      cacheline and avoid hitting two extra cachelines for the lookup.
      
      While reworking the bounds code, move the bound update from allocation
      to insertion time, this avoids updating the bounds for a few error
      paths.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      4f666546
    • Peter Zijlstra's avatar
      module: Use __module_address() for module_address_lookup() · b7df4d1b
      Peter Zijlstra authored
      Use the generic __module_address() addr to struct module lookup
      instead of open coding it once more.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      b7df4d1b
    • Peter Zijlstra's avatar
      module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING · 6c9692e2
      Peter Zijlstra authored
      Andrew worried about the overhead on small systems; only use the fancy
      code when either perf or tracing is enabled.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Requested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      6c9692e2
    • Peter Zijlstra's avatar
      module: Optimize __module_address() using a latched RB-tree · 93c2e105
      Peter Zijlstra authored
      Currently __module_address() is using a linear search through all
      modules in order to find the module corresponding to the provided
      address. With a lot of modules this can take a lot of time.
      
      One of the users of this is kernel_text_address() which is employed
      in many stack unwinders; which in turn are used by perf-callchain and
      ftrace (possibly from NMI context).
      
      So by optimizing __module_address() we optimize many stack unwinders
      which are used by both perf and tracing in performance sensitive code.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      93c2e105
    • Peter Zijlstra's avatar
      module: Sanitize RCU usage and locking · 0be964be
      Peter Zijlstra authored
      Currently the RCU usage in module is an inconsistent mess of RCU and
      RCU-sched, this is broken for CONFIG_PREEMPT where synchronize_rcu()
      does not imply synchronize_sched().
      
      Most usage sites use preempt_{dis,en}able() which is RCU-sched, but
      (most of) the modification sites use synchronize_rcu(). With the
      exception of the module bug list, which actually uses RCU.
      
      Convert everything over to RCU-sched.
      
      Furthermore add lockdep asserts to all sites, because it's not at all
      clear to me the required locking is observed, esp. on exported
      functions.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      0be964be
  2. 26 May, 2015 1 commit
    • Peter Zijlstra's avatar
      module: Annotate module version magic · 926a59b1
      Peter Zijlstra authored
      Due to the new lockdep checks in the coming patch, we go:
      
      [    9.759380] ------------[ cut here ]------------
      [    9.759389] WARNING: CPU: 31 PID: 597 at ../kernel/module.c:216 each_symbol_section+0x121/0x130()
      [    9.759391] Modules linked in:
      [    9.759393] CPU: 31 PID: 597 Comm: modprobe Not tainted 4.0.0-rc1+ #65
      [    9.759393] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
      [    9.759396]  ffffffff817d8676 ffff880424567ca8 ffffffff8157e98b 0000000000000001
      [    9.759398]  0000000000000000 ffff880424567ce8 ffffffff8105fbc7 ffff880424567cd8
      [    9.759400]  0000000000000000 ffffffff810ec160 ffff880424567d40 0000000000000000
      [    9.759400] Call Trace:
      [    9.759407]  [<ffffffff8157e98b>] dump_stack+0x4f/0x7b
      [    9.759410]  [<ffffffff8105fbc7>] warn_slowpath_common+0x97/0xe0
      [    9.759412]  [<ffffffff810ec160>] ? section_objs+0x60/0x60
      [    9.759414]  [<ffffffff8105fc2a>] warn_slowpath_null+0x1a/0x20
      [    9.759415]  [<ffffffff810ed9c1>] each_symbol_section+0x121/0x130
      [    9.759417]  [<ffffffff810eda01>] find_symbol+0x31/0x70
      [    9.759420]  [<ffffffff810ef5bf>] load_module+0x20f/0x2660
      [    9.759422]  [<ffffffff8104ef10>] ? __do_page_fault+0x190/0x4e0
      [    9.759426]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759427]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759433]  [<ffffffff810ae73d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
      [    9.759437]  [<ffffffff812fcc0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [    9.759439]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759441]  [<ffffffff810f1ade>] SyS_init_module+0xce/0x100
      [    9.759443]  [<ffffffff81587429>] system_call_fastpath+0x12/0x17
      [    9.759445] ---[ end trace 9294429076a9c644 ]---
      
      As per the comment this site should be fine, but lets wrap it in
      preempt_disable() anyhow to placate lockdep.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      926a59b1
  3. 08 Apr, 2015 2 commits
  4. 23 Mar, 2015 3 commits
    • Kirill A. Shutemov's avatar
      module: do not print allocation-fail warning on bogus user buffer size · cc9e605d
      Kirill A. Shutemov authored
      init_module(2) passes user-specified buffer length directly to
      vmalloc(). It makes warn_alloc_failed() to print out a lot of info into
      dmesg if user specified insane size, like -1.
      
      Let's silence the warning. It doesn't add much value to -ENOMEM return
      code. Without the patch the syscall is prohibitive noisy for testing
      with trinity.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      cc9e605d
    • Yannick Guerrini's avatar
      kernel/module.c: fix typos in message about unused symbols · 7b63c3ab
      Yannick Guerrini authored
      Fix typos in pr_warn message about unused symbols
      Signed-off-by: default avatarYannick Guerrini <yguerrini@tomshardware.fr>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      7b63c3ab
    • Peter Zijlstra's avatar
      lockdep: Fix the module unload key range freeing logic · 35a9393c
      Peter Zijlstra authored
      Module unload calls lockdep_free_key_range(), which removes entries
      from the data structures. Most of the lockdep code OTOH assumes the
      data structures are append only; in specific see the comments in
      add_lock_to_list() and look_up_lock_class().
      
      Clearly this has only worked by accident; make it work proper. The
      actual scenario to make it go boom would involve the memory freed by
      the module unlock being re-allocated and re-used for a lock inside of
      a rcu-sched grace period. This is a very unlikely scenario, still
      better plug the hole.
      
      Use RCU list iteration in all places and ammend the comments.
      
      Change lockdep_free_key_range() to issue a sync_sched() between
      removal from the lists and returning -- which results in the memory
      being freed. Further ensure the callers are placed correctly and
      comment the requirements.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Tsyvarev <tsyvarev@ispras.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      35a9393c
  5. 12 Mar, 2015 1 commit
    • Andrey Ryabinin's avatar
      kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8
      Andrey Ryabinin authored
      Current approach in handling shadow memory for modules is broken.
      
      Shadow memory could be freed only after memory shadow corresponds it is no
      longer used.  vfree() called from interrupt context could use memory its
      freeing to store 'struct llist_node' in it:
      
          void vfree(const void *addr)
          {
          ...
              if (unlikely(in_interrupt())) {
                  struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
                  if (llist_add((struct llist_node *)addr, &p->list))
                          schedule_work(&p->wq);
      
      Later this list node used in free_work() which actually frees memory.
      Currently module_memfree() called in interrupt context will free shadow
      before freeing module's memory which could provoke kernel crash.
      
      So shadow memory should be freed after module's memory.  However, such
      deallocation order could race with kasan_module_alloc() in module_alloc().
      
      Free shadow right before releasing vm area.  At this point vfree()'d
      memory is not used anymore and yet not available for other allocations.
      New VM_KASAN flag used to indicate that vm area has dynamically allocated
      shadow memory so kasan frees shadow only if it was previously allocated.
      Signed-off-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5af5aa8
  6. 06 Mar, 2015 1 commit
  7. 17 Feb, 2015 1 commit
  8. 13 Feb, 2015 1 commit
    • Andrey Ryabinin's avatar
      kasan: enable instrumentation of global variables · bebf56a1
      Andrey Ryabinin authored
      This feature let us to detect accesses out of bounds of global variables.
      This will work as for globals in kernel image, so for globals in modules.
      Currently this won't work for symbols in user-specified sections (e.g.
      __init, __read_mostly, ...)
      
      The idea of this is simple.  Compiler increases each global variable by
      redzone size and add constructors invoking __asan_register_globals()
      function.  Information about global variable (address, size, size with
      redzone ...) passed to __asan_register_globals() so we could poison
      variable's redzone.
      
      This patch also forces module_alloc() to return 8*PAGE_SIZE aligned
      address making shadow memory handling (
      kasan_module_alloc()/kasan_module_free() ) more simple.  Such alignment
      guarantees that each shadow page backing modules address space correspond
      to only one module_alloc() allocation.
      Signed-off-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: default avatarAndrey Konovalov <adech.fo@gmail.com>
      Cc: Yuri Gribov <tetra2005@gmail.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bebf56a1
  9. 10 Feb, 2015 2 commits
  10. 05 Feb, 2015 2 commits
  11. 21 Jan, 2015 1 commit
  12. 19 Jan, 2015 3 commits
    • Rusty Russell's avatar
      module: fix race in kallsyms resolution during module load success. · c7496379
      Rusty Russell authored
      The kallsyms routines (module_symbol_name, lookup_module_* etc) disable
      preemption to walk the modules rather than taking the module_mutex:
      this is because they are used for symbol resolution during oopses.
      
      This works because there are synchronize_sched() and synchronize_rcu()
      in the unload and failure paths.  However, there's one case which doesn't
      have that: the normal case where module loading succeeds, and we free
      the init section.
      
      We don't want a synchronize_rcu() there, because it would slow down
      module loading: this bug was introduced in 2009 to speed module
      loading in the first place.
      
      Thus, we want to do the free in an RCU callback.  We do this in the
      simplest possible way by allocating a new rcu_head: if we put it in
      the module structure we'd have to worry about that getting freed.
      Reported-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      c7496379
    • Rusty Russell's avatar
      module: remove mod arg from module_free, rename module_memfree(). · be1f221c
      Rusty Russell authored
      Nothing needs the module pointer any more, and the next patch will
      call it from RCU, where the module itself might no longer exist.
      Removing the arg is the safest approach.
      
      This just codifies the use of the module_alloc/module_free pattern
      which ftrace and bpf use.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: x86@kernel.org
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: linux-cris-kernel@axis.com
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: nios2-dev@lists.rocketboards.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: sparclinux@vger.kernel.org
      Cc: netdev@vger.kernel.org
      be1f221c
    • Rusty Russell's avatar
      module_arch_freeing_init(): new hook for archs before module->module_init freed. · d453cded
      Rusty Russell authored
      Archs have been abusing module_free() to clean up their arch-specific
      allocations.  Since module_free() is also (ab)used by BPF and trace code,
      let's keep it to simple allocations, and provide a hook called before
      that.
      
      This means that avr32, ia64, parisc and s390 no longer need to implement
      their own module_free() at all.  avr32 doesn't need module_finalize()
      either.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      d453cded
  13. 10 Nov, 2014 6 commits
  14. 28 Oct, 2014 1 commit
  15. 14 Oct, 2014 1 commit
    • Prarit Bhargava's avatar
      modules, lock around setting of MODULE_STATE_UNFORMED · d3051b48
      Prarit Bhargava authored
      A panic was seen in the following sitation.
      
      There are two threads running on the system. The first thread is a system
      monitoring thread that is reading /proc/modules. The second thread is
      loading and unloading a module (in this example I'm using my simple
      dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
      driver module.
      
      When doing this, the following panic occurred:
      
       ------------[ cut here ]------------
       kernel BUG at kernel/module.c:3739!
       invalid opcode: 0000 [#1] SMP
       Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
       CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
       Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
       task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
       RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
       RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
       RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
       RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
       RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
       R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
       R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
       FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
        ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
        ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
       Call Trace:
        [<ffffffff810d666c>] m_show+0x19c/0x1e0
        [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
        [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
        [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
        [<ffffffff811c1a58>] SyS_read+0x58/0xb0
        [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
       Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
       RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
        RSP <ffff88080fa7fe18>
      
          Consider the two processes running on the system.
      
          CPU 0 (/proc/modules reader)
          CPU 1 (loading/unloading module)
      
          CPU 0 opens /proc/modules, and starts displaying data for each module by
          traversing the modules list via fs/seq_file.c:seq_open() and
          fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
          does
      
                  op->start()  <-- this is a pointer to m_start()
                  op->show()   <- this is a pointer to m_show()
                  op->stop()   <-- this is a pointer to m_stop()
      
          The m_start(), m_show(), and m_stop() module functions are defined in
          kernel/module.c. The m_start() and m_stop() functions acquire and release
          the module_mutex respectively.
      
          ie) When reading /proc/modules, the module_mutex is acquired and released
          for each module.
      
          m_show() is called with the module_mutex held.  It accesses the module
          struct data and attempts to write out module data.  It is in this code
          path that the above BUG_ON() warning is encountered, specifically m_show()
          calls
      
          static char *module_flags(struct module *mod, char *buf)
          {
                  int bx = 0;
      
                  BUG_ON(mod->state == MODULE_STATE_UNFORMED);
          ...
      
          The other thread, CPU 1, in unloading the module calls the syscall
          delete_module() defined in kernel/module.c.  The module_mutex is acquired
          for a short time, and then released.  free_module() is called without the
          module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
          also without the module_mutex.  Some additional code is called and then the
          module_mutex is reacquired to remove the module from the modules list:
      
              /* Now we can delete it from the lists */
              mutex_lock(&module_mutex);
              stop_machine(__unlink_module, mod, NULL);
              mutex_unlock(&module_mutex);
      
      This is the sequence of events that leads to the panic.
      
      CPU 1 is removing dummy_module via delete_module().  It acquires the
      module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
      MODULE_STATE_UNFORMED yet.
      
      CPU 0, which is reading the /proc/modules, acquires the module_mutex and
      acquires a pointer to the dummy_module which is still in the modules list.
      CPU 0 calls m_show for dummy_module.  The check in m_show() for
      MODULE_STATE_UNFORMED passed for dummy_module even though it is being
      torn down.
      
      Meanwhile CPU 1, which has been continuing to remove dummy_module without
      holding the module_mutex, now calls free_module() and sets
      dummy_module->state to MODULE_STATE_UNFORMED.
      
      CPU 0 now calls module_flags() with dummy_module and ...
      
      static char *module_flags(struct module *mod, char *buf)
      {
              int bx = 0;
      
              BUG_ON(mod->state == MODULE_STATE_UNFORMED);
      
      and BOOM.
      
      Acquire and release the module_mutex lock around the setting of
      MODULE_STATE_UNFORMED in the teardown path, which should resolve the
      problem.
      
      Testing: In the unpatched kernel I can panic the system within 1 minute by
      doing
      
      while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done
      
      and
      
      while (true) do cat /proc/modules; done
      
      in separate terminals.
      
      In the patched kernel I was able to run just over one hour without seeing
      any issues.  I also verified the output of panic via sysrq-c and the output
      of /proc/modules looks correct for all three states for the dummy_module.
      
              dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
              dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
              dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      d3051b48
  16. 02 Oct, 2014 1 commit
    • Kyle McMartin's avatar
      aarch64: filter $x from kallsyms · 6c34f1f5
      Kyle McMartin authored
      Similar to ARM, AArch64 is generating $x and $d syms... which isn't
      terribly helpful when looking at %pF output and the like. Filter those
      out in kallsyms, modpost and when looking at module symbols.
      
      Seems simplest since none of these check EM_ARM anyway, to just add it
      to the strchr used, rather than trying to make things overly
      complicated.
      
      initcall_debug improves:
      dmesg_before.txt: initcall $x+0x0/0x154 [sg] returned 0 after 26331 usecs
      dmesg_after.txt: initcall init_sg+0x0/0x154 [sg] returned 0 after 15461 usecs
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      6c34f1f5
  17. 27 Aug, 2014 1 commit
  18. 15 Aug, 2014 1 commit
  19. 27 Jul, 2014 2 commits
  20. 03 Jul, 2014 1 commit
    • Jarod Wilson's avatar
      crypto: fips - only panic on bad/missing crypto mod signatures · 002c77a4
      Jarod Wilson authored
      Per further discussion with NIST, the requirements for FIPS state that
      we only need to panic the system on failed kernel module signature checks
      for crypto subsystem modules. This moves the fips-mode-only module
      signature check out of the generic module loading code, into the crypto
      subsystem, at points where we can catch both algorithm module loads and
      mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on
      CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode.
      
      v2: remove extraneous blank line, perform checks in static inline
      function, drop no longer necessary fips.h include.
      
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Stephan Mueller <stephan.mueller@atsec.com>
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      002c77a4