1. 25 Oct, 2016 5 commits
    • Charlie Jacobsen's avatar
      Basic lcd module create, run, and destroy. · e0193fa4
      Charlie Jacobsen authored
      This code is ugly, but it's working.
      
      Tested with basic module, and appears to be working
      properly. I will soon incorporate the patched
      modprobe into the kernel tree, and then this code
      will be usable by everyone.
      
      The ipc code is still unimplemented. The only
      hypercall handled is yield. Also note that other
      exit conditions (e.g. external interrupt) have not
      been fully tested.
      
      Overview:
      -- kernel code calls lcd_create_as_module with
         the module's name
      -- lcd_create_as_module loads the module using
         request_lcd_module (request_lcd_module calls
         the patched modprobe to load the module, and
         the patched modprobe calls back into the lcd
         driver via the ioctrl interface to load the
         module)
      -- lcd_create_as_module then finds the loaded
         module, spawns a kernel thread and passes off
         the module to it
      -- the kernel thread initializes the lcd and
         maps the module inside it, then suspends itself
      -- lcd_run_as_module wakes up the kernel thread
         and tells it to run
      -- lcd_delete_as_module stops the kernel thread
         and deletes the module from the host kernel
      
      File-by-file details:
      
      arch/x86/include/asm/lcd-domains-arch.h
      arch/x86/lcd-domains/lcd-domains-arch-tests.c
      arch/x86/lcd-domains/lcd-domains-arch.c
      -- lcd was not running in 64-bit mode, and my
         checks had one subtle bug
      -- fixed %cr3 load to properly load vmcs first
      -- fixed set program counter to use guest virtual
         rather than guest physical address
      
      include/linux/sched.h
      -- added struct lcd to task_struct
      
      include/linux/init_task.h
      -- lcd pointer set to null when task_struct is
         initialized
      
      include/linux/module.h
      kernel/module.c
      -- made init_module and delete_module system calls
         callable from kernel code
      -- available in module.h via do_sys_init_module and
         do_sys_delete_module
      -- simply moved the majority of the guts of the
         system calls into a non-system call, exported
         routine
      -- take an extra flag, for_lcd; when set, the init
         code skips over running (and deallocating) the
         module's init code, and the delete code skips
         over running the module exit
      -- system calls from user code set for_lcd = 0; this
         ensures existing code still works
      
      include/linux/kmod.h
      kernel/kmod.c
      kernel/sysctl.c
      -- changed __request_module to __do_request_module; takes
         one extra argument, for_lcd
      -- __request_module   ==>  __do_request_module with for_lcd = 0
      -- request_lcd_module ==>  __do_request_module with for_lcd = 1
      -- call_modprobe conditionally uses lcd_modprobe_path, the path
         to a patched modprobe accessible via sysfs
      
      include/lcd-domains/lcd-domains.h
      -- added lcd status enum; see source code doc
      -- three routines for creating/running/destroying
         lcd's that use modules; see source code doc
      
      include/uapi/linux/lcd-domains.h
      -- added interface defns for patched modprobe to call into
         lcd driver for module init; lcd driver loads
         module (via slightly refactored module.c code) on behalf
         of modprobe
      
      virt/lcd-domains/lcd-domains.c
      -- implementation of routines for modules inside lcd's
      -- implementation of module init / delete for lcd's
         (uses patched module.c code)
      
      virt/lcd-domains/Kconfig
      virt/lcd-domains/Makefile
      virt/lcd-domains/lcd-module-load-test.c
      virt/lcd-domains/lcd-tests.c
      -- added test module for lcd module code
      -- test runs automatically when lcd module is inserted
      e0193fa4
    • Charles Jacobsen's avatar
      Fixed build system for lcd, and most compiler errors. · 7c05c7a0
      Charles Jacobsen authored
      Two components to the lcd build now:
      -- arch/x86/lcd/Makefile: for building x86 lcd code
      -- virt/lcd/Makefile: for building arch-indep lcd code
      
      Modified the build system just slightly for a cleaner
      build:
      -- virt/ directory treated like ipc/, usr/, etc. directories
      -- added Kconfig and Makefile to virt/, mirroring drivers/
      -- updated top-level Makefile to include virt/ as vmlinux
         directory / dependency, so build system will recur into
         virt/
      -- updated arch/x86/Kconfig to include virt/Kconfig, so it
         will be included as a menu item
      -- updated arch/x86/Kbuild to include arch/x86/lcd/
      
      Removed old capabilities code in cap/.
      
      Removed lcd syscall.
      
      Temporarily turned off build for drivers/lcd.
      
      Fixed most bugs in lcd-vmx (still need to do lcd_vmx_exit).
      -- minor naming issues in lcd-vmx.h
      -- straight copy over of vmx_disable_intercepts_for_msr,
         but with more doc
      -- removed VMX_EPT_INDIVIDUAL_ADDR macro from vmx.h (where
         did this come from? it's not documented in the intel manual,
         nor is it used in kvm)
      7c05c7a0
    • Anton Burtsev's avatar
      ioctl interface to LCD · 552bd976
      Anton Burtsev authored
      552bd976
    • Anton Burtsev's avatar
      Move things around · 0ec257ef
      Anton Burtsev authored
      Not ideal. Some problems:
      
        - The arch dependent and independent parts are not completely
          separated.
      
        - The module builds in arch/x86/lcd and this is a bit strange
      
        - I can't build with LCD buildin, but build works if I build it
          as a module
      
        - There is still a bunch of old warrnings from Weibin's code
      0ec257ef
    • Sarah Spall's avatar
  2. 03 Aug, 2016 3 commits
    • Jessica Yu's avatar
      modules: add ro_after_init support · 444d13ff
      Jessica Yu authored
      Add ro_after_init support for modules by adding a new page-aligned section
      in the module layout (after rodata) for ro_after_init data and enabling RO
      protection for that section after module init runs.
      Signed-off-by: default avatarJessica Yu <jeyu@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      444d13ff
    • Prarit Bhargava's avatar
      modules: Add kernel parameter to blacklist modules · be7de5f9
      Prarit Bhargava authored
      Blacklisting a module in linux has long been a problem.  The current
      procedure is to use rd.blacklist=module_name, however, that doesn't
      cover the case after the initramfs and before a boot prompt (where one
      is supposed to use /etc/modprobe.d/blacklist.conf to blacklist
      runtime loading). Using rd.shell to get an early prompt is hit-or-miss,
      and doesn't cover all situations AFAICT.
      
      This patch adds this functionality of permanently blacklisting a module
      by its name via the kernel parameter module_blacklist=module_name.
      
      [v2]: Rusty, use core_param() instead of __setup() which simplifies
      things.
      
      [v3]: Rusty, undo wreckage from strsep()
      
      [v4]: Rusty, simpler version of blacklisted()
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: linux-doc@vger.kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      be7de5f9
    • Steven Rostedt's avatar
      module: Do a WARN_ON_ONCE() for assert module mutex not held · 9502514f
      Steven Rostedt authored
      When running with lockdep enabled, I triggered the WARN_ON() in the
      module code that asserts when module_mutex or rcu_read_lock_sched are
      not held. The issue I have is that this can also be called from the
      dump_stack() code, causing us to enter an infinite loop...
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8fa70 ffff880215e8fa70 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8fac0 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8f7a0 ffff880215e8f7a0 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8f7f0 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8f4d0 ffff880215e8f4d0 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8f520 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
      [...]
      
      Which gives us rather useless information. Worse yet, there's some race
      that causes this, and I seldom trigger it, so I have no idea what
      happened.
      
      This would not be an issue if that warning was a WARN_ON_ONCE().
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      9502514f
  3. 02 Aug, 2016 1 commit
  4. 26 Jul, 2016 4 commits
  5. 01 Apr, 2016 1 commit
    • Jessica Yu's avatar
      module: preserve Elf information for livepatch modules · 1ce15ef4
      Jessica Yu authored
      For livepatch modules, copy Elf section, symbol, and string information
      from the load_info struct in the module loader. Persist copies of the
      original symbol table and string table.
      
      Livepatch manages its own relocation sections in order to reuse module
      loader code to write relocations. Livepatch modules must preserve Elf
      information such as section indices in order to apply livepatch relocation
      sections using the module loader's apply_relocate_add() function.
      
      In order to apply livepatch relocation sections, livepatch modules must
      keep a complete copy of their original symbol table in memory. Normally, a
      stripped down copy of a module's symbol table (containing only "core"
      symbols) is made available through module->core_symtab. But for livepatch
      modules, the symbol table copied into memory on module load must be exactly
      the same as the symbol table produced when the patch module was compiled.
      This is because the relocations in each livepatch relocation section refer
      to their respective symbols with their symbol indices, and the original
      symbol indices (and thus the symtab ordering) must be preserved in order
      for apply_relocate_add() to find the right symbol.
      Signed-off-by: default avatarJessica Yu <jeyu@redhat.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      1ce15ef4
  6. 17 Mar, 2016 2 commits
  7. 21 Feb, 2016 1 commit
    • Mimi Zohar's avatar
      module: replace copy_module_from_fd with kernel version · a1db7420
      Mimi Zohar authored
      Replace copy_module_from_fd() with kernel_read_file_from_fd().
      
      Although none of the upstreamed LSMs define a kernel_module_from_file
      hook, IMA is called, based on policy, to prevent unsigned kernel modules
      from being loaded by the original kernel module syscall and to
      measure/appraise signed kernel modules.
      
      The security function security_kernel_module_from_file() was called prior
      to reading a kernel module.  Preventing unsigned kernel modules from being
      loaded by the original kernel module syscall remains on the pre-read
      kernel_read_file() security hook.  Instead of reading the kernel module
      twice, once for measuring/appraising and again for loading the kernel
      module, the signature validation is moved to the kernel_post_read_file()
      security hook.
      
      This patch removes the security_kernel_module_from_file() hook and security
      call.
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarLuis R. Rodriguez <mcgrof@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      a1db7420
  8. 17 Feb, 2016 1 commit
    • Jessica Yu's avatar
      ftrace/module: remove ftrace module notifier · 7dcd182b
      Jessica Yu authored
      Remove the ftrace module notifier in favor of directly calling
      ftrace_module_enable() and ftrace_release_mod() in the module loader.
      Hard-coding the function calls directly in the module loader removes
      dependence on the module notifier call chain and provides better
      visibility and control over what gets called when, which is important
      to kernel utilities such as livepatch.
      
      This fixes a notifier ordering issue in which the ftrace module notifier
      (and hence ftrace_module_enable()) for coming modules was being called
      after klp_module_notify(), which caused livepatch modules to initialize
      incorrectly. This patch removes dependence on the module notifier call
      chain in favor of hard coding the corresponding function calls in the
      module loader. This ensures that ftrace and livepatch code get called in
      the correct order on patch module load and unload.
      
      Fixes: 5156dca3 ("ftrace: Fix the race between ftrace and insmod")
      Signed-off-by: default avatarJessica Yu <jeyu@redhat.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.cz>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      7dcd182b
  9. 02 Feb, 2016 3 commits
    • Rusty Russell's avatar
      modules: fix longstanding /proc/kallsyms vs module insertion race. · 8244062e
      Rusty Russell authored
      For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
      There's one full copy, marked SHF_ALLOC and laid out at the end of the
      module's init section.  There's also a cut-down version that only
      contains core symbols and strings, and lives in the module's core
      section.
      
      After module init (and before we free the module memory), we switch
      the mod->symtab, mod->num_symtab and mod->strtab to point to the core
      versions.  We do this under the module_mutex.
      
      However, kallsyms doesn't take the module_mutex: it uses
      preempt_disable() and rcu tricks to walk through the modules, because
      it's used in the oops path.  It's also used in /proc/kallsyms.
      There's nothing atomic about the change of these variables, so we can
      get the old (larger!) num_symtab and the new symtab pointer; in fact
      this is what I saw when trying to reproduce.
      
      By grouping these variables together, we can use a
      carefully-dereferenced pointer to ensure we always get one or the
      other (the free of the module init section is already done in an RCU
      callback, so that's safe).  We allocate the init one at the end of the
      module init section, and keep the core one inside the struct module
      itself (it could also have been allocated at the end of the module
      core, but that's probably overkill).
      Reported-by: default avatarWeilong Chen <chenweilong@huawei.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
      Cc: stable@kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      8244062e
    • Rusty Russell's avatar
      module: wrapper for symbol name. · 2e7bac53
      Rusty Russell authored
      This trivial wrapper adds clarity and makes the following patch
      smaller.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      2e7bac53
    • Luis R. Rodriguez's avatar
      modules: fix modparam async_probe request · 4355efbd
      Luis R. Rodriguez authored
      Commit f2411da7 ("driver-core: add driver module
      asynchronous probe support") added async probe support,
      in two forms:
      
        * in-kernel driver specification annotation
        * generic async_probe module parameter (modprobe foo async_probe)
      
      To support the generic kernel parameter parse_args() was
      extended via commit ecc86170 ("module: add extra
      argument for parse_params() callback") however commit
      failed to f2411da7 failed to add the required argument.
      
      This causes a crash then whenever async_probe generic
      module parameter is used. This was overlooked when the
      form in which in-kernel async probe support was reworked
      a bit... Fix this as originally intended.
      
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: stable@vger.kernel.org (4.2+)
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> [minimized]
      4355efbd
  10. 07 Jan, 2016 1 commit
  11. 04 Dec, 2015 4 commits
    • Miroslav Benes's avatar
      module: keep percpu symbols in module's symtab · e0224418
      Miroslav Benes authored
      Currently, percpu symbols from .data..percpu ELF section of a module are
      not copied over and stored in final symtab array of struct module.
      Consequently such symbol cannot be returned via kallsyms API (for
      example kallsyms_lookup_name). This can be especially confusing when the
      percpu symbol is exported. Only its __ksymtab et al. are present in its
      symtab.
      
      The culprit is in layout_and_allocate() function where SHF_ALLOC flag is
      dropped for .data..percpu section. There is in fact no need to copy the
      section to final struct module, because kernel module loader allocates
      extra percpu section by itself. Unfortunately only symbols from
      SHF_ALLOC sections are copied due to a check in is_core_symbol().
      
      The patch changes is_core_symbol() function to copy over also percpu
      symbols (their st_shndx points to .data..percpu ELF section). We do it
      only if CONFIG_KALLSYMS_ALL is set to be consistent with the rest of the
      function (ELF section is SHF_ALLOC but !SHF_EXECINSTR). Finally
      elf_type() returns type 'a' for a percpu symbol because its address is
      absolute.
      Signed-off-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      e0224418
    • Rusty Russell's avatar
      module: clean up RO/NX handling. · 85c898db
      Rusty Russell authored
      Modules have three sections: text, rodata and writable data.  The code
      handled the case where these overlapped, however they never can:
      debug_align() ensures they are always page-aligned.
      
      This is why we got away with manually traversing the pages in
      set_all_modules_text_rw() without rounding.
      
      We create three helper functions: frob_text(), frob_rodata() and
      frob_writable_data().  We then call these explicitly at every point,
      so it's clear what we're doing.
      
      We also expose module_enable_ro() and module_disable_ro() for
      livepatch to use.
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      85c898db
    • Rusty Russell's avatar
      module: use a structure to encapsulate layout. · 7523e4dc
      Rusty Russell authored
      Makes it easier to handle init vs core cleanly, though the change is
      fairly invasive across random architectures.
      
      It simplifies the rbtree code immediately, however, while keeping the
      core data together in the same cachline (now iff the rbtree code is
      enabled).
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      7523e4dc
    • Josh Poimboeuf's avatar
      module: Use the same logic for setting and unsetting RO/NX · 20ef10c1
      Josh Poimboeuf authored
      When setting a module's RO and NX permissions, set_section_ro_nx() is
      used, but when clearing them, unset_module_{init,core}_ro_nx() are used.
      The unset functions don't have the same checks the set function has for
      partial page protections.  It's probably harmless, but it's still
      confusingly asymmetrical.
      
      Instead, use the same logic to do both.  Also add some new
      set_module_{init,core}_ro_nx() helper functions for more symmetry with
      the unset functions.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      20ef10c1
  12. 23 Aug, 2015 1 commit
    • Peter Zijlstra's avatar
      module: Fix locking in symbol_put_addr() · 275d7d44
      Peter Zijlstra authored
      Poma (on the way to another bug) reported an assertion triggering:
      
        [<ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
        [<ffffffff81150822>] __module_address+0x32/0x150
        [<ffffffff81150956>] __module_text_address+0x16/0x70
        [<ffffffff81150f19>] symbol_put_addr+0x29/0x40
        [<ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]
      
      Laura Abbott <labbott@redhat.com> produced a patch which lead us to
      inspect symbol_put_addr(). This function has a comment claiming it
      doesn't need to disable preemption around the module lookup
      because it holds a reference to the module it wants to find, which
      therefore cannot go away.
      
      This is wrong (and a false optimization too, preempt_disable() is really
      rather cheap, and I doubt any of this is on uber critical paths,
      otherwise it would've retained a pointer to the actual module anyway and
      avoided the second lookup).
      
      While its true that the module cannot go away while we hold a reference
      on it, the data structure we do the lookup in very much _CAN_ change
      while we do the lookup. Therefore fix the comment and add the
      required preempt_disable().
      Reported-by: default avatarpoma <pomidorabelisima@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Fixes: a6e6abd5 ("module: remove module_text_address()")
      Cc: stable@kernel.org
      275d7d44
  13. 28 Jul, 2015 1 commit
  14. 08 Jul, 2015 1 commit
  15. 27 Jun, 2015 1 commit
  16. 22 Jun, 2015 1 commit
    • Dan Streetman's avatar
      module: add per-module param_lock · b51d23e4
      Dan Streetman authored
      Add a "param_lock" mutex to each module, and update params.c to use
      the correct built-in or module mutex while locking kernel params.
      Remove the kparam_block_sysfs_r/w() macros, replace them with direct
      calls to kernel_param_[un]lock(module).
      
      The kernel param code currently uses a single mutex to protect
      modification of any and all kernel params.  While this generally works,
      there is one specific problem with it; a module callback function
      cannot safely load another module, i.e. with request_module() or even
      with indirect calls such as crypto_has_alg().  If the module to be
      loaded has any of its params configured (e.g. with a /etc/modprobe.d/*
      config file), then the attempt will result in a deadlock between the
      first module param callback waiting for modprobe, and modprobe trying to
      lock the single kernel param mutex to set the new module's param.
      
      This fixes that by using per-module mutexes, so that each individual module
      is protected against concurrent changes in its own kernel params, but is
      not blocked by changes to other module params.  All built-in modules
      continue to use the built-in mutex, since they will always be loaded at
      runtime and references (e.g. request_module(), crypto_has_alg()) to them
      will never cause load-time param changing.
      
      This also simplifies the interface used by modules to block sysfs access
      to their params; while there are currently functions to block and unblock
      sysfs param access which are split up by read and write and expect a single
      kernel param to be passed, their actual operation is identical and applies
      to all params, not just the one passed to them; they simply lock and unlock
      the global param mutex.  They are replaced with direct calls to
      kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or
      if the module is built-in, it locks the built-in mutex.
      Suggested-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      b51d23e4
  17. 27 May, 2015 8 commits
    • Luis R. Rodriguez's avatar
      kernel/module.c: avoid ifdefs for sig_enforce declaration · 6727bb9c
      Luis R. Rodriguez authored
      There's no need to require an ifdef over the declaration
      of sig_enforce as IS_ENABLED() can be used. While at it,
      there's no harm in exposing this kernel parameter outside of
      CONFIG_MODULE_SIG as it'd be a no-op on non module sig
      kernels.
      
      Now, technically we should in theory be able to remove
      the #ifdef'ery over the declaration of the module parameter
      as we are also trusting the bool_enable_only code for
      CONFIG_MODULE_SIG kernels but for now remain paranoid
      and keep it.
      
      With time if no one can put a bullet through bool_enable_only
      and if there are no technical requirements over not exposing
      CONFIG_MODULE_SIG_FORCE with the measures in place by
      bool_enable_only we could remove this last ifdef.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      6727bb9c
    • Luis R. Rodriguez's avatar
      kernel/params.c: generalize bool_enable_only · d19f05d8
      Luis R. Rodriguez authored
      This takes out the bool_enable_only implementation from
      the module loading code and generalizes it so that others
      can make use of it.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      d19f05d8
    • Luis R. Rodriguez's avatar
      kernel/module.c: use generic module param operaters for sig_enforce · 05f408dd
      Luis R. Rodriguez authored
      We're directly checking and modifying sig_enforce when needed instead
      of using the generic helpers. This prevents us from generalizing this
      helper so that others can use it. Use indirect helpers to allow us
      to generalize this code a bit and to make it a bit more clear what
      this is doing.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: cocci@systeme.lip6.fr
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      05f408dd
    • Peter Zijlstra's avatar
      module: Rework module_addr_{min,max} · 4f666546
      Peter Zijlstra authored
      __module_address() does an initial bound check before doing the
      {list/tree} iteration to find the actual module. The bound variables
      are nowhere near the mod_tree cacheline, in fact they're nowhere near
      one another.
      
      module_addr_min lives in .data while module_addr_max lives in .bss
      (smarty pants GCC thinks the explicit 0 assignment is a mistake).
      
      Rectify this by moving the two variables into a structure together
      with the latch_tree_root to guarantee they all share the same
      cacheline and avoid hitting two extra cachelines for the lookup.
      
      While reworking the bounds code, move the bound update from allocation
      to insertion time, this avoids updating the bounds for a few error
      paths.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      4f666546
    • Peter Zijlstra's avatar
      module: Use __module_address() for module_address_lookup() · b7df4d1b
      Peter Zijlstra authored
      Use the generic __module_address() addr to struct module lookup
      instead of open coding it once more.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      b7df4d1b
    • Peter Zijlstra's avatar
      module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING · 6c9692e2
      Peter Zijlstra authored
      Andrew worried about the overhead on small systems; only use the fancy
      code when either perf or tracing is enabled.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Requested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      6c9692e2
    • Peter Zijlstra's avatar
      module: Optimize __module_address() using a latched RB-tree · 93c2e105
      Peter Zijlstra authored
      Currently __module_address() is using a linear search through all
      modules in order to find the module corresponding to the provided
      address. With a lot of modules this can take a lot of time.
      
      One of the users of this is kernel_text_address() which is employed
      in many stack unwinders; which in turn are used by perf-callchain and
      ftrace (possibly from NMI context).
      
      So by optimizing __module_address() we optimize many stack unwinders
      which are used by both perf and tracing in performance sensitive code.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      93c2e105
    • Peter Zijlstra's avatar
      module: Sanitize RCU usage and locking · 0be964be
      Peter Zijlstra authored
      Currently the RCU usage in module is an inconsistent mess of RCU and
      RCU-sched, this is broken for CONFIG_PREEMPT where synchronize_rcu()
      does not imply synchronize_sched().
      
      Most usage sites use preempt_{dis,en}able() which is RCU-sched, but
      (most of) the modification sites use synchronize_rcu(). With the
      exception of the module bug list, which actually uses RCU.
      
      Convert everything over to RCU-sched.
      
      Furthermore add lockdep asserts to all sites, because it's not at all
      clear to me the required locking is observed, esp. on exported
      functions.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      0be964be
  18. 26 May, 2015 1 commit
    • Peter Zijlstra's avatar
      module: Annotate module version magic · 926a59b1
      Peter Zijlstra authored
      Due to the new lockdep checks in the coming patch, we go:
      
      [    9.759380] ------------[ cut here ]------------
      [    9.759389] WARNING: CPU: 31 PID: 597 at ../kernel/module.c:216 each_symbol_section+0x121/0x130()
      [    9.759391] Modules linked in:
      [    9.759393] CPU: 31 PID: 597 Comm: modprobe Not tainted 4.0.0-rc1+ #65
      [    9.759393] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
      [    9.759396]  ffffffff817d8676 ffff880424567ca8 ffffffff8157e98b 0000000000000001
      [    9.759398]  0000000000000000 ffff880424567ce8 ffffffff8105fbc7 ffff880424567cd8
      [    9.759400]  0000000000000000 ffffffff810ec160 ffff880424567d40 0000000000000000
      [    9.759400] Call Trace:
      [    9.759407]  [<ffffffff8157e98b>] dump_stack+0x4f/0x7b
      [    9.759410]  [<ffffffff8105fbc7>] warn_slowpath_common+0x97/0xe0
      [    9.759412]  [<ffffffff810ec160>] ? section_objs+0x60/0x60
      [    9.759414]  [<ffffffff8105fc2a>] warn_slowpath_null+0x1a/0x20
      [    9.759415]  [<ffffffff810ed9c1>] each_symbol_section+0x121/0x130
      [    9.759417]  [<ffffffff810eda01>] find_symbol+0x31/0x70
      [    9.759420]  [<ffffffff810ef5bf>] load_module+0x20f/0x2660
      [    9.759422]  [<ffffffff8104ef10>] ? __do_page_fault+0x190/0x4e0
      [    9.759426]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759427]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759433]  [<ffffffff810ae73d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
      [    9.759437]  [<ffffffff812fcc0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [    9.759439]  [<ffffffff815880ec>] ? retint_restore_args+0x13/0x13
      [    9.759441]  [<ffffffff810f1ade>] SyS_init_module+0xce/0x100
      [    9.759443]  [<ffffffff81587429>] system_call_fastpath+0x12/0x17
      [    9.759445] ---[ end trace 9294429076a9c644 ]---
      
      As per the comment this site should be fine, but lets wrap it in
      preempt_disable() anyhow to placate lockdep.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      926a59b1