1. 17 Aug, 2010 1 commit
    • David Howells's avatar
      Make do_execve() take a const filename pointer · d7627467
      David Howells authored
      Make do_execve() take a const filename pointer so that kernel_execve() compiles
      correctly on ARM:
      arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
      This also requires the argv and envp arguments to be consted twice, once for
      the pointer array and once for the strings the array points to.  This is
      because do_execve() passes a pointer to the filename (now const) to
      copy_strings_kernel().  A simpler alternative would be to cast the filename
      pointer in do_execve() when it's passed to copy_strings_kernel().
      do_execve() may not change any of the strings it is passed as part of the argv
      or envp lists as they are some of them in .rodata, so marking these strings as
      const should be fine.
      Further kernel_execve() and sys_execve() need to be changed to match.
      This has been test built on x86_64, frv, arm and mips.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 27 May, 2010 8 commits
    • Oleg Nesterov's avatar
      call_usermodehelper: UMH_WAIT_EXEC ignores kernel_thread() failure · 04b1c384
      Oleg Nesterov authored
      UMH_WAIT_EXEC should report the error if kernel_thread() fails, like
      UMH_WAIT_PROC does.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      call_usermodehelper: simplify/fix UMH_NO_WAIT case · d47419cd
      Oleg Nesterov authored
      __call_usermodehelper(UMH_NO_WAIT) has 2 problems:
      	- if kernel_thread() fails, call_usermodehelper_freeinfo()
      	  is not called.
      	- for unknown reason UMH_NO_WAIT has UMH_WAIT_PROC logic,
      	  we spawn yet another thread which waits until the user
      	  mode application exits.
      Change the UMH_NO_WAIT code to use ____call_usermodehelper() instead of
      wait_for_helper(), and do call_usermodehelper_freeinfo() unconditionally.
      We can rely on CLONE_VFORK, do_fork(CLONE_VFORK) until the child exits or
      With or without this patch UMH_NO_WAIT does not report the error if
      kernel_thread() fails, this is correct since the caller doesn't wait for
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      wait_for_helper: SIGCHLD from user-space can lead to use-after-free · 7d642242
      Oleg Nesterov authored
      1. wait_for_helper() calls allow_signal(SIGCHLD) to ensure the child
         can't autoreap itself.
         However, this means that a spurious SIGCHILD from user-space can
         set TIF_SIGPENDING and:
         	- kernel_thread() or sys_wait4() can fail due to signal_pending()
         	- worse, wait4() can fail before ____call_usermodehelper() execs
         	  or exits. In this case the caller may kfree(subprocess_info)
         	  while the child still uses this memory.
         Change the code to use SIG_DFL instead of magic "(void __user *)2"
         set by allow_signal(). This means that SIGCHLD won't be delivered,
         yet the child won't autoreap itsefl.
         The problem is minor, only root can send a signal to this kthread.
      2. If sys_wait4(&ret) fails it doesn't populate "ret", in this case
         wait_for_helper() reports a random value from uninitialized var.
         With this patch sys_wait4() should never fail, but still it makes
         sense to initialize ret = -ECHILD so that the caller can notice
         the problem.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      call_usermodehelper: no need to unblock signals · 363da402
      Oleg Nesterov authored
      ____call_usermodehelper() correctly calls flush_signal_handlers() to set
      SIG_DFL, but sigemptyset(->blocked) and recalc_sigpending() are not
      This kthread was forked by workqueue thread, all signals must be unblocked
      and ignored, no pending signal is possible.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      umh: creds: kill subprocess_info->cred logic · c70a626d
      Oleg Nesterov authored
      Now that nobody ever changes subprocess_info->cred we can kill this member
      and related code.  ____call_usermodehelper() always runs in the context of
      freshly forked kernel thread, it has the proper ->cred copied from its
      parent kthread, keventd.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      umh: creds: convert call_usermodehelper_keys() to use subprocess_info->init() · 685bfd2c
      Oleg Nesterov authored
      call_usermodehelper_keys() uses call_usermodehelper_setkeys() to change
      subprocess_info->cred in advance.  Now that we have info->init() we can
      change this code to set tgcred->session_keyring in context of execing
      kernel thread.
      Note: since currently call_usermodehelper_keys() is never called with
      UMH_NO_WAIT, call_usermodehelper_keys()->key_get() and umh_keys_cleanup()
      are not really needed, we could rely on install_session_keyring_to_cred()
      which does key_get() on success.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Neil Horman's avatar
      exec: replace call_usermodehelper_pipe with use of umh init function and resolve limit · 898b374a
      Neil Horman authored
      The first patch in this series introduced an init function to the
      call_usermodehelper api so that processes could be customized by caller.
      This patch takes advantage of that fact, by customizing the helper in
      do_coredump to create the pipe and set its core limit to one (for our
      recusrsion check).  This lets us clean up the previous uglyness in the
      usermodehelper internals and factor call_usermodehelper out entirely.
      While I'm at it, we can also modify the helper setup to look for a core
      limit value of 1 rather than zero for our recursion check
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Neil Horman's avatar
      kmod: add init function to usermodehelper · a06a4dc3
      Neil Horman authored
      About 6 months ago, I made a set of changes to how the core-dump-to-a-pipe
      feature in the kernel works.  We had reports of several races, including
      some reports of apps bypassing our recursion check so that a process that
      was forked as part of a core_pattern setup could infinitely crash and
      refork until the system crashed.
      We fixed those by improving our recursion checks.  The new check basically
      refuses to fork a process if its core limit is zero, which works well.
      Unfortunately, I've been getting grief from maintainer of user space
      programs that are inserted as the forked process of core_pattern.  They
      contend that in order for their programs (such as abrt and apport) to
      work, all the running processes in a system must have their core limits
      set to a non-zero value, to which I say 'yes'.  I did this by design, and
      think thats the right way to do things.
      But I've been asked to ease this burden on user space enough times that I
      thought I would take a look at it.  The first suggestion was to make the
      recursion check fail on a non-zero 'special' number, like one.  That way
      the core collector process could set its core size ulimit to 1, and enable
      the kernel's recursion detection.  This isn't a bad idea on the surface,
      but I don't like it since its opt-in, in that if a program like abrt or
      apport has a bug and fails to set such a core limit, we're left with a
      recursively crashing system again.
      So I've come up with this.  What I've done is modify the
      call_usermodehelper api such that an extra parameter is added, a function
      pointer which will be called by the user helper task, after it forks, but
      before it exec's the required process.  This will give the caller the
      opportunity to get a call back in the processes context, allowing it to do
      whatever it needs to to the process in the kernel prior to exec-ing the
      user space code.  In the case of do_coredump, this callback is ues to set
      the core ulimit of the helper process to 1.  This elimnates the opt-in
      problem that I had above, as it allows the ulimit for core sizes to be set
      to the value of 1, which is what the recursion check looks for in
      This patch:
      Create new function call_usermodehelper_fns() and allow it to assign both
      an init and cleanup function, as we'll as arbitrary data.
      The init function is called from the context of the forked process and
      allows for customization of the helper process prior to calling exec.  Its
      return code gates the continuation of the process, or causes its exit.
      Also add an arbitrary data pointer to the subprocess_info struct allowing
      for data to be passed from the caller to the new process, and the
      subsequent cleanup process
      Also, use this patch to cleanup the cleanup function.  It currently takes
      an argp and envp pointer for freeing, which is ugly.  Lets instead just
      make the subprocess_info structure public, and pass that to the cleanup
      and init routines
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  3. 11 Jan, 2010 1 commit
    • Masami Hiramatsu's avatar
      kmod: fix resource leak in call_usermodehelper_pipe() · 8767ba27
      Masami Hiramatsu authored
      Fix resource (write-pipe file) leak in call_usermodehelper_pipe().
      When call_usermodehelper_exec() fails, write-pipe file is opened and
      call_usermodehelper_pipe() just returns an error.  Since it is hard for
      caller to determine whether the error occured when opening the pipe or
      executing the helper, the caller cannot close the pipe by themselves.
      I've found this resoruce leak when testing coredump.  You can check how
      the resource leaks as below;
      $ echo "|nocommand" > /proc/sys/kernel/core_pattern
      $ ulimit -c unlimited
      $ while [ 1 ]; do ./segv; done &> /dev/null &
      $ cat /proc/meminfo (<- repeat it)
      where segv.c is;
      int main () {
              char *p = 0;
              *p = 1;
      This patch closes write-pipe file if call_usermodehelper_exec() failed.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 09 Nov, 2009 1 commit
    • Eric Paris's avatar
      security: report the module name to security_module_request · dd8dbf2e
      Eric Paris authored
      For SELinux to do better filtering in userspace we send the name of the
      module along with the AVC denial when a program is denied module_request.
      Example output:
      type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
      type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc:  denied  { module_request } for  pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
  5. 23 Sep, 2009 2 commits
    • Sebastian Andrzej Siewior's avatar
      Revert "kmod: fix race in usermodehelper code" · 95e0d86b
      Sebastian Andrzej Siewior authored
      This reverts commit c02e3f36 ("kmod: fix race in usermodehelper code")
      The patch is wrong.  UMH_WAIT_EXEC is called with VFORK what ensures
      that the child finishes prior returing back to the parent.  No race.
      In fact, the patch makes it even worse because it does the thing it
      claims not do:
       - It calls ->complete() on UMH_WAIT_EXEC
       - the complete() callback may de-allocated subinfo as seen in the
         following call chain:
          [<c009f904>] (__link_path_walk+0x20/0xeb4) from [<c00a094c>] (path_walk+0x48/0x94)
          [<c00a094c>] (path_walk+0x48/0x94) from [<c00a0a34>] (do_path_lookup+0x24/0x4c)
          [<c00a0a34>] (do_path_lookup+0x24/0x4c) from [<c00a158c>] (do_filp_open+0xa4/0x83c)
          [<c00a158c>] (do_filp_open+0xa4/0x83c) from [<c009ba90>] (open_exec+0x24/0xe0)
          [<c009ba90>] (open_exec+0x24/0xe0) from [<c009bfa8>] (do_execve+0x7c/0x2e4)
          [<c009bfa8>] (do_execve+0x7c/0x2e4) from [<c0026a80>] (kernel_execve+0x34/0x80)
          [<c0026a80>] (kernel_execve+0x34/0x80) from [<c004b514>] (____call_usermodehelper+0x130/0x148)
          [<c004b514>] (____call_usermodehelper+0x130/0x148) from [<c0024858>] (kernel_thread_exit+0x0/0x8)
         and the path pointer was NULL.  Good that ARM's kernel_execve()
         doesn't check the pointer for NULL or else I wouldn't notice it.
      The only race there might be is with UMH_NO_WAIT but it is too late for
      me to investigate it now.  UMH_WAIT_PROC could probably also use VFORK
      and we could save one exec.  So the only race I see is with UMH_NO_WAIT
      and recent scheduler changes where the child does not always run first
      might have trigger here something but as I said, it is late....
      Signed-off-by: default avatarSebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Neil Horman's avatar
      kmod: fix race in usermodehelper code · c02e3f36
      Neil Horman authored
      The user mode helper code has a race in it.  call_usermodehelper_exec()
      takes an allocated subprocess_info structure, which it passes to a
      workqueue, and then passes it to a kernel thread which it creates, after
      which it calls complete to signal to the caller of
      call_usermodehelper_exec() that it can free the subprocess_info struct.
      But since we use that structure in the created thread, we can't call
      complete from __call_usermodehelper(), which is where we create the kernel
      thread.  We need to call complete() from within the kernel thread and then
      not use subprocess_info afterward in the case of UMH_WAIT_EXEC.  Tested
      successfully by me.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  6. 02 Sep, 2009 1 commit
    • David Howells's avatar
      CRED: Add some configurable debugging [try #6] · e0e81739
      David Howells authored
      Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
      for credential management.  The additional code keeps track of the number of
      pointers from task_structs to any given cred struct, and checks to see that
      this number never exceeds the usage count of the cred struct (which includes
      all references, not just those from task_structs).
      Furthermore, if SELinux is enabled, the code also checks that the security
      pointer in the cred struct is never seen to be invalid.
      This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
      kernel thread on seeing cred->security be a NULL pointer (it appears that the
      credential struct has been previously released):
      	http://www.kerneloops.org/oops.php?number=252883Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
  7. 17 Aug, 2009 1 commit
    • Li Zefan's avatar
      tracing/events: Add module tracepoints · 7ead8b83
      Li Zefan authored
      Add trace points to trace module_load, module_free, module_get,
      module_put and module_request, and use trace_event facility to
      get the trace output.
      Here's the sample output:
              | |       |          |         |
          <...>-42    [000]     1.758380: module_request: fb0 wait=1 call_site=fb_open
          <...>-60    [000]     3.269403: module_load: scsi_wait_scan
          <...>-60    [000]     3.269432: module_put: scsi_wait_scan call_site=sys_init_module refcnt=0
          <...>-61    [001]     3.273168: module_free: scsi_wait_scan
          <...>-1021  [000]    13.836081: module_load: sunrpc
          <...>-1021  [000]    13.840589: module_put: sunrpc call_site=sys_init_module refcnt=-1
          <...>-1027  [000]    13.848098: module_get: sunrpc call_site=try_module_get refcnt=0
          <...>-1027  [000]    13.848308: module_get: sunrpc call_site=get_filesystem refcnt=1
          <...>-1027  [000]    13.848692: module_put: sunrpc call_site=put_filesystem refcnt=0
       modprobe-2587  [001]  1088.437213: module_load: trace_events_sample F
       modprobe-2587  [001]  1088.437786: module_put: trace_events_sample call_site=sys_init_module refcnt=0
      - the taints flag can be 'F', 'C' and/or 'P' if mod->taints != 0
      - the module refcnt is percpu, so it can be negative in a
        specific cpu
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <4A891B3C.5030608@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  8. 13 Aug, 2009 1 commit
  9. 08 Jul, 2009 1 commit
  10. 26 May, 2009 1 commit
  11. 30 Mar, 2009 2 commits
    • Arjan van de Ven's avatar
      module: create a request_module_nowait() · acae0515
      Arjan van de Ven authored
      There seems to be a common pattern in the kernel where drivers want to
      call request_module() from inside a module_init() function. Currently
      this would deadlock.
      As a result, several drivers go through hoops like scheduling things via
      kevent, or creating custom work queues (because kevent can deadlock on them).
      This patch changes this to use a request_module_nowait() function macro instead,
      which just fires the modprobe off but doesn't wait for it, and thus avoids the
      original deadlock entirely.
      On my laptop this already results in one less kernel thread running..
      (Includes Jiri's patch to use enum umh_wait)
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (bool-ified)
      Cc: Jiri Slaby <jirislaby@gmail.com>
    • Rusty Russell's avatar
      cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL · 1a2142af
      Rusty Russell authored
      Impact: cleanup
      (Thanks to Al Viro for reminding me of this, via Ingo)
      CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:
      	#define CPU_MASK_ALL (cpumask_t) { { ... } }
      Taking the address of such a temporary is questionable at best,
      unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
      	#define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)
      Which formalizes this practice.  One day gcc could bite us over this
      usage (though we seem to have gotten away with it so far).
      So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR
      with the modern "cpu_all_mask" (a real const struct cpumask *).
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Mike Travis <travis@sgi.com>
  12. 06 Jan, 2009 1 commit
  13. 13 Nov, 2008 2 commits
    • David Howells's avatar
      CRED: Inaugurate COW credentials · d84f4f99
      David Howells authored
      Inaugurate copy-on-write credentials management.  This uses RCU to manage the
      credentials pointer in the task_struct with respect to accesses by other tasks.
      A process may only modify its own credentials, and so does not need locking to
      access or modify its own credentials.
      A mutex (cred_replace_mutex) is added to the task_struct to control the effect
      of PTRACE_ATTACHED on credential calculations, particularly with respect to
      With this patch, the contents of an active credentials struct may not be
      changed directly; rather a new set of credentials must be prepared, modified
      and committed using something like the following sequence of events:
      	struct cred *new = prepare_creds();
      	int ret = blah(new);
      	if (ret < 0) {
      		return ret;
      	return commit_creds(new);
      There are some exceptions to this rule: the keyrings pointed to by the active
      credentials may be instantiated - keyrings violate the COW rule as managing
      COW keyrings is tricky, given that it is possible for a task to directly alter
      the keys in a keyring in use by another task.
      To help enforce this, various pointers to sets of credentials, such as those in
      the task_struct, are declared const.  The purpose of this is compile-time
      discouragement of altering credentials through those pointers.  Once a set of
      credentials has been made public through one of these pointers, it may not be
      modified, except under special circumstances:
        (1) Its reference count may incremented and decremented.
        (2) The keyrings to which it points may be modified, but not replaced.
      The only safe way to modify anything else is to create a replacement and commit
      using the functions described in Documentation/credentials.txt (which will be
      added by a later patch).
      This patch and the preceding patches have been tested with the LTP SELinux
      This patch makes several logical sets of alteration:
       (1) execve().
           This now prepares and commits credentials in various places in the
           security code rather than altering the current creds directly.
       (2) Temporary credential overrides.
           do_coredump() and sys_faccessat() now prepare their own credentials and
           temporarily override the ones currently on the acting thread, whilst
           preventing interference from other threads by holding cred_replace_mutex
           on the thread being dumped.
           This will be replaced in a future patch by something that hands down the
           credentials directly to the functions being called, rather than altering
           the task's objective credentials.
       (3) LSM interface.
           A number of functions have been changed, added or removed:
           (*) security_capset_check(), ->capset_check()
           (*) security_capset_set(), ->capset_set()
           	 Removed in favour of security_capset().
           (*) security_capset(), ->capset()
           	 New.  This is passed a pointer to the new creds, a pointer to the old
           	 creds and the proposed capability sets.  It should fill in the new
           	 creds or return an error.  All pointers, barring the pointer to the
           	 new creds, are now const.
           (*) security_bprm_apply_creds(), ->bprm_apply_creds()
           	 Changed; now returns a value, which will cause the process to be
           	 killed if it's an error.
           (*) security_task_alloc(), ->task_alloc_security()
           	 Removed in favour of security_prepare_creds().
           (*) security_cred_free(), ->cred_free()
           	 New.  Free security data attached to cred->security.
           (*) security_prepare_creds(), ->cred_prepare()
           	 New. Duplicate any security data attached to cred->security.
           (*) security_commit_creds(), ->cred_commit()
           	 New. Apply any security effects for the upcoming installation of new
           	 security by commit_creds().
           (*) security_task_post_setuid(), ->task_post_setuid()
           	 Removed in favour of security_task_fix_setuid().
           (*) security_task_fix_setuid(), ->task_fix_setuid()
           	 Fix up the proposed new credentials for setuid().  This is used by
           	 cap_set_fix_setuid() to implicitly adjust capabilities in line with
           	 setuid() changes.  Changes are made to the new credentials, rather
           	 than the task itself as in security_task_post_setuid().
           (*) security_task_reparent_to_init(), ->task_reparent_to_init()
           	 Removed.  Instead the task being reparented to init is referred
           	 directly to init's credentials.
      	 NOTE!  This results in the loss of some state: SELinux's osid no
      	 longer records the sid of the thread that forked it.
           (*) security_key_alloc(), ->key_alloc()
           (*) security_key_permission(), ->key_permission()
           	 Changed.  These now take cred pointers rather than task pointers to
           	 refer to the security context.
       (4) sys_capset().
           This has been simplified and uses less locking.  The LSM functions it
           calls have been merged.
       (5) reparent_to_kthreadd().
           This gives the current thread the same credentials as init by simply using
           commit_thread() to point that way.
       (6) __sigqueue_alloc() and switch_uid()
           __sigqueue_alloc() can't stop the target task from changing its creds
           beneath it, so this function gets a reference to the currently applicable
           user_struct which it then passes into the sigqueue struct it returns if
           switch_uid() is now called from commit_creds(), and possibly should be
           folded into that.  commit_creds() should take care of protecting
       (7) [sg]et[ug]id() and co and [sg]et_current_groups.
           The set functions now all use prepare_creds(), commit_creds() and
           abort_creds() to build and check a new set of credentials before applying
           security_task_set[ug]id() is called inside the prepared section.  This
           guarantees that nothing else will affect the creds until we've finished.
           The calling of set_dumpable() has been moved into commit_creds().
           Much of the functionality of set_user() has been moved into
           The get functions all simply access the data directly.
       (8) security_task_prctl() and cap_task_prctl().
           security_task_prctl() has been modified to return -ENOSYS if it doesn't
           want to handle a function, or otherwise return the return value directly
           rather than through an argument.
           Additionally, cap_task_prctl() now prepares a new set of credentials, even
           if it doesn't end up using it.
       (9) Keyrings.
           A number of changes have been made to the keyrings code:
           (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
           	 all been dropped and built in to the credentials functions directly.
           	 They may want separating out again later.
           (b) key_alloc() and search_process_keyrings() now take a cred pointer
           	 rather than a task pointer to specify the security context.
           (c) copy_creds() gives a new thread within the same thread group a new
           	 thread keyring if its parent had one, otherwise it discards the thread
           (d) The authorisation key now points directly to the credentials to extend
           	 the search into rather pointing to the task that carries them.
           (e) Installing thread, process or session keyrings causes a new set of
           	 credentials to be created, even though it's not strictly necessary for
           	 process or session keyrings (they're shared).
      (10) Usermode helper.
           The usermode helper code now carries a cred struct pointer in its
           subprocess_info struct instead of a new session keyring pointer.  This set
           of credentials is derived from init_cred and installed on the new process
           after it has been cloned.
           call_usermodehelper_setup() allocates the new credentials and
           call_usermodehelper_freeinfo() discards them if they haven't been used.  A
           special cred function (prepare_usermodeinfo_creds()) is provided
           specifically for call_usermodehelper_setup() to call.
           call_usermodehelper_setkeys() adjusts the credentials to sport the
           supplied keyring as the new session keyring.
      (11) SELinux.
           SELinux has a number of changes, in addition to those to support the LSM
           interface changes mentioned above:
           (a) selinux_setprocattr() no longer does its check for whether the
           	 current ptracer can access processes with the new SID inside the lock
           	 that covers getting the ptracer's SID.  Whilst this lock ensures that
           	 the check is done with the ptracer pinned, the result is only valid
           	 until the lock is released, so there's no point doing it inside the
      (12) is_single_threaded().
           This function has been extracted from selinux_setprocattr() and put into
           a file of its own in the lib/ directory as join_session_keyring() now
           wants to use it too.
           The code in SELinux just checked to see whether a task shared mm_structs
           with other tasks (CLONE_VM), but that isn't good enough.  We really want
           to know if they're part of the same thread group (CLONE_THREAD).
      (13) nfsd.
           The NFS server daemon now has to use the COW credentials to set the
           credentials it is going to use.  It really needs to pass the credentials
           down to the functions it calls, but it can't do that until other patches
           in this series have been applied.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
    • David Howells's avatar
      KEYS: Alter use of key instantiation link-to-keyring argument · 8bbf4976
      David Howells authored
      Alter the use of the key instantiation and negation functions' link-to-keyring
      arguments.  Currently this specifies a keyring in the target process to link
      the key into, creating the keyring if it doesn't exist.  This, however, can be
      a problem for copy-on-write credentials as it means that the instantiating
      process can alter the credentials of the requesting process.
      This patch alters the behaviour such that:
       (1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific
           keyring by ID (ringid >= 0), then that keyring will be used.
       (2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the
           special constants that refer to the requesting process's keyrings
           (KEY_SPEC_*_KEYRING, all <= 0), then:
           (a) If sys_request_key() was given a keyring to use (destringid) then the
           	 key will be attached to that keyring.
           (b) If sys_request_key() was given a NULL keyring, then the key being
           	 instantiated will be attached to the default keyring as set by
       (3) No extra link will be made.
      Decision point (1) follows current behaviour, and allows those instantiators
      who've searched for a specifically named keyring in the requestor's keyring so
      as to partition the keys by type to still have their named keyrings.
      Decision point (2) allows the requestor to make sure that the key or keys that
      get produced by request_key() go where they want, whilst allowing the
      instantiator to request that the key is retained.  This is mainly useful for
      situations where the instantiator makes a secondary request, the key for which
      should be retained by the initial requestor:
      	+-----------+        +--------------+        +--------------+
      	|           |        |              |        |              |
      	| Requestor |------->| Instantiator |------->| Instantiator |
      	|           |        |              |        |              |
      	+-----------+        +--------------+        +--------------+
      	           request_key()           request_key()
      This might be useful, for example, in Kerberos, where the requestor requests a
      ticket, and then the ticket instantiator requests the TGT, which someone else
      then has to go and fetch.  The TGT, however, should be retained in the
      keyrings of the requestor, not the first instantiator.  To make this explict
      an extra special keyring constant is also added.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
  14. 16 Oct, 2008 2 commits
  15. 25 Jul, 2008 1 commit
  16. 24 Jul, 2008 1 commit
    • Ulrich Drepper's avatar
      flag parameters: NONBLOCK in pipe · be61a86d
      Ulrich Drepper authored
      This patch adds O_NONBLOCK support to pipe2.  It is minimally more involved
      than the patches for eventfd et.al but still trivial.  The interfaces of the
      create_write_pipe and create_read_pipe helper functions were changed and the
      one other caller as well.
      The following test must be adjusted for architectures other than x86 and
      x86-64 and in case the syscall numbers changed.
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      #ifndef __NR_pipe2
      # ifdef __x86_64__
      #  define __NR_pipe2 293
      # elif defined __i386__
      #  define __NR_pipe2 331
      # else
      #  error "need __NR_pipe2"
      # endif
      main (void)
        int fds[2];
        if (syscall (__NR_pipe2, fds, 0) == -1)
            puts ("pipe2(0) failed");
            return 1;
        for (int i = 0; i < 2; ++i)
            int fl = fcntl (fds[i], F_GETFL);
            if (fl == -1)
                puts ("fcntl failed");
                return 1;
            if (fl & O_NONBLOCK)
                printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i);
                return 1;
            close (fds[i]);
        if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1)
            puts ("pipe2(O_NONBLOCK) failed");
            return 1;
        for (int i = 0; i < 2; ++i)
            int fl = fcntl (fds[i], F_GETFL);
            if (fl == -1)
                puts ("fcntl failed");
                return 1;
            if ((fl & O_NONBLOCK) == 0)
                printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
                return 1;
            close (fds[i]);
        puts ("OK");
        return 0;
      Signed-off-by: default avatarUlrich Drepper <drepper@redhat.com>
      Acked-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  17. 22 Jul, 2008 1 commit
  18. 01 May, 2008 1 commit
  19. 19 Apr, 2008 1 commit
    • Mike Travis's avatar
      generic: use new set_cpus_allowed_ptr function · f70316da
      Mike Travis authored
        * Use new set_cpus_allowed_ptr() function added by previous patch,
          which instead of passing the "newly allowed cpus" cpumask_t arg
          by value,  pass it by pointer:
          -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
          +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
        * Modify CPU_MASK_ALL
      Depends on:
      	[sched-devel]: sched: add new set_cpus_allowed_ptr function
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  20. 14 Feb, 2008 1 commit
  21. 17 Jan, 2008 1 commit
  22. 11 Sep, 2007 1 commit
    • Michael Ellerman's avatar
      Restore call_usermodehelper_pipe() behaviour · 3210f0ec
      Michael Ellerman authored
      The semantics of call_usermodehelper_pipe() used to be that it would fork
      the helper, and wait for the kernel thread to be started.  This was
      implemented by setting sub_info.wait to 0 (implicitly), and doing a
      As part of the cleanup done in 0ab4dc92,
      call_usermodehelper_pipe() was changed to pass 1 as the value for wait to
      This is equivalent to setting sub_info.wait to 1, which is a change from
      the previous behaviour.  Using 1 instead of 0 causes
      __call_usermodehelper() to start the kernel thread running
      wait_for_helper(), rather than directly calling ____call_usermodehelper().
      The end result is that the calling kernel code blocks until the user mode
      helper finishes.  As the helper is expecting input on stdin, and now no one
      is writing anything, everything locks up (observed in do_coredump).
      The fix is to change the 1 to UMH_WAIT_EXEC (aka 0), indicating that we
      want to wait for the kernel thread to be started, but not for the helper to
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Acked-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  23. 26 Jul, 2007 1 commit
  24. 19 Jul, 2007 2 commits
  25. 18 Jul, 2007 2 commits
    • Jeremy Fitzhardinge's avatar
      usermodehelper: Tidy up waiting · 86313c48
      Jeremy Fitzhardinge authored
      Rather than using a tri-state integer for the wait flag in
      call_usermodehelper_exec, define a proper enum, and use that.  I've
      preserved the integer values so that any callers I've missed should
      still work OK.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
    • Jeremy Fitzhardinge's avatar
      usermodehelper: split setup from execution · 0ab4dc92
      Jeremy Fitzhardinge authored
      Rather than having hundreds of variations of call_usermodehelper for
      various pieces of usermode state which could be set up, split the
      info allocation and initialization from the actual process execution.
      This means the general pattern becomes:
       info = call_usermodehelper_setup(path, argv, envp); /* basic state */
       call_usermodehelper_<SET EXTRA STATE>(info, stuff...);	/* extra state */
       call_usermodehelper_exec(info, wait);	/* run process and free info */
      This patch introduces wrappers for all the existing calling styles for
      call_usermodehelper_*, but folds their implementations into one.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Bj?rn Steinbrink <B.Steinbrink@gmx.de>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
  26. 09 May, 2007 2 commits