From 2e4a707269a409950c3f315010c20f9719c594e2 Mon Sep 17 00:00:00 2001
From: Pavel Emelyanov <xemul@openvz.org>
Date: Thu, 18 Oct 2007 23:40:01 -0700
Subject: [PATCH] pid namespaces: move exit_task_namespaces()

Make task release its namespaces after it has reparented all his children to
child_reaper, but before it notifies its parent about its death.

The reason to release namespaces after reparenting is that when task exits it
may send a signal to its parent (SIGCHLD), but if the parent has already
exited its namespaces there will be no way to decide what pid to dever to him
- parent can be from different namespace.

The reason to release namespace before notifying the parent it that when task
sends a SIGCHLD to parent it can call wait() on this taks and release it.  But
releasing the mnt namespace implies dropping of all the mounts in the mnt
namespace and NFS expects the task to have valid sighand pointer.

Thanks to Oleg for pointing out some races that can apear and helping with
patches and fixes.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/exit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 3f2182ccf187..df2eee9c68ce 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -774,6 +774,7 @@ static void exit_notify(struct task_struct *tsk)
 	 *	jobs, send them a SIGHUP and then a SIGCONT.  (POSIX 3.2.2.2)
 	 */
 	forget_original_parent(tsk);
+	exit_task_namespaces(tsk);
 
 	write_lock_irq(&tasklist_lock);
 	/*
@@ -984,7 +985,6 @@ fastcall NORET_TYPE void do_exit(long code)
 		module_put(tsk->binfmt->module);
 
 	proc_exit_connector(tsk);
-	exit_task_namespaces(tsk);
 	exit_notify(tsk);
 #ifdef CONFIG_NUMA
 	mpol_free(tsk->mempolicy);
-- 
GitLab