1. 02 Aug, 2016 1 commit
  2. 23 Jun, 2016 1 commit
  3. 13 Dec, 2014 1 commit
    • Manfred Spraul's avatar
      ipc/msg: increase MSGMNI, remove scaling · 0050ee05
      Manfred Spraul authored
      SysV can be abused to allocate locked kernel memory.  For most systems, a
      small limit doesn't make sense, see the discussion with regards to SHMMAX.
      
      Therefore: increase MSGMNI to the maximum supported.
      
      And: If we ignore the risk of locking too much memory, then an automatic
      scaling of MSGMNI doesn't make sense.  Therefore the logic can be removed.
      
      The code preserves auto_msgmni to avoid breaking any user space applications
      that expect that the value exists.
      
      Notes:
      1) If an administrator must limit the memory allocations, then he can set
      MSGMNI as necessary.
      
      Or he can disable sysv entirely (as e.g. done by Android).
      
      2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
      to control latency vs. throughput:
      If MSGMNB is large, then msgsnd() just returns and more messages can be queued
      before a task switch to a task that calls msgrcv() is forced.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0050ee05
  4. 04 Dec, 2014 5 commits
  5. 29 Jul, 2014 1 commit
    • Eric W. Biederman's avatar
      namespaces: Use task_lock and not rcu to protect nsproxy · 728dba3a
      Eric W. Biederman authored
      The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
      a sufficiently expensive system call that people have complained.
      
      Upon inspect nsproxy no longer needs rcu protection for remote reads.
      remote reads are rare.  So optimize for same process reads and write
      by switching using rask_lock instead.
      
      This yields a simpler to understand lock, and a faster setns system call.
      
      In particular this fixes a performance regression observed
      by Rafael David Tinoco <rafael.tinoco@canonical.com>.
      
      This is effectively a revert of Pavel Emelyanov's commit
      cf7b708c Make access to task's nsproxy lighter
      from 2007.  The race this originialy fixed no longer exists as
      do_notify_parent uses task_active_pid_ns(parent) instead of
      parent->nsproxy.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      728dba3a
  6. 11 Sep, 2013 2 commits
  7. 31 Aug, 2013 1 commit
  8. 01 May, 2013 1 commit
  9. 14 Dec, 2012 1 commit
    • Eric W. Biederman's avatar
      userns: Require CAP_SYS_ADMIN for most uses of setns. · 5e4a0847
      Eric W. Biederman authored
      Andy Lutomirski <luto@amacapital.net> found a nasty little bug in
      the permissions of setns.  With unprivileged user namespaces it
      became possible to create new namespaces without privilege.
      
      However the setns calls were relaxed to only require CAP_SYS_ADMIN in
      the user nameapce of the targed namespace.
      
      Which made the following nasty sequence possible.
      
      pid = clone(CLONE_NEWUSER | CLONE_NEWNS);
      if (pid == 0) { /* child */
      	system("mount --bind /home/me/passwd /etc/passwd");
      }
      else if (pid != 0) { /* parent */
      	char path[PATH_MAX];
      	snprintf(path, sizeof(path), "/proc/%u/ns/mnt");
      	fd = open(path, O_RDONLY);
      	setns(fd, 0);
      	system("su -");
      }
      
      Prevent this possibility by requiring CAP_SYS_ADMIN
      in the current user namespace when joing all but the user namespace.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      5e4a0847
  10. 20 Nov, 2012 3 commits
  11. 07 Apr, 2012 1 commit
  12. 10 May, 2011 1 commit
  13. 25 Mar, 2011 1 commit
  14. 23 Mar, 2011 2 commits
  15. 18 Jun, 2009 3 commits
  16. 07 Apr, 2009 2 commits
    • Serge E. Hallyn's avatar
      namespaces: ipc namespaces: implement support for posix msqueues · 7eafd7c7
      Serge E. Hallyn authored
      Implement multiple mounts of the mqueue file system, and link it to usage
      of CLONE_NEWIPC.
      
      Each ipc ns has a corresponding mqueuefs superblock.  When a user does
      clone(CLONE_NEWIPC) or unshare(CLONE_NEWIPC), the unshare will cause an
      internal mount of a new mqueuefs sb linked to the new ipc ns.
      
      When a user does 'mount -t mqueue mqueue /dev/mqueue', he mounts the
      mqueuefs superblock.
      
      Posix message queues can be worked with both through the mq_* system calls
      (see mq_overview(7)), and through the VFS through the mqueue mount.  Any
      usage of mq_open() and friends will work with the acting task's ipc
      namespace.  Any actions through the VFS will work with the mqueuefs in
      which the file was created.  So if a user doesn't remount mqueuefs after
      unshare(CLONE_NEWIPC), mq_open("/ab") will not be reflected in "ls
      /dev/mqueue".
      
      If task a mounts mqueue for ipc_ns:1, then clones task b with a new ipcns,
      ipcns:2, and then task a is the last task in ipc_ns:1 to exit, then (1)
      ipc_ns:1 will be freed, (2) it's superblock will live on until task b
      umounts the corresponding mqueuefs, and vfs actions will continue to
      succeed, but (3) sb->s_fs_info will be NULL for the sb corresponding to
      the deceased ipc_ns:1.
      
      To make this happen, we must protect the ipc reference count when
      
      a) a task exits and drops its ipcns->count, since it might be dropping
         it to 0 and freeing the ipcns
      
      b) a task accesses the ipcns through its mqueuefs interface, since it
         bumps the ipcns refcount and might race with the last task in the ipcns
         exiting.
      
      So the kref is changed to an atomic_t so we can use
      atomic_dec_and_lock(&ns->count,mq_lock), and every access to the ipcns
      through ns = mqueuefs_sb->s_fs_info is protected by the same lock.
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7eafd7c7
    • Serge E. Hallyn's avatar
      namespaces: mqueue ns: move mqueue_mnt into struct ipc_namespace · 614b84cf
      Serge E. Hallyn authored
      Move mqueue vfsmount plus a few tunables into the ipc_namespace struct.
      The CONFIG_IPC_NS boolean and the ipc_namespace struct will serve both the
      posix message queue namespaces and the SYSV ipc namespaces.
      
      The sysctl code will be fixed separately in patch 3.  After just this
      patch, making a change to posix mqueue tunables always changes the values
      in the initial ipc namespace.
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      614b84cf
  17. 29 Apr, 2008 3 commits
    • Nadia Derbey's avatar
      ipc: recompute msgmni on ipc namespace creation/removal · e2c284d8
      Nadia Derbey authored
      Introduce a notification mechanism that aims at recomputing msgmni each time
      an ipc namespace is created or removed.
      
      The ipc namespace notifier chain already defined for memory hotplug management
      is used for that purpose too.
      
      Each time a new ipc namespace is allocated or an existing ipc namespace is
      removed, the ipcns notifier chain is notified.  The callback routine for each
      registered ipc namespace is then activated in order to recompute msgmni for
      that namespace.
      Signed-off-by: default avatarNadia Derbey <Nadia.Derbey@bull.net>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Mingming Cao <cmm@us.ibm.com>
      Cc: Pierre Peiffer <pierre.peiffer@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2c284d8
    • Nadia Derbey's avatar
      ipc: recompute msgmni on memory add / remove · b6b337ad
      Nadia Derbey authored
      Introduce the registration of a callback routine that recomputes msg_ctlmni
      upon memory add / remove.
      
      A single notifier block is registered in the hotplug memory chain for all the
      ipc namespaces.
      
      Since the ipc namespaces are not linked together, they have their own
      notification chain: one notifier_block is defined per ipc namespace.
      
      Each time an ipc namespace is created (removed) it registers (unregisters) its
      notifier block in (from) the ipcns chain.  The callback routine registered in
      the memory chain invokes the ipcns notifier chain with the IPCNS_LOWMEM event.
       Each callback routine registered in the ipcns namespace, in turn, recomputes
      msgmni for the owning namespace.
      Signed-off-by: default avatarNadia Derbey <Nadia.Derbey@bull.net>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Mingming Cao <cmm@us.ibm.com>
      Cc: Pierre Peiffer <pierre.peiffer@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b6b337ad
    • Nadia Derbey's avatar
      ipc: scale msgmni to the number of ipc namespaces · 4d89dc6a
      Nadia Derbey authored
      Since all the namespaces see the same amount of memory (the total one) this
      patch introduces a new variable that counts the ipc namespaces and divides
      msg_ctlmni by this counter.
      Signed-off-by: default avatarNadia Derbey <Nadia.Derbey@bull.net>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Mingming Cao <cmm@us.ibm.com>
      Cc: Pierre Peiffer <pierre.peiffer@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4d89dc6a
  18. 08 Feb, 2008 3 commits
    • Pierre Peiffer's avatar
      IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() · 01b8b07a
      Pierre Peiffer authored
      sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
      ipc_namespace is released to free all ipcs of each type.  But in fact, they
      do the same thing: they loop around all ipcs to free them individually by
      calling a specific routine.
      
      This patch proposes to consolidate this by introducing a common function,
      free_ipcs(), that do the job.  The specific routine to call on each
      individual ipcs is passed as parameter.  For this, these ipc-specific
      'free' routines are reworked to take a generic 'struct ipc_perm' as
      parameter.
      Signed-off-by: default avatarPierre Peiffer <pierre.peiffer@bull.net>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      01b8b07a
    • Pierre Peiffer's avatar
      IPC: make struct ipc_ids static in ipc_namespace · ed2ddbf8
      Pierre Peiffer authored
      Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
      msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
      are dynamically allocated for each icp_namespace as the ipc_namespace
      itself (for the init namespace, they are initialized with pointers to
      static variables instead)
      
      It is so for historical reason: in fact, before the use of idr to store the
      ipcs, the ipcs were stored in tables of variable length, depending of the
      maximum number of ipc allowed.  Now, these 'struct ipc_ids' have a fixed
      size.  As they are allocated in any cases for each new ipc_namespace, there
      is no gain of memory in having them allocated separately of the struct
      ipc_namespace.
      
      This patch proposes to make this table static in the struct ipc_namespace.
      Thus, we can allocate all in once and get rid of all the code needed to
      allocate and free these ipc_ids separately.
      Signed-off-by: default avatarPierre Peiffer <pierre.peiffer@bull.net>
      Acked-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed2ddbf8
    • Pavel Emelyanov's avatar
      namespaces: move the IPC namespace under IPC_NS option · ae5e1b22
      Pavel Emelyanov authored
      Currently the IPC namespace management code is spread over the ipc/*.c files.
      I moved this code into ipc/namespace.c file which is compiled out when needed.
      
      The linux/ipc_namespace.h file is used to store the prototypes of the
      functions in namespace.c and the stubs for NAMESPACES=n case.  This is done
      so, because the stub for copy_ipc_namespace requires the knowledge of the
      CLONE_NEWIPC flag, which is in sched.h.  But the linux/ipc.h file itself in
      included into many many .c files via the sys.h->sem.h sequence so adding the
      sched.h into it will make all these .c depend on sched.h which is not that
      good.  On the other hand the knowledge about the namespaces stuff is required
      in 4 .c files only.
      
      Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
      msg.c and shm.c files.  It turned out that moving these functions into
      namespaces.c is not that easy because they use many other calls and macros
      from the original file.  Moving them would make this patch complicated.  On
      the other hand all these functions can be consolidated, so I will send a
      separate patch doing this a bit later.
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Acked-by: default avatarSerge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae5e1b22