• Vladimir Davydov's avatar
    kernfs: do not account ino_ida allocations to memcg · 499611ed
    Vladimir Davydov authored
    root->ino_ida is used for kernfs inode number allocations. Since IDA has
    a layered structure, different IDs can reside on the same layer, which
    is currently accounted to some memory cgroup. The problem is that each
    kmem cache of a memory cgroup has its own directory on sysfs (under
    /sys/fs/kernel/<cache-name>/cgroup). If the inode number of such a
    directory or any file in it gets allocated from a layer accounted to the
    cgroup which the cache is created for, the cgroup will get pinned for
    good, because one has to free all kmem allocations accounted to a cgroup
    in order to release it and destroy all its kmem caches. That said we
    must not account layers of ino_ida to any memory cgroup.
    Since per net init operations may create new sysfs entries directly
    (e.g. lo device) or indirectly (nf_conntrack creates a new kmem cache
    per each namespace, which, in turn, creates new sysfs entries), an easy
    way to reproduce this issue is by creating network namespace(s) from
    inside a kmem-active memory cgroup.
    Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
    Acked-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@suse.cz>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Greg Thelen <gthelen@google.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: <stable@vger.kernel.org>	[4.0.x]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
dir.c 35.6 KB