    mm: filemap: don't plant shadow entries without radix tree node · 7d5d3b13
    Johannes Weiner authored
    commit d3798ae8c6f3767c726403c2ca6ecc317752c9dd upstream.
    When the underflow checks were added to workingset_node_shadow_dec(),
    they triggered immediately:
      kernel BUG at ./include/linux/swap.h:276!
      invalid opcode: 0000 [#1] SMP
      Modules linked in:
       CPU: 0 PID: 20929 Comm: blkid Not tainted 4.8.0-rc8-00087-gbe67d60b #1
      CPU: 0 PID: 20929 Comm: blkid Not tainted 4.8.0-rc8-00087-gbe67d60b #1
      Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016
      task: ffff8faa93ecd940 task.stack: ffff8faa7f478000
      RIP: page_cache_tree_insert+0xf1/0x100
      Call Trace:
      Code: 03 00 48 8b 5d d8 65 48 33 1c 25 28 00 00 00 44 89 e8 75 19 48 83 c4 18 5b 41 5c 41 5d 41 5e 5d c3 0f 0b 41 bd ef ff ff ff eb d7 <0f> 0b e8 88 68 ef ff 0f 1f 84 00
      RIP  page_cache_tree_insert+0xf1/0x100
    This is a long-standing bug in the way shadow entries are accounted in
    the radix tree nodes. The shrinker needs to know when radix tree nodes
    contain only shadow entries, no pages, so node->count is split in half
    to count shadows in the upper bits and pages in the lower bits.
    Unfortunately, the radix tree implementation doesn't know of this and
    assumes all entries are in node->count. When there is a shadow entry
    directly in root->rnode and the tree is later extended, the radix tree
    implementation will copy that entry into the new node and and bump its
    node->count, i.e. increases the page count bits. Once the shadow gets
    removed and we subtract from the upper counter, node->count underflows
    and triggers the warning. Afterwards, without node->count reaching 0
    again, the radix tree node is leaked.
    Limit shadow entries to when we have actual radix tree nodes and can
    count them properly. That means we lose the ability to detect refaults
    from files that had only the first page faulted in at eviction time.
    Fixes: 449dd698 ("mm: keep page cache radix tree nodes in check")
    Signed-off-by: 's avatarJohannes Weiner <hannes@cmpxchg.org>
    Reported-and-tested-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
    Reviewed-by: 's avatarJan Kara <jack@suse.cz>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
