Skip to content
  • Mitko Haralanov's avatar
    IB/hfi1: Prevent NULL pointer deferences in caching code · f19bd643
    Mitko Haralanov authored
    
    
    There is a potential kernel crash when the MMU notifier calls the
    invalidation routines in the hfi1 pinned page caching code for sdma.
    
    The invalidation routine could call the remove callback
    for the node, which in turn ends up dereferencing the
    current task_struct to get a pointer to the mm_struct.
    However, the mm_struct pointer could be NULL resulting in
    the following backtrace:
    
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
        IP: [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
        15
        task: ffff88085e66e080 ti: ffff88085c244000 task.ti: ffff88085c244000
        RIP: 0010:[<ffffffffa041f75a>]  [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1]
        RSP: 0000:ffff88085c245878  EFLAGS: 00010002
        RAX: 0000000000000000 RBX: ffff88105b9bbd40 RCX: ffffea003931a830
        RDX: 0000000000000004 RSI: ffff88105754a9c0 RDI: ffff88105754a9c0
        RBP: ffff88085c245890 R08: ffff88105b9bbd70 R09: 00000000fffffffb
        R10: ffff88105b9bbd58 R11: 0000000000000013 R12: ffff88105754a9c0
        R13: 0000000000000001 R14: 0000000000000001 R15: ffff88105b9bbd40
        FS:  0000000000000000(0000) GS:ffff88107ef40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000a8 CR3: 0000000001a0b000 CR4: 00000000001407e0
        Stack:
         ffff88105b9bbd40 ffff88080ec481a8 ffff88080ec481b8 ffff88085c2458c0
         ffffffffa03fa00e ffff88080ec48190 ffff88080ed9cd00 0000000001024000
         0000000000000000 ffff88085c245920 ffffffffa03fa0e7 0000000000000282
        Call Trace:
         [<ffffffffa03fa00e>] __mmu_rb_remove.isra.5+0x5e/0x70 [hfi1]
         [<ffffffffa03fa0e7>] mmu_notifier_mem_invalidate+0xc7/0xf0 [hfi1]
         [<ffffffffa03fa143>] mmu_notifier_page+0x13/0x20 [hfi1]
         [<ffffffff81156dd0>] __mmu_notifier_invalidate_page+0x50/0x70
         [<ffffffff81140bbb>] try_to_unmap_one+0x20b/0x470
         [<ffffffff81141ee7>] try_to_unmap_anon+0xa7/0x120
         [<ffffffff81141fad>] try_to_unmap+0x4d/0x60
         [<ffffffff8111fd7b>] shrink_page_list+0x2eb/0x9d0
         [<ffffffff81120ab3>] shrink_inactive_list+0x243/0x490
         [<ffffffff81121491>] shrink_lruvec+0x4c1/0x640
         [<ffffffff81121641>] shrink_zone+0x31/0x100
         [<ffffffff81121b0f>] kswapd_shrink_zone.constprop.62+0xef/0x1c0
         [<ffffffff811229e3>] kswapd+0x403/0x7e0
         [<ffffffff811225e0>] ? shrink_all_memory+0xf0/0xf0
         [<ffffffff81068ac0>] kthread+0xc0/0xd0
         [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
         [<ffffffff814ff8ec>] ret_from_fork+0x7c/0xb0
         [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40
    
    To correct this, the mm_struct passed to us by the MMU notifier is
    used (which is what should have been done to begin with). This avoids
    the broken derefences and ensures that the correct mm_struct is used.
    
    Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
    Reviewed-by: default avatarDean Luick <dean.luick@intel.com>
    Signed-off-by: default avatarMitko Haralanov <mitko.haralanov@intel.com>
    Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    f19bd643