Skip to content
  • Michal Hocko's avatar
    mm: make mmap_sem for write waits killable for mm syscalls · dc0ef0df
    Michal Hocko authored
    This is a follow up work for oom_reaper [1].  As the async OOM killing
    depends on oom_sem for read we would really appreciate if a holder for
    write didn't stood in the way.  This patchset is changing many of
    down_write calls to be killable to help those cases when the writer is
    blocked and waiting for readers to release the lock and so help
    __oom_reap_task to process the oom victim.
    
    Most of the patches are really trivial because the lock is help from a
    shallow syscall paths where we can return EINTR trivially and allow the
    current task to die (note that EINTR will never get to the userspace as
    the task has fatal signal pending).  Others seem to be easy as well as
    the callers are already handling fatal errors and bail and return to
    userspace which should be sufficient to handle the failure gracefully.
    I am not familiar with all those code paths so a deeper review is really
    appreciated.
    
    As this work is touching more areas which are not directly connected I
    have tried to keep the CC list as small as possible and people who I
    believed would be familiar are CCed only to the specific patches (all
    should have received the cover though).
    
    This patchset is based on linux-next and it depends on
    down_write_killable for rw_semaphores which got merged into tip
    locking/rwsem branch and it is merged into this next tree.  I guess it
    would be easiest to route these patches via mmotm because of the
    dependency on the tip tree but if respective maintainers prefer other
    way I have no objections.
    
    I haven't covered all the mmap_write(mm->mmap_sem) instances here
    
      $ git grep "down_write(.*\<mmap_sem\>)" next/master | wc -l
      98
      $ git grep "down_write(.*\<mmap_sem\>)" | wc -l
      62
    
    I have tried to cover those which should be relatively easy to review in
    this series because this alone should be a nice improvement.  Other
    places can be changed on top.
    
    [0] http://lkml.kernel.org/r/1456752417-9626-1-git-send-email-mhocko@kernel.org
    [1] http://lkml.kernel.org/r/1452094975-551-1-git-send-email-mhocko@kernel.org
    [2] http://lkml.kernel.org/r/1456750705-7141-1-git-send-email-mhocko@kernel.org
    
    
    
    This patch (of 18):
    
    This is the first step in making mmap_sem write waiters killable.  It
    focuses on the trivial ones which are taking the lock early after
    entering the syscall and they are not changing state before.
    
    Therefore it is very easy to change them to use down_write_killable and
    immediately return with -EINTR.  This will allow the waiter to pass away
    without blocking the mmap_sem which might be required to make a forward
    progress.  E.g.  the oom reaper will need the lock for reading to
    dismantle the OOM victim address space.
    
    The only tricky function in this patch is vm_mmap_pgoff which has many
    call sites via vm_mmap.  To reduce the risk keep vm_mmap with the
    original non-killable semantic for now.
    
    vm_munmap callers do not bother checking the return value so open code
    it into the munmap syscall path for now for simplicity.
    
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    dc0ef0df