1. 01 Sep, 2016 1 commit
  2. 19 May, 2016 2 commits
    • Hugh Dickins's avatar
      tmpfs: preliminary minor tidyups · 75edd345
      Hugh Dickins authored
      Make a few cleanups in mm/shmem.c, before going on to complicate it.
      
      shmem_alloc_page() will become more complicated: we can't afford to to
      have that complication duplicated between a CONFIG_NUMA version and a
      !CONFIG_NUMA version, so rearrange the #ifdef'ery there to yield a
      single shmem_swapin() and a single shmem_alloc_page().
      
      Yes, it's a shame to inflict the horrid pseudo-vma on non-NUMA
      configurations, but eliminating it is a larger cleanup: I have an
      alloc_pages_mpol() patchset not yet ready - mpol handling is subtle and
      bug-prone, and changed yet again since my last version.
      
      Move __SetPageLocked, __SetPageSwapBacked from shmem_getpage_gfp() to
      shmem_alloc_page(): that SwapBacked flag will be useful in future, to
      help to distinguish different cases appropriately.
      
      And the SGP_DIRTY variant of SGP_CACHE is hard to understand and of
      little use (IIRC it dates back to when shmem_getpage() returned the page
      unlocked): kill it and do the necessary in shmem_file_read_iter().
      
      But an arm64 build then complained that info may be uninitialized (where
      shmem_getpage_gfp() deletes a freshly alloced page beyond eof), and
      advancing to an "sgp <= SGP_CACHE" test jogged it back to reality.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75edd345
    • Yaowei Bai's avatar
      mm/mempolicy.c: vma_migratable() can return bool · 4ee815be
      Yaowei Bai authored
      Make vma_migratable() return bool due to this particular function only
      using either one or zero as its return value.
      Signed-off-by: default avatarYaowei Bai <baiyaowei@cmss.chinamobile.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4ee815be
  3. 14 Jan, 2016 1 commit
    • Nathan Zimmer's avatar
      mm/mempolicy.c: convert the shared_policy lock to a rwlock · 4a8c7bb5
      Nathan Zimmer authored
      When running the SPECint_rate gcc on some very large boxes it was
      noticed that the system was spending lots of time in
      mpol_shared_policy_lookup().  The gamess benchmark can also show it and
      is what I mostly used to chase down the issue since the setup for that I
      found to be easier.
      
      To be clear the binaries were on tmpfs because of disk I/O requirements.
      We then used text replication to avoid icache misses and having all the
      copies banging on the memory where the instruction code resides.  This
      results in us hitting a bottleneck in mpol_shared_policy_lookup() since
      lookup is serialised by the shared_policy lock.
      
      I have only reproduced this on very large (3k+ cores) boxes.  The
      problem starts showing up at just a few hundred ranks getting worse
      until it threatens to livelock once it gets large enough.  For example
      on the gamess benchmark at 128 ranks this area consumes only ~1% of
      time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over
      90%.
      
      To alleviate the contention in this area I converted the spinlock to an
      rwlock.  This allows a large number of lookups to happen simultaneously.
      The results were quite good reducing this consumtion at max ranks to
      around 2%.
      
      [akpm@linux-foundation.org: tidy up code comments]
      Signed-off-by: default avatarNathan Zimmer <nzimmer@sgi.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Nadia Yvette Chambers <nyc@holomorphy.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4a8c7bb5
  4. 09 Oct, 2014 3 commits
    • Oleg Nesterov's avatar
      mempolicy: unexport get_vma_policy() and remove its "task" arg · dd6eecb9
      Oleg Nesterov authored
      - get_vma_policy(task) is not safe if task != current, remove this
        argument.
      
      - get_vma_policy() no longer has callers outside of mempolicy.c,
        make it static.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd6eecb9
    • Oleg Nesterov's avatar
      mempolicy: introduce __get_vma_policy(), export get_task_policy() · 74d2c3a0
      Oleg Nesterov authored
      Extract the code which looks for vma's policy from get_vma_policy()
      into the new helper, __get_vma_policy(). Export get_task_policy().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74d2c3a0
    • Oleg Nesterov's avatar
      mempolicy: remove the "task" arg of vma_policy_mof() and simplify it · 6b6482bb
      Oleg Nesterov authored
      1. vma_policy_mof(task) is simply not safe unless task == current,
         it can race with do_exit()->mpol_put(). Remove this arg and update
         its single caller.
      
      2. vma can not be NULL, remove this check and simplify the code.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b6482bb
  5. 04 Jun, 2014 1 commit
  6. 07 Apr, 2014 2 commits
    • David Rientjes's avatar
      mm, mempolicy: remove per-process flag · f0432d15
      David Rientjes authored
      PF_MEMPOLICY is an unnecessary optimization for CONFIG_SLAB users.
      There's no significant performance degradation to checking
      current->mempolicy rather than current->flags & PF_MEMPOLICY in the
      allocation path, especially since this is considered unlikely().
      
      Running TCP_RR with netperf-2.4.5 through localhost on 16 cpu machine with
      64GB of memory and without a mempolicy:
      
      	threads		before		after
      	16		1249409		1244487
      	32		1281786		1246783
      	48		1239175		1239138
      	64		1244642		1241841
      	80		1244346		1248918
      	96		1266436		1254316
      	112		1307398		1312135
      	128		1327607		1326502
      
      Per-process flags are a scarce resource so we should free them up whenever
      possible and make them available.  We'll be using it shortly for memcg oom
      reserves.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Tim Hockin <thockin@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0432d15
    • David Rientjes's avatar
      mm, mempolicy: rename slab_node for clarity · 2a389610
      David Rientjes authored
      slab_node() is actually a mempolicy function, so rename it to
      mempolicy_slab_node() to make it clearer that it used for processes with
      mempolicies.
      
      At the same time, cleanup its code by saving numa_mem_id() in a local
      variable (since we require a node with memory, not just any node) and
      remove an obsolete comment that assumes the mempolicy is actually passed
      into the function.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Tim Hockin <thockin@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2a389610
  7. 21 Jan, 2014 1 commit
  8. 12 Nov, 2013 1 commit
    • David Rientjes's avatar
      mm, mempolicy: make mpol_to_str robust and always succeed · 948927ee
      David Rientjes authored
      mpol_to_str() should not fail.  Currently, it either fails because the
      string buffer is too small or because a string hasn't been defined for a
      mempolicy mode.
      
      If a new mempolicy mode is introduced and no string is defined for it,
      just warn and return "unknown".
      
      If the buffer is too small, just truncate the string and return, the
      same behavior as snprintf().
      
      This also fixes a bug where there was no NULL-byte termination when doing
      *p++ = '=' and *p++ ':' and maxlen has been reached.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Chen Gang <gang.chen@asianux.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      948927ee
  9. 09 Oct, 2013 1 commit
  10. 11 Sep, 2013 2 commits
  11. 02 Jan, 2013 2 commits
    • Mel Gorman's avatar
      mm: mempolicy: Convert shared_policy mutex to spinlock · 42288fe3
      Mel Gorman authored
      Sasha was fuzzing with trinity and reported the following problem:
      
        BUG: sleeping function called from invalid context at kernel/mutex.c:269
        in_atomic(): 1, irqs_disabled(): 0, pid: 6361, name: trinity-main
        2 locks held by trinity-main/6361:
         #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff810aa314>] __do_page_fault+0x1e4/0x4f0
         #1:  (&(&mm->page_table_lock)->rlock){+.+...}, at: [<ffffffff8122f017>] handle_pte_fault+0x3f7/0x6a0
        Pid: 6361, comm: trinity-main Tainted: G        W
        3.7.0-rc2-next-20121024-sasha-00001-gd95ef01-dirty #74
        Call Trace:
          __might_sleep+0x1c3/0x1e0
          mutex_lock_nested+0x29/0x50
          mpol_shared_policy_lookup+0x2e/0x90
          shmem_get_policy+0x2e/0x30
          get_vma_policy+0x5a/0xa0
          mpol_misplaced+0x41/0x1d0
          handle_pte_fault+0x465/0x6a0
      
      This was triggered by a different version of automatic NUMA balancing
      but in theory the current version is vunerable to the same problem.
      
      do_numa_page
        -> numa_migrate_prep
          -> mpol_misplaced
            -> get_vma_policy
              -> shmem_get_policy
      
      It's very unlikely this will happen as shared pages are not marked
      pte_numa -- see the page_mapcount() check in change_pte_range() -- but
      it is possible.
      
      To address this, this patch restores sp->lock as originally implemented
      by Kosaki Motohiro.  In the path where get_vma_policy() is called, it
      should not be calling sp_alloc() so it is not necessary to treat the PTL
      specially.
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42288fe3
    • Hugh Dickins's avatar
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins authored
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
  12. 11 Dec, 2012 1 commit
    • Lee Schermerhorn's avatar
      mm: mempolicy: Check for misplaced page · 771fb4d8
      Lee Schermerhorn authored
      This patch provides a new function to test whether a page resides
      on a node that is appropriate for the mempolicy for the vma and
      address where the page is supposed to be mapped.  This involves
      looking up the node where the page belongs.  So, the function
      returns that node so that it may be used to allocated the page
      without consulting the policy again.
      
      A subsequent patch will call this function from the fault path.
      Because of this, I don't want to go ahead and allocate the page, e.g.,
      via alloc_page_vma() only to have to free it if it has the correct
      policy.  So, I just mimic the alloc_page_vma() node computation
      logic--sort of.
      
      Note:  we could use this function to implement a MPOL_MF_STRICT
      behavior when migrating pages to match mbind() mempolicy--e.g.,
      to ensure that pages in an interleaved range are reinterleaved
      rather than left where they are when they reside on any page in
      the interleave nodemask.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      [ Added MPOL_F_LAZY to trigger migrate-on-fault;
        simplified code now that we don't have to bother
        with special crap for interleaved ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      771fb4d8
  13. 06 Dec, 2012 1 commit
    • Mel Gorman's avatar
      tmpfs: fix shared mempolicy leak · 18a2f371
      Mel Gorman authored
      This fixes a regression in 3.7-rc, which has since gone into stable.
      
      Commit 00442ad0 ("mempolicy: fix a memory corruption by refcount
      imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
      refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
      on expecting alloc_page_vma() to drop the refcount it had acquired.
      This deserves a rework: but for now fix the leak in shmem_alloc_page().
      
      Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
      the same refcounting there as in shmem_alloc_page(), delete its onstack
      mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
      those were invented to let swapin_readahead() make an unknown number of
      calls to alloc_pages_vma() with one mempolicy; but since 00442ad0,
      alloc_pages_vma() has kept refcount in balance, so now no problem.
      Reported-and-tested-by: default avatarTommi Rantala <tt.rantala@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18a2f371
  14. 13 Oct, 2012 1 commit
  15. 09 Oct, 2012 2 commits
    • Mel Gorman's avatar
      mempolicy: fix a race in shared_policy_replace() · b22d127a
      Mel Gorman authored
      shared_policy_replace() use of sp_alloc() is unsafe.  1) sp_node cannot
      be dereferenced if sp->lock is not held and 2) another thread can modify
      sp_node between spin_unlock for allocating a new sp node and next
      spin_lock.  The bug was introduced before 2.6.12-rc2.
      
      Kosaki's original patch for this problem was to allocate an sp node and
      policy within shared_policy_replace and initialise it when the lock is
      reacquired.  I was not keen on this approach because it partially
      duplicates sp_alloc().  As the paths were sp->lock is taken are not that
      performance critical this patch converts sp->lock to sp->mutex so it can
      sleep when calling sp_alloc().
      
      [kosaki.motohiro@jp.fujitsu.com: Original patch]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b22d127a
    • Konstantin Khlebnikov's avatar
      mm: kill vma flag VM_RESERVED and mm->reserved_vm counter · 314e51b9
      Konstantin Khlebnikov authored
      A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
      currently it lost original meaning but still has some effects:
      
       | effect                 | alternative flags
      -+------------------------+---------------------------------------------
      1| account as reserved_vm | VM_IO
      2| skip in core dump      | VM_IO, VM_DONTDUMP
      3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      4| do not mlock           | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
      
      This patch removes reserved_vm counter from mm_struct.  Seems like nobody
      cares about it, it does not exported into userspace directly, it only
      reduces total_vm showed in proc.
      
      Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
      
      remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
      remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
      
      [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      314e51b9
  16. 20 Jun, 2012 1 commit
  17. 29 May, 2012 1 commit
  18. 10 Jan, 2012 1 commit
  19. 25 May, 2011 2 commits
    • Stephen Wilson's avatar
      mm: declare mpol_to_str() when CONFIG_TMPFS=n · 13057efb
      Stephen Wilson authored
      When CONFIG_TMPFS=n mpol_to_str() is not declared in mempolicy.h.
      However, in the NUMA case, the definition is always compiled.
      
      Since it is not strictly true that tmpfs is the only client, and since the
      symbol was always lurking around anyways, export mpol_to_str()
      unconditionally.  Furthermore, this will allow us to move show_numa_map()
      out of mempolicy.c and into the procfs subsystem.
      Signed-off-by: default avatarStephen Wilson <wilsons@start.ca>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      13057efb
    • Stephen Wilson's avatar
      mm: export get_vma_policy() · d98f6cb6
      Stephen Wilson authored
      In commit 48fce342 ("mempolicies: unexport get_vma_policy()")
      get_vma_policy() was marked static as all clients were local to
      mempolicy.c.
      
      However, the decision to generate /proc/pid/numa_maps in the numa memory
      policy code and outside the procfs subsystem introduces an artificial
      interdependency between the two systems.  Exporting get_vma_policy() once
      again is the first step to clean up this interdependency.
      Signed-off-by: default avatarStephen Wilson <wilsons@start.ca>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d98f6cb6
  20. 09 Aug, 2010 1 commit
  21. 25 May, 2010 1 commit
    • Miao Xie's avatar
      mempolicy: restructure rebinding-mempolicy functions · 708c1bbc
      Miao Xie authored
      Nick Piggin reported that the allocator may see an empty nodemask when
      changing cpuset's mems[1].  It happens only on the kernel that do not do
      atomic nodemask_t stores.  (MAX_NUMNODES > BITS_PER_LONG)
      
      But I found that there is also a problem on the kernel that can do atomic
      nodemask_t stores.  The problem is that the allocator can't find a node to
      alloc page when changing cpuset's mems though there is a lot of free
      memory.  The reason is like this:
      
      (mpol: mempolicy)
      	task1			task1's mpol	task2
      	alloc page		1
      	  alloc on node0? NO	1
      				1		change mems from 1 to 0
      				1		rebind task1's mpol
      				0-1		  set new bits
      				0	  	  clear disallowed bits
      	  alloc on node1? NO	0
      	  ...
      	can't alloc page
      	  goto oom
      
      I can use the attached program reproduce it by the following step:
      
      # mkdir /dev/cpuset
      # mount -t cpuset cpuset /dev/cpuset
      # mkdir /dev/cpuset/1
      # echo `cat /dev/cpuset/cpus` > /dev/cpuset/1/cpus
      # echo `cat /dev/cpuset/mems` > /dev/cpuset/1/mems
      # echo $$ > /dev/cpuset/1/tasks
      # numactl --membind=`cat /dev/cpuset/mems` ./cpuset_mem_hog <nr_tasks> &
         <nr_tasks> = max(nr_cpus - 1, 1)
      # killall -s SIGUSR1 cpuset_mem_hog
      # ./change_mems.sh
      
      several hours later, oom will happen though there is a lot of free memory.
      
      This patchset fixes this problem by expanding the nodes range first(set
      newly allowed bits) and shrink it lazily(clear newly disallowed bits).  So
      we use a variable to tell the write-side task that read-side task is
      reading nodemask, and the write-side task clears newly disallowed nodes
      after read-side task ends the current memory allocation.
      
      This patch:
      
      In order to fix no node to alloc memory, when we want to update mempolicy
      and mems_allowed, we expand the set of nodes first (set all the newly
      nodes) and shrink the set of nodes lazily(clean disallowed nodes), But the
      mempolicy's rebind functions may breaks the expanding.
      
      So we restructure the mempolicy's rebind functions and split the rebind
      work to two steps, just like the update of cpuset's mems: The 1st step:
      expand the set of the mempolicy's nodes.  The 2nd step: shrink the set of
      the mempolicy's nodes.  It is used when there is no real lock to protect
      the mempolicy in the read-side.  Otherwise we can do rebind work at once.
      
      In order to implement it, we define
      
      	enum mpol_rebind_step {
      		MPOL_REBIND_ONCE,
      		MPOL_REBIND_STEP1,
      		MPOL_REBIND_STEP2,
      		MPOL_REBIND_NSTEP,
      	};
      
      If the mempolicy needn't be updated by two steps, we can pass
      MPOL_REBIND_ONCE to the rebind functions.  Or we can pass
      MPOL_REBIND_STEP1 to do the first step of the rebind work and pass
      MPOL_REBIND_STEP2 to do the second step work.
      
      Besides that, it maybe long time between these two step and we have to
      release the lock that protects mempolicy and mems_allowed.  If we hold the
      lock once again, we must check whether the current mempolicy is under the
      rebinding (the first step has been done) or not, because the task may
      alloc a new mempolicy when we don't hold the lock.  So we defined the
      following flag to identify it:
      
      #define MPOL_F_REBINDING (1 << 2)
      
      The new functions will be used in the next patch.
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Paul Menage <menage@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ravikiran Thirumalai <kiran@scalex86.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      708c1bbc
  22. 15 Dec, 2009 1 commit
    • Lee Schermerhorn's avatar
      hugetlb: derive huge pages nodes allowed from task mempolicy · 06808b08
      Lee Schermerhorn authored
      This patch derives a "nodes_allowed" node mask from the numa mempolicy of
      the task modifying the number of persistent huge pages to control the
      allocation, freeing and adjusting of surplus huge pages when the pool page
      count is modified via the new sysctl or sysfs attribute
      "nr_hugepages_mempolicy".  The nodes_allowed mask is derived as follows:
      
      * For "default" [NULL] task mempolicy, a NULL nodemask_t pointer
        is produced.  This will cause the hugetlb subsystem to use
        node_online_map as the "nodes_allowed".  This preserves the
        behavior before this patch.
      * For "preferred" mempolicy, including explicit local allocation,
        a nodemask with the single preferred node will be produced.
        "local" policy will NOT track any internode migrations of the
        task adjusting nr_hugepages.
      * For "bind" and "interleave" policy, the mempolicy's nodemask
        will be used.
      * Other than to inform the construction of the nodes_allowed node
        mask, the actual mempolicy mode is ignored.  That is, all modes
        behave like interleave over the resulting nodes_allowed mask
        with no "fallback".
      
      See the updated documentation [next patch] for more information
      about the implications of this patch.
      
      Examples:
      
      Starting with:
      
      	Node 0 HugePages_Total:     0
      	Node 1 HugePages_Total:     0
      	Node 2 HugePages_Total:     0
      	Node 3 HugePages_Total:     0
      
      Default behavior [with or without this patch] balances persistent
      hugepage allocation across nodes [with sufficient contiguous memory]:
      
      	sysctl vm.nr_hugepages[_mempolicy]=32
      
      yields:
      
      	Node 0 HugePages_Total:     8
      	Node 1 HugePages_Total:     8
      	Node 2 HugePages_Total:     8
      	Node 3 HugePages_Total:     8
      
      Of course, we only have nr_hugepages_mempolicy with the patch,
      but with default mempolicy, nr_hugepages_mempolicy behaves the
      same as nr_hugepages.
      
      Applying mempolicy--e.g., with numactl [using '-m' a.k.a.
      '--membind' because it allows multiple nodes to be specified
      and it's easy to type]--we can allocate huge pages on
      individual nodes or sets of nodes.  So, starting from the
      condition above, with 8 huge pages per node, add 8 more to
      node 2 using:
      
      	numactl -m 2 sysctl vm.nr_hugepages_mempolicy=40
      
      This yields:
      
      	Node 0 HugePages_Total:     8
      	Node 1 HugePages_Total:     8
      	Node 2 HugePages_Total:    16
      	Node 3 HugePages_Total:     8
      
      The incremental 8 huge pages were restricted to node 2 by the
      specified mempolicy.
      
      Similarly, we can use mempolicy to free persistent huge pages
      from specified nodes:
      
      	numactl -m 0,1 sysctl vm.nr_hugepages_mempolicy=32
      
      yields:
      
      	Node 0 HugePages_Total:     4
      	Node 1 HugePages_Total:     4
      	Node 2 HugePages_Total:    16
      	Node 3 HugePages_Total:     8
      
      The 8 huge pages freed were balanced over nodes 0 and 1.
      
      [rientjes@google.com: accomodate reworked NODEMASK_ALLOC]
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
      Reviewed-by: default avatarAndi Kleen <andi@firstfloor.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Nishanth Aravamudan <nacc@us.ibm.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      06808b08
  23. 24 Jul, 2008 1 commit
  24. 28 Apr, 2008 9 commits
    • Lee Schermerhorn's avatar
      mempolicy: use struct mempolicy pointer in shmem_sb_info · 71fe804b
      Lee Schermerhorn authored
      This patch replaces the mempolicy mode, mode_flags, and nodemask in the
      shmem_sb_info struct with a struct mempolicy pointer, initialized to NULL.
      This removes dependency on the details of mempolicy from shmem.c and hugetlbfs
      inode.c and simplifies the interfaces.
      
      mpol_parse_str() in mempolicy.c is changed to return, via a pointer to a
      pointer arg, a struct mempolicy pointer on success.  For MPOL_DEFAULT, the
      returned pointer is NULL.  Further, mpol_parse_str() now takes a 'no_context'
      argument that causes the input nodemask to be stored in the w.user_nodemask of
      the created mempolicy for use when the mempolicy is installed in a tmpfs inode
      shared policy tree.  At that time, any cpuset contextualization is applied to
      the original input nodemask.  This preserves the previous behavior where the
      input nodemask was stored in the superblock.  We can think of the returned
      mempolicy as "context free".
      
      Because mpol_parse_str() is now calling mpol_new(), we can remove from
      mpol_to_str() the semantic checks that mpol_new() already performs.
      
      Add 'no_context' parameter to mpol_to_str() to specify that it should format
      the nodemask in w.user_nodemask for 'bind' and 'interleave' policies.
      
      Change mpol_shared_policy_init() to take a pointer to a "context free" struct
      mempolicy and to create a new, "contextualized" mempolicy using the mode,
      mode_flags and user_nodemask from the input mempolicy.
      
        Note: we know that the mempolicy passed to mpol_to_str() or
        mpol_shared_policy_init() from a tmpfs superblock is "context free".  This
        is currently the only instance thereof.  However, if we found more uses for
        this concept, and introduced any ambiguity as to whether a mempolicy was
        context free or not, we could add another internal mode flag to identify
        context free mempolicies.  Then, we could remove the 'no_context' argument
        from mpol_to_str().
      
      Added shmem_get_sbmpol() to return a reference counted superblock mempolicy,
      if one exists, to pass to mpol_shared_policy_init().  We must add the
      reference under the sb stat_lock to prevent races with replacement of the mpol
      by remount.  This reference is removed in mpol_shared_policy_init().
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: another build fix]
      [akpm@linux-foundation.org: yet another build fix]
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      71fe804b
    • Lee Schermerhorn's avatar
      mempolicy: rework shmem mpol parsing and display · 095f1fc4
      Lee Schermerhorn authored
      mm/shmem.c currently contains functions to parse and display memory policy
      strings for the tmpfs 'mpol' mount option.  Move this to mm/mempolicy.c with
      the rest of the mempolicy support.  With subsequent patches, we'll be able to
      remove knowledge of the details [mode, flags, policy, ...] completely from
      shmem.c
      
      1) replace shmem_parse_mpol() in mm/shmem.c with mpol_parse_str() in
         mm/mempolicy.c.  Rework to use the policy_types[] array [used by
         mpol_to_str()] to look up mode by name.
      
      2) use mpol_to_str() to format policy for shmem_show_mpol().  mpol_to_str()
         expects a pointer to a struct mempolicy, so temporarily construct one.
         This will be replaced with a reference to a struct mempolicy in the tmpfs
         superblock in a subsequent patch.
      
         NOTE 1: I changed mpol_to_str() to use a colon ':' rather than an equal
         sign '=' as the nodemask delimiter to match mpol_parse_str() and the
         tmpfs/shmem mpol mount option formatting that now uses mpol_to_str().  This
         is a user visible change to numa_maps, but then the addition of the mode
         flags already changed the display.  It makes sense to me to have the mounts
         and numa_maps display the policy in the same format.  However, if anyone
         objects strongly, I can pass the desired nodemask delimeter as an arg to
         mpol_to_str().
      
         Note 2: Like show_numa_map(), I don't check the return code from
         mpol_to_str().  I do use a longer buffer than the one provided by
         show_numa_map(), which seems to have sufficed so far.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      095f1fc4
    • Lee Schermerhorn's avatar
      mempolicy: use MPOL_F_LOCAL to Indicate Preferred Local Policy · fc36b8d3
      Lee Schermerhorn authored
      Now that we're using "preferred local" policy for system default, we need to
      make this as fast as possible.  Because of the variable size of the mempolicy
      structure [based on size of nodemasks], the preferred_node may be in a
      different cacheline from the mode.  This can result in accessing an extra
      cacheline in the normal case of system default policy.  Suspect this is the
      cause of an observed 2-3% slowdown in page fault testing relative to kernel
      without this patch series.
      
      To alleviate this, use an internal mode flag, MPOL_F_LOCAL in the mempolicy
      flags member which is guaranteed [?] to be in the same cacheline as the mode
      itself.
      
      Verified that reworked mempolicy now performs slightly better on 25-rc8-mm1
      for both anon and shmem segments with system default and vma [preferred local]
      policy.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc36b8d3
    • Lee Schermerhorn's avatar
      mempolicy: rework mempolicy Reference Counting [yet again] · 52cd3b07
      Lee Schermerhorn authored
      After further discussion with Christoph Lameter, it has become clear that my
      earlier attempts to clean up the mempolicy reference counting were a bit of
      overkill in some areas, resulting in superflous ref/unref in what are usually
      fast paths.  In other areas, further inspection reveals that I botched the
      unref for interleave policies.
      
      A separate patch, suitable for upstream/stable trees, fixes up the known
      errors in the previous attempt to fix reference counting.
      
      This patch reworks the memory policy referencing counting and, one hopes,
      simplifies the code.  Maybe I'll get it right this time.
      
      See the update to the numa_memory_policy.txt document for a discussion of
      memory policy reference counting that motivates this patch.
      
      Summary:
      
      Lookup of mempolicy, based on (vma, address) need only add a reference for
      shared policy, and we need only unref the policy when finished for shared
      policies.  So, this patch backs out all of the unneeded extra reference
      counting added by my previous attempt.  It then unrefs only shared policies
      when we're finished with them, using the mpol_cond_put() [conditional put]
      helper function introduced by this patch.
      
      Note that shmem_swapin() calls read_swap_cache_async() with a dummy vma
      containing just the policy.  read_swap_cache_async() can call alloc_page_vma()
      multiple times, so we can't let alloc_page_vma() unref the shared policy in
      this case.  To avoid this, we make a copy of any non-null shared policy and
      remove the MPOL_F_SHARED flag from the copy.  This copy occurs before reading
      a page [or multiple pages] from swap, so the overhead should not be an issue
      here.
      
      I introduced a new static inline function "mpol_cond_copy()" to copy the
      shared policy to an on-stack policy and remove the flags that would require a
      conditional free.  The current implementation of mpol_cond_copy() assumes that
      the struct mempolicy contains no pointers to dynamically allocated structures
      that must be duplicated or reference counted during copy.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      52cd3b07
    • Lee Schermerhorn's avatar
      mempolicy: mark shared policies for unref · aab0b102
      Lee Schermerhorn authored
      As part of yet another rework of mempolicy reference counting, we want to be
      able to identify shared policies efficiently, because they have an extra ref
      taken on lookup that needs to be removed when we're finished using the policy.
      
        Note:  the extra ref is required because the policies are
        shared between tasks/processes and can be changed/freed
        by one task while another task is using them--e.g., for
        page allocation.
      
      Building on David Rientjes mempolicy "mode flags" enhancement, this patch
      indicates a "shared" policy by setting a new MPOL_F_SHARED flag in the flags
      member of the struct mempolicy added by David.  MPOL_F_SHARED, and any future
      "internal mode flags" are reserved from bit zero up, as they will never be
      passed in the upper bits of the mode argument of a mempolicy API.
      
      I set the MPOL_F_SHARED flag when the policy is installed in the shared policy
      rb-tree.  Don't need/want to clear the flag when removing from the tree as the
      mempolicy is freed [unref'd] internally to the sp_delete() function.  However,
      a task could hold another reference on this mempolicy from a prior lookup.  We
      need the MPOL_F_SHARED flag to stay put so that any tasks holding a ref will
      unref, eventually freeing, the mempolicy.
      
      A later patch in this series will introduce a function to conditionally unref
      [mpol_free] a policy.  The MPOL_F_SHARED flag is one reason [currently the
      only reason] to unref/free a policy via the conditional free.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aab0b102
    • Lee Schermerhorn's avatar
      mempolicy: rename struct mempolicy 'policy' member to 'mode' · 45c4745a
      Lee Schermerhorn authored
      The terms 'policy' and 'mode' are both used in various places to describe the
      semantics of the value stored in the 'policy' member of struct mempolicy.
      Furthermore, the term 'policy' is used to refer to that member, to the entire
      struct mempolicy and to the more abstract concept of the tuple consisting of a
      "mode" and an optional node or set of nodes.  Recently, we have added "mode
      flags" that are passed in the upper bits of the 'mode' [or sometimes,
      'policy'] member of the numa APIs.
      
      I'd like to resolve this confusion, which perhaps only exists in my mind, by
      renaming the 'policy' member to 'mode' throughout, and fixing up the
      Documentation.  Man pages will be updated separately.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      45c4745a
    • Lee Schermerhorn's avatar
      mempolicy: rename mpol_copy to mpol_dup · 846a16bf
      Lee Schermerhorn authored
      This patch renames mpol_copy() to mpol_dup() because, well, that's what it
      does.  Like, e.g., strdup() for strings, mpol_dup() takes a pointer to an
      existing mempolicy, allocates a new one and copies the contents.
      
      In a later patch, I want to use the name mpol_copy() to copy the contents from
      one mempolicy to another like, e.g., strcpy() does for strings.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      846a16bf
    • Lee Schermerhorn's avatar
      mempolicy: rename mpol_free to mpol_put · f0be3d32
      Lee Schermerhorn authored
      This is a change that was requested some time ago by Mel Gorman.  Makes sense
      to me, so here it is.
      
      Note: I retain the name "mpol_free_shared_policy()" because it actually does
      free the shared_policy, which is NOT a reference counted object.  However, ...
      
      The mempolicy object[s] referenced by the shared_policy are reference counted,
      so mpol_put() is used to release the reference held by the shared_policy.  The
      mempolicy might not be freed at this time, because some task attached to the
      shared object associated with the shared policy may be in the process of
      allocating a page based on the mempolicy.  In that case, the task performing
      the allocation will hold a reference on the mempolicy, obtained via
      mpol_shared_policy_lookup().  The mempolicy will be freed when all tasks
      holding such a reference have called mpol_put() for the mempolicy.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0be3d32
    • David Rientjes's avatar
      mempolicy: small header file cleanup · 3842b46d
      David Rientjes authored
      Removes forward definition of vm_area_struct in linux/mempolicy.h.  We already
      get it from the linux/slab.h -> linux/gfp.h include.
      
      Removes the unused mpol_set_vma_default() macro from linux/mempolicy.h.
      
      Removes the extern definition of default_policy since it is only referenced,
      as it should be, in mm/mempolicy.c.
      
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3842b46d