1. 03 Jul, 2013 1 commit
    • Jiang Liu's avatar
      mm: accurately calculate zone->managed_pages for highmem zones · 7b4b2a0d
      Jiang Liu authored
      Commit "mm: introduce new field 'managed_pages' to struct zone" assumes
      that all highmem pages will be freed into the buddy system by function
      mem_init().  But that's not always true, some architectures may reserve
      some highmem pages during boot.  For example PPC may allocate highmem
      pages for giagant HugeTLB pages, and several architectures have code to
      check PageReserved flag to exclude highmem pages allocated during boot
      when freeing highmem pages into the buddy system.
      So treat highmem pages in the same way as normal pages, that is to:
      1) reset zone->managed_pages to zero in mem_init().
      2) recalculate managed_pages when freeing pages into the buddy system.
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 29 Apr, 2013 1 commit
  3. 23 Feb, 2013 1 commit
    • Wen Congyang's avatar
      memory-hotplug: common APIs to support page tables hot-remove · ae9aae9e
      Wen Congyang authored
      When memory is removed, the corresponding pagetables should alse be
      removed.  This patch introduces some common APIs to support vmemmap
      pagetable and x86_64 architecture direct mapping pagetable removing.
      All pages of virtual mapping in removed memory cannot be freed if some
      pages used as PGD/PUD include not only removed memory but also other
      memory.  So this patch uses the following way to check whether a page
      can be freed or not.
      1) When removing memory, the page structs of the removed memory are
         filled with 0FD.
      2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be
         cleared.  In this case, the page used as PT/PMD can be freed.
      For direct mapping pages, update direct_pages_count[level] when we freed
      their pagetables.  And do not free the pages again because they were
      freed when offlining.
      For vmemmap pages, free the pages and their pagetables.
      For larger pages, do not split them into smaller ones because there is
      no way to know if the larger page has been split.  As a result, there is
      no way to decide when to split.  We deal the larger pages in the
      following way:
      1) For direct mapped pages, all the pages were freed when they were
         offlined.  And since menmory offline is done section by section, all
         the memory ranges being removed are aligned to PAGE_SIZE.  So only need
         to deal with unaligned pages when freeing vmemmap pages.
      2) For vmemmap pages being used to store page_struct, if part of the
         larger page is still in use, just fill the unused part with 0xFD.  And
         when the whole page is fulfilled with 0xFD, then free the larger page.
      [akpm@linux-foundation.org: fix typo in comment]
      [tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables]
      [tangchen@cn.fujitsu.com: do not free direct mapping pages twice]
      [tangchen@cn.fujitsu.com: do not free page split from hugepage one by one]
      [tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages]
      [akpm@linux-foundation.org: use pmd_page_vaddr()]
      [akpm@linux-foundation.org: fix used-uninitialised bug]
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: default avatarJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Wu Jianguo <wujianguo@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 29 Jan, 2013 1 commit
  5. 12 Dec, 2012 1 commit
  6. 11 Dec, 2012 1 commit
    • Joonsoo Kim's avatar
      bootmem: fix wrong call parameter for free_bootmem() · 81df9bff
      Joonsoo Kim authored
      It is strange that alloc_bootmem() returns a virtual address and
      free_bootmem() requires a physical address.  Anyway, free_bootmem()'s
      first parameter should be physical address.
      There are some call sites for free_bootmem() with virtual address.  So fix
      [akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
      Signed-off-by: default avatarJoonsoo Kim <js1304@gmail.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  7. 11 Jul, 2012 1 commit
    • Yinghai Lu's avatar
      mm: sparse: fix usemap allocation above node descriptor section · 99ab7b19
      Yinghai Lu authored
      After commit f5bf18fa ("bootmem/sparsemem: remove limit constraint
      in alloc_bootmem_section"), usemap allocations may easily be placed
      outside the optimal section that holds the node descriptor, even if
      there is space available in that section.  This results in unnecessary
      hotplug dependencies that need to have the node unplugged before the
      section holding the usemap.
      The reason is that the bootmem allocator doesn't guarantee a linear
      search starting from the passed allocation goal but may start out at a
      much higher address absent an upper limit.
      Fix this by trying the allocation with the limit at the section end,
      then retry without if that fails.  This keeps the fix from f5bf18fa
      of not panicking if the allocation does not fit in the section, but
      still makes sure to try to stay within the section at first.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[3.3.x, 3.4.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  8. 29 May, 2012 1 commit
  9. 23 May, 2012 1 commit
  10. 14 Jul, 2011 1 commit
  11. 25 May, 2011 1 commit
  12. 11 May, 2011 1 commit
    • Yinghai Lu's avatar
      mm: use alloc_bootmem_node_nopanic() on really needed path · 8f389a99
      Yinghai Lu authored
      Stefan found nobootmem does not work on his system that has only 8M of
      RAM.  This causes an early panic:
        BIOS-provided physical RAM map:
         BIOS-88: 0000000000000000 - 000000000009f000 (usable)
         BIOS-88: 0000000000100000 - 0000000000840000 (usable)
        bootconsole [earlyser0] enabled
        Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
        DMI not present or invalid.
        last_pfn = 0x840 max_arch_pfn = 0x100000
        init_memory_mapping: 0000000000000000-0000000000840000
        8MB LOWMEM available.
          mapped low ram: 0 - 00840000
          low ram: 0 - 00840000
        Zone PFN ranges:
          DMA      0x00000001 -> 0x00001000
          Normal   empty
        Movable zone start PFN for each node
        early_node_map[2] active PFN ranges
            0: 0x00000001 -> 0x0000009f
            0: 0x00000100 -> 0x00000840
        BUG: Int 6: CR2 (null)
             EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
             EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
             err (null)  EIP c0353191   CS c0320060  flg 00010082
        Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
               00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
               (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
        Pid: 0, comm: swapper Not tainted 2.6.36 #5
        Call Trace:
         [<c02e3707>] ? 0xc02e3707
         [<c035e6e5>] 0xc035e6e5
         [<c0353191>] ? 0xc0353191
         [<c03534d6>] 0xc03534d6
         [<c034f1cd>] 0xc034f1cd
         [<c034a824>] 0xc034a824
         [<c03513cb>] ? 0xc03513cb
         [<c0349432>] 0xc0349432
         [<c0349066>] 0xc0349066
      It turns out that we should ignore the low limit of 16M.
      Use alloc_bootmem_node_nopanic() in this case.
      [akpm@linux-foundation.org: less mess]
      Signed-off-by: default avatarYinghai LU <yinghai@kernel.org>
      Reported-by: default avatarStefan Hellermann <stefan@the2masters.de>
      Tested-by: default avatarStefan Hellermann <stefan@the2masters.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>		[2.6.34+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  13. 23 Mar, 2011 1 commit
    • Olaf Hering's avatar
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr,... · 93a72052
      Olaf Hering authored
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
      The Xen PV drivers in a crashed HVM guest can not connect to the dom0
      backend drivers because both frontend and backend drivers are still in
      connected state.  To run the connection reset function only in case of a
      crashdump, the is_kdump_kernel() function needs to be available for the PV
      driver modules.
      Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
      kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
      usable for modules.
      Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
      to early_param().  It adds an address range check from x86 also on ia64
      and powerpc.
      [akpm@linux-foundation.org: additional #includes]
      [akpm@linux-foundation.org: remove elfcorehdr_addr export]
      [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
      Signed-off-by: default avatarOlaf Hering <olaf@aepfle.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  14. 13 Dec, 2010 1 commit
  15. 12 Feb, 2010 1 commit
  16. 10 Nov, 2009 1 commit
    • FUJITA Tomonori's avatar
      bootmem: Add free_bootmem_late() · 9f993ac3
      FUJITA Tomonori authored
      Add a new function for freeing bootmem after the bootmem
      allocator has been released and the unreserved pages given to
      the page allocator.
      This allows us to reserve bootmem and then release it if we
      later discover it was not needed.
      ( This new API will be used by the swiotlb code to recover
        a significant amount of RAM (64MB). )
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: chrisw@sous-sol.org
      Cc: dwmw2@infradead.org
      Cc: joerg.roedel@amd.com
      Cc: muli@il.ibm.com
      Cc: hannes@cmpxchg.org
      Cc: tj@kernel.org
      Cc: akpm@linux-foundation.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <1257849980-22640-7-git-send-email-fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  17. 22 Sep, 2009 1 commit
  18. 01 Apr, 2009 1 commit
  19. 23 Feb, 2009 2 commits
    • Tejun Heo's avatar
      bootmem: reorder interface functions and add a missing one · 2d0aae41
      Tejun Heo authored
      Impact: cleanup and addition of missing interface wrapper
      The interface functions in bootmem.h was ordered in not so orderly
      manner.  Reorder them such that
      * functions allocating the same area group together -
        ie. alloc_bootmem group and alloc_bootmem_low group.
      * functions w/o node parameter come before the ones w/ node parameter.
      * nopanic variants are immediately below their panicky counterparts.
      While at it, add alloc_bootmem_pages_node_nopanic() which was missing.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@saeurebad.de>
    • Tejun Heo's avatar
      bootmem: clean up arch-specific bootmem wrapping · c1329375
      Tejun Heo authored
      Impact: cleaner and consistent bootmem wrapping
      By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
      arch-specific wrappers for bootmem allocation.  However, this is done
      a bit strangely in that only the high level convenience macros can be
      changed while lower level, but still exported, interface functions
      can't be wrapped.  This not only is messy but also leads to strange
      situation where alloc_bootmem() does what the arch wants it to do but
      the equivalent __alloc_bootmem() call doesn't although they should be
      able to be used interchangeably.
      This patch updates bootmem such that archs can override / wrap the
      backend function - alloc_bootmem_core() instead of the highlevel
      interface functions to allow simpler and consistent wrapping.  Also,
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@saeurebad.de>
  20. 12 Aug, 2008 1 commit
  21. 25 Jul, 2008 1 commit
  22. 24 Jul, 2008 6 commits
  23. 08 Jul, 2008 1 commit
  24. 21 Jun, 2008 1 commit
  25. 28 Apr, 2008 1 commit
  26. 07 Feb, 2008 1 commit
    • Bernhard Walle's avatar
      Introduce flags for reserve_bootmem() · 72a7fe39
      Bernhard Walle authored
      This patchset adds a flags variable to reserve_bootmem() and uses the
      BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
      between crashkernel area and already used memory.
      This patch:
      Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
      If that flag is set, the function returns with -EBUSY if the memory already
      has been reserved in the past.  This is to avoid conflicts.
      Because that code runs before SMP initialisation, there's no race condition
      inside reserve_bootmem_core().
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix powerpc build]
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  27. 29 Oct, 2007 1 commit
  28. 01 Jun, 2007 1 commit
  29. 02 May, 2007 1 commit
    • Ravikiran G Thirumalai's avatar
      [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA · e073ae1b
      Ravikiran G Thirumalai authored
      Enable system hashtable memory to be distributed among nodes on x86_64 NUMA
      Forcing the kernel to use node interleaved vmalloc instead of bootmem for
      the system hashtable memory (alloc_large_system_hash) reduces the memory
      imbalance on node 0 by around 40MB on a 8 node x86_64 NUMA box:
      Before the following patch, on bootup of a 8 node box:
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3206296 kB
      Node 0 MemUsed:        201192 kB
      Node 0 Active:           7012 kB
      Node 0 Inactive:          512 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        1912 kB
      Node 0 Mapped:            420 kB
      Node 0 AnonPages:        5612 kB
      Node 0 PageTables:        468 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             5408 kB
      Node 0 SReclaimable:      644 kB
      Node 0 SUnreclaim:       4764 kB
      After the patch (or using hashdist=1 on the kernel command line):
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3247608 kB
      Node 0 MemUsed:        159880 kB
      Node 0 Active:           3012 kB
      Node 0 Inactive:          616 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        2424 kB
      Node 0 Mapped:            380 kB
      Node 0 AnonPages:        1200 kB
      Node 0 PageTables:        396 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             6304 kB
      Node 0 SReclaimable:     1596 kB
      Node 0 SUnreclaim:       4708 kB
      I guess it is a good idea to keep HASHDIST_DEFAULT "on" for x86_64 NUMA
      since x86_64 has no dearth of vmalloc space?  Or maybe enable hash
      distribution for all 64bit NUMA arches?  The following patch does it only
      for x86_64.
      I ran a HPC MPI benchmark -- 'Ansys wingsolid', which takes up quite a bit of
      memory and uses up tlb entries.  This was on a 4 way, 2 socket
      Tyan AMD box (non vsmp), with 8G total memory (4G pernode).
      The results with and without hash distribution are:
      1. Vanilla - runtime of 1188.000s
      2. With hashdist=1 runtime of 1154.000s
      Oprofile output for the duration of run is:
      1. Vanilla:
      PU: AMD64 processors, speed 2411.16 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      163054    6.5513  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      162061    6.5114  libansys3.so             blockSaxpy6L_fd
      162042    6.5107  libansys3.so             blockInnerProduct6L_fd
      156286    6.2794  libansys3.so             maxb33_
      87879     3.5309  libansys1.so             elmatrixmultpcg_
      84857     3.4095  libansys4.so             saxpy_pcg
      58637     2.3560  libansys4.so             .st4560
      46612     1.8728  libansys4.so             .st4282
      43043     1.7294  vmlinux-t                copy_user_generic_string
      41326     1.6604  libansys3.so             blockSaxpyBackSolve6L_fd
      41288     1.6589  libansys3.so             blockInnerProductBackSolve6L_fd
      2. With hashdist=1
      CPU: AMD64 processors, speed 2411.13 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      162993    6.9814  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      160799    6.8874  libansys3.so             blockInnerProduct6L_fd
      160459    6.8729  libansys3.so             blockSaxpy6L_fd
      156018    6.6826  libansys3.so             maxb33_
      84700     3.6279  libansys4.so             saxpy_pcg
      83434     3.5737  libansys1.so             elmatrixmultpcg_
      58074     2.4875  libansys4.so             .st4560
      46000     1.9703  libansys4.so             .st4282
      41166     1.7632  libansys3.so             blockSaxpyBackSolve6L_fd
      41033     1.7575  libansys3.so             blockInnerProductBackSolve6L_fd
      35762     1.5318  libansys1.so             inner_product_sub
      35591     1.5245  libansys1.so             inner_product_sub2
      28259     1.2104  libansys4.so             addVectors
      Signed-off-by: default avatarPravin B. Shelar <pravin.shelar@calsoftinc.com>
      Signed-off-by: default avatarRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarShai Fultheim <shai@scalex86.org>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Acked-by: default avatarChristoph Lameter <clameter@engr.sgi.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
  30. 22 Mar, 2007 1 commit
  31. 07 Dec, 2006 1 commit
  32. 26 Sep, 2006 3 commits