Commit 4d59a751 authored by Yasuaki Ishimatsu's avatar Yasuaki Ishimatsu Committed by Linus Torvalds

x86: get pg_data_t's memory from other node

During the implementation of SRAT support, we met a problem.  In
setup_arch(), we have the following call series:

 1) memblock is ready;
 2) some functions use memblock to allocate memory;
 3) parse ACPI tables, such as SRAT.

Before 3), we don't know which memory is hotpluggable, and as a result,
we cannot prevent memblock from allocating hotpluggable memory.  So, in
2), there could be some hotpluggable memory allocated by memblock.

Now, we are trying to parse SRAT earlier, before memblock is ready.  But
I think we need more investigation on this topic.  So in this v5, I
dropped all the SRAT support, and v5 is just the same as v3, and it is
based on 3.8-rc3.

As we planned, we will support getting info from SRAT without users'
participation at last.  And we will post another patch-set to do so.

And also, I think for now, we can add this boot option as the first step
of supporting movable node.  Since Linux cannot migrate the direct
mapped pages, the only way for now is to limit the whole node containing
only movable memory.

Using SRAT is one way.  But even if we can use SRAT, users still need an
interface to enable/disable this functionality if they don't want to
loose their NUMA performance.  So I think, a user interface is always
needed.

For now, users can disable this functionality by not specifying the boot
option.  Later, we will post SRAT support, and add another option value
"movablecore_map=acpi" to using SRAT.

This patch:

If system can create movable node which all memory of the node is
allocated as ZONE_MOVABLE, setup_node_data() cannot allocate memory for
the node's pg_data_t.  So, use memblock_alloc_try_nid() instead of
memblock_alloc_nid() to retry when the first allocation fails.
Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent aa00d89c
......@@ -213,10 +213,9 @@ static void __init setup_node_data(int nid, u64 start, u64 end)
* Allocate node data. Try node-local memory and then any node.
* Never allocate in DMA zone.
*/
nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
if (!nd_pa) {
pr_err("Cannot find %zu bytes in node %d\n",
nd_size, nid);
pr_err("Cannot find %zu bytes in any node\n", nd_size);
return;
}
nd = __va(nd_pa);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment