Commit 89df5958 authored by Charlie Jacobsen's avatar Charlie Jacobsen Committed by Vikram Narayanan

liblcd-v2: Fix kliblcd mapping, address -> cptr translation.

I needed to clarify this part of the liblcd interface and ensure
I have it right.

Non-isolated code *must* invoke lcd_map_phys, lcd_map_virt, or
_lcd_mmap on memory objects it either volunteers or is granted.
This gives kliblcd an opportunity to insert the memory object
into one of the two resource trees used for address -> cptr
translation.

I changed the names of the two trees to clarify what is stored
in there - contiguous vs non-contiguous. (This is only a kliblcd
internal thing.) It doesn't matter if you map contiguous memory
via lcd_map_phys or lcd_map_virt; it always goes in the contiguous
resource tree. Similar or non-contiguous (e.g. vmalloc) mem.
parent 7fe276a9
......@@ -476,6 +476,22 @@ unsigned long __lcd_memory_object_size(struct lcd_memory_object *mo);
* This is just start + size - 1.
*/
unsigned long __lcd_memory_object_last(struct lcd_memory_object *mo);
/**
* __lcd_memory_object_is_contiguous -- Returns non-zero if memory is
* contiguous in host's physical mem
* @mo: the memory object
*
* For things like vmalloc memory, this returns 0 (for non-contiguous).
*/
int __lcd_memory_object_is_contiguous(struct lcd_memory_object *mo);
/**
* __lcd_memory_object_is_ram -- Return non-zero for non-devmem
*/
int __lcd_memory_object_is_ram(struct lcd_memory_object *mo);
/**
* __lcd_memory_object_hva -- Return host virtual address of start of mem obj
*/
hva_t __lcd_memory_object_hva(struct lcd_memory_object *mo);
/**
* __lcd_insert_memory_object -- Insert memory object into LCD's cspace, and
* into the global memory interval tree
......
......@@ -81,22 +81,42 @@ int _lcd_vmalloc(unsigned int order, cptr_t *slot_out);
* @mo: cptr to memory object capability (e.g., pages)
* @base: starting guest physical address where memory should be mapped
*
* Maps the memory referred to by the @mo cptr_t in the caller's
* guest physical address space, starting at @base. If @mo is
* already mapped, @base has an existing mapping, or the @mo won't
* fit, this call fails.
*
* If @mo or @gpa is invalid, returns non-zero (e.g., @mo doesn't
* refer to a memory object capability).
*
* IMPORTANT: If the caller is running in the non-isolated host (not inside
* an LCD), the memory is already mapped in the caller's physical
* address space (all host physical is available). However, calling
* this function is still necessary because it updates the internal
* resource tree used for address -> cptr translation.
*
* Note: If @mo is already mapped, the call will fail (multiple mappings
* is currently not supported).
* Purpose: Maps the memory referred to by the @mo cptr_t in the caller's
* guest physical address space, starting at @base. The semantics is a
* bit different though depending on the environment.
*
* In either environment, this call (or a higher-level counterpart, like
* lcd_map_phys) is necessary before you try to do address -> cptr
* translation via lcd_phys_to_cptr (even for non-isolated code).
*
* Non-isolated code notes
* -----------------------
*
* For non-isolated code, all physical memory is already accessible. This
* function ignores @base and just inserts the memory object into the
* caller's internal physical address space resource tree.
*
* If @mo refers to vmalloc memory, this function fails. You should use
* lcd_map_virt for vmalloc memory. (See lcd_map_virt's documentation
* for why.)
*
* (Unlike isolated code, some memory objects - vmalloc memory in
* particular - can be discontiguous in the host's physical address
* space. This is why, internally, kliblcd needs to use two resource
* trees - one for contiguous physical memory, the other for physically
* discontiguous but virtually contiguous memory.)
*
* Isolated code notes
* -------------------
*
* For isolated code, maps memory object at @base. If the memory
* object capability has already been used to map the memory
* object, this call fails. (The microkernel only allows the
* memory object to be mapped once per capability that the LCD
* has.)
*
* Furthermore, @base has an existing mapping, or the @mo won't
* fit, the microkernel will reject the mapping.
*/
int _lcd_mmap(cptr_t mo, gpa_t base);
......@@ -104,19 +124,34 @@ int _lcd_mmap(cptr_t mo, gpa_t base);
* _lcd_munmap -- Low-level unmapping, calls into microkernel
* @mo: cptr to memory object capability
*
* Unmaps the memory referred to by the @mo cptr_t in the caller's
* guest physical address space. If the pages aren't mapped, silently
* fails / does nothing.
*
* IMPORTANT: If the caller is a non-isolated thread, this will remove
* the memory object from the internal resource tree used for
* address -> cptr translation. If you don't do it, kliblcd will give
* you wrong results (it'll just affect you, not other kLCDs).
* Purpose: Unmaps the memory referred to by the @mo cptr_t in the caller's
* guest physical address space. If the memory object isn't mapped,
* silently fails / does nothing.
*
* You may wonder: why don't we pass just the guest physical address
* that we want unmapped? Answer: the microkernel needs to keep track
* of where memory objects are mapped (so that if rights are revoked,
* it knows how to unmap them).
*
* The semantics is a bit different depending on the environment.
*
* Non-isolated code notes
* -----------------------
*
* If the caller is a non-isolated thread, this will remove
* the memory object from the internal resource tree used for
* address -> cptr translation. If you don't do it, kliblcd will give
* you wrong results (it'll just affect you, not other kLCDs).
*
* However, no "unmapping" is actually done (memory objects are
* always mapped in the host).
*
* Isolated code notes
* -------------------
*
* For isolated code, this will actually unmap in the physical address
* space, and remove from the internal resource tree used for
* address -> cptr translation.
*/
void _lcd_munmap(cptr_t mo);
......@@ -183,36 +218,47 @@ void lcd_vfree(void *ptr);
* @order: there are 2^order pages to map
* @base_out: out param, guest physical address where pages were mapped
*
* IMPORTANT: This should not be used for ioremap-like functionality. This
* function will not set up the memory types / cache behavior properly for
* device memory.
*
* Map pages in a free part of the physical address space. Returns
* the (guest) physical address where the pages were mapped. (This is
* kind of like kmap, but for the guest physical address space.)
*
* If @order is not the true order of @pages, you may get odd behavior.
* (Internally, @order is going to be used to find a free spot in the
* address space.)
* If @pages has already been mapped, the microkernel will reject the
* map request, and this call will fail. (While this isn't necessary
* for non-isolated code, the same is true, in order to mirror the
* semantics of isolated code, and to prevent duplicate inserts into
* the internal resource tree.)
*
* Non-isolated code notes
* -----------------------
*
* Note: For non-isolated code, guest physical == host physical (by convention),
* For non-isolated code, guest physical == host physical (by convention),
* and the pages are already technically accessible to the caller (but the
* caller probably won't know the physical address of the pages - they just
* have the opaque cptr_t for the pages). So, for non-isolated code, this
* function doesn't really do much (it adds the pages to an internal
* data structure so you can still do address -> cptr translation,
* however). The guest physical address returned is actually the host
* function won't do any mapping.
*
* However, it *will* add the memory object to the caller's internal
* physical address space resource tree. So later, you will be able
* to invoke lcd_phys_to_cptr and look up the pages cptr (i.e., do
* address -> cptr translation).
*
* Note that since guest physical == host physical for non-isolated
* code, the guest physical address returned is actually the host
* physical address of the pages.
*
* Note: If pages is already mapped, this call will fail (the microkernel
* does not support multiple mappings).
* Isolated code notes
* -------------------
*
* If @order is not the true order of @pages, you may get odd behavior.
* (Internally, @order is going to be used to find a free spot in the
* address space.)
*/
int lcd_map_phys(cptr_t pages, unsigned int order, gpa_t *base_out);
/**
* lcd_map_virt -- Map a physical address range in caller's virtual address
* space
* @base: base guest physical address of memory to map
* @order: physical address range is 2^order pages
* lcd_map_virt -- Map pages in a free part of the physical and virtual
* address spaces (does both for you)
* @pages: cptr to pages memory object
* @order: there are 2^order pages
* @gva_out: out param, where the memory was mapped in the virtual address
* space
*
......@@ -220,36 +266,36 @@ int lcd_map_phys(cptr_t pages, unsigned int order, gpa_t *base_out);
* It will not set up memory types for io mem (e.g., marking memory as
* uncacheable).
*
* This is like kmap. You already have guest physical memory, and you just
* want to map it in a free spot in the caller's virtual address space.
*
* Note: For non-isolated code, at least on x86_64, all physical memory
* is already mapped, so this function won't really do much. Further
* note that guest virtual == host virtual, by convention - so the guest
* virtual address returned is actually a host virtual address for
* non-isolated code.
*
* Note: Mapping the same chunk of memory more than once can lead to
* undefined behavior inside the LCD. (The LCD data structures do not
* support more than one mapping, and we don't try to check if there
* already is a mapping. Linux doesn't support multiple mappings of RAM
* either, fyi. Doing so would break page_address, for example, as
* it would for us.)
*/
int lcd_map_virt(gpa_t base, unsigned int order, gva_t *gva_out);
/**
* lcd_map_both -- Map pages in a free place in caller's physical and
* virtual address spaces.
* @pages: cptr to pages capability
* @order: 2^order pages
* @gva_out: out param, (guest) virtual address where pages were mapped
* This is similar to lcd_map_phys, but also maps the memory in the
* caller's virtual address space.
*
* Non-isolated code notes
* -----------------------
*
* Map @pages in both address spaces so you can start using the memory.
* For non-isolated code, at least on x86_64, all physical memory
* is already mapped, so this function won't do any extra virtual
* mapping.
*
* This is lcd_map_phys and lcd_map_virt put together, with the same
* caveats.
* It *will* add the memory object to the caller's internal resource
* tree, so you will be able to invoke lcd_virt_to_cptr.
*
* Further note that guest virtual == host virtual, by convention -
* so the guest virtual address returned is actually a host virtual
* address.
*
* NOTE: You *must* call this for "mapping" vmalloc memory instead
* of lcd_map_phys. Why? It has to do with the fact that vmalloc memory
* cannot be "made" contiguous in a non-isolated thread's physical
* address space (the internal logic would break since the physical
* memory behind the vmalloc memory is not contiguous).
*
* Isolated code notes
* -------------------
*
* If @order doesn't match the true order of the pages memory object,
* you may get weird behavior.
*/
int lcd_map_both(cptr_t pages, unsigned int order, gva_t *gva_out);
int lcd_map_virt(cptr_t pages, unsigned int order, gva_t *gva_out);
/**
* lcd_unmap_phys -- Unmap pages from physical address space
* @base: guest physical address where pages are mapped
......@@ -269,18 +315,11 @@ void lcd_unmap_phys(gpa_t base, unsigned int order);
*
* Unmap memory from caller's virtual address space. (This is like kunmap.)
*
* Note: For non-isolated code, this is a no-op.
* Note: For non-isolated code, again, this won't unmap anything, but it
* will update some internal data structures, so it's important you call
* it.
*/
void lcd_unmap_virt(gva_t base, unsigned int order);
/**
* lcd_unmap_both -- Kill the guest virtual and physical mappings
* @base: guest virtual address where pages are mapped
* @order: 2^order pages are mapped there
*
* This is lcd_unmap_phys and lcd_unmap_virt put together, and will
* kill the mappings set up by lcd_map_both.
*/
void lcd_unmap_both(gva_t base, unsigned int order);
/* "VOLUNTEERING" PAGES ---------------------------------------- */
......
......@@ -13,9 +13,8 @@
/* RESOURCE TREES -------------------------------------------------- */
/*
* There are two trees: One for RAM and device memory (host physical
* addresses), one for vmalloc (host virtual addresses for memory that
* need not be contiguous).
* There are two trees: One for contiguous memory (RAM and device memory)
* and one for non-contiguous memory (vmalloc).
*
* For now, resource trees are per-thread (these are not the same
* thing as the global memory interval tree). So, we don't use any
......@@ -23,12 +22,10 @@
*
* We also expect non-isolated code to not be too tricky. For example,
* we don't expect it to insert host memory as RAM and VMALLOC memory
* simultaneously (one piece of physical memory, but also mapped via
* vmalloc). Non-isolated code is trusted after all ...
* simultaneously. Non-isolated code is trusted after all ...
*/
#define LCD_RESOURCE_TREE_RAM_IDX 0
#define LCD_RESOURCE_TREE_DEV_MEM_IDX 0
#define LCD_RESOURCE_TREE_VMALLOC_IDX 1
#define LCD_RESOURCE_TREE_CONTIGUOUS 0
#define LCD_RESOURCE_TREE_NON_CONTIGUOUS 1
int lcd_alloc_init_resource_tree(struct lcd_resource_tree **t_out)
{
......@@ -120,29 +117,16 @@ static int mo_insert_in_trees(struct task_struct *t,
struct lcd_memory_object *mo,
cptr_t mo_cptr)
{
switch (mo->sub_type) {
case LCD_MICROKERNEL_TYPE_ID_PAGE:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_PAGE:
if (__lcd_memory_object_is_contiguous(mo))
return mo_insert_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_RAM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_CONTIGUOUS],
mo,
mo_cptr);
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_DEV_MEM:
else
return mo_insert_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_DEV_MEM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_NON_CONTIGUOUS],
mo,
mo_cptr);
case LCD_MICROKERNEL_TYPE_ID_VMALLOC_MEM:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_VMALLOC_MEM:
return mo_insert_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_VMALLOC_IDX],
mo,
mo_cptr);
default:
LCD_ERR("unexpected memory object type %d",
mo->sub_type);
return -EINVAL;
}
}
static int mo_in_tree(struct lcd_resource_tree *t,
......@@ -159,26 +143,14 @@ static int mo_in_tree(struct lcd_resource_tree *t,
static int mo_in_trees(struct task_struct *t, struct lcd_memory_object *mo)
{
switch (mo->sub_type) {
case LCD_MICROKERNEL_TYPE_ID_PAGE:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_PAGE:
if (__lcd_memory_object_is_contiguous(mo))
return mo_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_RAM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_CONTIGUOUS],
mo);
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_DEV_MEM:
else
return mo_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_DEV_MEM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_NON_CONTIGUOUS],
mo);
case LCD_MICROKERNEL_TYPE_ID_VMALLOC_MEM:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_VMALLOC_MEM:
return mo_in_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_VMALLOC_IDX],
mo);
default:
LCD_ERR("unexpected memory object type %d",
mo->sub_type);
return -EINVAL;
}
}
static void mo_remove_from_tree(struct lcd_resource_tree *tree,
......@@ -206,26 +178,14 @@ static void mo_remove_from_tree(struct lcd_resource_tree *tree,
static void mo_remove_from_trees(struct task_struct *t,
struct lcd_memory_object *mo)
{
switch (mo->sub_type) {
case LCD_MICROKERNEL_TYPE_ID_PAGE:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_PAGE:
if (__lcd_memory_object_is_contiguous(mo))
mo_remove_from_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_RAM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_CONTIGUOUS],
mo);
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_DEV_MEM:
else
mo_remove_from_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_DEV_MEM_IDX],
t->lcd_resource_trees[LCD_RESOURCE_TREE_NON_CONTIGUOUS],
mo);
case LCD_MICROKERNEL_TYPE_ID_VMALLOC_MEM:
case LCD_MICROKERNEL_TYPE_ID_VOLUNTEERED_VMALLOC_MEM:
mo_remove_from_tree(
t->lcd_resource_trees[LCD_RESOURCE_TREE_VMALLOC_IDX],
mo);
default:
LCD_ERR("unexpected memory object type %d",
mo->sub_type);
return;
}
}
/* LOW-LEVEL PAGE ALLOC -------------------------------------------------- */
......@@ -327,7 +287,7 @@ fail1:
/* LOW-LEVEL MAP -------------------------------------------------- */
static int __lcd_mmap(struct lcd *lcd, struct lcd_memory_object *mo,
static int do_map_phys(struct lcd *lcd, struct lcd_memory_object *mo,
struct cnode *cnode, cptr_t mo_cptr, gpa_t base)
{
int ret;
......@@ -369,43 +329,16 @@ fail1:
int _lcd_mmap(cptr_t mo_cptr, gpa_t base)
{
struct lcd_memory_object *mo;
struct cnode *cnode;
int ret;
/*
* Ignore gpa arg - non-isolated code cannot change physical
* mappings
*
*
* Get and lock the memory object
*/
ret = __lcd_get_memory_object(current->lcd, mo_cptr, &cnode, &mo);
if (ret) {
LIBLCD_ERR("error looking up memory object");
goto fail1;
}
/*
* Do the map
*/
ret = __lcd_mmap(current->lcd, mo, cnode, mo_cptr, base);
if (ret) {
LIBLCD_ERR("error mapping mem object");
goto fail2;
}
gpa_t unused;
/*
* Release locks
* We cheat and use lcd_map_phys since all it does is add
* the memory object to the physical address space resource
* tree. For non-isolated, @order is ignored.
*/
__lcd_put_memory_object(current->lcd, cnode, mo);
return 0;
fail2:
__lcd_put_memory_object(current->lcd, cnode, mo);
fail1:
return ret;
return lcd_map_phys(mo_cptr, 0, &unused);
}
static void __lcd_munmap(struct lcd *lcd, struct lcd_memory_object *mo,
static void do_phys_unmap(struct lcd *lcd, struct lcd_memory_object *mo,
struct cnode *mo_cnode)
{
/*
......@@ -436,7 +369,7 @@ void _lcd_munmap(cptr_t mo_cptr)
/*
* Do the unmap
*/
__lcd_munmap(current->lcd, mo, cnode);
do_phys_unmap(current->lcd, mo, cnode);
/*
* Release locks
*/
......@@ -675,15 +608,22 @@ int lcd_map_phys(cptr_t pages, unsigned int order, gpa_t *base_out)
LIBLCD_ERR("internal error: mem lookup failed");
goto fail1;
}
/*
* Ensure the memory object is for contiguous memory
*/
if (!__lcd_memory_object_is_contiguous(mo)) {
LIBLCD_ERR("memory object is not contiguous; use lcd_map_virt instead");
ret = -EINVAL;
goto fail2;
}
/*
* "Map" the pages (adds pages to proper resource tree)
*/
ret = __lcd_mmap(current->lcd, mo, cnode, pages, __gpa(0));
ret = __lcd_mmap(current->lcd, mo, cnode, pages);
if (ret) {
LIBLCD_ERR("error mapping pages in resource tree");
goto fail2;
goto fail3;
}
p = mo->object;
/* guest physical == host physical for non-isolated */
*base_out = __gpa(hpa_val(va2hpa(page_address(p))));
......@@ -694,64 +634,62 @@ int lcd_map_phys(cptr_t pages, unsigned int order, gpa_t *base_out)
return 0;
fail3:
fail2:
__lcd_put_memory_object(current->lcd, cnode, mo);
fail1:
return ret;
}
int lcd_map_virt(gpa_t base, unsigned int order, gva_t *gva_out)
int lcd_map_virt(cptr_t pages, unsigned int order, gva_t *gva_out)
{
int ret;
cptr_t mo_cptr;
unsigned long mo_size;
struct page *p;
struct cnode *cnode;
struct lcd_memory_object *mo;
/*
* On x86_64, all RAM is mapped already.
*
* But we still check to ensure non-isolated code has
* access to physical address via capabilities and has
* it mapped in its resource tree.
* Ignore order
*
* XXX: For arch's with smaller virtual address spaces,
* we need to kmap or similar.
*
* guest virtual == host virtual for non-isolated
* Look up memory object so we can get virtual address.
*/
ret = lcd_phys_to_cptr(base, &mo_cptr, &mo_size);
ret = __lcd_get_memory_object(current->lcd, pages, &cnode, &mo);
if (ret) {
LIBLCD_ERR("phys not mapped?");
LIBLCD_ERR("internal error: mem lookup failed");
goto fail1;
}
/*
* If we got a "hit", then there's no need to get the actual
* memory object (base is guest physical == host physical for
* non-isolated)
* Ensure this is RAM mem (lcd_map_virt doesn't do ioremap)
*/
if (!__lcd_memory_object_is_ram(mo)) {
LIBLCD_ERR("cannot use lcd_map_virt for dev mem");
goto fail2;
}
/*
* "Map" the memory (adds pages to proper resource tree)
*/
ret = __lcd_mmap(current->lcd, mo, cnode, pages);
if (ret) {
LIBLCD_ERR("error mapping pages in resource tree");
goto fail3;
}
*gva_out = __gva(hva_val(pa2hva(gpa_val(base))));
/* guest virtual == host virtual for non-isolated */
*base_out = __gva(hva_val(__lcd_memory_object_hva(mo)));
/*
* Release memory object
*/
__lcd_put_memory_object(current->lcd, cnode, mo);
return 0;
fail3:
fail2:
__lcd_put_memory_object(current->lcd, cnode, mo);
fail1:
return ret;
}
int lcd_map_both(cptr_t pages, unsigned int order, gva_t *gva_out)
{
int ret;
gpa_t gpa;
/*
* "Map" in physical (this is more like cptr -> phys addr)
*/
ret = lcd_map_phys(pages, order, &gpa);
if (ret)
return ret;
/*
* "Map" in virtual
*/
return lcd_map_virt(gpa, order, gva_out);
}
void lcd_unmap_phys(gpa_t base, unsigned int order)
{
int ret;
......@@ -776,34 +714,24 @@ void lcd_unmap_phys(gpa_t base, unsigned int order)
void lcd_unmap_virt(gva_t base, unsigned int order)
{
int ret;
cptr_t mo_cptr;
unsigned long mo_size;
/*
* This is truly a no-op because all we did in lcd_map_virt
* was just check if the caller has the physical address in
* its resource tree.
*
* XXX: At least for x86_64. For arch's with smaller virtual
* address spaces, we need to kunmap or similar.
*/
}
void lcd_unmap_both(gva_t base, unsigned int order)
{
gpa_t base_gpa;
/*
* Get base as a physical address
* No real unmapping needs to be done, but we need to
* update the resource tree.
*
* guest virtual == host virtual, and
* guest physical == host physical for non-isolated
* Look up cptr for virtual memory
*/
base_gpa = __gpa(hpa_val(hva2hpa(__hva(gva_val(base)))));
/*
* Unmap in virtual
*/
lcd_unmap_virt(base, order);
ret = lcd_virt_to_cptr(base, &mo_cptr, &mo_size);
if (ret) {
LIBLCD_ERR("virt not mapped?");
return;
}
/*
* Unmap in physical
* Remove memory object from resource tree
*/
lcd_unmap_phys(base_gpa, order);
_lcd_munmap(mo_cptr);
}