1. 26 Oct, 2016 40 commits
    • Charlie Jacobsen's avatar
      host-resource-trees: Volunteer host memory. · ac556704
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      When non-isolated code wants to "volunteer" host memory to the
      microkernel's capability system, it invokes lcd_volunteer_pages,
      lcd_volunteer_dev_mem, or lcd_volunteer_vmalloc_mem, depending on
      the type of memory.
      
      Internally, these check to see if the memory has already been
      volunteered (so we don't get duplicates -- this is checked via the
      global memory interval tree). If not, it inserts the memory into
      the caller's cspace. The caller can subsequently share the memory
      with e.g. an isolated LCD via the capability mediated interfaces.
      
      I had support for this before, but there was no checking for
      duplicate inserts (and this is a real problem with the pmfs example
      for string sharing): Non-isolated code has no way of knowing
      (without implementing data structures on its own) whether it inserted
      host memory already or not, or whether some other non-isolated
      code has.
      
      Furthermore, now we have full support for address -> cptr translation
      in the non-isolated side. This is also needed for the pmfs example
      with string sharing: before, the non-isolated code just always
      inserted memory every time to share strings, even if this lead
      to duplicate inserts.
      
      I think this is one of the "friction points" of embedding a
      capability system inside a kernel: translation from host objects
      to capabilities and back. For some objects, you can just embed
      the cptr in the object itself (our "container structs"). But for
      some things -- like memory -- it's not so easy. (For device memory,
      the host kernel doesn't use a struct page to represent it. So we're
      faced with creating our own giant array of data structures to
      represent each page of device memory, and embedding the cptr in
      that. Or instead -- as I have done -- use a data structure like
      a tree to do a reverse lookup.)
      ac556704
    • Charlie Jacobsen's avatar
      host-resource-trees: Address to cptr translation using the trees. · 6b11c0b1
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      6b11c0b1
    • Charlie Jacobsen's avatar
      host-resource-trees: Update higher level alloc, map, unmap for trees. · 647e7c9b
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      To mirror what will happen inside an isolated container, when
      code does a higher-level page alloc, it expects the pages to
      be mapped in its virtual and physical address spaces. It also
      expects to be able to do address -> cptr translations. This means
      we need to insert the page memory object into the resource tree
      during page alloc (and remove it when the pages are freed).
      
      Similarly, for higher-level map/unmap, we need to update the resource
      trees - as this is what will happen inside an isolated LCD, and
      non-isolated code expects to be able to do address -> cptr
      translation.
      
      Almost done with this boring but important feature ...
      647e7c9b
    • Charlie Jacobsen's avatar
      host-resource-trees: Add/remove resource nodes from trees in kliblcd. · d39579d0
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Motivation: An LCD needs to keep track of address -> cptr
      correspondences. The resource trees fulfill that role. Each
      kLCD has two resource trees: one for physical memory (RAM, dev mem,
      etc.) and one for vmalloc memory (non-contiguous physical
      memory that is contiguous in the virtual address space).
      
      To mirror isolated code, when the kLCD maps/unmaps a memory
      object in host physical, we update its resource trees. (Of course,
      we don't bother / can't modify the host's physical mappings, so
      this is all that happens.) It gives kliblcd a chance to update
      the trees.
      d39579d0
    • Charlie Jacobsen's avatar
      host-resource-trees: Remove memory object from tree in cap delete. · 659122b2
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      When the last capability to a memory object goes away, the object
      will no longer be in the microkernel's capability system. Remove
      it from the memory interval tree.
      659122b2
    • Charlie Jacobsen's avatar
      host-resource-trees: Page alloc inserts into memory interval tree. · ccbce7bb
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Microkernel's page allocation code will insert the memory object
      that represents the allocated pages into the interval tree.
      
      Fixes interval tree insertion to use memory object range (start
      and last).
      
      Adds locks for tree traversal and per-node locks.
      ccbce7bb
    • Charles Jacobsen's avatar
      Fix a few syntax errors in VT-x code for PAT. · 3c82ef28
      Charles Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      3c82ef28
    • Charlie Jacobsen's avatar
      host-resource-trees: Add global memory interval tree. · ad373505
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      This is for tracking what host memory is in the microkernel's
      capability system. This prevents us from inserting the same memory
      multiple times into the capability system.
      ad373505
    • Charlie Jacobsen's avatar
      host-resource-trees: Clean up internal header, documentation. · e43ec0a6
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Clean up microkernel's module init/exit file, add init/exits for
      each subsystem.
      e43ec0a6
    • Abhiram Balasubramanian's avatar
      Fix compilation error · 4acf9cbb
      Abhiram Balasubramanian authored and Vikram Narayanan's avatar Vikram Narayanan committed
      
      
      - moved dependancies accordingly
      Signed-off-by: Abhiram Balasubramanian's avatarAbhiram Balasubramanian <abhiram@cs.utah.edu>
      4acf9cbb
    • Anton Burtsev's avatar
      Moved LCD_TEST_MODS_PATH to the Makefile · 4e643c28
      Anton Burtsev authored and Vikram Narayanan's avatar Vikram Narayanan committed
      -- way we don't have to change include/lcd-domains/types.h
      4e643c28
    • Charlie Jacobsen's avatar
      c04aa003
    • Charlie Jacobsen's avatar
      libcap-integration: Add support for sharing vmalloc memory. · b9325169
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      We were doing this before (memory for module bits is vmalloc'd),
      but we used a capability per page. This leads to a handful of
      cptr's we have to keep track of, rather than just a couple.
      Moreover, when I tune page allocation inside the LCD, I may
      have to vmalloc memory (if the required chunks are big
      enough).
      
      So, one vmalloc allocation should be represented with a
      single capability. (We sacrifice some access control granularity.)
      b9325169
    • Abhiram Balasubramanian's avatar
      Add ioremap support for lcds · be3834c3
      Abhiram Balasubramanian authored and Vikram Narayanan's avatar Vikram Narayanan committed
      
      
      - introduce a new memory space as a part of GPA and GVA
      - set PAT memory type to UC so that effective memory type becomes UC
      
      NOTE - implementation needs to be tested with Charlie's revamped code
      Signed-off-by: Abhiram Balasubramanian's avatarAbhiram Balasubramanian <abhiram@cs.utah.edu>
      be3834c3
    • Charlie Jacobsen's avatar
      libcap-integration: Resource trees partially integrated into kliblcd. · be7b3caa
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Enter/exit code sets up/tears down the thread's tree.
      
      Fixed a few spotted bugs in allocation and tree code.
      be7b3caa
    • Sarah Spall's avatar
    • Charlie Jacobsen's avatar
      generalized-allocator: Fix inconsistent block order and size use. · 8af0a335
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      8af0a335
    • Charlie Jacobsen's avatar
      generalized-allocator: Clean up interface and documentation. · 8ee3bfac
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Few more details to sort out, bugs caught after thinking through
      documentation.
      8ee3bfac
    • Charlie Jacobsen's avatar
      0ae93fbc
    • Charlie Jacobsen's avatar
      generalized-allocator: Add code for embedded metadata allocation. · 267f0dfa
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Adds code for allocating and initializing the page blocks that
      contain the embedded metadata.
      
      Most of the core logic is done now.
      267f0dfa
    • Charlie Jacobsen's avatar
      generalized-allocator: Fix free lists initialization. · 16b7e2c6
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      16b7e2c6
    • Charlie Jacobsen's avatar
      generalized-allocator: Free/unmap backing memory when we free a chunk. · d8f1d605
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      When we free a chunk of page blocks of the right order, we should
      free/unmap the backing memory. This is the other part of the
      demand paging equation (and will also be used for other scenarios,
      like ioremap).
      d8f1d605
    • Charlie Jacobsen's avatar
      generalized-allocator: Add page allocation code. · cc85efe7
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Adapts from Linux page allocator again. Still need to modify slightly
      for internal alloc in the init code.
      
      Almost there ...
      cc85efe7
    • Charlie Jacobsen's avatar
      generalized-allocator: Add free page code. · 63293b17
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Borrows from Linux's buddy allocator code in mm/page_alloc.c,
      in __free_one_page.
      63293b17
    • Charlie Jacobsen's avatar
      generalized-allocator: Add allocator init. · 473d6bab
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Most of the initialization is ready. I just need to implement
      some of the core routines for alloc/free that initialization
      depends on.
      473d6bab
    • Charlie Jacobsen's avatar
      generalized-allocator: Add code for metadata allocation. · b24caca1
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      I'm no longer providing the option of embedding resource nodes in
      the metadata. The caller can just reserve a static array of the
      appropriate size (shouldn't be too big in common cases). Makes
      the metadata size calculation simpler. And the caller will most likely
      need to do tuning regardless.
      b24caca1
    • Charlie Jacobsen's avatar
      generalized-allocator: Add page allocator metadata size calculation. · 058edf87
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      This is needed so that the page allocator user knows how much memory
      is needed to set it up. It's also needed for the metadata embedding
      trick.
      058edf87
    • Sarah Spall's avatar
      insert containers into dstore and do error checking · f5b2a2be
      Sarah Spall authored and Vikram Narayanan's avatar Vikram Narayanan committed
      f5b2a2be
    • Charlie Jacobsen's avatar
      generalized-allocator: Start LCD page allocator. · cac85870
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Move old LCD code into different folder. Using lcd-domains folder
      now to mirror kernel source.
      cac85870
    • Charlie Jacobsen's avatar
      generalized-allocator: Add resource tree implementation. · 6c0c9fc0
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      This is a thin wrapper around Linux's interval tree.
      6c0c9fc0
    • Charlie Jacobsen's avatar
      generalized-allocator: Sketch out data structures and interfaces. · 4d114c0e
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      I'm introducing two new data structures: a page allocator and
      a resource tree.
      
      page allocator motivation: We need some kind of data structure for
      tracking regions of the guest physical address space. For example,
      you may want to dedicate the portion of physical address space from
      (1 << 20) -- (16 << 20) (from the first megabyte to the sixteenth
      megabyte) for ioremap's. Someone will say: give me 16 pages of
      uncacheable addresses I can use to ioremap this device memory; the
      page allocator will find 16 free pages of guest physical address.
      
      resource tree motivation/what it is: This data structure is an
      interval tree used to resolve a guest physical address to the
      cptr/capability for the memory object that is mapped at that
      address. For example, you may need the cptr for the page that
      backs a certain guest physical address so that you can share or
      free the page (you need the cptr for the page because the microkernel
      interface only uses cptr's).
      
      page allocator planned implementation:
      
      I plan to adapt the buddy allocator algorithm from Linux. After
      reviewing the code, I found the algorithm to be simple enough that this
      is realistic. In addition, the allocator will provide a means for
      doing "microkernel page allocs" in a more coarse-grained fashion. For
      example, the page allocator will call out to the microkernel to get
      chunks of 1 MB machine pages, and then allocate from that at page
      (4 KB) granularity. This means fewer VM exits. (Right now, every page
      alloc/free results in a VM exit; the page allocator calls out the
      microkernel for every page alloc/free; it doesn't try to do
      coarse-grained alloc/frees and then track those bigger chunks.)
      
      I also plan to allow the page allocator to "embed" its metadata in the
      address space that it is managing, to cover some heap bootstrap issues.
      (This embedding won't work for some use cases, like tracking uncacheable
      memory - we wouldn't want to embed the RAM that contains the page
      allocator metadata inside the address space region and make it
      uncacheable.)
      
      Finally, I plan to allow the page allocator to use varying granularity
      for "microkernel allocs" (if applicable) and allocs for higher levels
      (e.g., page allocator allocates 4 MB chunks from the microkernel, but allows
      higher level code inside an LCD to alloc at 4 KB granularity).
      
      The page allocator data structure (there can be multiple instances)
      will be used exclusively inside an LCD for guest physical address
      space management.
      
      resource tree planned implementation:
      
      I plan to re-use the interval tree data structure in Linux. Google
      developed a nice API. (It replaced the priority tree that was once
      used in the vma code.)
      
      The resource tree will be used in isolated and non-isolated environments
      (physical address -> cptr translation is needed in both).
      
      Some alternatives/discussion:
      
      I could use an easier bitmap first-fit algorithm for page allocation,
      but this is slow (this is what we use now). I wondered if the majority
      of page allocs will be on the control path, and that we may be able to
      tolerate this inefficiency (and all data path operations will involve just
      ring buffer operations on shared memory that is set up beforehand). But
      I suspect this won't be the case. There could be some slab allocations that
      happen on the data path for internal data; if the slab needs to shrink
      or grow, this may trigger a call into the page allocator, which could
      be slow (if it triggered a VM exit, a call on the data path could be
      bloated to 2000 cycles). Maybe this is not true and my concerns are
      unfounded.
      
      It may also seem silly to have multiple page allocator instances inside
      an LCD; why not just one allocator that manages the entire address
      space? First, some of the dynamic regions are independent of each other:
      The heap region and the ioremap region are for different purposes; having
      a single allocator for both regions might be complex and error prone.
      Second, you wouldn't want one allocator to track the entire
      address space since its huge (the amount of allocator metadata could
      be enormous, depending on the design). My intent is to abstract over
      common needs from all regions (tracking free guest physical address space)
      and provide some hooks for specific cases.
      
      An alternative to the resource tree is to just use a giant array of
      cptr's, one slot for each page (this is what we do now for the heap
      inside an LCD). You would then translate a physical
      address to an offset into the array to get the cptr for the resource
      that contains that physical address (e.g. a page). There are a couple
      issues with this: First, the array could be huge (for a region of
      16 GBs at 4 KB granularity, the array would be 32 MBs). Second, even
      if this is tolerable inside an LCD, the non-isolated code needs the
      same functionality (address -> cptr translation), and setting up a
      giant array for the entire host physical address space is obviously
      dumb (and would need to be done *per thread* since each thread uses
      its own cspace). A tree handles the sparsity a lot better.
      
      Finally, it is worth considering how KVM handles backing guest physical
      memory with real host/machine pages. I believe they use some
      sophisticated demand paging triggered by EPT faults, and all of this
      is hooked into the host's page cache. This seems too scary and
      complex for our humble microkernel (that we want to keep simple).
      
      I hope you enjoyed this giant commit message.
      4d114c0e
    • Charlie Jacobsen's avatar
      libcap-integration: Re-factor basic lcd create code in kliblcd. · 0ca33457
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Creating empty LCDs, configuring them, running them.
      0ca33457
    • Sarah Spall's avatar
      added indentation to c ast. · f51d50bc
      Sarah Spall authored and Vikram Narayanan's avatar Vikram Narayanan committed
      f51d50bc
    • Charlie Jacobsen's avatar
      libcap-integration: Re-factor some easy misc kliblcd code. · 508b682d
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Boot info, caps and cptrs, printk, and sync ipc.
      508b682d
    • Charlie Jacobsen's avatar
      libcap-integration: Re-factor kliblcd page alloc and mapping. · fc7f2e63
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Mostly complete except for the bits that need the rb tree I'm
      planning to put in place for translating physical addresses
      to cptr's.
      
      This may seem like silly refactoring, but it's cleaning up
      and unifying a bunch of crap (including the more recent
      feature for passing strings back and forth).
      fc7f2e63
    • Charlie Jacobsen's avatar
      libcap-integration: Re-factor kliblcd enter-exit code. · 36fbb83d
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      36fbb83d
    • Charlie Jacobsen's avatar
      libcap-integration: Finish liblcd v2 headers for now. · 007e3f0c
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
        -- cap.h: delete, revoke; you may wonder: why do we need this
                  if we have libcap? Answer: an LCD needs to have a way
                  to modify *it's own* cspace, rather than cspaces it
                  manages
        -- console.h: lcd_printk and friends, moved into new file with
                      few changes
        -- enter_exit.h: lcd_enter, exit, etc., moved into new file with
                         few changes
      007e3f0c
    • Charlie Jacobsen's avatar
      libcap-integration: Add some headers for liblcd v2. · 7dadddf3
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      I wanted to do this first before re-factoring kliblcd, so I know
      what I need to do.
      
      This is a step toward unifying the old isolated and non-isolated
      interfaces. The semantics of each function will be a bit different
      depending on the execution context.
      
        -- address_spaces.h: from old types.h, with few changes
        -- boot_info.h: bootstrap page data; from old types.h; small
                        changes to struct
        -- create.h: LCD and kLCD creation; from old kliblcd.h; doc cleaned
                     up and interface
        -- mem.h: unified memory interface; coalesces functions from old
                  liblcd.h and kliblcd.h
        -- sync_ipc.h: unifies ipc and utcb headers
        -- syscall.h: same as before
      
      Removes old capability and data store crap.
      
      Also, fixes small bug for edge case in cap types.
      7dadddf3
    • Charlie Jacobsen's avatar
      libcap-integration: Simplify types.h header. · d125987a
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      Removes cptr and capability crap. Boot info and LCD address
      spaces will be moved to separate header(s).
      d125987a
    • Charlie Jacobsen's avatar
      libcap-integration: Start simultaneous interface refactoring. · 09650d18
      Charlie Jacobsen authored and Vikram Narayanan's avatar Vikram Narayanan committed
      09650d18