-
David Johnson authored
This means that the personality can emulate breakpoints and singlesteps in userspace threads, on behalf of the generic OS Process overlay driver. Thus, the OS Process driver can now be stacked atop the KVM/QEMU driver with real breakpoints. Previously, it could be stacked atop KVM/QEMU, but without userspace breakpoints... so it wasn't especially useful for active debugging. The OS Process driver was previously only able to stack atop the Xen driver, because the Xen driver supported hacked hypervisors that would divert userspace debug exceptions to the driver (as well as kernel debug exceptions). Hacking QEMU's GDB stub to do this was undesireable, and infeasible. (The QEMU/KVM support was built through the GDB driver, because QEMU provides a GDB server stub --- but supporting userspace breakpoints in an overlay driver atop our GDB driver was infeasible due to the "shared page" breakpoint issue. The shared page issue occurs when we place a breakpoint at a userspace function virtual addr, and that same memory page is later mmap'd into another process. Our overlay driver might only be attached to the first process, and not the latter --- but because the page is shared, we have to emulate breakpoint handling for the first page. We did this in a complicated way for Xen -- we hacked the hypervisor to siphon debug exceptions for kernel and userspace -- and then the OS Process overlay driver would just insert its breakpoints at *physical* addresses. This allowed the underlying Xen driver to recognize that a legit exception had occurred, and that it had to "handle" the exception even though no overlay driver was attached. Since we could not do this for QEMU/KVM/GDB server stub (and because we're tired of having people install our hacked Xen), we now support the next obvious thing! The Linux OS personality now places probes atop the kernel's do_int3 and do_debug interrupt handlers, if the underlying hypervisor cannot process those userspace-generated interrupts (like our hacked Xen can and does). When they are hit, the personality checks to see if the overlay OS Process driver had placed a physical page breakpoint at the faulting addr; and if an overlay target is attached to the faulting thread; if so, it notifies the overlay to handle the exception (and similarly for single stepping). If the OS Process driver handles the exception, the personality *aborts* the interrupt handler with a return value of 0, to emulate success. And the OS Process driver *has* handled the exception -- it resets the saved RIP on the interrupt stack, removes the breakpoint instruction, and sets EF_TF in RFLAGS on the interrupt stack. Thus, when the IRET happens in the kernel to return to userspace, the process singlesteps the instruction as it should. Then the personality has an immediate debug trap to handle in the kernel's do_debug function -- and it is handled in the same manner to emulate the single step. The personality calls the (newly renamed) target_os_emulate_(bp|ss)_handler functions to handle the shared page breakpoint cases just like the Xen driver handled them; it's just slightly different because the personality now sets up the emulated single step. The base driver (in this case, the QEMU/KVM/GDB base driver) can only step at the kernel level, effectively. However, we still support the "old way", where the Xen driver supports hacked hypervisors that siphon userspace debug exceptions. This code should also support unmodified Xens, *if* the user passes the --hypervisor-ignores-userspace-exceptions argument (or sets it in the Xen config struct) -- but I haven't tried it. Other notes: * I had to add explicit support for x86_64 interrupt stacks. These are interesting; there are 5 per-cpu stacks for different interrupts. So what I do now is check to see if the RSP is within the kernel per-task kernel stack pages (2 pages); if it is, we assume the stack is 2 pages. If not, we assume we're on an alternate 1-page stack. This can break for debug exceptions, because debug exception stacks are 2 pages! But, since we are overloading the do_int3 and do_debug handlers, *our drivers* won't break it cause the stack won't have flowed onto the second page. The second page is probably mostly for handling nested debug exceptions, or in-kernel KGDB -- that kind of thing. * I also hadn't supported writing back userspace regs for the current thread, if the thread was a userspace thread in the kernel. That's now there. * There's some memmod tweaks, to allow memmods that don't actually *write* their changes. In other words, they exist to track modifications made in some other way --- like through a GDB server stub writing a breakpoint! You see, Stackdb needs to track those GDB breakpoint states, in order to do things like insert arbitrary code at a breakpoint. Stackdb --- and particularly this commit --- allow functions to be "aborted", meaning instead of executing the real instruction at a breakpoint, we insert a RET instr, fix the RSP to garbage-collect the frame, and then single step. This commit relies on that support to hook the interrupt handlers, but then "obviate" them, since we've handled the exception. There is new code in the probe stuff too to handle this case, and it's all because it seemed to me in early debugging that the QEMU GDB server stub got kinda ticked off when I tried to overwrite its breakpoints temporarily. This did seem weird to me; maybe it was a side-effect of a separate bug. Anyway, now this situation is actually modeled correctly --- the GDB driver yanks out the GDB breakpoint before inserting the return code --- and then it puts it back afterwards... and of course it's appropriately abstracted. * The OS Process driver now indirects its single steps through the underlying OS personality, via target_os_thread_singlestep. This is the magic that allows us to support either 1) hypervisors that siphon off debug exceptions like our hacked Xen; and 2) hypervisors/stubs that do not siphon them off, like QEMU's GDB stub. If we have situation 1, we singlestep the base driver directly; if we have 2, we set up the userspace thread state to execute a single step, by modifying RFLAGS to set the EF_TF flag.
71753d5e