- Jun 30, 2009
-
-
Davide Libenzi authored
Change the eventfd interface to de-couple the eventfd memory context, from the file pointer instance. Without such change, there is no clean way to racely free handle the POLLHUP event sent when the last instance of the file* goes away. Also, now the internal eventfd APIs are using the eventfd context instead of the file*. This patch is required by KVM's IRQfd code, which is still under development. Signed-off-by:
Davide Libenzi <davidel@xmailserver.org> Cc: Gregory Haskins <ghaskins@novell.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Avi Kivity <avi@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jun 12, 2009
-
-
Rusty Russell authored
We no longer need an efficient mechanism to force the Guest back into host userspace, as each device is serviced without bothering the main Guest process (aka. the Launcher). Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Currently, when a Guest wants to perform I/O it calls LHCALL_NOTIFY with an address: the main Launcher process returns with this address, and figures out what device to run. A far nicer model is to let processes bind an eventfd to an address: if we find one, we simply signal the eventfd. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: Davide Libenzi <davidel@xmailserver.org>
-
Rusty Russell authored
We currently only allow the Launcher process to send interrupts, but it as we already send interrupts from the hrtimer, it's a simple matter of extracting that code into a common set_interrupt routine. As we switch to a thread per virtqueue, this avoids a bottleneck through the main Launcher process. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
1) j wasn't initialized in setup_pagetables, so they weren't set up for me causing immediate guest crashes. 2) gpte_addr should not re-read the pmd from the Guest. Especially not BUG_ON() based on the value. If we ever supported SMP guests, they could trigger that. And the Launcher could also trigger it (tho currently root-only). Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
This version requires that host and guest have the same PAE status. NX cap is not offered to the guest, yet. Signed-off-by:
Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name (That's really what it is, and the confusion gets worse with PAE support) Signed-off-by:
Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Reported-by:
Jeremy Fitzhardinge <jeremy@goop.org>
-
Matias Zabaljauregui authored
Some cleanups and replace direct assignment with native_set_* macros which properly handle 64-bit entries when PAE is activated Signed-off-by:
Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
Map switcher with executable page table entries. (This bug didn't matter before PAE and hence NX support -- RR) Signed-off-by:
Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
If GDT_ENTRIES were every > 256, this could become a problem. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Roel Kluin authored
Do not go beyond ARRAY_SIZE of cpu->arch.gdt Signed-off-by:
Roel Kluin <roel.kluin@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
lguest never checked for pending interrupts when enabling interrupts, and things still worked. However, it makes a significant difference to TCP performance, so it's time we fixed it by introducing a pending_irq flag and checking it on irq_restore and irq_enable. These two routines are now too big to patch into the 8/10 bytes patch space, so we drop that code. Note: The high latency on interrupt delivery had a very curious effect: once everything else was optimized, networking without GSO was faster than networking with GSO, since more interrupts were sent and hence a greater chance of one getting through to the Guest! Note2: (Almost) Closing the same loophole for iret doesn't have any measurable effect, so I'm leaving that patch for the moment. Before: 1GB tcpblast Guest->Host: 30.7 seconds 1GB tcpblast Guest->Host (no GSO): 76.0 seconds After: 1GB tcpblast Guest->Host: 6.8 seconds 1GB tcpblast Guest->Host (no GSO): 27.8 seconds Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
When the Guest does the LHCALL_HALT hypercall, we go to sleep, expecting that a timer or the Waker will wake_up_process() us. But we do it in a stupid way, leaving a classic missing wakeup race. So split maybe_do_interrupt() into interrupt_pending() and try_deliver_interrupt(), and check maybe_do_interrupt() and the "break_out" flag before calling schedule. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
The Launcher could be inside the Guest on another CPU; wake_up_process will do nothing because it is "running". kick_process will knock it back into our kernel in this case, otherwise we'll miss it until the next guest exit. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Michael S. Tsirkin authored
This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations, and updates all drivers. This is needed for MSI support, because MSI needs to know the total number of vectors upfront. Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)
-
Rusty Russell authored
Add a linked list of all virtqueues for a virtio device: this helps for debugging and is also needed for upcoming interface change. Also, add a "name" field for clearer debug messages. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- May 26, 2009
-
-
Rusty Russell authored
When KVM is loaded, and hence VT set up, the vmcall instruction in an lguest guest causes a #GP, not #UD. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 19, 2009
-
-
Rusty Russell authored
Fixes guest crash 'lguest: bad read address 0x4800000 len 256' The new per-cpu allocator ends up handing a non-linear address to write_gdt_entry. We do __pa() on it, and hand it to the host, which kills us. I've long wanted to make the hypercall "LOAD_GDT_ENTRY" to match the IDT code, but had no pressing reason until now. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: lguest@ozlabs.org
-
Matias Zabaljauregui authored
Typical message: 'lguest: unhandled trap 6 at 0x418726 (0x0)' vmlinux guests were broken by 4cd8b5e2 'lguest: use KVM hypercalls', which rewrites guest text from kvm hypercalls to trap 31. The Launcher mmaps the kernel image. The Guest executes and immediately faults in the first text page (read-only). Then it hits a hypercall, and we rewrite that hypercall, causing a copy-on-write. But the Guest pagetables still refer to the old page: we fault again, but as Host we see the hypercall already rewritten, and pass the fault back to the Guest. The Guest hasn't set up an IDT yet, so we kill it. This doesn't happen with bzImages: they unpack themselves and so the text pages are already read-write. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Tested-by:
Patrick McHardy <kaber@trash.net>
-
- Mar 30, 2009
-
-
Matias Zabaljauregui authored
Impact: clean up Rusty told me, some time ago, that he had become a fan of "bool". So, here are some replacements. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
Impact: cleanup This patch allow us to use KVM hypercalls Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Impact: fix crash on misbehaving guest gpte_addr() contains a BUG_ON(), insisting that the present flag is set. We need to return before we call it if that isn't the case. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
-
- Mar 08, 2009
-
-
Rusty Russell authored
Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y We now need to call irq_to_desc_alloc_cpu() before set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no kmalloc available). So do it as we use interrupts instead. Also means we only alloc for irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
-
- Feb 22, 2009
-
-
Ingo Molnar authored
Impact: remove unused/broken code The Voyager subarch last built successfully on the v2.6.26 kernel and has been stale since then and does not build on the v2.6.27, v2.6.28 and v2.6.29-rc5 kernels. No actual users beyond the maintainer reported this breakage. Patches were sent and most of the fixes were accepted but the discussion around how to do a few remaining issues cleanly fizzled out with no resolution and the code remained broken. In the v2.6.30 x86 tree development cycle 32-bit subarch support has been reworked and removed - and the Voyager code, beyond the build problems already known, needs serious and significant changes and probably a rewrite to support it. CONFIG_X86_VOYAGER has been marked BROKEN then. The maintainer has been notified but no patches have been sent so far to fix it. While all other subarchs have been converted to the new scheme, voyager is still broken. We'd prefer to receive patches which clean up the current situation in a constructive way, but even in case of removal there is no obstacle to add that support back after the issues have been sorted out in a mutually acceptable fashion. So remove this inactive code for now. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jan 29, 2009
-
-
Mark Wallis authored
Fix a memory leak identified by Rusty Russell during LCA09 by kfree'ing the lg object instead of just clearing it when the launcher closes. Signed-off-by:
Mark Wallis <mwallis@serialmonkey.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Atsushi SAKAI authored
3 points lguest_asm.S => i386_head.S LHCALL_BREAK => LHREQ_BREAK perferred => preferred Signed-off-by:
Atsushi SAKAI <sakaia@jp.fujitsu.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Jan 06, 2009
-
-
Mark McLoughlin authored
We shouldn't be statically allocating the root device object, so dynamically allocate it using root_device_register() instead. Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Acked-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- Dec 29, 2008
-
-
Mark McLoughlin authored
bus_id is gradually being removed, so use dev_name() instead. Signed-off-by:
Mark McLoughlin <markmc@redhat.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Matias Zabaljauregui authored
This patch moves the initial guest page table creation code to the host, so the launcher keeps working with PAE enabled configs. Signed-off-by:
Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
This allows each virtio user to hand in the alignment appropriate to their virtio_ring structures. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
Rusty Russell authored
This doesn't really matter, since lguest is i386 only at the moment, but we could actually choose a different value. (lguest doesn't have a guarenteed ABI). Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Dec 23, 2008
-
-
Yinghai Lu authored
Impact: fix lguest, clean up 32-bit lguest used used_vectors to record vectors, but that model of allocating vectors changed and got broken, after we changed vector allocation to a per_cpu array. Try enable that for 64bit, and the array is used for all vectors that are not managed by vector_irq per_cpu array. Also kill system_vectors[], that is now a duplication of the used_vectors bitmap. [ merged in cpus4096 due to io_apic.c cpumask changes. ] [ -v2, fix build failure ] Signed-off-by:
Yinghai Lu <yinghai@kernel.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Aug 25, 2008
-
-
Rusty Russell authored
Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Aug 12, 2008
-
-
Rusty Russell authored
Using a simple page table thrashing program I measure a slight improvement. The program creates five processes. Each touches 1000 pages then schedules the next process. We repeat this 1000 times. As lguest only caches 4 cr3 values, this rebuilds a lot of shadow page tables requiring virt->phys mappings. Before: 5.93 seconds After: 5.40 seconds (Counts of slow vs fastpath in this usage are 6092 and 2852462 respectively.) And more importantly for lguest, the code is simpler. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Jul 28, 2008
-
-
Andrew Morton authored
To support my little make-x86-bitops-use-proper-typechecking projectlet. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Acked-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Johannes Weiner authored
map_switcher allocates the array, unmap_switcher has to free it accordingly. Signed-off-by:
Johannes Weiner <hannes@saeurebad.de> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Ron Minnich noticed that guest userspace gets a GPF when it tries to int3: we need to copy the privilege level from the guest-supplied IDT to the real IDT. int3 is the only common case where guest userspace expects to invoke an interrupt, so that's the symptom of failing to do this. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Jul 24, 2008
-
-
Rusty Russell authored
To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Jul 10, 2008
-
-
Ingo Molnar authored
remove leftover traces of various VISWS related Kconfig specials. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-