1. 21 Nov, 2014 1 commit
  2. 20 Nov, 2014 1 commit
  3. 18 Nov, 2014 1 commit
  4. 17 Nov, 2014 5 commits
  5. 14 Nov, 2014 11 commits
    • Roger Pau Monne's avatar
      xen_disk: fix unmapping of persistent grants · 2f01dfac
      Roger Pau Monne authored
      This patch fixes two issues with persistent grants and the disk PV backend
      (Qdisk):
      
       - Keep track of memory regions where persistent grants have been mapped
         since we need to unmap them as a whole. It is not possible to unmap a
         single grant if it has been batch-mapped. A new check has also been added
         to make sure persistent grants are only used if the whole mapped region
         can be persistently mapped in the batch_maps case.
       - Unmap persistent grants before switching to the closed state, so the
         frontend can also free them.
      Signed-off-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Reported-by: default avatarGeorge Dunlap <george.dunlap@eu.citrix.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Kevin Wolf <kwolf@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: George Dunlap <george.dunlap@eu.citrix.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      2f01dfac
    • Igor Mammedov's avatar
      pc: piix4_pm: init legacy PCI hotplug when running on Xen · 91ab2ed7
      Igor Mammedov authored
      If user starts QEMU with "-machine pc,accel=xen", then
      compat property in xenfv won't work and it would cause error:
      "Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
      when PCI device is added with -device on QEMU CLI.
      
      From: Igor Mammedov <imammedo@redhat.com>
      
      In case of Xen instead of using compat property, just use the fact
      that xen doesn't use QEMU's fw_cfg/acpi tables to switch piix4_pm
      into legacy PCI hotplug mode when Xen is enabled.
      Signed-off-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarLi Liang <liang.z.li@intel.com>
      Signed-off-by: default avatarStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      91ab2ed7
    • John Snow's avatar
      ahci: factor out FIS decomposition from handle_cmd · 107f0d46
      John Snow authored
      In order to make handle_cmd more readable at the macro level,
      the details of how to decompose particular types of FIS packets
      are left to helper functions.
      
      In our case, the only type of FIS packet we currently expect to
      see is a Register H2D FIS packet, but the gory details of its
      decomposition are of no particular interest in handle_cmd.
      
      This patch keeps the receipt of FIS packets and the decomposition
      thereof separated to two different functions.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1415058979-16604-6-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      107f0d46
    • John Snow's avatar
      ahci: Check cmd_fis[1] more explicitly · 102e5625
      John Snow authored
      Instead of checking for a known byte, inspect the
      fields of this byte explicitly to produce more meaningful
      error messages and improve the readability of this section.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1415058979-16604-5-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      102e5625
    • John Snow's avatar
      ahci: Reorder error cases in handle_cmd · 36ab3c34
      John Snow authored
      Error checking in ahci's handle_cmd is re-ordered so that we
      initialize as few things as possible before we've done our
      sanity checking. This simplifies returning from this call
      in case of an error.
      
      A check to make sure the DMA memory map succeeds with the
      correct size is also added, and the debug print of the
      command fis is cleaned up with its size corrected.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1415058979-16604-4-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      36ab3c34
    • John Snow's avatar
      ahci: Fix FIS decomposition · 1cbdd968
      John Snow authored
      This patch introduces a few changes to how FIS packets are
      deciphered in the AHCI virtual device. The summary of
      changes can be grouped into two pieces:
      
      [A] Changes to how we apply a preliminary sieve to FISes,
      [B] Changes in how we internalize a decomposed FIS.
      
      == Changes to how we apply a preliminary sieve to FISes ==
      
      (1) Packets may now either update the Control register or
          the Command register, but not both. This is according
          to the SATA 3.2 specification which states:
          "...the device either initiates processing of the command
          indicated in the Command register or initiates processing
          of the control request indicated [...] depending on the
          state of the C bit in the FIS."
      
          See SATA 3.2 section 10.5.5.4, "Reception" in the 10.5.5
          "Register Host to Device FIS" section.
      
          This change accounts for the first two regions of change
          within the diff. All other changes belong to the following
          changes.
      
      == Changes in how we internalize a decomposed FIS ==
      
      (2) Instead of trying to extract the sector number out of the
          FIS from bytes 4-10 and setting it with ide_set_sector,
          we set the appropriate IDEState registers and trust that
          ide_get_sector can retrieve the correct sector later.
      
          By "constructing" the sector for use with ide_set_sector,
          we are duplicating the mechanisms of ide_get_sector.
          This change makes the FIS decomposition more obvious.
      
          SATA 3.2 as a specification does not make the legacy
          register mapping with respect to the D2H FIS obvious.
          However, SATA 3.2 section 10.5.5.1 "Register Host to
          Device FIS layout" describes all of the "cmd_fis"
          bytes:
      
          0 - FIS Type (0x27)
          1 - Port Multiplier Port and Command Update flag
          2 - ATA Command
          3 - Features_Low
          4 - LBA 7:0
          5 - LBA 15:8
          6 - LBA 23:16
          7 - Device, AKA "Drive Select."
          8 - LBA 31:24
          9 - LBA 39:32
          10 - LBA 47:40
          11 - Features_High
          12 - Count Low
          13 - Count High
          14 - ICC
          15 - Control
          16-19 - Auxiliary (for NCQ, defined per-command)
      
          Most of these registers map to existing IDEState registers
          in obvious ways, especially features, select, hob_features,
          and nsector (count). ICC is reserved in older specifications
          but is not supported in our implementation, and remains
          unused here. The Control register is not valid for a command
          that is trying to update the command register and is to be
          considered reserved at this point.
      
          What is not obvious is the LBA register mappings, but SATA 1.0
          can help inform of us legacy device support, see SATA 1.0 section
          8.5.2 "Register - Host to Device."
      
          LBA 7:0   - Sector Number    (sector)
          LBA 15:8  - Cyl Low          (lcyl)
          LBA 23:16 - Cyl High         (hcyl)
          LBA 31:24 - Sector Num Exp.  (hob_sector)
          LBA 39:32 - Cyl Low Exp.     (hob_lcyl)
          LBA 47:40 - Cyl High Exp.    (hob_hcyl)
      
          These mappings help guide which registers the FIS should be decomposed
          into/towards for CHS, LBA28 and LBA48 commands.
      
          As a note: The prior confusion that can be seen in the documentation
          arises from the fact that CHS and LBA28 commands use the low nybble
          of the drive select register to store LBA 27:24, whereas LNA48 commands
          use the hob_sector, hob_lcyl and hob_hcyl registers as explained above.
      
          The decomposition as it stands now will correctly decompose CHS, LBA28
          and LBA48 commands into their appropriate registers where the core
          IDE/ATAPI layers can deal with them correctly.
      
          See the below point for more information.
      
      (3) We save cmd_fis[7] as ide_state->select, which informs
          decisions about if we are using LBA or CHS.
          This corrects a bug in AHCI wherein we attempt to set and/or
          retrieve the sector number by using ide_set_sector and
          ide_get_sector, which depend on the select register to
          determine if we are using LBA or CHS.
      
          Without this adjustment, LBA48 read/writes are currently
          broken. Thanks to Eniac Zheng @ HP for pointing this out.
      
      (4) Save cmd_fis[11] as ide_state->hob_feature, as defined in SATA 3.2.
      
      (5) For several ATA commands, the sector count register set to 0
          is a magic number that means 256 sectors. For LBA48 commands,
          this means 65,536 sectors. We drop the magic sector correction
          here, and trust the ide core layer to handle the conversion
          appropriately, in ide_cmd_lba48_transform(). As it stands,
          the current AHCI code is only compliant with LBA28 commands.
          By simply removing the magic, it will work with LBA28 and LBA48.
      
      (6) We expand FIS decomposition to include both ATAPI and IDE devices.
          We leave the logic of determining if the fields are valid or not
          to the respective layers.
      
          This change intends to make it clearer that AHCI is only a
          composition mechanism for the FIS packets: the meanings of
          the registers is best left to the implementation layers for
          those devices.
      
      (7) Forcefully setting the feature, hcyl and lcyl registers for ATAPI
          commands is removed.
          - The hcyl and lcyl magic present here is valid at boot only,
            and should not be overridden for every PACKET command.
          - The feature register is defined as valid for the PACKET command,
            so we should not suppress it. The ATAPI layer does not even
            currently depend on or require 0x01 as mandatory.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1415058979-16604-3-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      1cbdd968
    • John Snow's avatar
      ahci: add is_ncq predicate helper · 72a065db
      John Snow authored
      A small helper to determine which S/ATA commands
      are destined to be routed to the NCQ pathways.
      
      This references SATA 3.2 section 13.6,
      Native Command Queueing. See sections 13.6.4,
      13.6.5, 13.6.6, 13.6.7 and 13.6.8 for all
      SATA commands considered to be part of the
      NCQ feature set. This is summarized in a small
      list in section 13.6.3.1 and again in 13.6.3.2.
      
      Not all of these NCQ commands are currently supported,
      so the error pathways are adjusted slightly to be more
      informative in the case they are encountered.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1415058979-16604-2-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      72a065db
    • John Snow's avatar
      ide: Correct handling of malformed/short PRDTs · 3251bdcf
      John Snow authored
      This impacts both BMDMA and AHCI HBA interfaces for IDE.
      Currently, we confuse the difference between a PRDT having
      "0 bytes" and a PRDT having "0 complete sectors."
      
      When we receive an incomplete sector, inconsistent error checking
      leads to an infinite loop wherein the call succeeds, but it
      didn't give us enough bytes -- leading us to re-call the
      DMA chain over and over again. This leads to, in the BMDMA case,
      leaked memory for short PRDTs, and infinite loops and resource
      usage in the AHCI case.
      
      The .prepare_buf() callback is reworked to return the number of
      bytes that it successfully prepared. 0 is a valid, non-error
      answer that means the table was empty and described no bytes.
      -1 indicates an error.
      
      Our current implementation uses the io_buffer in IDEState to
      ultimately describe the size of a prepared scatter-gather list.
      Even though the AHCI PRDT/SGList can be as large as 256GiB, the
      AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
      however, defines the largest transaction to be an LBA48 command
      that transfers 65,536 sectors. With a 512 byte sector size, this
      is just 32MiB.
      
      Since our current state structures use the int type to describe
      the size of the buffer, and this state is migrated as int32, we
      are limited to describing 2GiB buffer sizes unless we change the
      migration protocol.
      
      For this reason, this patch begins to unify the assertions in the
      IDE pathways that the scatter-gather list provided by either the
      AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
      2GiB. This should be resilient enough unless we need a sector
      size that exceeds 32KiB.
      
      Further, the likelihood of any guest operating system actually
      attempting to transfer this much data in a single operation is
      very slim.
      
      To this end, the IDEState variables have been updated to more
      explicitly clarify our maximum supported size. Callers to the
      prepare_buf callback have been reworked to understand the new
      return code, and all versions of the prepare_buf callback have
      been adjusted accordingly.
      
      Lastly, the ahci_populate_sglist helper, relied upon by the
      AHCI implementation of .prepare_buf() as well as the PCI
      implementation of the callback have had overflow assertions
      added to help make clear the reasonings behind the various
      type changes.
      
      [Added %d -> %"PRId64" fix John sent because off_pos changed from int to
      int64_t.
      --Stefan]
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      3251bdcf
    • John Snow's avatar
      ahci: unify sglist preparation · bef1301a
      John Snow authored
      The intent of this patch is to further unify the creation and
      deletion of the sglist used for all AHCI transfers, including
      emulated PIO, ATAPI R/W, and native DMA R/W.
      
      By replacing ahci_start_transfer's call to ahci_populate_sglist
      with ahci_dma_prepare_buf, we reduce the number of direct calls
      where we manipulate the scatter-gather list in the AHCI code.
      
      To make this switch, the constant "0" passed as an offset
      in ahci_dma_prepare_buf is adjusted to use io_buffer_offset.
      
      For DMA pathways, this has no effect: io_buffer_offset is always
      updated to 0 at the beginning of a DMA transfer loop regardless.
      DMA pathways through ide_dma_cb() update the io_buffer_offset
      accordingly, and for circumstances where we might make several
      trips through this loop, this may actually correct a design flaw.
      
      For PIO pathways, the newly updated ahci_dma_prepare_buf will
      now prepare the sglist at the correct offset. It will also set
      io_buffer_size, but this is not used in the cmd_read_pio or
      cmd_write_pio pathways.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1414785819-26209-3-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      bef1301a
    • John Snow's avatar
      ide: repair PIO transfers for cases where nsector > 1 · 36334faf
      John Snow authored
      Currently, for emulated PIO transfers through the AHCI device,
      any attempt made to request more than a single sector's worth
      of data will result in the same sector being transferred over
      and over.
      
      For example, if we request 8 sectors via PIO READ SECTORS, the
      AHCI device will give us the same sector eight times.
      
      This patch adds offset tracking into the PIO pathways so that
      we can fulfill these requests appropriately.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1414785819-26209-2-git-send-email-jsnow@redhat.com
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      36334faf
    • John Snow's avatar
      ahci: Fix byte count regression for ATAPI/PIO · a395f3fa
      John Snow authored
      This patch fixes a regression caused by commit
      659142ec.
      The problem occurs when we wish to return early
      from the ahci_start_transfer function, but are now
      updating the transferred byte count in the AHCI
      command header via ahci_commit_buf.
      
      This will cause problems in the Windows 8 installer.
      
      Don't update the byte count in the command header
      for the transmission of ATAPI packets: These commands
      will distort the final byte count of the actual data
      payload.
      
      The call to ahci_commit_buf remains in the "out"
      portion of the call in order to clean up the sglist.
      The byte count is maintained by forcing size to be 0.
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      a395f3fa
  6. 13 Nov, 2014 4 commits
  7. 12 Nov, 2014 3 commits
  8. 11 Nov, 2014 3 commits
  9. 10 Nov, 2014 1 commit
  10. 07 Nov, 2014 2 commits
    • Paolo Bonzini's avatar
      virtio-scsi: work around bug in old BIOSes · 55783a55
      Paolo Bonzini authored
      Old BIOSes left some padding by mistake after the req_size/resp_size.
      New QEMU does not like it, thinking it is a bidirectional command.
      
      As a workaround, we can check if the ANY_LAYOUT bit is set; if not, we
      always consider the first buffer as the virtio-scsi request/response,
      because, back when QEMU did not support ANY_LAYOUT, it expected the
      payload to start at the second element of the iovec.
      
      This can show up during migration.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      55783a55
    • Hannes Reinecke's avatar
      esp-pci: fixup deadlock with linux · c3543fb5
      Hannes Reinecke authored
      A linux guest will be issuing messages:
      
      [   32.124042] DC390: Deadlock in DataIn_0: DMA aborted unfinished: 000000 bytes remain!!
      [   32.126348] DC390: DataIn_0: DMA State: 0
      
      and the HBA will fail to work properly.
      Reason is the emulation is not setting the 'DMA transfer done'
      status correctly.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Cc: qemu-stable@nongnu.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c3543fb5
  11. 06 Nov, 2014 1 commit
    • Marc-André Lureau's avatar
      virtio-serial: avoid crash when port has no name · 7eb73114
      Marc-André Lureau authored
      It seems "name" is not mandatory, and the following command line (based
      on one generated by current libvirt) will crash qemu at start:
      
      qemu-system-x86_64 \
          -device virtio-serial-pci \
          -device virtserialport,name=foo \
          -device virtconsole
      
      Program received signal SIGSEGV, Segmentation fault.
      __strcmp_ssse3 () at ../sysdeps/x86_64/strcmp.S:210
      210        movlpd    (%rsi), %xmm2
      Missing separate debuginfos, use: debuginfo-install
      python-libs-2.7.5-13.fc20.x86_64
      (gdb) bt
       #0  __strcmp_ssse3 () at ../sysdeps/x86_64/strcmp.S:210
       #1  0x000055555566bdc6 in find_port_by_name (name=0x0) at /home/elmarco/src/qemu/hw/char/virtio-serial-bus.c:67
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@gmail.com>
      Reviewed-by: default avatarAmos Kong <akong@redhat.com>
      Signed-off-by: default avatarAmit Shah <amit.shah@redhat.com>
      7eb73114
  12. 05 Nov, 2014 3 commits
  13. 04 Nov, 2014 4 commits
    • Alexander Graf's avatar
      spapr: Allow dynamic creation of PHB · 9e3f9733
      Alexander Graf authored
      Now that we finally check for presence of dangling sysbus devices, make check
      started complaining that the sPAPR PHB is one such device.
      
      However, it really isn't. The spapr PHB is not really a traditional sysbus
      device, but much more a special spapr pv device which is already able to get
      created dynamically.
      
      Move spapr to its own dynamic sysbus check handling and allow PHB devices to
      get allocated dynamically.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      9e3f9733
    • Alexander Graf's avatar
      e500: Add support for eTSEC in device tree · fdfb7f2c
      Alexander Graf authored
      This patch adds support to expose eTSEC devices in the dynamically created
      guest facing device tree. This allows us to expose eTSEC devices into guests
      without changes in the machine file.
      
      Because we can now tell the guest about eTSEC devices this patch allows the
      user to specify eTSEC devices via -device at all.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      fdfb7f2c
    • Alexander Graf's avatar
      PPC: e500: Support dynamically spawned sysbus devices · f7087343
      Alexander Graf authored
      For e500 our approach to supporting dynamically spawned sysbus devices is to
      create a simple bus from the guest's point of view within which we map those
      devices dynamically.
      
      We allocate memory regions always within the "platform" hole in address
      space and map IRQs to predetermined IRQ lines that are reserved for platform
      device usage.
      
      This maps really nicely into device tree logic, so we can just tell the
      guest about our virtual simple bus in device tree as well.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      f7087343
    • Alexander Graf's avatar
      sysbus: Add new platform bus helper device · 7634fe3c
      Alexander Graf authored
      We need to support spawning of sysbus devices dynamically via the command line.
      The easiest way to represent these dynamically spawned devices in the guest's
      memory and IRQ layout is by preallocating some space for dynamic sysbus devices.
      
      This is what the "platform bus" device does. It is a sysbus device that exports
      a configurably sized MMIO region and a configurable number of IRQ lines. When
      this device encounters sysbus devices that have been dynamically created and not
      manually wired up, it dynamically connects them to its own pool of resources.
      
      The machine model can then loop through all of these devices and create a guest
      configuration (device tree) to make them visible to the guest.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      7634fe3c