1. 27 Jul, 2010 3 commits
    • Ryan Kuester's avatar
      [SCSI] mptsas: fix hangs caused by ATA pass-through · 2a1b7e57
      Ryan Kuester authored
      I may have an explanation for the LSI 1068 HBA hangs provoked by ATA
      pass-through commands, in particular by smartctl.
      First, my version of the symptoms.  On an LSI SAS1068E B3 HBA running firmware, with SATA disks, and with smartd running, I'm seeing
      occasional task, bus, and host resets, some of which lead to hard faults of
      the HBA requiring a reboot.  Abusively looping the smartctl command,
          # while true; do smartctl -a /dev/sdb > /dev/null; done
      dramatically increases the frequency of these failures to nearly one per
      minute.  A high IO load through the HBA while looping smartctl seems to
      improve the chance of a full scsi host reset or a non-recoverable hang.
      I reduced what smartctl was doing down to a simple test case which
      causes the hang with a single IO when pointed at the sd interface.  See
      the code at the bottom of this e-mail.  It uses an SG_IO ioctl to issue
      a single pass-through ATA identify device command.  If the buffer
      userspace gives for the read data has certain alignments, the task is
      issued to the HBA but the HBA fails to respond.  If run against the sg
      interface, neither the test code nor smartctl causes a hang.
      sd and sg handle the SG_IO ioctl slightly differently.  Unless you
      specifically set a flag to do direct IO, sg passes a buffer of its own,
      which is page-aligned, to the block layer and later copies the result
      into the userspace buffer regardless of its alignment.  sd, on the other
      hand, always does direct IO unless the userspace buffer fails an
      alignment test at block/blk-map.c line 57, in which case a page-aligned
      buffer is created and used for the transfer.
      The alignment test currently checks for word-alignment, the default
      setup by scsi_lib.c; therefore, userspace buffers of almost any
      alignment are given directly to the HBA as DMA targets.  The LSI 1068
      hardware doesn't seem to like at least a couple of the alignments which
      cross a page boundary (see the test code below).  Curiously, many
      page-boundary-crossing alignments do work just fine.
      So, either the hardware has an bug handling certain alignments or the
      hardware has a stricter alignment requirement than the driver is
      advertising.  If stricter alignment is required, then in no case should
      misaligned buffers from userspace be allowed through without being
      bounced or at least causing an error to be returned.
      It seems the mptsas driver could use blk_queue_dma_alignment() to advertise
      a stricter alignment requirement.  If it does, sd does the right thing and
      bounces misaligned buffers (see block/blk-map.c line 57).  The following
      patch to 2.6.34-rc5 makes my symptoms go away.  I'm sure this is the wrong
      place for this code, but it gets my idea across.
      Acked-by: default avatar"Desai, Kashyap" <Kashyap.Desai@lsi.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
    • Eric Moore's avatar
      [SCSI] mpt2sas: DIF Type 2 Protection Support · d334aa79
      Eric Moore authored
      Adding DIF Type 2 protection support, as well as turning on 32 byte cdb's,
      and setting the cdb length for > 16 byte in the SCSI_IO->control parameter.
      Signed-off-by: default avatarMartin Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarEric Moore <eric.moore@lsi.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
    • Anton Blanchard's avatar
      [SCSI] ibmvscsi: Fix oops when an interrupt is pending during probe · 8f83d768
      Anton Blanchard authored
      A driver needs to be ready to take an interrupt as soon as it registers
      an interrupt handler. I noticed the following oops when testing kdump:
      ipr: IBM Power RAID SCSI Device Driver version: 2.5.0 (February 11, 2010)
      ibmvscsi 30000002: SRP_VERSION: 16.a
      ibmvscsi 30000002: SRP_VERSION: 16.a
      Unable to handle kernel paging request for data at address 0x00000000
      pc: c000000004085e34: .tasklet_action+0xf4/0x1dc
      c000000004086fe4 .__do_softirq+0x16c/0x2c0
      c00000000403138c .call_do_softirq+0x14/0x24
      c00000000400ee14 .do_softirq+0xa0/0x104
      c00000000408690c .irq_exit+0x70/0xd0
      c00000000400f190 .do_IRQ+0x214/0x2a8
      c000000004004804 hardware_interrupt_entry+0x1c/0x98
      --- Exception: 501 (Hardware Interrupt) at c00000000400c544 .raw_local_irq_restore+0x48/0x54
      c00000000465d2a8 ._raw_spin_unlock_irqrestore+0x74/0xa0
      c0000000040e7f00 .__setup_irq+0x2ec/0x3f0
      c0000000040e8198 .request_threaded_irq+0x194/0x22c
      c00000000446d854 .rpavscsi_init_crq_queue+0x284/0x3f0
      c00000000446c764 .ibmvscsi_probe+0x688/0x710
      c00000000402903c .vio_bus_probe+0x37c/0x3e4
      c000000004403f10 .driver_probe_device+0xec/0x1b8
      c000000004404088 .__driver_attach+0xac/0xf4
      c000000004403184 .bus_for_each_dev+0x98/0x104
      c000000004403c98 .driver_attach+0x40/0x60
      c0000000044026f0 .bus_add_driver+0x154/0x324
      c0000000044045d0 .driver_register+0xe8/0x1ac
      c00000000402b2a8 .vio_register_driver+0x54/0x74
      c000000004933ea4 .ibmvscsi_module_init+0x80/0xc0
      c000000004009834 .do_one_initcall+0x98/0x1d8
      c0000000049005b4 .kernel_init+0x27c/0x33c
      c000000004031550 .kernel_thread+0x54/0x70
      srp_task needs to be setup before request_irq. The patch below fixes the oops.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Acked-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
  2. 21 Jul, 2010 4 commits
  3. 11 Jun, 2010 33 commits