1. 11 May, 2009 1 commit
    • Tejun Heo's avatar
      block: add rq->resid_len · c3a4d78c
      Tejun Heo authored
      rq->data_len served two purposes - the length of data buffer on issue
      and the residual count on completion.  This duality creates some
      First of all, block layer and low level drivers can't really determine
      what rq->data_len contains while a request is executing.  It could be
      the total request length or it coulde be anything else one of the
      lower layers is using to keep track of residual count.  This
      complicates things because blk_rq_bytes() and thus
      [__]blk_end_request_all() relies on rq->data_len for PC commands.
      Drivers which want to report residual count should first cache the
      total request length, update rq->data_len and then complete the
      request with the cached data length.
      Secondly, it makes requests default to reporting full residual count,
      ie. reporting that no data transfer occurred.  The residual count is
      an exception not the norm; however, the driver should clear
      rq->data_len to zero to signify the normal cases while leaving it
      alone means no data transfer occurred at all.  This reverse default
      behavior complicates code unnecessarily and renders block PC on some
      drivers (ide-tape/floppy) unuseable.
      This patch adds rq->resid_len which is used only for residual count.
      While at it, remove now unnecessasry blk_rq_bytes() caching in
      ide_pc_intr() as rq->data_len is not changed anymore.
      Boaz	: spotted missing conversion in osd
      Sergei	: spotted too early conversion to blk_rq_bytes() in ide-tape
      [ Impact: cleanup residual count handling, report 0 resid by default ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Doug Gilbert <dgilbert@interlog.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Darrick J. Wong <djwong@us.ibm.com>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
  2. 27 Apr, 2009 2 commits
    • Tejun Heo's avatar
      block: kill rq->data · 731ec497
      Tejun Heo authored
      Now that all block request data transfer is done via bio, rq->data
      isn't used.  Kill it.
      While at it, make the roles of rq->special and buffer clear.
      [ Impact: drop now unncessary field from struct request ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
    • Tejun Heo's avatar
      block: implement and use [__]blk_end_request_all() · 40cbbb78
      Tejun Heo authored
      There are many [__]blk_end_request() call sites which call it with
      full request length and expect full completion.  Many of them ensure
      that the request actually completes by doing BUG_ON() the return
      value, which is awkward and error-prone.
      This patch adds [__]blk_end_request_all() which takes @rq and @error
      and fully completes the request.  BUG_ON() is added to to ensure that
      this actually happens.
      Most conversions are simple but there are a few noteworthy ones.
      * cdrom/viocd: viocd_end_request() replaced with direct calls to
      * s390/block/dasd: dasd_end_request() replaced with direct calls to
      * s390/char/tape_block: tapeblock_end_request() replaced with direct
        calls to blk_end_request_all().
      [ Impact: cleanup ]
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
  3. 03 Apr, 2009 1 commit
    • James Bottomley's avatar
      [SCSI] fix recovered error handling · a9bddd74
      James Bottomley authored
      We have a problem with recovered error handling in that any command
      which goes down as BLOCK_PC but which returns a sense code of RECOVERED
      ERROR gets completed with -EIO.  For actual SG_IO commands, this doesn't
      matter at all, since the error return code gets dropped in favour of
      req->errors which contain the SCSI completion code.
      However, if this command is part of the block system, then it will pay
      attention to the returned error code.  In particularly if a SYNCHRONIZE
      CACHE from a barrier command completes with RECOVERED ERROR, the
      resulting -EIO on the barrier causes block to error the request and
      return it to the filesystem.  Fix this by converting the -EIO for
      recovered error to zero, plus remove the printing of this from sd and sr
      so the message isn't double printed.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  4. 12 Mar, 2009 1 commit
  5. 21 Feb, 2009 1 commit
  6. 07 Jan, 2009 1 commit
    • James Bottomley's avatar
      [SCSI] scsi_lib: fix DID_RESET status problems · 79ed2429
      James Bottomley authored
      Andrew Vaszquez said:
      > There's a problem that is causing commands returned by the LLD with
      > a DID_RESET status to be reissued with cleared cmd->sdb data which
      > in our tests are manifesting in firmware detected overruns.  Here's
      > a snippet of a READ_10 scsi_cmnd upon completion by the storage
      The problem is caused by:
      commit b60af5b0
      Author: Alan Stern <stern@rowland.harvard.edu>
      Date:   Mon Nov 3 15:56:47 2008 -0500
          [SCSI] simplify scsi_io_completion()
      Because scsi_release_buffers() is called before commands that go
      through the ACTION_RETRY and ACTION_DELAYED_RETRY legs are requeued.
      However, they're not re-prepared, so nothing ever reallocates the
      buffer resources to them.  Fix this by releasing the buffers only if
      we're not going to go down these legs (but scsi_release_buffers() on
      all legs including two in scsi_end_request(); this latter needs a
      special version __scsi_release_buffers() because the final one can be
      called after the request has been freed, so the bidi test in
      scsi_release_buffers(), which touches the request has to be skipped).
      Reported-by: default avatarAndrew Vasquez <andrew.vasquez@qlogic.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  7. 05 Jan, 2009 2 commits
    • Martin K. Petersen's avatar
      [SCSI] Fix error handling for DIF/DIX · 3e695f89
      Martin K. Petersen authored
      commit b60af5b0
      Author: Alan Stern <stern@rowland.harvard.edu>
      Date:   Mon Nov 3 15:56:47 2008 -0500
          [SCSI] simplify scsi_io_completion()
      broke DIX error handling.  Also, we are now using EILSEQ to indicate
      integrity errors to the upper layers (as opposed to regular EIO
      failures).  This allows filesystems to inspect buffers and decide
      whether to retry the I/O.  Update scsi_io_completion() accordingly.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
    • James Bottomley's avatar
      [SCSI] scsi_lib: don't decrement busy counters when inserting commands · 4f5299ac
      James Bottomley authored
      A bug was introduced by
      commit b60af5b0
      Author: Alan Stern <stern@rowland.harvard.edu>
      Date:   Mon Nov 3 15:56:47 2008 -0500
          [SCSI] simplify scsi_io_completion()
      because the simplification uses scsi_queue_insert().  The problem with
      this function is that it expects to be called from the completion path
      while the command is still outstanding, so it decrements the device
      and host busy counts to do the requeue.  The problem is that
      scsi_io_completion() is a path executed well after these counts have
      *already* been decremented, leading to a double decrement if the
      command goes down any error path leading to ACTION_DELAYED_RETRY.
      The fix is to allow a private function __scsi_queue_insert() with a
      flag to say whether the busy counters should be decremented.  This is
      made static to scsi_lib.c to discourage other use.
      Reported-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  8. 02 Jan, 2009 1 commit
  9. 29 Dec, 2008 2 commits
    • FUJITA Tomonori's avatar
      [SCSI] add residual argument to scsi_execute and scsi_execute_req · f4f4e47e
      FUJITA Tomonori authored
      scsi_execute() and scsi_execute_req() discard the residual length
      information. Some callers need it. This adds residual argument
      (optional) to scsi_execute and scsi_execute_req.
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
    • Alan Stern's avatar
      [SCSI] simplify scsi_io_completion() · b60af5b0
      Alan Stern authored
      This patch (as1142b) consolidates a lot of repetitious code in
      scsi_io_completion().  It also fixes a few comments.  Most
      importantly, however, it clearly distinguishes among the three sorts
      of retries that can be done when a command fails to complete:
      	Unprepare the request and resubmit it, so that a new
      	command will be created for it.
      	Requeue the request directly so that it will be retried
      	immediately using the same command.
      	Requeue the request so that it will be retried following
      	a short delay.
      	Complete the remainder of the request with an I/O error.
      [jejb: Updates
           1. For several error conditions, we would now print the sense twice
              in slightly different ways, so unify the location of sense
           2. I added more descriptions to actual failure conditions for
              better debugging
           3. according to spec, ABORTED_COMMAND is supposed to be retried
              (except on DIF failure).  Our old behaviour of erroring it looks
              to be a bug.
           4. I'd prefer not to default initialise the action variable because
              that ensures that every leg of the error handler has an
              associated action and the compiler will warn if someone later
              accidentally misses one or removes one.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  10. 13 Dec, 2008 1 commit
  11. 16 Nov, 2008 1 commit
  12. 23 Oct, 2008 3 commits
  13. 13 Oct, 2008 2 commits
    • Mike Christie's avatar
      [SCSI] modify scsi to handle new fail fast flags. · 4a27446f
      Mike Christie authored
      This checks the errors the scsi-ml determined were retryable
      and returns if we should fast fail it based on the request
      fail fast flags.
      Without the patch, drivers like lpfc, qla2xxx and fcoe would return
      DID_ERROR for what it determines is a temporary communication problem.
      There is no loss of connectivity at that time and the driver thinks
      that it would be fast to retry at the driver level. SCSI-ml will however
      sees fast fail on the request and DID_ERROR and will fast fail the io.
      This will then cause dm-multipath to fail the path and possibley switch
      target controllers when we should be retrying at the scsi layer.
      We also were fast failing device errors to dm multiapth when
      unless the scsi_dh modules think otherwis we want to retry at
      the scsi layer because multipath can only retry the IO like scsi
      should have done. multipath is a little dumber though because it
      does not what the error was for and assumes that it should fail
      the paths.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
    • Mike Christie's avatar
      [SCSI] Add helper code so transport classes/driver can control queueing (v3) · f0c0a376
      Mike Christie authored
      SCSI-ml manages the queueing limits for the device and host, but
      does not do so at the target level. However something something similar
      can come in userful when a driver is transitioning a transport object to
      the the blocked state, becuase at that time we do not want to queue
      io and we do not want the queuecommand to be called again.
      The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
      You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
      a transport level queueing issue like the hw cannot allocate some
      resource at the iscsi session/connection level, or the target has temporarily
      closed or shrunk the queueing window, or if we are transitioning
      to the blocked state.
      bnx2i, when they rework their firmware according to netdev
      developers requests, will also need to be able to limit queueing at this
      level. bnx2i will hook into libiscsi, but will allocate a scsi host per
      netdevice/hba, so unlike pure software iscsi/iser which is allocating
      a host per session, it cannot set the scsi_host->can_queue and return
      SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
      The iscsi class/driver can also set a scsi_target->can_queue value which
      reflects the max commands the driver/class can support. For iscsi this
      reflects the number of commands we can support for each session due to
      session/connection hw limits, driver limits, and to also reflect the
      session/targets's queueing window.
      v1 - initial patch.
      v2 - Fix scsi_run_queue handling of multiple blocked targets.
      Previously we would break from the main loop if a device was added back on
      the starved list. We now run over the list and check if any target is
      v3 - Rediff for scsi-misc.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  14. 09 Oct, 2008 1 commit
  15. 03 Oct, 2008 1 commit
  16. 23 Sep, 2008 1 commit
    • James Bottomley's avatar
      [SCSI] Fix hang with split requests · 44ea91c5
      James Bottomley authored
      Sometimes, particularly for USB devices with the last sector bug,
      requests get completed in chunks.  There's a bug in this in that if
      one of the chunks gets an error, we complete that chunk with an error
      but never move on to the remaining ones, leading to the request
      hanging (because it's not fully completed).
      Fix this by completing all remaining chunks if an error is encountered.
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  17. 27 Jul, 2008 1 commit
  18. 26 Jul, 2008 4 commits
  19. 12 Jul, 2008 2 commits
  20. 06 Jul, 2008 1 commit
  21. 05 Jun, 2008 1 commit
  22. 02 May, 2008 2 commits
    • Boaz Harrosh's avatar
      [SCSI] add support for variable length extended commands · db4742dd
      Boaz Harrosh authored
      Add support for variable-length, extended, and vendor specific
      CDBs to scsi-ml. It is now possible for initiators and ULD's
      to issue these types of commands. LLDs need not change much.
      All they need is to raise the .max_cmd_len to the longest command
      they support (see iscsi patch).
      - clean-up some code paths that did not expect commands to be
        larger than 16, and change cmd_len members' type to short as
        char is not enough.
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarBenny Halevy <bhalevy@panasas.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
    • Boaz Harrosh's avatar
      [SCSI] Let scsi_cmnd->cmnd use request->cmd buffer · 64a87b24
      Boaz Harrosh authored
       - struct scsi_cmnd had a 16 bytes command buffer of its own.
         This is an unnecessary duplication and copy of request's
         cmd. It is probably left overs from the time that scsi_cmnd
         could function without a request attached. So clean that up.
       - Once above is done, few places, apart from scsi-ml, needed
         adjustments due to changing the data type of scsi_cmnd->cmnd.
       - Lots of drivers still use MAX_COMMAND_SIZE. So I have left
         that #define but equate it to BLK_MAX_CDB. The way I see it
         and is reflected in the patch below is.
         MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB
                            as per the SCSI standard and is not related
                            to the implementation.
         BLK_MAX_CDB.     - The allocated space at the request level
       - I have audit all ISA drivers and made sure none use ->cmnd in a DMA
         Operation. Same audit was done by Andi Kleen.
      (*)fixed-length here means commands that their size can be determined
         by their opcode and the CDB does not carry a length specifier, (unlike
         the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly
         true and the SCSI standard also defines extended commands and
         vendor specific commands that can be bigger than 16 bytes. The kernel
         will support these using the same infrastructure used for VARLEN CDB's.
         So in effect MAX_COMMAND_SIZE means the maximum size command
         scsi-ml supports without specifying a cmd_len by ULD's
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  23. 29 Apr, 2008 1 commit
  24. 07 Apr, 2008 2 commits
  25. 19 Mar, 2008 1 commit
    • Kay Sievers's avatar
      [SCSI] fix media change events for polled devices · 4d1566ed
      Kay Sievers authored
        a341cd0f (SCSI: add asynchronous event notification API)
        285e9670 (sr,sd: send media state change modification events)
      by introducing an event filter, which is removed here, to make
      events, we are depending on, happen again.
      Fix this by removing the event filter.  It's pretty much broken at the
      moment, since a user can't set it (the attribute being read only).  A
      proper fix will be to make the event discriminator distinguish between
      AN and Polled media change events.
      Cc: David Zeuthen <david@fubar.dk>
      Cc: kristen accardi <kaccardi@gmail.com>
      Cc: Jeff Garzik <jeff@garzik.org>
      Signed-off-by: default avatarKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  26. 19 Feb, 2008 1 commit
    • Tejun Heo's avatar
      block: add request->raw_data_len · 6b00769f
      Tejun Heo authored
      With padding and draining moved into it, block layer now may extend
      requests as directed by queue parameters, so now a request has two
      sizes - the original request size and the extended size which matches
      the size of area pointed to by bios and later by sgs.  The latter size
      is what lower layers are primarily interested in when allocating,
      filling up DMA tables and setting up the controller.
      Both padding and draining extend the data area to accomodate
      controller characteristics.  As any controller which speaks SCSI can
      handle underflows, feeding larger data area is safe.
      So, this patch makes the primary data length field, request->data_len,
      indicate the size of full data area and add a separate length field,
      request->raw_data_len, for the unmodified request size.  The latter is
      used to report to higher layer (userland) and where the original
      request size should be fed to the controller or device.
      Signed-off-by: default avatarTejun Heo <htejun@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
  27. 07 Feb, 2008 1 commit
    • Tony Battersby's avatar
      [SCSI] fix BUG when sum(scatterlist) > bufflen · 4d2de3a5
      Tony Battersby authored
      When sending a SCSI command to a tape drive via the SCSI Generic (sg)
      driver, if the command has a data transfer length more than
      scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
      hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
      the command never completes (depending on the LLDD).
      When constructing scatterlists, the sg driver rounds up the scatterlist
      element sizes to be a multiple of 512.  This can result in
      sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
      incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
      bufflen.  When the command completes, req_bio_endio() detects that
      bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
      command to be resubmitted, resulting in BUG_ON or the command never
      This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
      than to sum(scatterlist lengths), which fixes the problem.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
  28. 05 Feb, 2008 1 commit
    • FUJITA Tomonori's avatar
      iommu sg merging: call dma_set_seg_boundary in __scsi_alloc_queue() · 99c84dbd
      FUJITA Tomonori authored
      This is a one-line patch to add the following to __scsi_alloc_queue():
      dma_set_seg_boundary(dev, shost->dma_boundary);
      This is the simplest approach but the result looks odd,
      __scsi_alloc_queue() does:
      blk_queue_segment_boundary(q, shost->dma_boundary);
      dma_set_seg_boundary(dev, shost->dma_boundary);
      blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
      I think that it would be better to set up segment boundary in the same
      way as we did for the maximum segment size. That is, removing
      shost->dma_boundary and LLDs call pci_set_dma_seg_boundary (or its
      Then __scsi_alloc_queue() can set up both limits in the same way:
      blk_queue_segment_boundary(q, dma_get_seg_boundary(dev));
      blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
      killing dma_boundary in scsi_host_template needs a large patch for
      libata (dma_boundary is used by only libata and sym53c8xx). I'll send
      a patch to do that if it is acceptable. James and Jeff?
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: James Bottomley <James.Bottomley@steeleye.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Jeff Garzik <jeff@garzik.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>