1. 13 Oct, 2008 7 commits
    • Mike Christie's avatar
      [SCSI] iscsi class: fix endpoint id handling · 21536062
      Mike Christie authored
      Some endpoint code was using unsigned int and some
      was using uint64_t. This converts it all to uint64_t.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      21536062
    • Mike Christie's avatar
      [SCSI] libiscsi: Support drivers initiating session removal · e5bd7b54
      Mike Christie authored
      If the driver knows when hardware is removed like with cxgb3i,
      bnx2i, qla4xxx and iser then we will want to remove the sessions/devices
      that are bound to that device before removing the host.
      
      cxgb3i and in the future bnx2i will remove the host and that will
      remove all the sessions on the hba. iser can call iscsi_kill_session
      when it gets an event that indicates that a hca is removed.
      And when qla4xxx is hooked in to the lib (it is only hooked into
      the class right now) it can call iscsi remove host like the
      partial offload card drivers.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      e5bd7b54
    • Mike Christie's avatar
      [SCSI] block: separate failfast into multiple bits. · 6000a368
      Mike Christie authored
      Multipath is best at handling transport errors. If it gets a device
      error then there is not much the multipath layer can do. It will just
      access the same device but from a different path.
      
      This patch breaks up failfast into device, transport and driver errors.
      The multipath layers (md and dm mutlipath) only ask the lower levels to
      fast fail transport errors. The user of failfast, read ahead, will ask
      to fast fail on all errors.
      
      Note that blk_noretry_request will return true if any failfast bit
      is set. This allows drivers that do not support the multipath failfast
      bits to continue to fail on any failfast error like before. Drivers
      like scsi that are able to fail fast specific errors can check
      for the specific fail fast type. In the next patch I will convert
      scsi.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      6000a368
    • Mike Christie's avatar
      [SCSI] fc class: Add support for new transport errors · f46e307d
      Mike Christie authored
      If the target is blocked and fast io fail tmo has not fired
      then we requeue with DID_TRANSPORT_DISRUPTED. Once that
      tmo fires we fail with DID_TRANSPORT_FAILFAST.
      
      v2
      - seperate from
      "fc class: unblock target after calling terminate callback"
      to make it easier to review.
      - Add JamesS's ack from list.
      v2
      - initial patch
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Acked-by: default avatarJames Smart <James.Smart@emulex.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      f46e307d
    • Mike Christie's avatar
      [SCSI] scsi: add transport host byte errors (v3) · a4dfaa6f
      Mike Christie authored
      Currently, if there is a transport problem the iscsi drivers will return
      outstanding commands (commands being exeucted by the driver/fw/hw) with
      DID_BUS_BUSY and block the session so no new commands can be queued.
      Commands that are caught between the failure handling and blocking are
      failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values.
      When the recovery_timeout fires, the iscsi drivers then fail IO with
      DID_NO_CONNECT.
      
      For fcp, some drivers will fail some outstanding IO (disk but possibly not
      tape) with DID_BUS_BUSY or DID_ERROR or some other value that causes a retry
      and hits the scsi_error.c failfast check, block the rport, and commands
      caught in the race are failed with DID_IMM_RETRY. Other drivers, may
      hold onto all IO and wait for the terminate_rport_io or dev_loss_tmo_callbk
      to be called.
      
      The following patches attempt to unify what upper layers will see drivers
      like multipath can make a good guess. This relies on drivers being
      hooked into their transport class.
      
      This first patch just defines two new host byte errors so drivers can
      return the same value for when a rport/session is blocked and for
      when the fast_io_fail_tmo fires.
      
      The idea is that if the LLD/class detects a problem and is going to block
      a rport/session, then if the LLD wants or must return the command to scsi-ml,
      then it can return it with DID_TRANSPORT_DISRUPTED. This will requeue
      the IO into the same scsi queue it came from, until the fast io fail timer
      fires and the class decides what to do.
      
      When using multipath and the fast_io_fail_tmo fires then the class
      can fail commands with DID_TRANSPORT_FAILFAST or drivers can use
      DID_TRANSPORT_FAILFAST in their terminate_rport_io callbacks or
      the equivlent in iscsi if we ever implement more advanced recovery methods.
      A LLD, like lpfc, could continue to return DID_ERROR and then it will hit
      the normal failfast path, so drivers do not have fully be ported to
      work better. The point of the patches is that upper layers will
      not see a failure that could be recovered from while the rport/session is
      blocked until fast_io_fail_tmo/recovery_timeout fires.
      
      V3
      Remove some comments.
      V2
      Fixed patch/diff errors and renamed DID_TRANSPORT_BLOCKED to
      DID_TRANSPORT_DISRUPTED.
      V1
      initial patch.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      a4dfaa6f
    • Mike Christie's avatar
      [SCSI] fc class: unblock target after calling terminate callback (take 2) · fff9d40c
      Mike Christie authored
      When we block a rport and the driver implements the terminate
      callback we will fail IO that was running quickly. However
      IO that was in the scsi_device/block queue sits there until
      the dev_loss_tmo fires, and this can make it look like IO is
      lost because new IO will get executed but that IO stuck in
      the blocked queue sits there for some time longer.
      
      With this patch when the fast io fail tmo fires, we will
      fail the blocked IO and any new IO. This patch also allows
      all drivers to partially support the fast io fail tmo. If the
      terminate io callback is not implemented, we will still fail blocked
      IO and any new IO, so multipath can handle that.
      
      This patch also allows the fc and iscsi classes to implement the
      same behavior. The timers are just unfornately named differently.
      
      This patch also fixes the problem where drivers were unblocking
      the target in their terminate callback, which was needed for
      rport removal, but for fast io fail timeout it would cause
      IO to bounce arround the scsi/block layer and the LLD queuecommand.
      And it for drivers that could have IO stuck but did not have
      a terminate callback the unblock calls in the class will fix
      them.
      
      v2.
      - fix up bit setting style to meet JamesS's pref.
      - Broke out new host byte error changes to make it easier to read.
      - added JamesS's ack from list.
      v1
      - initial patch
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Acked-by: default avatarJames Smart <James.Smart@emulex.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      fff9d40c
    • Mike Christie's avatar
      [SCSI] Add helper code so transport classes/driver can control queueing (v3) · f0c0a376
      Mike Christie authored
      SCSI-ml manages the queueing limits for the device and host, but
      does not do so at the target level. However something something similar
      can come in userful when a driver is transitioning a transport object to
      the the blocked state, becuase at that time we do not want to queue
      io and we do not want the queuecommand to be called again.
      
      The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
      You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
      a transport level queueing issue like the hw cannot allocate some
      resource at the iscsi session/connection level, or the target has temporarily
      closed or shrunk the queueing window, or if we are transitioning
      to the blocked state.
      
      bnx2i, when they rework their firmware according to netdev
      developers requests, will also need to be able to limit queueing at this
      level. bnx2i will hook into libiscsi, but will allocate a scsi host per
      netdevice/hba, so unlike pure software iscsi/iser which is allocating
      a host per session, it cannot set the scsi_host->can_queue and return
      SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
      
      The iscsi class/driver can also set a scsi_target->can_queue value which
      reflects the max commands the driver/class can support. For iscsi this
      reflects the number of commands we can support for each session due to
      session/connection hw limits, driver limits, and to also reflect the
      session/targets's queueing window.
      
      Changes:
      v1 - initial patch.
      v2 - Fix scsi_run_queue handling of multiple blocked targets.
      Previously we would break from the main loop if a device was added back on
      the starved list. We now run over the list and check if any target is
      blocked.
      v3 - Rediff for scsi-misc.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      f0c0a376
  2. 12 Oct, 2008 4 commits
  3. 11 Oct, 2008 24 commits
  4. 10 Oct, 2008 5 commits