1. 13 Oct, 2008 6 commits
    • Mike Christie's avatar
      [SCSI] iscsi class: fix endpoint id handling · 21536062
      Mike Christie authored
      Some endpoint code was using unsigned int and some
      was using uint64_t. This converts it all to uint64_t.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      21536062
    • Mike Christie's avatar
      [SCSI] libiscsi: Support drivers initiating session removal · e5bd7b54
      Mike Christie authored
      If the driver knows when hardware is removed like with cxgb3i,
      bnx2i, qla4xxx and iser then we will want to remove the sessions/devices
      that are bound to that device before removing the host.
      
      cxgb3i and in the future bnx2i will remove the host and that will
      remove all the sessions on the hba. iser can call iscsi_kill_session
      when it gets an event that indicates that a hca is removed.
      And when qla4xxx is hooked in to the lib (it is only hooked into
      the class right now) it can call iscsi remove host like the
      partial offload card drivers.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      e5bd7b54
    • Mike Christie's avatar
      [SCSI] fc class: Add support for new transport errors · f46e307d
      Mike Christie authored
      If the target is blocked and fast io fail tmo has not fired
      then we requeue with DID_TRANSPORT_DISRUPTED. Once that
      tmo fires we fail with DID_TRANSPORT_FAILFAST.
      
      v2
      - seperate from
      "fc class: unblock target after calling terminate callback"
      to make it easier to review.
      - Add JamesS's ack from list.
      v2
      - initial patch
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Acked-by: default avatarJames Smart <James.Smart@emulex.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      f46e307d
    • Mike Christie's avatar
      [SCSI] scsi: add transport host byte errors (v3) · a4dfaa6f
      Mike Christie authored
      Currently, if there is a transport problem the iscsi drivers will return
      outstanding commands (commands being exeucted by the driver/fw/hw) with
      DID_BUS_BUSY and block the session so no new commands can be queued.
      Commands that are caught between the failure handling and blocking are
      failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values.
      When the recovery_timeout fires, the iscsi drivers then fail IO with
      DID_NO_CONNECT.
      
      For fcp, some drivers will fail some outstanding IO (disk but possibly not
      tape) with DID_BUS_BUSY or DID_ERROR or some other value that causes a retry
      and hits the scsi_error.c failfast check, block the rport, and commands
      caught in the race are failed with DID_IMM_RETRY. Other drivers, may
      hold onto all IO and wait for the terminate_rport_io or dev_loss_tmo_callbk
      to be called.
      
      The following patches attempt to unify what upper layers will see drivers
      like multipath can make a good guess. This relies on drivers being
      hooked into their transport class.
      
      This first patch just defines two new host byte errors so drivers can
      return the same value for when a rport/session is blocked and for
      when the fast_io_fail_tmo fires.
      
      The idea is that if the LLD/class detects a problem and is going to block
      a rport/session, then if the LLD wants or must return the command to scsi-ml,
      then it can return it with DID_TRANSPORT_DISRUPTED. This will requeue
      the IO into the same scsi queue it came from, until the fast io fail timer
      fires and the class decides what to do.
      
      When using multipath and the fast_io_fail_tmo fires then the class
      can fail commands with DID_TRANSPORT_FAILFAST or drivers can use
      DID_TRANSPORT_FAILFAST in their terminate_rport_io callbacks or
      the equivlent in iscsi if we ever implement more advanced recovery methods.
      A LLD, like lpfc, could continue to return DID_ERROR and then it will hit
      the normal failfast path, so drivers do not have fully be ported to
      work better. The point of the patches is that upper layers will
      not see a failure that could be recovered from while the rport/session is
      blocked until fast_io_fail_tmo/recovery_timeout fires.
      
      V3
      Remove some comments.
      V2
      Fixed patch/diff errors and renamed DID_TRANSPORT_BLOCKED to
      DID_TRANSPORT_DISRUPTED.
      V1
      initial patch.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      a4dfaa6f
    • Mike Christie's avatar
      [SCSI] fc class: unblock target after calling terminate callback (take 2) · fff9d40c
      Mike Christie authored
      When we block a rport and the driver implements the terminate
      callback we will fail IO that was running quickly. However
      IO that was in the scsi_device/block queue sits there until
      the dev_loss_tmo fires, and this can make it look like IO is
      lost because new IO will get executed but that IO stuck in
      the blocked queue sits there for some time longer.
      
      With this patch when the fast io fail tmo fires, we will
      fail the blocked IO and any new IO. This patch also allows
      all drivers to partially support the fast io fail tmo. If the
      terminate io callback is not implemented, we will still fail blocked
      IO and any new IO, so multipath can handle that.
      
      This patch also allows the fc and iscsi classes to implement the
      same behavior. The timers are just unfornately named differently.
      
      This patch also fixes the problem where drivers were unblocking
      the target in their terminate callback, which was needed for
      rport removal, but for fast io fail timeout it would cause
      IO to bounce arround the scsi/block layer and the LLD queuecommand.
      And it for drivers that could have IO stuck but did not have
      a terminate callback the unblock calls in the class will fix
      them.
      
      v2.
      - fix up bit setting style to meet JamesS's pref.
      - Broke out new host byte error changes to make it easier to read.
      - added JamesS's ack from list.
      v1
      - initial patch
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Acked-by: default avatarJames Smart <James.Smart@emulex.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      fff9d40c
    • Mike Christie's avatar
      [SCSI] Add helper code so transport classes/driver can control queueing (v3) · f0c0a376
      Mike Christie authored
      SCSI-ml manages the queueing limits for the device and host, but
      does not do so at the target level. However something something similar
      can come in userful when a driver is transitioning a transport object to
      the the blocked state, becuase at that time we do not want to queue
      io and we do not want the queuecommand to be called again.
      
      The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
      You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
      a transport level queueing issue like the hw cannot allocate some
      resource at the iscsi session/connection level, or the target has temporarily
      closed or shrunk the queueing window, or if we are transitioning
      to the blocked state.
      
      bnx2i, when they rework their firmware according to netdev
      developers requests, will also need to be able to limit queueing at this
      level. bnx2i will hook into libiscsi, but will allocate a scsi host per
      netdevice/hba, so unlike pure software iscsi/iser which is allocating
      a host per session, it cannot set the scsi_host->can_queue and return
      SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
      
      The iscsi class/driver can also set a scsi_target->can_queue value which
      reflects the max commands the driver/class can support. For iscsi this
      reflects the number of commands we can support for each session due to
      session/connection hw limits, driver limits, and to also reflect the
      session/targets's queueing window.
      
      Changes:
      v1 - initial patch.
      v2 - Fix scsi_run_queue handling of multiple blocked targets.
      Previously we would break from the main loop if a device was added back on
      the starved list. We now run over the list and check if any target is
      blocked.
      v3 - Rediff for scsi-misc.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      f0c0a376
  2. 09 Oct, 2008 1 commit
  3. 03 Oct, 2008 4 commits
    • Andrew Vasquez's avatar
      [SCSI] fc_transport: Add an API to allow an LLD to create vports · a30c3f69
      Andrew Vasquez authored
      There's already a fc_vport_termintate() call exported by
      the transport.  This patch adds a symmetric call to the API to allow
      an NPIV-capable LLD to instantiate vports sans user intervention.
      
      Additional comments/updates:
      
         Re: scsi_fc_transport.txt
           Add a function prototype for fc_vport_terminate similar to what's
           done for fc_vport_create
      
         Re: fc_vport_create
           I recommend we pass the channel number in fc_vport_create rather
           than fixing it at zero.
      
           Also, ids->vport_type should be set to FC_PORTTYPE_NPIV prior to
           calling fc_vport_create. The comment is also meaningless.
      
      Added-by and
      Signed-off-by: default avatarJames Smart <james.smart@emulex.com>
      Signed-off-by: default avatarAndrew Vasquez <andrew.vasquez@qlogic.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      a30c3f69
    • James Bottomley's avatar
      [SCSI] Update the SCSI state model to allow blocking in the created state · 6f4267e3
      James Bottomley authored
      Brian King <brking@linux.vnet.ibm.com> reported that fibre channel
      devices can oops during scanning if their ports block (because the
      device goes from CREATED -> BLOCK -> RUNNING rather than CREATED ->
      BLOCK -> CREATED).
      
      Fix this by adding a new state: CREATED_BLOCK which can only transition
      back to CREATED and disallow the CREATED -> BLOCK transition.  Now both
      the created and blocked states that the mid-layer recognises can include
      CREATED_BLOCK.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      6f4267e3
    • James Bottomley's avatar
      [SCSI] add inline functions for recognising created and blocked states · 0f1d87a2
      James Bottomley authored
      The created and blocked states are very shortly going to correspond to
      mixed sdev_state states.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      0f1d87a2
    • James Smart's avatar
      [SCSI] scsi_netlink: Add transport and LLD recieve and event support · 22447be7
      James Smart authored
      This patch adds scsi netlink recieve and event support for transport
      and scsi LLDD's.  It is a reimplementation of the patch posted last
      week by David Somayajulu.
      http://marc.info/?l=linux-scsi&m=121745486221819&w=2
      
      There are a few things done differently:
      
      - Transport support is included
      
      - Event delivery is included
      
      - The vendor message is now its own unique message type, considered
        part of the generic "SCSI Transport".
      
      - LLDD entry points are now registered rather than included in the
        scsi_host_template.
      
        Background: When I started to implement the event handler via template,
        I had to either: muck up scsi_add_host and scsi_remove_host;  or have
        the event handler search all possible shosts. Neither was acceptable.
        Moving to a registration solves this, and also limits the scope of
        the changes to something that could be backported to a distro without
        breaking an already-released-distro kabi. However, I admit it isn't
        as elegant, as the passing of the LLDD host template in the
        registration and the complexity around dynamic add/remove shows.
      
      - The receive path was augmented to require a unique identifier for
        the LLDD before the message was allowed to be handed off to the
        driver. Given how quickly very fatal errors occur if there's msg
        mismatches (which I saw in testing my own tools :), I believe this
        to be a very good thing. The id plays off the vendor id scheme already
        introduced for the vendor unique event messages used by FC.
        Additionally, the id use as the basis of the registration/deregistration.
      
      - Send assist functions, for both the transport and LLDDs are included.
      
      [fujita.tomonori@lab.ntt.co.jp: fix missing cast]
      Signed-off-by: default avatarJames Smart <james.smart@emulex.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      22447be7
  4. 29 Aug, 2008 1 commit
  5. 06 Aug, 2008 1 commit
  6. 05 Aug, 2008 1 commit
  7. 04 Aug, 2008 1 commit
  8. 27 Jul, 2008 1 commit
    • Alan Jenkins's avatar
      [SCSI] extend the last_sector_bug flag to cover more sectors · 2b142900
      Alan Jenkins authored
      The last_sector_bug flag was added to work around a bug in certain usb
      cardreaders, where they would crash if a multiple sector read included the
      last sector. The original implementation avoids this by e.g. splitting an 8
      sector read which includes the last sector into a 7 sector read, and a single
      sector read for the last sector.  The flag is enabled for all USB devices.
      
      This revealed a second bug in other usb cardreaders, which crash when they
      get a multiple sector read which stops 1 sector short of the last sector.
      Affected hardware includes the Kingston "MobileLite" external USB cardreader
      and the internal USB cardreader on the Asus EeePC.
      
      Extend the last_sector_bug workaround to ensure that any access which touches
      the last 8 hardware sectors of the device is a single sector long.  Requests
      are shrunk as necessary to meet this constraint.
      
      This gives us a safety margin against potential unknown or future bugs
      affecting multi-sector access to the end of the device.  The two known bugs
      only affect the last 2 sectors.  However, they suggest that these devices
      are prone to fencepost errors and that multi-sector access to the end of the
      device is not well tested.  Popular OS's use multi-sector accesses, but they
      rarely read the last few sectors.  Linux (with udev & vol_id) automatically
      reads sectors from the end of the device on insertion.  It is assumed that
      single sector accesses are more thoroughly tested during development.
      Signed-off-by: default avatarAlan Jenkins <alan-jenkins@tuffmail.co.uk>
      Tested-by: default avatarAlan Jenkins <alan-jenkins@tuffmail.co.uk>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      2b142900
  9. 26 Jul, 2008 9 commits
  10. 21 Jul, 2008 1 commit
  11. 15 Jul, 2008 1 commit
  12. 14 Jul, 2008 1 commit
    • Stefan Richter's avatar
      scsi: sd: optionally set power condition in START STOP UNIT · d2886ea3
      Stefan Richter authored
      Adds a new scsi_device flag, start_stop_pwr_cond:  If enabled, the sd
      driver will not send plain START STOP UNIT commands but ones with the
      power condition field set to 3 (standby) or 1 (active) respectively.
      
      Some FireWire disk firmwares do not stop the motor if power condition is
      zero.  Or worse, they become unresponsive after a START STOP UNIT with
      power condition = 0 and start = 0.
      
      http://lkml.org/lkml/2008/4/29/704
      
      This patch only adds the necessary code to sd_mod but doesn't activate
      it.  Follow-up patches to the FireWire drivers will add detection of
      affected devices and enable the code for them.
      
      I did not add power condition values to scsi_error.c::scsi_eh_try_stu()
      for now.  The three firmwares which suffer from above mentioned problems
      do not need START STOP UNIT in the error handler, and they are not
      adversely affected by START STOP UNIT with power condition = 0 and start
      = 1 (like scsi_eh_try_stu() sends it if scsi_device.allow_restart is
      enabled).
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Tested-by: default avatarTino Keitel <tino.keitel@gmx.de>
      d2886ea3
  13. 12 Jul, 2008 12 commits