1. 22 Sep, 2016 1 commit
  2. 12 Sep, 2016 4 commits
    • Arnd Bergmann's avatar
      nvme-rdma: add back dependency on CONFIG_BLOCK · 2cfe199c
      Arnd Bergmann authored
      A recent change removed the dependency on BLK_DEV_NVME, which implies
      the dependency on PCI and BLOCK. We don't need CONFIG_PCI, but without
      CONFIG_BLOCK we get tons of build errors, e.g.
      
      In file included from drivers/nvme/host/core.c:16:0:
      linux/blk-mq.h:182:33: error: 'struct gendisk' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
      drivers/nvme/host/core.c: In function 'nvme_setup_rw':
      drivers/nvme/host/core.c:295:21: error: implicit declaration of function 'rq_data_dir' [-Werror=implicit-function-declaration]
      drivers/nvme/host/nvme.h: In function 'nvme_map_len':
      drivers/nvme/host/nvme.h:217:6: error: implicit declaration of function 'req_op' [-Werror=implicit-function-declaration]
      drivers/nvme/host/scsi.c: In function 'nvme_trans_bdev_limits_page':
      drivers/nvme/host/scsi.c:768:85: error: implicit declaration of function 'queue_max_hw_sectors' [-Werror=implicit-function-declaration]
      
      This adds back the specific CONFIG_BLOCK dependency to avoid broken
      configurations.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: aa719874 ("nvme: fabrics drivers don't need the nvme-pci driver")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      2cfe199c
    • Colin Ian King's avatar
      nvme-rdma: fix null pointer dereference on req->mr · 1bda18de
      Colin Ian King authored
      If there is an error on req->mr, req->mr is set to null, however
      the following statement sets req->mr->need_inval causing a null
      pointer dereference.  Fix this by bailing out to label 'out' to
      immediately return and hence skip over the offending null pointer
      dereference.
      
      Fixes: f5b7b559 ("nvme-rdma: Get rid of duplicate variable")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      1bda18de
    • Steve Wise's avatar
      nvme-rdma: use ib_client API to detect device removal · e87a911f
      Steve Wise authored
      Change nvme-rdma to use the IB Client API to detect device removal.
      This has the wonderful benefit of being able to blow away all the
      ib/rdma_cm resources for the device being removed.  No craziness about
      not destroying the cm_id handling the event.  No deadlocks due to broken
      iw_cm/rdma_cm/iwarp dependencies.  And no need to have a bound cm_id
      around during controller recovery/reconnect to catch device removal
      events.
      
      We don't use the device_add aspect of the ib_client service since we only
      want to create resources for an IB device if we have a target utilizing
      that device.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      e87a911f
    • Sagi Grimberg's avatar
      nvme-rdma: add DELETING queue flag · e89ca58f
      Sagi Grimberg authored
      When we get a surprise disconnect from the target we queue a periodic
      reconnect (which is the sane thing to do...).
      
      We only move the queues out of CONNECTED when we retry to reconnect (after
      10 seconds in the default case) but we stop the blk queues immediately
      so we are not bothered with traffic from now on. If delete() is kicking
      off in this period the queues are still in CONNECTED state.
      
      Part of the delete sequence is trying to issue ctrl shutdown if the
      admin queue is CONNECTED (which it is!). This request is issued but
      stuck in blk-mq waiting for the queues to start again. This might be
      the one preventing us from forward progress...
      
      The patch separates the queue flags to CONNECTED and DELETING. Now we
      will move out of CONNECTED as soon as error recovery kicks in (before
      stopping the queues) and DELETING is on when we start the queue deletion.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      e89ca58f
  3. 11 Sep, 2016 1 commit
    • Linus Torvalds's avatar
      nvme: make NVME_RDMA depend on BLOCK · bd0b841f
      Linus Torvalds authored
      Commit aa719874 ("nvme: fabrics drivers don't need the nvme-pci
      driver") removed the dependency on BLK_DEV_NVME, but the cdoe does
      depend on the block layer (which used to be an implicit dependency
      through BLK_DEV_NVME).
      
      Otherwise you get various errors from the kbuild test robot random
      config testing when that happens to hit a configuration with BLOCK
      device support disabled.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jay Freyensee <james_p_freyensee@linux.intel.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bd0b841f
  4. 08 Sep, 2016 1 commit
  5. 07 Sep, 2016 1 commit
  6. 04 Sep, 2016 2 commits
  7. 28 Aug, 2016 2 commits
  8. 24 Aug, 2016 1 commit
  9. 19 Aug, 2016 3 commits
  10. 18 Aug, 2016 4 commits
  11. 16 Aug, 2016 2 commits
  12. 15 Aug, 2016 1 commit
  13. 11 Aug, 2016 1 commit
    • Gabriel Krisman Bertazi's avatar
      nvme: Suspend all queues before deletion · c21377f8
      Gabriel Krisman Bertazi authored
      When nvme_delete_queue fails in the first pass of the
      nvme_disable_io_queues() loop, we return early, failing to suspend all
      of the IO queues.  Later, on the nvme_pci_disable path, this causes us
      to disable MSI without actually having freed all the IRQs, which
      triggers the BUG_ON in free_msi_irqs(), as show below.
      
      This patch refactors nvme_disable_io_queues to suspend all queues before
      start submitting delete queue commands.  This way, we ensure that we
      have at least returned every IRQ before continuing with the removal
      path.
      
      [  487.529200] kernel BUG at ../drivers/pci/msi.c:368!
      cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650]
          pc: c000000000627a50: free_msi_irqs+0x90/0x200
          lr: c000000000627a40: free_msi_irqs+0x80/0x200
          sp: c0000078c5b838d0
         msr: 9000000100029033
        current = 0xc0000078c5b40000
        paca    = 0xc000000002bd7600   softe: 0        irq_happened: 0x01
          pid   = 1376, comm = kworker/70:1H
      kernel BUG at ../drivers/pci/msi.c:368!
      Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413
      (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016
      enter ? for help
      [c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme]
      [c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme]
      [c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110
      [c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170
      [c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110
      [c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0
      [c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590
      [c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660
      [c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130
      [c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Cc: Brian King <brking@linux.vnet.ibm.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: linux-nvme@lists.infradead.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c21377f8
  14. 04 Aug, 2016 5 commits
  15. 03 Aug, 2016 7 commits
  16. 20 Jul, 2016 4 commits