1. 07 Aug, 2014 1 commit
  2. 01 Aug, 2014 1 commit
  3. 24 Jul, 2014 1 commit
    • Boaz Harrosh's avatar
      direct-io: fix uninitialized warning in do_direct_IO() · 6fcc5420
      Boaz Harrosh authored
      The following warnings:
      
        fs/direct-io.c: In function ‘__blockdev_direct_IO’:
        fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        fs/direct-io.c:913:16: note: ‘to’ was declared here
        fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        fs/direct-io.c:913:10: note: ‘from’ was declared here
      
      are false positive because dio_get_page() either fails, or sets both
      'from' and 'to'.
      
      Paul Bolle said ...
      Maybe it's better to move initializing "to" and "from" out of
      dio_get_page(). That _might_ make it easier for both the the reader and
      the compiler to understand what's going on. Something like this:
      
      Christoph Hellwig said ...
      The fix of moving the code definitively looks nicer, while I think
      uninitialized_var is horrible wart that won't get anywhere near my code.
      
      Boaz Harrosh: I agree with Christoph and Paul
      Signed-off-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      6fcc5420
  4. 06 May, 2014 5 commits
  5. 03 Apr, 2014 1 commit
  6. 09 Feb, 2014 1 commit
  7. 23 Nov, 2013 1 commit
    • Kent Overstreet's avatar
      block: Abstract out bvec iterator · 4f024f37
      Kent Overstreet authored
      Immutable biovecs are going to require an explicit iterator. To
      implement immutable bvecs, a later patch is going to add a bi_bvec_done
      member to this struct; for now, this patch effectively just renames
      things.
      Signed-off-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Cc: Benny Halevy <bhalevy@tonian.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: xfs@oss.sgi.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: "Roger Pau Monné" <roger.pau@citrix.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchand@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Peng Tao <tao.peng@emc.com>
      Cc: Andy Adamson <andros@netapp.com>
      Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Pankaj Kumar <pankaj.km@samsung.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Mel Gorman <mgorman@suse.de>6
      4f024f37
  8. 09 Sep, 2013 1 commit
  9. 04 Sep, 2013 2 commits
    • Christoph Hellwig's avatar
      direct-io: Handle O_(D)SYNC AIO · 02afc27f
      Christoph Hellwig authored
      Call generic_write_sync() from the deferred I/O completion handler if
      O_DSYNC is set for a write request.  Also make sure various callers
      don't call generic_write_sync if the direct I/O code returns
      -EIOCBQUEUED.
      
      Based on an earlier patch from Jan Kara <jack@suse.cz> with updates from
      Jeff Moyer <jmoyer@redhat.com> and Darrick J. Wong <darrick.wong@oracle.com>.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      02afc27f
    • Christoph Hellwig's avatar
      direct-io: Implement generic deferred AIO completions · 7b7a8665
      Christoph Hellwig authored
      Add support to the core direct-io code to defer AIO completions to user
      context using a workqueue.  This replaces opencoded and less efficient
      code in XFS and ext4 (we save a memory allocation for each direct IO)
      and will be needed to properly support O_(D)SYNC for AIO.
      
      The communication between the filesystem and the direct I/O code requires
      a new buffer head flag, which is a bit ugly but not avoidable until the
      direct I/O code stops abusing the buffer_head structure for communicating
      with the filesystems.
      
      Currently this creates a per-superblock unbound workqueue for these
      completions, which is taken from an earlier patch by Jan Kara.  I'm
      not really convinced about this use and would prefer a "normal" global
      workqueue with a high concurrency limit, but this needs further discussion.
      
      JK: Fixed ext4 part, dynamic allocation of the workqueue.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7b7a8665
  10. 07 May, 2013 1 commit
  11. 29 Apr, 2013 2 commits
  12. 23 Mar, 2013 1 commit
    • Kent Overstreet's avatar
      block: Convert some code to bio_for_each_segment_all() · cb34e057
      Kent Overstreet authored
      More prep work for immutable bvecs:
      
      A few places in the code were either open coding or using the wrong
      version - fix.
      
      After we introduce the bvec iter, it'll no longer be possible to modify
      the biovec through bio_for_each_segment_all() - it doesn't increment a
      pointer to the current bvec, you pass in a struct bio_vec (not a
      pointer) which is updated with what the current biovec would be (taking
      into account bi_bvec_done and bi_size).
      
      So because of that it's more worthwhile to be consistent about
      bio_for_each_segment()/bio_for_each_segment_all() usage.
      Signed-off-by: default avatarKent Overstreet <koverstreet@google.com>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: NeilBrown <neilb@suse.de>
      CC: Alasdair Kergon <agk@redhat.com>
      CC: dm-devel@redhat.com
      CC: Alexander Viro <viro@zeniv.linux.org.uk>
      cb34e057
  13. 22 Feb, 2013 1 commit
  14. 29 Nov, 2012 1 commit
    • Linus Torvalds's avatar
      direct-io: don't read inode->i_blkbits multiple times · ab73857e
      Linus Torvalds authored
      Since directio can work on a raw block device, and the block size of the
      device can change under it, we need to do the same thing that
      fs/buffer.c now does: read the block size a single time, using
      ACCESS_ONCE().
      
      Reading it multiple times can get different results, which will then
      confuse the code because it actually encodes the i_blksize in
      relationship to the underlying logical blocksize.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab73857e
  15. 09 Aug, 2012 1 commit
  16. 14 Jul, 2012 1 commit
  17. 31 May, 2012 1 commit
  18. 23 Feb, 2012 1 commit
    • Anton Altaparmakov's avatar
      Restore direct_io / truncate locking API · 37fbf4bf
      Anton Altaparmakov authored
      With kernel 3.1, Christoph removed i_alloc_sem and replaced it with
      calls (namely inode_dio_wait() and inode_dio_done()) which are
      EXPORT_SYMBOL_GPL() thus they cannot be used by non-GPL file systems and
      further inode_dio_wait() was pushed from notify_change() into the file
      system ->setattr() method but no non-GPL file system can make this call.
      
      That means non-GPL file systems cannot exist any more unless they do not
      use any VFS functionality related to reading/writing as far as I can
      tell or at least as long as they want to implement direct i/o.
      
      Both Linus and Al (and others) have said on LKML that this breakage of
      the VFS API should not have happened and that the change was simply
      missed as it was not documented in the change logs of the patches that
      did those changes.
      
      This patch changes the two function exports in question to be
      EXPORT_SYMBOL() thus restoring the VFS API as it used to be - accessible
      for all modules.
      
      Christoph, who introduced the two functions and exported them GPL-only
      is CC-ed on this patch to give him the opportunity to object to the
      symbols being changed in this manner if he did indeed intend them to be
      GPL-only and does not want them to become available to all modules.
      Signed-off-by: default avatarAnton Altaparmakov <anton@tuxera.com>
      CC: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      37fbf4bf
  19. 12 Jan, 2012 2 commits
    • Andi Kleen's avatar
      dio: optimize cache misses in the submission path · 65dd2aa9
      Andi Kleen authored
      Some investigation of a transaction processing workload showed that a
      major consumer of cycles in __blockdev_direct_IO is the cache miss while
      accessing the block size.  This is because it has to walk the chain from
      block_dev to gendisk to queue.
      
      The block size is needed early on to check alignment and sizes.  It's only
      done if the check for the inode block size fails.  But the costly block
      device state is unconditionally fetched.
      
      - Reorganize the code to only fetch block dev state when actually
        needed.
      
      Then do a prefetch on the block dev early on in the direct IO path.  This
      is worth it, because there is substantial code run before we actually
      touch the block dev now.
      
      - I also added some unlikelies to make it clear the compiler that block
        device fetch code is not normally executed.
      
      This gave a small, but measurable improvement on a large database
      benchmark (about 0.3%)
      
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: using prefetch requires including prefetch.h]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65dd2aa9
    • Tao Ma's avatar
      fs/direct-io.c: calculate fs_count correctly in get_more_blocks() · ae55e1aa
      Tao Ma authored
      In get_more_blocks(), we use dio_count to calcuate fs_count and do some
      tricky things to increase fs_count if dio_count isn't aligned.  But
      actually it still has some corner cases that can't be coverd.  See the
      following example:
      
      	dio_write foo -s 1024 -w 4096
      
      (direct write 4096 bytes at offset 1024).  The same goes if the offset
      isn't aligned to fs_blocksize.
      
      In this case, the old calculation counts fs_count to be 1, but actually we
      will write into 2 different blocks (if fs_blocksize=4096).  The old code
      just works, since it will call get_block twice (and may have to allocate
      and create extents twice for filesystems like ext4).  So we'd better call
      get_block just once with the proper fs_count.
      Signed-off-by: default avatarTao Ma <boyu.mt@taobao.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae55e1aa
  20. 28 Oct, 2011 7 commits
  21. 26 Jul, 2011 1 commit
  22. 20 Jul, 2011 4 commits
    • Christoph Hellwig's avatar
      fs: move inode_dio_done to the end_io handler · 72c5052d
      Christoph Hellwig authored
      For filesystems that delay their end_io processing we should keep our
      i_dio_count until the the processing is done.  Enable this by moving
      the inode_dio_done call to the end_io handler if one exist.  Note that
      the actual move to the workqueue for ext4 and XFS is not done in
      this patch yet, but left to the filesystem maintainers.  At least
      for XFS it's not needed yet either as XFS has an internal equivalent
      to i_dio_count.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      72c5052d
    • Christoph Hellwig's avatar
      fs: always maintain i_dio_count · df2d6f26
      Christoph Hellwig authored
      Maintain i_dio_count for all filesystems, not just those using DIO_LOCKING.
      This these filesystems to also protect truncate against direct I/O requests
      by using common code.  Right now the only non-DIO_LOCKING filesystem that
      appears to do so is XFS, which uses an opencoded variant of the i_dio_count
      scheme.
      
      Behaviour doesn't change for filesystems never calling inode_dio_wait.
      For ext4 behaviour changes when using the dioread_nonlock option, which
      previously was missing any protection between truncate and direct I/O reads.
      For ocfs2 that handcrafted i_dio_count manipulations are replaced with
      the common code now enable.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      df2d6f26
    • Christoph Hellwig's avatar
      fs: kill i_alloc_sem · bd5fe6c5
      Christoph Hellwig authored
      i_alloc_sem is a rather special rw_semaphore.  It's the last one that may
      be released by a non-owner, and it's write side is always mirrored by
      real exclusion.  It's intended use it to wait for all pending direct I/O
      requests to finish before starting a truncate.
      
      Replace it with a hand-grown construct:
      
       - exclusion for truncates is already guaranteed by i_mutex, so it can
         simply fall way
       - the reader side is replaced by an i_dio_count member in struct inode
         that counts the number of pending direct I/O requests.  Truncate can't
         proceed as long as it's non-zero
       - when i_dio_count reaches non-zero we wake up a pending truncate using
         wake_up_bit on a new bit in i_flags
       - new references to i_dio_count can't appear while we are waiting for
         it to read zero because the direct I/O count always needs i_mutex
         (or an equivalent like XFS's i_iolock) for starting a new operation.
      
      This scheme is much simpler, and saves the space of a spinlock_t and a
      struct list_head in struct inode (typically 160 bits on a non-debug 64-bit
      system).
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      bd5fe6c5
    • Christoph Hellwig's avatar
      fs: simplify handling of zero sized reads in __blockdev_direct_IO · f9b5570d
      Christoph Hellwig authored
      Reject zero sized reads as soon as we know our I/O length, and don't
      borther with locks or allocations that might have to be cleaned up
      otherwise.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      f9b5570d
  23. 10 Mar, 2011 2 commits
    • Jens Axboe's avatar
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe authored
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      721a9602
    • Jens Axboe's avatar
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe authored
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      7eaceacc