1. 30 Apr, 2008 5 commits
    • Miklos Szeredi's avatar
      fuse: fix max i/o size calculation · e5d9a0df
      Miklos Szeredi authored
      
      
      Fix a bug that Werner Baumann reported: fuse can send a bigger write request
      than the maximum specified.  This only affected direct_io operation.
      
      In addition set a sane minimum for the max_read and max_write tunables, so I/O
      always makes some progress.
      
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5d9a0df
    • Miklos Szeredi's avatar
      fuse: update file size on short read · 5c5c5e51
      Miklos Szeredi authored
      
      
      If the READ request returned a short count, then either
      
        - cached size is incorrect
        - filesystem is buggy, as short reads are only allowed on EOF
      
      So assume that the size is wrong and refresh it, so that cached read() doesn't
      zero fill the missing chunk.
      
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5c5c5e51
    • Nick Piggin's avatar
      fuse: implement perform_write · ea9b9907
      Nick Piggin authored
      
      
      Introduce fuse_perform_write.  With fusexmp (a passthrough filesystem), large
      (1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times
      (256MB/s vs 71MB/s).
      
      [mszeredi@suse.cz]:
      
       - split into smaller functions
       - testing
       - duplicate generic_file_aio_write(), so that there's no need to add a
         new ->perform_write() a_op.  Comment from hch.
      
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea9b9907
    • Miklos Szeredi's avatar
      fuse: clean up setting i_size in write · 854512ec
      Miklos Szeredi authored
      
      
      Extract common code for setting i_size in write functions into a common
      helper.
      
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      854512ec
    • Miklos Szeredi's avatar
      fuse: support writable mmap · 3be5a52b
      Miklos Szeredi authored
      
      
      Quoting Linus (3 years ago, FUSE inclusion discussions):
      
        "User-space filesystems are hard to get right. I'd claim that they
         are almost impossible, unless you limit them somehow (shared
         writable mappings are the nastiest part - if you don't have those,
         you can reasonably limit your problems by limiting the number of
         dirty pages you accept through normal "write()" calls)."
      
      Instead of attempting the impossible, I've just waited for the dirty page
      accounting infrastructure to materialize (thanks to Peter Zijlstra and
      others).  This nicely solved the biggest problem: limiting the number of pages
      used for write caching.
      
      Some small details remained, however, which this largish patch attempts to
      address.  It provides a page writeback implementation for fuse, which is
      completely safe against VM related deadlocks.  Performance may not be very
      good for certain usage patterns, but generally it should be acceptable.
      
      It has been tested extensively with fsx-linux and bash-shared-mapping.
      
      Fuse page writeback design
      --------------------------
      
      fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
      It copies the contents of the original page, and queues a WRITE request to the
      userspace filesystem using this temp page.
      
      The writeback is finished instantly from the MM's point of view: the page is
      removed from the radix trees, and the PageDirty and PageWriteback flags are
      cleared.
      
      For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
      incremented.  The per-bdi writeback count is not decremented until the actual
      write completes.
      
      On dirtying the page, fuse waits for a previous write to finish before
      proceeding.  This makes sure, there can only be one temporary page used at a
      time for one cached page.
      
      This approach is wasteful in both memory and CPU bandwidth, so why is this
      complication needed?
      
      The basic problem is that there can be no guarantee about the time in which
      the userspace filesystem will complete a write.  It may be buggy or even
      malicious, and fail to complete WRITE requests.  We don't want unrelated parts
      of the system to grind to a halt in such cases.
      
      Also a filesystem may need additional resources (particularly memory) to
      complete a WRITE request.  There's a great danger of a deadlock if that
      allocation may wait for the writepage to finish.
      
      Currently there are several cases where the kernel can block on page
      writeback:
      
        - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
        - page migration
        - throttle_vm_writeout (through NR_WRITEBACK)
        - sync(2)
      
      Of course in some cases (fsync, msync) we explicitly want to allow blocking.
      So for these cases new code has to be added to fuse, since the VM is not
      tracking writeback pages for us any more.
      
      As an extra safetly measure, the maximum dirty ratio allocated to a single
      fuse filesystem is set to 1% by default.  This way one (or several) buggy or
      malicious fuse filesystems cannot slow down the rest of the system by hogging
      dirty memory.
      
      With appropriate privileges, this limit can be raised through
      '/sys/class/bdi/<bdi>/max_ratio'.
      
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3be5a52b
  2. 06 Feb, 2008 1 commit
  3. 29 Nov, 2007 2 commits
  4. 14 Nov, 2007 1 commit
  5. 18 Oct, 2007 6 commits
  6. 17 Oct, 2007 4 commits
  7. 16 Oct, 2007 1 commit
  8. 10 Jul, 2007 1 commit
  9. 23 May, 2007 1 commit
  10. 21 May, 2007 1 commit
    • Alexey Dobriyan's avatar
      Detach sched.h from mm.h · e8edc6e0
      Alexey Dobriyan authored
      
      
      First thing mm.h does is including sched.h solely for can_do_mlock() inline
      function which has "current" dereference inside. By dealing with can_do_mlock()
      mm.h can be detached from sched.h which is good. See below, why.
      
      This patch
      a) removes unconditional inclusion of sched.h from mm.h
      b) makes can_do_mlock() normal function in mm/mlock.c
      c) exports can_do_mlock() to not break compilation
      d) adds sched.h inclusions back to files that were getting it indirectly.
      e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
         getting them indirectly
      
      Net result is:
      a) mm.h users would get less code to open, read, preprocess, parse, ... if
         they don't need sched.h
      b) sched.h stops being dependency for significant number of files:
         on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
         after patch it's only 3744 (-8.3%).
      
      Cross-compile tested on
      
      	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
      	alpha alpha-up
      	arm
      	i386 i386-up i386-defconfig i386-allnoconfig
      	ia64 ia64-up
      	m68k
      	mips
      	parisc parisc-up
      	powerpc powerpc-up
      	s390 s390-up
      	sparc sparc-up
      	sparc64 sparc64-up
      	um-x86_64
      	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
      
      as well as my two usual configs.
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e8edc6e0
  11. 06 May, 2007 1 commit
  12. 11 Feb, 2007 1 commit
  13. 21 Dec, 2006 1 commit
  14. 08 Dec, 2006 1 commit
  15. 07 Dec, 2006 1 commit
  16. 03 Nov, 2006 1 commit
  17. 17 Oct, 2006 1 commit
    • Miklos Szeredi's avatar
      [PATCH] fuse: fix hang on SMP · 9ffbb916
      Miklos Szeredi authored
      
      
      Fuse didn't always call i_size_write() with i_mutex held which caused rare
      hangs on SMP/32bit.  This bug has been present since fuse-2.2, well before
      being merged into mainline.
      
      The simplest solution is to protect i_size_write() with the per-connection
      spinlock.  Using i_mutex for this purpose would require some restructuring of
      the code and I'm not even sure it's always safe to acquire i_mutex in all
      places i_size needs to be set.
      
      Since most of vmtruncate is already duplicated for other reasons, duplicate
      the remaining part as well, making all i_size_write() calls internal to fuse.
      
      Using i_size_write() was unnecessary in fuse_init_inode(), since this function
      is only called on a newly created locked inode.
      
      Reported by a few people over the years, but special thanks to Dana Henriksen
      who was persistent enough in helping me debug it.
      
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9ffbb916
  18. 01 Oct, 2006 1 commit
  19. 14 Aug, 2006 1 commit
  20. 28 Jun, 2006 1 commit
  21. 25 Jun, 2006 5 commits
  22. 23 Jun, 2006 1 commit
    • Miklos Szeredi's avatar
      [PATCH] vfs: add lock owner argument to flush operation · 75e1fcc0
      Miklos Szeredi authored
      
      
      Pass the POSIX lock owner ID to the flush operation.
      
      This is useful for filesystems which don't want to store any locking state
      in inode->i_flock but want to handle locking/unlocking POSIX locks
      internally.  FUSE is one such filesystem but I think it possible that some
      network filesystems would need this also.
      
      Also add a flag to indicate that a POSIX locking request was generated by
      close(), so filesystems using the above feature won't send an extra locking
      request in this case.
      
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      75e1fcc0
  23. 11 Apr, 2006 1 commit