1. 06 Aug, 2015 1 commit
    • Mel Gorman's avatar
      fs, file table: reinit files_stat.max_files after deferred memory initialisation · 4248b0da
      Mel Gorman authored
      Dave Hansen reported the following;
      
      	My laptop has been behaving strangely with 4.2-rc2.  Once I log
      	in to my X session, I start getting all kinds of strange errors
      	from applications and see this in my dmesg:
      
              	VFS: file-max limit 8192 reached
      
      The problem is that the file-max is calculated before memory is fully
      initialised and miscalculates how much memory the kernel is using.  This
      patch recalculates file-max after deferred memory initialisation.  Note
      that using memory hotplug infrastructure would not have avoided this
      problem as the value is not recalculated after memory hot-add.
      
      4.1:             files_stat.max_files = 6582781
      4.2-rc2:         files_stat.max_files = 8192
      4.2-rc2 patched: files_stat.max_files = 6562467
      
      Small differences with the patch applied and 4.1 but not enough to matter.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarDave Hansen <dave.hansen@intel.com>
      Cc: Nicolai Stange <nicstange@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Alex Ng <alexng@microsoft.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4248b0da
  2. 23 Jun, 2015 1 commit
  3. 11 Apr, 2015 1 commit
  4. 12 Oct, 2014 1 commit
  5. 07 Sep, 2014 1 commit
    • Tejun Heo's avatar
      percpu_counter: add @gfp to percpu_counter_init() · 908c7f19
      Tejun Heo authored
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_counter_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_counters too.
      
      We could have left percpu_counter_init() alone and added
      percpu_counter_init_gfp(); however, the number of users isn't that
      high and introducing _gfp variants to all percpu data structures would
      be quite ugly, so let's just do the conversion.  This is the one with
      the most users.  Other percpu data structures are a lot easier to
      convert.
      
      This patch doesn't make any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatar"David S. Miller" <davem@davemloft.net>
      Cc: x86@kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      908c7f19
  6. 06 Jun, 2014 1 commit
  7. 06 May, 2014 2 commits
    • Al Viro's avatar
      new methods: ->read_iter() and ->write_iter() · 293bc982
      Al Viro authored
      Beginning to introduce those.  Just the callers for now, and it's
      clumsier than it'll eventually become; once we finish converting
      aio_read and aio_write instances, the things will get nicer.
      
      For now, these guys are in parallel to ->aio_read() and ->aio_write();
      they take iocb and iov_iter, with everything in iov_iter already
      validated.  File offset is passed in iocb->ki_pos, iov/nr_segs -
      in iov_iter.
      
      Main concerns in that series are stack footprint and ability to
      split the damn thing cleanly.
      
      [fix from Peter Ujfalusi <peter.ujfalusi@ti.com> folded]
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      293bc982
    • Al Viro's avatar
      replace checking for ->read/->aio_read presence with check in ->f_mode · 7f7f25e8
      Al Viro authored
      Since we are about to introduce new methods (read_iter/write_iter), the
      tests in a bunch of places would have to grow inconveniently.  Check
      once (at open() time) and store results in ->f_mode as FMODE_CAN_READ
      and FMODE_CAN_WRITE resp.  It might end up being a temporary measure -
      once everything switches from ->aio_{read,write} to ->{read,write}_iter
      it might make sense to return to open-coded checks.  We'll see...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7f7f25e8
  8. 01 Apr, 2014 4 commits
  9. 31 Mar, 2014 1 commit
  10. 10 Mar, 2014 1 commit
    • Linus Torvalds's avatar
      vfs: atomic f_pos accesses as per POSIX · 9c225f26
      Linus Torvalds authored
      Our write() system call has always been atomic in the sense that you get
      the expected thread-safe contiguous write, but we haven't actually
      guaranteed that concurrent writes are serialized wrt f_pos accesses, so
      threads (or processes) that share a file descriptor and use "write()"
      concurrently would quite likely overwrite each others data.
      
      This violates POSIX.1-2008/SUSv4 Section XSI 2.9.7 that says:
      
       "2.9.7 Thread Interactions with Regular File Operations
      
        All of the following functions shall be atomic with respect to each
        other in the effects specified in POSIX.1-2008 when they operate on
        regular files or symbolic links: [...]"
      
      and one of the effects is the file position update.
      
      This unprotected file position behavior is not new behavior, and nobody
      has ever cared.  Until now.  Yongzhi Pan reported unexpected behavior to
      Michael Kerrisk that was due to this.
      
      This resolves the issue with a f_pos-specific lock that is taken by
      read/write/lseek on file descriptors that may be shared across threads
      or processes.
      Reported-by: default avatarYongzhi Pan <panyongzhi@gmail.com>
      Reported-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9c225f26
  11. 08 Nov, 2013 1 commit
  12. 24 Oct, 2013 1 commit
  13. 20 Oct, 2013 1 commit
    • Al Viro's avatar
      nfsd regression since delayed fput() · c7314d74
      Al Viro authored
      Background: nfsd v[23] had throughput regression since delayed fput
      went in; every read or write ends up doing fput() and we get a pair
      of extra context switches out of that (plus quite a bit of work
      in queue_work itselfi, apparently).  Use of schedule_delayed_work()
      gives it a chance to accumulate a bit before we do __fput() on all
      of them.  I'm not too happy about that solution, but... on at least
      one real-world setup it reverts about 10% throughput loss we got from
      switch to delayed fput.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c7314d74
  14. 11 Sep, 2013 1 commit
  15. 03 Sep, 2013 1 commit
  16. 13 Jul, 2013 2 commits
  17. 29 Jun, 2013 1 commit
  18. 14 Jun, 2013 1 commit
    • Oleg Nesterov's avatar
      fput: task_work_add() can fail if the caller has passed exit_task_work() · e7b2c406
      Oleg Nesterov authored
      fput() assumes that it can't be called after exit_task_work() but
      this is not true, for example free_ipc_ns()->shm_destroy() can do
      this. In this case fput() silently leaks the file.
      
      Change it to fallback to delayed_fput_work if task_work_add() fails.
      The patch looks complicated but it is not, it changes the code from
      
      	if (PF_KTHREAD) {
      		schedule_work(...);
      		return;
      	}
      	task_work_add(...)
      
      to
      	if (!PF_KTHREAD) {
      		if (!task_work_add(...))
      			return;
      		/* fallback */
      	}
      	schedule_work(...);
      
      As for shm_destroy() in particular, we could make another fix but I
      think this change makes sense anyway. There could be another similar
      user, it is not safe to assume that task_work_add() can't fail.
      Reported-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e7b2c406
  19. 01 Mar, 2013 1 commit
  20. 22 Feb, 2013 3 commits
  21. 20 Dec, 2012 1 commit
    • Jan Kara's avatar
      fs: Fix imbalance in freeze protection in mark_files_ro() · 72651cac
      Jan Kara authored
      File descriptors (even those for writing) do not hold freeze protection.
      Thus mark_files_ro() must call __mnt_drop_write() to only drop protection
      against remount read-only. Calling mnt_drop_write_file() as we do now
      results in:
      
      [ BUG: bad unlock balance detected! ]
      3.7.0-rc6-00028-g88e75b6 #101 Not tainted
      -------------------------------------
      kworker/1:2/79 is trying to release lock (sb_writers) at:
      [<ffffffff811b33b4>] mnt_drop_write+0x24/0x30
      but there are no more locks to release!
      Reported-by: default avatarZdenek Kabelac <zkabelac@redhat.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      72651cac
  22. 09 Oct, 2012 1 commit
  23. 26 Sep, 2012 1 commit
  24. 07 Sep, 2012 1 commit
  25. 30 Jul, 2012 1 commit
  26. 29 Jul, 2012 1 commit
  27. 22 Jul, 2012 1 commit
    • Al Viro's avatar
      switch fput to task_work_add · 4a9d4b02
      Al Viro authored
      ... and schedule_work() for interrupt/kernel_thread callers
      (and yes, now it *is* OK to call from interrupt).
      
      We are guaranteed that __fput() will be done before we return
      to userland (or exit).  Note that for fput() from a kernel
      thread we get an async behaviour; it's almost always OK, but
      sometimes you might need to have __fput() completed before
      you do anything else.  There are two mechanisms for that -
      a general barrier (flush_delayed_fput()) and explicit
      __fput_sync().  Both should be used with care (as was the
      case for fput() from kernel threads all along).  See comments
      in fs/file_table.c for details.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4a9d4b02
  28. 14 Jul, 2012 1 commit
  29. 29 May, 2012 2 commits
    • Andi Kleen's avatar
      brlocks/lglocks: API cleanups · 962830df
      Andi Kleen authored
      lglocks and brlocks are currently generated with some complicated macros
      in lglock.h.  But there's no reason to not just use common utility
      functions and put all the data into a common data structure.
      
      In preparation, this patch changes the API to look more like normal
      function calls with pointers, not magic macros.
      
      The patch is rather large because I move over all users in one go to keep
      it bisectable.  This impacts the VFS somewhat in terms of lines changed.
      But no actual behaviour change.
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      962830df
    • Andi Kleen's avatar
      brlocks/lglocks: turn into functions · eea62f83
      Andi Kleen authored
      lglocks and brlocks are currently generated with some complicated macros
      in lglock.h.  But there's no reason to not just use common utility
      functions and put all the data into a common data structure.
      
      Since there are at least two users it makes sense to share this code in a
      library.  This is also easier maintainable than a macro forest.
      
      This will also make it later possible to dynamically allocate lglocks and
      also use them in modules (this would both still need some additional, but
      now straightforward, code)
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      eea62f83
  30. 20 Mar, 2012 1 commit
  31. 06 Jan, 2012 1 commit
  32. 26 Jul, 2011 1 commit