1. 16 Oct, 2007 3 commits
    • Nick Piggin's avatar
      fs: introduce write_begin, write_end, and perform_write aops · afddba49
      Nick Piggin authored
      These are intended to replace prepare_write and commit_write with more
      flexible alternatives that are also able to avoid the buffered write
      deadlock problems efficiently (which prepare_write is unable to do).
      
      [mark.fasheh@oracle.com: API design contributions, code review and fixes]
      [akpm@linux-foundation.org: various fixes]
      [dmonakhov@sw.ru: new aop block_write_begin fix]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: default avatarDmitriy Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      afddba49
    • Nick Piggin's avatar
      mm: fix pagecache write deadlocks · 08291429
      Nick Piggin authored
      Modify the core write() code so that it won't take a pagefault while holding a
      lock on the pagecache page. There are a number of different deadlocks possible
      if we try to do such a thing:
      
      1.  generic_buffered_write
      2.   lock_page
      3.    prepare_write
      4.     unlock_page+vmtruncate
      5.     copy_from_user
      6.      mmap_sem(r)
      7.       handle_mm_fault
      8.        lock_page (filemap_nopage)
      9.    commit_write
      10.  unlock_page
      
      a. sys_munmap / sys_mlock / others
      b.  mmap_sem(w)
      c.   make_pages_present
      d.    get_user_pages
      e.     handle_mm_fault
      f.      lock_page (filemap_nopage)
      
      2,8	- recursive deadlock if page is same
      2,8;2,8	- ABBA deadlock is page is different
      2,6;b,f	- ABBA deadlock if page is same
      
      The solution is as follows:
      1.  If we find the destination page is uptodate, continue as normal, but use
          atomic usercopies which do not take pagefaults and do not zero the uncopied
          tail of the destination. The destination is already uptodate, so we can
          commit_write the full length even if there was a partial copy: it does not
          matter that the tail was not modified, because if it is dirtied and written
          back to disk it will not cause any problems (uptodate *means* that the
          destination page is as new or newer than the copy on disk).
      
      1a. The above requires that fault_in_pages_readable correctly returns access
          information, because atomic usercopies cannot distinguish between
          non-present pages in a readable mapping, from lack of a readable mapping.
      
      2.  If we find the destination page is non uptodate, unlock it (this could be
          made slightly more optimal), then allocate a temporary page to copy the
          source data into. Relock the destination page and continue with the copy.
          However, instead of a usercopy (which might take a fault), copy the data
          from the pinned temporary page via the kernel address space.
      
      (also, rename maxlen to seglen, because it was confusing)
      
      This increases the CPU/memory copy cost by almost 50% on the affected
      workloads. That will be solved by introducing a new set of pagecache write
      aops in a subsequent patch.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08291429
    • Fengguang Wu's avatar
      filemap: convert some unsigned long to pgoff_t · 57f6b96c
      Fengguang Wu authored
      Convert some 'unsigned long' to pgoff_t.
      Signed-off-by: default avatarFengguang Wu <wfg@mail.ustc.edu.cn>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57f6b96c
  2. 08 May, 2007 1 commit
  3. 07 May, 2007 1 commit
  4. 09 Feb, 2007 1 commit
  5. 28 Oct, 2006 1 commit
  6. 26 Sep, 2006 1 commit
    • Nick Piggin's avatar
      [PATCH] mm: non syncing lock_page() · db37648c
      Nick Piggin authored
      lock_page needs the caller to have a reference on the page->mapping inode
      due to sync_page, ergo set_page_dirty_lock is obviously buggy according to
      its comments.
      
      Solve it by introducing a new lock_page_nosync which does not do a sync_page.
      
      akpm: unpleasant solution to an unpleasant problem.  If it goes wrong it could
      cause great slowdowns while the lock_page() caller waits for kblockd to
      perform the unplug.  And if a filesystem has special sync_page() requirements
      (none presently do), permanent hangs are possible.
      
      otoh, set_page_dirty_lock() is usually (always?) called against userspace
      pages.  They are always up-to-date, so there shouldn't be any pending read I/O
      against these pages.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      db37648c
  7. 30 Jun, 2006 1 commit
  8. 23 Jun, 2006 1 commit
  9. 27 Apr, 2006 1 commit
  10. 31 Mar, 2006 1 commit
  11. 24 Mar, 2006 1 commit
    • Paul Jackson's avatar
      [PATCH] cpuset memory spread page cache implementation and hooks · 44110fe3
      Paul Jackson authored
      Change the page cache allocation calls to support cpuset memory spreading.
      
      See the previous patch, cpuset_mem_spread, for an explanation of cpuset memory
      spreading.
      
      On systems without cpusets configured in the kernel, this is no change.
      
      On systems with cpusets configured in the kernel, but the "memory_spread"
      cpuset option not enabled for the current tasks cpuset, this adds a call to a
      cpuset routine and failed bit test of the processor state flag PF_SPREAD_PAGE.
      
      On tasks in cpusets with "memory_spread" enabled, this adds a call to a cpuset
      routine that computes which of the tasks mems_allowed nodes should be
      preferred for this allocation.
      
      If memory spreading applies to a particular allocation, then any other NUMA
      mempolicy does not apply.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      44110fe3
  12. 13 Nov, 2005 1 commit
  13. 28 Oct, 2005 2 commits
  14. 08 Oct, 2005 1 commit
  15. 21 Jun, 2005 1 commit
    • Martin Hicks's avatar
      [PATCH] VM: add __GFP_NORECLAIM · 0c35bbad
      Martin Hicks authored
      When using the early zone reclaim, it was noticed that allocating new pages
      that should be spread across the whole system caused eviction of local pages.
      
      This adds a new GFP flag to prevent early reclaim from happening during
      certain allocation attempts.  The example that is implemented here is for page
      cache pages.  We want page cache pages to be spread across the whole system,
      and we don't want page cache pages to evict other pages to get local memory.
      Signed-off-by: default avatarMartin Hicks <mort@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0c35bbad
  16. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4