1. 28 Jun, 2006 1 commit
  2. 27 Jun, 2006 1 commit
  3. 23 Jun, 2006 1 commit
    • Jens Axboe's avatar
      [PATCH] Kill PF_SYNCWRITE flag · b31dc66a
      Jens Axboe authored
      A process flag to indicate whether we are doing sync io is incredibly
      ugly. It also causes performance problems when one does a lot of async
      io and then proceeds to sync it. Part of the io will go out as async,
      and the other part as sync. This causes a disconnect between the
      previously submitted io and the synced io. For io schedulers such as CFQ,
      this will cause us lost merges and suboptimal behaviour in scheduling.
      Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
      the O_DIRECT path just directly indicate that the writes are sync
      by using WRITE_SYNC instead.
      Signed-off-by: default avatarJens Axboe <axboe@suse.de>
  4. 27 Mar, 2006 1 commit
  5. 26 Mar, 2006 5 commits
  6. 25 Mar, 2006 1 commit
  7. 24 Mar, 2006 4 commits
  8. 23 Mar, 2006 1 commit
  9. 22 Mar, 2006 1 commit
    • Christoph Lameter's avatar
      [PATCH] page migration reorg · b20a3503
      Christoph Lameter authored
      Centralize the page migration functions in anticipation of additional
      tinkering.  Creates a new file mm/migrate.c
      1. Extract buffer_migrate_page() from fs/buffer.c
      2. Extract central migration code from vmscan.c
      3. Extract some components from mempolicy.c
      4. Export pageout() and remove_from_swap() from vmscan.c
      5. Make it possible to configure NUMA systems without page migration
         and non-NUMA systems with page migration.
      I had to so some #ifdeffing in mempolicy.c that may need a cleanup.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  10. 14 Mar, 2006 1 commit
  11. 03 Feb, 2006 1 commit
  12. 01 Feb, 2006 2 commits
  13. 16 Jan, 2006 1 commit
  14. 14 Jan, 2006 1 commit
  15. 11 Jan, 2006 1 commit
  16. 09 Jan, 2006 1 commit
  17. 08 Jan, 2006 3 commits
    • Andrew Morton's avatar
      [PATCH] fix possible PAGE_CACHE_SHIFT overflows · 54b21a79
      Andrew Morton authored
      We've had two instances recently of overflows when doing
      	64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT)
      I did a tree-wide grep of `<<.*PAGE_CACHE_SHIFT' and this is the result.
      - afs_rxfs_fetch_descriptor.offset is of type off_t, which seems broken.
      - jfs and jffs are limited to 4GB anyway.
      - reiserfs map_block_for_writepage() takes an unsigned long for the block -
        it should take sector_t.  (It'll fail for huge filesystems with
      - cramfs_read() needs to use sector_t (I think cramsfs is busted on large
        filesystems anyway)
      - affs is limited in file size anyway.
      - I generally didn't fix 32-bit overflows in directory operations.
      - arm's __flush_dcache_page() is peculiar.  What if the page lies beyond 4G?
      - gss_wrap_req_priv() needs checking (snd_buf->page_base)
      Cc: Oleg Drokin <green@linuxhacker.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: <reiserfs-dev@namesys.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: <linux-fsdevel@vger.kernel.org>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • OGAWA Hirofumi's avatar
      [PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait) · 28fd1298
      OGAWA Hirofumi authored
      This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it.
      See mm/filemap.c:
      And changes the filemap_write_and_wait() and filemap_write_and_wait_range().
      Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite()
      returns error.  However, even if filemap_fdatawrite() returned an
      error, it may have submitted the partially data pages to the device.
      (e.g. in the case of -ENOSPC)
      Andrew Morton writes,
      If filemap_fdatawrite() returns an error, this might be due to some
      I/O problem: dead disk, unplugged cable, etc.  Given the generally
      crappy quality of the kernel's handling of such exceptions, there's a
      good chance that the filemap_fdatawait() will get stuck in D state
      So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO.
      Trond, could you please review the nfs part?  Especially I'm not sure,
      nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not.
      Acked-by: default avatarTrond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • OGAWA Hirofumi's avatar
      [PATCH] fat: support a truncate() for expanding size (generic_cont_expand) · 05eb0b51
      OGAWA Hirofumi authored
      This patch changes generic_cont_expand(), in order to share the code
      with fatfs.
        - Use vmtruncate() if ->prepare_write() returns a error.
      Even if ->prepare_write() returns an error, it may already have added some
      blocks.  So, this truncates blocks outside of ->i_size by vmtruncate().
        - Add generic_cont_expand_simple().
      The generic_cont_expand_simple() assumes that ->prepare_write() can handle
      the block boundary.  With this, we don't need to care the extra byte.
      And for expanding a file size by truncate(), fatfs uses the
      added generic_cont_expand_simple().
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  18. 07 Nov, 2005 1 commit
  19. 30 Oct, 2005 2 commits
    • Andrew Morton's avatar
      [PATCH] __bread oops fix · a3e713b5
      Andrew Morton authored
      If a filesystem passes an idiotic blocksize into bread(), __getblk_slow() will
      warn and will return NULL.  We have a report (from Hubert Tonneau
      <hubert.tonneau@fullpliant.org>) of isofs_fill_super() doing this (passing in
      a silly block size) against an unplugged CDROM drive.
      But a couple of __getblk_slow() callers forgot to check for the NULL bh, hence
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • Jan Kara's avatar
      [PATCH] ext3: Fix unmapped buffers in transaction's lists · aaa4059b
      Jan Kara authored
      Fix the problem (BUG 4964) with unmapped buffers in transaction's
      t_sync_data list.  The problem is we need to call filesystem's own
      invalidatepage() from block_write_full_page().
      block_write_full_page() must call filesystem's invalidatepage().  Otherwise
      following nasty race can happen:
         proc 1                                        proc 2
         ------                                        ------
      - write some new data to 'offset'
        => bh gets to the transactions data list
                                                    - starts truncate
                                                      => i_size set to new size
      - mpage_writepages()
        - ext3_ordered_writepage() to 'offset'
          - block_write_full_page()
            - page->index > end_index+1
              - block_invalidatepage()
                - discard_buffer()
                  - clear_buffer_mapped()
      - commit triggers and finds unmapped buffer - BOOM!
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  20. 29 Oct, 2005 1 commit
    • Hugh Dickins's avatar
      [PATCH] mm: split page table lock · 4c21e2f2
      Hugh Dickins authored
      Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
      a many-threaded application which concurrently initializes different parts of
      a large anonymous area.
      This patch corrects that, by using a separate spinlock per page table page, to
      guard the page table entries in that page, instead of using the mm's single
      page_table_lock.  (But even then, page_table_lock is still used to guard page
      table allocation, and anon_vma allocation.)
      In this implementation, the spinlock is tucked inside the struct page of the
      page table page: with a BUILD_BUG_ON in case it overflows - which it would in
      the case of 32-bit PA-RISC with spinlock debugging enabled.
      Splitting the lock is not quite for free: another cacheline access.  Ideally,
      I suppose we would use split ptlock only for multi-threaded processes on
      multi-cpu machines; but deciding that dynamically would have its own costs.
      So for now enable it by config, at some number of cpus - since the Kconfig
      language doesn't support inequalities, let preprocessor compare that with
      NR_CPUS.  But I don't think it's worth being user-configurable: for good
      testing of both split and unsplit configs, split now at 4 cpus, and perhaps
      change that to 8 later.
      There is a benefit even for singly threaded processes: kswapd can be attacking
      one part of the mm while another part is busy faulting.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  21. 28 Oct, 2005 2 commits
    • Al Viro's avatar
      [PATCH] gfp_t: fs/* · 27496a8c
      Al Viro authored
       - ->releasepage() annotated (s/int/gfp_t), instances updated
       - missing gfp_t in fs/* added
       - fixed misannotation from the original sweep caught by bitwise checks:
         XFS used __nocast both for gfp_t and for flags used by XFS allocator.
         The latter left with unsigned int __nocast; we might want to add a
         different type for those but for now let's leave them alone.  That,
         BTW, is a case when __nocast use had been actively confusing - it had
         been used in the same code for two different and similar types, with
         no way to catch misuses.  Switch of gfp_t to bitwise had caught that
      One tricky bit is left alone to be dealt with later - mapping->flags is
      a mix of gfp_t and error indications.  Left alone for now.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • Al Viro's avatar
      [PATCH] gfp_t: infrastructure · af4ca457
      Al Viro authored
      Beginning of gfp_t annotations:
       - -Wbitwise added to CHECKFLAGS
       - old __bitwise renamed to __bitwise__
       - __bitwise defined to either __bitwise__ or nothing, depending on
         __CHECK_ENDIAN__ being defined
       - gfp_t switched from __nocast to __bitwise__
       - force cast to gfp_t added to __GFP_... constants
       - new helper - gfp_zone(); extracts zone bits out of gfp_t value and casts
         the result to int
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  22. 08 Oct, 2005 1 commit
  23. 10 Sep, 2005 1 commit
    • Ingo Molnar's avatar
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar authored
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
       - consolidates and enhances the spinlock/rwlock debugging code
       - simplifies the asm/spinlock.h files
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
         SMP                         |  UP
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
       * here's the role of the various spinlock/rwlock related include files:
       * on SMP builds:
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *   (also included on UP-debug builds:)
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       * on UP builds:
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *   (included on UP-non-debug builds:)
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *  linux/spinlock.h:     builds the final spin_*() APIs.
      All SMP and UP architectures are converted by this patch.
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      From: Grant Grundler <grundler@parisc-linux.org>
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      From: "Luck, Tony" <tony.luck@intel.com>
         ia64 fix
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: default avatarGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: default avatarHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: default avatarMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: default avatarBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  24. 07 Sep, 2005 2 commits
  25. 07 Jul, 2005 1 commit
  26. 28 Jun, 2005 1 commit
  27. 23 Jun, 2005 1 commit
    • Anton Altaparmakov's avatar
      [PATCH] Bug in error recovery in fs/buffer.c::__block_prepare_write() · 152becd2
      Anton Altaparmakov authored
      fs/buffer.c::__block_prepare_write() has broken error recovery.  It calls
      the get_block() callback with "create = 1" and if that succeeds it
      immediately clears buffer_new on the just allocated buffer (which has
      buffer_new set).
      The bug is that if an error occurs and get_block() returns != 0, we break
      from this loop and go into recovery code.  This code has this comment:
      /* Error case: */
       * Zero out any newly allocated blocks to avoid exposing stale
       * data.  If BH_New is set, we know that the block was newly
       * allocated in the above loop.
      So the intent is obviously good in that it wants to clear just allocated
      and hence not zeroed buffers.  However the code recognises allocated
      buffers by checking for buffer_new being set.
      Unfortunately __block_prepare_write() as discussed above already cleared
      buffer_new on all allocated buffers thus no buffers will be cleared during
      error recovery and old data will be leaked.
      The simplest way I can see to fix this is to make the current recovery code
      work by _not_ clearing buffer_new after calling get_block() in
      We cannot safely allow buffer_new buffers to "leak out" of
      __block_prepare_write(), thus we simply do a quick loop over the buffers
      clearing buffer_new on each of them if it is set just before returning
      "success" from __block_prepare_write().
      Signed-off-by: default avatarAnton Altaparmakov <aia21@cantab.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>