1. 09 Nov, 2012 10 commits
  2. 08 Nov, 2012 30 commits
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-for-v3.7-rc5' of... · a186d25d
      Linus Torvalds authored
      Merge tag 'pinctrl-for-v3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
      
      Pull pinctrl fixes from Linus Walleij:
      
       - A set of SPEAr pinctrl fixes that recently arrived
      
       - A fixup for the Samsung/Exynos Kconfig deps
      
      * tag 'pinctrl-for-v3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: samsung and exynos need to depend on OF && GPIOLIB
        pinctrl: SPEAr1340: Add clcd sleep mode pin configuration
        pinctrl: SPEAr1340: Make DDR reset & clock pads as gpio
        pinctrl: SPEAr1310: add register entries for enabling pad direction
        pinctrl: SPEAr1310: Separate out pci pins from pcie_sata pin group
        pinctrl: SPEAr1310: Fix value of PERIP_CFG reigster and MCIF_SEL_SHIFT
        pinctrl: SPEAr1310: fix clcd high resolution pin group name
        pinctrl: SPEAr320: Correct pad mux entries for rmii/smii
        pinctrl: SPEAr3xx: correct register space to configure pwm
        pinctrl: SPEAr: Don't update all non muxreg bits on pinctrl_disable
      a186d25d
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 4ad48bb7
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "A couple of bug fixes.  I keep the fingers crossed that we now got
        transparent huge pages ready for prime time."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/cio: fix length calculation in idset.c
        s390/sclp: fix addressing mode clobber
        s390: Move css limits from drivers/s390/cio/ to include/asm/.
        s390/thp: respect page protection in pmd_none() and pmd_present()
        s390/mm: use pmd_large() instead of pmd_huge()
        s390/cio: suppress 2nd path verification during resume
      4ad48bb7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 976bacef
      Linus Torvalds authored
      Pull HID fix from Jiri Kosina:
       "This reverts a patch that causes regression in binding between HID
        devices and drivers during device unplug/replug cycle."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: hidraw: put old deallocation mechanism in place
      976bacef
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Fixes from Andrew) · ce6d841e
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Five fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (5 patches)
        h8300: add missing L1_CACHE_SHIFT
        mm: bugfix: set current->reclaim_state to NULL while returning from kswapd()
        fanotify: fix missing break
        revert "epoll: support for disabling items, and a self-test app"
        checkpatch: improve network block comment style checking
      ce6d841e
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · c0cba03b
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Just radeon and nouveau, mostly regressions fixers, and a couple of
        radeon register checker fixes."
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/nouveau: fix acpi edid retrieval
        drm/nvc0/disp: fix regression in vblank semaphore release
        drm/nv40/mpeg: fix context handling
        drm/nv40/graph: fix typo in type names
        drm/nv41/vm: fix typo in type name
        drm/radeon/si: add some missing regs to the VM reg checker
        drm/radeon/cayman: add some missing regs to the VM reg checker
        drm/radeon/dce3: switch back to old pll allocation order for discrete
      c0cba03b
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · cdfe1565
      Linus Torvalds authored
      Pull virtio and module fixes from Rusty Russell:
       "YA module signing build tweak, and two cc'd to stable."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        virtio: Don't access index after unregister.
        modules: don't break modules_install on external modules with no key.
        module: fix out-by-one error in kallsyms
      cdfe1565
    • Linus Torvalds's avatar
      Merge tag 'for-linus-v3.7-rc5' of git://oss.sgi.com/xfs/xfs · a601e637
      Linus Torvalds authored
      Pull xfs bugfixes from Ben Myers:
      
       - fix for large transactions spanning multiple iclog buffers
      
       - zero the allocation_args structure on the stack before using it to
         determine whether to use a worker for allocation
       - move allocation stack switch to xfs_bmapi_allocate in order to
         prevent deadlock on AGF buffers
      
       - growfs no longer reads in garbage for new secondary superblocks
      
       - silence a build warning
      
       - ensure that invalid buffers never get written to disk while on free
         list
      
       - don't vmap inode cluster buffers during free
      
       - fix buffer shutdown reference count mismatch
      
       - fix reading of wrapped log data
      
      * tag 'for-linus-v3.7-rc5' of git://oss.sgi.com/xfs/xfs:
        xfs: fix reading of wrapped log data
        xfs: fix buffer shudown reference count mismatch
        xfs: don't vmap inode cluster buffers during free
        xfs: invalidate allocbt blocks moved to the free list
        xfs: silence uninitialised f.file warning.
        xfs: growfs: don't read garbage for new secondary superblocks
        xfs: move allocation stack switch up to xfs_bmapi_allocate
        xfs: introduce XFS_BMAPI_STACK_SWITCH
        xfs: zero allocation_args on the kernel stack
        xfs: only update the last_sync_lsn when a transaction completes
      a601e637
    • Fengguang Wu's avatar
      h8300: add missing L1_CACHE_SHIFT · 6893f567
      Fengguang Wu authored
      Fix the build error
      
        lib/atomic64.c: In function 'lock_addr':
        lib/atomic64.c:40:11: error: 'L1_CACHE_SHIFT' undeclared (first use in this function)
        lib/atomic64.c:40:11: note: each undeclared identifier is reported only once for each function it appears in
      Signed-off-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6893f567
    • Takamori Yamaguchi's avatar
      mm: bugfix: set current->reclaim_state to NULL while returning from kswapd() · b0a8cc58
      Takamori Yamaguchi authored
      In kswapd(), set current->reclaim_state to NULL before returning, as
      current->reclaim_state holds reference to variable on kswapd()'s stack.
      
      In rare cases, while returning from kswapd() during memory offlining,
      __free_slab() and freepages() can access the dangling pointer of
      current->reclaim_state.
      Signed-off-by: default avatarTakamori Yamaguchi <takamori.yamaguchi@jp.sony.com>
      Signed-off-by: default avatarAaditya Kumar <aaditya.kumar@ap.sony.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0a8cc58
    • Eric Paris's avatar
      fanotify: fix missing break · 848561d3
      Eric Paris authored
      Anders Blomdell noted in 2010 that Fanotify lost events and provided a
      test case.  Eric Paris confirmed it was a bug and posted a fix to the
      list
      
        https://groups.google.com/forum/?fromgroups=#!topic/linux.kernel/RrJfTfyW2BE
      
      but never applied it.  Repeated attempts over time to actually get him
      to apply it have never had a reply from anyone who has raised it
      
      So apply it anyway
      Signed-off-by: default avatarAlan Cox <alan@linux.intel.com>
      Reported-by: default avatarAnders Blomdell <anders.blomdell@control.lth.se>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      848561d3
    • Andrew Morton's avatar
      revert "epoll: support for disabling items, and a self-test app" · a80a6b85
      Andrew Morton authored
      Revert commit 03a7beb5 ("epoll: support for disabling items, and a
      self-test app") pending resolution of the issues identified by Michael
      Kerrisk, copied below.
      
      We'll revisit this for 3.8.
      
      : I've taken a look at this patch as it currently stands in 3.7-rc1, and
      : done a bit of testing. (By the way, the test program
      : tools/testing/selftests/epoll/test_epoll.c does not compile...)
      :
      : There are one or two places where the behavior seems a little strange,
      : so I have a question or two at the end of this mail. But other than
      : that, I want to check my understanding so that the interface can be
      : correctly documented.
      :
      : Just to go though my understanding, the problem is the following
      : scenario in a multithreaded application:
      :
      : 1. Multiple threads are performing epoll_wait() operations,
      :    and maintaining a user-space cache that contains information
      :    corresponding to each file descriptor being monitored by
      :    epoll_wait().
      :
      : 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
      :    a file descriptor from the epoll interest list, and
      :    delete the corresponding record from the user-space cache.
      :
      : 3. The problem with (2) is that some other thread may have
      :    previously done an epoll_wait() that retrieved information
      :    about the fd in question, and may be in the middle of using
      :    information in the cache that relates to that fd. Thus,
      :    there is a potential race.
      :
      : 4. The race can't solved purely in user space, because doing
      :    so would require applying a mutex across the epoll_wait()
      :    call, which would of course blow thread concurrency.
      :
      : Right?
      :
      : Your solution is the EPOLL_CTL_DISABLE operation. I want to
      : confirm my understanding about how to use this flag, since
      : the description that has accompanied the patches so far
      : has been a bit sparse
      :
      : 0. In the scenario you're concerned about, deleting a file
      :    descriptor means (safely) doing the following:
      :    (a) Deleting the file descriptor from the epoll interest list
      :        using EPOLL_CTL_DEL
      :    (b) Deleting the corresponding record in the user-space cache
      :
      : 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
      :    conjunction with EPOLLONESHOT.
      :
      : 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
      :    conjunction is a logical error.
      :
      : 3. The correct way to code multithreaded applications using
      :    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
      :
      :    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
      :       should EPOLLONESHOT.
      :
      :    b. When a thread wants to delete a file descriptor, it
      :       should do the following:
      :
      :       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
      :       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
      :           was zero, then the file descriptor can be safely
      :           deleted by the thread that made this call.
      :       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
      :           then the descriptor is in use. In this case, the calling
      :           thread should set a flag in the user-space cache to
      :           indicate that the thread that is using the descriptor
      :           should perform the deletion operation.
      :
      : Is all of the above correct?
      :
      : The implementation depends on checking on whether
      : (events & ~EP_PRIVATE_BITS) == 0
      : This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
      : set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
      : causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
      : cleared.
      :
      : A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
      : is only useful in conjunction with EPOLLONESHOT. However, as things
      : stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
      : not have EPOLLONESHOT set in 'events' This results in the following
      : (slightly surprising) behavior:
      :
      : (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
      :     (the indicator that the file descriptor can be safely deleted).
      : (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
      :
      : This doesn't seem particularly useful, and in fact is probably an
      : indication that the user made a logic error: they should only be using
      : epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
      : EPOLLONESHOT was set in 'events'. If that is correct, then would it
      : not make sense to return an error to user space for this case?
      
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Paton J. Lewis" <palewis@adobe.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a80a6b85
    • Joe Perches's avatar
      checkpatch: improve network block comment style checking · c24f9f19
      Joe Perches authored
      Some comment styles in net and drivers/net are flagged inappropriately.
      
      Avoid proclaiming inline comments like:
      	int a = b;	/* some comment */
      and block comments like:
      	/*********************
      	 * some comment
      	 ********************/
      are defective.
      
      Tested with
      $ cat drivers/net/t.c
      /* foo */
      
      /*
       * foo
       */
      
      /* foo
       */
      
      /* foo
       * bar */
      
      /****************************
       * some long block comment
       ***************************/
      
      struct foo {
      	int bar;	/* another test */
      };
      $
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Reported-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Cc: David Miller <davem@davemloft.net>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c24f9f19
    • Dave Airlie's avatar
      Merge branch 'drm-nouveau-fixes' of... · 4a48ed23
      Dave Airlie authored
      Merge branch 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
      
      just some misc regression fixes and typo fixes.
      
      * 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
        drm/nouveau: fix acpi edid retrieval
        drm/nvc0/disp: fix regression in vblank semaphore release
        drm/nv40/mpeg: fix context handling
        drm/nv40/graph: fix typo in type names
        drm/nv41/vm: fix typo in type name
      4a48ed23
    • Cornelia Huck's avatar
      virtio: Don't access index after unregister. · 237242bd
      Cornelia Huck authored
      Virtio wants to release used indices after the corresponding
      virtio device has been unregistered. However, virtio does not
      hold an extra reference, giving up its last reference with
      device_unregister(), making accessing dev->index afterwards
      invalid.
      
      I actually saw problems when testing my (not-yet-merged)
      virtio-ccw code:
      
      - device_add virtio-net,id=xxx
      -> creates device virtio<n> with n>0
      
      - device_del xxx
      -> deletes virtio<n>, but calls ida_simple_remove with an
         index of 0
      
      - device_add virtio-net,id=xxx
      -> tries to add virtio0, which is still in use...
      
      So let's save the index we want to release before calling
      device_unregister().
      Signed-off-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Acked-by: default avatarSjur Brændeland <sjur.brandeland@stericsson.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      237242bd
    • Maarten Lankhorst's avatar
      drm/nouveau: fix acpi edid retrieval · df285500
      Maarten Lankhorst authored
      Commit c0077061 accidentally inverted the logic for nouveau_acpi_edid,
      causing it to only show a connector as connected when the edid could not
      be retrieved with acpi.
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      df285500
    • Kelly Doran's avatar
    • Marcin Slusarz's avatar
      drm/nv40/mpeg: fix context handling · 7707b701
      Marcin Slusarz authored
      It slipped in thanks to typeless API.
      Signed-off-by: default avatarMarcin Slusarz <marcin.slusarz@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      7707b701
    • Marcin Slusarz's avatar
      drm/nv40/graph: fix typo in type names · a4dd4ec2
      Marcin Slusarz authored
      nv04_graph_priv / nv04_graph_chan are not defined in this context...
      Signed-off-by: default avatarMarcin Slusarz <marcin.slusarz@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      a4dd4ec2
    • Marcin Slusarz's avatar
      drm/nv41/vm: fix typo in type name · 479dd567
      Marcin Slusarz authored
      It's a miracle it compiles at all - nv04_vm_priv does not exist
      anywhere in the tree.
      Signed-off-by: default avatarMarcin Slusarz <marcin.slusarz@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      479dd567
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 022d1a29
      Dave Airlie authored
      Just some minor fixes for VM reg check and a regression fix for dce3 plls
      
      * 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux:
        drm/radeon/si: add some missing regs to the VM reg checker
        drm/radeon/cayman: add some missing regs to the VM reg checker
        drm/radeon/dce3: switch back to old pll allocation order for discrete
      022d1a29
    • Dave Chinner's avatar
      xfs: fix reading of wrapped log data · 6ce377af
      Dave Chinner authored
      Commit 44396476 ("xfs: reset buffer pointers before freeing them") in
      3.0-rc1 introduced a regression when recovering log buffers that
      wrapped around the end of log. The second part of the log buffer at
      the start of the physical log was being read into the header buffer
      rather than the data buffer, and hence recovery was seeing garbage
      in the data buffer when it got to the region of the log buffer that
      was incorrectly read.
      
      Cc: <stable@vger.kernel.org> # 3.0.x, 3.2.x, 3.4.x 3.6.x
      Reported-by: default avatarTorsten Kaiser <just.for.lkml@googlemail.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      6ce377af
    • Dave Chinner's avatar
      xfs: fix buffer shudown reference count mismatch · 03b1293e
      Dave Chinner authored
      When we shut down the filesystem, we have to unpin and free all the
      buffers currently active in the CIL. To do this we unpin and remove
      them in one operation as a result of a failed iclogbuf write. For
      buffers, we do this removal via a simultated IO completion of after
      marking the buffer stale.
      
      At the time we do this, we have two references to the buffer - the
      active LRU reference and the buf log item.  The LRU reference is
      removed by marking the buffer stale, and the active CIL reference is
      by the xfs_buf_iodone() callback that is run by
      xfs_buf_do_callbacks() during ioend processing (via the bp->b_iodone
      callback).
      
      However, ioend processing requires one more reference - that of the
      IO that it is completing. We don't have this reference, so we free
      the buffer prematurely and use it after it is freed. For buffers
      marked with XBF_ASYNC, this leads to assert failures in
      xfs_buf_rele() on debug kernels because the b_hold count is zero.
      
      Fix this by making sure we take the necessary IO reference before
      starting IO completion processing on the stale buffer, and set the
      XBF_ASYNC flag to ensure that IO completion processing removes all
      the active references from the buffer to ensure it is fully torn
      down.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      03b1293e
    • Dave Chinner's avatar
      xfs: don't vmap inode cluster buffers during free · 4b62acfe
      Dave Chinner authored
      Inode buffers do not need to be mapped as inodes are read or written
      directly from/to the pages underlying the buffer. This fixes a
      regression introduced by commit 611c9946 ("xfs: make XBF_MAPPED the
      default behaviour").
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      4b62acfe
    • Dave Chinner's avatar
      xfs: invalidate allocbt blocks moved to the free list · ca250b1b
      Dave Chinner authored
      When we free a block from the alloc btree tree, we move it to the
      freelist held in the AGFL and mark it busy in the busy extent tree.
      This typically happens when we merge btree blocks.
      
      Once the transaction is committed and checkpointed, the block can
      remain on the free list for an indefinite amount of time.  Now, this
      isn't the end of the world at this point - if the free list is
      shortened, the buffer is invalidated in the transaction that moves
      it back to free space. If the buffer is allocated as metadata from
      the free list, then all the modifications getted logged, and we have
      no issues, either. And if it gets allocated as userdata direct from
      the freelist, it gets invalidated and so will never get written.
      
      However, during the time it sits on the free list, pressure on the
      log can cause the AIL to be pushed and the buffer that covers the
      block gets pushed for write. IOWs, we end up writing a freed
      metadata block to disk. Again, this isn't the end of the world
      because we know from the above we are only writing to free space.
      
      The problem, however, is for validation callbacks. If the block was
      on old btree root block, then the level of the block is going to be
      higher than the current tree root, and so will fail validation.
      There may be other inconsistencies in the block as well, and
      currently we don't care because the block is in free space. Shutting
      down the filesystem because a freed block doesn't pass write
      validation, OTOH, is rather unfriendly.
      
      So, make sure we always invalidate buffers as they move from the
      free space trees to the free list so that we guarantee they never
      get written to disk while on the free list.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarPhil White <pwhite@sgi.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      ca250b1b
    • Dave Chinner's avatar
      xfs: silence uninitialised f.file warning. · 1e7acbb7
      Dave Chinner authored
      Uninitialised variable build warning introduced by 2903ff01 ("switch
      simple cases of fget_light to fdget"), gcc is not smart enough to
      work out that the variable is not used uninitialised, and the commit
      removed the initialisation at declaration that the old variable had.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      1e7acbb7
    • Dave Chinner's avatar
      xfs: growfs: don't read garbage for new secondary superblocks · eaef8543
      Dave Chinner authored
      When updating new secondary superblocks in a growfs operation, the
      superblock buffer is read from the newly grown region of the
      underlying device. This is not guaranteed to be zero, so violates
      the underlying assumption that the unused parts of superblocks are
      zero filled. Get a new buffer for these secondary superblocks to
      ensure that the unused regions are zero filled correctly.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      eaef8543
    • Dave Chinner's avatar
      xfs: move allocation stack switch up to xfs_bmapi_allocate · 1f3c785c
      Dave Chinner authored
      Switching stacks are xfs_alloc_vextent can cause deadlocks when we
      run out of worker threads on the allocation workqueue. This can
      occur because xfs_bmap_btalloc can make multiple calls to
      xfs_alloc_vextent() and even if xfs_alloc_vextent() fails it can
      return with the AGF locked in the current allocation transaction.
      
      If we then need to make another allocation, and all the allocation
      worker contexts are exhausted because the are blocked waiting for
      the AGF lock, holder of the AGF cannot get it's xfs-alloc_vextent
      work completed to release the AGF.  Hence allocation effectively
      deadlocks.
      
      To avoid this, move the stack switch one layer up to
      xfs_bmapi_allocate() so that all of the allocation attempts in a
      single switched stack transaction occur in a single worker context.
      This avoids the problem of an allocation being blocked waiting for
      a worker thread whilst holding the AGF.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      1f3c785c
    • Dave Chinner's avatar
      xfs: introduce XFS_BMAPI_STACK_SWITCH · 326c0355
      Dave Chinner authored
      Certain allocation paths through xfs_bmapi_write() are in situations
      where we have limited stack available. These are almost always in
      the buffered IO writeback path when convertion delayed allocation
      extents to real extents.
      
      The current stack switch occurs for userdata allocations, which
      means we also do stack switches for preallocation, direct IO and
      unwritten extent conversion, even those these call chains have never
      been implicated in a stack overrun.
      
      Hence, let's target just the single stack overun offended for stack
      switches. To do that, introduce a XFS_BMAPI_STACK_SWITCH flag that
      the caller can pass xfs_bmapi_write() to indicate it should switch
      stacks if it needs to do allocation.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      326c0355
    • Mark Tinguely's avatar
      xfs: zero allocation_args on the kernel stack · 408cc4e9
      Mark Tinguely authored
      Zero the kernel stack space that makes up the xfs_alloc_arg structures.
      Signed-off-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      408cc4e9
    • Dave Chinner's avatar
      xfs: only update the last_sync_lsn when a transaction completes · 7e9620f2
      Dave Chinner authored
      The log write code stamps each iclog with the current tail LSN in
      the iclog header so that recovery knows where to find the tail of
      thelog once it has found the head. Normally this is taken from the
      first item on the AIL - the log item that corresponds to the oldest
      active item in the log.
      
      The problem is that when the AIL is empty, the tail lsn is dervied
      from the the l_last_sync_lsn, which is the LSN of the last iclog to
      be written to the log. In most cases this doesn't happen, because
      the AIL is rarely empty on an active filesystem. However, when it
      does, it opens up an interesting case when the transaction being
      committed to the iclog spans multiple iclogs.
      
      That is, the first iclog is stamped with the l_last_sync_lsn, and IO
      is issued. Then the next iclog is setup, the changes copied into the
      iclog (takes some time), and then the l_last_sync_lsn is stamped
      into the header and IO is issued. This is still the same
      transaction, so the tail lsn of both iclogs must be the same for log
      recovery to find the entire transaction to be able to replay it.
      
      The problem arises in that the iclog buffer IO completion updates
      the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog
      completes it's IO before the second iclog is filled and has the tail
      lsn stamped in it, it will stamp the LSN of the first iclog into
      it's tail lsn field. If the system fails at this point, log recovery
      will not see a complete transaction, so the transaction will no be
      replayed.
      
      The fix is simple - the l_last_sync_lsn is updated when a iclog
      buffer IO completes, and this is incorrect. The l_last_sync_lsn
      shoul dbe updated when a transaction is completed by a iclog buffer
      IO. That is, only iclog buffers that have transaction commit
      callbacks attached to them should update the l_last_sync_lsn. This
      means that the last_sync_lsn will only move forward when a commit
      record it written, not in the middle of a large transaction that is
      rolling through multiple iclog buffers.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      7e9620f2