1. 08 Jul, 2016 1 commit
  2. 07 Jul, 2016 1 commit
  3. 05 Jul, 2016 2 commits
  4. 03 Jul, 2016 1 commit
  5. 01 Jul, 2016 1 commit
    • Miklos Szeredi's avatar
      locks: use file_inode() · 6343a212
      Miklos Szeredi authored
      (Another one for the f_path debacle.)
      
      ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.
      
      The reason is that generic_add_lease() used filp->f_path.dentry->inode
      while all the others use file_inode().  This makes a difference for files
      opened on overlayfs since the former will point to the overlay inode the
      latter to the underlying inode.
      
      So generic_add_lease() added the lease to the overlay inode and
      generic_delete_lease() removed it from the underlying inode.  When the file
      was released the lease remained on the overlay inode's lock list, resulting
      in use after free.
      Reported-by: default avatarEryu Guan <eguan@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      6343a212
  6. 30 Jun, 2016 6 commits
    • Andrey Ulanov's avatar
      namespace: update event counter when umounting a deleted dentry · e06b933e
      Andrey Ulanov authored
      - m_start() in fs/namespace.c expects that ns->event is incremented each
        time a mount added or removed from ns->list.
      - umount_tree() removes items from the list but does not increment event
        counter, expecting that it's done before the function is called.
      - There are some codepaths that call umount_tree() without updating
        "event" counter. e.g. from __detach_mounts().
      - When this happens m_start may reuse a cached mount structure that no
        longer belongs to ns->list (i.e. use after free which usually leads
        to infinite loop).
      
      This change fixes the above problem by incrementing global event counter
      before invoking umount_tree().
      
      Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrey Ulanov <andreyu@google.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e06b933e
    • Miklos Szeredi's avatar
      9p: use file_dentry() · b403f0e3
      Miklos Szeredi authored
      v9fs may be used as lower layer of overlayfs and accessing f_path.dentry
      can lead to a crash.  In this case it's a NULL pointer dereference in
      p9_fid_create().
      
      Fix by replacing direct access of file->f_path.dentry with the
      file_dentry() accessor, which will always return a native object.
      Reported-by: default avatarAlessio Igor Bogani <alessioigorbogani@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Tested-by: default avatarAlessio Igor Bogani <alessioigorbogani@gmail.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b403f0e3
    • Scott Mayhew's avatar
      lockd: unregister notifier blocks if the service fails to come up completely · cb7d224f
      Scott Mayhew authored
      If the lockd service fails to start up then we need to be sure that the
      notifier blocks are not registered, otherwise a subsequent start of the
      service could cause the same notifier to be registered twice, leading to
      soft lockups.
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 0751ddf7 "lockd: Register callbacks on the inetaddr_chain..."
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      cb7d224f
    • Tahsin Erdogan's avatar
      writeback: inode cgroup wb switch should not call ihold() · 74524955
      Tahsin Erdogan authored
      Asynchronous wb switching of inodes takes an additional ref count on an
      inode to make sure inode remains valid until switchover is completed.
      
      However, anyone calling ihold() must already have a ref count on inode,
      but in this case inode->i_count may already be zero:
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 917 at fs/inode.c:397 ihold+0x2b/0x30
      CPU: 1 PID: 917 Comm: kworker/u4:5 Not tainted 4.7.0-rc2+ #49
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Workqueue: writeback wb_workfn (flush-8:16)
       0000000000000000 ffff88007ca0fb58 ffffffff805990af 0000000000000000
       0000000000000000 ffff88007ca0fb98 ffffffff80268702 0000018d000004e2
       ffff88007cef40e8 ffff88007c9b89a8 ffff880079e3a740 0000000000000003
      Call Trace:
       [<ffffffff805990af>] dump_stack+0x4d/0x6e
       [<ffffffff80268702>] __warn+0xc2/0xe0
       [<ffffffff802687d8>] warn_slowpath_null+0x18/0x20
       [<ffffffff8035b4ab>] ihold+0x2b/0x30
       [<ffffffff80367ecc>] inode_switch_wbs+0x11c/0x180
       [<ffffffff80369110>] wbc_detach_inode+0x170/0x1a0
       [<ffffffff80369abc>] writeback_sb_inodes+0x21c/0x530
       [<ffffffff80369f7e>] wb_writeback+0xee/0x1e0
       [<ffffffff8036a147>] wb_workfn+0xd7/0x280
       [<ffffffff80287531>] ? try_to_wake_up+0x1b1/0x2b0
       [<ffffffff8027bb09>] process_one_work+0x129/0x300
       [<ffffffff8027be06>] worker_thread+0x126/0x480
       [<ffffffff8098cde7>] ? __schedule+0x1c7/0x561
       [<ffffffff8027bce0>] ? process_one_work+0x300/0x300
       [<ffffffff80280ff4>] kthread+0xc4/0xe0
       [<ffffffff80335578>] ? kfree+0xc8/0x100
       [<ffffffff809903cf>] ret_from_fork+0x1f/0x40
       [<ffffffff80280f30>] ? __kthread_parkme+0x70/0x70
      ---[ end trace aaefd2fd9f306bc4 ]---
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      74524955
    • Miklos Szeredi's avatar
      fuse: serialize dirops by default · 5c672ab3
      Miklos Szeredi authored
      Negotiate with userspace filesystems whether they support parallel readdir
      and lookup.  Disable parallelism by default for fear of breaking fuse
      filesystems.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 9902af79 ("parallel lookups: actual switch to rwsem")
      Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")
      5c672ab3
    • Marek Vasut's avatar
      configfs: Remove ppos increment in configfs_write_bin_file · f8608985
      Marek Vasut authored
      The simple_write_to_buffer() already increments the @ppos on success,
      see fs/libfs.c simple_write_to_buffer() comment:
      
      "
      On success, the number of bytes written is returned and the offset @ppos
      advanced by this number, or negative value is returned on error.
      "
      
      If the configfs_write_bin_file() is invoked with @count smaller than the
      total length of the written binary file, it will be invoked multiple times.
      Since configfs_write_bin_file() increments @ppos on success, after calling
      simple_write_to_buffer(), the @ppos is incremented twice.
      
      Subsequent invocation of configfs_write_bin_file() will result in the next
      piece of data being written to the offset twice as long as the length of
      the previous write, thus creating buffer with "holes" in it.
      
      The simple testcase using DTO follows:
        $ mkdir /sys/kernel/config/device-tree/overlays/1
        $ dd bs=1 if=foo.dtbo of=/sys/kernel/config/device-tree/overlays/1/dtbo
      Without this patch, the testcase will result in twice as big buffer in the
      kernel, which is then passed to the cfs_overlay_item_dtbo_write() .
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      f8608985
  7. 29 Jun, 2016 2 commits
    • Miklos Szeredi's avatar
      ovl: get_write_access() in truncate · 03bea604
      Miklos Szeredi authored
      When truncating a file we should check write access on the underlying
      inode.  And we should do so on the lower file as well (before copy-up) for
      consistency.
      
      Original patch and test case by Aihua Zhang.
      
       - - >o >o - - test.c - - >o >o - -
      #include <stdio.h>
      #include <errno.h>
      #include <unistd.h>
      
      int main(int argc, char *argv[])
      {
      	int ret;
      
      	ret = truncate(argv[0], 4096);
      	if (ret != -1) {
      		fprintf(stderr, "truncate(argv[0]) should have failed\n");
      		return 1;
      	}
      	if (errno != ETXTBSY) {
      		perror("truncate(argv[0])");
      		return 1;
      	}
      
      	return 0;
      }
       - - >o >o - - >o >o - - >o >o - -
      Reported-by: default avatarAihua Zhang <zhangaihua1@huawei.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org>
      03bea604
    • Miklos Szeredi's avatar
      ovl: fix dentry leak for default_permissions · a4859d75
      Miklos Szeredi authored
      When using the 'default_permissions' mount option, ovl_permission() on
      non-directories was missing a dput(alias), resulting in "BUG Dentry still
      in use".
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 8d3095f4 ("ovl: default permissions")
      Cc: <stable@vger.kernel.org> # v4.5+
      a4859d75
  8. 28 Jun, 2016 1 commit
    • Trond Myklebust's avatar
      NFS: Fix another OPEN_DOWNGRADE bug · e547f262
      Trond Myklebust authored
      Olga Kornievskaia reports that the following test fails to trigger
      an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.
      
      	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
      	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
      	close(fd0) -- should trigger an open_downgrade
      	read(fd1)
      	close(fd1)
      
      The issue is that we're missing a check for whether or not the current
      state transitioned from an O_RDWR state as opposed to having transitioned
      from a combination of O_RDONLY and O_WRONLY.
      Reported-by: default avatarOlga Kornievskaia <aglo@umich.edu>
      Fixes: cd9288ff ("NFSv4: Fix another bug in the close/open_downgrade code")
      Cc: stable@vger.kernel.org # 2.6.33+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      e547f262
  9. 27 Jun, 2016 2 commits
    • Eric Sandeen's avatar
      dax: fix offset overflow in dax_io · 02395435
      Eric Sandeen authored
      This isn't functionally apparent for some reason, but
      when we test io at extreme offsets at the end of the loff_t
      rang, such as in fstests xfs/071, the calculation of
      "max" in dax_io() can be wrong due to pos + size overflowing.
      
      For example,
      
      # xfs_io -c "pwrite 9223372036854771712 512" /mnt/test/file
      
      enters dax_io with:
      
      start 0x7ffffffffffff000
      end   0x7ffffffffffff200
      
      and the rounded up "size" variable is 0x1000.  This yields:
      
      pos + size 0x8000000000000000 (overflows loff_t)
             end 0x7ffffffffffff200
      
      Due to the overflow, the min() function picks the wrong
      value for the "max" variable, and when we send (max - pos)
      into i.e. copy_from_iter_pmem() it is also the wrong value.
      
      This somehow(tm) gets magically absorbed without incident,
      probably because iter->count is correct.  But it seems best
      to fix it up properly by comparing the two values as
      unsigned.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      02395435
    • Al Viro's avatar
      make nfs_atomic_open() call d_drop() on all ->open_context() errors. · d20cb71d
      Al Viro authored
      In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
      unconditional d_drop() after the ->open_context() had been removed.  It had
      been correct for success cases (there ->open_context() itself had been doing
      dcache manipulations), but not for error ones.  Only one of those (ENOENT)
      got a compensatory d_drop() added in that commit, but in fact it should've
      been done for all errors.  As it is, the case of O_CREAT non-exclusive open
      on a hashed negative dentry racing with e.g. symlink creation from another
      client ended up with ->open_context() getting an error and proceeding to
      call nfs_lookup().  On a hashed dentry, which would've instantly triggered
      BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
      d_splice_alias()).
      
      Cc: stable@vger.kernel.org # v3.10+
      Tested-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      d20cb71d
  10. 25 Jun, 2016 1 commit
    • Omar Sandoval's avatar
      Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes · 02dbfc99
      Omar Sandoval authored
      Commit fe742fd4 ("Revert "btrfs: switch to ->iterate_shared()"")
      backed out the conversion to ->iterate_shared() for Btrfs because the
      delayed inode handling in btrfs_real_readdir() is racy. However, we can
      still do readdir in parallel if there are no delayed nodes.
      
      This is a temporary fix which upgrades the shared inode lock to an
      exclusive lock only when we have delayed items until we come up with a
      more complete solution. While we're here, rename the
      btrfs_{get,put}_delayed_items functions to make it very clear that
      they're just for readdir.
      
      Tested with xfstests and by doing a parallel kernel build:
      
      	while make tinyconfig && make -j4 && git clean dqfx; do
      		:
      	done
      
      along with a bunch of parallel finds in another shell:
      
      	while true; do
      		for ((i=0; i<4; i++)); do
      			find . >/dev/null &
      		done
      		wait
      	done
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      02dbfc99
  11. 24 Jun, 2016 21 commits
  12. 23 Jun, 2016 1 commit