1. 16 Nov, 2015 1 commit
  2. 13 Nov, 2015 9 commits
  3. 12 Nov, 2015 1 commit
    • Dan Williams's avatar
      dax: fix __dax_pmd_fault crash · 152d7bd8
      Dan Williams authored
      Since 4.3 introduced devm_memremap_pages() the pfns handled by DAX may
      optionally have a struct page backing.  When a mapped pfn reaches
      vmf_insert_pfn_pmd() it fails with a crash signature like the following:
      
       kernel BUG at mm/huge_memory.c:905!
       [..]
       Call Trace:
        [<ffffffff812a73ba>] __dax_pmd_fault+0x2ea/0x5b0
        [<ffffffffa01a4182>] xfs_filemap_pmd_fault+0x92/0x150 [xfs]
        [<ffffffff811fbe02>] handle_mm_fault+0x312/0x1b50
      
      Fix this by falling back to 4K mappings in the pfn_valid() case.  Longer
      term, vmf_insert_pfn_pmd() needs to grow support for architectures that
      can provide a 'pmd_special' capability.
      
      Cc: <stable@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reported-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      152d7bd8
  4. 11 Nov, 2015 15 commits
  5. 10 Nov, 2015 13 commits
    • Zhao Lei's avatar
      btrfs: Use fs_info directly in btrfs_delete_unused_bgs · d5f2e33b
      Zhao Lei authored
      No need to use root->fs_info in btrfs_delete_unused_bgs(),
      use fs_info directly instead.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      d5f2e33b
    • Zhao Lei's avatar
      btrfs: Fix lost-data-profile caused by balance bg · 2c9fe835
      Zhao Lei authored
      Reproduce:
       (In integration-4.3 branch)
      
       TEST_DEV=(/dev/vdg /dev/vdh)
       TEST_DIR=/mnt/tmp
      
       umount "$TEST_DEV" >/dev/null
       mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}"
      
       mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
       btrfs balance start -dusage=0 $TEST_DIR
       btrfs filesystem usage $TEST_DIR
      
       dd if=/dev/zero of="$TEST_DIR"/file count=100
       btrfs filesystem usage $TEST_DIR
      
      Result:
       We can see "no data chunk" in first "btrfs filesystem usage":
       # btrfs filesystem usage $TEST_DIR
       Overall:
          ...
       Metadata,single: Size:8.00MiB, Used:0.00B
          /dev/vdg        8.00MiB
       Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
          /dev/vdg      122.88MiB
          /dev/vdh      122.88MiB
       System,single: Size:4.00MiB, Used:0.00B
          /dev/vdg        4.00MiB
       System,RAID1: Size:8.00MiB, Used:16.00KiB
          /dev/vdg        8.00MiB
          /dev/vdh        8.00MiB
       Unallocated:
          /dev/vdg        1.06GiB
          /dev/vdh        1.07GiB
      
       And "data chunks changed from raid1 to single" in second
       "btrfs filesystem usage":
       # btrfs filesystem usage $TEST_DIR
       Overall:
          ...
       Data,single: Size:256.00MiB, Used:0.00B
          /dev/vdh      256.00MiB
       Metadata,single: Size:8.00MiB, Used:0.00B
          /dev/vdg        8.00MiB
       Metadata,RAID1: Size:122.88MiB, Used:112.00KiB
          /dev/vdg      122.88MiB
          /dev/vdh      122.88MiB
       System,single: Size:4.00MiB, Used:0.00B
          /dev/vdg        4.00MiB
       System,RAID1: Size:8.00MiB, Used:16.00KiB
          /dev/vdg        8.00MiB
          /dev/vdh        8.00MiB
       Unallocated:
          /dev/vdg        1.06GiB
          /dev/vdh      841.92MiB
      
      Reason:
       btrfs balance delete last data chunk in case of no data in
       the filesystem, then we can see "no data chunk" by "fi usage"
       command.
      
       And when we do write operation to fs, the only available data
       profile is 0x0, result is all new chunks are allocated single type.
      
      Fix:
       Allocate a data chunk explicitly to ensure we don't lose the
       raid profile for data.
      
      Test:
       Test by above script, and confirmed the logic by debug output.
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      2c9fe835
    • Zhao Lei's avatar
      btrfs: Fix lost-data-profile caused by auto removing bg · aefbe9a6
      Zhao Lei authored
      Reproduce:
       (In integration-4.3 branch)
      
       TEST_DEV=(/dev/vdg /dev/vdh)
       TEST_DIR=/mnt/tmp
      
       umount "$TEST_DEV" >/dev/null
       mkfs.btrfs -f -d raid1 "${TEST_DEV[@]}"
      
       mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
       umount "$TEST_DEV"
      
       mount -o nospace_cache "$TEST_DEV" "$TEST_DIR"
       btrfs filesystem usage $TEST_DIR
      
      We can see the data chunk changed from raid1 to single:
       # btrfs filesystem usage $TEST_DIR
       Data,single: Size:8.00MiB, Used:0.00B
          /dev/vdg        8.00MiB
       #
      
      Reason:
       When a empty filesystem mount with -o nospace_cache, the last
       data blockgroup will be auto-removed in umount.
      
       Then if we mount it again, there is no data chunk in the
       filesystem, so the only available data profile is 0x0, result
       is all new chunks are created as single type.
      
      Fix:
       Don't auto-delete last blockgroup for a raid type.
      
      Test:
       Test by above script, and confirmed the logic by debug output.
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      aefbe9a6
    • Zhao Lei's avatar
      btrfs: Remove len argument from scrub_find_csum · 3b5753ec
      Zhao Lei authored
      It is useless.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      3b5753ec
    • Zhao Lei's avatar
      btrfs: Reduce unnecessary arguments in scrub_recheck_block · affe4a5a
      Zhao Lei authored
      We don't need pass so many arguments for recheck sblock now,
      this patch cleans them.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      affe4a5a
    • Zhao Lei's avatar
      btrfs: Use scrub_checksum_data and scrub_checksum_tree_block for scrub_recheck_block_checksum · ba7cf988
      Zhao Lei authored
      We can use existing scrub_checksum_data() and scrub_checksum_tree_block()
      for scrub_recheck_block_checksum(), instead of write duplicated code.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      ba7cf988
    • Zhao Lei's avatar
      btrfs: Reset sblock->xxx_error stats before calling scrub_recheck_block_checksum · 772d233f
      Zhao Lei authored
      We should reset sblock->xxx_error stats before calling
      scrub_recheck_block_checksum().
      
      Current code run correctly because all sblock are allocated by
      k[cz]alloc(), and the error stats are not got changed.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      772d233f
    • Zhao Lei's avatar
      btrfs: scrub: setup all fields for sblock_to_check · 4734b7ed
      Zhao Lei authored
      scrub_setup_recheck_block() isn't setup all necessary fields for
      sblock_to_check because history reason.
      
      So current code need more arguments in severial functions,
      and more local variables, just to passing these lacked values to
      necessary place.
      
      This patch setup above fields to sblock_to_check in
      scrub_setup_recheck_block(), for:
      1: more cleanup for function arg, local variable
      2: to make sblock_to_check complete, then we can use sblock_to_check
         without concern about some uninitialized member.
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      4734b7ed
    • Zhao Lei's avatar
      btrfs: scrub: set error stats when tree block spanning stripes · 9799d2c3
      Zhao Lei authored
      It is better to show error stats to user when we found tree block
      spanning stripes.
      
      On a btrfs created by old version of btrfs-convert:
      Before patch:
        # btrfs scrub start -B /dev/vdh
        scrub done for 8b342d35-2904-41ab-b3cb-2f929709cf47
                scrub started at Tue Aug 25 21:19:09 2015 and finished after 00:00:00
                total bytes scrubbed: 53.54MiB with 0 errors
        # dmesg
        ...
        [  128.711434] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27000832
        [  128.712744] BTRFS error (device vdh): scrub: tree block 27054080 spanning stripes, ignored. logical=27066368
        ...
      
      After patch:
        # btrfs scrub start -B /dev/vdh
        scrub done for ff7f844b-7a4e-4b1a-88a9-8252ab25be1b
                scrub started at Tue Aug 25 21:42:29 2015 and finished after 00:00:00
                total bytes scrubbed: 53.60MiB with 2 errors
                error details:
                corrected errors: 0, uncorrectable errors: 2, unverified errors: 0
        ERROR: There are uncorrectable errors.
        # dmesg
        ...omit...
        #
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      9799d2c3
    • Jens Axboe's avatar
      direct-io: be sure to assign dio->bio_bdev for both paths · c1c53460
      Jens Axboe authored
      btrfs sets ->submit_io(), and we failed to set the block dev for
      that path. That resulted in a potential NULL dereference when
      we later wait for IO in dio_await_one().
      Reported-by: default avatarkernel test robot <ying.huang@linux.intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c1c53460
    • Andrew Elble's avatar
      nfsd: fix race with open / open upgrade stateids · 7fc0564e
      Andrew Elble authored
      We observed multiple open stateids on the server for files that
      seemingly should have been closed.
      
      nfsd4_process_open2() tests for the existence of a preexisting
      stateid. If one is not found, the locks are dropped and a new
      one is created. The problem is that init_open_stateid(), which
      is also responsible for hashing the newly initialized stateid,
      doesn't check to see if another open has raced in and created
      a matching stateid. This fix is to enable init_open_stateid() to
      return the matching stateid and have nfsd4_process_open2()
      swap to that stateid and switch to the open upgrade path.
      In testing this patch, coverage to the newly created
      path indicates that the race was indeed happening.
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Reviewed-by: default avatarJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      7fc0564e
    • Andrew Elble's avatar
      nfsd: eliminate sending duplicate and repeated delegations · 34ed9872
      Andrew Elble authored
      We've observed the nfsd server in a state where there are
      multiple delegations on the same nfs4_file for the same client.
      The nfs client does attempt to DELEGRETURN these when they are presented to
      it - but apparently under some (unknown) circumstances the client does not
      manage to return all of them. This leads to the eventual
      attempt to CB_RECALL more than one delegation with the same nfs
      filehandle to the same client. The first recall will succeed, but the
      next recall will fail with NFS4ERR_BADHANDLE. This leads to the server
      having delegations on cl_revoked that the client has no way to FREE
      or DELEGRETURN, with resulting inability to recover. The state manager
      on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED,
      and the state manager on the client will be looping unable to satisfy
      the server.
      
      List discussion also reports a race between OPEN and DELEGRETURN that
      will be avoided by only sending the delegation once to the
      client. This is also logically in accordance with RFC5561 9.1.1 and 10.2.
      
      So, let's:
      
      1.) Not hand out duplicate delegations.
      2.) Only send them to the client once.
      
      RFC 5561:
      
      9.1.1:
      "Delegations and layouts, on the other hand, are not associated with a
      specific owner but are associated with the client as a whole
      (identified by a client ID)."
      
      10.2:
      "...the stateid for a delegation is associated with a client ID and may be
      used on behalf of all the open-owners for the given client.  A
      delegation is made to the client as a whole and not to any specific
      process or thread of control within it."
      Reported-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Olga Kornievskaia <aglo@umich.edu>
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      34ed9872
    • Jeff Layton's avatar
      nfsd: remove recurring workqueue job to clean DRC · 3e80dbcd
      Jeff Layton authored
      We have a shrinker, we clean out the cache when nfsd is shut down, and
      prune the chains on each request. A recurring workqueue job seems like
      unnecessary overhead. Just remove it.
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      3e80dbcd
  6. 09 Nov, 2015 1 commit
    • Rich Felker's avatar
      fs/binfmt_elf_fdpic.c: provide NOMMU loader for regular ELF binaries · 1bde925d
      Rich Felker authored
      The ELF binary loader in binfmt_elf.c requires an MMU, making it
      impossible to use regular ELF binaries on NOMMU archs.  However, the FDPIC
      ELF loader in binfmt_elf_fdpic.c is fully capable as a loader for plain
      ELF, which requires constant displacements between LOAD segments, since it
      already supports FDPIC ELF files flagged as needing constant displacement.
      
      This patch adjusts the FDPIC ELF loader to accept non-FDPIC ELF files on
      NOMMU archs.  They are treated identically to FDPIC ELF files with the
      constant-displacement flag bit set, except for personality, which must
      match the ABI of the program being loaded; the PER_LINUX_FDPIC personality
      controls how the kernel interprets function pointers passed to sigaction.
      
      Files that do not set a stack size requirement explicitly are given a
      default stack size (matching the amount of committed stack the normal ELF
      loader for MMU archs would give them) rather than being rejected; this is
      necessary because plain ELF files generally do not declare stack
      requirements in theit program headers.
      
      Only ET_DYN (PIE) format ELF files are supported, since loading at a fixed
      virtual address is not possible on NOMMU.
      
      This patch was developed and tested on J2 (SH2-compatible) but should
      be usable immediately on all archs where binfmt_elf_fdpic is
      available. Moreover, by providing dummy definitions of the
      elf_check_fdpic() and elf_check_const_displacement() macros for archs
      which lack an FDPIC ABI, it should be possible to enable building of
      binfmt_elf_fdpic on all other NOMMU archs and thereby give them ELF
      binary support, but I have not yet tested this.
      
      The motivation for using binfmt_elf_fdpic.c rather than adapting
      binfmt_elf.c to NOMMU is that the former already has all the necessary
      code to work properly on NOMMU and has already received widespread
      real-world use and testing. I hope this is not controversial.
      
      I'm not really happy with having to unset the FDPIC_FUNCPTRS
      personality bit when loading non-FDPIC ELF. This bit should really
      reset automatically on execve, since otherwise, executing non-ELF
      binaries (e.g. bFLT) from an FDPIC process will leave the personality
      in the wrong state and severely break signal handling. But that's a
      separate, existing bug and I don't know the right place to fix it.
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      Acked-by: default avatarGreg Ungerer <gerg@uclinux.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Oleg Endo <oleg.endo@t-online.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1bde925d