Commit c4875e5b authored by Markus Armbruster's avatar Markus Armbruster Committed by Max Reitz

raw-posix: SEEK_HOLE suffices, get rid of FIEMAP

Commit 5500316d (May 2012) implemented raw_co_is_allocated() as
follows:

1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl

2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek()

3. Else pretend there are no holes

Later on, raw_co_is_allocated() was generalized to
raw_co_get_block_status().

Commit 4f11aa8a (May 2014) changed it to try the three methods in order
until success, because "there may be implementations which support
[SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice
versa."

Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC.
Commit 38c4d0ae (Sep 2014) added it.  Because that's a significant
speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first.

As you see, the obvious use of FIEMAP is wrong, and the correct use is
slow.  I guess this puts it somewhere between -7 "The obvious use is
wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard
to Misuse scale[*].

"Fortunately", the FIEMAP code is used only when

* SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is

  Uncommon.  SEEK_HOLE had no XFS implementation between 2011 (when it
  was introduced for ext4 and btrfs) and 2012.

* SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails

  Unlikely.

Thus, the FIEMAP code executes rarely.  Makes it a nice hidey-hole for
bugs.  Worse, bugs hiding there can theoretically bite even on a host
that has SEEK_HOLE/SEEK_DATA.

I don't want to worry about this crap, not even theoretically.  Get
rid of it.

[*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.htmlSigned-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
Reviewed-by: default avatarMax Reitz <mreitz@redhat.com>
Reviewed-by: default avatarEric Blake <eblake@redhat.com>
Signed-off-by: default avatarMax Reitz <mreitz@redhat.com>
parent be2ebc6d
......@@ -60,9 +60,6 @@
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
#endif
#endif
#ifdef CONFIG_FIEMAP
#include <linux/fiemap.h>
#endif
#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
#include <linux/falloc.h>
#endif
......@@ -151,9 +148,6 @@ typedef struct BDRVRawState {
bool has_write_zeroes:1;
bool discard_zeroes:1;
bool needs_alignment;
#ifdef CONFIG_FIEMAP
bool skip_fiemap;
#endif
} BDRVRawState;
typedef struct BDRVRawReopenState {
......@@ -1481,52 +1475,6 @@ out:
return result;
}
static int try_fiemap(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int nb_sectors)
{
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
int ret = 0;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
if (s->skip_fiemap) {
return -ENOTSUP;
}
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = FIEMAP_FLAG_SYNC;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
s->skip_fiemap = true;
return -errno;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
*hole = f.fm.fm_start;
*data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
*data = f.fe.fe_logical;
*hole = f.fe.fe_logical + f.fe.fe_length;
if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) {
ret |= BDRV_BLOCK_ZERO;
}
}
return ret;
#else
return -ENOTSUP;
#endif
}
static int try_seek_hole(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole)
{
......@@ -1593,13 +1541,10 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
ret = try_seek_hole(bs, start, &data, &hole);
if (ret < 0) {
ret = try_fiemap(bs, start, &data, &hole, nb_sectors);
if (ret < 0) {
/* Assume everything is allocated. */
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
ret = 0;
}
/* Assume everything is allocated. */
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
ret = 0;
}
assert(ret >= 0);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment