Skip to content
  • Chris Mason's avatar
    Btrfs: rework allocation clustering · fa9c0d79
    Chris Mason authored
    
    
    Because btrfs is copy-on-write, we end up picking new locations for
    blocks very often.  This makes it fairly difficult to maintain perfect
    read patterns over time, but we can at least do some optimizations
    for writes.
    
    This is done today by remembering the last place we allocated and
    trying to find a free space hole big enough to hold more than just one
    allocation.  The end result is that we tend to write sequentially to
    the drive.
    
    This happens all the time for metadata and it happens for data
    when mounted -o ssd.  But, the way we record it is fairly racey
    and it tends to fragment the free space over time because we are trying
    to allocate fairly large areas at once.
    
    This commit gets rid of the races by adding a free space cluster object
    with dedicated locking to make sure that only one process at a time
    is out replacing the cluster.
    
    The free space fragmentation is somewhat solved by allowing a cluster
    to be comprised of smaller free space extents.  This part definitely
    adds some CPU time to the cluster allocations, but it allows the allocator
    to consume the small holes left behind by cow.
    
    Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
    fa9c0d79