Skip to content
  • Dave Chinner's avatar
    xfs: don't use vfs writeback for pure metadata modifications · dcd79a14
    Dave Chinner authored
    
    
    Under heavy multi-way parallel create workloads, the VFS struggles
    to write back all the inodes that have been changed in age order.
    The bdi flusher thread becomes CPU bound, spending 85% of it's time
    in the VFS code, mostly traversing the superblock dirty inode list
    to separate dirty inodes old enough to flush.
    
    We already keep an index of all metadata changes in age order - in
    the AIL - and continued log pressure will do age ordered writeback
    without any extra overhead at all. If there is no pressure on the
    log, the xfssyncd will periodically write back metadata in ascending
    disk address offset order so will be very efficient.
    
    Hence we can stop marking VFS inodes dirty during transaction commit
    or when changing timestamps during transactions. This will keep the
    inodes in the superblock dirty list to those containing data or
    unlogged metadata changes.
    
    However, the timstamp changes are slightly more complex than this -
    there are a couple of places that do unlogged updates of the
    timestamps, and the VFS need to be informed of these. Hence add a
    new function xfs_trans_ichgtime() for transactional changes,
    and leave xfs_ichgtime() for the non-transactional changes.
    
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    dcd79a14