• Tahsin Erdogan's avatar
    block: do not merge requests without consulting with io scheduler · 72ef799b
    Tahsin Erdogan authored
    Before merging a bio into an existing request, io scheduler is called to
    get its approval first. However, the requests that come from a plug
    flush may get merged by block layer without consulting with io
    scheduler.
    
    In case of CFQ, this can cause fairness problems. For instance, if a
    request gets merged into a low weight cgroup's request, high weight cgroup
    now will depend on low weight cgroup to get scheduled. If high weigt cgroup
    needs that io request to complete before submitting more requests, then it
    will also lose its timeslice.
    
    Following script demonstrates the problem. Group g1 has a low weight, g2
    and g3 have equal high weights but g2's requests are adjacent to g1's
    requests so they are subject to merging. Due to these merges, g2 gets
    poor disk time allocation.
    
    cat > cfq-merge-repro.sh << "EOF"
    #!/bin/bash
    set -e
    
    IO_ROOT=/mnt-cgroup/io
    
    mkdir -p $IO_ROOT
    
    if ! mount | grep -qw $IO_ROOT; then
      mount -t cgroup none -oblkio $IO_ROOT
    fi
    
    cd $IO_ROOT
    
    for i in g1 g2 g3; do
      if [ -d $i ]; then
        rmdir $i
      fi
    done
    
    mkdir g1 && echo 10 > g1/blkio.weight
    mkdir g2 && echo 495 > g2/blkio.weight
    mkdir g3 && echo 495 > g3/blkio.weight
    
    RUNTIME=10
    
    (echo $BASHPID > g1/cgroup.procs &&
     fio --readonly --name name1 --filename /dev/sdb \
         --rw read --size 64k --bs 64k --time_based \
         --runtime=$RUNTIME --offset=0k &> /dev/null)&
    
    (echo $BASHPID > g2/cgroup.procs &&
     fio --readonly --name name1 --filename /dev/sdb \
         --rw read --size 64k --bs 64k --time_based \
         --runtime=$RUNTIME --offset=64k &> /dev/null)&
    
    (echo $BASHPID > g3/cgroup.procs &&
     fio --readonly --name name1 --filename /dev/sdb \
         --rw read --size 64k --bs 64k --time_based \
         --runtime=$RUNTIME --offset=256k &> /dev/null)&
    
    sleep $((RUNTIME+1))
    
    for i in g1 g2 g3; do
      echo ---- $i ----
      cat $i/blkio.time
    done
    
    EOF
    # ./cfq-merge-repro.sh
    ---- g1 ----
    8:16 162
    ---- g2 ----
    8:16 165
    ---- g3 ----
    8:16 686
    
    After applying the patch:
    
    # ./cfq-merge-repro.sh
    ---- g1 ----
    8:16 90
    ---- g2 ----
    8:16 445
    ---- g3 ----
    8:16 471
    Signed-off-by: 's avatarTahsin Erdogan <tahsin@google.com>
    Signed-off-by: 's avatarJens Axboe <axboe@fb.com>
    72ef799b
elevator.h 7.13 KB