Skip to content
  • Wu Fengguang's avatar
    writeback: per task dirty rate limit · 9d823e8f
    Wu Fengguang authored
    
    
    Add two fields to task_struct.
    
    1) account dirtied pages in the individual tasks, for accuracy
    2) per-task balance_dirty_pages() call intervals, for flexibility
    
    The balance_dirty_pages() call interval (ie. nr_dirtied_pause) will
    scale near-sqrt to the safety gap between dirty pages and threshold.
    
    The main problem of per-task nr_dirtied is, if 1k+ tasks start dirtying
    pages at exactly the same time, each task will be assigned a large
    initial nr_dirtied_pause, so that the dirty threshold will be exceeded
    long before each task reached its nr_dirtied_pause and hence call
    balance_dirty_pages().
    
    The solution is to watch for the number of pages dirtied on each CPU in
    between the calls into balance_dirty_pages(). If it exceeds ratelimit_pages
    (3% dirty threshold), force call balance_dirty_pages() for a chance to
    set bdi->dirty_exceeded. In normal situations, this safeguarding
    condition is not expected to trigger at all.
    
    On the sqrt in dirty_poll_interval():
    
    It will serve as an initial guess when dirty pages are still in the
    freerun area.
    
    When dirty pages are floating inside the dirty control scope [freerun,
    limit], a followup patch will use some refined dirty poll interval to
    get the desired pause time.
    
       thresh-dirty (MB)    sqrt
    		   1      16
    		   2      22
    		   4      32
    		   8      45
    		  16      64
    		  32      90
    		  64     128
    		 128     181
    		 256     256
    		 512     362
    		1024     512
    
    The above table means, given 1MB (or 1GB) gap and the dd tasks polling
    balance_dirty_pages() on every 16 (or 512) pages, the dirty limit won't
    be exceeded as long as there are less than 16 (or 512) concurrent dd's.
    
    So sqrt naturally leads to less overheads and more safe concurrent tasks
    for large memory servers, which have large (thresh-freerun) gaps.
    
    peter: keep the per-CPU ratelimit for safeguarding the 1k+ tasks case
    
    CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Reviewed-by: default avatarAndrea Righi <andrea@betterlinux.com>
    Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
    9d823e8f