Skip to content
  • Jerome Glisse's avatar
    drm/radeon/kms: fence cleanup + more reliable GPU lockup detection V4 · 225758d8
    Jerome Glisse authored
    
    
    This patch cleanup the fence code, it drops the timeout field of
    fence as the time to complete each IB is unpredictable and shouldn't
    be bound.
    
    The fence cleanup lead to GPU lockup detection improvement, this
    patch introduce a callback, allowing to do asic specific test for
    lockup detection. In this patch the CP is use as a first indicator
    of GPU lockup. If CP doesn't make progress during 1second we assume
    we are facing a GPU lockup.
    
    To avoid overhead of testing GPU lockup frequently due to fence
    taking time to be signaled we query the lockup callback every
    500msec. There is plenty code comment explaining the design & choise
    inside the code.
    
    This have been tested mostly on R3XX/R5XX hw, in normal running
    destkop (compiz firefox, quake3 running) the lockup callback wasn't
    call once (1 hour session). Also tested with forcing GPU lockup and
    lockup was reported after the 1s CP activity timeout.
    
    V2 switch to 500ms timeout so GPU lockup get call at least 2 times
       in less than 2sec.
    V3 store last jiffies in fence struct so on ERESTART, EBUSY we keep
       track of how long we already wait for a given fence
    V4 make sure we got up to date cp read pointer so we don't have
       false positive
    
    Signed-off-by: default avatarJerome Glisse <jglisse@redhat.com>
    Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
    225758d8