Skip to content
  • Thomas Schlichter's avatar
    uvesafb,vesafb: create WC or WB PAT-entries · 803a4e14
    Thomas Schlichter authored
    
    
    with an PAT-enabled kernel, when using uvesafb or vesafb, these drivers will
    create uncached-minus PAT entries for the framebuffer memory because they use
    ioremap() (not the *_cache or *_wc variants). When the framebuffer memory
    intersects with the video RAM used by Xorg, the complete video RAM will be
    mapped uncached-minus what results in a serve performance penalty.
    
    Here are the correct MTRR entries created by uvesafb:
    schlicht@netbook:~$ cat /proc/mtrr
    reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
    reg01: base=0x06ff00000 ( 1791MB), size= 1MB, count=1: uncachable
    reg02: base=0x070000000 ( 1792MB), size= 256MB, count=1: uncachable
    reg03: base=0x0d0000000 ( 3328MB), size= 16MB, count=1: write-combining
    
    And here are the problematic PAT entries:
    schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
    PAT memtype list:
    write-back @ 0x0-0x1000
    uncached-minus @ 0x6fedd000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee3000-0x6fee4000
    uncached-minus @ 0x6fee3000-0x6fee4000
    uncached-minus @ 0x6fee3000-0x6fee4000
    uncached-minus @ 0xd0000000-0xe0000000 <-- created by xserver-xorg
    uncached-minus @ 0xd0000000-0xd1194000 <-- created by uvesafb
    uncached-minus @ 0xf4000000-0xf4009000
    uncached-minus @ 0xf4200000-0xf4400000
    uncached-minus @ 0xf5000000-0xf5010000
    uncached-minus @ 0xf5100000-0xf5104000
    uncached-minus @ 0xf5400000-0xf5404000
    uncached-minus @ 0xf5404000-0xf5405000
    uncached-minus @ 0xf5404000-0xf5405000
    uncached-minus @ 0xfed00000-0xfed01000
    
    Therefore I created the attached patch for uvesafb which uses ioremap_wc() to
    create the correct PAT entries, as shown below:
    schlicht@netbook:~$ sudo cat /sys/kernel/debug/x86/pat_memtype_list
    PAT memtype list:
    write-back @ 0x0-0x1000
    uncached-minus @ 0x6fedd000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee2000-0x6fee3000
    uncached-minus @ 0x6fee3000-0x6fee4000
    uncached-minus @ 0x6fee3000-0x6fee4000
    uncached-minus @ 0x6fee3000-0x6fee4000
    write-combining @ 0xd0000000-0xe0000000
    write-combining @ 0xd0000000-0xd1194000
    uncached-minus @ 0xf4000000-0xf4009000
    uncached-minus @ 0xf4200000-0xf4400000
    uncached-minus @ 0xf5000000-0xf5010000
    uncached-minus @ 0xf5100000-0xf5104000
    uncached-minus @ 0xf5400000-0xf5404000
    uncached-minus @ 0xf5404000-0xf5405000
    uncached-minus @ 0xf5404000-0xf5405000
    uncached-minus @ 0xfed00000-0xfed01000
    
    This results in a performance gain, objectively measurable with e.g.
    x11perf -comppixwin10 -comppixwin100 -comppixwin500:
    1: x11perf_xaa.log
    2: x11perf_xaa_patched.log
    
           1                2 Operation
    -------- ---------------- -----------------
    124000.0 202000.0 ( 1.63) Composite 10x10 from pixmap to window
      3340.0  24400.0 ( 7.31) Composite 100x100 from pixmap to window
       131.0   1150.0 ( 8.78) Composite 500x500 from pixmap to window
    
    You can see the serve performance gain when composing larger pixmaps to window.
    
    The patches replace the ioremap() function with the variant matching the mtrr-
    parameter. To create "write-back" PAT entries, the ioremap_cache() function
    must be called after creating the MTRR entries, and the ioremap_cache() region
    must completely fit into the MTRR region, this is why the MTRR region size is
    now rounded up to the next power-of-two.
    
    Signed-off-by: default avatarThomas Schlichter <thomas.schlichter@web.de>
    Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
    803a4e14