Skip to content
  • Vaibhav Nagarnaik's avatar
    tracing: Use NUMA allocation for per-cpu ring buffer pages · 7ea59064
    Vaibhav Nagarnaik authored
    The tracing ring buffer is a group of per-cpu ring buffers where
    allocation and logging is done on a per-cpu basis. The events that are
    generated on a particular CPU are logged in the corresponding buffer.
    This is to provide wait-free writes between CPUs and good NUMA node
    locality while accessing the ring buffer.
    
    However, the allocation routines consider NUMA locality only for buffer
    page metadata and not for the actual buffer page. This causes the pages
    to be allocated on the NUMA node local to the CPU where the allocation
    routine is running at the time.
    
    This patch fixes the problem by using a NUMA node specific allocation
    routine so that the pages are allocated from a NUMA node local to the
    logging CPU.
    
    I tested with the getuid_microbench from autotest. It is a simple binary
    that calls getuid() in a loop and measures the average time for the
    syscall to complete. The following command was used to test:
    $ getuid_microbench 1000000
    
    Compared th...
    7ea59064