Skip to content
  • David Mosberger-Tang's avatar
    [IA64] fix fls() · 821376bf
    David Mosberger-Tang authored
    
    
    The ia64-version of fls() never worked as intended (the bitnumbering
    was off by 1 and fls(0) was undefined).  This patch fixes the problem
    by using a popcnt-based fls(), which on McKinley-derived cores is
    slightly faster than both ia64_fls() and generic_fls().  The resulting
    code, however, is bigger (7-8 bundles instead of about 3 bundles).
    Also switch ia64_popcnt() to __builtin_popcountl() for GCC v3.4 or
    newer since the compiler can predicate that and schedule it better.
    
    Thanks to Simon Derr and Matt Mackall for tracking down this bug.
    
    Signed-off-by: default avatarDavid Mosberger-Tang <davidm@hpl.hp.com>
    Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
    821376bf