• Dave Hansen's avatar
    mm/core, x86/mm/pkeys: Add execute-only protection keys support · 62b5f7d0
    Dave Hansen authored
    Protection keys provide new page-based protection in hardware.
    But, they have an interesting attribute: they only affect data
    accesses and never affect instruction fetches.  That means that
    if we set up some memory which is set as "access-disabled" via
    protection keys, we can still execute from it.
    
    This patch uses protection keys to set up mappings to do just that.
    If a user calls:
    
    	mmap(..., PROT_EXEC);
    or
    	mprotect(ptr, sz, PROT_EXEC);
    
    (note PROT_EXEC-only without PROT_READ/WRITE), the kernel will
    notice this, and set a special protection key on the memory.  It
    also sets the appropriate bits in the Protection Keys User Rights
    (PKRU) register so that the memory becomes unreadable and
    unwritable.
    
    I haven't found any userspace that does this today.  With this
    facility in place, we expect userspace to move to use it
    eventually.  Userspace _could_ start doing this today.  Any
    PROT_EXEC calls get converted to PROT_READ inside the kernel, and
    would transparently be upgraded to "true" PROT_EXEC with this
    code.  IOW, userspace never has to do any PROT_EXEC runtime
    detection.
    
    This feature provides enhanced protection against leaking
    executable memory contents.  This helps thwart attacks which are
    attempting to find ROP gadgets on the fly.
    
    But, the security provided by this approach is not comprehensive.
    The PKRU register which controls access permissions is a normal
    user register writable from unprivileged userspace.  An attacker
    who can execute the 'wrpkru' instruction can easily disable the
    protection provided by this feature.
    
    The protection key that is used for execute-only support is
    permanently dedicated at compile time.  This is fine for now
    because there is currently no API to set a protection key other
    than this one.
    
    Despite there being a constant PKRU value across the entire
    system, we do not set it unless this feature is in use in a
    process.  That is to preserve the PKRU XSAVE 'init state',
    which can lead to faster context switches.
    
    PKRU *is* a user register and the kernel is modifying it.  That
    means that code doing:
    
    	pkru = rdpkru()
    	pkru |= 0x100;
    	mmap(..., PROT_EXEC);
    	wrpkru(pkru);
    
    could lose the bits in PKRU that enforce execute-only
    permissions.  To avoid this, we suggest avoiding ever calling
    mmap() or mprotect() when the PKRU value is expected to be
    unstable.
    Signed-off-by: 's avatarDave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: 's avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Chen Gang <gang.chen.5i5j@gmail.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Dave Hansen <dave@sr71.net>
    Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Piotr Kwapulinski <kwapulinski.piotr@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Stephen Smalley <sds@tycho.nsa.gov>
    Cc: Vladimir Murzin <vladimir.murzin@arm.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: keescook@google.com
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20160212210240.CB4BB5CA@viggo.jf.intel.comSigned-off-by: 's avatarIngo Molnar <mingo@kernel.org>
    62b5f7d0
pkeys.h 766 Bytes