1. 28 Apr, 2016 2 commits
  2. 25 Apr, 2016 2 commits
  3. 21 Apr, 2016 2 commits
  4. 19 Apr, 2016 3 commits
  5. 15 Apr, 2016 4 commits
    • Catalin Marinas's avatar
      arm64: Implement ptep_set_access_flags() for hardware AF/DBM · 66dbd6e6
      Catalin Marinas authored
      When hardware updates of the access and dirty states are enabled, the
      default ptep_set_access_flags() implementation based on calling
      set_pte_at() directly is potentially racy. This triggers the "racy dirty
      state clearing" warning in set_pte_at() because an existing writable PTE
      is overridden with a clean entry.
      
      There are two main scenarios for this situation:
      
      1. The CPU getting an access fault does not support hardware updates of
         the access/dirty flags. However, a different agent in the system
         (e.g. SMMU) can do this, therefore overriding a writable entry with a
         clean one could potentially lose the automatically updated dirty
         status
      
      2. A more complex situation is possible when all CPUs support hardware
         AF/DBM:
      
         a) Initial state: shareable + writable vma and pte_none(pte)
         b) Read fault taken by two threads of the same process on different
            CPUs
         c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
            eventually reaches do_set_pte() which sets a writable + clean pte.
            CPU0 releases the mmap_sem
         d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
            pte entry it reads is present, writable and clean and it continues
            to pte_mkyoung()
         e) CPU1 calls ptep_set_access_flags()
      
         If between (d) and (e) the hardware (another CPU) updates the dirty
         state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
         marking the entry clean again.
      
      This patch implements an arm64-specific ptep_set_access_flags() function
      to perform an atomic update of the PTE flags.
      
      Fixes: 2f4b829c
      
       ("arm64: Add support for hardware updates of the access and dirty pte bits")
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarMing Lei <tom.leiming@gmail.com>
      Tested-by: default avatarJulien Grall <julien.grall@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.3+
      [will: reworded comment]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      66dbd6e6
    • Ganapatrao Kulkarni's avatar
      arm64, numa: Add NUMA support for arm64 platforms. · 1a2db300
      Ganapatrao Kulkarni authored
      
      
      Attempt to get the memory and CPU NUMA node via of_numa.  If that
      fails, default the dummy NUMA node and map all memory and CPUs to node
      0.
      Tested-by: default avatarShannon Zhao <shannon.zhao@linaro.org>
      Reviewed-by: default avatarRobert Richter <rrichter@cavium.com>
      Signed-off-by: default avatarGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      1a2db300
    • David Daney's avatar
      arm64: Move unflatten_device_tree() call earlier. · 3194ac6e
      David Daney authored
      
      
      In order to extract NUMA information from the device tree, we need to
      have the tree in its unflattened form.
      
      Move the call to bootmem_init() in the tail of paging_init() into
      setup_arch, and adjust header files so that its declaration is
      visible.
      
      Move the unflatten_device_tree() call between the calls to
      paging_init() and bootmem_init().  Follow on patches add NUMA handling
      to bootmem_init().
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3194ac6e
    • Suzuki K Poulose's avatar
      arm64: Add cpu_panic_kernel helper · 17eebd1a
      Suzuki K Poulose authored
      
      
      During the activation of a secondary CPU, we could report serious
      configuration issues and hence request to crash the kernel. We do
      this for CPU ASID bit check now. We will need it also for handling
      mismatched exception levels for the CPUs with VHE. Hence, add a
      helper to do the same for reusability.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      17eebd1a
  6. 14 Apr, 2016 7 commits
    • James Morse's avatar
      arm64: mm: Add trace_irqflags annotations to do_debug_exception() · 6afedcd2
      James Morse authored
      
      
      With CONFIG_PROVE_LOCKING, CONFIG_DEBUG_LOCKDEP and CONFIG_TRACE_IRQFLAGS
      enabled, lockdep will compare current->hardirqs_enabled with the flags from
      local_irq_save().
      
      When a debug exception occurs, interrupts are disabled in entry.S, but
      lockdep isn't told, resulting in:
      DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
      ------------[ cut here ]------------
      WARNING: at ../kernel/locking/lockdep.c:3523
      Modules linked in:
      CPU: 3 PID: 1752 Comm: perf Not tainted 4.5.0-rc4+ #2204
      Hardware name: ARM Juno development board (r1) (DT)
      task: ffffffc974868000 ti: ffffffc975f40000 task.ti: ffffffc975f40000
      PC is at check_flags.part.35+0x17c/0x184
      LR is at check_flags.part.35+0x17c/0x184
      pc : [<ffffff80080fc93c>] lr : [<ffffff80080fc93c>] pstate: 600003c5
      [...]
      ---[ end trace 74631f9305ef5020 ]---
      Call trace:
      [<ffffff80080fc93c>] check_flags.part.35+0x17c/0x184
      [<ffffff80080ffe30>] lock_acquire+0xa8/0xc4
      [<ffffff8008093038>] breakpoint_handler+0x118/0x288
      [<ffffff8008082434>] do_debug_exception+0x3c/0xa8
      [<ffffff80080854b4>] el1_dbg+0x18/0x6c
      [<ffffff80081e82f4>] do_filp_open+0x64/0xdc
      [<ffffff80081d6e60>] do_sys_open+0x140/0x204
      [<ffffff80081d6f58>] SyS_openat+0x10/0x18
      [<ffffff8008085d30>] el0_svc_naked+0x24/0x28
      possible reason: unannotated irqs-off.
      irq event stamp: 65857
      hardirqs last  enabled at (65857): [<ffffff80081fb1c0>] lookup_mnt+0xf4/0x1b4
      hardirqs last disabled at (65856): [<ffffff80081fb188>] lookup_mnt+0xbc/0x1b4
      softirqs last  enabled at (65790): [<ffffff80080bdca4>] __do_softirq+0x1f8/0x290
      softirqs last disabled at (65757): [<ffffff80080be038>] irq_exit+0x9c/0xd0
      
      This patch adds the annotations to do_debug_exception(), while trying not
      to call trace_hardirqs_off() if el1_dbg() interrupted a task that already
      had irqs disabled.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      6afedcd2
    • Ard Biesheuvel's avatar
      arm64: cover the .head.text section in the .text segment mapping · 7eb90f2f
      Ard Biesheuvel authored
      
      
      Keeping .head.text out of the .text mapping buys us very little: its actual
      payload is only 4 KB, most of which is padding, but the page alignment may
      add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
      padding to the uncompressed kernel Image.
      
      Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
      map the adjacent 56 KB of code without the PTE_CONT attribute, and since
      this region contains things like the vector table and the GIC interrupt
      handling entry point, this region is likely to benefit from the reduced TLB
      pressure that results from PTE_CONT mappings.
      
      So remove the alignment between the .head.text and .text sections, and use
      the [_text, _etext) rather than the [_stext, _etext) interval for mapping
      the .text segment.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      7eb90f2f
    • Ard Biesheuvel's avatar
      arm64: use 'segment' rather than 'chunk' to describe mapped kernel regions · 2c09ec06
      Ard Biesheuvel authored
      
      
      Replace the poorly defined term chunk with segment, which is a term that is
      already used by the ELF spec to describe contiguous mappings with the same
      permission attributes of statically allocated ranges of an executable.
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2c09ec06
    • Ard Biesheuvel's avatar
      arm64: mm: move vmemmap region right below the linear region · 3e1907d5
      Ard Biesheuvel authored
      
      
      This moves the vmemmap region right below PAGE_OFFSET, aka the start
      of the linear region, and redefines its size to be a power of two.
      Due to the placement of PAGE_OFFSET in the middle of the address space,
      whose size is a power of two as well, this guarantees that virt to
      page conversions and vice versa can be implemented efficiently, by
      masking and shifting rather than ordinary arithmetic.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3e1907d5
    • Ard Biesheuvel's avatar
      arm64: mm: free __init memory via the linear mapping · d386825c
      Ard Biesheuvel authored
      
      
      The implementation of free_initmem_default() expects __init_begin
      and __init_end to be covered by the linear mapping, which is no
      longer the case. So open code it instead, using addresses that are
      explicitly translated from kernel virtual to linear virtual.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      d386825c
    • Ard Biesheuvel's avatar
      arm64: add the initrd region to the linear mapping explicitly · 177e15f0
      Ard Biesheuvel authored
      
      
      Instead of going out of our way to relocate the initrd if it turns out
      to occupy memory that is not covered by the linear mapping, just add the
      initrd to the linear mapping. This puts the burden on the bootloader to
      pass initrd= and mem= options that are mutually consistent.
      
      Note that, since the placement of the linear region in the PA space is
      also dependent on the placement of the kernel Image, which may reside
      anywhere in memory, we may still end up with a situation where the initrd
      and the kernel Image are simply too far apart to be covered by the linear
      region.
      
      Since we now leave it up to the bootloader to pass the initrd in memory
      that is guaranteed to be accessible by the kernel, add a mention of this to
      the arm64 boot protocol specification as well.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      177e15f0
    • Ard Biesheuvel's avatar
      arm64/mm: ensure memstart_addr remains sufficiently aligned · 2958987f
      Ard Biesheuvel authored
      
      
      After choosing memstart_addr to be the highest multiple of
      ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
      address, we clip the memblocks to the maximum size of the linear region.
      Since the kernel may be high up in memory, we take care not to clip the
      kernel itself, which means we have to clip some memory from the bottom if
      this occurs, to ensure that the distance between the first and the last
      usable physical memory address can be covered by the linear region.
      
      However, we fail to update memstart_addr if this clipping from the bottom
      occurs, which means that we may still end up with virtual addresses that
      wrap into the userland range. So increment memstart_addr as appropriate to
      prevent this from happening.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2958987f
  7. 24 Mar, 2016 2 commits
    • Mark Rutland's avatar
      arm64: mm: allow preemption in copy_to_user_page · 691b1e2e
      Mark Rutland authored
      
      
      Currently we disable preemption in copy_to_user_page; a behaviour that
      we inherited from the 32-bit arm code. This was necessary for older
      cores without broadcast data cache maintenance, and ensured that cache
      lines were dirtied and cleaned by the same CPU. On these systems dirty
      cache line migration was not possible, so this was sufficient to
      guarantee coherency.
      
      On contemporary systems, cache coherence protocols permit (dirty) cache
      lines to migrate between CPUs as a result of speculation, prefetching,
      and other behaviours. To account for this, in ARMv8 data cache
      maintenance operations are broadcast and affect all data caches in the
      domain associated with the VA (i.e. ISH for kernel and user mappings).
      
      In __switch_to we ensure that tasks can be safely migrated in the middle
      of a maintenance sequence, using a dsb(ish) to ensure prior explicit
      memory accesses are observed and cache maintenance operations are
      completed before a task can be run on another CPU.
      
      Given the above, it is not necessary to disable preemption in
      copy_to_user_page. This patch removes the preempt_{disable,enable}
      calls, permitting preemption.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      691b1e2e
    • Mark Rutland's avatar
      arm64: consistently use p?d_set_huge · c661cb1c
      Mark Rutland authored
      Commit 324420bf
      
       ("arm64: add support for ioremap() block
      mappings") added new p?d_set_huge functions which do the hard work to
      generate and set a correct block entry.
      
      These differ from open-coded huge page creation in the early page table
      code by explicitly setting the P?D_TYPE_SECT bits (which are implicitly
      retained by mk_sect_prot() for any valid prot), but are otherwise
      identical (and cannot fail on arm64).
      
      For simplicity and consistency, make use of these in the initial page
      table creation code.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c661cb1c
  8. 21 Mar, 2016 1 commit
  9. 17 Mar, 2016 2 commits
  10. 11 Mar, 2016 2 commits
    • Catalin Marinas's avatar
      arm64: kasan: Fix zero shadow mapping overriding kernel image shadow · 2776e0e8
      Catalin Marinas authored
      With the 16KB and 64KB page size configurations, SWAPPER_BLOCK_SIZE is
      PAGE_SIZE and ARM64_SWAPPER_USES_SECTION_MAPS is 0. Since
      kimg_shadow_end is not page aligned (_end shifted by
      KASAN_SHADOW_SCALE_SHIFT), the edges of previously mapped kernel image
      shadow via vmemmap_populate() may be overridden by subsequent calls to
      kasan_populate_zero_shadow(), leading to kernel panics like below:
      
      ------------------------------------------------------------------------------
      Unable to handle kernel paging request at virtual address fffffc100135068c
      pgd = fffffc8009ac0000
      [fffffc100135068c] *pgd=00000009ffee0003, *pud=00000009ffee0003, *pmd=00000009ffee0003, *pte=00e0000081a00793
      Internal error: Oops: 9600004f [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc4+ #1984
      Hardware name: Juno (DT)
      task: fffffe09001a0000 ti: fffffe0900200000 task.ti: fffffe0900200000
      PC is at __memset+0x4c/0x200
      LR is at kasan_unpoison_shadow+0x34/0x50
      pc : [<fffffc800846f1cc>] lr : [<fffffc800821ff54>] pstate: 00000245
      sp : fffffe0900203db0
      x29: fffffe0900203db0 x28: 0000000000000000
      x27: 0000000000000000 x26: 0000000000000000
      x25: fffffc80099b69d0 x24: 0000000000000001
      x23: 0000000000000000 x22: 0000000000002000
      x21: dffffc8000000000 x20: 1fffff9001350a8c
      x19: 0000000000002000 x18: 0000000000000008
      x17: 0000000000000147 x16: ffffffffffffffff
      x15: 79746972100e041d x14: ffffff0000000000
      x13: ffff000000000000 x12: 0000000000000000
      x11: 0101010101010101 x10: 1fffffc11c000000
      x9 : 0000000000000000 x8 : fffffc100135068c
      x7 : 0000000000000000 x6 : 000000000000003f
      x5 : 0000000000000040 x4 : 0000000000000004
      x3 : fffffc100134f651 x2 : 0000000000000400
      x1 : 0000000000000000 x0 : fffffc100135068c
      
      Process swapper/0 (pid: 1, stack limit = 0xfffffe0900200020)
      Call trace:
      [<fffffc800846f1cc>] __memset+0x4c/0x200
      [<fffffc8008220044>] __asan_register_globals+0x5c/0xb0
      [<fffffc8008a09d34>] _GLOBAL__sub_I_65535_1_sunrpc_cache_lookup+0x1c/0x28
      [<fffffc8008f20d28>] kernel_init_freeable+0x104/0x274
      [<fffffc80089e1948>] kernel_init+0x10/0xf8
      [<fffffc8008093a00>] ret_from_fork+0x10/0x50
      ------------------------------------------------------------------------------
      
      This patch aligns kimg_shadow_start and kimg_shadow_end to
      SWAPPER_BLOCK_SIZE in all configurations.
      
      Fixes: f9040773
      
       ("arm64: move kernel image to base of vmalloc area")
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      2776e0e8
    • Catalin Marinas's avatar
      arm64: kasan: Use actual memory node when populating the kernel image shadow · 2f76969f
      Catalin Marinas authored
      With the 16KB or 64KB page configurations, the generic
      vmemmap_populate() implementation warns on potential offnode
      page_structs via vmemmap_verify() because the arm64 kasan_init() passes
      NUMA_NO_NODE instead of the actual node for the kernel image memory.
      
      Fixes: f9040773
      
       ("arm64: move kernel image to base of vmalloc area")
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      2f76969f
  11. 09 Mar, 2016 1 commit
    • Will Deacon's avatar
      arm64: hugetlb: partial revert of 66b3923a · ff792584
      Will Deacon authored
      Commit 66b3923a
      
       ("arm64: hugetlb: add support for PTE contiguous bit")
      introduced support for huge pages using the contiguous bit in the PTE
      as opposed to block mappings, which may be slightly unwieldy (512M) in
      64k page configurations.
      
      Unfortunately, this support has resulted in some late regressions when
      running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM
      as a result of a BUG:
      
       | readback (2M: 64):	------------[ cut here ]------------
       | kernel BUG at fs/hugetlbfs/inode.c:446!
       | Internal error: Oops - BUG: 0 [#1] SMP
       | Modules linked in:
       | CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148
       | Hardware name: linux,dummy-virt (DT)
       | task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000
       | PC is at remove_inode_hugepages+0x44c/0x480
       | LR is at remove_inode_hugepages+0x264/0x480
      
      Rather than revert the entire patch, simply avoid advertising the
      contiguous huge page sizes for now while people are actively working on
      a fix. This patch can then be reverted once things have been sorted out.
      
      Cc: David Woods <dwoods@ezchip.com>
      Reported-by: default avatarSteve Capper <steve.capper@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ff792584
  12. 04 Mar, 2016 1 commit
    • Mark Rutland's avatar
      arm64: make mrs_s prefixing implicit in read_cpuid · 1cc6ed90
      Mark Rutland authored
      Commit 0f54b14e ("arm64: cpufeature: Change read_cpuid() to use
      sysreg's mrs_s macro") changed read_cpuid to require a SYS_ prefix on
      register names, to allow manual assembly of registers unknown by the
      toolchain, using tables in sysreg.h.
      
      This interacts poorly with commit 42b55734 ("efi/arm64: Check
      for h/w support before booting a >4 KB granular kernel"), which is
      curretly queued via the tip tree, and uses read_cpuid without a SYS_
      prefix. Due to this, a build of next-20160304 fails if EFI and 64K pages
      are selected.
      
      To avoid this issue when trees are merged, move the required SYS_
      prefixing into read_cpuid, and revert all of the updated callsites to
      pass plain register names. This effectively reverts the bulk of commit
      0f54b14e
      
      .
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      1cc6ed90
  13. 02 Mar, 2016 1 commit
  14. 29 Feb, 2016 2 commits
  15. 27 Feb, 2016 1 commit
    • Daniel Cashman's avatar
      mm: ASLR: use get_random_long() · 5ef11c35
      Daniel Cashman authored
      
      
      Replace calls to get_random_int() followed by a cast to (unsigned long)
      with calls to get_random_long().  Also address shifting bug which, in
      case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
      Signed-off-by: default avatarDaniel Cashman <dcashman@android.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5ef11c35
  16. 26 Feb, 2016 5 commits
  17. 25 Feb, 2016 2 commits