1. 03 Jun, 2016 5 commits
    • Mark Rutland's avatar
      arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled · aed7eb83
      Mark Rutland authored
      With ARM64_64K_PAGES and RANDOMIZE_TEXT_OFFSET enabled, we hit the
      following issue on the boot:
      
      kernel BUG at arch/arm64/mm/mmu.c:480!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.6.0 #310
      Hardware name: ARM Juno development board (r2) (DT)
      task: ffff000008d58a80 ti: ffff000008d30000 task.ti: ffff000008d30000
      PC is at map_kernel_segment+0x44/0xb0
      LR is at paging_init+0x84/0x5b0
      pc : [<ffff000008c450b4>] lr : [<ffff000008c451a4>] pstate: 600002c5
      
      Call trace:
      [<ffff000008c450b4>] map_kernel_segment+0x44/0xb0
      [<ffff000008c451a4>] paging_init+0x84/0x5b0
      [<ffff000008c42728>] setup_arch+0x198/0x534
      [<ffff000008c40848>] start_kernel+0x70/0x388
      [<ffff000008c401bc>] __primary_switched+0x30/0x74
      
      Commit 7eb90f2f ("arm64: cover the .head.text section in the .text
      segment mapping") removed the alignment between the .head.text and .text
      sections, and used the _text rather than the _stext interval for mapping
      the .text segment.
      
      Prior to this commit _stext was always section aligned and didn't cause
      any issue even when RANDOMIZE_TEXT_OFFSET was enabled. Since that
      alignment has been removed and _text is used to map the .text segment,
      we need ensure _text is always page aligned when RANDOMIZE_TEXT_OFFSET
      is enabled.
      
      This patch adds logic to TEXT_OFFSET fuzzing to ensure that the offset
      is always aligned to the kernel page size. To ensure this, we rely on
      the PAGE_SHIFT being available via Kconfig.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 7eb90f2f ("arm64: cover the .head.text section in the .text segment mapping")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      aed7eb83
    • Mark Rutland's avatar
      arm64: move {PAGE,CONT}_SHIFT into Kconfig · 030c4d24
      Mark Rutland authored
      In some cases (e.g. the awk for CONFIG_RANDOMIZE_TEXT_OFFSET) we would
      like to make use of PAGE_SHIFT outside of code that can include the
      usual header files.
      
      Add a new CONFIG_ARM64_PAGE_SHIFT for this, likewise with
      ARM64_CONT_SHIFT for consistency.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      030c4d24
    • Mark Rutland's avatar
      arm64: mm: dump: log span level · 48dd73c5
      Mark Rutland authored
      The page table dump code logs spans of entries at the same level
      (pgd/pud/pmd/pte) which have the same attributes. While we log the
      (decoded) attributes, we don't log the level, which leaves the output
      ambiguous and/or confusing in some cases.
      
      For example:
      
      0xffff800800000000-0xffff800980000000           6G       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      If using 4K pages, this may describe a span of 6 1G block entries at the
      PGD/PUD level, or 3072 2M block entries at the PMD level.
      
      This patch adds the page table level to each output line, removing this
      ambiguity. For the example above, this will produce:
      
      0xffffffc800000000-0xffffffc980000000           6G PUD       RW NX SHD AF        BLK UXN MEM/NORMAL
      
      When 3 level tables are in use, and we use the asm-generic/nopud.h
      definitions, the dump code treats each entry in the PGD as a 1 element
      table at the PUD level, and logs spans as being PUDs, which can be
      confusing. To counteract this, the "PUD" mnemonic is replaced with "PGD"
      when CONFIG_PGTABLE_LEVELS <= 3. Likewise for "PMD" when
      CONFIG_PGTABLE_LEVELS <= 2.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Huang Shijie <shijie.huang@arm.com>
      Cc: Laura Abbott <labbott@fedoraproject.org>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      48dd73c5
    • Mark Rutland's avatar
      arm64: update stale PAGE_OFFSET comment · a13e3a5b
      Mark Rutland authored
      Commit ab893fb9 ("arm64: introduce KIMAGE_VADDR as the virtual
      base of the kernel region") logically split KIMAGE_VADDR from
      PAGE_OFFSET, and since commit f9040773 ("arm64: move kernel
      image to base of vmalloc area") the two have been distinct values.
      
      Unfortunately, neither commit updated the comment above these
      definitions, which now erroneously states that PAGE_OFFSET is the start
      of the kernel image rather than the start of the linear mapping.
      
      This patch fixes said comment, and introduces an explanation of
      KIMAGE_VADDR.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      a13e3a5b
    • Mark Rutland's avatar
      arm64: report CPU number in bad_mode · 8051f4d1
      Mark Rutland authored
      If we take an exception we don't expect (e.g. SError), we report this in
      the bad_mode handler with pr_crit. Depending on the configured log
      level, we may or may not log additional information in functions called
      subsequently. Notably, the messages in dump_stack (including the CPU
      number) are printed with KERN_DEFAULT and may not appear.
      
      Some exceptions have an IMPLEMENTATION DEFINED ESR_ELx.ISS encoding, and
      knowing the CPU number is crucial to correctly decode them. To ensure
      that this is always possible, we should log the CPU number along with
      the ESR_ELx value, so we are not reliant on subsequent logs or
      additional printk configuration options.
      
      This patch logs the CPU number in bad_mode such that it is possible for
      a developer to decode these exceptions, provided access to sufficient
      documentation.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarAl Grant <Al.Grant@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      8051f4d1
  2. 01 Jun, 2016 1 commit
  3. 31 May, 2016 8 commits
  4. 23 May, 2016 1 commit
  5. 20 May, 2016 6 commits
    • Jiri Slaby's avatar
      exit_thread: remove empty bodies · 5f56a5df
      Jiri Slaby authored
      Define HAVE_EXIT_THREAD for archs which want to do something in
      exit_thread. For others, let's define exit_thread as an empty inline.
      
      This is a cleanup before we change the prototype of exit_thread to
      accept a task parameter.
      
      [akpm@linux-foundation.org: fix mips]
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5f56a5df
    • Christoffer Dall's avatar
      KVM: arm/arm64: vgic-new: Synchronize changes to active state · 35a2d585
      Christoffer Dall authored
      When modifying the active state of an interrupt via the MMIO interface,
      we should ensure that the write has the intended effect.
      
      If a guest sets an interrupt to active, but that interrupt is already
      flushed into a list register on a running VCPU, then that VCPU will
      write the active state back into the struct vgic_irq upon returning from
      the guest and syncing its state.  This is a non-benign race, because the
      guest can observe that an interrupt is not active, and it can have a
      reasonable expectations that other VCPUs will not ack any IRQs, and then
      set the state to active, and expect it to stay that way.  Currently we
      are not honoring this case.
      
      Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the
      world, change the irq state, potentially queue the irq if we're setting
      it to active, and then continue.
      
      We take this chance to slightly optimize these functions by not stopping
      the world when touching private interrupts where there is inherently no
      possible race.
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      35a2d585
    • Andre Przywara's avatar
      KVM: arm/arm64: vgic-new: enable build · efffe55a
      Andre Przywara authored
      Now that the new VGIC implementation has reached feature parity with
      the old one, add the new files to the build system and add a Kconfig
      option to switch between the two versions.
      We set the default to the new version to get maximum test coverage,
      in case people experience problems they can switch back to the old
      behaviour if needed.
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Acked-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      efffe55a
    • Christoffer Dall's avatar
      KVM: arm/arm64: Provide functionality to pause and resume a guest · b13216cf
      Christoffer Dall authored
      For some rare corner cases in our VGIC emulation later we have to stop
      the guest to make sure the VGIC state is consistent.
      Provide the necessary framework to pause and resume a guest.
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      b13216cf
    • Christoffer Dall's avatar
      KVM: arm/arm64: Export mmio_read/write_bus · d5a5a0ef
      Christoffer Dall authored
      Rename mmio_{read,write}_bus to kvm_mmio_{read,write}_bus and export
      them out of mmio.c.
      This will be needed later for the new VGIC implementation.
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarAndre Przywara <andre.przywara@arm.com>
      d5a5a0ef
    • Matt Evans's avatar
      kvm: arm64: Fix EC field in inject_abt64 · e4fe9e7d
      Matt Evans authored
      The EC field of the constructed ESR is conditionally modified by ORing in
      ESR_ELx_EC_DABT_LOW for a data abort.  However, ESR_ELx_EC_SHIFT is missing
      from this condition.
      Signed-off-by: default avatarMatt Evans <matt.evans@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      e4fe9e7d
  6. 19 May, 2016 2 commits
    • Hugh Dickins's avatar
      arch: fix has_transparent_hugepage() · fd8cfd30
      Hugh Dickins authored
      I've just discovered that the useful-sounding has_transparent_hugepage()
      is actually an architecture-dependent minefield: on some arches it only
      builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
      not, but on some of those (arm and arm64) it then gives the wrong
      answer; and on mips alone it's marked __init, which would crash if
      called later (but so far it has not been called later).
      
      Straighten this out: make it available to all configs, with a sensible
      default in asm-generic/pgtable.h, removing its definitions from those
      arches (arc, arm, arm64, sparc, tile) which are served by the default,
      adding #define has_transparent_hugepage has_transparent_hugepage to
      those (mips, powerpc, s390, x86) which need to override the default at
      runtime, and removing the __init from mips (but maybe that kind of code
      should be avoided after init: set a static variable the first time it's
      called).
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>		[arch/arc]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[arch/s390]
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd8cfd30
    • Vaishali Thakkar's avatar
      arm64: mm: use hugetlb_bad_size() · d77e20ce
      Vaishali Thakkar authored
      Update setup_hugepagesz() to call hugetlb_bad_size() when unsupported
      hugepage size is found.
      Signed-off-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
      Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d77e20ce
  7. 17 May, 2016 1 commit
  8. 16 May, 2016 6 commits
  9. 14 May, 2016 2 commits
  10. 13 May, 2016 1 commit
    • Christian Borntraeger's avatar
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger authored
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  11. 12 May, 2016 1 commit
  12. 11 May, 2016 3 commits
    • Kees Cook's avatar
      arm64: kernel: Fix incorrect brk randomization · 61462c8a
      Kees Cook authored
      This fixes two issues with the arm64 brk randomziation. First, the
      STACK_RND_MASK was being used incorrectly. The original code was:
      
      	unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1;
      
      STACK_RND_MASK is 0x7ff (32-bit) or 0x3ffff (64-bit), with 4K pages where
      PAGE_SHIFT is 12:
      
      	#define STACK_RND_MASK	(test_thread_flag(TIF_32BIT) ? \
      						0x7ff >> (PAGE_SHIFT - 12) : \
      						0x3ffff >> (PAGE_SHIFT - 12))
      
      This means the resulting offset from base would be 0x7ff0001 or 0x3ffff0001,
      which is wrong since it creates an unaligned end address. It was likely
      intended to be:
      
      	unsigned long range_end = base + ((STACK_RND_MASK + 1) << PAGE_SHIFT)
      
      Which would result in offsets of 0x800000 (32-bit) and 0x40000000 (64-bit).
      
      However, even this corrected 32-bit compat offset (0x00800000) is much
      smaller than native ARM's brk randomization value (0x02000000):
      
      	unsigned long arch_randomize_brk(struct mm_struct *mm)
      	{
      	        unsigned long range_end = mm->brk + 0x02000000;
      	        return randomize_range(mm->brk, range_end, 0) ? : mm->brk;
      	}
      
      So, instead of basing arm64's brk randomization on mistaken STACK_RND_MASK
      calculations, just use specific corrected values for compat (0x2000000)
      and native arm64 (0x40000000).
      Reviewed-by: default avatarJon Medhurst <tixy@linaro.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      [will: use is_compat_task() as suggested by tixy]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      61462c8a
    • Julien Grall's avatar
      arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str · f228b494
      Julien Grall authored
      The loop that browses the array compat_hwcap_str will stop when a NULL
      is encountered, however NULL is missing at the end of array. This will
      lead to overrun until a NULL is found somewhere in the following memory.
      In reality, this works out because the compat_hwcap2_str array tends to
      follow immediately in memory, and that *is* terminated correctly.
      Furthermore, the unsigned int compat_elf_hwcap is checked before
      printing each capability, so we end up doing the right thing because
      the size of the two arrays is less than 32. Still, this is an obvious
      mistake and should be fixed.
      
      Note for backporting: commit 12d11817 ("arm64: Move
      /proc/cpuinfo handling code") moved this code in v4.4. Prior to that
      commit, the same change should be made in arch/arm64/kernel/setup.c.
      
      Fixes: 44b82b77 "arm64: Fix up /proc/cpuinfo"
      Cc: <stable@vger.kernel.org> # v3.19+ (but see note above prior to v4.4)
      Signed-off-by: default avatarJulien Grall <julien.grall@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      f228b494
    • Suzuki K Poulose's avatar
      arm64: secondary_start_kernel: Remove unnecessary barrier · 99aa0362
      Suzuki K Poulose authored
      Remove the unnecessary smp_wmb(), which was added to make sure
      that the update_cpu_boot_status() completes before we mark the
      CPU online. But update_cpu_boot_status() already has dsb() (required
      for the failing CPUs) to ensure the correct behavior.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarDennis Chen <dennis.chen@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      99aa0362
  13. 10 May, 2016 3 commits