1. 21 Nov, 2005 1 commit
  2. 17 Nov, 2005 2 commits
    • Chen, Kenneth W's avatar
      [IA64] polish comments for tlb fault handler in ivt.S · e8aabc47
      Chen, Kenneth W authored
      
      
      Polish the comments specifically in vhpt_miss and nested_dtlb_miss
      handlers.  I think it's better to explicitly name each page table
      level with its name instead of numerically name them.  i.e., use
      pgd, pud, pmd, and pte instead of referring as L1, L2, L3 etc.
      Along the line, remove some magic number in the comments like:
      "PTA + (((IFA(61,63) << 7) | IFA(33,39))*8)".  No code change at
      all, pure comment update.  Feel free to shoot anything you have,
      darts or tomahawk cruise missile.  I will duck behind a bunker ;-)
      Signed-off-by: default avatarKen Chen <kenneth.w.chen@intel.com>
      Acked-by: default avatarRobin Holt <holt@sgi.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      e8aabc47
    • Chen, Kenneth W's avatar
      [IA64] 4 level page table bug fix in vhpt_miss · fedb25fa
      Chen, Kenneth W authored
      
      
      From source code inspection, I think there is a bug with 4 level
      page table with vhpt_miss handler.  In the code path of rechecking
      page table entry against previously read value after tlb insertion,
      *pte value in register r18 was overwritten with value newly read
      from pud pointer, render the check of new *pte against previous
      *pte completely wrong.  Though the bug is none fatal and the penalty
      is to purge the entry and retry.  For functional correctness, it
      should be fixed.  The fix is to use a different register so new
      *pud don't trash *pte.  (btw, the comments in the cmp statement is
      wrong as well, which I will address in the next patch).
      Signed-off-by: default avatarKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      fedb25fa
  3. 15 Nov, 2005 1 commit
    • Chen, Kenneth W's avatar
      [PATCH] ia64: cpu_idle performance bug fix · 1e185b97
      Chen, Kenneth W authored
      Our performance validation on 2.6.15-rc1 caught a disastrous performance
      regression on ia64 with netperf (-98%) and volanomark (-58%) compares to
      previous kernel version 2.6.14-git7.  See the following chart (result
      group 1 & 2).
      
        http://kernel-perf.sourceforge.net/results.machine_id=26.html
      
      We have root caused it to commit 64c7c8f8
      
      
      
      This changeset broke the ia64 task resched notification.  In
      sched.c:resched_task(), a reschedule IPI is conditioned upon
      TIF_POLLING_NRFLAG.  However, the above changeset unconditionally set
      the polling thread flag for idle tasks regardless whether pal_halt_light
      is in use or not.  As a result, resched IPI is not sent from
      resched_task().  And since the default behavior on ia64 is to use
      pal_halt_light, we end up delaying the rescheduling task until next
      timer tick, and thus cause the performance regression.
      
      This fixes the performance bug.  I'm glad our performance suite is
      turning up bad performance bug like this in time.
      Signed-off-by: default avatarKen Chen <kenneth.w.chen@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1e185b97
  4. 14 Nov, 2005 1 commit
  5. 11 Nov, 2005 2 commits
    • Mark Maule's avatar
      [IA64-SGI] set altix preferred console · ff51224c
      Mark Maule authored
      
      
      Fix default VGA console on SN platforms.  Since SN firmware does not pass
      enough ACPI information to identify VGA cards and the associated legacy IO/MEM
      addresses, we rely on the EFI PCDP table.  Since the linux pcdp driver is
      optional (and overridden if console= directives are used) SN duplicates a
      portion of the pcdp scan code to identify if there is a usable console VGA
      adapter.  Additionally, dup necessary pcdp related structs to avoid dragging
      drivers/pcdp.h into a more public location.
      Signed-off-by: default avatarMark Maule <maule@sgi.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      ff51224c
    • Robin Holt's avatar
      [IA64] 4-level page tables · 837cd0bd
      Robin Holt authored
      
      
      This patch introduces 4-level page tables to ia64.  I have run
      some benchmarks and found nothing interesting.  Performance has
      consistently fallen within the noise range.
      
      It also introduces a config option (setting the default to 3
      levels).  The config option prevents having 4 level page
      tables with 64k base page size.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      837cd0bd
  6. 10 Nov, 2005 2 commits
  7. 09 Nov, 2005 3 commits
    • Nick Piggin's avatar
      [PATCH] sched: resched and cpu_idle rework · 64c7c8f8
      Nick Piggin authored
      
      
      Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
      confusion, and make their semantics rigid.  Improves efficiency of
      resched_task and some cpu_idle routines.
      
      * In resched_task:
      - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
        and as we hold it during resched_task, then there is no need for an
        atomic test and set there. The only other time this should be set is
        when the task's quantum expires, in the timer interrupt - this is
        protected against because the rq lock is irq-safe.
      
      - If TIF_NEED_RESCHED is set, then we don't need to do anything. It
        won't get unset until the task get's schedule()d off.
      
      - If we are running on the same CPU as the task we resched, then set
        TIF_NEED_RESCHED and no further action is required.
      
      - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
        after TIF_NEED_RESCHED has been set, then we need to send an IPI.
      
      Using these rules, we are able to remove the test and set operation in
      resched_task, and make clear the previously vague semantics of
      POLLING_NRFLAG.
      
      * In idle routines:
      - Enter cpu_idle with preempt disabled. When the need_resched() condition
        becomes true, explicitly call schedule(). This makes things a bit clearer
        (IMO), but haven't updated all architectures yet.
      
      - Many do a test and clear of TIF_NEED_RESCHED for some reason. According
        to the resched_task rules, this isn't needed (and actually breaks the
        assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
        held). So remove that. Generally one less locked memory op when switching
        to the idle thread.
      
      - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
        most polling idle loops. The above resched_task semantics allow it to be
        set until before the last time need_resched() is checked before going into
        a halt requiring interrupt wakeup.
      
        Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
        can be always left set, completely eliminating resched IPIs when rescheduling
        the idle task.
      
        POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Con Kolivas <kernel@kolivas.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      64c7c8f8
    • Nick Piggin's avatar
      [PATCH] sched: disable preempt in idle tasks · 5bfb5d69
      Nick Piggin authored
      
      
      Run idle threads with preempt disabled.
      
      Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
      How did it ever work before?
      
      Might fix the CPU hotplugging hang which Nigel Cunningham noted.
      
      We think the bug hits if the idle thread is preempted after checking
      need_resched() and before going to sleep, then the CPU offlined.
      
      After calling stop_machine_run, the CPU eventually returns from preemption and
      into the idle thread and goes to sleep.  The CPU will continue executing
      previous idle and have no chance to call play_dead.
      
      By disabling preemption until we are ready to explicitly schedule, this bug is
      fixed and the idle threads generally become more robust.
      
      From: alexs <ashepard@u.washington.edu>
      
        PPC build fix
      
      From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
      
        MIPS build fix
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5bfb5d69
    • Christoph Hellwig's avatar
      [PATCH] remove ioctl32_handler_t · 7e4c54a2
      Christoph Hellwig authored
      
      
      Some architectures define and use this type in their compat_ioctl code, but
      all of them can easily use the identical ioctl_trans_handler_t type that is
      defined in common code.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7e4c54a2
  8. 08 Nov, 2005 7 commits
  9. 07 Nov, 2005 9 commits
  10. 03 Nov, 2005 2 commits
  11. 31 Oct, 2005 1 commit
  12. 30 Oct, 2005 3 commits
  13. 29 Oct, 2005 5 commits
    • Dave Hansen's avatar
      [PATCH] memory hotplug locking: node_size_lock · 208d54e5
      Dave Hansen authored
      
      
      pgdat->node_size_lock is basically only neeeded in one place in the normal
      code: show_mem(), which is the arch-specific sysrq-m printing function.
      
      Strictly speaking, the architectures not doing memory hotplug do no need this
      locking in show_mem().  However, they are all included for completeness.  This
      should also make any future consolidation of all of the implementations a
      little more straightforward.
      
      This lock is also held in the sparsemem code during a memory removal, as
      sections are invalidated.  This is the place there pfn_valid() is made false
      for a memory area that's being removed.  The lock is only required when doing
      pfn_valid() operations on memory which the user does not already have a
      reference on the page, such as in show_mem().
      Signed-off-by: default avatarDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      208d54e5
    • Hugh Dickins's avatar
      [PATCH] mm: flush_tlb_range outside ptlock · 663b97f7
      Hugh Dickins authored
      
      
      There was one small but very significant change in the previous patch:
      mprotect's flush_tlb_range fell outside the page_table_lock: as it is in 2.4,
      but that doesn't prove it safe in 2.6.
      
      On some architectures flush_tlb_range comes to the same as flush_tlb_mm, which
      has always been called from outside page_table_lock in dup_mmap, and is so
      proved safe.  Others required a deeper audit: I could find no reliance on
      page_table_lock in any; but in ia64 and parisc found some code which looks a
      bit as if it might want preemption disabled.  That won't do any actual harm,
      so pending a decision from the maintainers, disable preemption there.
      
      Remove comments on page_table_lock from flush_tlb_mm, flush_tlb_range and
      flush_tlb_page entries in cachetlb.txt: they were rather misleading (what
      generic code does is different from what usually happens), the rules are now
      changing, and it's not yet clear where we'll end up (will the generic
      tlb_flush_mmu happen always under lock?  never under lock?  or sometimes under
      and sometimes not?).
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      663b97f7
    • Hugh Dickins's avatar
      [PATCH] mm: init_mm without ptlock · 872fec16
      Hugh Dickins authored
      
      
      First step in pushing down the page_table_lock.  init_mm.page_table_lock has
      been used throughout the architectures (usually for ioremap): not to serialize
      kernel address space allocation (that's usually vmlist_lock), but because
      pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.
      
      Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
      architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
      and drop it when allocating a new one, to check lest a racing task already
      did.  Similarly no page_table_lock in vmalloc's map_vm_area.
      
      Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
      user mms, which are converted only by a later patch, for now they have to lock
      differently according to whether or not it's init_mm.
      
      If sources get muddled, there's a danger that an arch source taking
      init_mm.page_table_lock will be mixed with common source also taking it (or
      neither take it).  So break the rules and make another change, which should
      break the build for such a mismatch: remove the redundant mm arg from
      pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).
      
      Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
      used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
      pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
      map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
      took page_table_lock for no good reason.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      872fec16
    • Hugh Dickins's avatar
      [PATCH] mm: ia64 use expand_upwards · 46dea3d0
      Hugh Dickins authored
      
      
      ia64 has expand_backing_store function for growing its Register Backing Store
      vma upwards.  But more complete code for this purpose is found in the
      CONFIG_STACK_GROWSUP part of mm/mmap.c.  Uglify its #ifdefs further to provide
      expand_upwards for ia64 as well as expand_stack for parisc.
      
      The Register Backing Store vma should be marked VM_ACCOUNT.  Implement the
      intention of growing it only a page at a time, instead of passing an address
      outside of the vma to handle_mm_fault, with unknown consequences.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      46dea3d0
    • Hugh Dickins's avatar
      [PATCH] mm: vm_stat_account unshackled · ab50b8ed
      Hugh Dickins authored
      
      
      The original vm_stat_account has fallen into disuse, with only one user, and
      only one user of vm_stat_unaccount.  It's easier to keep track if we convert
      them all to __vm_stat_account, then free it from its __shackles.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ab50b8ed
  14. 28 Oct, 2005 1 commit