- Jul 24, 2009
-
-
Ming Lei authored
Add BFS statistics to the existing lockdep stats. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-10-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
Also account the BFS memory usage. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> [ fix build for !PROVE_LOCKING ] Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-9-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
Implement lockdep_count_{for,back}ward using BFS. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-8-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
Since the shortest lock dependencies' path may be obtained by BFS, we print the shortest one by print_shortest_lock_dependencies(). Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-7-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
This patch uses BFS to implement find_usage_*wards(),which was originally writen by DFS. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-6-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
This patch uses BFS to implement check_noncircular() and prints the generated shortest circle if exists. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-5-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
1,introduce match() to BFS in order to make it usable to match different pattern; 2,also rename some functions to make them more suitable. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-4-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
1,replace %MAX_CIRCULAR_QUE_SIZE with &(MAX_CIRCULAR_QUE_SIZE-1) since we define MAX_CIRCULAR_QUE_SIZE as power of 2; 2,use bitmap to mark if a lock is accessed in BFS in order to clear it quickly, because we may search a graph many times. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-3-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ming Lei authored
Currently lockdep will print the 1st circle detected if it exists when acquiring a new (next) lock. This patch prints the shortest path from the next lock to be acquired to the previous held lock if a circle is found. The patch still uses the current method to check circle, and once the circle is found, breadth-first search algorithem is used to compute the shortest path from the next lock to the previous lock in the forward lock dependency graph. Printing the shortest path will shorten the dependency chain, and make troubleshooting for possible circular locking easier. Signed-off-by:
Ming Lei <tom.leiming@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1246201486-7308-2-git-send-email-tom.leiming@gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jul 22, 2009
-
-
Bruno Premont authored
Since genirq: Delegate irq affinity setting to the irq thread (591d2fb0) compilation with CONFIG_SMP=n fails with following error: /usr/src/linux-2.6/kernel/irq/manage.c: In function 'irq_thread_check_affinity': /usr/src/linux-2.6/kernel/irq/manage.c:475: error: 'struct irq_desc' has no member named 'affinity' make[4]: *** [kernel/irq/manage.o] Error 1 That commit adds a new function irq_thread_check_affinity() which uses struct irq_desc.affinity which is only available for CONFIG_SMP=y. Move that function under #ifdef CONFIG_SMP. [ tglx@brownpaperbag: compile and boot tested on UP and SMP ] Signed-off-by:
Bruno Premont <bonbons@linux-vserver.org> LKML-Reference: <20090722222232.2eb3e1c4@neptune.home> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Arjan van de Ven authored
the "reserved" field was not initialized to zero, resulting in 4 bytes of stack data leaking to userspace.... Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Anton Blanchard authored
Right now we only print PERF_EVENT_THROTTLE + 1 (ie PERF_EVENT_UNTHROTTLE). Fix this to print both a throttle and unthrottle event. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090722130546.GE9029@kryten>
-
Peter Zijlstra authored
Anton noted that for inherited counters the counter-id as provided by PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID because each inherited counter gets its own id. His suggestion was to always return the parent counter id, since that is the primary counter id as exposed. However, these inherited counters have a unique identifier so that events like PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which counter gets modified, which is important when trying to normalize the sample streams. This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD, which is more useful anyway, since changing periods became a lot more common than initially thought -- rendering PERF_EVENT_PERIOD the less useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate value, since it reports the value used to trigger the overflow, whereas PERF_EVENT_PERIOD simply reports the requested period changed, which might only take effect on the next cycle). This still leaves us PERF_EVENT_THROTTLE to consider, but since that _should_ be a rare occurrence, and linking it to a primary id is the most useful bit to diagnose the problem, we introduce a PERF_SAMPLE_STREAM_ID, for those few cases where the full reconstruction is important. [Does change the ABI a little, but I see no other way out] Suggested-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1248095846.15751.8781.camel@twins>
-
Peter Zijlstra authored
Per example of Arjan's patch, I went through and found a few more. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl>
-
Arjan van de Ven authored
the "reserved" field was not initialized to zero, resulting in 4 bytes of stack data leaking to userspace.... Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl>
-
Peter Zijlstra authored
commit ca109491 (hrtimer: removing all ur callback modes) moved all hrtimer callbacks into hard interrupt context when high resolution timers are active. That breaks code which relied on the assumption that the callback happens in softirq context. Provide a generic infrastructure which combines tasklets and hrtimers together to provide an in-softirq hrtimer experience. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: kaber@trash.net Cc: David Miller <davem@davemloft.net> LKML-Reference: <1248265724.27058.1366.camel@twins> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- Jul 21, 2009
-
-
Thomas Gleixner authored
irq_set_thread_affinity() calls set_cpus_allowed_ptr() which might sleep, but irq_set_thread_affinity() is called with desc->lock held and can be called from hard interrupt context as well. The code has another bug as it does not hold a ref on the task struct as required by set_cpus_allowed_ptr(). Just set the IRQTF_AFFINITY bit in action->thread_flags. The next time the thread runs it migrates itself. Solves all of the above problems nicely. Add kerneldoc to irq_set_thread_affinity() while at it. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission>
-
- Jul 19, 2009
-
-
Thomas Gleixner authored
Writing a zero length string to sys/.../current_clocksource will cause a NULL pointer dereference if the clock events system is in one shot (highres or nohz) mode. Pointed-out-by:
Dan Carpenter <error27@gmail.com> LKML-Reference: <alpine.DEB.2.00.0907191545580.12306@bicker> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- Jul 18, 2009
-
-
Pavel Roskin authored
timer->expires may be uninitialized, so check timer_pending() before touching timer->expires to pacify kmemcheck. Signed-off-by:
Pavel Roskin <proski@gnu.org> LKML-Reference: <20090718204602.5191.360.stgit@mj.roinet.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
commit e3c8ca83 (sched: do not count frozen tasks toward load) broke the nr_uninterruptible accounting on freeze/thaw. On freeze the task is excluded from accounting with a check for (task->flags & PF_FROZEN), but that flag is cleared before the task is thawed. So while we prevent that the task with state TASK_UNINTERRUPTIBLE is accounted to nr_uninterruptible on freeze we decrement nr_uninterruptible on thaw. Use a separate flag which is handled by the freezing task itself. Set it before calling the scheduler with TASK_UNINTERRUPTIBLE state and clear it after we return from frozen state. Cc: <stable@kernel.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
The new load average code clears rq->calc_load_active on CPU_ONLINE. That's wrong as the new onlined CPU might have got a scheduler tick already and accounted the delta to the stale value of the time we offlined the CPU. Clear the value when we cleanup the dead CPU instead. Also move the update of the calc_load_update time for the newly online CPU to CPU_UP_PREPARE to avoid that the CPU plays catch up with the stale update time value. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Anton Blanchard authored
Right now we don't output vfork events. Even though we should always see an exec after a vfork, we may get perfcounter samples between the vfork and exec. These samples can lead to some confusion when parsing perfcounter data. To keep things consistent we should always log a fork event. It will result in a little more log data, but is less confusing to trace parsing tools. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.589309391@samba.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Anton Blanchard authored
There are a few places we are leaking tiny amounts of kernel memory to userspace. This happens when writing out strings because we always align the end to 64 bits. To avoid this we should always use an appropriately sized temporary buffer and ensure it is zeroed. Since d_path assembles the string from the end of the buffer backwards, we need to add 64 bits after the buffer to allow for alignment. We also need to copy arch_vma_name to the temporary buffer, because if we use it directly we may end up copying to userspace a number of bytes after the end of the string constant. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.273972048@samba.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Fabio Checconi authored
I spotted two sites that didn't take vruntime wrap-around into account. Fix these by creating a comparison helper that does do so. Signed-off-by:
Fabio Checconi <fabio@gandalf.sssup.it> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jul 16, 2009
-
-
Xiao Guangrong authored
ftrace_trace_onoff_callback() will return an error even if we do the right operation, for example: # echo _spin_*:traceon:10 > set_ftrace_filter -bash: echo: write error: Invalid argument # cat set_ftrace_filter #### all functions enabled #### _spin_trylock_bh:traceon:count=10 _spin_unlock_irq:traceon:count=10 _spin_unlock_bh:traceon:count=10 _spin_lock_irq:traceon:count=10 _spin_unlock:traceon:count=10 _spin_trylock:traceon:count=10 _spin_unlock_irqrestore:traceon:count=10 _spin_lock_irqsave:traceon:count=10 _spin_lock_bh:traceon:count=10 _spin_lock:traceon:count=10 We want to set _spin_*:traceon:10 to set_ftrace_filter, it complains with "Invalid argument", but the operation is successful. This is because ftrace_process_regex() returns the number of functions that matched the pattern. If the number is not 0, this value is returned by ftrace_regex_write() whereas we want to return the number of bytes virtually written. Also the file offset pointer is not updated in this case. If the number of matched functions is lower than the number of bytes written by the user, this results to a reprocessing of the string given by the user with a lower size, leading to a malformed ftrace regex and then a -EINVAL returned. So, this patch fixes it by returning 0 if no error occured. The fix also applies on 2.6.30 Signed-off-by:
Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Reviewed-by:
Li Zefan <lizf@cn.fujitsu.com> Cc: stable@kernel.org Signed-off-by:
Frederic Weisbecker <fweisbec@gmail.com>
-
- Jul 13, 2009
-
-
Steven Rostedt authored
The per cpu variable stat is freeded if we fail to allocate a name on start up. This was due to stat at first being allocated in the initial design. But since then, it has become a static per cpu variable but the free on error was not removed. Also added __init annotation to the function that this is in. [ Impact: prevent possible memory corruption on low mem at boot up ] Signed-off-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Chris Wilson authored
Fix a missed rename in EVENT_PROFILE support so that it gets built and allows tracepoint tracing from the 'perf' tool. Fix a typo in the (never before built & enabled) portion in perf_counter.c as well, and update that code to the attr.config changes as well. Signed-off-by:
Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Gamari <bgamari.foss@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1246869094-21237-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jul 12, 2009
-
-
Alexey Dobriyan authored
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 11, 2009
-
-
Sonny Rao authored
get_futex_key() can infinitely loop if it is called on a virtual address that is within a huge page but not aligned to the beginning of that page. The call to get_user_pages_fast will return the struct page for a sub-page within the huge page and the check for page->mapping will always fail. The fix is to call compound_head on the page before checking that it's mapped. Signed-off-by:
Sonny Rao <sonnyrao@us.ibm.com> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org Cc: anton@samba.org Cc: rajamony@us.ibm.com Cc: speight@us.ibm.com Cc: mstephen@us.ibm.com Cc: grimm@us.ibm.com Cc: mikey@ozlabs.au.ibm.com LKML-Reference: <20090710231313.GA23572@us.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Paul Turner authored
One of the isolation modifications for SCHED_IDLE is the unitization of sleeper credit. However the check for this assumes that the sched_entity we're placing always belongs to a task. This is potentially not true with group scheduling and leaves us rummaging randomly when we try to pull the policy. Signed-off-by:
Paul Turner <pjt@google.com> Cc: peterz@infradead.org LKML-Reference: <alpine.DEB.1.00.0907101649570.29914@kitami.corp.google.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jul 10, 2009
-
-
Peter Zijlstra authored
Optimize cond_resched() by removing one conditional. Currently cond_resched() checks system_state == SYSTEM_RUNNING in order to avoid scheduling before the scheduler is running. We can however, as per suggestion of Matt, use PREEMPT_ACTIVE to accomplish that very same. Suggested-by:
Matt Mackall <mpm@selenic.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by:
Matt Mackall <mpm@selenic.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Thomas Gleixner authored
The timer migration expiry check should prevent the migration of a timer to another CPU when the timer expires before the next event is scheduled on the other CPU. Migrating the timer might delay it because we can not reprogram the clock event device on the other CPU. But the code implementing that check has two flaws: - for !HIGHRES the check compares the expiry value with the clock events device expiry value which is wrong for CLOCK_REALTIME based timers. - the check is racy. It holds the hrtimer base lock of the target CPU, but the clock event device expiry value can be modified nevertheless, e.g. by an timer interrupt firing. The !HIGHRES case is easy to fix as we can enqueue the timer on the cpu which was selected by the load balancer. It runs the idle balancing code once per jiffy anyway. So the maximum delay for the timer is the same as when we keep the tick on the current cpu going. In the HIGHRES case we can get the next expiry value from the hrtimer cpu_base of the target CPU and serialize the update with the cpu_base lock. This moves the lock section in hrtimer_interrupt() so we can set next_event to KTIME_MAX while we are handling the expired timers and set it to the next expiry value after we handled the timers under the base lock. While the expired timers are processed timer migration is blocked because the expiry time of the timer is always <= KTIME_MAX. Also remove the now useless clockevents_get_next_event() function. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
The timer migration code needs to check whether the expiry time of the timer is before the programmed clock event expiry time when the timer is enqueued on another CPU because we can not reprogram the timer device on the other CPU. The current logic checks the expiry time even if we enqueue on the current CPU when nohz_get_load_balancer() returns current CPU. This might lead to an endless loop in the expiry check code when the expiry time of the timer is before the current programmed next event. Check whether nohz_get_load_balancer() returns current CPU and skip the expiry check if this is the case. The bug was triggered from the networking code. The patch fixes the regression http://bugzilla.kernel.org/show_bug.cgi?id=13738 (Soft-Lockup/Race in networking in 2.6.31-rc1+195) Cc: Arun Bharadwaj <arun@linux.vnet.ibm.com Tested-by:
Joao Correia <joaomiguelcorreia@gmail.com> Tested-by:
Andres Freund <andres@anarazel.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Fabio Checconi authored
init_rt_rq() initializes only rq->rt.pushable_tasks, and not the pushable_tasks field of the passed rt_rq. The plist is not used uninitialized since the only pushable_tasks plists used are the ones of root rt_rqs; anyway reinitializing the list on every group creation corrupts the root plist, losing its previous contents. Signed-off-by:
Fabio Checconi <fabio@gandalf.sssup.it> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090615185638.GK21741@gandalf.sssup.it> CC: Gregory Haskins <ghaskins@novell.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Lucas De Marchi authored
The sched_stat fields are currently not reset upon fork. Ingo's recent commit 6c594c21 did reset nr_migrations, but it didn't reset any of the others. This patch resets all sched_stat fields on fork. Signed-off-by:
Lucas De Marchi <lucas.de.marchi@gmail.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <193b0f820907090457s7a3662f4gcdecdc22fcae857b@mail.gmail.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
Fixes an easily triggerable BUG() when setting process affinities. Make sure to count the number of migratable tasks in the same place: the root rt_rq. Otherwise the number doesn't make sense and we'll hit the BUG in set_cpus_allowed_rt(). Also, make sure we only count tasks, not groups (this is probably already taken care of by the fact that rt_se->nr_cpus_allowed will be 0 for groups, but be more explicit) Tested-by:
Thomas Gleixner <tglx@linutronix.de> CC: stable@kernel.org Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by:
Gregory Haskins <ghaskins@novell.com> LKML-Reference: <1247067476.9777.57.camel@twins> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
Instead of open coding the unclone context thingy, put it in a common function. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Jul 09, 2009
-
-
Catalin Marinas authored
kmemleak_alloc() calls were added in some places where alloc_bootmem was called. Since now kmemleak tracks bootmem allocations, these explicit calls should be run. Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by:
Pekka Enberg <penberg@cs.helsinki.fi>
-
- Jul 08, 2009
-
-
Joe Perches authored
Commit 5fd29d6c ("printk: clean up handling of log-levels and newlines") changed printk semantics. printk lines with multiple KERN_<level> prefixes are no longer emitted as before the patch. <level> is now included in the output on each additional use. Remove all uses of multiple KERN_<level>s in formats. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
Fix various silly problems wrt mnt_namespace.h: - exit_mnt_ns() isn't used, remove it - done that, sched.h and nsproxy.h inclusions aren't needed - mount.h inclusion was need for vfsmount_lock, but no longer - remove mnt_namespace.h inclusion from files which don't use anything from mnt_namespace.h Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-