- Sep 28, 2010
-
-
Lucas De Marchi authored
Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Aug 09, 2010
-
-
Ai Li authored
On some SoC chips, HW resources may be in use during any particular idle period. As a consequence, the cpuidle states that the SoC is safe to enter can change from idle period to idle period. In addition, the latency and threshold of each cpuidle state can vary, depending on the operating condition when the CPU becomes idle, e.g. the current cpu frequency, the current state of the HW blocks, etc. cpuidle core and the menu governor, in the current form, are geared towards cpuidle states that are static, i.e. the availabiltiy of the states, their latencies, their thresholds are non-changing during run time. cpuidle does not provide any hook that cpuidle drivers can use to adjust those values on the fly for the current idle period before the menu governor selects the target cpuidle state. This patch extends cpuidle core and the menu governor to handle states that are dynamic. There are three additions in the patch and the patch maintains backwards-compatibility with existing cpuidle drivers. 1) add prepare() to struct cpuidle_device. A cpuidle driver can hook into the callback and cpuidle will call prepare() before calling the governor's select function. The callback gives the cpuidle driver a chance to update the dynamic information of the cpuidle states for the current idle period, e.g. state availability, latencies, thresholds, power values, etc. 2) add CPUIDLE_FLAG_IGNORE as one of the state flags. In the prepare() function, a cpuidle driver can set/clear the flag to indicate to the menu governor whether a cpuidle state should be ignored, i.e. not available, during the current idle period. 3) add power_specified bit to struct cpuidle_device. The menu governor currently assumes that the cpuidle states are arranged in the order of increasing latency, threshold, and power savings. This is true or can be made true for static states. Once the state parameters are dynamic, the latencies, thresholds, and power savings for the cpuidle states can increase or decrease by different amounts from idle period to idle period. So the assumption of increasing latency, threshold, and power savings from Cn to C(n+1) can no longer be guaranteed. It can be straightforward to calculate the power consumption of each available state and to specify it in power_usage for the idle period. Using the power_usage fields, the menu governor then selects the state that has the lowest power consumption and that still satisfies all other critieria. The power_specified bit defaults to 0. For existing cpuidle drivers, cpuidle detects that power_specified is 0 and fills in a dummy set of power_usage values. Signed-off-by:
Ai Li <aili@codeaurora.org> Cc: Len Brown <len.brown@intel.com> Acked-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venki@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 03, 2010
-
-
Thomas Renninger authored
and fix the broken case if a core's frequency depends on others. trace_power_frequency was only implemented in a rather ungeneric way in acpi-cpufreq driver's target() function only. -> Move the call to trace_power_frequency to cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE notifier is triggered. This will support power frequency tracing by all cpufreq drivers trace_power_frequency did not trace frequency changes correctly when the userspace governor was used or when CPU cores' frequency depend on each other. -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu which gets switched automatically fixes this. Robert Schoene provided some important fixes on top of my initial quick shot version which are integrated in this patch: - Forgot some changes in power_end trace (TP_printk/variable names) - Variable dummy in power_end must now be cpu_id - Use static 64 bit variable instead of unsigned int for cpu_id Signed-off-by:
Thomas Renninger <trenn@suse.de> CC: davej@redhat.com CC: arjan@infradead.org CC: linux-kernel@vger.kernel.org CC: robert.schoene@tu-dresden.de Tested-by:
<robert.schoene@tu-dresden.de> Signed-off-by:
Dave Jones <davej@redhat.com>
-
- Jul 22, 2010
-
-
Thomas Renninger authored
and fix the broken case if a core's frequency depends on others. trace_power_frequency was only implemented in a rather ungeneric way in acpi-cpufreq driver's target() function only. -> Move the call to trace_power_frequency to cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE notifier is triggered. This will support power frequency tracing by all cpufreq drivers. trace_power_frequency did not trace frequency changes correctly when the userspace governor was used or when CPU cores' frequency depend on each other. -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu which gets switched automatically fixes this. Robert Schoene provided some important fixes on top of my initial quick shot version which are integrated in this patch: - Forgot some changes in power_end trace (TP_printk/variable names) - Variable dummy in power_end must now be cpu_id - Use static 64 bit variable instead of unsigned int for cpu_id [akpm@linux-foundation.org: build fix] Signed-off-by:
Thomas Renninger <trenn@suse.de> Cc: davej@codemonkey.org.uk Signed-off-by:
Ingo Molnar <mingo@elte.hu> Cc: Dave Jones <davej@codemonkey.org.uk> Acked-by:
Arjan van de Ven <arjan@infradead.org> Cc: Robert Schoene <robert.schoene@tu-dresden.de> Tested-by:
Robert Schoene <robert.schoene@tu-dresden.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Jul 01, 2010
-
-
Peter Zijlstra authored
Commit 0224cf4c (sched: Intoduce get_cpu_iowait_time_us()) broke things by not making sure preemption was indeed disabled by the callers of nr_iowait_cpu() which took the iowait value of the current cpu. This resulted in a heap of preempt warnings. Cure this by making nr_iowait_cpu() take a cpu number and fix up the callers to pass in the right number. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: linux-pm@lists.linux-foundation.org LKML-Reference: <1277968037.1868.120.camel@laptop> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- May 27, 2010
-
-
Len Brown authored
cpuidle_register_driver() sets cpuidle_curr_driver cpuidle_unregister_driver() clears cpuidle_curr_driver We should't expose cpuidle_curr_driver to potential modification except via these interfaces. So make it static and create cpuidle_get_driver() to observe it. Signed-off-by:
Len Brown <len.brown@intel.com>
-
Len Brown authored
Assure that cpuidle_unregister_driver() will not clobber the registered driver if unregistered by somebody else. Signed-off-by:
Len Brown <len.brown@intel.com>
-
- May 25, 2010
-
-
Arjan van de Ven authored
Currently, the menu governor uses the (corrected) next timer as key item for predicting the idle duration. It turns out that there are specific cases where this breaks down: There are cases where we have a very repetitive pattern of idle durations, where the idle period is pretty much the same, for reasons completely unrelated to the next timer event. Examples of such repeating patterns are network loads with irq mitigation, the mouse moving but in theory also the wifi beacons. This patch adds a relatively simple detector for such repeating patterns, where the standard deviation of the last 8 idle periods is compared to a threshold. With this extra predictor in place, measurements show that the DECAY factor can now be increased (the decaying average will now decay slower) to get an even more stable result. [arjan@infradead.org: fix bug identified by Frank] Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Corrado Zoccolo <czoccolo@gmail.com> Cc: Frank Rowand <frank.rowand@am.sony.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 10, 2010
-
-
Mark Gross authored
This patch changes the string based list management to a handle base implementation to help with the hot path use of pm-qos, it also renames much of the API to use "request" as opposed to "requirement" that was used in the initial implementation. I did this because request more accurately represents what it actually does. Also, I added a string based ABI for users wanting to use a string interface. So if the user writes 0xDDDDDDDD formatted hex it will be accepted by the interface. (someone asked me for it and I don't think it hurts anything.) This patch updates some documentation input I got from Randy. Signed-off-by:
markgross <mgross@linux.intel.com> Signed-off-by:
Rafael J. Wysocki <rjw@sisk.pl>
-
- May 09, 2010
-
-
Arjan van de Ven authored
commit 672917dc ("cpuidle: menu governor: reduce latency on exit") added an optimization, where the analysis on the past idle period moved from the end of idle, to the beginning of the new idle. Unfortunately, this optimization had a bug where it zeroed one key variable for new use, that is needed for the analysis. The fix is simple, zero the variable after doing the work from the previous idle. During the audit of the code that found this issue, another issue was also found; the ->measured_us data structure member is never set, a local variable is always used instead. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Corrado Zoccolo <czoccolo@gmail.com> Cc: stable@kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 30, 2010
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by:
Tejun Heo <tj@kernel.org> Guess-its-ok-by:
Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- Mar 07, 2010
-
-
Emese Revfy authored
Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by:
Emese Revfy <re.emese@gmail.com> Acked-by:
David Teigland <teigland@redhat.com> Acked-by:
Matt Domsch <Matt_Domsch@dell.com> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by:
Hans J. Koch <hjk@linutronix.de> Acked-by:
Pekka Enberg <penberg@cs.helsinki.fi> Acked-by:
Jens Axboe <jens.axboe@oracle.com> Acked-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Andi Kleen authored
Passing the attribute to the low level IO functions allows all kinds of cleanups, by sharing low level IO code without requiring an own function for every piece of data. Also drivers can extend the attributes with own data fields and use that in the low level function. Similar to sysdev_attributes and normal attributes. This is a tree-wide sweep, converting everything in one go. No functional changes in this patch other than passing the new argument everywhere. Tested on x86, the non x86 parts are uncompiled. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- Mar 06, 2010
-
-
Richard Kennedy authored
Reorder struct menu_device to remove 8 bytes of padding on 64 bit builds. Size drops from 136 to 128 bytes, so possibly needing one fewer cache lines. Signed-off-by:
Richard Kennedy <richard@rsk.demon.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 11, 2010
-
-
Stephen Hemminger authored
menu: use proper 64 bit math The new menu governor is incorrectly doing a 64 bit divide. Compile tested only Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: <stable@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 15, 2009
-
-
Julia Lawall authored
It does not seem possible that ldev can be NULL, so drop the unnecessary test. If ldev can somehow be NULL, then the initialization of last_idx should be moved below the test. A simplified version of the semantic match that detects this problem is as follows (http://coccinelle.lip6.fr/ ): // <smpl> @match exists@ expression x, E; identifier fld; @@ * x->fld ... when != \(x = E\|&x\) * x == NULL // </smpl> Signed-off-by:
Julia Lawall <julia@diku.dk> Acked-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Nov 09, 2009
-
-
Uwe Kleine-König authored
This patch was generated by git grep -E -i -l '[Aa]quire' | xargs -r perl -p -i -e 's/([Aa])quire/$1cquire/' and the cumsumed was found by checking the diff for aquire. Signed-off-by:
Uwe Kleine-Knig <u.kleine-koenig@pengutronix.de> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- Oct 29, 2009
-
-
Kevin Hilman authored
In the case where cpuidle_idle_call() returns before changing state due to a need_resched(), it was returning with IRQs disabled. The idle path assumes that the platform specific idle code returns with interrupts enabled (although this too is undocumented AFAICT) and on ARM we have a WARN_ON(!(irqs_disabled()) when returning from the idle loop, so the user-visible effects were only a warning since interrupts were eventually re-enabled later. On x86, this same problem exists, but there is no WARN_ON() to detect it. As on ARM, the interrupts are eventually re-enabled, so I'm not sure of any actual bugs triggered by this. It's primarily a correctness/consistency fix. This patch ensures IRQs are (re)enabled before returning. Reported-by:
Hemanth V <hemanthv@ti.com> Signed-off-by:
Kevin Hilman <khilman@deeprootsystems.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Tested-by:
Martin Michlmayr <tbm@cyrius.com> Cc: <stable@kernel.org> [2.6.31.x] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Sep 22, 2009
-
-
Corrado Zoccolo authored
Move the state residency accounting and statistics computation off the hot exit path. On exit, the need to recompute statistics is recorded, and new statistics will be computed when menu_select is called again. The expected effect is to reduce processor wakeup latency from sleep (C-states). We are speaking of few hundreds of cycles reduction out of a several microseconds latency (determined by the hardware transition), so it is difficult to measure. Signed-off-by:
Corrado Zoccolo <czoccolo@gmail.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Adam Belay <abelay@novell.com Acked-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Arjan van de Ven authored
Fix the menu idle governor which balances power savings, energy efficiency and performance impact. The reason for a reworked governor is that there have been serious performance issues reported with the existing code on Nehalem server systems. To show this I'm sure Andrew wants to see benchmark results: (benchmark is "fio", "no cstates" is using "idle=poll") no cstates current linux new algorithm 1 disk 107 Mb/s 85 Mb/s 105 Mb/s 2 disks 215 Mb/s 123 Mb/s 209 Mb/s 12 disks 590 Mb/s 320 Mb/s 585 Mb/s In various power benchmark measurements, no degredation was found by our measurement&diagnostics team. Obviously a small percentage more power was used in the "fio" benchmark, due to the much higher performance. While it would be a novel idea to describe the new algorithm in this commit message, I cheaped out and described it in comments in the code instead. [changes since first post: spelling fixes from akpm, review feedback, folded menu-tng into menu.c] Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Acked-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Sep 19, 2009
-
-
Arjan van de Ven authored
The "end of a C state" trace point currently happens before the code runs that corrects the TSC for having stopped during idle. The result of this is that the timestamp of the end-of-C-state event is garbage on cpus where the TSC stops during idle. This patch moves the end point of the C state to after the timekeeping engine of the kernel has been corrected. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <len.brown@intel.com> Cc: fweisbec@gmail.com Cc: peterz@infradead.org Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090919133533.139c2a46@infradead.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Dec 30, 2008
-
-
Pallipadi, Venkatesh authored
Add decaying history of predicted idle time, instead of using the last early wakeup. This logic helps menu governor do better job of predicting idle time. With this change, we also measured noticable (~8%) power savings on a DP server system with CPUs supporting deep C states, when system was lightly loaded. There was no change to power or perf on other load conditions. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Nov 09, 2008
-
-
Arjan van de Ven authored
It's showing up as regressions; disabling it very likely just papers over an underlying issue, but time is running out for 2.6.28, lets get back to this for 2.6.29 Fixes: #11826 and #11893 Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 16, 2008
-
-
Venkatesh Pallipadi authored
http://bugzilla.kernel.org/show_bug.cgi?id=11345 Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Venkatesh Pallipadi authored
cpuidle accounts the idle time for the C-state it was trying to enter and not to the actual state that the driver eventually entered. The driver may select a different state than the one chosen by cpuidle due to constraints like bus-mastering, etc. Change the time acounting code to look at the dev->last_state after returning from target_state->enter(). Driver can modify dev->last_state internally, inside the enter routine to reflect the actual C-state entered. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Tested-by:
Kevin Hilman <khilman@deeprootsystems.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Sep 11, 2008
-
-
Arjan van de Ven authored
As part of going idle, we already look at the time of the next timer event to determine which C-state to select etc. This patch adds functionality that causes the timers that are past their soft expire time, to fire at this time, before we calculate the next wakeup time. This functionality will thus avoid wakeups by running timers before going idle rather than specially waking up for it. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com>
-
- Aug 15, 2008
-
-
venkatesh.pallipadi@intel.com authored
ladder governor only honored latency requirement when promoting C-states. Instead. it should check for latency requirement on each idle call, and demote to appropriate C-state when there is a latency requirement change. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Andi Kleen <ak@linux.intel.com>
-
venkatesh.pallipadi@intel.com authored
There is a bug in menu governor where we have if (data->elapsed_us < data->elapsed_us + measured_us) with measured_us already having elapsed_us added in tickless case here unsigned int measured_us = cpuidle_get_last_residency(dev) + data->elapsed_us; Also, it should be last_residency, not measured_us, that need to be used to do comparing and distinguish between expected & non-expected events. Refactor menu_reflect() to fix these two problems. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Wei Gang <gang.wei@intel.com> Signed-off-by:
Andi Kleen <ak@linux.intel.com>
-
venkatesh.pallipadi@intel.com authored
poll_idle was added to CPUIDLE, just as a low latency idle handler, to be used in cases when user desires CPUs not to enter any idle state at all. It was supposed to be a run time idle=poll option to the user. But, it was indeed getting used during normal menu and ladder governor default case, with no special user setting (Reported by Linus Torvalds). Change below ensures that poll_idle will not be used unless user explicitly asks pm_qos infrastructure for zero latency requirement. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Andi Kleen <ak@linux.intel.com>
-
- Aug 12, 2008
-
-
Rabin Vincent authored
These attributes are really sysdev class attributes. The incorrect definition leads to an oops because of recent changes which make sysdev attributes use a different prototype. Based on Andi's f718cd4a ("sched: make scheduler sysfs attributes sysdev class devices") Reported-by:
Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by:
Rabin Vincent <rabin@rab.in> Acked-by:
Andi Kleen <ak@linux.intel.com> Cc: "Li, Shaohua" <shaohua.li@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 28, 2008
-
-
Thomas Gleixner authored
pm_idle_save resp. pm_idle_old can be NULL when the restore code in acpi_processor_cst_has_changed() resp. cpuidle_uninstall_idle_handler() is called. This can set pm_idle unconditinally to NULL, which causes the kernel to panic when calling pm_idle in the x86 idle code. This was covered by an extra check for !pm_idle in the x86 idle code, which was removed during the x86 idle code refactoring. Instead of restoring the pm_idle check in the x86 code prevent the acpi/cpuidle code to set pm_idle to NULL. Reported by: Dhaval Giani http://lkml.org/lkml/2008/7/2/309 Based on a debug patch from Ingo Molnar Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 21, 2008
-
-
Andi Kleen authored
This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
- Jun 26, 2008
-
-
Jens Axboe authored
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by:
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- Jun 11, 2008
-
-
Venkatesh Pallipadi authored
cpuidle and acpi driver interaction bug with the way cpuidle_register_driver() is called. Due to this bug, there will be oops on AC<->DC on some systems, where they support C-states in one DC and not in AC. The current code does ON BOOT: Look at CST and other C-state info to see whether more than C1 is supported. If it is, then acpi processor_idle does a cpuidle_register_driver() call, which internally enables the device. ON CST change notification (AC<->DC) and on suspend-resume: acpi driver temporarily disables device, updates the device with any new C-states, and reenables the device. The problem is is on boot, there are no C2, C3 states supported and we skip the register. Later on AC<->DC, we may get a CST notification and we try to reevaluate CST and enabled the device, without actually registering it. This causes breakage as we try to create /sys fs sub directory, without the parent directory which is created at register time. Thanks to Sanjeev for reporting the problem here. http://bugzilla.kernel.org/show_bug.cgi?id=10394 Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Mar 25, 2008
-
-
Venki Pallipadi authored
commit 9b12e18c 'ACPI: cpuidle: Support C1 idle time accounting' was implicated in a 100% C0 idle regression. http://bugzilla.kernel.org/show_bug.cgi?id=10076 It pointed out a potential problem where the menu governor may get confused by the C-state residency time from poll idle or C1 idle, where this timing info is not accurate. This inaccuracy is due to interrupts being handled before we account for C-state exit. Do not mark TIME_VALID for CO poll state. Mark C1 time as valid only with the MWAIT (CSTATE_FFH) entry method. This makes governors use the timing information only when it is correct and eliminates any wrong policy decisions that may result from invalid timing information. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Yi Yang authored
cpuidle C-state sysfs node time and usage are very easy to overflow because they are all of unsigned int type, time will overflow within about two hours, usage will take longer time to overflow, but they are increasing for ever. This patch will convert them to unsigned long long. Signed-off-by:
Yi Yang <yi.y.yang@intel.com> Acked-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Feb 13, 2008
-
-
Venkatesh Pallipadi authored
Add a new sysfs entry under cpuidle states. desc - can be used by driver to communicate to userspace any specific information about the state. This helps in identifying the exact hardware C-states behind the ACPI C-state definition. Idea is to export this through powertop, which will help to map the C-state reported by powertop to actual hardware C-state. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Feb 09, 2008
-
-
Venki Pallipadi authored
The last posted version of this patch gave compile error on IA64. So, here goes yet another rewrite of the patch. Convert cpu_idle_wait() to cpuidle_kick_cpus() which is SMP-only, and gives error on non supported CPU. Changes from last patch sent by Kevin: Moved the definition of kick_cpus back to cpuidle.c from cpuidle.h: * Having it in .h gives #error on archs which includes the header file without actually having CPU_IDLE configured. To make it work in .h, we need one more #ifdef around that code which makes it messy. * Also, the function is only called from one file. So, it can be in declared statically in .c rather than making it available to everyone who includes the .h file. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Kevin Hilman <khilman@mvista.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- Feb 07, 2008
-
-
venkatesh.pallipadi@intel.com authored
Add a default poll idle state with 0 latency. Provides an option to users to use poll_idle by using 0 as the latency requirement. Signed-off-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>