|
|
Subscribe / Log in / New account

The PowerClamp driver

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
December 5, 2012
The kernel's power management subsystem has become increasingly effective over recent years, to the point that our CPU power management is said to be second to none. But, while the kernel endeavors to minimize the power consumed by a given workload, it lacks mechanisms to put an overall limit on the amount of power consumed. The recently-announced PowerClamp driver by Jacob Pan and Arjan van de Ven is intended to change that situation on Intel processors.

Most users will never want to use PowerClamp. As a general rule, when one has purchased hardware with a given computational capability, one wants that full capability to be available when needed. But there are situations where it makes sense to run a system below its full speed. Data centers have power-consumption and cooling constraints that can argue against running all systems flat-out all the time. Even the owner of an individual laptop or handheld system may wish to ensure that its operating temperature does not exceed a given value; an overly hot laptop can be uncomfortable to work with, even if it is still working within its specified temperature range. So there can be value in telling the system to run slower at times.

The PowerClamp driver allows the system administrator to set a desired idle percentage by way of a sysfs attribute. That percentage is capped at 50% in the current implementation. Once a percentage has been set, the kernel monitors the actual idle time for each processor in the system. Should a processor's idle time fall below the desired idle percentage, a special kernel thread (called kidle_inject/N, where N is the number of the CPU to which the thread is assigned) is created to take corrective action.

That thread operates as a high-priority realtime process, so it is able to respond quickly when needed. Its job is relatively simple: look at the amount of idle time on its assigned CPU and calculate the difference from the desired idle time. Then, periodically, the thread will run, disable the clock tick, and force the CPU into a sleep state for the required amount of time. The sleeping is done for a given number of jiffies, so the sleep states tend to be relatively long — a necessary condition for an effective reduction in power usage.

Naturally, the PowerClamp thread will continue to monitor actual idle time as it operates, adjusting the amount of forced sleep time as needed. It also monitors the amount of desired sleep time that is lost to interrupts. Interrupts remain enabled during the forced sleep, so they can bring the processor back to an operational state before the PowerClamp driver would have otherwise done so. Over time, the amount of sleep time lost in this manner is tracked; the driver will then attempt to compensate by increasing the amount of forced sleep time to try to pull the CPU back to the original idle time target.

By itself, PowerClamp can come close to achieving the desired level of idle time on a system with a changing workload. Often, though, the real goal is not idle time as such; instead, the purpose is to keep the system within a given level of power consumption or a set of thermal limits. Doing that will require the implementation of additional logic in user space. By monitoring the parameter of interest, a user-space process can implement a control loop that adjusts the desired level of idle time as needed. The PowerClamp driver can respond relatively quickly to those changes, giving the control process an effective tool for the management of the amount of power used by the system.

The driver has been through a couple of revisions with little in the way of substantive comments. This patch poses a relatively small risk to the system, since it does not do anything if the feature is not in use. It could thus conceivably be ready for merging as soon as the 3.8 development cycle. Some more information can be found in the documentation file included with the patch.


(Log in to post comments)

The PowerClamp driver

Posted Dec 6, 2012 11:00 UTC (Thu) by ras (subscriber, #33059) [Link]

> The kernel's power management subsystem has become increasingly effective over recent years, to the point that our CPU power management is said to be second to none.

OK, now I am confused.

I always put MacBook's phenomenal battery life to excellent software power management, and by implication the Linux ecosystem had a bit of catching up to do. Admittedly this based purely on observation that MacBook's are nearly identical. For example the MacBook pro's battery capacity is approx 210kJ, a Dell is 180kJ. The other components - CPU, graphics card, memory, disk drive are all the same. Yet the Mac will last 10 hours, and the Dell 4.

Can someone with a clue tell me what is really going on?

The PowerClamp driver

Posted Dec 6, 2012 11:49 UTC (Thu) by hummassa (guest, #307) [Link]

My MacBook also has a considerably long battery life *as long as I don't do anything funny with it* (like transcoding movies or compiling a large piece of software like gcc...)

The PowerClamp driver

Posted Dec 6, 2012 13:56 UTC (Thu) by Jonno (subscriber, #49613) [Link]

> Can someone with a clue tell me what is really going on?
Linux kernel CPU power management currently beats all other, but the CPU isn't the only thing drawing power. GPU, PCIe, and HDD power management are all areas where Linux could do significantly better. And Linux userspace is typically less power management aware than that of MacOS X, resulting in the Mac simply doing less work while on battery...

The PowerClamp driver

Posted Dec 7, 2012 1:59 UTC (Fri) by idupree (guest, #71169) [Link]

Yeah, alas. I'd hoped it was a "[mechanism] to put an overall limit on the amount of power consumed" but it isn't. I was looking at certain external batteries[*] for travel which have a maximum output rating of 4.2A. My laptop (at 19V DC input) is capable of using more than 4.2A (80 W). The power brick it came with has max 6.32A; from my measurements with some of CPU/GPU/screen at full force, this higher capacity is sometimes necessary. If my system tried to draw more power at any moment, bad things might happen (I'm not sure how bad. Battery damage? Laptop shutdown?). I wouldn't buy that battery without a way to ensure the laptop's total power usage was below the limit -- which is a hard problem which this article does not appear to go anywhere near solving.

[*] http://zikko-store.com/product_view.php?id=7

The PowerClamp driver

Posted Dec 12, 2012 16:40 UTC (Wed) by arjan (subscriber, #36785) [Link]

powerclamp is a building block towards a solution that you describe.
Note that if you need a "near instant" limit, a kernel level solution isn't going to work, you need something much much faster responding.
But if you can deal with "we need over <some hundreds of milliseconds> the average to be below X", and you can measure X.. this driver is what a small (userspace) control agent can use to actually impact the current consumption

The PowerClamp driver

Posted Dec 12, 2012 16:37 UTC (Wed) by arjan (subscriber, #36785) [Link]

Note that this is NOT about getting a longer battery life.
In fact, pretty much all tricks to put a limit on temperature/power consumption like this en up costing you battery life.

This is is about limiting either temperature or current use (on laptops temperature matters, in data centers current matters, but also temperature) due to external constraints.

We have a simple userspace app for example that can control a laptop temperature to just below the point that the fan would come on.
(it's very much prototype code at this point.. we're working on getting it more usable than on the one machine we ran it on).

The PowerClamp driver

Posted Dec 12, 2012 16:44 UTC (Wed) by corbet (editor, #1) [Link]

I'm confused...the word "battery" does not appear in the article. Instead I talk about things like temperature regulation. Was something not clear?

The PowerClamp driver

Posted Dec 12, 2012 16:55 UTC (Wed) by arjan (subscriber, #36785) [Link]

the article is clear to me, but I can see people thinking this is for "saving power" (see the first comments)...

The PowerClamp driver

Posted Dec 13, 2012 15:59 UTC (Thu) by redden0t8 (guest, #72783) [Link]

I'm kind of confused... how does reduced power consumption cost you battery life?

Is it because a given workload will take longer to complete, and therefore take more total power by the time its done?

If that's the case, then it's actually workload dependent. Specifically, I'm thinking of playing retro games that insist on taking 100% of the CPU. They needlessly waste power to achieve much greater that 60 fps, even with the CPU clocked to the lowest scaling frequency. Would PowerClamp not increase battery life in this scenario?

The PowerClamp driver

Posted Mar 7, 2013 21:57 UTC (Thu) by pjones (subscriber, #31722) [Link]

energy = power * time. If you deliberately hobble the workload to use less power, it takes more time, but also operates less efficiently. This means you'll use more energy.

The PowerClamp driver

Posted Mar 7, 2013 22:53 UTC (Thu) by dgm (subscriber, #49227) [Link]

> If you deliberately hobble the workload to use less power, it takes more time

Not if your work depends on external events, that is, IO. And it all depends on the characteristics of the system. If you can do it 10% slower at half the power consumption, it may very well be worth the wait.

The PowerClamp driver

Posted Mar 7, 2013 23:30 UTC (Thu) by dlang (guest, #313) [Link]

and remember that even memory access is an external event that is significantly slower than the CPU, so if what you are doing is a small amount of computation on a large amount of memory, you may be able to do it in the same amount of time with a much slower CPU.

The PowerClamp driver

Posted Mar 11, 2013 13:06 UTC (Mon) by ssam (guest, #46587) [Link]

if i am running a process that is limited by memory bandwidth, wont the CPU be spend lots of time idle, and so reducing the power consumption already? Can the CPU sleep while it is waiting for something to be fetched from RAM?

The PowerClamp driver

Posted Mar 11, 2013 20:28 UTC (Mon) by dlang (guest, #313) [Link]

not really, it takes a long time for the processor to go to sleep and wake up again, long enough that it's frequently better for the processor to idle at high speed rather than go into sleep mode.

The PowerClamp driver

Posted Apr 29, 2013 17:04 UTC (Mon) by aidenn0 (guest, #90668) [Link]

This sounds like something I could really use; I have no A/C where I live, and my server tends to overheat if I compile code on it in August; I could pretty easily hook into the thermal info from user-space to clamp down the CPU time when it gets too hot.

The PowerClamp driver

Posted Jun 25, 2013 21:22 UTC (Tue) by daniexia (guest, #91596) [Link]

Hi,

I have a problem with seeing the effect of powerclamp. First, I enabled intel powerclamp driver in my menuconfig. Then, I built a new kernel 3.9.7. After rebooting, I can see the module as /sys/class/thermal/cooling_device13. When I grep it. It shows,

cur_state:-1
max_state:50
type:intel_powerclamp

The cur_state =-1, which means invalid state. Why is this invalid state?

Thanks.

The PowerClamp driver

Posted Jun 26, 2013 16:57 UTC (Wed) by daniexia (guest, #91596) [Link]

solved, after enabling package C state in BIOS.


Copyright © 2012, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds