|
|
Subscribe / Log in / New account

Fixing control groups

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

By Jonathan Corbet
February 28, 2012
Control groups are one of those features that kernel developers love to hate. It is not that hard to find developers complaining about control groups and even threatening to rip them out of the kernel some dark night when nobody is paying attention. But it is much less common to see explicit discussion around the aspects of control groups that these developers find offensive or what might be done to improve the situation. A recent linux-kernel discussion may have made some progress in that direction, though.

The control group mechanism is really just a way for a system administrator to partition processes into one or more hierarchies of groups. There can be multiple hierarchies, and a given process can be placed into more than one of them at any given time. Associated with control groups is the concept of "controllers," which apply some sort of policy to a specific control group hierarchy. The group scheduling controller allows control groups to contend against each other for CPU time, limiting the extent to which one group can deprive another of time in the processor. The memory controller places limits on how much memory and swap can be consumed by any given group. The network priority controller, merged for 3.3, allows an administrator to give different groups better or worse access to network interfaces. And so on.

Tejun Heo started the discussion with a lengthy message describing his complaints with the control group mechanism and some thoughts on how things could be fixed. According to Tejun, allowing the existence of multiple process hierarchies was a mistake. The idea behind multiple hierarchies is that they allow different policies to be applied using different criteria. The documentation added with control groups at the beginning gives an example with two distinct hierarchies that could be implemented on a university system:

  • One hierarchy would categorize processes based on the role of their owner; there could be a group for students, one for faculty, and one for system staff. Available CPU time could then be divided between the different types of users depending on the system policy; professors would be isolated somewhat from student activity, but, naturally, the system staff would get the lion's share.

  • A second hierarchy would be based on the program being executed by any given process. Web browsers would go into one group while, say, bittorrent clients could be put into another. The available network bandwidth could then be split according to the administrator's view of each class of application.

On their face, multiple hierarchies provide a useful level of flexibility for administrators to define all kinds of policies. In practice, though, they complicate the code and create some interesting issues. As Tejun points out, controllers can only be assigned to one hierarchy. For controllers implementing resource allocation policies, this restriction makes some sense; otherwise, processes would likely be subjected to conflicting policies when placed in multiple hierarchies. But there are controllers that exist for other purposes; the "freezer" controller simply freezes the processes found in a control group, allowing them to be checkpointed or otherwise operated upon. There is no reason why this kind of feature could not be available in any hierarchy, but making that possible would complicate the control group implementation significantly.

The real complaint with multiple hierarchies, though, is that few developers seem to see the feature as being useful in actual, deployed systems. It is not clear that it is being used. Tejun suggests that this feature could be phased out, perhaps with a painfully long deprecation period. In the end, Tejun said, the control group hierarchy could disappear as an independent structure, and control groups could just be overlaid onto the existing process hierarchy. Some others disagree, though; Peter Zijlstra said "I rather like being able to assign tasks to cgroups of my making without having to mirror that in the process hierarchy." So the ability to have a control group hierarchy that differs from the process hierarchy may not go away, even if the multiple-hierarchy feature does eventually become deprecated.

A related problem that Tejun raised is that different controllers treat the control group hierarchy differently. In particular, a number of controllers seem to have gone through an evolutionary path where the initial implementation does not recognize nested control groups but, instead, simply flattens the hierarchy. Later updates may add full hierarchical support. The block I/O controller, for example, only finished the job with hierarchical support last year; others still have not done it. Making the system work properly, Tejun said, requires getting all of the controllers to treat the hierarchy in the same way.

In general, the controllers have been the source of a lot of grumbling over the years. They tend to be implemented in a way that minimizes their intrusiveness on systems where they are not used - for good reason - but that leads to poor integration overall. The memory controller, for example, created its own shadow page tracking system, leading to a resource-intensive mess that was only cleaned up for the 3.3 release. The hugetlb controller is not integrated with the memory controller, and, as of 3.3, we have two independent network controllers. As the number of small controllers continues to grow (there is, for example, a proposed timer slack controller out there), things can only get more chaotic.

Fixing the controllers requires, probably more than anything else, a person to take on the role as the overall control group maintainer. Tejun and Li Zefan are credited with that role in the MAINTAINERS file, but it is still true that, for the most part, control groups have nobody watching over the whole system, so changes tend to be made by way of a number of different subsystems. It is an administrative problem in the end; it should be amenable to solution.

Fixing control groups overall could be harder, especially if the elimination of the multiple-hierarchy feature is to be contemplated. That, of course, is a user-space ABI change; making it happen would take years, if it is possible at all. Tejun suggests "herding people to use a unified hierarchy", along with a determined effort to make all of the controllers handle nesting properly. At some point, the kernel could start to emit a warning when multiple hierarchies are used. Eventually, if nobody complains, the feature could go away.

Of course, if nobody is actually using multiple hierarchies, things could happen a lot more quickly. Nobody entered the discussion to say that they needed multiple hierarchies, but, then, it was a discussion among kernel developers and not end users. If there are people out there using the multiple hierarchy feature, it might not be a bad idea to make their use case known. Any such use cases could shape the future evolution of the control group mechanism; a perceived lack of such use cases could have a very different effect.

Index entries for this article
KernelControl groups


(Log in to post comments)

Multiple hierarchies?

Posted Mar 1, 2012 12:21 UTC (Thu) by smurf (subscriber, #17840) [Link]

Just look at systemd. It uses control groups to organize its daemons.
(Which IMHO makes a lot of sense, practically speaking.)

While I haven't looked at it in any detail, I do NOT think that every existing controller semantics can be made to conform to systemd's needs.

Fixing control groups

Posted Mar 1, 2012 16:56 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

I'm using multiple control groups and I like it. For example, it makes perfect sense to categorize processes based on network policy (i.e. processes that can create connections, that can create listening sockets, etc.) with completely parallel tree maintained by systemd.

That's a great feature and I'd be disappointed if it went away.

Fixing control groups

Posted Mar 2, 2012 22:00 UTC (Fri) by cmccabe (guest, #60281) [Link]

Removing multiple-hierarchy support would be an amazingly bad idea. It would force system administrators to put all their tasks into one giant rigid hierarchy, removing all flexibility. It would also make a lot of configurations impossible.

For example, let's say I have a simple system with 3 processes: httpd, sshd, and xeyes.

I want the following constraints:
httpd: limit network, limit memory, don't freeze
sshd: do freeze
xeyes: limit memory, do freeze

Assuming there are three cgroups-- network, memory, and freezer-- how can I create a single hierarchy that will do what I want? Unless I am somehow misinterpreting what is being proposed here, this would be impossible if multiple-hierarchy support was removed.

(And yes, I know i could use the old rlimit stuff. But there are presumably other things that cgroups can do that rlimit can't.)

Fixing control groups

Posted Mar 6, 2012 14:18 UTC (Tue) by jflasch (guest, #5699) [Link]

lxc containers use this as the control or fence to keep the processes in the containers in a container, this is the best performance of a kind of virtual machine in a machine.

Containers are still new and not many are using right now, but out of virtual system kvm,vmware,virualbox this one is the most promising so it should be mention as a heavy user of control groups.


Copyright © 2012, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds