The linux kernel feature of cgroups (Control Groups) is being increasingly adopted for running applications in multi-tenanted environments. Many projects (e.g., Docker and CoreOS ) rely on cgroups to limit resources such as CPU and memory. Ensuring the high performance of the applications running in cgroups is very important for business-critical computing environments.
At LinkedIn, we have been using cgroups to build our owncontainerization product calledLPS (LinkedIn Platform as a Service) and investigating the impact of resource-limiting policies on application performance. This post presents our findings on how memory pressure affects the performance of applications in cgroups. We have found that cgroups do not totally isolate resources, but rather limit resource usage so that applications running in memory-limited cgroups do not starve other cgroups.
When there is memory pressure in the system, various issues can significantly affect the performance of the applications running in cgroups. Specifically: (1) Memory is not reserved for cgroups (as with virtual machines); (2) Page cache used by apps is counted towards a cgroup’s memory limit (therefore anonymous memory usage can steal page cache usage for the same cgroup); and (3) The OS can steal memory (both anonymous memory and page cache) from cgroups if necessary (because the root cgroup is unbounded). In this post, we’ll also provide a set of recommendations for addressing these issues.
IntroductionCgroups (Control Groups) provide kernel mechanisms to limit the resource usage of different applications. These resources include memory, CPU, and disk IO. Among these, memory usage is one of the most important resource types that impact application performance.
On Linux, there is a root cgroup that serves as the base of the cgroup hierarchy. Multiple non-root cgroups (i.e., regular cgroups) can be deployed, each with a fixed memory limit. A process can be explicitly assigned to regular cgroups, which are bounded by certain memory limits. Any processes (e.g., sshd ) that are not assigned to regular cgroups are managed by the root cgroup.
Though cgroups do a decent job of limiting the memory usage of each regular cgroup, based on our experiences using cgroups V1 (V1 starts in Linux kernel 2.6.24, and the new version of V2 appears in Linux kernel 4.5), applications running in cgroups fail to perform well in certain memory-pressured scenarios.
We studied cgroups’ performance under various types of memory pressure and found several potential performance pitfalls. If not controlled carefully, these performance problems can significantly affect applications running in cgroups. We also propose recommendations to address these pitfalls.
BackgroundBefore moving on to the performance issues, we’ll use the following diagram to present some background information. A regular cgroup’s memory usage includes anonymous (i.e., user space) memory (such as malloc() requested) and page cache. A cgroup’s total memory usage is capped by the memory limit configured for it. The root cgroup’s memory, however, is unbounded with no limit.
Each cgroup can have its own swappiness (value of 0 disables swapping, while 1 enables) setting, but all cgroups use the same swap space configured by OS. Similarly, though each cgroup can use page cache, all page caches belong to a single kernel space and are maintained by OS.

Performance pitfalls
Memory pressure in either the root cgroup or the regular cgroups may affect the performance of other cgroups. One of the impacts of these issues is degraded application performance. For instance, application startup can be much slower if the OS has to free up memory in order to satisfy an application memory request.
Experiment setupFor each of the pitfalls listed below, we conducted experiments to determine the size of the performance impact. The experiment setup is as follows. The machine runs RHEL 7 with a Linux kernel of 3.10.0-327.10.1.el7.x86_64 and 64GB of physical RAM. The hardware is dual-socket with a total of 24 virtual cores (hyper-threading enabled). OS-level swapping is enabled (swappiness=1) and there are 16GB of total swap space. Swapping in all cgroups is disabled by setting swappiness=0.
The workload used to request anonymous memory is a JVM application, which keeps allocating and deallocating objects. Other performance metrics we consider include: the cgroup’s statistics (such as page cache, swap, and RSS size), and OS “free”-utility reported statistics (such as swap and page cache size).
1. Memory is not reserved for cgroups (as with virtual machines)A cgroup only imposes an upper limit on memory usage by applications in the cgroup. It does not reserve memory for these applications and as such, memory is allocated on demand, and applications deployed in cgroups still compete for free memory from the OS.
One implication of this feature is that, when the cgroup later requests more memory (still within its memory limit), the requested memory needs to be allocated by OS at that time. If the OS does not have enough free memory, it has to reclaim memory from the page cache or anonymous memory, depending on the swapping setup on the OS (i.e., swappiness value and swap space).
Because of this, memory reclamation by the OS could be a performance killer, affecting the performance of other cgroups.
ExperimentsWe started by ensuring that a regular cgroup’s memory usage had not reached its limit, and that the process running in the cgroups is requesting more memory. If the OS does not have enough free memory, it must reclaim page cache to satisfy the cgroup’s request. If the reclaimed page cache is dirty, then the OS needs to write back the dirty page cache to disk before providing the memory to the cgroup, which is a slow process when the swap files are on HDD. The cgroup process requesting memory therefore needs to wait for the memory request, and so experiences degraded performance.
Under these conditions, the application requesting 16GB anonymous memory takes about 20 seconds to obtain its anonymous memory. During this startup period, the application performance is close to zero. For normal running time, the amount of performance degradation varies based on the writeback amount of the dirty page cache and the requested memory amount.
2. Page cache usage by apps is counted towards a cgroup’s memory limit, and anonymous memory usage can steal page cache for the same cgroup
A cgroup’s memory limit (e.g., 10GB) includes all memory usage of the processes running in it―both the anonymous memory and page cache of the cgroup are counted towards the memory limit. In particular, when the application running in a cgroup reads or writes files, the corresponding page cache allocated by OS is counted as part of the cgroup’s memory limit.
For some applications, the starvation of page cache (and corresponding low page cache hit rate) has two effects: it degrades the performance of the application and increases workload on the root disk drive, and it could also severely