Don’t Let Linux Control Groups Uncontrolled

Summary

The linux kernel feature of cgroups (Control Groups) is being increasingly adopted for running applications in multi-tenanted environments. Many projects (e.g., Docker and CoreOS ) rely on cgroups to limit resources such as CPU and memory. Ensuring the high performance of the applications running in cgroups is very important for business-critical computing environments.

At LinkedIn, we have been using cgroups to build our owncontainerization product calledLPS (LinkedIn Platform as a Service) and investigating the impact of resource-limiting policies on application performance. This post presents our findings on how memory pressure affects the performance of applications in cgroups. We have found that cgroups do not totally isolate resources, but rather limit resource usage so that applications running in memory-limited cgroups do not starve other cgroups.

When there is memory pressure in the system, various issues can significantly affect the performance of the applications running in cgroups. Specifically: (1) Memory is not reserved for cgroups (as with virtual machines); (2) Page cache used by apps is counted towards a cgroup’s memory limit (therefore anonymous memory usage can steal page cache usage for the same cgroup); and (3) The OS can steal memory (both anonymous memory and page cache) from cgroups if necessary (because the root cgroup is unbounded). In this post, we’ll also provide a set of recommendations for addressing these issues.

Introduction

Cgroups (Control Groups) provide kernel mechanisms to limit the resource usage of different applications. These resources include memory, CPU, and disk IO. Among these, memory usage is one of the most important resource types that impact application performance.

On Linux, there is a root cgroup that serves as the base of the cgroup hierarchy. Multiple non-root cgroups (i.e., regular cgroups) can be deployed, each with a fixed memory limit. A process can be explicitly assigned to regular cgroups, which are bounded by certain memory limits. Any processes (e.g., sshd ) that are not assigned to regular cgroups are managed by the root cgroup.

Though cgroups do a decent job of limiting the memory usage of each regular cgroup, based on our experiences using cgroups V1 (V1 starts in Linux kernel 2.6.24, and the new version of V2 appears in Linux kernel 4.5), applications running in cgroups fail to perform well in certain memory-pressured scenarios.

We studied cgroups’ performance under various types of memory pressure and found several potential performance pitfalls. If not controlled carefully, these performance problems can significantly affect applications running in cgroups. We also propose recommendations to address these pitfalls.

Background

Before moving on to the performance issues, we’ll use the following diagram to present some background information. A regular cgroup’s memory usage includes anonymous (i.e., user space) memory (such as malloc() requested) and page cache. A cgroup’s total memory usage is capped by the memory limit configured for it. The root cgroup’s memory, however, is unbounded with no limit.

Each cgroup can have its own swappiness (value of 0 disables swapping, while 1 enables) setting, but all cgroups use the same swap space configured by OS. Similarly, though each cgroup can use page cache, all page caches belong to a single kernel space and are maintained by OS.

Don’t Let Linux Control Groups Uncontrolled

Performance pitfalls

Memory pressure in either the root cgroup or the regular cgroups may affect the performance of other cgroups. One of the impacts of these issues is degraded application performance. For instance, application startup can be much slower if the OS has to free up memory in order to satisfy an application memory request.

Experiment setup

For each of the pitfalls listed below, we conducted experiments to determine the size of the performance impact. The experiment setup is as follows. The machine runs RHEL 7 with a Linux kernel of 3.10.0-327.10.1.el7.x86_64 and 64GB of physical RAM. The hardware is dual-socket with a total of 24 virtual cores (hyper-threading enabled). OS-level swapping is enabled (swappiness=1) and there are 16GB of total swap space. Swapping in all cgroups is disabled by setting swappiness=0.

The workload used to request anonymous memory is a JVM application, which keeps allocating and deallocating objects. Other performance metrics we consider include: the cgroup’s statistics (such as page cache, swap, and RSS size), and OS “free”-utility reported statistics (such as swap and page cache size).

1. Memory is not reserved for cgroups (as with virtual machines)

A cgroup only imposes an upper limit on memory usage by applications in the cgroup. It does not reserve memory for these applications and as such, memory is allocated on demand, and applications deployed in cgroups still compete for free memory from the OS.

One implication of this feature is that, when the cgroup later requests more memory (still within its memory limit), the requested memory needs to be allocated by OS at that time. If the OS does not have enough free memory, it has to reclaim memory from the page cache or anonymous memory, depending on the swapping setup on the OS (i.e., swappiness value and swap space).

Because of this, memory reclamation by the OS could be a performance killer, affecting the performance of other cgroups.

Experiments

We started by ensuring that a regular cgroup’s memory usage had not reached its limit, and that the process running in the cgroups is requesting more memory. If the OS does not have enough free memory, it must reclaim page cache to satisfy the cgroup’s request. If the reclaimed page cache is dirty, then the OS needs to write back the dirty page cache to disk before providing the memory to the cgroup, which is a slow process when the swap files are on HDD. The cgroup process requesting memory therefore needs to wait for the memory request, and so experiences degraded performance.

Under these conditions, the application requesting 16GB anonymous memory takes about 20 seconds to obtain its anonymous memory. During this startup period, the application performance is close to zero. For normal running time, the amount of performance degradation varies based on the writeback amount of the dirty page cache and the requested memory amount.

2. Page cache usage by apps is counted towards a cgroup’s memory limit, and anonymous memory usage can steal page cache for the same cgroup

A cgroup’s memory limit (e.g., 10GB) includes all memory usage of the processes running in it―both the anonymous memory and page cache of the cgroup are counted towards the memory limit. In particular, when the application running in a cgroup reads or writes files, the corresponding page cache allocated by OS is counted as part of the cgroup’s memory limit.

For some applications, the starvation of page cache (and corresponding low page cache hit rate) has two effects: it degrades the performance of the application and increases workload on the root disk drive, and it could also severely

Don’t Let Linux Control Groups Uncontrolled

Trending Articles

[奇怪机翻组] 过分色气的深见君 / Yatara Yarashii Fukami-kun - 01 [WebRip] [1080P...

[ReinForce] 吸血鬼同盟 Dance In The Vampire Bund (BDRip 1920x1080 x264 FLAC)

有人買民雄嘉大博識嗎?(或美銓建設以前的建案)

JVID女郎搞暗黑《延禧》

MAME 0.277 免安裝中文版 - 街機遊戲模擬器

Photoshop.CS6 (免安裝隨身版隨插即用 ) (直接下載)

行星绕恒星边飞边解体令科学家惊心动魄

【日语无字】春之钟.Haru.no.kane.1985.JAP.vhsrip.NoSub.by.xiongzaixia&vivi

竹北高鐵第一豪宅若山怎麼了？竹北高鐵第一豪宅若山怎麼了？

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

出售: sound mechanics 音響架

关门一家亲：习远平、张澜澜、徐才厚

[转载]梦瑜伽三梦大法梦瑜伽的修行方法

詐騙猖獗網路名師也中鏢江兆君(小M老師)：學員勿上當！

Windbg 指令與分析之教學筆記

Office 安装管理器，一键下载/安装//打包ISO！支持2016-2024/365全版本！微软官方下载安全可靠！

回顧廿六年前北角地盤籠

【追新番字幕組】★[簡日雙語][ 勇者義彥和被引導的七人 12 最終回 / ゆうしゃヨシヒコとみちびかれしななにん Yusha Yoshihiko to...

C88圣战首日吸引18万人参战！会场工作人员名言汇总

SFC超級任天堂釣魚太郎1.2.3 (海釣太郎) 遊戲+金手指+模擬器！