Application Pauses When Running JVM Inside Linux Control Groups

linux cgroups -based solutions (e.g., Docker , CoreOS ) are increasingly being used to host multiple applications on the same host. We have been using cgroups at LinkedIn to build our owncontainerization product calledLPS (LinkedIn Platform as a Service) and to investigate the impact of resource-limiting policies on application performance. This post presents our findings on how CPU scheduling affects the performance of Java applications in cgroups. We found that Java applications can have more and longer application pauses when using CFS (Completely Fair Scheduler) in conjunction with CFS Bandwidth Control quotas. During these pauses, the application is not responding to user requests, so this is a severe performance pain that we need to understand and address.

These increased pauses are caused by the interactions between JVM’s GC (Garbage Collection) mechanisms and CFS scheduling. In CFS, a cgroup is assigned certain CPU quota (i.e., cfs_quota), which can be quickly drained by JVM GC’s multiple-threading activity, causing the application to be throttled. For example, the following may occur:

If an application aggressively uses its CPU quota in a scheduling period, then the application is throttled (no more CPU is given) and stops responding for the remaining duration of the scheduling period.

The multi-threaded JVM GC makes the problem much worse, as the cfs_quota is counted across all threads of the application. As a result, the CPU quota may be even more quickly used up. JVM GC has many concurrent phases that are non-STW (stop the world). However, their running can also cause faster cpu_quota use, hence practically making the entire application STW.

In this post, we’ll share our findings after investigating this issue and our recommendations about CFS/JVM tuning to mitigate the negative impact. Specifically:

Sufficient CPU quota should be assigned to the cgroup that hosts the Java application; and,

JVM GC threads should be appropriately tuned down to mitigate the pauses.

Linux cgroups background

Linux cgroups (Control Groups) are used to limit various types of resource usage by applications. For the CPU resource, the CPU subsystem schedules CPU access to cgroups, and CFS is one of the two supported schedulers. CFS is a proportional share scheduler that allocates CPU access to cgroups based on the cgroups’ weights.

For the RHEL7 (Red Hat Enterprise Linux) machine we use, there are multiple tunables. Two CFS tunables for ceiling enforcement are used for limiting the CPU resources used by cgroups: cpu.cfs_period_us and cpu.cfs_quota_us, both in microseconds. cpu.cfs_period_us specifies the CFS period, or enforcement interval for which a quota is applied, and cpu.cfs_quota_us specifies the quota the cgroup can use in each CFS period. cpu.cfs_quota_us essentially sets the hard limit (i.e., ceiling) on the CPU resource. A cgroup (along with its processes) is only allowed to occupy the CPU cores for the time duration specified in cpu.cfs_quota_us. Hence, to give a cgroup N cores, cpu.cfs_quota_us is set to N times of cpu.cfs_period_us.

Workload and setup

For our analysis, we created a synthetic Java application for testing CFS behavior. This Java application simply keeps allocating objects on the Java heap. After the number of allocated objects reaches a certain threshold, a portion of them are released. There are two application threads, each independently performing object allocation and object release. The time taken by each object allocation is recorded as the allocation latency. The source code for this synthetic Java application is on GitHub .

The performance metrics we considered include: (1) application throughput (the object allocation rate); (2) object allocation latencies; (3) the cgroup’s statistics, including the cgroup’s CPU usage, nr_throttled (number of CFS periods that are throttled), and throttled_time (total throttled time).

The machine we used is RHEL7 with 3.10.0-327 kernel, and it has 24 HT (hyper-threading) cores. The CPU resources were limited using CFS ceiling enforcement. The cgroup hosting the Java application by default was assigned three cores of CPU shares, considering the fact that there were two application threads and GC activities. In later tests we also varied the number of cores assigned in order to gain additional insights. The cfs_period by default was 100ms. Each run of the workload took 20 minutes (1200 seconds). So with cfs_period being 100ms, there were 12,000 CFS periods in each run.

Investigation of a large application pause

We’ll start with a detailed analysis of a particular application pause in order to shed light on the reasons behind the pause.

Application stop

At the time of 22:57:34, both application threads stop for about three seconds (i.e., 2,917ms and 2,916ms).

JVM GC STW

To understand what caused the three-second application freezing, we first examined the JVM GC log. We found that at the time of 22:57:37.771, a STW (stop the world) GC pause occurred. Note that the pause lasts about 0.12 seconds.

Compared with the three-second application stop, the 0.12-second GC pause is insufficient to explain the 2.88-second (i.e., 3-0.12) difference; therefore, there must be some other reason for the pause.

CFS throttling

We suspected that the extra application pause was caused by the cgroups’ CFS scheduler. We examined the cgroups’ statistics by gathering various types of reported cgroups statistics for every second that the application was running. We found that the metric of “throttled_time” is of great interest. “Throttled_time” reports the accumulated total time duration (in nanoseconds) for which entities of the group have been throttled.

We noticed that, while the application was frozen, the “throttled_time” occurred starting at 22:57:33. When the application was in the frozen period, the increase (i.e., difference) in “throttled_time” was about 5.28 seconds. We believe that the “throttled_time” contributed to the application freezing.

JVM GC threads

We have found that some CFS scheduling periods expose substantial “throttled_time,” which we believe is caused by the (multiple) JVM GC threads. Briefly, when GC starts, JVM invokes multiple GC threads to do the work. JVM uses internal formulas to decide on the number of GC threads. Specifically, for a machine with 24 cores, the number of parallel GC threads is18, and the number of concurrent GC threads is 5. Because of these large numbers of threads, the cgroup’s CPU quota is quickly used up, causing all application threads (including GC threads) to be paused. How does CFS scheduler cause the application pause? CFS scheduler can lead to long application pau

Application Pauses When Running JVM Inside Linux Control Groups

Trending Articles

[奇怪机翻组] 双梦相牵 / ふたりの夢もち [RJ01259078] [WebRip] [1080P HEVC-10Bit AAC 2.0]...

HONDA CITY VTI-S 菜單分享

#新闻拍一拍# 新的摩尔定律：黄氏定律

一如既往的痴情能否打动月瓶金蝎？ (豆瓣月亮水瓶小组)

求購按摩椅~'~

「粉红」不是霸凌辜莞允杠部落客：我爽在哪？

Intel 7-10代集成显卡驱动31.0.101.2137完整版

涉Gotbit加密货币市场操纵台男纽约被捕

臺灣法治會計學會2025年第三季研討會

不靠姊姊！張柏芝弟弟開計程車維生

关门一家亲：习远平、张澜澜、徐才厚

剑指offer——24.二叉树中和为某一值的路径

苏珊米勒日晕05.11｜狮子鼓励孩子；处女相信自己 (豆瓣 SUSAN MILLER小组)

【台積電IT卓越新戰略5】台積IT組織5年三次大調整，要靠平臺工程讓DevOps創新再加速

【日语无字】春之钟.Haru.no.kane.1985.JAP.vhsrip.NoSub.by.xiongzaixia&vivi

美籍老公不讓步李愛綺兒子念公立小學

新华网这张照片绝了!直讽江泽民宋祖英淫乱组图

湖州师范学院音乐学院开发的 Kontakt 8 明代魏氏乐琵琶/瑟/月琴音源即将发布

Google Chrome Portable 140.0.7339.186 穩定版免安裝中文版 - Google 瀏覽器

免费翻墙节点大全