The Emacs dumper dispute

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the linux and free software communities. If you enjoy this article, please consider accepting the trial offer on the right. Thank you for visiting LWN.net!

Free trial subscription

Try LWN for free for 1 month: no payment or credit card required. Activate your trial subscription now and see why thousands of readers subscribe to LWN.net.

By Jonathan Corbet

November 30, 2016

The Emacs editor is, at its core, a C program, but much of the editor's functionality is actually implemented in its special "Elisp" dialect of Lisp. Starting the editor requires loading a great deal of Elisp code and initializing its state, a process that can take a long time. To avoid making users wait for this process, Emacs has long used a scheme whereby the Elisp code is loaded once and a memory image is written to disk; starting Emacs becomes a matter of reading the memory image back in, which is a much faster process. Supporting this "dumping" functionality (also known as "unexec") has never been easy; beyond the technical challenges, it now appears that it may lead to a significant split within the Emacs community.

Ascovered here in January, the Emacs dumping (and "undumping") mechanism has long depended on some low-level hooks in the GNU C Library's memory allocation subsystem. The Glibc developers would like to modernize and improve this code, improving the library overall but removing the hooks that Emacs depends upon. At the end of the January discussions, the Emacs developers had decided to move to a workaround implementation that allowed the dumper to continue to work in the absence of Glibc support.

What nobody realized at the time is that the loss of the Glibc hooks, which happened in October for the 2.25 release (expected in February 2017), would affect existing Emacs releases in a surprising way. In particular, they fall back to an older interface called "ralloc", which does not perform well at all. The result iswell summarized by Emacs co-maintainer Eli Zaretskii in October:

Based on what we've learned the hard way during the last couple of weeks, I'd say that all the Emacs versions before 25.2 (including 25.1) will be unstable on such GNU systems to the degree of making them almost unusable.

"Unstable" is the sort of behavior that users of text editors normally go well out of their way to avoid; it's also the sort of thing that could give vi a definitive advantage in the interminable editor wars. So something clearly needs to be done to make the Emacs dumping facility more stable and, preferably, more maintainable going forward. What that "something" would be is unclear, and the posting of a possible solution appears to have simply muddied the waters further.

That solution comes in the form of the "portable dumper" patch from Daniel Colascione. This patch is not small; it adds over 4,500 lines of code to Emacs and it is not yet complete. Rather than try to capture the state of the C library's memory-allocation subsystem, it simply marshals and saves the set of Elisp objects known to the editor. The file format is designed for performance and, in some settings at least, Emacs can start by simply mapping the file into memory and initializing a set of pointers.

Colascione describes the result this way:

The point of this gargantuan patch is that we can rip out our unexec implementations and replace them with loading a data file that contains an Emacs heap image. There are no dependencies on executable rewriting, disabling ASLR, or saving and restoring internal malloc state. This system works with fully position-independent executables and with any malloc implementation.

It also, he says, matches the startup performance of the current "unexec" system to within 100ms, and he has not yet had the time to collect a bunch of low-hanging optimization fruit. In other words, it seems like an interesting solution to the problem, but a patch of this size is always going to generate some discussion.

Some of that discussion focused on how this dumper works when address-space layout randomization (ASLR) is in use. Current Emacs binaries must disable ASLR entirely, thus losing the security benefits that ASLR is meant to provide. The new dumper does not require disabling ASLR, but it does contain an optimization that can be applied if the dump file can be successfully mapped at a specific address: most of the data therein can be used directly from the mapped image, without the need to allocate storage for and copy it. That should speed the startup process considerably, at the cost of always mapping the dump image at the same location.

Paul Eggertworried about the potential security implications of losing ASLR protection for the bulk of the editor's data. Colascioneresponded that, since no part of the data image is marked executable, there is little risk of attackers running code from there. But, as Eggertpointed out, that view overlooks an important detail: that memory is full of Elisp bytecode that is executed in the editor itself, and which can do just about anything an attacker might want. So, if this approach is adopted, the fixed-location mapping might have to be turned off, at least by default.

There is, however, a bigger disagreement involving Zaretskii, whodescribed this work as " a wrong direction. " His objection, in short, is that this patch adds a lot of low-level complexity, implemented inC, that will be a maintenance burden going forward. That is,he said, a threat to the future of the project:

The number of people aboard who can matter-of-factly hack the Emacs internals on the C level is consistently going down, and is already so small they can be counted on one hand. We must make Emacs depend less on people from this small and diminishing group, if we want the development pace increased or even kept at its current level. To me, that means keep as many features out of low-level C, and limit futzing with C-level internals of Lisp objects and the Lisp interpreter to the absolute minimum.

It makes sense to put thought into the maintainability of the code base and how it can be evolved to attract more developers. It is not entirely clear, though, that C programmers are actually a dying breed ― or that the long-term supply of Elisp developers is more certain. In any case, the Emacs community needs to fix the startup problem; those who oppose the portable dumper solution presumably have something else in mind.

Zaretskii's preferred solution would be to make the Elisp loader faster, to the point that it can be used to read Elisp code directly at startup time. That is a solution that others might like to see as well, but it has one significant shortcoming: no code toward that goal exists, and there are no signs that anybody is working in that area. Colascione's solution, instead, does exist and has an interested developer behind it. In almost any development project, working code and ongoing maintenance carries a lot of weight.

Zaretskii feels strongly enough about this issue that he has thr

The Emacs dumper dispute

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本