Quantcast
Channel: CodeSection,代码区,Linux操作系统:Ubuntu_Centos_Debian - CodeSec
Viewing all articles
Browse latest Browse all 11063

Copying someone else's performance tweak

$
0
0

This all started as a issue in the Rust bug tracker . Actually, as far as it concerns me, it all started with a thread regarding CI being too slow , but I digress.

The story goes like this: some LLVM people learned that their test suite was bottlenecking on a big global lock because their CLI processes were loading a windows DLL that implicitly registered them as GUI apps (even though they never created a window). Fixing it should only require reimplementing a parser, because the official Microsoft implementation of the parser in question is inseparably bundled with the GUI library that is imposing so much overhead, from someone who had a convenient Windows box to test on. Like, me.


Copying someone else's performance tweak
Reimplementing the Windows C runtime command line argument parser

First thing I learned: don't just read the documentation. If you want to rewrite a part of Windows, you should mine the actual implementation.

You probably heard that you can write three quotes in a row to type a single quote in Windows. For example, "this """is""" the same thing" will give the same result as "this \"is\" the same thing" . What you may not have heard is that "this"" stuff produces the same result as "this\"" stuff , while I would've expected it to do "this stuff" . My implementation, and probably the Microsoft one too, implements the three-quotes feature by doing this:

start quote run | | end quote run | | | |literal quote | || | ||start quote run | ||| "this """is""" the same thing" ||| | ||start quote run end quote run || |literal quote | end quote run In other words, if a quote run is immediately followed by another quote mark, emit a literal quote. It makes sense in this case, but the result is that "this""thing" produces ["thisthing"] on linux and ["this\"thing"] on Windows.
Copying someone else's performance tweak
Copying someone else's performance tweak

Unlike the behavior with quotes, their documentation about backslashes is actually pretty honest, and it doesn't leave out any details like that:

https://docs.microsoft.com/en-us/previous-versions//17w5ykft(v=vs.85)

Backslashes are interpreted literally, unless they immediately precede a double quotation mark.

If an even number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is interpreted as a string delimiter.

If an odd number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is "escaped" by the remaining backslash, causing a literal double quotation mark (") to be placed in argv.

You can't put quote marks in Windows filenames with Explorer, so it's fine.

As for whitespace, it's a bit more special than they claim. A command line starts with "the command name", which is supposed to be a valid Windows executable path (just like it would be on UNIX, really), so their behavior is completely different. You can't put quotes in command names at all; if your command name starts with a quote, it will end at the next quote, no matter what else you do. And if you don't quote the command name, then it will end at anything in the ASCII control plane. Newline, vertical feed, bell, doesn't matter; even though the rest of the arguments treat them literally, they delimit the command name from the rest of the arguments. It took me far too long to figure that out, thanks to @pitdicker who linked me to where the ReactOS people documented that fact.

Testing all possible four-ASCII-character-long command lines, while slow as heck, is a good way to find bugs in that parser. Exhaustively testing a meaningful chunk of your inputs is a good idea .

The results

After fixing all that and landing the PR, it didn't measurably help at all . Next step, test some hypotheses to prove that it wasn't all a waste of time:

Is rustc still pulling in GDI? No.

micha@DESKTOP-IIQA1VP MINGW64 ~/IdeaProjects/rust (master) $ ldd build/x86_64-pc-windows-msvc/stage1/bin/rustc.exe | grep -i ntdll ntdll.dll => /c/WINDOWS/SYSTEM32/ntdll.dll (0x7ffbede90000) micha@DESKTOP-IIQA1VP MINGW64 ~/IdeaProjects/rust (master) $ ldd build/x86_64-pc-windows-msvc/stage1/bin/rustc.exe | grep -i gdi micha@DESKTOP-IIQA1VP MINGW64 ~/IdeaProjects/rust (master) $

As a last-ditch check, I made sure that compiletest, the part of Rust's test suite that spawns lots of processes that I expected to improve, actually supported parallelism. They get turned off in limited cases , but concurrent run-pass tests are supported on Windows.

So what's the moral of the story?

A concurrency bottleneck that locks up someone's 24-core Windows workstation might not have an notable effect on a test suite running on a 2-core virtual machine. The fact that my parser is built for correctness and safety more than speed probably doesn't help either.

However, one of the other commenters in that issue does mention that only 24% of their CPU is being used by the Rust test suite , so there is a bottleneck somewhere in here , but not GDI object registration.


Viewing all articles
Browse latest Browse all 11063

Trending Articles