hardware

While I do most of my work in an office, from time to time I work on code changes to Chromium at home. With the recent deprecation of Jumbo Builds, building the browser on my cheap 2016-era Dell XPS 8900 (i7-6700K) went from unpleasant to impractical. While I pondered buying a high-end Threadripper, I couldn’t justify the high cost, especially given the limited performance characteristics for low-thread workloads (basically, everything other than compilation).

The introduction of the moderately-priced (nominally $750), 16 Core Ryzen 3950X hit the sweet spot, so I plunked down my credit card and got a new machine from a system builder. Disappointingly, it took almost two months to arrive in a working state, but things seem to be good now.

The AMD Ryzen 3950X has 16 cores with two threads each, and runs around 3.95ghz when they’re all fully-loaded; it’s cooled by a CyberPowerPC DeepCool Castle 360EX liquid cooler. An Intel Optane 905P 480GB system drive holds the OS, compilers, and Chromium code. The key advantage of the Optane over more affordable SSDs is that it has a much higher random read rate (~400% as fast as the Samsung 970 Pro I originally planned to use):

Following the Chromium build instructions, I configured my environment and set up a 32bit component build with reduced symbols:

is_component_build = true
enable_nacl = false
target_cpu = "x86"
blink_symbol_level = 0
symbol_level = 1

Atop Windows 10 1909, I disabled Windows Defender entirely, and didn’t do anything too taxing with the PC while the build was underway.

Ultimately, a clean build of the “chrome” target took just under 53 minutes, achieving 33.3x parallelism.

While this isn’t a fast result by any stretch of the definition, it’s still faster than my non-jumbo local build times back when I worked at Google in 2016/2017 and used a $6000 Xeon 48 thread workstation to build Chrome, at somewhere around half of the cost.

Cloud Compilation

When I first joined Google, I learned about the seemingly magical engineering systems available to Googlers, quickly followed by the crushing revelation that most of those magic tools were not available to those of us working on the Chromium open-source project.

The one significant exception was that Google Chrome engineers had access to a distributed build system called “Goma” which would allow compiling Chrome using servers in the Google cloud. My queries around the team suggested that only a minority of engineers took advantage of it, partly because (at the time) it didn’t generate very debuggable Windows builds. Nevertheless, I eventually gave it a shot and found that it cut perhaps five minutes off my forty-five minute jumbo build times on my Xeon workstation. I rationalized this by concluding that the build must not be very parallelizable, and the fact that I worked remotely from Austin, so any build-artifacts from the Goma cloud would be much further away than from my colleagues in Mountain View.

Given the complexity of the configuration, I stopped using Goma, and spent perhaps half of my tenure on Chrome with forty-five minute build times[1]. Then, one day I needed to do some development on my Macbook, and I figured its puny specs would benefit from Goma in a way my Xeon workstation never would. So I went back to read the Goma documentation and found a different reference than I saw originally. This one mentioned a then unknown to me “-j” command line argument that tells the build system how many cloud cores to use.

This new, better, documentation noted that by default the build system would just match your local core count, but when using Goma you should instead demand ~20x your local core count– so -j 960 for my workstation. With one command line argument, my typical compiles dropped from 45 minutes to around 6.

::suitable_meme_of_wonder_and_fury::

Returning to Edge

I returned to Microsoft as a Program Manager on the Edge team in mid-2018, unaware that replatforming atop Chromium was even a possibility until the day before I started. Just before I began, a lead sent me a 27 page PDF file containing the Edge-on-Chromium proposal. “What do you think?” he asked. I had a lot of thoughts (most of the form “OMG, yes!“) but one thing I told everyone who would listen is that we would never be able to keep up without having a cloud-compilation system akin to Goma. The Google team had recently open-sourced the Goma client, but hadn’t yet open-sourced the cloud server component. I figured the Edge team had engineering years worth of work ahead of us to replicate that piece.

When an engineer on the team announced two weeks later that he had “MSGoma” building Chromium using an Azure cloud backend, it was the first strong sign that this crazy bet could actually pay off.

And pay off it has. While I still build locally from time to time, I typically build Chromium using MSGoma from my late 2018 Lenovo X1 Extreme laptop, with build times hovering just over ten minutes. Cloud compilation is a game changer.

The Chrome team has since released a Goma Server implementation, and several other major Chromium contributors are using distributed build systems of their own design.

I haven’t yet tried using MSGoma from my new Ryzen workstation, but I’ve been told that the Optane drive is especially helpful when performing distributed builds, due to the high incidence of small random reads.

-Eric

[1] This experience recalled a much earlier one: my family moving to Michigan shortly after I turned 11. Our new house featured a huge yard. My dad bought a self-propelled lawn mower and my brother and I took turns mowing the yard weekly. The self-propelled mower was perhaps fifteen pounds heavier than our last mower, and the self-propelling system didn’t really seem to do much of anything.

After two years of weekly mows from my brother and I, my dad took a turn mowing. He pushed the lawn mower perhaps five feet before he said “That isn’t right,” reached under the control panel and flipped a switch. My brother and I watched in amazement and dismay as the mower began pulling him across the yard.

Moral of the story: Knowledge is power.

I recently bought a Dell XPS 8900 desktop system with Windows 10. It ran okay for a while, but after enabling Hyper-V, every few minutes the system would freeze for a few seconds and then reboot with no explanation. Looking at the Event Viewer’s Windows Logs > System revealed that the system had bugchecked (blue screened):

Event Viewer - BugcheckBugcheck 0x1a indicates a problem with “Memory Management” .

Run WinDbg as Administrator. File > Open Crash Dump:

WinDBG open crash dump 

Open C:\Windows\memory.dmp. Wait for symbols to download:

Debuggee not connected; symbols downloading

If symbols aren’t downloaded automatically, try typing .symfix and then .reload in the command prompt at the bottom.

Use !analyze -v says WinDBG

Then, follow the tool’s advice and run !analyze -v to have the debugger analyze the crash. WinDBG presents a surprisingly readable explanation:

WinDBG notes driver memory corruption

So a driver’s at fault, but which one?

Stack trace points at WiFi

It looks like bcmwl63a, for which symbols aren’t loaded, one clue that this isn’t Microsoft’s code. Let’s find out more about it using lm vm bcmw163a:

 

Debugger points at Wifi driver

Pop over to the listed path to examine the file’s properties, and see that it’s the WiFi driver:

Driver details

The Dell 1560 802.11ac card is the same type as found in my Dell XPS 13” notebook PC, where it was responsible for a flurry of bluescreens last year. The driver appears to have improved (the XPS 13 doesn’t crash anymore), but it looks like some corner cases got missed, likely related to the Hyper-V virtual networking code. Rather than waiting for an updated driver, the experts on Twitter suggested I simply upgrade to the Intel 7265 and install the latest Intel PROSet wireless driver. At $20 on Amazon, this seemed like a fine approach.

The upgrade was straightforward and would’ve taken less than 5 minutes to install except one of the nearly microscopic sockets broke off as I removed the Dell card’s antenna cables:

BrokenSocket

I used a needle to remove the broken pieces from the antenna’s connector before it would fit on the new card’s socket. After connecting the antenna, the new card easily slid into the slot and Windows recognized it on next boot. I used Device Manager to ensure the drivers loaded for the new card’s Bluetooth support, and installed the latest PROSet driver. Everything’s been working great since.

While WinDBG is one of the more inscrutable tools I use, it worked great in this situation and would point even a novice in the right direction.

 

-Eric