text/plain

x22i Treadmill Review

I love my treadmill, but two years in, I cannot recommend it.

On New Year’s Day 2022 I bought a NordicTrack x22i Incline Trainer (a treadmill that supports 40% incline and 6% decline) with the aim of getting in shape to hike Kilimanjaro. I was successful on both counts, losing 50 pounds in six months and summiting Kilimanjaro with my brother in mid-2023. Between its arrival January 24, 2022 and today, I’ve run ~1780 miles on it.

The Good

Most people I talk to about running complain about how awful treadmills are, describing them as “dreadmills” and horribly boring. While I’m not an outdoor runner, I’m sympathetic to their criticism, but it doesn’t resonate for me, at all.

The iFit video training series is awesome for me. I’m inspired to get on the treadmill to see what’s next on its 22″ screen (which feels larger). I’ve had the chance to walk, run, and hike all over the world: South America, Hawaii, Japan, Italy, Africa, Europe, Antarctica, and all over the US. I’ve run races I’ll likely never get to run in the real world, including races (mostly marathons) in Hawaii, London, Boston, Jackson Hole, New York, Chicago, Tanzania, and more I’ve probably forgotten. I’ve probably run the Kilimanjaro Half Marathon a dozen times at this point, and I’m currently working my way through a “Kilimanjaro Summit” hiking series, partially retracing my steps up the Western Approach. Along the way, I’ve learned lots of training tips, some phrases in foreign languages and history of lots of interesting places.

The treadmill hardware is pretty nice — the shock absorption of the deck is excellent and I’ve managed not to destroy my knees despite running thousands of miles. Running on pavement in the real world leaves me considerably more sore.

While iFit has a variety of annoyances (there are not nearly enough 10Ks or half marathons, and they don’t add new “hard” workouts fast enough) there’s no question in my mind that the iFit training classes are to thank for the success I’ve had in getting in shape.

The Bad

There are many inexpensive treadmills out there, and most of them don’t seem very sturdy or likely to support a serious and regular running habit.

I was serious about my goals and figured that I should spend enough to ensure that my treadmill would last and never give me a technical excuse not to run. Still, the cost ended up being pretty intimidating, with ~$3800 up-front and $1900 on later expenses.

x22i Treadmill (On Sale)	$3170
Delivery and “White Glove” Assembly	$299
Sales Tax	$286
NordicTrack Heart Rate monitor arm band	$100
iFit Video Training Subscription renewal (Years 2-3)	$600
20-Amp dedicated circuit	$970
Extended warranty (years 2-5)	$300ish

Total 3-year costs for the x22i = $5725

Fortunately, Microsoft’s employee fitness program grants $1500 a year, and I was able to put the first year’s payment toward the treadmill and the following year I was able to pay for the subscription content renewal with $900 left over to defray the cost of the Kilimanjaro hike.

The Ugly

Unfortunately, my treadmill has been an escalating source of hassles from the very beginning. The assembly folks failed to fully screw in a few screws (they were sticking so far out that I assumed they used the wrong ones) and they cracked one of the water bottle holders. I complained to the NordicTrack folks and they refunded me the delivery/setup fee and within a few weeks came out to replace the broken water bottle holder.

Throughout the first year, my treadmill frequently tripped the circuit breaker; much to my surprise, the abrupt loss of power never resulted in me crashing into the front handrails, no matter how fast I was going. The treadmill was on a shared 15A circuit and while it was never supposed to approach that level of energy consumption, it clearly did. Sometimes, the trigger was obvious (someone turning on the toaster in the kitchen) while other times the treadmill was the only thing running. Eventually I hooked up a Kill-A-Watt meter and found that it could peak at 16-17 amps when starting or changing the incline, well above what it was supposed to consume, but within the technical specs. I eventually spent the money to get a dedicated 20A circuit, and was angry to discover that it was still periodically tripping. After months of annoyance and research, I eventually discovered that treadmills are infamous for tripping “Arc Fault Circuit Interrupt” breakers that are now required by Texas building code. Since having the electrician swap the AFCI breaker for the “old” type, I don’t think it has tripped again.

After all of the electrical problems, I invested in the extended warranty when it was offered, and I’m glad it did. Somewhere around the one year mark, my treadmill started making a loud banging noise. I looked closer and realized that two screws had broken off the bottom of the left and right rails and I assumed that was the source of the noise. Alas, removing the rails didn’t stop the banging, nor did having them replaced. Over the course of several months, techs came out to replace the side rails, idler roller, drive roller, belt, belt guide, and cushions. As November 2023, the treadmill no longer makes a banging sound, but it’s not nearly as quiet as it once was, and I’m expecting that I’ll probably need more service/parts within a few more months.

UPDATE: In October 2024, at around 2000 miles, the steel of the frame cracked where it holds the motor to the frame. It took two weeks for the technician to come verify that it was, in fact, broken and unfixable. Fortunately, the frame warranty is the longest one, at ten years. A few days later, I was offered either a replacement or a $3170 credit towards a new one. I spend a few days pondering whether to just buy another x22i, add $1500 of my own money for an x24, or get a non-incline 2450. The repair guy suggested that the x22i, with its motor at the back, is an especially unreliable model. :( Ultimately, I decided to get a new x22i, out $380 for shipping, assembly, and removal of the old one. Fingers crossed that the new one holds up better, or that if it does fail, it’s the frame again.

Closing Thoughts

From a cost/hassle point-of-view, I would be much better off getting a membership to the gym a half-mile down the block. I suspect, however, that much of my success with regular running comes from the fact that the treadmill lives between my bedroom and my home office, and it beckons to me every morning on my “commute.” The hassle of getting in the car, needing to dress in more than a pair of sweaty shorts, etc, would give me a lot of excuses to “nope” out of regular runs.

When I first was shopping for a treadmill, someone teased me and suggested that I make sure it had a good bar for hanging clothes on, since that’s probably the most common job for home treadmills. I managed to avoid that trap, and I’ve fallen in love with my treadmill despite its many flaws.

I don’t know whether other treadmills at a similar price point are of higher quality, or whether spending even more would give better results, but it almost doesn’t matter at this point — the iFit video content is the best part of my treadmill, and I don’t think any other ecosystem (e.g. Peloton) is comparable.

-Eric

PS: If I end up replacing my treadmill in a few years, I might get a “regular” treadmill rather than an Incline Trainer, because I don’t use the steep inclines very often and I think that capability adds quite a bit of weight and perhaps some additional components that could fail?

How Downloads Work

I delivered a one hour session on the internals of file downloads in web browsers at THAT Conference 2024. The slides are here and a MP3 of the talk is available.

If you’d prefer to read, much of the content in the talk is found in this blog’s posts that have a Download tag.

A Cold and Slow 3M Half

My second run of the 3M Half Marathon was Sunday January 21, 2024. My first half-marathon last year was cold (starting at 38F), but this year’s was slated to be even colder (33F) and I was nervous.

For dinner on Saturday night, I had a HelloFresh meal of meatballs and mashed potatoes, and I went to bed around 9:45pm. I set an alarm for 6, but I woke up around 5:15 am and lingered in bed until 5:30. I drank a cup of coffee right away and then had a productive trip to the bathroom. I ate a banana and had another cup of coffee while I prepped my gear and got dressed.

I put on my new UnderArmor leggings and shorts with the number that I’d attached the night before. I packed an additional running shirt in case it was cold enough to double-layer; something I’d never done before but the forecast called for 33 degrees, five degrees colder than last year’s cold run. I also put on a pair of $3 disposable cotton gloves that I’d picked up at the packet pickup expo the day before. I wore new Balega socks and my trusty orange Hokas (my new ones aren’t quite broken in yet).

My water bottle’s pouch would hold my car key, an iPhone SE to provide tunes streamed to one Bluetooth earpiece (the other died months ago) and snacks: a pack of Gu gummies and some Jelly Belly Energy beans (which I ended up liking most).

I left the house around 6:45 for the 7:30 am race. While waiting at a light in the parking traffic I concluded that I definitely was going to need that second shirt, so I put it on under my trusty Decker Challenge shirt that brought me to the top of Kilimanjaro.

By 7:22 I had parked and was waiting in a long line for a porta-potty near the start, debating about whether or not I should just skip it and go find my pace group. Ultimately, the race began just before I had a turn, although it was nice to dispose of that second cup of coffee. Alas, I was forced to start with the 2:40 pacers. I started my Fitbit a minute or two before my group made it across the starting line. Alas, my second watch (an ancient Timex I found somewhere) crashed when I tried to start it as I crossed the starting line.

I spent the first mile passing folks and by the second mile marker I’d reached the 2:05 pace group. For the next few miles, the 2:00 pace group was in sight in the distance, but I’d never caught up to them, despite my hope of running most of the race with the 1:55 pace group. I consoled myself that I’d probably crossed the start line two minutes after the 2:00 group so my dream of finishing in under 2 hours was probably still possible.

Around mile 5, my energy started to flag, but shortly thereafter an 8yo in a tutu running ahead of me guilted me into realizing that this wasn’t as hard as I was making it out to be. I passed her in about half a mile, grateful for the boost.

Shortly after mile 6, I discarded my gloves which had served me well. By this point I was taking short walking breaks and had concluded that I was unlikely to set any PRs.

Miles 6 through 9 were full of signs. My favorite was the 3yo boy holding a sign that said “This seems like a lot of work for a banana“– I told him it was the best one I’d seen. I groaned a bit at some of the signs held by twenty-somethings; one woman’s read “Find a cute butt and follow it” while another proclaimed: “Wow, that looks really long and hard!” Like last year, around mile 9 I stopped for a pee break although this time it was almost nothing… I had sipped under 16 ounces on the entire run.

By mile 7, my torso was starting to get a bit warm in my double shirts, but in another few miles the breeze had picked up and I was glad that I had them. Sheesh, it was chilly.

Amazingly, nothing hurt. My feet felt good. My legs felt good. My throat and lungs felt fine. My lips, wearing a swipe of chapstick, were fine. My thighs and chest, coated in BodyGlide, were not chafing anywhere. I didn’t have any weird aches in my arms or back. The closest thing I had to any pain was the bottom of my nose, which was getting chapped in the cold.

This year, I was anticipating the two hills downtown, and while I can’t say that I ran up them, they were much less demoralizing than last year. As the finish line approached, it had been a few miles since I’d seen my last pacer, but I figured I was somewhere in the 2:13-2:18 range. I idly hoped I’d still beat my time from last year’s slow Galveston Half.

Ultimately, I crossed the finish with a chip time of 2:09:24, a 9:52/m pace.

I wish FitBit made it easier to trim their data to the actual *run* portion of the race :)

This year, I made sure not to blow by the volunteers passing out the medals just over the finish line.

Tired but feeling like I could’ve easily run a few more miles at a slow pace, I hopped the bus back to the starting point. I grabbed my phone and a jacket from the car and walked a mile to the bagel shop to get a coffee and celebratory breakfast sandwich… the day felt even colder. Back home on the couch after a long warm shower, I signed up for next year’s race.

Next month is the Galveston Half Marathon. I hope to run the race with the 2:00 pacer the whole way, but I’ll settle for beating 2:09, five minutes faster than last year’s effort.

The Blind Doorkeeper Problem, or, Why Enclaves are Tricky

When trying to protect a secret on a client device, there are many strategies, but most of them are doomed. However, as a long-standing problem, many security experts have tried to chip away at its edges over the years. Over the last decade there’s been growing interest in using enclaves as a means to protect secrets from attackers.

Background: Enclaves

The basic idea of an enclave is simple: you can put something into an enclave, but never take it out. For example, this means you can put the private key of a public/private key pair into the enclave so that it cannot be stolen. Whenever you need to decrypt a message, you simply pass the ciphertext into the enclave and the private key is used to decrypt the message and return the plaintext, crucially without the private key ever leaving the enclave.

There are several types of enclaves, but on modern systems, most are backed by hardware — either a TPM chip, a custom security processor, or features in mainstream CPUs that enable enclave-style isolation within a general-purpose CPU. Windows exposes a CreateEnclave API to allow running code inside an enclave, backed by virtualization-based security (VBS) features in modern processors. The general concept behind Virtual Secure Mode is simple: code running at the normal Virtual Trust Level (VTL0) cannot read or write memory “inside” the enclave, which runs its code at VTL1. Even the highly-privileged OS Kernel code running at VTL0 cannot spy on the content of a VTL1 enclave.

DLLs loaded into an enclave must be signed by a specific class of certificate (provided by Azure Trusted Signing) and the code’s signature and integrity are validated before it is loaded into the enclave. After the privileged code is loaded into the enclave, it has access to all of the memory of the current process (both untrusted VTL0 and privileged VTL1 memory). In-enclave code cannot load most libraries and thus can only call a tiny set of external library functions, mostly related to cryptography.

Security researchers spend a lot of time trying to attack enclaves for the same reason that robbers try to rob banks: because that’s where the valuables are. At this point, most enclaves offer pretty solid security guarantees– attacking hardware is usually quite difficult which makes many attacks impractically expensive or unreliable.

However, it’s important to recognize that enclaves are far from a panacea, and the limits of the protection provided by an enclave are quite subtle.

A Metaphor

Imagine a real-world protection problem: You don’t want anyone to get into your apartment, so you lock the door when you leave. However, you’re in the habit of leaving your keys on the bar when you’re out for drinks and bad guys keep swiping them and entering your apartment. Some especially annoying bad guys don’t just enter themselves, they also make copies of your key and share it with their bad-guy brethren to use at their leisure.

You hit on the following solution: you change your apartment’s lock, making only one key. You hire a doorkeeper to hold the key for you, and he wears it on a chain around his neck, never letting it leave his person. Every time you need to get in your apartment, you ask the doorkeeper to let you in and he unlocks the door for you.

No one other than the doorkeeper ever touches the key, so there’s no way for a bad guy to steal or copy the key.

Is this solution secure?

Well, no. The problem is that you never gave your doorkeeper instructions on who is allowed to tell him to unlock the door, so he’ll open it for anyone who asks. Your carefully-designed system is perfectly effective in protecting the key but utterly fails in achieving the actual protection goal: protecting the contents of your apartment.

What does this have to do with enclaves?

Sometimes, security engineers get confused about their goals, and believe that their objective is to keep the private key secret. Keeping the private key secret is simply an annoying requirement in service of the real goal: ensuring that messages can be decrypted/encrypted only by the one legitimate owner of the key. The enclave serves to prevent that key from being stolen, but preventing the key from being abused is a different thing altogether.

Consider, for example, the case of locally-running malware. The malware can’t steal the enclaved key, but it doesn’t need to! It can just hand a message to the code running inside the enclave and say “Decrypt this, please and thank you.” The code inside the enclave dutifully does as it’s asked and returns the plaintext out to the malware. Similarly, the attacker can tell the enclave “Encrypt this message with the key” and the code inside the enclave does as directed. The key remains a secret from the malware, but the crypto system has been completely compromised, with the attacker able to decrypt and encrypt messages of his choice.

So, what can we do about this? It’s natural to think: “Ah, we’ll just sign/encrypt messages from the app into the enclave and the code inside the enclave will validate that the calls are legitimate!” but a moment later you’ll remember: “Ah, but how do we prevent the protect that app’s key?” and we’re back where we started. Oops.

Another idea is that the code inside the enclave will examine the running process and determine whether the app/caller is the expected legitimate app. Unfortunately, this is extremely difficult. While the VTL1 code can read all of app’s VTL0 memory, to confidently determine that the host app is legitimate would require something like groveling all of the executable pages in the process’ memory, hashing them, and comparing them to a “known good” value. If the process contains any unexpected code, it may be compromised. Even if you could successfully implement this process snapshot hash check, an attacker could probably exploit a race condition to circumvent the check, and you’d be forever bedeviled with false-positives caused by non-malicious code injection from accessibility utilities or security tools.

In general, any security checks from inside the enclave that look at memory in VTL0 are potentially subject to a TOCTOU attack — an attacker can change any values at any time unless they have been copied into VTL1 memory. Microsoft Security wrote an important summary of things your code needs to do to in order to program securely with enclaves.

Another idea would be to prompt the user: the code inside the enclave could pop up an unspoofable dialog asking “Hey, would you like me to sign this message [data] with your key?” Unfortunately, in the Windows model this isn’t possible — code running inside an enclave can’t show any UI, and even if it could, there’s nothing that would prevent such a confirmation UI from being redressed by VTL0 code. Rats.

Conclusion

Before you reach for an enclave, consider your full threat model and whether using the enclave would meaningfully mitigate any threats.

For example, token binding uses an enclave to render cookie theft useless– while token binding doesn’t mitigate the threat of locally-running malware abusing cookies via a sock-puppet browser, it does mitigate the threat of cookies theft via XSS/cookie leaks. Furthermore, it complicates the lives of malicious insiders using tightly-locked down corporate PCs at financial services firms– those PCs are heavily monitored and audited, so forcing an attacker to abuse the key on device is a significant improvement in security posture. (The upcoming Device Bound Session Credentials feature has similar properties and may be more deployable than token binding was).

A similar scenario involves Hardware Security Modules (HSMs) used in code-signing and certificate issuance scenarios: while the overall security goal is to prevent misuse of the key, preventing the attacker from egressing the key improves the threat model because it allows other components of the overall system (auditing, alerting, AV, EDR/XDR, etc) to combat attackers attempting to abuse the unexportable key.

-Eric

PS: A great discussion of hardware-backed enclaves on popular phones can be read here.

PPS: I turns out I wrote a much shorter version of this post (with a similar metaphor) previously in response to a bug report.

Coding at Google

I wrote this a few years back, but I’ve had occasion to cite it yet again when explaining why engineering at Google was awesome. To avoid it getting eaten by the bitbucket, I’m publishing it here.

Background: From January 2016 to May 2018, I was a Senior SWE on the Chrome Enamel Security team.

Google culture prioritizes developer productivity and code velocity. The internal development environment has been described (by myself and others) as “borderline magic.” Google’s developer focus carried over to the Chrome organization even though it’s a client codebase, and its open-source nature means that cannot depend upon Google-internal tooling and infrastructure.

I recounted the following experience after starting at Google:

When an engineer first joins Google, they start with a week or two of technical training on the Google infrastructure. I’ve worked in software development for nearly two decades, and I’ve never even dreamed of the development environment Google engineers get to use. I felt like Charlie Bucket on his tour of Willa Wonka’s Chocolate Factory—astonished by the amazing and unbelievable goodies available at any turn. The computing infrastructure was something out of Star Trek, the development tools were slick and amazing, the process was jaw-dropping.

While I was doing a “hello world” coding exercise in Google’s environment, a former colleague from the IE team pinged me on Hangouts chat, probably because he’d seen my tweets about feeling like an imposter as a SWE. He sent me a link to click, which I did. Code from Google’s core advertising engine appeared in my browser in a web app IDE. Google’s engineers have access to nearly all of the code across the whole company. This alone was astonishing—in contrast, I’d initially joined the IE team so I could get access to the networking code to figure out why the Office Online team’s website wasn’t working.

“Neat, I can see everything!” I typed back. “Push the Analyze button” he instructed. I did, and some sort of automated analyzer emitted a report identifying a few dozen performance bugs in the code. “Wow, that’s amazing!” I gushed. “Now, push the Fix button” he instructed. “Uh, this isn’t some sort of security red team exercise, right?” I asked. He assured me that it wasn’t. I pushed the button. The code changed to fix some unnecessary object copies. “Amazing!” I effused. “Click Submit” he instructed. I did, and watched as the system compiled the code in the cloud, determined which tests to run, and ran them.

Later that afternoon, an owner of the code in the affected folder typed LGTM (Googlers approve changes by typing the acronym for Looks Good To Me) on the change list I had submitted, and my change was live in production later that day. I was, in a word, gobsmacked. That night, I searched the entire codebase for misuse of an IE cache control token and proposed fixes for the instances I found.
-Me, 2017

The development tooling and build test infrastructure at Google enable fearless commits—even a novice can make contributions into the codebase without breaking anything—and if something does break, culturally, it’s not that novice’s fault: instead, everyone agrees that the fault lies with the environment – usually either an incomplete presubmit check or missing test automation for some corner case. Regressing CLs (changelists) can be quickly and easily reverted and resubmitted with the error corrected. Relatedly, Google invests heavily in blameless post-mortems for any problem that meaningfully impacts customer experience or metrics. Beyond investing in researching and authoring the post-mortem in a timely fashion, post-mortems are broadly-reviewed and preventative action items identified therein are fixed with priority.

Google makes it easy to get started and contribute. When ramping up into a new space, the new engineer is pointed to a Wiki or other easily-updated source of step-by-step instructions for configuring their development environment. This set of instructions is expected to be current, and if the reader encounters any problems or changes, they’re expected to improve the document for the next reader (“Leave it better than you found it”). If needed, there’s usually a script or other provisioning tool used to help get the right packages/tools/dependencies installed, and again, if the user encounters any problems, the expectation is that they’ll either file a bug or commit the fix to the script.

Similarly, any ongoing Process is expected to have a “Playbook” that explains how to perform the process – for example, Chrome’s HSTS Preload list is compiled into the Chrome codebase from snapshots of data exported from HSTSPreload.org. There’s a “Playbook” document that explains the relevant scripts to run, when to run them, and how to diagnose and fix any problems. This Playbook is updated whenever any aspect of the process changes as a part of whatever checkin changes the process tooling.

As a relatively recent update, the Chromium project now offers a very lightweight contribution experience that can be run entirely in a web browser, which mimics the Google internal development environment (Cider IDE with Borg compiler backend).

Mono-repo, no team/feature branches, Google internally uses a mono-repo into which almost all code (with few exceptions, including Chrome) is checked in, and the permissions allow any engineer anywhere in the company to read it, dramatically simplifying both direct code reuse as well as finding expertise in a given topic. Because Chrome is an open-source project, it uses its own mono-repo containing approximately 25 million lines of code. Chrome does not, in general, use shared branches for feature development, only to fork for the release branches (e.g. Canary is forked in order to create the Dev branch, and there are firm rules about cherry-picking from Main into those branches).

An individual developer will locally create branches for each fix that he’s working on, but those branches are almost never seen by anyone else; his PR is merged to HEAD at which point everyone can see it. As a consequence, landing non-trivial changes, especially in areas where others are merging, often results in many commits and a sort of “chess game” where you have to anticipate where the code will be moving as your pieces are put in. This strongly encourages developers to land code in many small CLs that coax the project toward the desired end-state, each with matching automated tests to ensure that you’re protected against anyone else landing a change that regresses your code. Those tests end up defending your code for years to come.

Because all work is done in Main, there’s little in the way of cross-team latency, because you need not wait for an RI/FI (reverse-integrate, forward-integrate) to bring features around to/from other branches.

Cloud build. Google uses cloud build infrastructure (Borg/Goma) to build its projects so developers can work on relatively puny workstations but compile with hundreds to thousands of cores. A clean build of Chrome for Windows that took 46 minutes on a 48 thread Xeon workstation would take 6 minutes on 960 Goma cores, and most engineers are not doing clean builds very often.

This Cloud build infrastructure is heavily leveraged throughout the engineering system—it means that when an engineer puts a changelist up for review, the code is compiled for five to ten different platforms in parallel in the background and then the entire automated test suite is run (“Tryjob”) such that the engineer can find any errors before another engineer even begins their code review. Similarly, artifacts from each landed CL’s compilation are archived such that there’s a complete history of the project’s binaries, which enables automated tooling to pinpoint regressions (performance via perfbots, security via ClusterFuzz, reliability via their version of Watson) and engineers to quickly bisect other types of regressions.

Great code search/blame. Google’s Code Search features are extremely fast and, thanks to the View-All monorepo and lack of branches, it’s very easy to quickly find code from anywhere in the company. Cross-references work correctly, so things like “Find References” will properly find all callers of a specific function rather than just doing a string search for that name. Viewing Git history and blame is integrated, so it’s quick and easy to see how code evolved over time.

24-hour Code Review Culture. Google’s engineering team has a general SLA of 24 hours on code-review. The tools help you find appropriate reviewers, and the automation helps ensure that your CL is in the best possible shape (proper linting, formatting, all tests pass, code coverage %s did not decline) before another human needs to look at it. The fast and simple review tools help reviewers concentrate on the task at hand, and the fact that almost all CLs are small/tiny by Microsoft standards help keep reviews moving quickly. Similarly, Google’s worldwide engineering culture mean that it’s often easy to submit a CL at the end of the day Pacific time and then respond to review feedback received overnight from engineers in Japan or Germany.

Opinionated and Enforced Coding Standards. Google has coding standards documents for each language (e.g. C++) that are opinionated and carefully revised after broad and deep discussions among practitioners interested in participating. These coding standards are, to the extent possible, enforced by automated tooling to ensure that all code is written to the standard, and these standards are shared across teams by default, with any per-project exceptions (e.g. Chrome’s C++) treated as an overlay.

Easily Discovered Area Interest/Ownership Google has an extremely good internal “People Directory” – it allows you to search for any employee based on tags/keywords, so you can very quickly find other folks in the company that own a particular area. Think “Dr Whom/Who+” with 100ms page-load-times, and backed by a work culture where folks keep their own areas of ownership and interest up-to-date because it’s both simple and because if they fail to do so, they’re going to keep getting questions about things they no longer own. Similarly, the OWNERS system within the codebases are up-to-date because they are used to enforce OWNERS review of changes, so after you find a piece of code, it’s easy to find both who wrote it (fast GIT BLAME) and who’s responsible for it today. Company/Division/Team/Individual OKRs are all globally visible, so it’s easy to figure out what is important to a given level of the organization, no matter how remote.

Simple/fast bug trackers. Google’s bug tracker tools are simple, load extremely quickly, and allow filing/finding bugs against anything very quickly. There’s a single internal tracker for most of Google, and a public tracker (crbug.com) for the Chromium OSS project.

Simple/fast telemetry/data science tools. Google’s equivalent of Watson is extremely fast and has code to automatically generate stack information, hit counts, recent checkins near the top-of-stack functions, etc. Google’s equivalent of SQM/OCV is extremely fast and enables viewing of histograms and answering questions like “What percentage of page loads result in this behavior” without learning a query language, getting complicated data access permissions, or suffering slow page loads. These tools enable easy creation of “notifications/subscriptions” so developers interested in an area can get a “chirp” email if a metric moves meaningfully.

Sheriffs and Rotations. Most recurring processes (e.g. bug triage) have both a Sheriff and a Deputy and Google has tools for automatically managing “rotations” so that the load is spread throughout the team. For some expensive roles (e.g. a “Build Sheriff”) the developer’s primary responsibility while sheriff becomes the process in question and their normal development work is deferred until their rotation ends; the rotation tool shows the schedule for the next few months, so it is relatively easy to plan for this disruption in your productivity.

Intranet Search that doesn’t suck While Google tries to get many important design docs and so forth into the repo directly there’s still a bunch of documentation and other things on assorted wikis and Google Docs, etc. As you might guess, Google has an internal search engine for this non-public content that works quite well, in contrast to other places I’ve worked.