One of my final projects on the Chrome team was writing an internal document outlining Best Practices for Secure URL Display. Yesterday, it got checked into the public Chromium repro, so if this is a topic that interests you, please have a look!
Update: The October 2018 Cumulative Security Update (KB4462919) brings the RS5 Cookie Control changes described below to Windows 10 RS2, RS3, and RS4.
Cookies are one of the most crucial features in the web platform, and large swaths of the web don’t work properly without them. Unfortunately, cookies are also one of the primary mechanisms that trackers and ad networks utilize to follow users around the web, potentially impacting users’ privacy. To that end, browsers have offered cookie controls for over twenty years.
Back in 2010, I wrote a summary of Internet Explorer’s Cookie Controls. IE’s cookie controls were very granular and quite powerful. The basic settings were augmented with P3P, a once-promising feature that allowed sites to advertise their privacy practices and browsers to automatically enforce users’ preferences against cookies. Unfortunately, major sites created fraudulent P3P statements, regulators failed to act, and the entire (complicated) system collapsed. P3P was removed from IE11 on Windows 10 and never implemented in Microsoft Edge.
Instead, Edge offers a very simple cookie control in the Privacy and Security section of the settings. Under the Cookies option, you have three choices: Don’t block cookies (the default), Block all cookies, and Block only third party cookies:
This simple setting hides a bunch of subtlety that this post will explore.
Cookie => Cookie-Like
For the October 2018 update (aka “Redstone Five” aka “RS5”) we’ve made some important changes to Edge’s Cookie control.
The biggest of the changes is that Edge now matches other browsers, and uses the cookie controls to restrict cookie-like storage mechanisms, including localStorage, sessionStorage, indexedDB, Cache API, and ServiceWorkers. Each of these features can behave much like a cookie, with a similar potential impact on users’ privacy.
While we didn’t change the UI, it would be accurate to change it to:
This change improves privacy and can even improve site compatibility. During our testing, we were surprised to discover that some website flows fail if the browser blocks only 3rd party cookies without also blocking 3rd-party localStorage. This change brings Edge in line with other browsers with minor exceptions. For example, in Firefox 62, when 3rd-party site data is blocked, sessionStorage is still permitted in a 3rd-party context. In Edge RS5 and Chrome, 3rd party sessionStorage is blocked if the user blocks 3rd-party cookies.
Block Setting and Sending
Another subtlety exists because of the ambiguous terminology “third-party cookie.” A cookie is just a cookie– it belongs to a site (eTLD+1). Where the “party” comes into play is the context where the cookie was set and when it is sent.
In the web platform, unless a browser implements restrictions:
- A cookie set in a first-party context will be sent to a first-party context
- A cookie set in a first-party context will be sent to a third-party context
- A cookie set in a third-party context will be sent to a first party context
- A cookie set in a third-party context will be sent to a third-party context
For instance, in this sample page, if the IFRAME and IMG both set a cookie, these cookies are set in a third-party context:
- If the user subsequently visits domain2.com, the cookie set by that 3rd-Party IFRAME will now be sent to the domain2.com server in a 1st-Party context.
- If the user subsequently visits domain3.com, the cookie set by that 3rd-Party IMG will now be sent to the domain3.com server in a 1st-Party context.
Historically, Edge and IE’s “Block 3rd party cookies” options controlled only whether a cookie could be set from a 3rd party context, but did not impact whether a cookie initially set in a 1st party context would be sent to a 3rd party context.
As of Edge RS5, setting “Block only 3rd party cookies” will now also block cookies that were set in a 1st party context from being sent in a 3rd-party context. This change is in line with the behavior of other browsers.
Edge Controls Impacted By Zones
With the move from Internet Explorer to Edge, the Windows Security Zones architecture was largely left by the wayside.
However, cookie controls are one of a small number of exceptions to this; Edge applies the cookie restrictions only in the Internet Zone, the zone almost all sites fall into (outside of users on corporate networks).
Perhaps surprisingly, cookie-like features and the document.cookie getter are restricted, even in the Intranet and Trusted zones.
Chrome and Firefox do not take Windows Security Zones into account when applying cookie policies.
I’ve updated my old “Cookies” test page with new storage test cases. You can set your browser’s privacy controls:
…then visit the test page to see how the browser limits features from 3rd-party contexts. You can use the Swap button on the page to swap 1st-party and 3rd-party contexts to see how restrictions have been applied. You should see that the latest versions of Chrome, Firefox, and Edge all behave pretty much the same way.
One interesting exception is that when configured to Block 3rd-party Cookies, Edge still allows 3rd-party contexts to delete their own cookies. (This is used by federated logout pages, for instance). Chrome does not allow deletion in this scenario– the attempt to delete cookies is ignored.
Appendix: Chromium Audit
In the course of our site-compatibility investigations, I had a look at Chromium’s behavior with regard to their cookie controls. In Chromium, Blink asks the host application for permission to use various storages, and these chokepoints check:
…which is sensitive to the various “Block Cookies” settings.
Mojo messages come up through renderer_host/chrome_render_message_filter.cc, gating access to
Additionally, ChromeContentBrowserClient gates
Elsewhere, IsCookieAccessAllowed is used to limit:
- Flash Storage (PP_FLASHLSORESTRICTIONS_BLOCK)
- Client Hints
Of these, Edge does not support WebSQL, FileSystem, SharedWorker, or Client Hints.
Yesterday, I started looking a site compatibility bug where a page’s layout is intermittently busted. Popping open the F12 Tools on the failing page, we see that a stylesheet is getting blocked because it lacks a CORS Access-Control-Allow-Origin response header:
We see that the client demands the header because the LINK element that references it includes a crossorigin=anonymous directive:
crossorigin="anonymous" href="//s.axs.com/axs/css/90a6f65.css?4.0.1194" type="text/css" />
Aside: It’s not clear why the site is using this directive. CORS is required to use SubResource Integrity, but this resource does not include an integrity attribute. Perhaps the goal was to save bandwidth by not sending cookies to the “s” (static content) domain?
In any case, the result is that the stylesheet sometimes fails to load as you navigate back and forward.
Looking at the network traffic, we find that the static content domain is trying to follow the best practice Include Vary: Origin when using CORS for access control.
Unfortunately, it’s doing so in a subtly incorrect way, which you can see when diffing two request/response pairs for the stylesheet:
As you can see in the diff, the Origin token is added only to the response’s Vary directive when the request specifies an Origin header. If the request doesn’t specify an Origin, the server returns a response that lacks the Access-Control-* headers and also omits the Vary: Origin header.
That’s a problem. If the browser has the variant without the Access-Control directives in its cache, it will reuse that variant in response to a subsequent request… regardless of whether or not the subsequent request has an Origin header.
The rule here is simple: If your server makes a decision about what to return based on a what’s in a HTTP header, you need to include that header name in your Vary, even if the request didn’t include that header.
CORS protocol and HTTP caches
If CORS protocol requirements are more complicated than setting `Access-Control-Allow-Origin` to * or a static origin, `Vary` is to be used.
In particular, consider what happens if `Vary` is not used and a server is configured to send `Access-Control-Allow-Origin` for a certain resource only in response to a CORS request. When a user agent receives a response to a non-CORS request for that resource (for example, as the result of a navigation request), the response will lack `Access-Control-Allow-Origin` and the user agent will cache that response. Then, if the user agent subsequently encounters a CORS request for the resource, it will use that cached response from the previous non-CORS request, without `Access-Control-Allow-Origin`.
But if `Vary: Origin` is used in the same scenario described above, it will cause the user agent to fetch a response that includes `Access-Control-Allow-Origin`, rather than using the cached response from the previous non-CORS request that lacks `Access-Control-Allow-Origin`.
However, if `Access-Control-Allow-Origin` is set to * or a static origin for a particular resource, then configure the server to always send `Access-Control-Allow-Origin` in responses for the resource — for non-CORS requests as well as CORS requests — and do not use `Vary`.
I’ve been working on web security for a long time at this point, and spending most of my time looking at all of the bad stuff happening on the web can get pretty demoralizing. Fortunately, there’s also a lot of amazing stuff on the web that periodically reminds me of what an amazing tool it can be.
For instance, this afternoon a friend posted the following picture:
Now, I’ve loved mysteries for a long time, and this one seemed like it ought to be an easy one with all of the magical tech we have at our disposal these days.
I first gave it a shot with the Google Translate app on my phone, which frequently surprises me with the things it can do. However, while it can do translations via photo, that feature requires that you first specify the source language, and I don’t know it yet.
I gave this nice article on recognizing character sets a skim, but no perfect match leapt out at me, although it did eliminate some of my guesses. The answer’s actually in there, had I done more than skim it.
Fortunately, I remembered an amazing website that lets you draw a shape and it’ll tell you what Unicode characters you might be writing. I chose the most distinctive character I could and scrawled it in the box, and the first half of the puzzle fell into place:
After that, it was a simple matter of clicking through to the Georgian character set block and verifying that all of the characters were present. I then copied the text over to Google Translate, which reports that ელდარ translates to Eldar.
There’s a ton of amazing stuff out there on the web. Don’t let all the bad stuff get you down.
Recently, a developer asked me how to enable Brotli content-compression support in FiddlerCore applications, so that APIs like oSession.GetResponseBodyAsString() work properly when the entity body has been compressed using brotli.
Right now, support requires two steps:
- Put brotli.exe (installed by Fiddler or off Github) into a Tools subfolder of the folder containing your application’s executable.
- Ensure that the Environment.SpecialFolder.MyDocuments folder exists and contains a FiddlerCore subfolder (e.g. C:\users\username\documents\FiddlerCore).
Step #1 allows FiddlerCore to find brotli.exe. Alternatively, you can set the fiddler.config.path.Tools preference to override the folder.
Step #2 allows FiddlerCore to create necessary temporary files. Sadly, this folder cannot presently be overridden [Bug].
One day, Fiddler might not need brotli.exe any longer, as Brotli compression is making its way into the framework.
“Happy Holidays” David said as he poked his head into my office, handing me an unwrapped holiday card featuring a kitten in a Santa hat. As I took it, I nearly dropped a small white envelope that dropped out from inside. The inscription in the card read simply “Best wishes, David – 2010.”
“Uh, thanks, you too!” I replied, both surprised and a bit uncomfortable that my colleague had gotten me a card. We were friendly but not friends. We’d only worked together a few times over the prior year, and it would’ve never occurred to me to get him anything for Christmas. He seemed like an archetypal geek, so we shared an interest in technology and science fiction, but we didn’t hang out or anything like that. Fortunately, he had a stack of cards in his hands, so it wasn’t like I’d been singled out or anything. Hoping he hadn’t gotten me anything fancy, I asked “What’s this?” as I flipped over the rigid envelope. In small print near the flap, it read “Do not open until Christmas 2017.”
His eyes twinkled and he grinned mischievously, an expression I’d never seen from him before. “Just a tiny gift. Well, maybe sort of a test. It’s very important that I give it to you now. But if you open it before 2017, it’ll be the crummiest gift you ever got. If you can wait, maybe it’ll be pretty nice.”
I furrowed my brow. “So, like some sort of Savings bond thing?” I asked, thinking back to the bonds I’d gotten from far-off relatives as a little kid… I’d recently stopped dragging them around from apartment to apartment and taken them to the bank to collect the princely sum of $261, two decades after I’d ungratefully wished they were some action figures or a book instead.
David smiled. “Sure, sorta. Don’t lose it. Don’t get it wet. And don’t open it early!”
“Uh, I won’t… Thanks?” I promised, and with that, David disappeared into Rob’s office next door to start his spiel all over.
Weird. Well, I’d already bought my direct reports gift cards for the IPic movie theatre and I had one extra left over… I’ll put that in a New Year’s card for David and leave it in his office sometime next week, I resolved. I tossed the card and envelope into the stack of RFC printouts on my bookshelf and went back to the email I was writing, hoping to get everything squared away before leaving for a short Christmas vacation.
The envelope sat on my bookshelf undisturbed, buried in an ever growing pile of paper. I might’ve remembered it the following year, but David had left the company that summer, off to do a volunteer tour with the Peace Corps before joining some startup out in San Francisco. Still buried in a pile of paper when I left the company two years later, the card made its way into an unsorted moving box labeled “office stuff” as we moved across the country to Texas. It then sat quietly in the box in my garage for five more years.
In the summer of 2017, I finally got around to digging through the garage, trashing what I could in an effort to make way for the growing proliferation of tricycles, big wheels, wagons, bikes, and pool toys that our two Texas-born children had collected. I spent a quiet Saturday afternoon in July mired in nostalgia, poring through boxes full of old books and papers and remembering a life before kids and so many responsibilities.
When I eventually uncovered the kitten card in the pile, I snorted and stretched to toss it in the “Recycle” pile before I remembered the weird little envelope. Sure enough, it was still inside, forgotten and untouched for the better part of a decade. I ignored the admonition of the faded green warning and tore it open, long-forgotten curiosity mounting.
The envelope contained a pile of papers folded in thirds. The outermost of these was a cleanly cut sheet of wax paper of the sort that was used for sandwiches back before ziplock bags took over the world. Odd, I mused as I discarded it. The next was a folded sheet of thick white cardstock, taped closed, which bore a short paragraph printed in Christmas-colored ink. It read simply:
Is it Christmas 2017 yet?
If so, happy holidays! Enjoy your present.
If not, please google ‘Marshmallow experiment’ and wait patiently.
You’ve been warned.
I paused as the marshmallow reference tickled something in my memory … some university lecture I’d forgotten long ago? Mildly annoyed, I dragged out my phone and searched as instructed. Oh yeah, that Stanford study about delayed gratification. It’s not like anyone will ever know… I mused as I put down my phone and started to peel back the tape holding the cardstock shut.
The door to the garage opened. “Nap’s over, Nate’s up.” my wife called, and I tossed the letter back in the box, eagerly grabbing the large pile of recycling I’d generated to show her my progress. Dancing around, building lego cars, and wrestling with my two kids, I completely forgot about David’s weird little present for another few months.
In December, off work for a two week winter holiday vacation, I resolved to finish cleaning out the garage. Thirty minutes into the job, I came across the card. I tore open the cardstock. Inside lay a single laser-printed page with a letter printed on one side. The opening paragraph read:
Happy holidays, friend!
I hope you’ve waited patiently for this small gift and aren’t too annoyed at the oddity of its presentation, but it’s all for a purpose.
My accountant has instructed me to make it very clear that this is a GIFT, granted freely without any restrictions, from me to you on December 17th, 2010. It has an approximate market value of $250. Relax– it only cost me about 8 bucks, and it may well be worthless by the time you open it. (Market value is only possibly important for tax purposes)
Beneath the card was a black and white picture, one inch square, full of smaller squares, the sort of bar code you’ll find on the back of shampoo bottles.
The letter continued and I read on, confused but intrigued.
This is a QR code containing the private key for a digital wallet containing bitcoin. Bitcoins are a virtual currency that I’ve been playing with this year, and I thought it would be a lark to give some out as a present. (I’m not clever at picking presents– growing up, I got a new leather wallet every year for Christmas.)
Perhaps in 2017 bitcoins have become worthless (that seems like the most likely outcome), but I have a hunch that perhaps they’ll continue to appreciate over the years. If you’ve been patient, maybe it’s enough to buy you a nicer present now.
If so, happy holidays! If not, I hope my folly brings you at least a chuckle. :)
My heart started to pound in my chest. The page concluded with David’s signature, preceded by 7 hand-scrawled characters.
Hands shaking with the instant weight of the letter, I dropped it and the universe entered slow motion as the paper fluttered to the ground.
Four weeks ago, emailed notice of a free massage credit revealed that I’ve been at Google for a year. Time flies when you’re drinking from a firehose.
When I mentioned my anniversary, friends and colleagues from other companies asked what I’ve learned while working on Chrome over the last year. This rambling post is an attempt to answer that question.
While I started at Google just over a year ago, I haven’t actually worked there for a full year yet. My second son (Nate) was born a few weeks early, arriving ten workdays after my first day of work.
I took full advantage of Google’s very generous twelve weeks of paternity leave, taking a few weeks after we brought Nate home, and the balance as spring turned to summer. In a year, we went from having an enormous infant to an enormous toddler who’s taking his first steps and trying to emulate everything his 3 year-old brother (Noah) does.
I mention this because it’s had a huge impact on my work over the last year—much more than I’d naively expected.
When Noah was born, I’d been at Telerik for almost a year, and I’d been hacking on Fiddler alone for nearly a decade. I took a short paternity leave, and my coding hours shifted somewhat (I started writing code late at night between bottle feeds), but otherwise my work wasn’t significantly impacted.
As I pondered joining Google Chrome’s security team, I expected pretty much the same—a bit less sleep, a bit of scheduling awkwardness, but I figured things would fall into a good routine in a few months.
Things turned out somewhat differently.
Perhaps sensing that my life had become too easy, fate decided that 2016 was the year I’d get sick. Constantly. (Our theory is that Noah was bringing home germs from pre-school; he got sick a bunch too, but recovered quickly each time.) I was sick more days in 2016 than I was in the prior decade, including a month-long illness in the spring. That ended with a bout of pneumonia that concluded with a doctor-mandated seven days away from the office. As I coughed my brains out on the sofa at home, I derived some consolation in thinking about Google’s generous life insurance package. But for the most part, my illnesses were minor—enough to keep me awake at night and coughing all day, but otherwise able to work.
Mathematically, you might expect two kids to be twice as much work as one, but in our experience, it hasn’t worked out that way. Instead, it varies between 80% (when the kids happily play together) to 400% (when they’re colliding like atoms in a runaway nuclear reactor). Thanks to my wife’s heroic efforts, we found a workable daytime routine. The nights, however, have been unexpectedly difficult. Big brother Noah is at an age where he usually sleeps through the night, but he’s sure to wake me up every morning at 6:30am sharp. Fortunately, Nate has been a pretty good sleeper, but even now, at just over a year old, he usually still wakes up and requires attention twice a night or so.
I can’t remember the last time I had eight hours of sleep in a row. And that’s been extremely challenging… because I can’t remember much else either. Learning new things when you don’t remember them the next day is a brutal, frustrating process.
When Noah was a baby, I could simply sleep in after a long night. Even if I didn’t get enough sleep, it wouldn’t really matter—I’d been coding in C# on Fiddler for a decade, and deadlines were few and far between. If all else failed, I’d just avoid working on any especially gnarly code and spend the day handling support requests, updating graphics, or doing other simple and straightforward grunt work from my backlog.
Things are much different on Chrome.
When I first started talking to the Chrome Security team about coming aboard, it was for a role on the Developer Advocacy team. I’d be driving HTTPS adoption across the web and working with big sites to unblock their migrations in any way I could. I’d already been doing the first half of that for fun (delivering talks at conferences like Codemash and Velocity), and I’d previously spent eight years as a Security Program Manager for the Internet Explorer team. I had tons of relevant experience. Easy peasy.
I interviewed for the Developer Advocate role. The hiring committee kicked back my packet and said I should interview as a Technical Program Manager instead.
I interviewed as a Technical Program Manager. The hiring committee kicked back my packet and said I should interview as a Developer Advocate instead.
The Chrome team resolved the deadlock by hiring me as a Senior Software Engineer (SWE).
I was initially very nervous about this, having not written any significant C++ code in over a decade—except for one in-place replacement of IE9’s caching logic which I’d coded as a PM because I couldn’t find a developer to do the work. But eventually I started believing in my own pep talk: “I mean, how hard could it be, right? I’ve been troubleshooting code in web browsers for almost two decades now. I’m not a complete dummy. I’ll ramp up. It’ll be rough, but it’ll work out. Hell, I started writing Fiddler not knowing either C# nor HTTP, and that turned out pretty good. I’ll buy some books and get caught up. There’s no way that Google would have just hired me as a C++ developer without asking me any C++ coding questions if it wasn’t going to all be okay. Right? Right?!?”
I knew I had a lot to learn, and fast, but it took me a while to realize just how much else I didn’t know.
Google’s primary development platform is Linux, an OS that I would install every few years, play with for a day, then forget about. My new laptop was a Mac, a platform I’d used a bit more, but still one for which I was about a twentieth as proficient as I was on Windows. The Chrome Windows team made a half-hearted attempt to get me to join their merry band, but warned me honestly that some of the tooling wasn’t quite as good as it was on Linux and it’d probably be harder for me to get help. So I tried to avoid Windows for the first few months, ordering a puny Windows machine that took around four times longer to build Chrome than my obscenely powerful Linux box (with its 48 logical cores). After a few months, I gave up on trying to avoid Windows and started using it as my primary platform. I was more productive, but incredibly slow builds remained a problem for a few months. Everyone told me to just order another obscenely powerful box to put next to my Linux one, but it felt wrong to have hardware at my desk that collectively cost more than my first car—especially when, at Microsoft, I bought all my own hardware. I eventually mentioned my cost/productivity dilemma to a manager, who noted I was getting paid a Google engineer’s salary and then politely asked me if I was just really terrible at math. I ordered a beastly Windows machine and now my builds scream. (To the extent that any C++ builds can scream, of course. At Telerik, I was horrified when a full build of Fiddler slowed to a full 5 seconds on my puny Windows machine; my typical Chrome build today still takes about 15 minutes.)
Beyond learning different operating systems, I’d never used Google’s apps before (Docs/Sheets/Slides); luckily, I found these easy to pick up, although I still haven’t fully figured out how Google Drive file organization works. Google Docs, in particular, is so good that I’ve pretty much given up on Microsoft Word (which headed downhill after the 2010 version). Google Keep is a low-powered alternative to OneNote (which is, as far as I can tell, banned because it syncs to Microsoft servers) and I haven’t managed to get it to work well for my needs. Google Plus still hasn’t figured out how to support pasting of images via CTRL+V, a baffling limitation for something meant to compete in the space… hell, even Microsoft Yammer supports that, for gods sake. The only real downside to the web apps is that tab/window management on modern browsers is still a very much unsolved problem (but more on that in a bit).
But these speedbumps all pale in comparison to Gmail. Oh, Gmail. As a program manager at Microsoft, pretty much your entire life is in your inbox. After twelve years with Outlook and Exchange, switching to Gmail was a train wreck. “What do you mean, there aren’t folders? How do I mark this message as low priority? Where’s the button to format text with strikethrough? What do you mean, I can’t drag an email to my calendar? What the hell does this Archive thing do? Where’s that message I was just looking at? Hell, where did my Gmail tab even go—it got lost in a pile of sixty other tabs across four top-level Chrome windows. WTH??? How does anyone get anything done?”
Communication and Remote Work
While Telerik had an office in Austin, I didn’t interact with other employees very often, and when I did they were usually in other offices. I thought I had a handle on remote work, but I really didn’t. Working with a remote team on a daily basis is just different.
With communication happening over mail, IRC, Hangouts, bugs, document markup comments, GVC (video conferencing), G+, and discussion lists, it was often hard to figure out which mechanisms to use, let alone which recipients to target. Undocumented pitfalls abounded (many discussion groups were essentially abandoned while others were unexpectedly broad; turning on chat history was deemed a “no-no” for document retention reasons).
It often it took a bit of research to even understand who various communication participants were and how they related to the projects at hand.
After years of email culture at Microsoft, I grew accustomed to a particular style of email, and Google’s is just different. Mail threads were long, with frequent additions of new recipients and many terse remarks. Many times, I’d reply privately to someone on a side thread, with a clarifying question, or suggesting a counterpoint to something they said. The response was often “Hey, this just went to me. Mind adding on the main thread?”
I’m working remotely, with peers around the world, so real-time communication with my team is essential. Some Chrome subteams use Hangouts, but the Security team largely uses IRC.
Now, I’ve been chatting with people online since BBSes were a thing (I’ve got a five digit ICQ number somewhere), but my knowledge of IRC was limited to the fact that it was a common way of taking over suckers’ machines with buffer overflows in the ‘90s. My new teammates tried to explain how to IRC repeatedly: “Oh, it’s easy, you just get this console IRC client. No, no, you don’t run it on your own workstation, that’d be crazy. You wouldn’t have history! You provision a persistent remote VM on a machine in Google’s cloud, then SSH to that, then you run screens and then you run your IRC client in that. Easy peasy.”
Getting onto IRC remained on my “TODO” list for five months before I finally said “F- it”, installed HexChat on my Windows box, disabled automatic sleep, and called it done. It’s worked fairly well.
Google Developer Tooling
When an engineer first joins Google, they start with a week or two of technical training on the Google infrastructure. I’ve worked in software development for nearly two decades, and I’ve never even dreamed of the development environment Google engineers get to use. I felt like Charlie Bucket on his tour of Willa Wonka’s Chocolate Factory—astonished by the amazing and unbelievable goodies available at any turn. The computing infrastructure was something out of Star Trek, the development tools were slick and amazing, the process was jaw-dropping.
While I was doing a “hello world” coding exercise in Google’s environment, a former colleague from the IE team pinged me on Hangouts chat, probably because he’d seen my tweets about feeling like an imposter as a SWE. He sent me a link to click, which I did. Code from Google’s core advertising engine appeared in my browser. Google’s engineers have access to nearly all of the code across the whole company. This alone was astonishing—in contrast, I’d initially joined the IE team so I could get access to the networking code to figure out why the Office Online team’s website wasn’t working. “Neat, I can see everything!” I typed back. “Push the Analyze button” he instructed. I did, and some sort of automated analyzer emitted a report identifying a few dozen performance bugs in the code. “Wow, that’s amazing!” I gushed. “Now, push the Fix button” he instructed. “Uh, this isn’t some sort of security red team exercise, right?” I asked. He assured me that it wasn’t. I pushed the button. The code changed to fix some unnecessary object copies. “Amazing!” I effused. “Click Submit” he instructed. I did, and watched as the system compiled the code in the cloud, determined which tests to run, and ran them. Later that afternoon, an owner of the code in the affected folder typed LGTM (Googlers approve changes by typing the acronym for Looks Good To Me) on the change list I had submitted, and my change was live in production later that day. I was, in a word, gobsmacked. That night, I searched the entire codebase for misuse of an IE cache control token and proposed fixes for the instances I found. I also narcissistically searched for my own name and found a bunch of references to blog posts I’d written about assorted web development topics.
Unfortunately for Chrome Engineers, the introduction to Google’s infrastructure is followed by a major letdown—because Chromium is open-source, the Chrome team itself doesn’t get to take advantage of most of Google’s internal goodies. Development of Chrome instead resembles C++ development at most major companies, albeit with an automatically deployed toolchain and enhancements like a web-based code review tool and some super-useful scripts. The most amazing of these is called bisect-builds, and it allows a developer to very quickly discover what build of Chrome introduced a particular bug. You just give it a “known good” build number and a “known bad” build number and it automatically downloads and runs the minimal number of builds to perform a binary search for the build that introduced a given bug:
Firefox has a similar system, but I’d’ve killed for something like this back when I was reproducing and reducing bugs in IE. While it’s easy to understand how the system functions, it works so well that it feels like magic. Other useful scripts include the presubmit checks that run on each change list before you submit them for code review—they find and flag various style violations and other problems.
Compilation itself typically uses a local compiler; on Windows, we use the MSVC command line compiler from Visual Studio 2015 Update 3, although work is underway to switch over to Clang. Compilation and linking all of Chrome takes quite some time, although on my new beastly dev boxes it’s not too bad. Googlers do have one special perk—we can use Goma (a distributed compiler system that runs on Google’s amazing internal cloud) but I haven’t taken advantage of that so far.
For bug tracking, Chrome recently moved to Monorail, a straightforward web-based bug tracking system. It works fairly well, although it is somewhat more cumbersome than it needs to be and would be much improved with a few tweaks. Monorail is open-source, but I haven’t committed to it myself yet.
For code review, Chrome presently uses Rietveld, a web-based system, but this is slated to change to a different system called Gerrit in the near(ish) future. Like Monorail, it’s pretty straightforward although it would benefit from some minor usability tweaks; I committed one trivial change myself, but the pending migration to a different system means that it isn’t likely to see further improvements.
As an open-source project, Chromium has quite a bit of public documentation for developers, including Design Documents. Unfortunately, Chrome moves so fast that many of the design documents are out-of-date, and it’s not always obvious what’s current and what was replaced long ago. The team does value engineers’ investment in the documents, however, and various efforts are underway to update the documents and reduce Chrome’s overall architectural complexity. I expect these will be ongoing battles forever, just like in any significant active project.
What I’ve Done
“That’s all well and good,” my reader asks, “but what have you done in the last year?”
I Wrote Some Code
My first check in to Chrome landed in February; it was a simple adjustment to limit Public-Key-Pins to 60 days. Assorted other checkins trickled in through the spring before I went on paternity leave. The most fun fix I did cleaned up a tiny UX glitch that sat unnoticed in Chrome for almost a decade; it was mostly interesting because it was a minor thing that I’d tripped over for years, including back in IE. (The root cause was arguably that MSDN documentation about DWM lied; I fixed the bug in Chrome, sent the fix to IE, and asked MSDN to fix their docs).
I fixed a number of minor security bugs, and lately I’ve been working on UX issues related to Chrome’s HTTPS user-experience. Back in 2005, I wrote a blog post complaining about websites using HTTPS incorrectly, and now, just over a decade later, Chrome and Firefox are launching UI changes to warn users when a site is collecting sensitive information on pages which are Not Secure; I’m delighted to have a small part in those changes.
Having written a handful of Internet Explorer Extensions in the past, I was excited to discover the joy of writing Chrome extensions. Chrome extensions are fun, simple, and powerful, and there’s none of the complexity and crashes of COM.
My first and most significant extension is the moarTLS Analyzer– it’s related to my HTTPS work at Google and it’s proven very useful in discovering sites that could improve their security. I blogged about it and the process of developing it last year.
Because I run several different Chrome instances on my PC (and they update daily or weekly), I found myself constantly needing to look up the Chrome version number for bug reports and the like. I wrote a tiny extension that shows the version number in a button on the toolbar (so it’s captured in screenshots too!):
More than once, I spent an hour or so trying to reproduce and reduce a bug that had been filed against Chrome. When I found out the cause, I’d jubilently add my notes to the issue in the Monorail bug tracker, click “Save changes” and discover that someone more familiar with the space had beaten me to the punch and figured it out while I’d had the bug open on my screen. Adding an “Issue has been updated” alert to the bug tracker itself seemed like the right way to go, but it would require some changes that I wasn’t able to commit on my own. So, instead I built an extension that provides such alerts within the page until the feature can be added to the tracker itself.
Each of these extensions was a joy to write.
I Filed Some Bugs
I’m a diligent self-hoster, and I run Chrome Canary builds on all of my devices. I submit crash reports and file bugs with as much information as I can. My proudest moment was in helping narrow down a bizarre and intermittent problem users had with Chrome on Windows 10, where Chrome tabs would crash on every startup until you rebooted the OS. My blog post explains the full story, and encourages others to file bugs as they encounter them.
I Triaged More Bugs
I’ve been developing software for Windows for just over two decades, and inevitably I’ve learned quite a bit about it, including the undocumented bits. That’s given me a leg up in understanding bugs in the Windows code. Some of the most fun include issues in Drag and Drop, like this gem of a bug that means that you can’t drop files from Chrome to most applications in Windows. More meaningful bugs relate to problems with Windows’ Mark-of-the-Web security feature (about which I’ve blogged about several times).
I Took Sheriff Rotations
Google teams have the notion of sheriffs—a rotating assignment that ensures that important tasks (like triaging incoming security bugs) always has a defined owner, without overwhelming any single person. Each Sheriff has a term of ~1 week where they take on additional duties beyond their day-to-day coding, designing, testing, etc.
The Sheriff system has some real benefits—perhaps the most important of which is creating a broad swath of people experienced and qualified in making triage decisions around security vulnerabilities. The alternative is to leave such tasks to a single owner, rapidly increasing their bus factor and thus the risk to the project. (I know this from first-hand experience. After IE8 shipped, I was on my way out the door to join another team. Then IE’s Security PM left, leaving a gaping hole that I felt obliged to stay around to fill. It worked out okay for me and the team, but it was tense all around.)
I’m on two sheriff rotations: Enamel (my subteam) and the broader Chrome Security Sheriff.
The Enamel rotation’s tasks are akin to what I used to do as a Program Manager at Microsoft—triage incoming bugs, respond to questions in the Help Forums, and generally act as a point of contact for my immediate team.
In contrast, the Security Sheriff rotation is more work, and somewhat more exciting. The Security Sheriff’s duties include triaging all bugs of type “Security”, assigning priority, severity, and finding an owner for each. Most security bugs are automatically reported by our fuzzers (a tireless robot army!), but we also get reports from the public and from Chrome team members and Project Zero too.
At Microsoft, incoming security bug reports were first received and evaluated by the Microsoft Security Response Center (MSRC); valid reports were passed along to the IE team after some level of analysis and reproduction was undertaken. In general, all communication was done through MSRC, and the turnaround cycle on bugs was typically on the order of weeks to months.
In contrast, anyone can file a security bug against Chrome, and every week lots of people do. One reason for that is that Chrome has a Vulnerability Rewards program which pays out up to $100K for reports of vulnerabilities in Chrome and Chrome OS. Chrome paid out just under $1M USD in bounties last year. This is an awesome incentive for researchers to responsibly disclose bugs directly to us, and the bounties are much higher than those of nearly any other project.
In his “Hacker Quantified Security” talk at the O’Reilly Security conference, HackerOne CTO and Cofounder Alex Rice showed the following chart of bounty payout size for vulnerabilities when explaining why he was using a Chromebook. Apologies for the blurry photo, but the line at the top shows Chrome OS, with the 90th percentile line miles below as severity rises to Critical:
With a top bounty of $100000 for an exploit or exploit chain that fully compromises a Chromebook, researchers are much more likely to send their bugs to us than to try to find a buyer on the black market.
Bug bounties are great, except when they’re not. Unfortunately, many filers don’t bother to read the Chrome Security FAQ which explains what constitutes a security vulnerability and the great many things that do not. Nearly every week, we have at least one person (and often more) file a bug noting “I can use the Developer Tools to read my own password out of a webpage. Can I have a bounty?” or “If I install malware on my PC, I can see what happens inside Chrome” or variations of these.
Because we take security bug reports very seriously, we often spend a lot of time on what seem like garbage filings to verify that there’s not just some sort of communication problem. This exposes one downside of the sheriff process—the lack of continuity from week to week.
In the fall, we had one bug reporter file a new issue every week that was just a collection of security related terms (XSS! CSRF! UAF! EoP! Dangling Pointer! Script Injection!) lightly wrapped in prose, including screenshots, snippets from websites, console output from developer tools, and the like. Each week, the sheriff would investigate, ask for more information, and engage in a fruitless back and forth with the filer trying to figure out what claim was being made. Eventually I caught on to what was happening and started monitoring the sheriff’s queue, triaging the new findings directly and sparing the sheriff of the week. But even today we still catch folks who lookup old bug reports (usually Won’t Fixed issues), copy/paste the content into new bugs, and file them into the queue. It’s frustrating, but coming from a closed bug database, I’d choose the openness of the Chrome bug database every time.
Getting ready for my first Sherriff rotation, I started watching the incoming queue a few months earlier and felt ready for my first rotation in September. Day One was quiet, with a few small issues found by fuzzers and one or two junk reports from the public which I triaged away with pointers to the “Why isn’t a vulnerability” entries in the Security FAQ. I spent the rest of the day writing a fix for a lower-priority security bug that had been filed a month before. A pretty successful day, I thought.
Day Two was more interesting. Scanning the queue, I saw a few more fuzzer issues and one external report whose text started with “Here is a Chrome OS exploit chain.” The report was about two pages long, and had a forty-two page PDF attachment explaining the four exploits the finder had used to take over a fully-patched Chromebook.
Watching Luke’s X-wing take out the Death Star in Star Wars was no more exciting than reading the PDF’s tale of how a single byte memory overwrite in the DNS resolver code could weave its way through the many-layered security features of the Chromebook and achieve a full compromise. It was like the most amazing magic trick you’ve ever seen.
I hopped over to IRC. “So, do we see full compromises of Chrome OS every week?” I asked innocently.
“No. Why?” came the reply from several corners. I pasted in the bug link and a few moments later the replies started flowing in “OMG. Amazing!” Even guys from Project Zero were impressed, and they’re magicians who build exploits like this (usually for other products) all the time. The researcher had found one small bug and a variety of neglected components that were thought to be unreachable and put together a deadly chain.
The first patches were out for code review that evening, and by the next day, we’d reached out to the open-source owner of the DNS component with the 1-byte overwrite bug so he could release patches for the other projects using his code. Within a few days, fixes to other components landed and had been ported to all of the supported versions of Chrome OS. Two weeks later, the Chrome Vulnerability rewards team added the reward-100000 tag, the only bug so far to be so marked. Four weeks after that, I had to hold my tongue when Alex mentioned that “no one’s ever claimed that $100000 bounty” during his “Hacker Quantified Security” talk. Just under 90 days from filing, the bug was unrestricted and made available for public viewing.
The remainder of my first Sheriff rotation was considerably less exciting, although still interesting. I spent some time looking through the components the researcher had abused in his exploit chain and filed a few bugs. Ultimately, the most risky component he used was removed entirely.
Outreach and Blogging
Beyond working on the Enamel team (focused on Chrome’s security UI surface), I also work on the “MoarTLS” project, designed to help encourage and assist the web as a whole in moving to HTTPS. This takes a number of forms—I help maintain the HTTPS on Top Sites Report Card, I do consultations and HTTPS Audits with major sites as they enable HTTPS on their sites. I discover, reduce, and file bugs on Chrome’s and other browsers’ support of features like Upgrade-Insecure-Requests. I publish a running list of articles on why and how sites should enable TLS. I hassle teams all over Google (and the web in general) to enable HTTPS on every single hyperlink they emit. I responsibly disclosed security bugs in a number of products and sites, including a vulnerability in Hillary Clinton’s fundraising emails. I worked to send a notification to many many many thousands of sites collecting user information non-securely, warning them of the UI changes in Chrome 56.
When I applied to Google for the Developer Advocate role, I expected I’d be delivering public talks constantly, but as a SWE I’ve only given a few talks, including my Migrating to HTTPS talk at the first O’Reilly Security Conference. I had a lot of fun at that conference, catching up with old friends from the security community (mostly ex-Microsofties). I also went to my first Chrome Dev Summit, where I didn’t have a public talk (my colleagues did) but I did get to talk to some major companies about deploying HTTPS.
I also blogged quite a bit. At Microsoft, I started blogging because I got tired of repeating myself, and because our Exchange server and document retention policies had started making it hard or impossible to find old responses—I figured “Well, if I publish everything on the web, Google will find it, and Internet Archive will back it up.”
I’ve kept blogging since leaving Microsoft, and I’m happy that I have even though my reader count numbers are much lower than they were at Microsoft. I’ve managed to mostly avoid trouble, although my posts are not entirely uncontroversial. At Microsoft, they wouldn’t let me publish this post (because it was too frank); in my first month at Google, I got a phone call at home (during the first portion of my paternity leave) from a Google Director complaining that I’d written something that was too harsh about a change Microsoft had made. But for the most part, my blogging seems not to ruffle too many feathers.
- Food at Google is generally really good; I’m at a satellite office in Austin, so the selection is much smaller than on the main campuses, but the rotating menu is fairly broad and always has at least three major options. And the breakfasts! I gained about 15 pounds in my first few months, but my pneumonia took it off and I’ve restrained my intake since I came back.
- At Microsoft, I always sneered at companies offering free food (“I’m an adult professional. I can pay for my lunch.”), but it’s definitely convenient to not have to hassle with payments. And until the government closes the loophole, it’s a way to increase employees’ compensation without getting taxed.
- For the first three months, I was impressed and slightly annoyed that all of the snack options in Google’s micro-kitchens are healthy (e.g. fruit)—probably a good thing, since I sit about twenty feet from one. Then I saw someone open a drawer and pull out some M&Ms, and I learned the secret—all of the junk food is in drawers. The selection is impressive and ranges from the popular to the high end.
- Google makes heavy use of the “open-office concept.” I think this makes sense for some teams, but it’s not at all awesome for me. I’d gladly take a 10% salary cut for a private office. I doubt I’m alone.
- Coworkers at Google range from very smart to insanely off-the-scales-smart. Yet, almost all of them are humble, approachable, and kind.
- Google, like Microsoft, offers gift matching for charities. This is an awesome perk, and one I aim to max out every year. I’m awed by people who go far beyond that.
- Window Management – I mentioned earlier that one downside of web-based tools is that it’s hard to even find the right tab when I’ve got dozens of open tabs that I’m flipping between. The Quick Tabs extension is one great mitigation; it shows your tabs in a searchable, most-recently-used list in a convenient dropdown:
Another trick that I learned just this month is that you can instruct Chrome to open a site in “App” mode, where it runs in its own top-level window (with no other tabs), showing the site’s icon as the icon in the Windows taskbar. It’s easy:
On Windows, run chrome.exe –app=https://mail.google.com
While on OS X, run open -n -b com.google.Chrome –args –app=’https://news.google.com‘
Tip: The easy way to create a shortcut to the current page in app mode is to click the Chrome Menu > More Tools > Create Shortcut and tick the Open as Window checkbox.
I now have SlickRun MagicWords set up for mail, calendar, and my other critical applications.
So, that’s it for year one @Google. I’m feeling lucky as I head into year two!