Objectively, the best cat

On March 15, 2009 we put my cat Jill (Jillian, Jilly, Jilly Bean, Jillkin) to sleep at an emergency vet. We didn’t know for sure why her kidneys failed, but it was sudden and unexpected. At times Jill could be grumpy and stubborn, but she was usually friendly, curious, and she was always very loved. I lamented at the time that it was “never going to be time to feed the monster again,” and we would always miss her. I didn’t expect to have another pet for a long time.

But then life intervened.

A Friday night two weeks later, my fiancée and I were waiting for a movie to start at the Regal theater next to the Crossroads mall in Bellevue. We saw that the Petco store adjacent was hosting some cats for adoption. We had no plans to get a new pet, especially not so soon, but we took a look anyway. I liked the look of a big cat (“Yoda”) snoring lazily in his cage, but Jane preferred a tiny black cat with a shaved belly and a curious face. After looking for a few minutes, we left for our movie.

In our seats, waiting for the previews to end, Jane remarked “She’s black and white. We could call her Domino.” and I spent the rest of subsequent movie impatiently waiting for it to end: the name was so perfect I couldn’t imagine living in a world where it wasn’t hers and she ours. Unfortunately, the pet store closed before our movie ended, but we were back bright and early the next day to adopt her.

A street kitten, under-weight, under-furred and under-age (she was allegedly a year old but plainly was only a few months) she took to her new home immediately; her fur grew in and she eventually stopped chasing her own tail.

Domino was playful and alert, and always pushing boundaries. We let her out a few times, but she kept getting hurt, one time returning with teeth marks on both sides of her little ribs suggesting that a dog had gotten its jaws around her entire body. After that, she was confined to our new deck– as soon as she stepped off, we’d pick her up and put her back on it. She clued in and began keeping a single paw arched back to the bottom step of the deck, her other three triumphantly planted in the grass. It was soon a common sight, one so comical that it was ingrained in my memory to the point that I never bothered to take a picture. I’d’ve never believed if I hadn’t seen it.

Unlike many cats who are standoffish or aloof, Domino was empathic– if we ever cried or even sounded sad, she’d immediately appear and start rubbing up against us, a furry friend without advice, just love. We bought her a fancy bed, but she never wanted to sleep in it– she always snuggled up to us instead.

When we moved from Seattle to Austin in 2012, we decided to drive rather than fly to make the trip easier on her– drugging her for a flight seemed cruel and unnecessary. A day into the trip, we realized what an error we’d made… she spent the first two days of the trip hiding in the back of the car, climbing up to the dashboard to peer out, then, terrified at America whizzing by at 75mph, jumping into the back or a lap to lay back down. This process repeated every few minutes until she’d fall asleep for short naps.

By the third day, she was plainly miserable; when we finally rolled into our new driveway, the first order of business was to let her out to explore her new back yard. Her backyard explorations were rare after that– we’d seen coyotes and snakes in the neighborhood, and between those threats and the blazing Texas summers, she was happy enough as an indoor cat. An occasional foray into the backyard to nibble some grass or roll around on the patio seemed to keep her happy.

We grew up together.

In 2013, we nervously brought our first son home, worried about how Domino might react. She kept her distance. In fact, if anything, she seemed terrified, avoiding both his crib and her food bowl. Her odd behavior went on for a few days until I finally figured out what had happened.

Tied to newborn Noah’s car seat floated a congratulatory Mylar balloon, and having never seen one, Domino was terrified of it. I foolishly made the mistake of trying to carry her over to it to show her that it was nothing to be afraid of. She clawed her way out of my arms, over my shoulder, and down my back, peeing the whole way. Later that afternoon, Noah peed all over me as I changed his diaper, and I joked that it was “pee on Dad day.”

After we got rid of the balloon, Domino was the perfect big sister, tolerating a toddler who would occasionally pull her tail or chase her around, never lifting a paw against him no matter how rough he wanted to play. Her patience continued with our second son, and it paid off– as they grew up, the boys were always happy to snuggle, pet her, and build increasingly elaborate cat forts out of whatever cardboard boxes they could get their hands on. She humored them and snuggled them every chance she got.

But she snuggled with all of us, including sleepy mom and dad.

When my wife asked me to move out a year ago, there wasn’t much debate — our cat was coming with me. Domino kept me company through the roughest pandemic year alone in a new home. Two or three other cats in my new neighborhood roamed free and Domino noisily defended her turf through the windows. As the weather cooled down, she started keeping me company on the porch, and then demanding to go out between dinner and bedtime. A month ago, she went out late and didn’t come back until first light, banging on my bedroom window and demanding breakfast.

So I didn’t think too much about it as I acceded to her demands and let her out the back door at around 11pm the night of November 9th, 2020. “Be good. Don’t get in trouble!” I admonished as always. At 12:30am, I checked for her, but she wasn’t around, so I reluctantly went to bed.

I woke up a few times in the night and checked for her, but she wasn’t around. When I woke up at 8am, I was officially worried. By the afternoon, I was papering the neighborhood with posters, updating the contact info for her microchip, and learning about all of the web resources for reuniting lost pets (some of which are pretty cool).

Going to bed alone that night with her still at large was hard. And it hasn’t gotten any easier over the last week, no matter how many posters I’ve put up, or web postings I’ve scoured.

o

Sounds like a great cat, huh? Well, you haven’t heard the real story yet.

Back when Noah was four, his right knee started swelling. We assumed it was bruising from a fall, but he didn’t complain about pain and the swelling didn’t go down in weeks. A series of doctors’ visits, scans, and a fully-anesthetized MRI led to a diagnosis of PVNS, a non-cancerous tumor that nonetheless would require major surgery on both the front and back of Noah’s knee. The expert doctor at Dell Children’s Hospital assured us that it was as clear a case as he’d ever seen, and we needed to do the surgery soon to ensure that Noah’s leg would grow normally. I pushed for doing the operation right away, but Jane suggested we do it later in the fall so that Noah would be able to recover without the temptations of the swimming pool and other summer fun.

Around this time we noticed Domino had developed a large growth (about the size of a large marble) in her neck, a recurrence of a problem (keratin buildup?) she’d had previously. We hoped she’d get better on her own again this time, as we had bigger worries now.

Throughout August and September, Noah would watch cartoons on the couch, and Domino would lay her swollen chin on his swollen knee. When it came time to go back to the hospital for Noah’s operation, the doctor was gobsmacked– the mass was completely gone. He had no explanation– he’d never seen anything like it. The doctor asked us to do an MRI in an attempt to figure out what had happened, but after confirming that Noah wouldn’t need surgery and an MRI wouldn’t change anything for his diagnosis, we declined. We were overwhelmed with relief. Domino’s growth disappeared too.

I’m not saying my cat cured my son (and vice-versa), but our best doctors don’t seem to be in a position to say that’s not what happened either.

I miss my friend.

Simply Making Simple Fixes Simple for Chromium

Google recently introduced a cool web-based editing tool for Chromium source code, a very stripped down version of the Willy Wonka tooling Googlers get to use for non-Chromium projects.

I’ve used this tool to submit two trivial change lists (CLs, aka PRs) to Chromium, but I was curious about whether this new feature would work for something not completely trivial.

Let’s try it.

First, find an Available GoodFirstBug. Say, this one to remove an expired flag. Let’s search for the flag. Not too many hits:

Let’s click into the first hit to open it. Click Edit Code at the top right:

A web-based code editor opens:

Remove the &kOverlayNewLayout line and the function that references it later in the file. Use the navigation panel on the left to open the .h file corresponding and remove the declaration of the feature.

Open about_flags.cc and remove the lines referencing the flag:

Note that this will create orphan variables for the flag_descriptions that we need to go delete too. Go delete those from the flag_descriptions.cc and flag_descriptions.h files.

At this point, your Pending Changes pane contains all of the files with all of the hits that came up in the search.

If we think we’re done, we can go hit the Create Change button to actually create the change for review.

However, suspiciously, we haven’t actually found anything that respects the Feature we removed. If we instead search for OverlayNewLayout we see an additional hit appears:

Having worked on stuff like this before, I know that there’s some magical marshalling of feature values from C++ into Java-land (Android’s UI layer). So we need to go eliminate the use of the Java-side feature. Search for OVERLAY_NEW_LAYOUT.

Oof. There are 8 more files: 7 are Java and one is a similarly-named variable in a CC file. (It’s a variable in C++ code that set based on the feature’s state from the Java-side based on a resource ID. Sneaky sneaky).

Worse, one of the early hits is this one:

… So we also need to remove this function entirely and update all of its callers to not check it. Fortunately, only two more files have references, but clearing up some of those implies that we might have resources that will become unused:

When we’re done pulling all the threads, the change has grown pretty big. Click Create change to put all of our changes together into a change list:

In the dialog that appears, enter a description of the changelist and click Create:

After a minute or so, a notice will appear:

Sadly, it’s not a hyperlink, but if you look down at the bottom of the editor, you’ll find a direct link to the Chromium Code Review site, where you can find the created change. After the CL is prepared, you can add reviewers for a code review, ask anyone with Try Permissions to run your change through the bots, and otherwise follow the process for contributing to Chromium.

If you need to modify a change after you’ve created a CL, you can do so using the Edit link on the Code Review Site. Push the button:

… then use the new commands down near the file list to modify (or add) files in the CL:

A few observations from this process:

1. A lot of Chromium GoodFirstBug‘s are much more complicated than they appear. This one required a followup CL to remove 400 more lines!

2. This new web-editing workflow is great for trivial fixes, but very limited. Most importantly, if you need to use any advanced tools (e.g. source formatting, manual code generators, presubmit checks, etc), you’ll need to download your CL from the code review site to use the tools on it.

3. Multi-file changes are challenging. It would be awesome to be able to just click “Edit Code” from the Chrome Source Viewer on multiple files and have them all open in the same “in-progress” change set, but it doesn’t seem to work like that (yet?). You must manually find all files after the first one in the editor’s sidebar.

I’m excited to see how Chromium’s Edit Code feature evolves.

-Eric

Client Certificates and Logout

Back in May, I wrote about Client Certificate Authentication, a mechanism that allows websites to strongly validate the identity of their visitors using certificates presented by the visitor’s browser.

One significant limitation for client certificate authentication is that there is no standards-based mechanism for a user to “log out” of a site that uses that auth mechanism.

Background on Auth Caching

When a user picks a client certificate to authenticate to a server, that choice (an {Origin, ClientCert} tuple) is remembered for the lifetime of the browsing session. Any subsequent CertificateRequest challenge from that origin will automatically be answered by the browser using the previously-selected certificate.

The Auth Caching behavior is necessary for at least two reasons:

  • User-Experience: Client Cert authentication is a connection-oriented authentication protocol– each connection to the server can challenge the user for a client certificate, and the browser re-prompting the user for a new certificate for each new connection would be incredibly annoying.
  • Predictability: Even worse, if the user accidentally picked a different certificate for subsequent connections, the browser would end up in the very confusing state whereby requests might exhibit different behavior based on which reused connection happened to be pulled from the browser’s connection reuse pool when each request is sent.

Unfortunately, tying the Auth cache’s certificate selection to the browsing session means that if a user has multiple certificates, selecting a different certificate for subsequent use on the site (either because the user chose the “wrong” certificate the first time, or because they would like to “switch their identity”) is not generally straightforward.

The only option that works reliably across browsers is to restart the browser entirely (closing all of your other tabs and web applications).

Log Out Approaches

To address the user desire to “Log out without restarting,” Internet Explorer offered a “Clear SSL State” button buried inside the Internet Control Panel:

… and websites could invoke the same functionality with a web-accessible JavaScript call:

 document.execCommand("ClearAuthenticationCache", false);

Notably, the button and the API both clear all Client Certificates for all sites running in the current session, meaning users cannot log out of just a single site using the API.

The ClearAuthenticationCache API (a blunt hammer with many side-effects) was never standardized, so it was not supported in other browsers.

Instead, the standardized Clear Site Data (MDN) offers a mechanism for websites to programmatically clear per-origin data including Auth Caches (both HTTP Authentication and Client Certificate Authentication). Unlike the IE ClearAuthenticationCache call, Clear-Site-Data is per-origin, so calling it does not blow every site’s data away.

Unfortunately, it’s not very useful for logging users out of Client Cert auth’d sites, for a few reasons:

  1. Websites cannot clear only Auth caches– the Auth caches are instead cleared when the website indicates that it would like cookies cleared.
  2. Websites cannot clear only Session data– when you request that cookies be cleared, it clears not only the Auth caches and session cookies, but also clears the origin’s persistent cookies.
  3. Clearing the Auth Cache alone isn’t enough– the browser also must clear/close any authenticated connections to the origin from the reuse pool, because otherwise a request might reuse a connection previously authenticated with a client certificate.

    The complexity and side-effects of this requirement mean that Chromium has not implemented the clearing of the Auth cache when Clear-Site-Data is called.

Chromium also does not offer any UI mechanism to clear auth caches, meaning that users must restart their browser to clear the auth cache and select different credentials.

Sessions vs. Profiles

Beyond the Clear SSL State button and ClearAuthenticationCache API, Internet Explorer offered users a New Session command that allows a user to open a new browser window that runs in a different Session than the original — the new browser window’s Session does not share Session state (the auth cache, session cookies, sessionStorage) with the original Session. The new Session does however share persistent state (persistent cookies, localStorage, the HTTP Cache, etc), so the two Sessions are not fully isolated from one another.

In contrast, Chromium does not offer such a “New Session” feature. Users can have a maximum of two active Sessions per profile– their “main” Session, and one Private Mode (“Incognito”) session. If a user wishes to have more than two Sessions active at a time, they must load the site using multiple Browser Profiles.

By many measures, this design is “better”– using a different Profile means that all state is isolated between the instances, so you won’t have localStorage, indexedDB data, persistent cookies, or HTTP cache information cross-contaminating your different browser sessions. You’ll also have a handy Profile Avatar in your browser UI to remind you of which account you’re supposed to be using for each window.

However, to an end-user, using different profiles might be less convenient, because profiles isolate all state, not just web platform state. That means that Favorites, Extensions, History and other useful application state are not shared between the two Profiles.

Can anything be done?

There are a variety of investments we might consider to address this use case:

  1. Enhance Chromium’s Clear Site Data feature to match the spec. (The fact that this got punted in Chromium suggests that it’s a hard problem)
  2. Offer the user an explicit mechanism in the browser UI to clear an origin’s Auth cache and empty its connection-reuse pool. (Users will have to learn the new mechanic)
  3. Try to be clever and clear the Auth cache when the user closes the last top-level browser tab to an origin. This is not quite as crazy as it sounds; there have been some discussions of this approach to address the problem of undead session cookies. (Tricky to implement this without undesirable side effects)
  4. Implement something like Firefox Containers to allow one Browser Profile to contain multiple isolated Web Platform-level sessions/profiles. (Expensive, Complicated, Users will have to learn the new mechanic)
  5. The clever and simple solution for this longstanding problem that I simply haven’t thought of yet.

Given the relative obscurity of this scenario, I’m hoping #5 turns up. :)

-Eric

Bonus Content: PINs

One additional behavioral change between IE/Edge Legacy and the new Edge concerns SmartCards whose private keys are protected by a PIN.

Windows typically caches a user-entered PIN on a per-process basis.

In the old model, each tab’s content process contained its own instance of the network stack which was responsible for selection of the certificate (and obtaining the user’s PIN, if required). This means each individual tab would prompt the user for their SmartCard PIN.

In the new model, it’s the browser process that accesses the SmartCard to obtain a selected client certificate, which means that the user is prompted for their PIN only once per browsing session.

Web “Sessions” in Private Mode

I’ve written about Private Browsing Mode a lot previously, and I’ve written a bit about the behavior of “Session restore” previously, but one topic I haven’t covered is how “Sessions” work while in Private mode.

Session Sharing

Historically, one of the top-reported Private Mode issues was that users unexpectedly found that opening a new Private window showed that they were already logged into some site they had used earlier.

From one example issue: The typical explanation when users report issues like this is that the user has multiple Incognito windows open and does not realize that fact. Incognito windows (perhaps surprisingly) are not isolated from one another, and closing one Incognito window does not end the Incognito session. The “background” Incognito window hangs on to all of the login tokens and when you open a new Incognito window, all of those tokens remain available in that new window. Only when all Incognito windows are closed is the session ended and the login tokens expired.

To address this, back in 2018 Chromium implemented an Incognito Window Counter to help the user understand when there are multiple windows in a single Incognito Browsing Session:

If there’s a number in parentheses after the “InPrivate” text, it means that there are more InPrivate windows open in the background. You’ll need to close all of them to end the InPrivate session.

Can We Isolate Each Private Window?

> Cant we just provide a real InPrivate window every time one is
> opened regardless of whether one is open or not already?? 

Offering “N-isolated InPrivate Windows” (Issue 1024731) is an occasionally-requested feature, but satisfying it would have some tricky subtleties.

In theory, yes, software’s just bits and we can code them any way we want—IE8 exposed an explicit “New Session” command, for instance. So we could make it such that every invocation of “New InPrivate Window” creates a new and isolated Web Session.

Implementing “N-isolated InPrivate windows” has two significant hurdles:

  1. Code – Chromium is presently designed with the idea that there’s a maximum of one InPrivate session per profile. We’d have to carefully trace through every use of the active Profile to create new partitions supporting “N-isolated InPrivate Sessions”, and figure out how UX features like tearing tabs out from an isolated window ought to behave.
  2. UX – Users might not really want every InPrivate window to be isolated from every other InPrivate window. For instance, if you have a website InPrivate and it opens a popup, you probably need that popup to be in the same Session as the parent window, or the flow (script access, any login cookies, etc) is going to break. In a world of “N-isolated InPrivate Sessions”, if you don’t bind the Session tightly to the window, you need to find some way to allow the user to distinguish which windows belong to which Sessions (to ensure, for instance, that closing the last InPrivate window in that isolated Session cleans up exactly the expected state).

Now, in a world of tabbed browsing where most site-initiated popups are automatically created as new tabs in the same window, perhaps we could just punt on the hard UX challenges and decree that every top-level InPrivate Window is isolated to only itself (and e.g. forbid tearing tabs out of that window). It’s hard to say whether the total cost of such a feature would justify the user-perceived benefit.


Rather than using InPrivate for all scenarios, you might also choose to create extra “ephemeral” profiles that throw away all cookies/cache/credentials/etc every time they’re closed, or you can use the existing-by-default “Guest” account for the same purpose.

-E

PS: Long ago, I built a Web Sessions test page in case you’d like to explore the behavior described in this post.

Images Keeping You Awake?

A Microsoft Edge user recently complained that her screensaver was no longer activating after the expected delay, and she thought that this might be related to her browser.

It was, in a way.

To troubleshoot issues where your PC’s screensaver and power-saving options aren’t working correctly, you can use the Power Config command line tool. From an command prompt running as Administrator, run powercfg /requests to see the list of applications requesting that your device keep the display active.

In this case, we see that MSEdge.exe and Teams.exe have active Display Requests. These requests tell Windows that the application wants the screen to remain active and unlocked, usually to display important content (in this case, an ongoing video call and a video playing on a visible tab in Edge, respectively).

After ending my Teams call, only the Edge lock remains. But I don’t think I’m playing any video. What’s up with that?

In Chrome or Edge, you can visit chrome://media-internals to see the list of video content that’s currently playing:

In this case, we see that my Twitter feed is playing back a video. Using the F12 Developer Tools to investigate, we see that (for performance reasons) Twitter serves their animated “GIFs” using MP4 video files:

… and this playback is what’s preventing my screen from going to sleep. Unfortunately, at present there does not appear to be any mechanism for a video tag to indicate that it does not contain important content that requires the display remain active. One proposal is that, for performance reasons, browsers should allow image elements to use video sources <img src="a.mp4" /> and render them as they render animated GIF/PNG today (e.g. no playback controls, allow display to sleep).

Unfortunately, for an end-user there’s not a good workaround for the problem, short of directing Windows to ignore Display Requests from the browser entirely.

-Eric

PS: Beyond media playback, another browser feature that can keep your screen alive is the Screen Wake Lock API.

File Downloads will allow the screen to turn off, but Chromium will request that the system itself stay awake so that the download will not be interrupted in the middle:

WebRTC connections and file uploads also set an Execution wake lock.

Debugging Browsers – Tools and Techniques

Last update: Sept 30, 2020

Earlier this year, I shared a post on how you can become an expert on web browsers from the comfort of your desk… or anywhere else you have an internet connection. In that post, I mostly covered how to search through the source, review issue reports, and find design documentation. I also provided a long list of browser experts you might consider following on Twitter.

In today’s post, I’d like to give a quick summary of some of the tools and techniques I use for diagnosing browser problems.

The Importance of Observation

Specs lie. Code misleads. Observation reveals what’s actually going on– not what the PM designed, or the Dev intended.

In many cases, the fastest route to troubleshoot problems is to observe exactly what is happening on the network, disk, or screen and only then start looking at code and specs to figure out why.

Built-in Tools

The F12 Developer Tools (just hit F12) are tremendously useful for determining why a given website behaves in a certain way. In many cases, the DevTools Console will flag an observed problem with a helpful error message. I don’t know of a great tutorial, but there are likely some on YouTube.

Chromium’s NetLogs (see chrome://net-export ) contain tons of detailed information about almost every aspect of networking, as well as other useful diagnostic data (the user’s enabled extensions, field trial experimental settings, etc). You can analyze NetLogs using a variety of free tools.

Chromium Tracing (see chrome://tracing ) allows you to diagnose performance issues in Chromium-based browsers using extremely in-depth tracelog data. Analysis of these logs isn’t for the faint-of-heart.

Chromium Logging (using the --enable-logging command line argument) is useful in diagnosing a number of internal issues in Chromium subsystems. Collecting and analyzing these logs is non-trivial, but is sometimes the fastest way to root-cause tricky problems. See the following resources:

Chromium includes over 20 Internals Pages that allow you to view detailed information about media playback, data sync, and other features. Visit chrome://chrome-urls and search for -internals to see the list.

Browser Extensions

The VisBug Chrome Extension – Easily manipulate any page layout, directly in your browser.

postMessage Debugger – This extension prints messages sent with postMessage to the console.

Extension source viewer – View the source of browser extensions directly from the Web Store listing.

Cross-Platform Tools

WireShark allows packet-level analysis of network traffic. This can be useful in rare cases where a network bug depends on the exact packet size and timing.

The Fiddler Web Debugger allows import of NetLog and HAR traffic captures, and enables losslessly capturing requests and responses from any browser. While Fiddler Classic is (effectively) Windows-only, Fiddler Everywhere runs on Windows, Mac, and Linux.

Windows Tools

If you need to watch file or registry key creation/read/write/deletion, or thread/process creations and exits, then Sysinternals Process Monitor has got your back. For instance, this helped us easily root cause a bug where launching a Chromium-based browser would delete a file owned by Chrome.

If you want to explore information about process sandboxing, startup parameters, Job limits, etc, then Sysinternals Process Explorer is the tool to use. For instance, this helped us track down a problem where a browser window was unexpectedly appearing. The user simply closed all browser instances, then waited for an unexpected browser to appear. Then, they looked at the process tree to see what application started it. For instance, in this case, Edge was launched by sr.exe:

If you need to debug a scenario involving drag/drop or copy/paste, you can use ClipSpy (binary only) or NirSoft InsideClipboard.

Bisecting

Bisecting is the process of making a repeated set of observations to determine the build in which a problem appeared (or disappeared). From there, you can easily assign bugs to the right owners for rapid fixes.

See the Bisect Regressions section of this post for details on how to use Chromium’s bisect-builds.py script (which does not require you to build Chromium or download all of its tools and source code) to bisect problems. Here’s another bisection case-study.

I’m sure there are a hundred great tools I’ve omitted. This post will grow over time. If you’ve got a suggestion for a great diagnostic tool, share with us!

-Eric

Local Data Encryption in Chromium

Back in February, I wrote about browser password managers and mentioned that it’s important to understand the threat model when deciding how to implement features and their security protections.

Generally speaking, “keeping secrets from yourself” is a fool’s errand, so it’s a waste of time and effort to encrypt data if you have to store the decryption key in a place that’s accessible to the attacker. That’s one reason why physically local attacks and machines infected with malware are generally outside the browser’s threat model: if an attacker has access to the keys, using encryption isn’t going to protect your data.

Web browsers store a variety of highly sensitive data, including passwords and cookies (often containing authentication tokens functionally equivalent to passwords). When storing this extra-sensitive data, Chromium encrypts it using AES256, storing the encryption key in an OS storage area. This feature is called local data encryption. Not all of the browser’s data stores use encryption– for instance, the browser cache does not. If your device is at risk of theft, you should be using your OS’ full-disk encryption feature, e.g. BitLocker on Windows.

The profile’s encryption key is protected by OSCrypt: On Windows, the OS Storage area is DPAPI; on Mac, it’s the Keychain; on Linux, it’s Gnome Keyring or KWallet.

Notably, all of these storage areas encrypt the AES256 key using a key accessible to (some or all1) processes running as the user– this means that if your PC is infected with malware, the bad guys can get decrypted access to the browser’s storage areas.

However, that’s not to say that Local Data Encryption is entirely without value– for instance, I recently came across a misconfigured web server that allowed any visitor to explore the server owner’s profile (e.g. c:\users\sally), including their Chrome profile folder. Because the browser key in the profile is encrypted using a key stored outside of the Chrome profile, their most sensitive data remained encrypted.

Similarly, if a laptop isn’t protected with Full Disk Encryption, Local Data Encryption will make a thief’s life harder.

Tradeoffs?

Okay, so Local Data Encryption might be useful. What are the downsides?

The obvious tradeoff is simple and mild: There’s always a performance cost to encrypting and decrypting data. However, AES256 is extremely fast (>1GB/sec) on modern hardware, and the data size of cookies and credentials is relatively small.

The bigger risk is complexity: If something goes wrong with either of the keys (the browser’s key that encrypts the data or the OS’s key that encrypts the browser’s key), then the user’s cookie and credential data will be unrecoverable. The user will be forced to re-log into every website and re-store all of the credentials in their password manager (or recover their credentials from the cloud using the browser’s sync feature).

Unfortunately, I seem to be a magnet for such problems.

On Mac, Edge recently had a problem where the browser would fail to get the browser key from the OS keychain. The browser would offer to wipe the keychain (losing all of your data), but ignoring the error message and restarting would typically correct the problem. A fix for that bug was recently issued.

On Windows, DPAPI failures are typically silent– your data disappears with nary a message box.

When I first rejoined Microsoft in 2018, a bug in AAD meant that my OS DPAPI key was corrupted, causing Chromium-based browsers to cause lsass to spin a CPU core forever when they launched. Troubleshooting this problem required months of effort.

More recently, we’ve heard from some users on Windows 10 that Edge and Chrome forget their data frequently (and similar effects are seen in other DPAPI-using applications).

Users in this state who visit chrome://histograms/OSCrypt in Chrome or Edge in the browser session where they first notice their sensitive data has gone missing will see an entry inside OSCrypt.Win.KeyDecryptionError with a value of -2146893813 (NTE_BAD_KEY_STATE), indicating that the OS API was unable to use the currently logged-in user’s credentials to decrypt the browser’s encryption key:

Fortunately (for us, not for him), this problem hit one of the best engineers in the world, and he was able to develop a solid theory of the root cause of the problem. If you find your system in this state, try running the following command in PowerShell:

Get-ScheduledTask | foreach { If (([xml](Export-ScheduledTask -TaskName $_.TaskName -TaskPath $_.TaskPath)).GetElementsByTagName("LogonType").'#text' -eq "S4U") { $_.TaskName } }

This will list off any scheduled tasks using the S4U feature suspected of causing the incorrect DPAPI credentials:

The Windows Crypto team is actively investigating this issue, and hopefully we’ll have a fix soon.

-Eric

1 The question of which processes can ask the OS to decrypt the browser’s key is a somewhat interesting one. On Windows, Chromium’s use of DPAPI’s CryptProtectData allows any process running as the user to make the request; there’s no attempt to use additional entropy to do “better” encryption, largely because there’s nowhere safe to store that additional entropy. On modern Windows, there are some other mechanisms that might provide somewhat more isolation than raw CryptProtectData, but full-trust malware is always going to be able to find a way to get at the data.

On Mac, the Keychain protection restricts access to data such that it’s not accessible to every process running as the user, but this doesn’t mean the data is immune from malware. Malware must instead use Chrome as a sock-puppet, having it perform all of the data decryption tasks, driving it via extensibility interfaces or other mechanisms.

The overall threat model against local attackers is further complicated by the mechanisms and constraints of process isolation: for instance, if an Admin process and dump the memory of a user-level process, or inject threads into that process, malware can also steal the data after the browser has decrypted it.

Mobile platforms (iOS/Android) tend to have the strongest story here, with more robust process isolation, code-signing requirements, hardware-backed secure enclaves, etc.

Web Debugging: Watching Element Changes

Recently, I was debugging a regression where I wanted to watch change’s in an element’s property at runtime. Specifically, I wanted to watch the URL change when I select different colors in Tesla’s customizer. By using the Inspect Element tool, I can find the relevant image in the tree, and then when I pick a different color in the page, the Developer Tools briefly highlight the changes to the image’s attributes:

Unfortunately, you might notice that the value in the xlink:href property contains a ... in the middle of it, making it difficult to see what’s changed. I noted that the context menu offers a handy “Break on” submenu to break execution whenever the node changes:

…but I lamented that there’s no Watch attribute command to log the changing URLs to the console. Mozillian April King offered a helpful snippet that provides this functionality.

After selecting the image (which points Console variable $0 at the element), type the following in the Console:

new MutationObserver(i => console.log(i[0].target.attributes['xlink:href'])).observe($0,
{ attributes: ['xlink:href']});

This simple snippet creates a MutationObserver to watch the selected element’s xlink:href attribute, and every time it changes, the callback writes the current attribute value to the console:

Cool, huh?

Thanks, April!

-Eric

Browser Memory Limits

Web browsers are notorious for being memory hogs, but this can be a bit misleading– in most cases, the memory used by the loaded pages accounts for the majority of memory consumption.

Unfortunately, some pages are not very good stewards of the system’s memory. One particularly common problem is memory leaks– a site establishes a fetch() connection to retrieve data from an endless stream of data coming from some webservice, then subsequently tries to hold onto the ever-growing response data forever.

Sandbox Limits

In Chromium-based browsers on Windows1 and Linux, a sandboxed 64-bit process’ memory consumption is bounded by a limit on the Windows Job object holding the process. For the renderer processes that load pages and run JavaScript, the limit was 4gb back in 2017 but now it can be as high as 16gb :

int64_t physical_memory = base::SysInfo::AmountOfPhysicalMemory();
    ...
    if (physical_memory > 16 * GB) {
      memory_limit = 16 * GB;
    } else if (physical_memory > 8 * GB) {
      memory_limit = 8 * GB;
    }

If the tab crashes and the error page shows SBOX_FATAL_MEMORY_EXCEEDED, it means that the tab used more memory than permitted for the sandboxed process2.

The sandbox limits are so high that exceeding them is almost always an indication of a memory leak or JavaScript error on the part of the site.

Running out of memory

Beyond hitting the sandbox limits, a process can simply run out of memory– if it asks for memory from the OS and the OS says “Sorry, nope“, the process will typically crash.

If the tab crashes, the error code will be rendered in the page:

Or, that’s what happens in the ideal case, at least.

If your system is truly out of free memory, all sorts of things are likely to fail– random processes around the system will likely fall over, and the critical top-level Browser process itself might crash, blowing away all of your tabs.

In my case, the crash reporter itself crashes, leading to this unfriendly dialog:

To make these sorts of catastrophic crashes less likely, allow Windows to manage the size of your page file.

Turning off OS page file as I had in the screenshot above means that when your last block of physical memory is exhausted, rather than slowing down, random processes on your system will fall over.

32bit Processes and Fragmentation

Notably, no sandbox limit is set for a 32bit browser instance; on 32-bit Windows, a 32bit process can almost always only allocate 2gb (std::numeric_limits::max() == 2147483647) before crashing with an OOM. For a 32-bit process running on 64bit Windows, a process compiled as LargeAddressAware (like Chromium) can allocate up to 4GB.

32-bit processes also often encounter another problem– even if you haven’t reached the 2gb process limit, it’s often hard to allocate more than a few hundred megabytes of contiguous memory because of address space fragmentation. If you encounter an “Out of Memory” error in a process that doesn’t seem to be using very much memory, visit chrome://version to ensure that you aren’t using a 32 bit browser.

Test Page and Tooling

I’ve built a Memory Use test page that allows you to use gobs of memory to see how your browser (and Operating System) reacts. Note that memory accounting is complicated and sneaky: ArrayBuffers aren’t considered JavaScript memory, and on Mac Chromium, they’re not backed by “real memory” until used.

You can use the Browser Task Manager (hit Shift+Esc on Windows or use Window > Task Manager on Mac) to see how much memory your tabs and browser extensions are using:

You can also use the Memory tab in the F12 Developer tools to peek at heap memory usage. Click the Take Snapshot button to get a peek at where memory is being used (and potentially wasted):

Using lots of memory isn’t necessarily bad– memory not being used is memory that’s going to waste. But you should always ensure that your web application isn’t holding onto data that it will never need again.

Memory: Use it, but don’t abuse it.

-Eric

1 Due to platform limitations, Chromium on OS X does not limit the sandbox size.

2 The error code isn’t fully reliable; Chrome’s test code notes:

// On 64-bit, the Job object should terminate the renderer on an OOM.
// However, if the system is low on memory already, then the allocator
// might just return a normal OOM before hitting the Job limit.

Alex Gough from Chromium provided the following breakdown of other memory-related limits:

  • Renderer
    • Per-process (Job, Rlimit) ~ as above
    • V8 Isolate 4GB (reservation)
    • V8 code range 128 MiB
    • Wasm Memory 4 GiB kSpecMaxWasmMaximumMemoryPages
    • Wasm Code Size 2 GiB kMaxWasmCodeMemory
    • Single partition_alloc allocation INT_MAX
    • Per-slab allocation limit from allocator shims: 2 GiB.
  • GPU
    • Windows, Linux: 64 GiB (less if less system memory)
  • Network
    • Windows: 16 GiB
  • Other Utilities
    • Windows: 16 GiB via default policies
  • Browser
    • None

Web-to-App Communication: The Native Messaging API

One of the most powerful mechanisms for Web-to-App communication is to use an extension that utilizes the NativeMessaging API. The NativeMessaging API allows an extension running inside the browser to exchange messages with a native-code “Host” executable running outside of the browser sandbox. That Host executable runs with the full privileges of the current user account, meaning that it can show UI, make network connections, read/write to any files to which the user has access, call privileged APIs, etc.

The NativeMessaging approach requires installing both a native executable and a browser extension (e.g. from the Chrome or Edge Web Store). Web pages cannot themselves communicate directly with a NativeMessaging host, they must use message passing APIs to communicate from the web page to the Extension, which then uses NativeMessaging to communicate with the executable running outside of the browser. This restriction adds implementation complexity, but is considerably safer than historical approaches like ActiveX controls with elevated brokers.

Implementing the Host Executable

The browser launches1 the Host executable in response to requests from an extension. On Windows, two command-line arguments are passed to the executable: the origin of the extension, and the browser’s HWND.

From the native code executable’s point-of-view, messages are received and sent using simple standard I/O streams. Messages are serialized using UTF8-encoded JSON preceded by a 32bit unsigned length in native byte order. Messages to the Host are capped at 4GB, and responses from the Host returned to the extension are capped at 1MB.

Hosts can be implemented using pretty much any language that supports standard I/O streams; using a cross-platform language like Go or Rust is probably a good choice if you aim to run on Windows, Mac, and Linux.

Avoid A Footgun
Be sure to set your streams to binary mode, or you might miscompute the data length prefix if the data contains CR/LF characters, causing Chromium to think your message was malformed.

_setmode(_fileno(stdin), _O_BINARY);
_setmode(_fileno(stdout), _O_BINARY);

Open-source examples of Native Messaging hosts abound; you can find some on GitHub, e.g. by searching for allowed_origins. For instance, here’s a simple one written in C#.

Registering the Host

The Native Messaging Host (typically installed by a downloaded executable or MSI installer) describes itself using a JSON manifest file that specifies the Extensions allowed to invoke it. For instance, say I wanted to add a NativeMessaging host that would allow my browser extension to file a bug in a local Microsoft Access database. The registration for the extension might look like this:

{
  "name": "com.bayden.moarTLSNative",
  "description": "MoarTLS Bug Filer",
  "path": "C:\\Program Files\\MoarTLSBugFiler\\native_messaging_host_for_bug_filing.exe",
  "type": "stdio",
  "allowed_origins": ["chrome-extension://emojohianibcocnaiionilkabjlppkjc/"]
}

On Windows, the host’s manifest is referenced in the registry, inside \Software\Microsoft\Edge\NativeMessagingHosts\ under either the HKLM or HKCU hive. By default, a reference in HKCU overrides a HKLM reference. For compatibility reasons (enabling Chrome Web Store extensions to work with Edge), Microsoft Edge will also check for NativeMessagingHosts registered within the Google Chrome registry key or file path:

On other platforms, the manifest is placed in a well-known path.

User-Level vs. System-Level Registration

Writing to HKLM (a so-called “System Level install”) requires that the installer run with Administrator permissions, so many extensions prefer to register within HKCU so that Admin permissions are not required for installation. However, there are two downsides to “User Level” registration:

  1. It requires every user account on a shared system to run the installer
  2. It does not work for some Enterprise configurations.

The latter requires some explanation. The Microsoft Edge team (and various other external organizations) publish “Security Baseline” documents that give Enterprises and other organizations advice about best practices for securely deploying web browsers.

One element in the Microsoft Edge team’s baseline recommends that enterprises policy-disable support for “user-level” Native Messaging host executables. This policy directive helps ensure that native code executables that run outside of the browser sandbox were properly vetted and installed by the organization (and have not been installed by a rogue end-user, for instance). The specific mechanism of enforcement is that a browser with this policy set will refuse to load a NativeMessagingHost unless it is registered in the HKLM hive of the registry; HKCU-registered hosts are simply ignored.

In order for Enterprises to deploy browser extensions that utilize NativeMessaging with NativeMessagingUserLevelHosts policy-disabled, such extension installers must offer the option to register the messaging host in HKLM. Those System-level installers will then require Admin-elevation to run, so it’s probably worthwhile to offer either two installers (one for User-level installs and one for System-level installs) or a single installer that elevates to install to HKLM if requested.

Calling the NativeMessaging Host

From the JavaScript extension platform point-of-view, messages are sent using a simple postMessage API.

To communicate with a Native Messaging host, the extension must include the nativeMessaging permission in its manifest. After doing so, it can send a message to the Host like so:

var port = chrome.runtime.connectNative('com.bayden.moarTLSNative');
port.postMessage({ url: activeTabs[0].url });

When the connectNative call executes, the browser launches the native_messaging_host_for_bug_filer.exe executable referenced in the manifest. The subsequent postMessage call results in writing the message data to the process’ stdin I/O stream. If the process responds, port‘s onMessage handler fires, or if the process disconnects, the onDisconnect handler is invoked.


NativeMessaging is a remarkably powerful primitive for bi-directional communication with native apps. Please use it carefully– escaping the browser’s sandbox means that careless implementations might result in serious security vulnerabilities.

-Eric

1 As of Chromium 87, the way the executable is invoked on Windows is rather convoluted (cmd.exe is used as a proxy) and it may fail for some users. Avoid using any interesting characters (e.g. & and @) in the path to your Host executable.