Captive Portals

When you join a public WiFi network, sometimes you’ll notice that you have to accept “Terms of Use” or provide a password or payment to use the network. Your browser opens or navigates to a page that shows the network’s legal terms or web log on form, you fill it out, and you’re on your way. Ideally.

How does this all work?

Wikipedia has a nice article about Captive Portals, but let’s talk about the lower-level mechanics.

Operating Systems’ Portal Detection

When a new network connection is established, Windows will send a background HTTP request to www.msftconnecttest.com/connecttest.txt. If the result is a HTTP/200 but the response body doesn’t match the string the server is known to always send in reply (“Microsoft Connect Test“), the OS will launch a web browser to the non-secure HTTP URL www.msftconnecttest.com/redirect. The expectation is that if the user is behind a Captive Portal, the WiFi router will intercept these HTTP requests and respond with a redirect to a page that will allow the user to log on to the network. After the user completes the ritual, the WiFi router stores the MAC Address of the device’s network card to avoid repeating the dance on every subsequent connection.

This probing functionality is a part of the Network Connectivity Status Indicator feature of Windows, which will also ensure that the WiFi icon in your task bar indicates if the current connection does not yet have access to the Internet at large. Beyond this active probing behavior, NCSI also has a passive polling behavior that watches the behavior of other network APIs to detect the network state.

Other Windows applications can detect the Captive Portal State using the Network List Manager API, which indicates NLM_INTERNET_CONNECTIVITY_WEBHIJACK when Windows noticed that the active probe was hijacked by the network. Enterprises can reconfigure the behavior of the NCSI feature using registry keys or Group Policy.

On MacOS computers, the OS offers a very similar active probe: a non-secure probe to http://captive.apple.com is expected to always reply with (“Success“).

Edge Portal Detection

Chromium includes its own Captive Portal detection logic whereby a probe URL is expected to return a HTTP/204 No Content response.

Edge specifies a probe url of http://edge-http.microsoft.com/captiveportal/generate_204

Chrome uses the probe URL http://www.gstatic.com/generate_204.

Avoiding HTTPS

Some Captive Portals perform their interception by returning a forged server address when the client attempts a DNS lookup. However, DNS hijacking is not possible if DNS-over-HTTPS (DoH) is in use. To mitigate this, the detector bypasses DoH when resolving the probe URL’s hostname.

Similarly, note that all of the probe URLs specify non-secure http://. If a probe URL started with https://, the WiFi router would not be able to successfully hijack it. HTTPS is explicitly designed to prevent a Monster-in-the-Middle (MiTM) like a WiFi router from changing any of the traffic, using cryptography and digital signatures to protect the traffic from modification. If a hijack tries to redirect a request to a different location, the browser will show a Certificate Error page that indicates that either the router’s certificate is not trusted, or that the certificate the router used to encrypt its response does not have a URL address that matches the expected website (e.g. edge-http.microsoft.com).

This means, among other things, that new browser features that upgrade non-secure HTTP requests to HTTPS must not attempt to upgrade the probe requests, because doing so will prevent successful hijacking. To that end, Edge’s Automatic HTTPS feature includes a set of exclusions:

kAutomaticHttpsNeverUpgradeList {
    "msftconnecttest.com, edge.microsoft.com, "
    "neverssl.com, edge-http.microsoft.com" };

Unfortunately, this exclusion list alone isn’t always enough. Consider the case where a WiFi router hijacks the request for edge-http.microsoft.com and redirects it to http://captiveportal.net/accept_terms. The browser might try to upgrade that navigation request (which targets a hostname not on the exclusion list) to HTTPS. If the portal’s server doesn’t support HTTPS, the user will either encounter a Connection Refused error or an Untrusted Certificate error.

If a user does happen to try to navigate to a HTTPS address before authenticating to the portal, and the router tries to hijack the secure request, Chromium detects this condition and replaces the normal certificate error page with a page suggesting that the user must first satisfy the demands of the Captive Portal:

For years, this friendly design had a flaw– if the actual captive portal server specified a HTTPS log on URL but that log on URL sent an invalid certificate, there was no way for the user to specify “I don’t care about the untrusted certificate, continue anyway!” I fixed that shortcoming in Chromium v101, such that the special “Connect to Wi-Fi” page is not shown if the certificate error appears on the tab shown for Captive Portal login.

-Eric

Extending Fiddler’s ImageView

Fiddler’s ImageView Inspector offers a lot of powerful functionality for inspecting images and discovering ways to shrink an image’s byte-weight without impacting its quality.

Less well-known is the fact that the ImageView Inspector is very extensible, such that you can add new tools to it very simply. To do so, simply download any required executables and add registry entries pointing at them.

For instance, consider Guetzli, the JPEG-optimizer from the compression experts at Google. It aims to shrink JPEG images by 20-30% without impacting quality. The tool is delivered as a command-line executable that accepts an image’s path as input, generating a new file containing the optimized image. If you pass no arguments at all, you get the following help text:

To integrate this tool into Fiddler, simply:

  1. Download the executable (x64 or x86 as appropriate) to a suitable folder.
  2. Run regedit.exe and navigate to HKEY_CURRENT_USER\Software\Microsoft\Fiddler2\ImagesMenuExt\
  3. Create a new key with the caption of the menu item you’d like to add. (Optionally, add a & character before the accelerator key.)
  4. Create Command, Parameters and Types values of type REG_SZ. Read on for details of what to put in each.

Alternatively, you could use Notepad to edit a AddCommand.reg script and then double-click the file to update your registry:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\Fiddler2\ImagesMenuExt\&JPEGGuetzli]
"Command"="C:\\YOURPATHToTools\\guetzli_windows_x86-64.exe"
"Parameters"="{in} {out:jpg}"
"Types"="image/jpeg"

When you’re done, the registry key should look something like:

After you’ve set up your new command, when you right-click on a JPEG in the ImageView, you’ll see your new menu item in the Tools submenu:

When you run the command, the tool will run and a new entry will be added to the Web Sessions list, containing the now optimized image:

Registry Values

The required Command entry points to the location of the executable on disk.

The optional Parameters entry specifies the parameters to pass to the tool. The Parameters entry supports two tokens, {in} and {out}. The {in} token is replaced with the full path to the temporary file Fiddler uses to store the raw image from the ImageView Inspector before running the tool. The {out} token is replaced with the filepath Fiddler requests the tool write its output to. If you want the output file to have a particular extension, you can specify it after a colon; for example {out:jpg} generates a filename ending in .jpg. If you omit the Parameters value, a default of {in} is used.

The optional Types parameter limits the MIME types on which your tool is offered. For example, if your tool only analyzes .png files, you can specify image/png. You can specify multiple types by separating them with a comma, space, or semicolon.

The optional Options value enables you to specify either <stdout> or <stderr> (or both) and Fiddler will collect any textual output from the tool and show it in a message box. For instance, the default To WebP Lossless command sets the <stderr> option, and after the tool finishes, a dialog box is shown:

Inspect and optimize all the things!

-Eric

“Batteries-Included” vs “Bloated”

Fundamentals are invisible. Features are controversial.

One of the few common complaints against Microsoft Edge is that “It’s bloated– there’s too much stuff in it!

A big philosophical question for designers of popular software concerns whether the product should include features that might not be useful for everyone or even a majority of users. There are strong arguments on both sides of this issue, and in this post, I’ll explore the concerns, the counterpoints, and share some thoughts on how software designers should think about this tradeoff.

But first, a few stories

  1. I started working in Microsoft Office back in 1999, on the team that was to eventually ship SharePoint. Every few months in the early 2000s, a startup would appear, promising a new office software package that’s “just the 10% of Microsoft Office that people actually use.” All of these products failed (for various reasons), but they all failed in part because their development and marketing teams failed to recognize a key fact: Yes, the vast majority of customers use less than 10% of the features of the Microsoft Office suite, but it’s a different 10% for each customer.
  2. I started building the Fiddler Web Debugger in 2003, as a way for my team (the Office Clip Art team) to debug client/server traffic between the Microsoft Clip Art client app and the Design Gallery Live webservice that hosted the bulk of the clipart Microsoft made available. I had no particular ambitions to build a general purpose debugger, but I had a problem: I needed to offer a simple way to filter the list of web requests based on their URLs or other criteria, but I really didn’t want to futz with building a complicated filter UI with dozens of comboboxes and text fields.

    I mused “If only I could let the user write their queries in code and then Fiddler would just run that!” And then I realized I could do exactly that, by embedding the JScript.NET engine into Fiddler. I did so, and folks from all over the company started using this as an extensibility mechanism that went far beyond my original plans. As I started getting more feature requests from folks interested in tailoring Fiddler to their own needs, I figured “Why not just allow developers to write their own features in .NET?” So I built in a simplistic extensibility model that allowed adding new features and tabs all over. Over a few short years, a niche tool morphed into a wildly extensible debugger used by millions of developers around the world.
  3. The original 2008 release of the Chrome browser was very limited in terms of features, as the team was heavily focused on performance, security, and simplicity. But one feature that seemed to get a lot of consideration early on was support for Mouse Gestures; some folks on the team loved gestures, but there was recognition that it was unlikely to be a broadly-used feature. Ultimately, the Chrome team decided not to implement mouse gestures, instead leaving the problem space to browser extensions.

    Years later, after Chrome became my primary browser, I lost my beloved IE Mouse Gestures extension, so I started hunting for a replacement. I found one that seemed to work okay, but because I run Fiddler constantly, I soon noticed that every time I invoked a gesture, it sent the current page’s URL and other sensitive data off to some server in China. Appalled at the hidden impact to my privacy and security, I reported the extension to the Chrome Web Store team and uninstalled it. The extension was delisted from the web store.

    Some time later, now back on the Edge team, a new lead joined and excitedly recommended we all try out a great Mouse Gestures extension for Chromium. I was disappointed to discover it was the same extension that had been removed previously, now with a slightly more complete Privacy Policy, and now using HTTPS when leaking users’ URLs to its own servers.

With these background stories in hand, let’s look at the tradeoffs.

Bloated!”

There are three common classes of complaint from folks who point at the long list of features Edge has added over upstream Chromium and furiously charge “It’s bloated!“:

  1. User Experience complexity
  2. Security/reliability
  3. Performance

UX Complexity

Usually when you add a feature to the browser, you add new menu items, hotkeys, support articles, group policies, and other user-visible infrastructure to support that feature. If you’re not careful, it’s easy to accidentally break a user’s longstanding workflows or muscle memory.

One of the Windows 7 Design Principles was “Change is bad, unless it’s great!” and that’s a great truth to keep in mind– entropy accumulates, and if you’re not careful, you can easily make the product worse. Users don’t like it when you move their cheese.

A natural response to this concern is to design new features to be unobtrusive, by leaving them off-by-default, or hiding them away in context menus, or otherwise keeping them out of the way. But now we’ve got a problem– if users don’t even know about your feature, why bother building it? If potential customers don’t know that your unique and valuable features exist, why would they start using your product instead of sticking with the market leader, even if that leader has been stagnant for years?

Worse still, many startups and experiments are essentially “graded” based on the number of monthly or daily active users (MAU or DAU)– if a feature isn’t getting used, it gets axed or deprioritized, and the team behind it is reallocated to a more promising area. Users cannot use a feature if they haven’t discovered it. As a consequence, in an organization that lacks powerful oversight there’s a serious risk of tragedy, whereby your product becomes a sea of banners and popups each begging the user for attention. Users don’t like it when they think you’re distracting them from the cheese they’ve been enjoying.

Security/reliability risk

Engineers and enthusiasts know that software is, inescapably, never free of errors, and intuitively it seems that every additional line of code in a product is another potential source of crashes or security vulnerabilities.

If software has an average of, say, two errors per thousand lines of code, adding a million lines of new feature code mathmatically suggests there are now two thousand more bugs that the user might suffer.

If users have to “pay” for features they’re not using, this feels like a bad deal.

Performance

Unlike features, Performance is one of the “Universal Goods” in software– no user anywhere has ever complained that “My app runs too fast!” (with the possible exception of gamers trying to run retro games from the 1990s on 4ghz CPUs).

However, we users also note that, even as our hardware has gotten thousands of times faster over the decades, our software doesn’t seem to have gotten much faster at all. Much like our worry about new features introducing code defects, we also worry that introducing new features will make the product as a whole slower, with higher CPU, memory, or storage requirements.

Each of these three buckets of concerns is important; keep them in mind as we consider the other side.

Batteries Included!

We use software to accomplish tasks, and features are the mechanism that software exposes to help us accomplish our tasks.

We might imagine that ideal software would offer exactly and only the features we need, but this is impractical. Oftentimes, we may not recognize the full scope of our own needs, and even if we do, most software must appeal to broad audiences to be viable (e.g. the “10% of Microsoft Office” problem). And beyond that, our needs often change over time, such that we no longer need some features but do need other features we didn’t use previously.

One school of thought suggests that the product team should build a very lightweight app with very few features, each of which is used by almost everyone. Features that will be used by fewer users are instead relegated to implementation via an extensibility model, and users can cobble together their own perfect app atop the base.

There’s a lot of appeal in such a system– with less code running, surely the product must be more secure, more performant, more reliable, and less complex. Right?

Unfortunately, that’s not necessarily the case. Extension models are extremely hard to get right, because until you build all of the extensions, you’re not sure that the model is correct or complete. If you need to change the model, you may need to change all of the extensions (e.g. witness the painful transition from Chromium’s Manifest v2 to Manifest v3).

Building features atop an extension model sometimes entails major performance bugs, because data must flow through more layers and interfaces, and if needed events aren’t exposed, you may need to poll for updates. Individual extensions with common needs may have to do redundant work (e.g. each extension scanning the full text of each loaded page, rather than the browser itself scanning the whole thing just once).

As we saw with the Mouse Gestures story above, allowing third-party extensions carries along with it a huge amount of complexity related to security risk and misaligned incentives. In an adversarial ecosystem where good and bad actors both participate, you must invest heavily in security and anti-abuse mechanisms.

Finally, regression testing and prevention gets much more challenging when important features are relegated to extensions. Product changes that break extensions won’t block the commit queue, and the combinatorics involved in testing with arbitrary combinations of extensions quickly hockey sticks upward to infinity.

Extensions also introduce complexity in the management and update experience, and users might miss out on great functionality because they never discovered extension exists to address a need they have (or didn’t even realize they have). You’d probably be surprised by the low percentage of users that have any browser extensions installed at all.

With Fiddler, I originally took the “Platform” approach where each extra feature was its own extension. Users would download Fiddler, then download four or five other packages after/if they realized such valuable functionality existed. Over time, I realized that nobody was happy with my Ikea-style assemble-your-own debugger, so I started shipping a “Mondo build” of Fiddler that just included everything.

Extensions, while useful, are no panacea.

Principles

These days, I’ve come around to the idea that we should include as many awesome features as we can, but we should follow some key principles:

  • To the extent possible, features must be “pay to play.” If a user isn’t using a given feature, it should not impact performance, security, or reliability. Even small regressions quickly add up.
  • Don’t abuse integration to avoid clean layering and architecture. Just because your feature’s implementation can go poke down into the bowels of the engine doesn’t mean it should.
  • Respect users and carefully manage UX complexity. Remember, “change is bad unless it’s great.” Invest in systems that enable you to only advertise new features to the right users at the right time.
  • Remove failed experiments. If you ship a feature and find that it’s not meeting your goals, pull it out. If you must accommodate a niche audience that fell in love, consider whether an extension might meet their needs.
  • Find ways to measure, market, and prioritize investments in Fundamentals. Features usually hog all the glory, but Fundamentals ensure those features have the opportunity to shine.

-Eric

Chromium Startup

This morning, a Microsoft Edge customer contacted support to ask how they could launch a URL in a browser window at a particular size. I responded that they could simply use the --window-size="800,600" command line argument. The customer quickly complained that this only seemed to work if they also specified a non-default path in the --user-data-dir command line argument, such that the URL opened in a different profile.

As I mentioned back in my post about Edge Command Line Arguments, most command-line arguments are ignored if there’s already a running instance of Edge, and this is one of them. Even if you also pass the --new-window argument, Edge simply opens the URL you’ve supplied inside a new window with the same size and location as the original window.

Now, in many cases, this is a reasonable limitation. Many of the command line arguments you can pass into Chromium have a global impact, and having inbound arguments change the behavior of an already-running browser instance would introduce an impractical level of complexity to the code and user experience. Similarly, there’s an assumption in the Chromium code that only one browser instance will interact with a single profile folder at one time, so we could not simply allow multiple browser instances with different behaviors to use a single profile in parallel.

In this particular case, it feels reasonable that if a user passes both --new-window and either --window-size or --window-position (or all three), the resulting window will have the expected dimensions, even if there was already one or more browser windows open in the browsing session. Because the arguments do not impact more than the newly-created window, there’s none of the compexity of trying to change the behavior of any other part of the already-running browser. I filed a bug suggesting that we ought to look at enabling this.

In the course of investigating this bug, I had the opportunity to learn a bit more about how Chromium handles the invocation of a URL when there’s already a running instance. When it first starts, the new browser process searches for an existing hidden message window of Class Chrome_MessageWindow with a Caption matching the user-data-dir of the new process.

If it fails to find one, it creates a new hidden messaging window (using a mutex to combat race conditions) with the correct caption for any future processes to find.

However, if the code did find an existing messaging window, there’s a call to a AttemptToNotifyRunningChrome function that sends a WM_COPYDATA message to pass along the command line from the (soon-to-exit) new process.

In the unlikely event that the existing process fails to accept the message (e.g. because it is hung), the user will be prompted to kill the existing process so that the new process can handle the navigation.

This code is surprisingly simple, and feels very familiar– the startup code inside Fiddler is almost identical except it’s implemented in C#.

In our customer scenario, we see that the existing browser instance correctly gets the command line from the new process:

[11824:4812:0615/092252.689:startup_browser_creator.cc(1391)] ProcessCommandLineAlreadyRunning "C:\src\c\src\out\default\chrome.exe" --new-window --window-size=400,510 --window-position=123,34 --flag-switches-begin --flag-switches-end example2.com

And shortly after that, there’s the expected call to the UpdateWindowBoundsAndShowStateFromCommandLine function that sets the window size and position from any arguments passed on the command line.

The stack trace of that call looks like

-chrome::internal::UpdateWindowBoundsAndShowStateFromCommandLine
-chrome::GetSavedWindowBoundsAndShowState
-BrowserView::GetSavedWindowPlacement

Unfortunately, when we look at the GetSavedWindowBoundsAndShowState function we see the problem:

void GetSavedWindowBoundsAndShowState(const Browser* browser,
                                      gfx::Rect* bounds,
                                      ui::WindowShowState* show_state) {
  //...
  const base::CommandLine& parsed_command_line =
      *base::CommandLine::ForCurrentProcess();

internal::UpdateWindowBoundsAndShowStateFromCommandLine(parsed_command_line, bounds, show_state);
}

As you can see, the call passes the command line string representing the current (preexisting) process, rather than the command line that was passed from the newly started process. So, the new window ends up with the same size and position information from the original window.

To fix this, we’ll need to restructure the calls such that when we’re handling a command line passed to us from another process through ProcessCommandLineAlreadyRunning, we use the WM_COPYDATA-passed command line when setting the window size and position.

-Eric

Microsoft Edge Tips and Tricks

Last Updated: June 3, 2022. The intent of this post is to capture a list of non-obvious features of the browser that might be useful to you.


Q: How do I find the tab playing audio? It’s cool that Microsoft Edge shows the volume icon in the tab playing music and I can click to mute it:

…but what if I have a bunch of Edge windows? I have to go into each window to find the icon?

A: The Ctrl+Shift+A hotkey is your friend. It will show your open tabs to allow you to search across them, and those playing audio/video are listed in a group at the top:


Q: How can I move a few tabs out of the current window?

A: You can simply drag the tab’s button/title out of the tab strip to move it to a new window. Less obviously, you can Ctrl+Click *multiple* tabs and drag your selections out into a new window (unselected tabs temporarily dim). Use Shift+Click if you’d prefer to select a range of tabs.


Q: How can I duplicate a tab?

A: Hit Ctrl+Shift+K or use the “Duplicate Tab” command on the tabstrip’s context menu to duplicate the current tab.

Less obviously, you can Ctrl+Click the back or forward arrow buttons to open the previous or next entry in the history in a new tab, or you can Shift+Click the buttons to open the page in a new window.


Q: How can I get back a tab I accidentally closed?

A: Hit Ctrl+Shift+T or use the “Reopen closed” option on the tabstrip’s context menu shown on right-click.

You also might be interested in the “Ask before closing a window with multiple tabs” option available inside the edge://settings page:


Q: On a desktop mouse, is middle-click useful for anything?

A: Middle-click a link to open it in a new tab. Middle-click a tab title button to close that tab rather than hunting for its [x] icon.


Q: How can I easily open a given site in a different profile?

A: You can right-click a link in a page and choose “Open as” to open that link in a different profile:

If you already have the desired page open, you can right-click the tab title button and choose “Move Tab To” and pick the desired profile:

Move the current tab to a different profile

You can also use the options at edge://settings/profiles/multiProfileSettings to open particular sites using a particular profile, useful for splitting your “Work Sites” from your “Life Sites” and your “Ephemeral sites“.

Open AzDo in the Work Profile, and Hotmail in my Personal Profile

Q: How can I make any site act more like an “App” with its own window that isn’t cluttered with other tabs?

A: You can use the --app=url command line argument to give a any site its own standalone window that does not mix with your other sites. For example, if you run msedge.exe --app=https://outlook.live.com, the result looks like this:

This works great with command launchers like SlickRun, because you can then just type e.g. Mail to launch the standalone web app.


You might also enjoy this collection of not-so-frequently-asked questions about Edge.

Losing your cookies

My browser lost its cookies” has long been one of the most longstanding Support complaints in the history of browsers. Unfortunately, the reason that it has been such a longstanding issue is that it’s not the result of a single problem, and if the problem is intermittent (as it often is), troubleshooting the root cause may be non-trivial.

In this post, I’ll explain a bit about how cookies are stored, and what might cause them to go missing.

Background: Session vs. Persistent

Before we get started, it’s important to distinguish between “Session” cookies and “Persistent” cookies. Session cookies are meant to go away when your browser session ends (e.g. you close the last window/tab) while Persistent cookies are meant to exist until you either manually remove them or a server-specified expiration date is reached.

In a default configuration, closing your browser is meant to discard all of your session cookies, unless you happen to have the browser’s “Continue where I left off” option set, in which case Session cookies are expected to live forever.

Chromium stores its persistent cookies as encrypted entries in a SQLLite database file named Cookies, found within the profile folder:

Unfortunately, as a user it’s not trivial to understand whether a given cookie was marked Persistent and expected to survive browser shutdown. You have to look at either the 🔒 > Cookies > Cookies in use dialog:

…or the F12 Developer Tools’ Application tab:

If the cookie in question was set as a Persistent cookie, but it disappeared unexpectedly, there could be any of a number of root causes. Let’s take a look.

Culprit: User Settings

The most common causes of cookies going missing are not exactly a “bug” at all– the cookies are gone because the user either configured the browser to regularly clear cookies, or instructed it to do so on a one-time basis.

Microsoft Edge can be configured to clear selected data every time the browser is closed using the option at edge://settings/clearBrowsingDataOnClose:

This is a powerful feature useful for creating an “Ephemeral Profile“, but if you set this option on your main profile and forget you did so, it can be very frustrating. You can also use edge://settings/content/cookies to clear cookies on exit for specific sites:

That same page also contains an option to block all cookies in a 3rd party context, which can have surprising consequences if you set the option and forget it.

Culprit: User Actions

By hitting Ctrl+Shift+Delete or using the Clear browsing data button inside about://settings, you can clear either all of your cookies, or any cookie set within a particular date range:

Notably, the claim this will clear your data across all your synced devices does not apply to cookies. Because cookies are not synced/roamed, deleting them on one computer will not delete them on another.

You can also use the 🔒 > Cookies > Cookies in use dialog to remove cookies for just the currently loaded site, or use edge://settings/siteData to clear cookies for an individual site:

Culprit: Multiple Browsers/Profiles/Channels

For me personally, the #1 thing that causes me to “lose” cookies is the fact that I use 14 different browser profiles (8 in Edge, 4 in Chrome, 2 in Firefox) on a regular basis. Oftentimes when I find myself annoyed that my browser isn’t logged in, or otherwise is “missing” an expected cookie, the real reason is that I’m in a different browser/channel/profile than the last time I used the site in question. When I switch to the same browser+channel+profile I used last time, the site “magically” works as expected again.

Culprit: 3rd-Party Utilities

Sometimes, a user loses their cookies unexpectedly because they’ve run (or their computer automatically ran) any of a number of 3rd-party Privacy/Security/Cleanup utilities that include as a feature deletion of browser cookies. Because these tools run outside of the browser context, they can delete the cookies without the browser’s awareness or control.

Culprit: Website Changes

Websites can change their behavior from minute to minute (especially when they perform “A/B testing” to try out new fixes. Sometimes, a website will, knowingly or unknowingly, cause a cookie to be discarded or ignored when new website code is rolled out. Websites can even cause all of their cookies to be deleted (e.g. when a user clicks “Logout”) using the ClearSiteData API.

(Unlikely) Culprit: Lost Decryption Key

As I mentioned earlier, Chromium encrypts cookies (and passwords) using an encryption key that is stored in the browser profile, and which is itself encrypted using a key stored by the OS user account.

If the key in the Chromium profile is deleted or corrupted, Chromium will not be able to obtain the AES key required to decrypt passwords and cookies and will behave as if those data stores were empty.

Similarly, the same loss is possible if the OS user account’s encryption key is lost or reset. This should be very rare, but prior to Windows 10 20H2 there was a bug where the key could be lost if the S4U task scheduler was in use. Similarly, a Windows bug is currently under investigation for machines with a 3rd-Party credential provider installed. Finally, if the user’s password is forgotten and “Reset”, the account’s encryption key is reinitialized resulting in the “irreversible loss of information” mentioned at the top of the confirmation dialog box:

(Unlikely) Culprit: Data Corruption

Chromium has made significant technical investments in protecting the integrity of data and I’m not aware of any significant issues where cookie data was lost due to browser crashes or the like.

Having said that, the browser is ultimately at the mercy of the integrity of the data it reads off disk. If that data is corrupted due to e.g. disk or memory failure, cookie data can be irretrievably lost, and the problem can recur until the failing hardware is replaced.

Troubleshooting Steps

  1. Check which browser/channel/profile you’re using and ensure that its the one you used last time.
  2. Visit about://settings/clearBrowsingDataOnClose to ensure that you’re not configured to delete cookies on exit.
  3. Visit about://settings/siteData to see whether all cookies were lost, or just some went missing.
  4. Visit about://settings/content/cookies to see whether third-party cookies are allowed, and whether you have any rules to clear a site’s cookies on exit.
  5. If you use the browser’s Password Manager, see whether your saved passwords went missing too, by visiting about://settings/passwords.

    If they also went missing, you may have a problem with Local Data Encryption keys. Check whether any error values appear inside the about:histograms/OSCrypt page right after noticing missing data.
  6. Visit Cookie/Storage test page, pick a color from the dropdown, and restart your browser fully. See whether the persistentCookie and localStorage values reflect the color previously chosen.
  7. Use the Windows Memory Diagnostic in your Start Menu to check for any memory errors. Use the Properties > Tools > Check for Errors option to check the C: drive for errors:

-Eric

Thoughts on Impact

In this post, I talk a lot about Microsoft, but it’s not only applicable to Microsoft.

It’s once again “Connect Season” at Microsoft, a biannual-ish period when Microsoft employees are tasked with filling out a document about their core priorities, key deliverables, and accomplishments over the past year, concluding with a look ahead to their goals for the next six months to a year.

The Connect document is then used as a conversation-starter between the employee and their manager. While this process is officially no longer coupled to the “Rewards” process, it’s pretty obviously related.

One of the key tasks in both Connects and Rewards processes is evaluating your impact— that is to say, what changed in the world thanks to your work?

We try to break impact down into three rings: 1) Your own accomplishments, 2) How you built upon the ideas/work of others, and 3) How you contributed to others’ impact. The value of #1 and #3 is pretty obvious, but #2 is just as important– Microsoft now strives to act as “One Microsoft”. This represents a significant improvement over what was once described to prospective employees as “A set of competing fiefdoms, perpetually at war” and drawn eloquently by Manu Cornet:

Old Microsoft, before “Building Upon Others” culture

By explicitly valuing the impact of building atop the work of others, duplicate effort is reduced, and collaboration enhances the final result for everyone.

While these rings of impact can seem quite abstract, they seem to me to be a reasonable framing for a useful conversation, whether you’re a new Level 59 PM, or a grizzled L67 Principal Software Engineer.

The challenge, of course, is that measurement of impact is often not at all easy.

Measures and Metrics

When writing the “Looking back” portion of your Connect, you want to capture the impact you achieved, but what’s the best way to measure and express that?

Obviously, numbers are great, if you can get them. However, even if you can get numbers, there are so many to choose from, and sometimes they’re unintentionally or intentionally misleading. Still, numbers are often treated as the “gold standard” for measuring impact, and you should try to think about how you might get some. Ideally, there will be some numbers which can be readily computed for a given period. For instance, my most recent Connect noted:

While this provides a cheap snapshot of impact, there’s a ton of nuance hiding there. For example, my prior Connect noted:

Does this mean that I was less than half as impactful this period vs. the last? I don’t think so, but you’d have to dig into the details behind the numbers to really know for sure.

Another popular metric is the number of users of your feature or product, because this number, assuming appropriate telemetry, is easy to compute. For example, most teams measure the number of “Daily Active Users” (DAU) or “Monthly Active Users” (MAU).

While I had very limited success in getting Microsoft to recognize the value of my work on my side project (the Fiddler Web Debugger), one thing that helped a bit was when our internal “grassroots innovation” platform (“The Garage”) added a simple web service where developers could track usage of any tool they built. I was gobsmacked to discover that Fiddler was used by over 35000 people at Microsoft, then more than one out of every three employees in the entire company.

Hard numbers bolstered anecdotal stories (e.g. the time when Microsoft’s CTO/CSA called me at my desk to help him debug something and I was about to guide him into installing Fiddler only to have him inform me that he “used it all the time.”)

When Fiddler was later being scouted for acquisition by devtool companies, I quickly learned that they weren’t particularly interested in the code — they were interested in the numbers: how many downloads (14K/day), how many daily active users, and any numbers that might reveal what were users were doing with it (enterprise software developers monetize better than gem-farming web gamers).

A few years prior, my manager had walked into my office and noted “As great as you make Fiddler, no matter how many features you add or how well you build them, nothing you do will ever have as much impact as you have on Internet Explorer.” And there’s a truth to that– while Fiddler probably peaked at single-digit millions of users, IE peaked at over a billion. When I rewrote IE’s caching logic, the performance savings were measured in minutes individually and lifetimes in aggregate.

Unfortunately, there’s a significant risk to making “Feature Usage” a measure of impact– it means that there’s a strong incentive for every feature owner to introduce/nag/cajole as many people as possible into using a feature. This often manifests as “First Run” ads, in-product popups, etc. Your product risks suffering a tragedy of the commons effect whereby every feature team is individually incentivized to maximize user exposure to their feature, regardless of appropriateness or the impact to users’ satisfaction with the product as a whole.

When a measure becomes a target, it ceases to be a good measure.

Goodhart’s Law

To demonstrate business impact, the most powerful metric is your impact on profitability, measured in dollars. Sadly, this metric is often extremely difficult to calculate: distinguishing the revenue impact of a single individual’s work on a massive product is typically either wildly speculative or very imprecise. However, once in a great while there’s a clear measure: My clearest win was nearly twenty years ago, and remains on my resume today:

Saving $156,000 a year in costs (while dramatically improving user-experience– a much harder metric to measure) at a time when I was earning around half of that sum was an incredibly compelling feather in my cap. (As an aside, perhaps my favorite example of this ever was found in the OKRs of the inventor of Brotli compression, who computed the annual bandwidth savings for Google and then converted that dollar figure into the corresponding numbers of engineers based on their yearly cost. “Brotli is worth <x> engineers, every year, forever.”)

Encouraging employees to evaluate their Profit Impact is also somewhat risky– oftentimes, engineers are not interested in the business side of the work they do and consider it somewhat unseemly — “I’m here to make the web safe for my family, not make a quick buck for a MegaCorp.” Even for engineers who consciously acknowledge the deal (“I recognize that the only reason we get to spend hundreds of millions of dollars building this great product we give away for free is because it makes the company more money somewhere“) it can be very uncomfortable to try to generate a precise profitability figure– engineers like accuracy and precision, and even with training in business analysis, calculation of profit impact is usually wildly speculative. You usually end up with a SWAG (silly wild-ass guess) and the fervent hope that no one will poke on your methodology too hard.

A significant competitive advantage held by the most successful software companies is that they don’t need to bother their engineers with the business specifics. “Build the best software you can, and the business will take care of itself” is a simple and compelling message for artisans working for wealthy patrons. And it’s a good deal for the leading browser business: when the product at the top of your funnel costs you 9 digits per year and brings in 12 digits worth of revenue, you can afford not to demand the artisans think too much about anything so mundane as money.

Storytelling

Of course, numbers aren’t the only way to demonstrate impact. Another way is to tell stories about colleagues you’ve rescued, customers you’ve delighted, problems you’ve solved and disasters you’ve averted.

Stories are powerful, engaging, and usually more interesting to share than dry metrics. Unfortunately, they’re often harder to collect (customers and partners are often busy and it can feel awkward to ask for quotes/feedback about impact). Over the course of a long review period, they’re also sometimes hard to even remember. Starting in 2016, I got in the habit of writing “Snippets”, a running text log of what I’ve worked on each day. Google had internal tooling for this (mostly for aggregating and publishing snippets to your team) but nowadays I just have a snippets.txt file on my desktop.

Both Google and Microsoft have an employee “Kudos” tool that allows other employees to send the employee (and their manager) praise about their work, which is useful for both visibility as well as record-keeping (since you can look back at your kudos for years later). I also keep a Kudos folder in Outlook to save (generally unsolicited) feedback from customers and partners on the impact of my work. One thing I’ve heard some departing Microsoft employees note (and experienced myself) is that they often don’t hear about the breadth of their impact until the replies to their farewell emails start pouring in. When I left Microsoft in 2012, I got some extremely kind notes from folks that I never expected to even know my name (some not even Fiddler users!).

Even when recounting an impact story, you should enhance it with numbers if you can. “I worked late to fix a regression for a Fortune 500 customer” is a story– “…and my fix unblocked deployment of Edge as the default browser to 30000 seats” is a story with impact.

A challenge with storytelling as an approach to demonstrating impact is that our most interesting stories tend to involve frantic, heroic, and extraordinary efforts or demonstrations of uncommon brilliance, but the reality is that oftentimes the impact of our labor is greater when competently performing the workaday tasks that head off the need for such story-worthy events. As I recently commented on Twitter:

We have to take care not to incentivize behaviors that result in great stories of “heroic firefighting” while neglecting the quiet work that obviates the need for firefighting in the first place. But quantifying the impact of the fire marshal can be difficult– how do you estimate the cost of a conflagration that didn’t happen? (This is one argument for investing in post-mortems— it allows for ballpark estimates of historical recovery costs, helps everyone understand that prevention is a comparative bargain).

My most recent Connect praised me as having done “a great job of being our last line of defense” which I found quite frustrating– while I do get a lot of visibility for fixing customer problems that have no clear owners, my most valuable efforts are in helping ensure that we fix problems before customers even experience them.

Speed

Related to this is the relationship of speed to impact— the sooner you make a course-correcting adjustment, the smaller that adjustment needs to be. Flag an issue in the design of the feature and you don’t have to update the code. Catch a bug in the code before it ships and no customer will notice. Find a bug in Canary before it reaches Beta and developers will not need to cherry-pick the fix to another branch. Fix a regression in Beta before it’s promoted to Stable and you reduce the potential customer impact by very close to 100%.

Similarly, any investment in tools, systems, and processes to tighten the feedback loop will have broad impact across the entire product. Checking in a fix to a customer-reported bug quickly only delights if that customer can benefit from that fix quickly.

Unfortunately, because speed reduces effort (a faster fix is cheaper), it’s too easy to fall into the trap of thinking that it had lower impact.

Effort != Impact

A key point arises here– impact is not a direct function of effort, it is just one of the inputs into the equation.

A friend once lamented his promotion from Level 63 to 64, noting “It’s awful. I can’t work any harder!” and while I’ve felt the same way, we also both know that even the highest-levelled employees don’t have more than 24 hours in their day, and most of them retain some semblance of work/life balance.

We’re not evaluated on our effort, but on our impact. Carefully selecting the right problems to attack, having useful ideas/subject matter expertise, working with the right colleagues, and just being lucky all have a role to play.

Senior Levels

At junior levels, the expectation is that your manager will assign you appropriate work to allow you to demonstrate impact commensurate with your level. If for some reason something out of your control happens (a PM’s developer leaves the team, so their spec is shelved) the employee’s “opportunity for impact” is deemed to be lower and taken into account in evaluations.

As you progress into the Senior band and beyond, however, “opportunity for impact” is implicitly deemed unlimited. The higher you rise, the greater the expectation that you will yourself figure out what work will have the highest impact, then go do that work. If there’s a blocker (e.g. a partner team declines to do needed work), you’re responsible for figuring out how to overcome that blocker.

Amid the Principal band, I’ve found it challenging to try to predict where the greatest opportunity for impact lies. For the first two years back at Microsoft, I was unexpectedly impactful, as (then) the only person on the team to have ever worked as a Chromium Developer– I was able to help the entire Edge team ramp up on the codebase, tooling, and systems. I then spent a year or so as an Enterprise Fixer, helping identify and fix deployment blockers preventing large companies from adopting Edge. Throughout, I’ve continued to contribute fixes to Chromium, investigate problems, blog extensively, and try to help build a great engineering culture. Many of these investments receive and warrant no immediate recognition– I think of them as seeds I’m optimistically planting in the hopes that one day they’ll bear fruit.

Many times I will take on an investigation or fix for a small customer, both in the hope that I’m also solving something for a large customer who just hasn’t noticed yet, and because there’s an immediate satisfaction in helping out an individual even if the process appears unscalable. While it cannot possibly be the most efficient method of maximizing impact, just helping, over and over, can bypass analysis paralysis or impactless naval gazing.

Do all the good you can,
By all the means you can,
In all the ways you can,
In all the places you can,
At all the times you can,
To all the people you can,
As long as ever you can.

https://quoteinvestigator.com/2016/09/24/all-good/

Learning is an Investment

Taking time to learn new technologies, skills, or even your own codebase does not typically have an immediate impact on the organization. But it’s an investment in the future, and it can pay off unexpectedly, or fairly reliably, depending on what you choose to learn.

The challenge with spending your time learning is that there is an infinite amount to learn, and learning, in and of itself, rarely has any impact at all, unless you contribute back to the learning resource. It is only upon putting your learning to use that the potential value turns into impact.

Teaching is an Investment

Similarly, sharing what you’ve learned is an investment– you’re betting that the overall value to the recipient or the organization will exceed the value of your preparation and presentation of information.

But beyond the value in teaching others, teaching is a great way to learn a topic more fully yourself, as gaps in your understanding are exposed, and the most dangerous class of ignorance (“Things you know that just ain’t so“) are pointed out to you. This is a major reason I blog and otherwise “Do it in public.

Do things that Scale (and things that don’t)

When planting seeds, you don’t know which might bear fruit. Spending an hour mentoring a peer may have no obvious impact right away. But perhaps you will learn something to improve your own impact, or perhaps years down the line your mentee will pass along a cool role opportunity in another division. You never know.

Still, if you seek to maximize your impact, finding opportunities to scale is crucial. If fixing a problem has impact X, documenting the fix or creating a process to ensure it never happens again may have impact 10X. Debugging a developer’s error may have impact Y, while writing a tool to allow any developer to diagnose their own errors may have impact 1000Y. Showing a colleague how to do something may have impact Z, while writing a blog post, a book, or giving a conference talk may have impact 100Z.

Personally, I like to start by helping in ways that don’t scale, because it allows me to learn enough about the problem space to maximize the value of my contribution. It also allows me to discover whether my solution even needs to scale, and if it does, how best to build that scalable solution.

Management

As you move up the ranks, one popular way to increase your impact is to become a manager. As a manager, you are, in effect, deemed partly responsible for the output of your team, and naturally the impact of a team is higher than that of an individual.

Unfortunately, measuring your personal contribution to the team’s output remains challenging– if you’re managing a team of star performers, would they continue to be star performers without you overhead? On the other hand, if you’re leading a team of chronic underachievers, the team’s impact will be low, and there are limits to both the speed and scope of a manager’s ability to turn things around.

As a manager, your impact remains very much subject to the macro-environment– your team of high performers might have high attrition because you’re a lousy manager, or in spite of you being a great manager (because your team’s mission isn’t aligned with the employee’s values, because your compensation budget isn’t competitive with the industry, etc).

Beyond measuring your own impact, you’re now responsible for the impact of your employees– assigning or guiding them toward the highest impact opportunities, and evaluating the impact of the outcomes. Crucially, you’re also responsible for explaining each employee’s impact to other leaders as a part of calibrating rewards across the team. Perhaps unfortunately for everyone, this activity is usually opaque to individual contributors (who are literally not in the room), leaving your ICs unable to determine how effectively you advocated on their behalf, beyond looking at their compensation changes.

Impact Alignment

One difficult challenge is that, “One Microsoft” aside, employee headcount and budgets are assigned by team. With the exception of some cross-division teams, most of your impact only “counts” for rewards if it accrues to your immediate peers, designated partner teams, or customers.

It is very hard to get rewarded for impact outside of that area, even if it’s unquestionably valuable to the organization as a whole.

Around 2009 or so, my manager walked into my office and irreverently asked “You’re an idiot, you know that right?” I conceded that was probably true, but asked “Sure, but why specifically?” He beckoned me over to the window and pointed down at the parking lot. “See that red Ferrari down there?” I nodded. He concluded “As soon as you thought of Fiddler, you should’ve quit, built it, and had Microsoft buy you out. Then you’d be driving that instead of a 2002 Corolla.” I laughed and noted “I’m no Mark Russinovich, and Microsoft clearly doesn’t want Fiddler anyway.” But this was a problem of organizational alignment, not value– Microsoft was using Fiddler extremely broadly and very intensely, but because it was not well-aligned with any particular product team, it received almost no official support. I’d offered it to Visual Studio, who made some vague mention of “investing in this area in some future version” and were never heard from again. I offered to write an article for MSDN Magazine, who rejected me on the grounds that the tool was “Not a Microsoft product” and thus not within their charter, despite its broad use exclusively by developers on Windows. Over the years, several leads strongly implied that my work on Fiddler was evidence that I could be “working harder at my day job.”

Nevertheless, in 2007, I won an Engineering Excellence award for Fiddler, for which I got a photo with Bill Gates, a crystal spike, an official letter of recognition, and $5000 for a morale event for my “team.” Lacking a team, I went on a Mediterranean cruise with my girlfriend.

Of course, there have been many non-official rewards for years of effort (niche fame, job opportunities, friendships) but because of this lack of alignment with my team’s ownership areas, even broad impact was hard for Microsoft to reward.

Karma

Our CEO once famously got in trouble for suggesting that employees passed over for promotion should be patient and “karma” would make things right. While the timing and venue for this observation were not ideal, it’s an idea that has been around at the company for decades. Expressed differently, reality has a way of being discovered eventually, and if you’re passed over for a deserved promotion, it’s likely to get fixed in the next cycle. In the other direction, one of the most painful things that can happen is a premature promotion, whereby you go from being a solid performer with level-appropriate impact to underachieving against higher expectations.

I spent six long years in the PM2 band before we had new leaders who joined the team and recognized the technical impact I’d been delivering for years; I was promoted from level 62 to 63 in five months.

In hindsight, I was too passive in evaluating and explaining my impact to leaders during those long years, and I probably could have made my case for greater rewards earlier if I’d spent a bit more energy on doing so. I had a pretty dismissive attitude toward “career management” and while I thought I was making things easier on my managers, the net impact was nearly disastrous– quitting in disgust because “they just don’t get it.”

How do you [maximize|measure|explain] your impact?

-Eric

Bonus Content – Career Advice

Back in 2015, I gave a talk about what I’d learned in my career thus far. You can find the recording and deck of Lucking In on GitHub; most of the key points ultimately come down to maximizing impact.

For career-advice contrast, I also enjoyed Moxie Marlinspike’s post.

Unexpectedly HTTPS?

While I’m a firm believer that every site should be using HTTPS, sadly, not every site is yet doing so. Looking at Chrome data, today around 92% of navigations are HTTPS:

…and the pages loaded account for around 95% of browsing time:

Browsers are working hard to get these numbers up, by locking down non-secure HTTP permissions, blocking mixed content downloads, and by attempting to get the user to a secure version of a site if possible (upgrading mixed content, and upgrading navigations).

Chrome and Edge have adopted different strategies for navigation upgrades:

Chrome

In Chrome, if you don’t type a protocol in the address bar, will try HTTPS first and if a response isn’t received in three seconds, it will race a HTTP request. There’s an option to require HTTPS:

When this option is set, attempting to navigate to a site that does not support HTTP results in a warning interstitial:

Edge

In Edge, we are experimenting with an “Automatic HTTPS” feature to upgrade navigations (even if http:// was specified) to use HTTPS.

The feature defaults to a list-based upgrade approach, whereby we deliver a component containing sites believed to be compatible with TLS. The list data is stored on disk, but is unfortunately not readily human-readable due to its encoding (for high-performance read operations):

Alternatively, if Always switch is specified, all requests are upgraded from HTTP to HTTPS unless:

  • The URL’s hostname is dotless (e.g. http://intranet, http://localhost)
  • The URL targets a non-default port (http://example.com:8080)
  • The hostname is included on a hardcoded exemption list containing just a handful of HTTP-only hostnames that are used by features or users to authenticate to Captive Portal interceptors. kAutomaticHttpsNeverUpgradeList = {"http://msftconnecttest.com", "http://edge.microsoft.com", "http://neverssl.com", "edge-http.microsoft.com};
  • The user has previously opted-out of HTTPS upgrade for the host by clicking the link on the connection failure error page.

Diagnostics

Beyond the browser-specific features, browsers might end up on a HTTPS site even when the user specified a http:// url because:

  • The site is on the HSTS Preload list (including preloaded TLDs)
  • The site was previously visited over HTTPS and returned a Strict-Transport-Security header to opt-in to HSTS
  • The site was previously visited over HTTP and returned a cacheable HTTP/3xx redirect to the HTTPS page

In some cases, such upgrades might be unexpected or problematic, but figuring out the root cause might not be entirely trivial, particularly if an end-user is reporting the problem and you do not have access to their computer.

Local Diagnostics

You can use the Network tab of the F12 Developer Tools to see whether a cached redirect response is responsible for an HTTPS upgrade.

You can see whether Edge’s Automatic HTTPS feature upgraded a request to HTTPS by looking at the F12 Console tab:

To see if HSTS is responsible for an upgrade, on the impacted client, visit about://net-internals/#hsts and enter the domain in the box and click Query. Look at the upgrade_mode values:

If the static_upgrade_mode value shows FORCE_HTTPS, the site is included in the HSTS preload list. If FORCE_HTTPS is specified in dynamic_upgrade_mode, the site sent a Strict-Transport-Security opt-in header.

You can clear out dynamic_upgrade_mode entries by using the Cached images and files: All time option in the Clear Browsing Data dialog box:

Remote Diagnostics

If you don’t have direct access to the client, you can ask the user to collect a NetLog capture to analyze. The NetLog will show HTTPS upgrades from HSTS and from previously cached responses.

You can see a HSTS Upgrade by using the search box to look for either TRANSPORT_SECURITY_STATE_SHOULD_UPGRADE_TO_SSL (which will appear for all URLRequests with a true or false value) or for reason = "HSTS" which will find the internal redirect to upgrade to HTTPS:

Unfortunately, at the moment there’s no clear signal that a request was upgraded by Edge’s Automatic HTTPS feature, because the rewrite of the URL happens above the network stack.

Please help secure the web by moving all sites to HTTPS!

-Eric

Chromium Internals: PAK Files

Web browsers are made up of much more than the native code (mostly compiled C++) that makes up their .exe and .dll files. A significant portion of the browser’s functionality (and bulk) is what we’d call “resources”, which include things like:

  • Images (at two resolutions, regular and “high-DPI”)
  • Localized UI Strings
  • HTML, JavaScript, and CSS used in Settings, DevTools and other features
  • UI Theme information
  • Other textual resources, like credits

In ancient times, this resource data was compiled directly into resource segments of Windows DLL files, but many years ago Chromium introduced a new format, called .pak files, to hold resource data. The browser loads resource streams out of the appropriate PAK files chosen at runtime (based on the user’s locale and screen resolution) and uses the data to populate the UI of the browser. PAK files are updated as a part of every build of the browser, because every change to any resource requires rebuilding the file.

High-DPI

Over the years, devices were released with ever-higher resolution displays, and software started needing to scale resources up so that they remain visible to human eyes and tappable by human fingers.

Scaling non-vector images up has a performance cost and can make them look fuzzy, so Chromium now includes two copies of each bitmap image, one in the 100_percent resource file, and a double-resolution version in the 200_percent resource file. The browser selects the appropriate version at runtime based on the current device’s display density.

Exploring PAK Files

You can find the browser’s resource files within Chrome/Edge’s Application folder:

Unfortunately for the curious, PAK is a binary format which is not easily human readable. Beyond munging many independent resources into a single file, the format relies upon GZIP or Brotli compression to shrink the data streams embedded inside the file. Occasionally, someone goofs and forgets to enable compression on a resource, bloating the file but leaving the plaintext easy to read in a hex-editor:

If you want a better look inside of a PAK file, you can use the unpack.bat tool, but this tool does not yet support decompressing brotli-compressed data (because Brotli support was added to PAK relatively recently). If you need to see a brotli-compressed resource, use unpack.bat to get the raw file. Then strip 8 bytes off the front of the file and use brotli.exe to decompress the data.

brotli.exe --decompress --in extracted.br --out plain.txt --verbose

Despite the availability of more efficient image formats (e.g. WebP), many browser bitmap resources are still stored as PNG files. My PNGDistill tool offers a GROVEL mode that allows extracting all of the embedded PNGs out of any binary file, including PAK:

You can then run PNGDistill on the extracted PNGs and discover that our current PNG compression efficiency inside resources.pak is just 94%, with 146K of extra size due to suboptimal compression.

Fortunately, the PNGs are almost all properly-stripped of metadata with the exception of this cute little guy, whose birthday is recorded in a PNG comment:

Have fun looking under the hood!

-Eric

Real-World Running

Yesterday, I ran my first 10K in “the real world”, my first real world run in a long time. Almost eleven years ago I ran a 5K, and 3.5 years ago I ran a non-competitive 5 miler on Thanksgiving.

I was a bit worried about how my treadmill training would map to real world running, but signed up for the Austin Capitol 10K as a forcing function to keep working out.

tl;dr: it went pretty well: it didn’t go as well as I’d hoped, nor as poorly as I feared. I beat my previous 10K time by about four minutes and finished just inside the top third of my age group.

What went well:

  • The weather was great: Sixties to low seventies. It was sunny, but early enough that it wasn’t too hot. I put my bib number on my shorts because I expected I’d have to take my shirt off to use as a towel, but I didn’t have sweat pouring into my eyes until the very end.
  • Because it wasn’t hot, my little SpeedDraw water bottle lasted me through the whole race.
  • While I expect my knees are likely to be my eventual downfall, I didn’t have any knee pain at all during the run.
  • No blisters or chafing, with new Balega socks and Brooks Ricochet 3 on my feet and Body Glide for my chest.
  • After running on the perfectly flat and almost perfectly predictable treadmill for three months, I expected I was going to trip or slide in gravel when running on a real road. While I definitely had to pay a lot more attention to my foot placement, I didn’t slip at all.
  • Sprinting at the finish felt great.
  • A friend ran with me and helped keep me motivated.
  • The drive to and parking for the race was easy.

Could’ve been better/worse:

  • Two miles in, my stomach started threatening to get rid of the prior night’s dinner; the feeling eventually faded, but I worried for the rest of the race.
  • As a little kid, I had horrible allergies, but since moving to Texas they’re mostly non-existent. On Saturday, my nose started running a bit and didn’t stop all weekend. Could’ve been a disaster, but it was a mild annoyance at most.
  • I woke up at 4:30am, 90 minutes before my alarm, and couldn’t get back to sleep. Still, I got 5.5 hours of good sleep.

What didn’t go so well:

  • Pacing. Running on a treadmill is trivial– you go as fast as the treadmill, and you always know exactly how fast that is. I wore my FitBit Sense watch, but because I had to keep my eyes on the road, I barely looked at it, and it never seemed to be showing what I wanted to know.

    I missed seeing the 1 mile marker, and by the time the 2 mile marker came along I was feeling nervous and demotivated. My watch had me believing that I was far behind my desired pace (I wasn’t, my first mile was my fastest and at my target pace, despite the obstacles). I didn’t get into my “groove” until the fourth mile of the race, but by then I wasn’t able to maintain it.

    I spend most of my running on the treadmill around 145bpm “Cardio” heart-rate zone (even when running intervals), but I spent most of this race in “Peak”, averaging 165bpm. More importantly, when I do peak on the treadmill, it’s usually been at the very end of the workout, and I get to cool down slowly with a long walk after. I need more practice ramping down my effort while still running at a more manageable pace.
  • Crowds. There were 15000 participants, and when I signed up, I expected I’d probably walk most of the course. As a consequence, I got assigned to one of the late-starting corrals with other slow folks, and I had to spend the first mile or so dodging around them. Zigzagging added an extra tenth of a mile to my race.
  • Hills. Running hills in the real world is a lot harder than doing it on the treadmill. In the real world, you have to find ways to run around the walls of walkers; while I’d hoped I’d find this motivational, seeing so many folks just walking briskly enticed me to join them. It didn’t help that some jerk at the top of the hill at 1.5mi assured everyone that it was the last hill of the race– while I didn’t really believe him, there was a part of me that very much wanted to, and I was quite annoyed when we hit the bigger hill later.
  • Forgettability. Likely related to the fact that I spent almost all of my time watching the road underfoot, an hour after the race was over, I struggled to remember almost anything about it. From capitol views to bands, I know there were things to see, but have no real memories of them. I’m spoiled by running on the treadmill “in” Costa Rica, although I don’t remember a ton of that either. Running definitely turns off my brain.
  • Poor Decisions. After the race, I wanted to keep moving, so I walked to a coffee shop I’d passed on my drive into the city. While my legs and cardio had no problem with this, at the end of the four-mile post-race walk, my feet (particularly my left foot arch) were quite unhappy.

Lest you come away with the wrong conclusion, I’m already trying to find another workable 10K sometime soon. That might be hard with the Texas summer barrelling toward us.

-Eric

Notice: As an Amazon Associate, I earn from qualifying purchases from Amazon links in this post.