text/plain

Captive Portals

When you join a public WiFi network, sometimes you’ll notice that you have to accept “Terms of Use” or provide a password or payment to use the network. Your browser opens or navigates to a page that shows the network’s legal terms or web log on form, you fill it out, and you’re on your way. Ideally.

How does this all work?

Wikipedia has a nice article about Captive Portals, but let’s talk about the lower-level mechanics.

Operating Systems’ Portal Detection

When a new network connection is established, Windows will send a background HTTP request to www.msftconnecttest.com/connecttest.txt. If the result is a HTTP/200 but the response body doesn’t match the string the server is known to always send in reply (“Microsoft Connect Test“), the OS will launch a web browser to the non-secure HTTP URL www.msftconnecttest.com/redirect. The expectation is that if the user is behind a Captive Portal, the WiFi router will intercept these HTTP requests and respond with a redirect to a page that will allow the user to log on to the network. After the user completes the ritual, the WiFi router stores the MAC Address of the device’s network card to avoid repeating the dance on every subsequent connection.

This probing functionality is a part of the Network Connectivity Status Indicator feature of Windows, which will also ensure that the WiFi icon in your task bar indicates if the current connection does not yet have access to the Internet at large. Beyond this active probing behavior, NCSI also has a passive polling behavior that watches the behavior of other network APIs to detect the network state.

Other Windows applications can detect the Captive Portal State using the Network List Manager API, which indicates NLM_INTERNET_CONNECTIVITY_WEBHIJACK when Windows noticed that the active probe was hijacked by the network. Enterprises can reconfigure the behavior of the NCSI feature using registry keys or Group Policy.

On MacOS computers, the OS offers a very similar active probe: a non-secure probe to http://captive.apple.com is expected to always reply with (“Success“).

Edge Portal Detection

Chromium includes its own Captive Portal detection logic whereby a probe URL is expected to return a HTTP/204 No Content response.

Edge specifies a probe url of http://edge-http.microsoft.com/captiveportal/generate_204

Chrome uses the probe URL http://www.gstatic.com/generate_204.

Avoiding HTTPS

Some Captive Portals perform their interception by returning a forged server address when the client attempts a DNS lookup. However, DNS hijacking is not possible if DNS-over-HTTPS (DoH) is in use. To mitigate this, the detector bypasses DoH when resolving the probe URL’s hostname.

Similarly, note that all of the probe URLs specify non-secure http://. If a probe URL started with https://, the WiFi router would not be able to successfully hijack it. HTTPS is explicitly designed to prevent a Monster-in-the-Middle (MiTM) like a WiFi router from changing any of the traffic, using cryptography and digital signatures to protect the traffic from modification. If a hijack tries to redirect a request to a different location, the browser will show a Certificate Error page that indicates that either the router’s certificate is not trusted, or that the certificate the router used to encrypt its response does not have a URL address that matches the expected website (e.g. edge-http.microsoft.com).

This means, among other things, that new browser features that upgrade non-secure HTTP requests to HTTPS must not attempt to upgrade the probe requests, because doing so will prevent successful hijacking. To that end, Edge’s Automatic HTTPS feature includes a set of exclusions:

kAutomaticHttpsNeverUpgradeList {
    "msftconnecttest.com, edge.microsoft.com, "
    "neverssl.com, edge-http.microsoft.com" };

Unfortunately, this exclusion list alone isn’t always enough. Consider the case where a WiFi router hijacks the request for edge-http.microsoft.com and redirects it to http://captiveportal.net/accept_terms. The browser might try to upgrade that navigation request (which targets a hostname not on the exclusion list) to HTTPS. If the portal’s server doesn’t support HTTPS, the user will either encounter a Connection Refused error or an Untrusted Certificate error.

If a user does happen to try to navigate to a HTTPS address before authenticating to the portal, and the router tries to hijack the secure request, Chromium detects this condition and replaces the normal certificate error page with a page suggesting that the user must first satisfy the demands of the Captive Portal:

For years, this friendly design had a flaw– if the actual captive portal server specified a HTTPS log on URL but that log on URL sent an invalid certificate, there was no way for the user to specify “I don’t care about the untrusted certificate, continue anyway!” I fixed that shortcoming in Chromium v101, such that the special “Connect to Wi-Fi” page is not shown if the certificate error appears on the tab shown for Captive Portal login.

-Eric

Extending Fiddler’s ImageView

Fiddler’s ImageView Inspector offers a lot of powerful functionality for inspecting images and discovering ways to shrink an image’s byte-weight without impacting its quality.

Less well-known is the fact that the ImageView Inspector is very extensible, such that you can add new tools to it very simply. To do so, simply download any required executables and add registry entries pointing at them.

For instance, consider Guetzli, the JPEG-optimizer from the compression experts at Google. It aims to shrink JPEG images by 20-30% without impacting quality. The tool is delivered as a command-line executable that accepts an image’s path as input, generating a new file containing the optimized image. If you pass no arguments at all, you get the following help text:

To integrate this tool into Fiddler, simply:

Download the executable (x64 or x86 as appropriate) to a suitable folder.
Run regedit.exe and navigate to HKEY_CURRENT_USER\Software\Microsoft\Fiddler2\ImagesMenuExt\
Create a new key with the caption of the menu item you’d like to add. (Optionally, add a & character before the accelerator key.)
Create Command, Parameters and Types values of type REG_SZ. Read on for details of what to put in each.

Alternatively, you could use Notepad to edit a AddCommand.reg script and then double-click the file to update your registry:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\Fiddler2\ImagesMenuExt\&JPEGGuetzli]
"Command"="C:\\YOURPATHToTools\\guetzli_windows_x86-64.exe"
"Parameters"="{in} {out:jpg}"
"Types"="image/jpeg"

When you’re done, the registry key should look something like:

After you’ve set up your new command, when you right-click on a JPEG in the ImageView, you’ll see your new menu item in the Tools submenu:

When you run the command, the tool will run and a new entry will be added to the Web Sessions list, containing the now optimized image:

Registry Values

The required Command entry points to the location of the executable on disk.

The optional Parameters entry specifies the parameters to pass to the tool. The Parameters entry supports two tokens, {in} and {out}. The {in} token is replaced with the full path to the temporary file Fiddler uses to store the raw image from the ImageView Inspector before running the tool. The {out} token is replaced with the filepath Fiddler requests the tool write its output to. If you want the output file to have a particular extension, you can specify it after a colon; for example {out:jpg} generates a filename ending in .jpg. If you omit the Parameters value, a default of {in} is used.

The optional Types parameter limits the MIME types on which your tool is offered. For example, if your tool only analyzes .png files, you can specify image/png. You can specify multiple types by separating them with a comma, space, or semicolon.

The optional Options value enables you to specify either <stdout> or <stderr> (or both) and Fiddler will collect any textual output from the tool and show it in a message box. For instance, the default To WebP Lossless command sets the <stderr> option, and after the tool finishes, a dialog box is shown:

Inspect and optimize all the things!

-Eric

“Batteries-Included” vs “Bloated”

Fundamentals are invisible. Features are controversial.

One of the few common complaints against Microsoft Edge is that “It’s bloated– there’s too much stuff in it!”

A big philosophical question for designers of popular software concerns whether the product should include features that might not be useful for everyone or even a majority of users. There are strong arguments on both sides of this issue, and in this post, I’ll explore the concerns, the counterpoints, and share some thoughts on how software designers should think about this tradeoff.

But first, a few stories

I started working in Microsoft Office back in 1999, on the team that was to eventually ship SharePoint. Every few months in the early 2000s, a startup would appear, promising a new office software package that’s “just the 10% of Microsoft Office that people actually use.” All of these products failed (for various reasons), but they all failed in part because their development and marketing teams failed to recognize a key fact: Yes, the vast majority of customers use less than 10% of the features of the Microsoft Office suite, but it’s a different 10% for each customer.
I started building the Fiddler Web Debugger in 2003, as a way for my team (the Office Clip Art team) to debug client/server traffic between the Microsoft Clip Art client app and the Design Gallery Live webservice that hosted the bulk of the clipart Microsoft made available. I had no particular ambitions to build a general purpose debugger, but I had a problem: I needed to offer a simple way to filter the list of web requests based on their URLs or other criteria, but I really didn’t want to futz with building a complicated filter UI with dozens of comboboxes and text fields.

I mused “If only I could let the user write their queries in code and then Fiddler would just run that!” And then I realized I could do exactly that, by embedding the JScript.NET engine into Fiddler. I did so, and folks from all over the company started using this as an extensibility mechanism that went far beyond my original plans. As I started getting more feature requests from folks interested in tailoring Fiddler to their own needs, I figured “Why not just allow developers to write their own features in .NET?” So I built in a simplistic extensibility model that allowed adding new features and tabs all over. Over a few short years, a niche tool morphed into a wildly extensible debugger used by millions of developers around the world.
The original 2008 release of the Chrome browser was very limited in terms of features, as the team was heavily focused on performance, security, and simplicity. But one feature that seemed to get a lot of consideration early on was support for Mouse Gestures; some folks on the team loved gestures, but there was recognition that it was unlikely to be a broadly-used feature. Ultimately, the Chrome team decided not to implement mouse gestures, instead leaving the problem space to browser extensions.

Years later, after Chrome became my primary browser, I lost my beloved IE Mouse Gestures extension, so I started hunting for a replacement. I found one that seemed to work okay, but because I run Fiddler constantly, I soon noticed that every time I invoked a gesture, it sent the current page’s URL and other sensitive data off to some server in China. Appalled at the hidden impact to my privacy and security, I reported the extension to the Chrome Web Store team and uninstalled it. The extension was delisted from the web store.

Some time later, now back on the Edge team, a new lead joined and excitedly recommended we all try out a great Mouse Gestures extension for Chromium. I was disappointed to discover it was the same extension that had been removed previously, now with a slightly more complete Privacy Policy, and now using HTTPS when leaking users’ URLs to its own servers. (Update: As of 2023, Edge has built-in Mouse Gestures. Hooray!)

With these background stories in hand, let’s look at the tradeoffs.

“`Bloat`ed!”

There are three common classes of complaint from folks who point at the long list of features Edge has added over upstream Chromium and furiously charge “It’s bloated!“:

User Experience complexity
Security/reliability
Performance

UX Complexity

Usually when you add a feature to the browser, you add new menu items, hotkeys, support articles, group policies, and other user-visible infrastructure to support that feature. If you’re not careful, it’s easy to accidentally break a user’s longstanding workflows or muscle memory.

One of the Windows 7 Design Principles was “Change is bad, unless it’s great!” and that’s a great truth to keep in mind– entropy accumulates, and if you’re not careful, you can easily make the product worse. Users don’t like it when you move their cheese.

A natural response to this concern is to design new features to be unobtrusive, by leaving them off-by-default, or hiding them away in context menus, or otherwise keeping them out of the way. But now we’ve got a problem– if users don’t even know about your feature, why bother building it? If potential customers don’t know that your unique and valuable features exist, why would they start using your product instead of sticking with the market leader, even if that leader has been stagnant for years?

Worse still, many startups and experiments are essentially “graded” based on the number of monthly or daily active users (MAU or DAU)– if a feature isn’t getting used, it gets axed or deprioritized, and the team behind it is reallocated to a more promising area. Users cannot use a feature if they haven’t discovered it. As a consequence, in an organization that lacks powerful oversight there’s a serious risk of tragedy, whereby your product becomes a sea of banners and popups each begging the user for attention. Users don’t like it when they think you’re distracting them from the cheese they’ve been enjoying.

Security/reliability risk

Engineers and enthusiasts know that software is, inescapably, never free of errors, and intuitively it seems that every additional line of code in a product is another potential source of crashes or security vulnerabilities.

If software has an average of, say, two errors per thousand lines of code, adding a million lines of new feature code mathmatically suggests there are now two thousand more bugs that the user might suffer.

If users have to “pay” for features they’re not using, this feels like a bad deal.

Performance

Unlike features, Performance is one of the “Universal Goods” in software– no user anywhere has ever complained that “My app runs too fast!” (with the possible exception of today’s gamers trying to use their 4ghz CPUs to run retro games from the 1990s).

However, we users also note that, even as our hardware has gotten thousands of times faster over the decades, our software doesn’t seem to have gotten much faster at all. Much like our worry about new features introducing code defects, we also worry that introducing new features will make the product as a whole slower, with higher CPU, memory, or storage requirements.

Each of these three buckets of concerns is important; keep them in mind as we consider the other side.

“`Batteries Included!`“

We use software to accomplish tasks, and features are the mechanism that software exposes to help us accomplish our tasks.

We might imagine that ideal software would offer exactly and only the features we need, but this is impractical. Oftentimes, we may not recognize the full scope of our own needs, and even if we do, most software must appeal to broad audiences to be viable (e.g. the “10% of Microsoft Office” problem). And beyond that, our needs often change over time, such that we no longer need some features but do need other features we didn’t use previously.

One school of thought suggests that the product team should build a very lightweight app with very few features, each of which is used by almost everyone. Features that will be used by fewer users are instead relegated to implementation via an extensibility model, and users can cobble together their own perfect app atop the base.

There’s a lot of appeal in such a system– with less code running, surely the product must be more secure, more performant, more reliable, and less complex. Right?

Unfortunately, that’s not necessarily the case. Extension models are extremely hard to get right, because until you build all of the extensions, you’re not sure that the model is correct or complete. If you need to change the model, you may need to change all of the extensions (e.g. witness the painful transition from Chromium’s Manifest v2 to Manifest v3).

Building features atop an extension model sometimes entails major performance bugs, because data must flow through more layers and interfaces, and if needed events aren’t exposed, you may need to poll for updates. Individual extensions with common needs may have to do redundant work (e.g. each extension scanning the full text of each loaded page, rather than the browser itself scanning the whole thing just once).

As we saw with the Mouse Gestures story above, allowing third-party extensions carries along with it a huge amount of complexity related to security risk and misaligned incentives. In an adversarial ecosystem where good and bad actors both participate, you must invest heavily in security and anti-abuse mechanisms.

Finally, regression testing and prevention gets much more challenging when important features are relegated to extensions. Product changes that break extensions won’t block the commit queue, and the combinatorics involved in testing with arbitrary combinations of extensions quickly hockey sticks upward to infinity.

Extensions also introduce complexity in the management and update experience, and users might miss out on great functionality because they never discovered extension exists to address a need they have (or didn’t even realize they have). You’d probably be surprised by the low percentage of users that have any browser extensions installed at all.

With Fiddler, I originally took the “Platform” approach where each extra feature was its own extension. Users would download Fiddler, then download four or five other packages after/if they realized such valuable functionality existed. Over time, I realized that nobody was happy with my Ikea-style assemble-your-own debugger, so I started shipping a “Mondo build” of Fiddler that just included everything.

Extensions, while useful, are no panacea.

Principles

These days, I’ve come around to the idea that we should include as many awesome features as we can, but we should follow some key principles:

To the extent possible, features must be “pay to play.” If a user isn’t using a given feature, it should not impact performance, security, or reliability. Even small regressions quickly add up.
Don’t abuse integration to avoid clean layering and architecture. Just because your feature’s implementation can go poke down into the bowels of the engine doesn’t mean it should.
Respect users and carefully manage UX complexity. Remember, “change is bad unless it’s great.” Invest in systems that enable you to only advertise new features to the right users at the right time.
Remove failed experiments. If you ship a feature and find that it’s not meeting your goals, pull it out. If you must accommodate a niche audience that fell in love, consider whether an extension might meet their needs.
Find ways to measure, market, and prioritize investments in Fundamentals. Features usually hog all the glory, but Fundamentals ensure those features have the opportunity to shine.

-Eric

Chromium Startup

This morning, a Microsoft Edge customer contacted support to ask how they could launch a URL in a browser window at a particular size. I responded that they could simply use the --window-size="800,600" command line argument. The customer quickly complained that this only seemed to work if they also specified a non-default path in the --user-data-dir command line argument, such that the URL opened in a different profile.

As I mentioned back in my post about Edge Command Line Arguments, most command-line arguments are ignored if there’s already a running instance of Edge, and this is one of them. Even if you also pass the --new-window argument, Edge simply opens the URL you’ve supplied inside a new window with the same size and location as the original window.

Now, in many cases, this is a reasonable limitation. Many of the command line arguments you can pass into Chromium have a global impact, and having inbound arguments change the behavior of an already-running browser instance would introduce an impractical level of complexity to the code and user experience. Similarly, there’s an assumption in the Chromium code that only one browser instance will interact with a single profile folder at one time, so we could not simply allow multiple browser instances with different behaviors to use a single profile in parallel.

In this particular case, it feels reasonable that if a user passes both --new-window and either --window-size or --window-position (or all three), the resulting window will have the expected dimensions, even if there was already one or more browser windows open in the browsing session. Because the arguments do not impact more than the newly-created window, there’s none of the compexity of trying to change the behavior of any other part of the already-running browser. I filed a bug suggesting that we ought to look at enabling this.

In the course of investigating this bug, I had the opportunity to learn a bit more about how Chromium handles the invocation of a URL when there’s already a running instance. When it first starts, the new browser process searches for an existing hidden message window of Class Chrome_MessageWindow with a Caption matching the user-data-dir of the new process.

If it fails to find one, it creates a new hidden messaging window (using a mutex to combat race conditions) with the correct caption for any future processes to find.

However, if the code did find an existing messaging window, there’s a call to a AttemptToNotifyRunningChrome function that sends a WM_COPYDATA message to pass along the command line from the (soon-to-exit) new process.

In the unlikely event that the existing process fails to accept the message (e.g. because it is hung), the user will be prompted to kill the existing process so that the new process can handle the navigation.

This code is surprisingly simple, and feels very familiar– the startup code inside Fiddler is almost identical except it’s implemented in C#.

In our customer scenario, we see that the existing browser instance correctly gets the command line from the new process:

[11824:4812:0615/092252.689:startup_browser_creator.cc(1391)] ProcessCommandLineAlreadyRunning "C:\src\c\src\out\default\chrome.exe" --new-window --window-size=400,510 --window-position=123,34 --flag-switches-begin --flag-switches-end example2.com

And shortly after that, there’s the expected call to the UpdateWindowBoundsAndShowStateFromCommandLine function that sets the window size and position from any arguments passed on the command line.

The stack trace of that call looks like

-chrome::internal::UpdateWindowBoundsAndShowStateFromCommandLine
-chrome::GetSavedWindowBoundsAndShowState
-BrowserView::GetSavedWindowPlacement

Unfortunately, when we look at the GetSavedWindowBoundsAndShowState function we see the problem:

void GetSavedWindowBoundsAndShowState(const Browser* browser,
                                      gfx::Rect* bounds,
                                      ui::WindowShowState* show_state) {
  //...
  const base::CommandLine& parsed_command_line =
      *base::CommandLine::ForCurrentProcess();

internal::UpdateWindowBoundsAndShowStateFromCommandLine(parsed_command_line, bounds, show_state);
}

As you can see, the call passes the command line string representing the current (preexisting) process, rather than the command line that was passed from the newly started process. So, the new window ends up with the same size and position information from the original window.

To fix this, we’ll need to restructure the calls such that when we’re handling a command line passed to us from another process through ProcessCommandLineAlreadyRunning, we use the WM_COPYDATA-passed command line when setting the window size and position.

-Eric

Microsoft Edge Tips and Tricks

Last Updated: June 3, 2022. The intent of this post is to capture a list of non-obvious features of the browser that might be useful to you.

Q: How do I find the tab playing audio? It’s cool that Microsoft Edge shows the volume icon in the tab playing music and I can click to mute it:

…but what if I have a bunch of Edge windows? I have to go into each window to find the icon?

A: The Ctrl+Shift+A hotkey is your friend. It will show your open tabs to allow you to search across them, and those playing audio/video are listed in a group at the top:

Q: How can I move a few tabs out of the current window?

A: You can simply drag the tab’s button/title out of the tab strip to move it to a new window. Less obviously, you can Ctrl+Click *multiple* tabs and drag your selections out into a new window (unselected tabs temporarily dim). Use Shift+Click if you’d prefer to select a range of tabs.

Q: How can I duplicate a tab?

A: Hit Ctrl+Shift+K or use the “Duplicate Tab” command on the tabstrip’s context menu to duplicate the current tab. If you have a middle-mouse button, middle-click the Refresh button.

Less obviously, you can Ctrl+Click the back or forward arrow buttons to open the previous or next entry in the history in a new tab, or you can Shift+Click the buttons to open the page in a new window.

Q: How can I get back a tab I accidentally closed?

A: Hit Ctrl+Shift+T or use the “Reopen closed” option on the tabstrip’s context menu shown on right-click.

You also might be interested in the “Ask before closing a window with multiple tabs” option available inside the edge://settings page:

Q: On a desktop mouse, is middle-click useful for anything?

A: Middle-click a link to open it in a new tab. Middle-click a tab title button to close that tab rather than hunting for its [x] icon. Middle-click the refresh button to duplicate the tab.

Q: How can I easily open a given site in a different profile?

A: You can right-click a link in a page and choose “Open as” to open that link in a different profile:

If you already have the desired page open, you can right-click the tab title button and choose “Move Tab To” and pick the desired profile:

Move the current tab to a different profile

You can also use the options at edge://settings/profiles/multiProfileSettings to open particular sites using a particular profile, useful for splitting your “Work Sites” from your “Life Sites” and your “Ephemeral sites“.

Open AzDo in the Work Profile, and Hotmail in my Personal Profile

Q: How can I make any site act more like an “App” with its own window that isn’t cluttered with other tabs?

A: You can use the --app=url command line argument to give a any site its own standalone window that does not mix with your other sites. For example, if you run msedge.exe --app=https://outlook.live.com, the result looks like this:

This works great with command launchers like SlickRun, because you can then just type e.g. Mail to launch the standalone web app.

You might also enjoy this collection of not-so-frequently-asked questions about Edge.

Operating Systems’ Portal Detection

Edge Portal Detection

Avoiding HTTPS

Registry Values

But first, a few stories

“Bloated!”

UX Complexity

Security/reliability risk

Performance

“Batteries Included!“

Principles

“`Bloat`ed!”

“`Batteries Included!`“