Analyzing Network Traffic Logs (NetLog json)

Previously, I’ve described how to capture a network traffic log from Microsoft Edge, Google Chrome, and applications based on Chromium or Electron.

In this post, I aim to catalog some guidance for looking at these logs to help find the root cause of captured problems and otherwise make sense of the data collected.

Last Update: April 24, 2020I expect to update this post over time as I continue to gain experience in analyzing network logs.

Choose A Viewer – Fiddler or Catapult

After you’ve collected the net-export-log.json file using the about:net-export page in the browser, you’ll need to decide how to analyze it.

The NetLog file format consists of a JSON-encoded stream of event objects that are logged as interesting things happen in the network layer. At the start of the file there are dictionaries mapping integer IDs to symbolic constants, followed by event objects that make use of those IDs. As a consequence, it’s very rare that a human will be able to read anything interesting from a NetLog.json file using just a plaintext editor or even a JSON parser.

The most common (by far) approach to reading NetLogs is to use the Catapult NetLog Viewer, a HTML/JavaScript application which loads the JSON file and parses it into a much more readable set of events.

An alternative approach is to use the NetLog Importer for Telerik Fiddler.

Importing NetLogs to Fiddler

For Windows users who are familiar with Fiddler, the NetLog Importer extension for Fiddler is easy-to-use and it enables you to quickly visualize HTTP/HTTPS requests and responses. The steps are easy:

  1. Install the NetLog Importer,
  2. Open Fiddler, ideally in Viewer mode fiddler.exe -viewer
  3. Click File > Import > NetLog JSON
  4. Select the JSON file to import

In seconds, all of the HTTP/HTTPS traffic found in the capture will be presented for your review. If the log was compressed before it was sent to you, the importer will automatically extract the first JSON file from a chosen .ZIP or .GZ file, saving you a step.

In addition to the requests and responses parsed from the log, there are a number of pseudo-Sessions with a fake host of NETLOG that represent metadata extracted from the log:

These pseudo-sessions include:

  • RAW_JSON contains the raw constants and event data. You probably will never want to examine this view.
  • CAPTURE_INFO contains basic data about the date/time of the capture, what browser and OS version were used, and the command line arguments to the browser.
  • ENABLED_EXTENSIONS contains the list of extensions that are enabled in this browser instance. This entry will be missing if the log was captured using the –log-net-log command line argument.
  • URL_REQUESTS contains a dictionary mapping every event related to URL_REQUEST back to the URL Requests to which it belongs. This provides a different view of the events that were used in the parsing of the Web Sessions added to the traffic list.
  • SECURE_SOCKETS contains a list of all of the HTTPS sockets that were established for network requests, including the certificates sent by the server and the parameters requested of any client certificates. The Server certificates can be viewed by saving the contents of a –BEGIN CERTIFICATE– entry to a file named something.cer. Alternatively, select the line, hit CTRL+C, click Edit > Paste As Sessions, select the Certificates Inspector and press its Content Certificates button.

You can then use Fiddler’s UI to examine each of the Web Sessions.

Limitations

The NetLog format currently does not store request body bytes, so those will always be missing (e.g. on POST requests).

Unless the Include Raw Bytes option was selected by the user collecting the capture, all of the response bytes will be missing as well. Fiddler will show a “dropped” notice when the body bytes are missing:

If the user did not select the Include Cookies and Credentials option, any Cookie or Authorization headers will be stripped down to help protect private data:

Scenario: Finding URLs

You can use Fiddler’s full text search feature to look for URLs of interest if the traffic capture includes raw bytes. Otherwise, you can search the Request URLs and headers alone.

On any session, you can use Fiddler’s “P” keystroke (or the Select > Parent Request context menu command) to attempt to walk back to the request’s creator (e.g. referring HTML page).

You can look for the traffic_annotation value that reflects why a resource was requested by looking for the X-Netlog-Traffic_Annotation Session Flag.

Scenario: Cookie Issues

If Fiddler sees that cookies were not set or sent due to features like SameSiteByDefault cookies, it will make a note of that in the Session using a psuedo $NETLOG-CookieNotSent or $NETLOG-CookieNotSet header on the request or response:

Closing Notes

If you’re interested in learning more about this extension, see announcement blog post and the open-source code.

While the Fiddler Importer is very convenient for analyzing many types of problems, for others, you need to go deeper and look at the raw events in the log using the Catapult Viewer.


Viewing NetLogs with the Catapult Viewer

Opening NetLogs with the Catapult NetLog Viewer is even simpler:

  1. Navigate to the web viewer
  2. Select the JSON file to view

If you find yourself opening NetLogs routinely, you might consider using a shortcut to launch the Viewer in an “App Mode” browser instance: msedge.exe --app=https://netlog-viewer.appspot.com/#import

The App Mode instance is a standalone window which doesn’t contain tabs or other UI:

Note that the Catapult Viewer is a standalone HTML application. If you like, you can save it as a .HTML file on your local computer and use it even when completely disconnected from the Internet. The only advantage to loading it from appspot.com is that the version hosted there is updated from time to time.

Along the left side of the window are tabs that offer different views of the data– most of the action takes place on the Events tab.

Tips

If the problem only exists on one browser instance, check the Command Line parameters and Active Field trials sections on the Import tab to see if there’s an experimental flag that may have caused the breakage. Similarly, check the Modules tab to see if there are any browser extensions that might explain the problem.

Each URL Request has a traffic_annotation value which is a hash you can look up in annotations.xml. That annotation will help you find what part of Chromium generated the network request:

Most requests generated by web content will be generated by the blink_resource_loader, navigations will have navigation_url_loader and requests from features running in the browser process are likely to have other sources.

Scenario: DNS Issues

Look at the DNS tab on the left, and HOST_RESOLVER_IMPL_JOB entries in the Events tab.

One interesting fact: The DNS Error page performs an asynchronous probe to see whether the configured DNS provider is working generally. The Error page also has automatic retry logic; you’ll see a duplicate URL_REQUEST sent shortly after the failed one, with the VALIDATE_CACHE load flag added to it. In this way, you might see a DNS_PROBE_FINISHED_NXDOMAIN error magically disappear if the user’s router’s DNS flakes.

Scenario: Cookie Issues

Look for COOKIE_INCLUSION_STATUS events for details about each candidate cookie that was considered for sending or setting on a given URL Request. In particular, watch for cookies that were excluded due to SameSite or similar problems.

Scenario: HTTPS Handshaking Issues

Look for SSL_CONNECT_JOB entries. Look at the raw TLS messages on the SOCKET entries. Read the HTTPS Certificate Issues section next.

Scenario: HTTPS Certificate Issues

Note: While NetLogs are great for capturing certs, you can also get the site’s certificate from the browser’s certificate error page.

When a HTTPS connection is established, the server sends one or more certificates representing the “end-entity” (web server) and (optionally) intermediate certificates that chain back to a trusted CA. The browser must use these certificates to try to build a “chain” back to a certificate “root” in the browser/OS trust store. This process of chain building is very complicated.

NetLogs include both the certificates explicitly sent by the server as well as the full chain of certificates it selected when building a chain back to the trusted root (if any).

The NetLog includes the certificates received for each HTTPS connection in the base64-encoded SSL_CERTIFICATES_RECEIVED events, and you can look at CERT_VERIFIER_JOB entries to see Chromium’s analysis of the trust chain. If you see ocsp_response entries, it means that the server stapled OCSP responses on the connection. is_issued_by_known_root indicates whether the chain terminates in a “public” PKI root (e.g. a public CA); if it’s false it means the trust root was a private CA installed on the PC. A verification result of Invalid means that the platform certificate verifier (on Windows, CAPI2) returned an error that didn’t map to one of the Chromium error statuses.

To view the certificates, just copy paste each certificate (including ---BEGIN to END---) out to a text file, name it log.cer, and use the OS certificate viewer to view it. Or you can copy the certificate blocks to your clipboard, and in Fiddler, click Edit > Paste As Sessions, select the Certificates Inspector and press its Content Certificates button.

If you’ve got Chromium’s repo, you can instead use the script at \src\net\tools\print_certificates.py to decode the certificates. There’s also a cert_verify_tool in the Chromium source you might build and try. For Mac, using verify-cert to check the cert and dump-trust-settings to check the state of the Root Trust Store might be useful.

In some cases, running the certificate through an analyzer like https://crt.sh/lintcert can flag relevant problems.

Scenario: Authentication Issues

Look for HTTP_AUTH_CONTROLLER events, and responses with the status codes 401, 403, and 407.

  • If the HTTP_AUTH_CONTROLLER does not show AUTH_HANDLER_INIT, it indicates that the handler did not initialize, for instance because the scheme was unknown or disabled.

Scenario: Debugging Proxy Configuration Issues

See Debugging the behavior of Proxy Configuration Scripts.


Got a great NetLog debugging tip I should include here? Please leave a comment and teach me!

-Eric

Debugging Proxy Configuration Scripts in the new Edge

I’ve written about Browser Proxy Configuration a few times over the years, and I’m delighted that Chromium has accurate & up-to-date documentation for its proxy support.

One thing I’d like to call out is that Microsoft Edge’s new Chromium foundation introduces a convenient new debugging feature for debugging the behavior of Proxy AutoConfiguration (PAC) scripts.

To use it, simply add alert() calls to your PAC script, like so:

alert("!!!!!!!!! PAC script start parse !!!!!!!!");
function FindProxyForURL(url, host) {
alert("Got request for (" + url+ " with host: " + host + ")");
return "PROXY 127.0.0.1:8888";
}
alert("!!!!!!!!! PAC script done parse !!!!!!!!");

Then, collect a NetLog trace from the browser:

msedge.exe --log-net-log=C:\temp\logFull.json --net-log-capture-mode=IncludeSocketBytes

…and reproduce the problem.

Save the NetLog JSON file and reload it into the NetLog viewer. Search in the Events tab for PAC_JAVASCRIPT_ALERT events:

Even without adding new alert() calls, you can also look for HTTP_STREAM_JOB_CONTROLLER_PROXY_SERVER_RESOLVED events to see what proxy the proxy resolution process determined should be used.

One current limitation of the current logging is that if the V8 Proxy Resolver process…

… crashes (e.g. because Citrix injected a DLL into it), there’s no mention of that crash in the NetLog; it will just show DIRECT. Until the logging is enhanced, users can hit SHIFT+ESC to launch the browser’s task manager and check to see whether the utility process is alive.

Try using the System Resolver

In some cases (e.g. when using DirectAccess), you might want to try using Windows’ proxy resolution code rather than the code within Chromium.

The --winhttp-proxy-resolver command line argument will direct Chrome/Edge to call out to Windows’ WinHTTP Proxy Service for PAC processing.

Differences in WPAD/PAC Processing

  • The WinHTTP Proxy Service caches proxy authentication credentials (in CredMan) and automatically reuses them across apps and browser launches; Chromium does not.
  • The WinHTTP Proxy Service caches WPAD determination across process launches. Chromium does not and will need to redetect the proxy each time the browser reopens.
  • Internet Explorer/WinINET/Edge Legacy call the PAC script’s FindProxyForURLEx function (introduced to unlock IPv6 support), if present, and FindProxyForURL if not.
  • Chrome/Edge/Firefox only call the FindProxyForURL function and do not call the Ex version.
  • Internet Explorer/WinINET/Edge Legacy expose a getClientVersion API that is not defined in other PAC environments.
  • Chrome/Edge may return different results than IE/WinINET/EdgeLegacy from the myIpAddress function when connected to a VPN.
  • Edge 79+/Chrome do not allow loading a PAC script from a file:// URI. (IE allowed this long ago, but it hasn’t been supported in a long time). If you want to try testing a PAC script locally in Chromium-based browsers, you can encode the whole script into a DATA URL and use that. IE/Edge Legacy do not support this mechanism.

Notes for Other Browsers

  • Prior to Windows 8, IE showed PAC alert() notices in a modal dialog box. It no longer does so and alert() is a no-op.
  • Firefox shows alert() messages in the Browser Console (hit Ctrl+Shift+J); note that Firefox’s Browser Console is not the Web Console where web pages’ console.log statements are shown.

-Eric

Celebrating Fifteen Years

While lately I’ve been endlessly streaming the latest news with horrified fascination, this morning my calendar unexpectedly popped up a reminder set over a year ago… Today is the fifteenth anniversary of my big-league blogging debut on the Internet Explorer Team’s blog.

My first post there, “A HTTP Detective Story” remains one of my favorites. Sadly, its topic feels all too familiar: a website took a dependency upon a browser quirk, the Referer and User-Agent headers were critical elements of the repro, and I used Fiddler to root-cause the problem. I’ve learned (and shared) so much more about these topics over the last fifteen years, and I appreciate the readers who’ve followed me from the IEBlog to IEInternals (237 posts) to Telerik’s Fiddler blog (41 posts) to my book, to this, my newest blog, TextSlashPlain (189 posts and counting!).

Here’s to learning and sharing for the next 15 years!

With gratitude,

-Eric

Enigma Conference 2020 – Browser Privacy Panel

Brave, Mozilla Firefox, Google Chrome and Microsoft Edge presented on our current privacy work at the Enigma 2020 conference in late January. The talks were mostly high-level, but there were a few feature-level slides for each browser.

My ~10 minute presentation on Microsoft Edge was first, followed by Firefox, Chrome, and Brave.

At 40 minutes in a 35min Q&A session starts, first with questions from the panel moderator, followed by questions from the audience.

“Can I… in the new Edge?”

This post is intended to collect a random set of questions I’ve been asked multiple times about the new Chromium-based Edge. I’ll add to it over time. I wouldn’t call this a FAQ because these questions, while repeated, are not frequently asked.

Last Update: Sept 30, 2020

Can I get a list of all of the command line arguments for Edge?

Unfortunately, we are not today publishing the list of command line arguments, although in principle we could use the same tool Chromium does to parse our source and generate a listing.

In general, our command-line arguments are the same as Chromium‘s, with the exception of marketing names (e.g. Chrome uses --incognito while msedge.exe uses --inprivate) and restricted words (sometimes Edge replaces blacklist with denylist and whitelist with allowlist). Note: The peter.sh list linked here is automatically generated; check the Last automated update occurred on text to verify that it was updated recently.

Can I block my employees from accessing the edge://flags page?

You can add “edge://flags” to the URLBlocklist if desired. Generally, we don’t recommend using this policy to block edge://* pages as doing so could have unexpected consequences.

Note that, even if you block access to edge://flags, a user is still able to modify the JSON data storage file backing that page: %LocalAppData%\Microsoft\Edge\User Data\Local State using Notepad or any other text editor.

Similarly, a user might specify command line arguments when launching msedge.exe to change a wide variety of settings.

Can I disable certain ciphers, like 3DES, in the new Edge?

The new Edge does not use SChannel, so none the prior SChannel cipher configuration policies or settings have any effect on the new Edge.

Group Policy may be used to configure the new Edge’s SSLVersionMin (which does impact available cipher suites, but doesn’t disable all of the ciphers considered “Weak” by SSLLabs.com’s test).

Chromium explicitly made a design/philosophical choice (see this and this) not to support disabling individual cipher suites via policy. Ciphersuites in the new Edge may be disabled using a command-line flag:

msedge.exe --cipher-suite-denylist=0x000a https://ssllabs.com

A few other notes:

  • The cipher suite in use is selected by the server from the list offered by the client. So if an organization is worried about ciphers used within their organization, they can simply direct their servers to only negotiate cipher suites acceptable to them.
  • The Chrome team has begun experimenting with disabling some weaker/older ciphersuites; see https://crbug.com/658905
  • If an Enterprise has configured IE Mode, the IE Mode tab’s HTTPS implementation is still controlled by Internet Explorer / Windows / SChannel policy, not the new Edge Chromium policies.
  • If TLS/1.3 is enabled, you cannot use the cipher-suite-denylist to disable ciphers 0x1301, 0x1302, and 0x1303. TLS1.3 spec: “A TLS-compliant application MUST implement the TLS_AES_128_GCM_SHA256 [GCM] cipher suite and SHOULD implement the TLS_AES_256_GCM_SHA384 [GCM] and TLS_CHACHA20_POLY1305_SHA256 [RFC8439] cipher suites (see Appendix B.4).”

Can I use TLS/1.3 in the new Edge?

TLS/1.3 is supported natively within the new Chromium-based Edge on all platforms.

Chromium-based Edge does not rely upon OS support for TLS. Windows’ IE 11 and Legacy Edge do not yet support TLS/1.3, but are expected to support TLS/1.3 in a future Windows 10 release.

For the time being, enabling both TLS/1.3 and TLS/1.2 is a best practice for servers.

Can I turn off TLS/1.3 in the new Edge?

You can set the SSLVersionMax command line argument, but the associated Group Policy was removed in Chrome 75.

msedge.exe --ssl-version-max=tls1.2 https://ssllabs.com

Can Extensions be installed automatically?

Enterprises can make extension install automatically and prevent disabling them using the ExtensionInstallForcelist Policy. Admins can also install extensions (but allow users to disable them) using the ExtensionSettings policy with the installation_mode set to normal_installed.

Here are the details to install extensions directly via the Windows Registry. Please note that if you want to install extensions from the Chrome WebStore, then you must provide the Chrome store id and update url: https://clients2.google.com/service/update2/crx.

Can specific file types be set to auto-open? Can I change my mind?

After downloading a file, you can click the “…” menu next to the download item and choose “Always open files of this type” from the context menu:

This option is not available for all file types (e.g. file types deemed dangerous cannot be auto-opened).

One challenge with this UI is that after you set this option, the download bar will not be shown for this file type any longer, leaving you no way to untick the “Always open files of this type” menu item.

The secret to changing your mind is to visit edge://settings/downloads and click the Clear all button next to the File types which are opened automatically after downloading list. There is no way to clear just one file type from the list short of editing the profile’s PREFERENCES json directly.

Presently, no Group Policy is available to force file types (except PDF) to open automatically, but this is a common enterprise request. The Master Preferences file can be configured with this option, but those defaults are only used when creating new browser profiles, and users may change them.

Can I go directly to single-word (Intranet) sites without doing a search first?

Can I avoid doing a search with a notification bar saying “Did you mean to go to <http://payroll&gt;?” In Internet Explorer, there was a “Go to an intranet site for a single word entry in the Address bar” checkbox in the Advanced Settings.

Use the GoToIntranetSiteForSingleWordEntryInAddressBar policy to change the default behavior.

Bypassing AppProtocol Prompts

Starting in Microsoft Edge 77 (and Chrome 77), the prompt shown when launching an AppProtocol from the browser was changed to remove the “Always allow” checkbox. That change was made, in large part, because this prompt is the only thing standing between every arbitrary site on the Internet (loaded inside your browser’s sandbox) and a full-trust application on your computer (running outside of the browser’s sandbox). See the blog post for more details on why AppProtocols are so scary.

After Edge 77, when you try to launch a Microsoft Teams meeting, for instance, you’ll see a UI like this:

Unfortunately, there’s a downside to this security improvement.

The same prompt that protects users from malicious content on https://BadGuy.example also shows every single time the legitimate Microsoft Teams website tries to open its related application. Users complain that the security prompt feels redundant, and IT departments have howled that they’ll have to retrain users and field helpdesk calls.

In Edge 82.0.425.0 Canary, a new flag was added:

…and in Edge 84 it was enabled by default.

The prompt now includes a new checkbox: “Always allow <hostname> to open links of this type in the associated app”:

By storing exemptions on a per-site, per-scheme basis, attack surface is significantly reduced, because only sites you’ve specifically allowed in the past are permitted to bypass the prompt.

This change is also available in other browsers based on Chromium 84+.

Some notes on this change:

  • Exemptions are stored on a per-scheme, per-origin basis (e.g. “Allow teams: from https://teams.microsoft.com“, so if multiple origins use the same scheme, you’ll need to exempt each one.
  • Stored exemptions are origin specific: https://site.example and https://www.site.example and http://site.example are all different origins.
  • Stored exemptions are only available for secure origins (basically: HTTPS, HTTP-to-Localhost, and FILE).
  • This checkbox can be disabled using Group Policy.
  • Starting in Edge 85, a new Group Policy allows an admin to preapprove exemption pairs (including non-secure origins) on behalf of their users.
  • In Edge 86 and later, you can see user-granted and Group Policy pushed exemptions by navigating to edge://settings/content/applicationLinks in the browser. You can also remove user-granted exemptions in this page.
  • To clear all user-granted exemptions (in any version), you may use the “Cookies and other site data” checkbox in the Clear Browsing Data dialog box. Note that you can set the time range to anything you like– all Origin+Scheme exemptions will be cleared.

You can experiment with this feature using the AppProtocol test page.

-Eric

Browser Password Managers: Threat Models

All major browsers have a built-in password manager. So we should use them, right?

I Do

  • I use my browser’s password manager because it’s convenient: with sync, I get all of my passwords on all of my devices.
  • This convenience means that I can use a different password for every website, improving my security.
  • This convenience means that my passwords can be long and hard to type, because I never have to do so.
  • This means that I don’t even know my own passwords for many sites, and because I can rely on my password manager to only fill my passwords on the sites to which they belong, I cannot succumb to a phishing attack.

Should You?

The easy answer is “Yes, use your browser’s password manager!

The more nuanced answer begins: “Tell me about your threat model?

As when evaluating almost any security feature, my threat model might not match your threat model, and as a consequence, our security choices might be different.

Here are the most relevant questions to consider when thinking about whether you should use a password manager:

  • Is a password manager available for your platform(s)?
  • What sort of attackers are you worried about?
  • What sort of websites do you log into?
  • Do you select strong, unique passwords?
  • Are your accounts protected with 2FA?
  • What sort of attacks are most likely?
  • What sort of attacks are possible?
  • How do you protect your devices?
  • What’s your personal tolerance for inconvenience?
  • Are you confident in the security of your password manager’s vendor?
  • If you sync passwords, are you confident in the security of the design of the sync system?

The answers to these questions might change your decisions about whether to use a password manager, and if so, whether you want to use the built-in password manager or use a password manager provided by a third-party.

For instance, if you’re sharing a Windows/Mac OS login account with someone you don’t trust, you should stop. If you cannot or don’t want to, you should not use a password manager, because there are trivial ways for a local user steal your passwords one-at-a-time and simple ways to steal them all at once. Of course, even if you’re not using a password manager, a co-user can simply use a keylogger to steal your passwords one-by-one as you type them.

Lock (Win+L) your computer when you’re not using it.

While browser passwords are encrypted on disk, they’re encrypted using a key available to any process on your PC, including any locally-running malware. Even if passwords are encrypted in a “vault” by a master key, they’ll be decrypted when loaded in the browser’s memory space and can be harvested after you unlock the vault. Locally-running malware is particularly dire if your threat model includes the possibility of a worm running rampant within your enterprise– it could infect all of your employees’ machines and steal all of their passwords in bulk in seconds. (Yes, dear reader, I know that you’re thinking of clever mechanisms to mitigate these sorts of attacks. I assure you I can defeat every practical idea you have. It’s a fundamental law of computing.)

Concern about instantaneous bulk egress of credentials has led the authors of security configuration guidance to recommend disabling browser password managers. For instance, the Edge Security Baseline and the Chrome STIG both suggest preventing users from using the password manager. (I personally think this is a poor tradeoff that increases the higher risk of individual users getting phished, but I don’t write the configuration guidance.)

Some tech elites advocate for using a 3rd-party password manager, and some users really like them. Most 3rd-party password managers are designed with broader feature sets to satisfy alternative threat models (including using master passwords to help protect against limited local attackers). Many also include additional conveniences like automatic generation of strong passwords and roaming of passwords to mobile platforms and apps. On the other hand, many external password manager applications are themselves a source of security vulnerabilities, and these products often end up growing extremely complicated due to the “Checkbox Wars” endemic to the security products industry.

Parting Advice

Passwords are a poor security mechanism, and should be phased out wherever possible.

When that’s not yet possible (because you don’t control the website): choose strong passwords, use a password manager if it satisfies your threat model, and enable 2FA if available (especially on your email accounts to which password recovery emails are sent).

-Eric

Demystifying Browsers

Last update: Sept 30, 2020

I started building browser extensions more than 22 years ago, and I started building browsers directly just over 16 years ago. At this point, I think it’s fair to say that I’m entering the grizzled veteran phase of my career.

With the Edge team continuing to grow with bright young minds from college and industry, I’m increasingly often asked “Where do I learn about browsers?” and I haven’t had a ready answer for that question.

This post aims to answer it.

First, a few prerequisites for developing expertise in browsers:

  1. Curiosity. While browsers are more complicated than ever, there are also better resources than ever to learn how they work. All major browsers are now based on open-source code, and if you’re curious, you no longer need to join a secret priesthood to discover how they operate under the hood.
  2. Willingness to Experiment. Considering how complex browsers are (and because they’re so diverse, across platforms, maker, and version), it’s often easiest to definitively answer questions about how browsers work by trying things, rather than reading an explainer (possibly outdated or a map that doesn’t match the terrain) or reading the code (often complex and potentially misleading). Build test cases and try them in each browser to see what happens. When you encounter surprising behavior, let your curiosity guide you into figuring it out. Browsers contain no magic, but plenty of butterfly effects.
  3. Doggedness. I’ve been doing this for over half of my life, and I’m still learning daily. While historical knowledge will serve you well, things are changing in this space every day, and keeping up is an endless challenge. And it’s often fun.

Now, how do you apply these prerequisites and grow to become a master of browsers? Read on.

Fundamental Understanding

Over the years, a variety of broad resources have been developed that will give you a good foundation in the fundamentals of how browsers work. Taking advantage of these will help you more effectively explore and learn on your own.

  • First, I recommend reading the Chrome Comic Book. This short, 38 page comic book from comics legend Scott McCloud was published alongside the first version of Google Chrome back in 2008. It clearly and simply explains many of the core concepts behind modern browsers as application platforms.
  • HTML5Rocks has a great introduction into How Browsers Work. This is a lengthy and detailed introduction into how browsers turn HTML and CSS into what you see on the screen. Read this article and you’ll understand more about this topic than 90% of web developers.
  • The folks at Google have created a fantastic four-part illustrated series about how modern browsers work: Inside look at modern web browsers. Navigation, the Rendering Engine and Input and Compositing as a part of their Web Fundamentals site.
  • Mozilla wrote a fantastic cartoon introduction to WebAssembly, explaining the basics behind this new technology; there’s tons of other invaluable content on Mozilla Hacks.
  • The Chromium Chronicle is a monthly series geared specifically to the Chromium developers who build the browser.
  • Web Developers should check out Web.Dev, a great source of articles on building fast and secure websites.

Books

If you prefer to learn from books, I can only recommend a few. Sadly, there are few on browsers themselves (largely because they tend to evolve too quickly), but there are good books on web technologies.

Tools

One of the best ways to examine what’s going on with browsers is to just use tools to watch what’s going on as you use your favorite websites.

Update: I’ve written a whole post on Browser-Debugging Tools.

Use the Source, Leia

The fact that all of the major browsers are built atop open-source projects is a wonderful thing. No longer do you need to be a reverse-engineering ninja with a low-level debugger to figure out how things are meant to work (although sometimes such approaches can still be super-valuable).

Source code locations:

Navigating the Code

While simply perusing a browser’s source code might give you a good feel for the project, browsers tend to be enormous. Chromium is over 10 million lines of code, for example.

If you need to find something in particular, one often effective way to find it easily is to search for a string shown in the browser UI near the feature of interest. (Or, if you’re searching for a DOM function name or HTML attribute name, try searching for that.) We might call this method string chasing.

By way of example, today I encountered an unexpected behavior in the handling of the “Go to <url>” command on Chromium’s context menu:

So, to find the code that implements this feature, I first try searching for that string:

…but there are a gazillion hits, which makes it hard to find what I need. So I instead search for a string that’s elsewhere in the context menu, and find only one hit in the Chromium “grd” (resources) file:

When I go look at that grd file, I quickly find the identifier I’m really looking for just below my search result:

So, we now know that we’re looking for usages of IDS_CONTENT_CONTEXT_GOTOURL, probably in a .CC file, and we find that almost immediately:

From here, we see that the menu item has the command identifier IDC_CONTENT_CONTEXT_GOTOURL, which we can then continue to chase down through the source until we find the code that handles the command. That command makes use of a variable selection_navigation_url_, which is filled elsewhere by some pretty complicated logic.

After you gain experience in the Chromium code, you might learn “Oh, yeah, all of the context menu stuff is easy to find, it’s in the renderer_context_menu directory” and limit your searches to that area, but after four years of working on Chrome, I still usually start my searches broadly.

Optional: Compile the code

If you’d actually like to compile the code of a major browser, things are a bit more involved, but if you follow the configuration instructions to the letter— your first build will succeed. Back in 2015, Monica Dinculescu created an amazing illustrated guide to contributing to Chromium, and in 2020, Marcos Cáceres wrote a thorough explainer about building a feature in Firefox.

You can compile Chromium or Firefox on a mid-range machine from 2016, but it will take quite a long time. A beefy PC will speed things up a bunch, but until we have cloud compilers available to the public, it’s always going to be pretty slow.

This guide on compiling WebKit suggests that a web platform-only build on a MacBook takes only about 20 minutes; you can run the resulting platform in a minibrowser or instruct Safari to use it.

Look at their Bugs

All browsers except Microsoft Edge have a public bug tracker where you can search for known issues and file new bugs if you encounter them.

  • FirefoxFirefox Bugzilla
  • WebKitWebKit Bugzilla
  • ChromiumCRBug
  • Microsoft Edge‘s – Platform bugs that are inherited from Chromium are tracked using CRBug. Sadly, at present there is no public tracker for bugs that reproduce only in Edge. Bugs reported by the “Feedback” button are tracked internally by Microsoft.
  • Braveon GitHub
  • HTML5 Specification – on GitHub

Here are instructions for filing a great browser bug.

Binge on Online Video

The Chrome team has an excellent set of educational content used to train new and long-time Chrome engineers. Titled Chrome University, it is periodically updated. Here’s the Chrome Security 101 course, for instance.

Blogs to Read

  • This One – I write mostly about browsers.
  • My (archived) IEInternals – I started writing this blog because it was the only reliable way for me to find my notes from investigations and troubleshooting years later.
  • Cloudflare’s – Cloudflare is a $5B company whose primary product is their amazing blog. I understand they also run a CDN on the side to generate interesting topics for their blog to talk about.
  • Nasko Oskov’s – Nasko is an engineer on the Chrome Security team and writes mostly about security topics.
  • Chris Palmer’s – Chris is an engineer on the Chrome Security team and writes about secure design.
  • Adam Langley’s – Google’s expert cryptographer
  • Bruce Dawson’s – Bruce is a Chrome Engineer who posts lots of interesting information about debugging and performance troubleshooting, especially on Windows.
  • Anne van Kesteren’s – Anne works on the HTML5 spec.
  • Mark Nottingham’s – Mark co-chairs the HTTP and QUIC working groups

Specific Posts of Interest

People to Follow

I’ve doubtless forgotten some, see who I follow.

The Business of Browsers

Public data reveals each point of marketshare in the browser market is worth at least $100,000,000 USD annually (most directly in the form of payments from the browser’s configured search engine).

Remembering this fact will help you understand many other things, from how browsers pay their large teams of expensive software engineers, to how they manage to give browsers away for free, to why certain features behave the way that they do.

Extra Resources

Browsers are hugely complicated beasts, and tons of fun. If the resources above leave you feeling both overwhelmed and excited, maybe you should become a browser builder.

Want to change the world? Come join the new Microsoft Edge team today!

-Eric

App-to-Web Communication: Launching Web Apps

In recent posts, I’ve explored mechanisms to communicate from web content to local (native) apps, and I explained how web apps can use the HTML5 registerProtocolHandler API to allow launching them from either local apps or other websites.

In today’s post, we’ll explore how local apps can launch web apps in the browser.

It’s Simple…

In most cases, it’s trivial for an app to launch a web app and send data to it. The app simply invokes the operating system’s “launch” API and passes it the desired URL for the web app.

Any data to be communicated to the web app is passed in the URL query string or the fragment component of the URL.

On Windows, such an invocation might look like this:

ShellExecute(hwnd, "open", "https://bayden.com/echo.aspx?DataTo=Pass#GoesHere", 0, 0, SW_SHOW);

Calling this API results in the user’s default browser being opened and a new tab navigated to the target URL.

This same simple approach works great on most operating systems and with virtually any browser a user might have configured as their default.

…Unless It’s Not

Unfortunately, this well-lit path adjoins a complexity cliff— if your scenario has requirements beyond the basic [Launch the default browser to this URL], things get much more challenging. The problem is that there is no API contract that provides a richer feature set and works across different browsers.

For instance, consider the case where you’d like your app to direct the browser to POST a form to a target server. Today, popular operating systems have no such concept– they know how to open a browser by passing it a URL, but they expose no API that says “Open the User’s browser to the following URL, sending the navigation request via the HTTP POST method and containing the following POST body data.

Over the years, a few workarounds have been used (e.g. see StackOverflow1 and StackOverflow2).

For instance, if the target webservice simply requires a HTTP POST and you cannot change it, your app could launch the browser to a webpage you control, passing the required data in the querystring component of a HTTP GET. Your web server could then reformat the data into the required POST body format and either proxy that request (server-side) to the target webservice, or it could return a web page with an auto-submitting form element with a method of POST and and action attribute pointed at the target webservice. The user’s browser will submit the form, posting the data to the target server.

Similarly, a more common approach involves having the app write a local HTML file in a temporary folder, then direct the Operating System to open that file using the appropriate API (again ShellExecute, in the case of Windows). Presuming that the user’s default HTML handler is also their default HTTPS protocol handler, opening the file will result in the default browser opening, and the HTML/script in the file will automatically submit the included form element to the target server. This “bounce through a local temporary form” approach has the advantage of making it possible to submit sizable of data to the server (e.g. the contents of a local file), unlike using a GET request’s size-limited querystring.

Caveats:

  • Unfortunately it is generally not possible to construct a HTML form that will submit a data field that exactly matches what you would get when sending an <input type=file> control. If the web service demands a format that was generated by a file upload control, you may not be able to emulate that.
  • Some browsers will not run JavaScript in local files by default.
  • Don’t forget to delete the temporary file!

If your scenario requires uploading files, an alternative approach is to:

  1. Upload the files directly from your app to a web service
  2. Have that web service return a secret token associated with the upload
  3. Have your app spawn a browser with a GET request whose querystring contains that secret token

Browser-Specific Approaches

Back in the Windows 7 days, the IE8 team created a very cool feature called Accelerators that would allow users to invoke web services in their browser from any other application. Interestingly, the API contract supported web services that required POST requests.

Because there was no API in Windows that supported launching the default browser with anything other than a URL, a different approach was needed. A browser that wished to participate as a handler for accelerators could implement a IOpenServiceActivityOutputContext::Navigate function which was expected to launch the browser and pass the data. The example implementation provided by our documentation called into Internet Explorer’s Navigate2() COM API, which accepted as a parameter the POST body to be sent in the navigation. As far as I know, no other browser ever implemented IOpenServiceActivityOutputContext.

These days, Accelerators are long dead, and no one should be using Internet Explorer anymore. In the intervening years, no browser-agnostic mechanism to transfer a POST request from an app to a browser has been created.

Perhaps the closest we’ve come is the W3C’s WebDriver Standard, designed for automated testing of websites across arbitrary browsers. Unfortunately, at present, there’s still no way for mainstream apps to take a dependency on WebDriver to deliver a reliable browser-agnostic solution enabling rich transfers from a local app to a web app. Similarly, Puppeteer can be used for some web automation scenarios in Chrome or Edge, and the new Microsoft Playwright enables automated testing in Chromium, WebKit, and Firefox.

Future Possibilities

While the current picture is bleak, the future is a bit brighter. That’s because a major goal of browsers’ investment in Progressive Web Apps is to make them rich enough to take the place of native apps. Today’s native apps have very rich mechanisms for passing data and files to one another and PWAs will need such capabilities in order to achieve their goals.

Perhaps one day, not too far in the future, your OS and your browser (regardless of vendor) will better interoperate.

-Eric

Microsoft’s Three Browsers

It’s an interesting time. Microsoft now maintains three different web browsers:

  • Internet Explorer 11
  • Microsoft Edge Legacy (Spartan, v18 and below)
  • Chromium-based Microsoft Edge (v79+)

If you’re using Internet Explorer 11, you should stop; sometimes, this is easier said than done.

If you’re using Legacy Microsoft Edge, you should upgrade to the new Microsoft Edge which is better in almost every way. When you install the Stable version of the new Microsoft Edge (either by downloading it or eventually by using WindowsUpdate), it will replace your existing Legacy Edge with the new version.

Update: Microsoft has announced that Edge Legacy will fall out of support on March 9th, 2021.

What if I still need to test in Edge Legacy?

If you’re a web developer and need to keep testing your sites and services in the legacy Microsoft Edge, you’ll need to set a registry key to prevent the Edge installer from removing the entry points to the old Edge.

Simply import this registry script before the new Edge is installed. When the AllowSxS key is set to 1, the new Edge installer will keep the old entry point, renaming it to “Microsoft Edge Legacy”:

Thereafter, you can use both versions of Edge on the same PC.

If you didn’t have this registry key set and your legacy Edge entry points have disappeared when you installed the new Edge, you can use the Add or Remove Programs applet in the system control panel to uninstall the new Edge, then set the registry key, then reinstall the new Edge.

Note: If you’re a Web Developer, you should also be testing in the Edge Beta or Edge Dev builds because these will allow you to see the changes coming to Edge before your users do. These builds install side-by-side (replacing no browser) and can be installed from https://MicrosoftEdgeInsider.com.

What if my company has sites that only work in Internet Explorer?

In order to help speed migration to the new Microsoft Edge, it offers an Internet Explorer Mode feature when running on Windows. IE Mode allows IT administrators to configure PCs running Windows 7, 8.1, and 10 such that specified sites will load inside a browser tab that uses the Internet Explorer 11 rendering engine.

  • IE Mode is not designed for or available to consumers.
  • Because IE Mode relies upon the IE11 binaries on the current machine, it is not available in Edge for MacOS, iOS, or Android.
  • IE Mode tabs run inside the legacy security sandbox (weaker than the regular Edge sandbox) and ActiveX controls like Silverlight are available to web pages.
  • IE Mode does not share a cache, cookies, or web storage with Microsoft Edge, so scenarios that depend upon using these storage mechanisms in a cross-site+cross-engine context will not work correctly. IT administrators should carefully set their policies such that user flows occur within a single engine.
  • Most Edge browser extensions will not work on IE Mode tabs–extensions which only look at the tab’s URL should work, but extensions which try to view or modify the page content will not function correctly.

In an ideal world, users will migrate to the latest version of Microsoft Edge as quickly as possible, and enjoy a faster, more compatible, more reliable browser. Nevertheless, Microsoft will continue to patch both Legacy Edge and Internet Explorer 11 according to their existing support lifecycle.

-Eric