dev

The Web Platform offers a great deal of power, and unfortunately evil websites go to great lengths to abuse it. One of the weakest (but simplest to implement) protections against such abuse is to block actions that were not preceded by a “User Gesture.” Such gestures (sometimes more precisely called User Activations) include a variety of simple actions, from clicking the mouse to typing a key; each interpreted as “The user tried to do something in this web content.”

A single user gesture can unlock any of a surprisingly wide array of privileged (“gated”) actions:

  • Allow a popup window to open
  • Allow an Application Protocol to be invoked
  • Allow an OnBeforeUnload dialog box to show
  • Allow the Vibration API to vibrate the device
  • Allow script to take the window fullscreen
  • Allow the password manager to fill the username/password into the page in a way that JavaScript can see them
  • Allow the page to prompt the user for a file to upload
  • Impact the behavior of file downloads (e.g. prompting)
  • and many more

So, when you see a site show a UI like this:

…chances are good that what they’re really trying to do is trick you into performing a gesture (mouse click) so they can perform a privileged action– in this case, open a popup ad in a new tab.

Some gestures are considered “consumable”, meaning that a single user action allows only one privileged action; subsequent privileged actions require another gesture.

Unfortunately, even this weak protection is subject to both false positives (an unwanted granting of privilege) and false negatives (an action is unexpectedly blocked).

You can learn more about this topic (and the complexity of dealing with nested frames, etc) in the original Chromium User Activation v2 spec, and the User-Activation section of HTML5.

-Eric

For the first few years of the web, developers pretty much coded whatever they thought was cool and shipped it. Specifications, if written at all, were an afterthought.

Then, for the next two decades, spec authors drafted increasingly elaborate specifications with optional features and extensibility points meant to be used to enable future work.

Unfortunately, browser and server developers often only implemented enough of the specs to ensure interoperability, and rarely tested that their code worked properly in the face of features and data allowed by the specs but not implemented in the popular clients.

Over the years, the web builders started to notice that specs’ extensibility points were rusting shut– if a new or upgraded client tried to make use of a new feature, or otherwise change what it sent as allowed by the specs, existing servers would fail when they saw the encountered the new values.

In light of this, spec authors came up with a clever idea: clients should send random dummy values allowed by the spec, causing spec-non-compliant servers that fail to properly ignore those values to fail immediately. This concept is called GREASE (with the backronym “Generate Random Extensions And Sustain Extensibility“), and was first implemented for the values sent by the TLS handshake. When connecting to servers, clients would claim to support new ciphersuites and handshake extensions, and intolerant servers would fail. Users would holler, and engineers could follow up with the broken site’s authors and developers about how to fix their code. To avoid “breaking the web” too broadly, GREASE is typically enabled experimentally at first, in Canary and Dev channels. Only after the scope of the breakages is better understood does the change get enabled for most users.

GREASE has proven such a success for TLS handshakes that the idea has started to appear in new places. Last week, the Chromium project turned on GREASE for HTTP2 in Canary/Dev for 50% of users, causing connection failures to many popular sites, including some run by Microsoft. These sites will need to be fixed in order to properly load in the new builds of Chromium.

// Enable "greasing" HTTP/2, that is, sending SETTINGS parameters with reserved identifiers and sending frames of reserved types, respectively. If greasing Frame types, an HTTP/2 frame with a reserved frame type will be sent after every HEADERS and SETTINGS frame. The same frame will be sent out on all connections to prevent the retry logic from hiding broken servers.
NETWORK_SWITCH(kHttp2GreaseSettings, "http2-grease-settings") NETWORK_SWITCH(kHttp2GreaseFrameType, "http2-grease-frame-type")

One interesting consequence of sending GREASE H2 Frames is that it requires moving the END_STREAM flag (recorded as fin=true in the netlog) from the HTTP2_SESSION_SEND_HEADERS frame into an empty (size=0) HTTP2_SESSION_SEND_DATA frame; unfortunately, the intervening GREASE Frame is not presently recorded in the netlog.

You can try H2 GREASE in Chrome Stable using command line flags that enable GREASE settings values and GREASE frames respectively:

chrome.exe --http2-grease-settings bing.com
chrome.exe --http2-grease-frame-type bing.com

Alternatively, you can disable the experiment in Dev/Canary:

chrome.exe --disable-features=Http2Grease

GREASE is baked into the new HTTP3 protocol (Cloudflare does it by default) and the new UA Client Hints specification (where it’s blowing up a fair number of sites). I expect we’ll see GREASE-like mechanisms appearing in most new web specs where there are optional or extensible features.

-Eric

Previously, I’ve described how to capture a network traffic log from Microsoft Edge, Google Chrome, and applications based on Chromium or Electron.

In this post, I aim to catalog some guidance for looking at these logs to help find the root cause of captured problems and otherwise make sense of the data collected.

Last Update: April 24, 2020I expect to update this post over time as I continue to gain experience in analyzing network logs.

Choose A Viewer – Fiddler or Catapult

After you’ve collected the net-export-log.json file using the about:net-export page in the browser, you’ll need to decide how to analyze it.

The NetLog file format consists of a JSON-encoded stream of event objects that are logged as interesting things happen in the network layer. At the start of the file there are dictionaries mapping integer IDs to symbolic constants, followed by event objects that make use of those IDs. As a consequence, it’s very rare that a human will be able to read anything interesting from a NetLog.json file using just a plaintext editor or even a JSON parser.

The most common (by far) approach to reading NetLogs is to use the Catapult NetLog Viewer, a HTML/JavaScript application which loads the JSON file and parses it into a much more readable set of events.

An alternative approach is to use the NetLog Importer for Telerik Fiddler.

Importing NetLogs to Fiddler

For Windows users who are familiar with Fiddler, the NetLog Importer extension for Fiddler is easy-to-use and it enables you to quickly visualize HTTP/HTTPS requests and responses. The steps are easy:

  1. Install the NetLog Importer,
  2. Open Fiddler, ideally in Viewer mode fiddler.exe -viewer
  3. Click File > Import > NetLog JSON
  4. Select the JSON file to import

In seconds, all of the HTTP/HTTPS traffic found in the capture will be presented for your review. If the log was compressed before it was sent to you, the importer will automatically extract the first JSON file from a chosen .ZIP or .GZ file, saving you a step.

In addition to the requests and responses parsed from the log, there are a number of pseudo-Sessions with a fake host of NETLOG that represent metadata extracted from the log:

These pseudo-sessions include:

  • RAW_JSON contains the raw constants and event data. You probably will never want to examine this view.
  • CAPTURE_INFO contains basic data about the date/time of the capture, what browser and OS version were used, and the command line arguments to the browser.
  • ENABLED_EXTENSIONS contains the list of extensions that are enabled in this browser instance. This entry will be missing if the log was captured using the –log-net-log command line argument.
  • URL_REQUESTS contains a dictionary mapping every event related to URL_REQUEST back to the URL Requests to which it belongs. This provides a different view of the events that were used in the parsing of the Web Sessions added to the traffic list.
  • SECURE_SOCKETS contains a list of all of the HTTPS sockets that were established for network requests, including the certificates sent by the server and the parameters requested of any client certificates. The Server certificates can be viewed by saving the contents of a –BEGIN CERTIFICATE– entry to a file named something.cer. Alternatively, select the line, hit CTRL+C, click Edit > Paste As Sessions, select the Certificates Inspector and press its Content Certificates button.

You can then use Fiddler’s UI to examine each of the Web Sessions.

Limitations

The NetLog format currently does not store request body bytes, so those will always be missing (e.g. on POST requests).

Unless the Include Raw Bytes option was selected by the user collecting the capture, all of the response bytes will be missing as well. Fiddler will show a “dropped” notice when the body bytes are missing:

If the user did not select the Include Cookies and Credentials option, any Cookie or Authorization headers will be stripped down to help protect private data:

Scenario: Finding URLs

You can use Fiddler’s full text search feature to look for URLs of interest if the traffic capture includes raw bytes. Otherwise, you can search the Request URLs and headers alone.

On any session, you can use Fiddler’s “P” keystroke (or the Select > Parent Request context menu command) to attempt to walk back to the request’s creator (e.g. referring HTML page).

You can look for the traffic_annotation value that reflects why a resource was requested by looking for the X-Netlog-Traffic_Annotation Session Flag.

Scenario: Cookie Issues

If Fiddler sees that cookies were not set or sent due to features like SameSiteByDefault cookies, it will make a note of that in the Session using a psuedo $NETLOG-CookieNotSent or $NETLOG-CookieNotSet header on the request or response:

Closing Notes

If you’re interested in learning more about this extension, see announcement blog post and the open-source code.

While the Fiddler Importer is very convenient for analyzing many types of problems, for others, you need to go deeper and look at the raw events in the log using the Catapult Viewer.


Viewing NetLogs with the Catapult Viewer

Opening NetLogs with the Catapult NetLog Viewer is even simpler:

  1. Navigate to the web viewer
  2. Select the JSON file to view

If you find yourself opening NetLogs routinely, you might consider using a shortcut to launch the Viewer in an “App Mode” browser instance: msedge.exe --app=https://netlog-viewer.appspot.com/#import

The App Mode instance is a standalone window which doesn’t contain tabs or other UI:

Note that the Catapult Viewer is a standalone HTML application. If you like, you can save it as a .HTML file on your local computer and use it even when completely disconnected from the Internet. The only advantage to loading it from appspot.com is that the version hosted there is updated from time to time.

Along the left side of the window are tabs that offer different views of the data– most of the action takes place on the Events tab.

Tips

If the problem only exists on one browser instance, check the Command Line parameters and Active Field trials sections on the Import tab to see if there’s an experimental flag that may have caused the breakage. Similarly, check the Modules tab to see if there are any browser extensions that might explain the problem.

Each URL Request has a traffic_annotation value which is a hash you can look up in annotations.xml. That annotation will help you find what part of Chromium generated the network request:

Most requests generated by web content will be generated by the blink_resource_loader, navigations will have navigation_url_loader and requests from features running in the browser process are likely to have other sources.

Scenario: DNS Issues

Look at the DNS tab on the left, and HOST_RESOLVER_IMPL_JOB entries in the Events tab.

One interesting fact: The DNS Error page performs an asynchronous probe to see whether the configured DNS provider is working generally. The Error page also has automatic retry logic; you’ll see a duplicate URL_REQUEST sent shortly after the failed one, with the VALIDATE_CACHE load flag added to it. In this way, you might see a DNS_PROBE_FINISHED_NXDOMAIN error magically disappear if the user’s router’s DNS flakes.

Scenario: Cookie Issues

Look for COOKIE_INCLUSION_STATUS events for details about each candidate cookie that was considered for sending or setting on a given URL Request. In particular, watch for cookies that were excluded due to SameSite or similar problems.

Scenario: HTTPS Handshaking Issues

Look for SSL_CONNECT_JOB and CERT_VERIFIER_JOB entries. Look at the raw TLS messages on the SOCKET entries.

Scenario: HTTPS Certificate Issues

Note: While NetLogs are great for capturing certs, you can also get the site’s certificate from the browser’s certificate error page.

The NetLog includes the certificates used for each HTTPS connection in the base64-encoded SSL_CERTIFICATES_RECEIVED events.

You can just copy paste each certificate (including ---BEGIN to END---) out to a text file, name it log.cer, and use the OS certificate viewer to view it. Or you can use Fiddler’s Inspector, as noted above.

If you’ve got Chromium’s repo, you can instead use the script at \src\net\tools\print_certificates.py to decode the certificates. There’s also a cert_verify_tool in the Chromium source you might build and try. For Mac, using verify-cert to check the cert and dump-trust-settings to check the state of the Root Trust Store might be useful.

In some cases, running the certificate through an analyzer like https://crt.sh/lintcert can flag relevant problems.

Scenario: Authentication Issues

Look for HTTP_AUTH_CONTROLLER events, and responses with the status codes 401, 403, and 407.

For instance, you might find that the authentication fails with ERR_INVALID_AUTH_CREDENTIALS unless you enable the browser’s DisableAuthNegotiateCnameLookup policy (Kerberos has long been very tricky).

Scenario: Debugging Proxy Configuration Issues

See Debugging the behavior of Proxy Configuration Scripts.


Got a great NetLog debugging tip I should include here? Please leave a comment and teach me!

-Eric

Last update: June 18, 2020

I started building browser extensions more than 22 years ago, and I started building browsers directly just over 16 years ago. At this point, I think it’s fair to say that I’m entering the grizzled veteran phase of my career.

With the Edge team continuing to grow with bright young minds from college and industry, I’m increasingly often asked “Where do I learn about browsers?” and I haven’t had a ready answer for that question.

This post aims to answer it.

First, a few prerequisites for developing expertise in browsers:

  1. Curiosity. While browsers are more complicated than ever, there are also better resources than ever to learn how they work. All major browsers are now based on open-source code, and if you’re curious, you no longer need to join a secret priesthood to discover how they operate under the hood.
  2. Willingness to Experiment. Considering how complex browsers are (and because they’re so diverse, across platforms, maker, and version), it’s often easiest to definitively answer questions about how browsers work by trying things, rather than reading an explainer (possibly outdated or a map that doesn’t match the terrain) or reading the code (often complex and potentially misleading). Build test cases and try them in each browser to see what happens. When you encounter surprising behavior, let your curiosity guide you into figuring it out. Browsers contain no magic, but plenty of butterfly effects.
  3. Doggedness. I’ve been doing this for half of my life, and I’m still learning daily. While historical knowledge will serve you well, things are changing in this space every day, and keeping up is an endless challenge. And it’s often fun.

Now, how do you apply these prerequisites and grow to become a master of browsers? Read on.

Fundamental Understanding

Over the years, a variety of broad resources have been developed that will give you a good foundation in the fundamentals of how browsers work. Taking advantage of these will help you more effectively explore and learn on your own.

  • First, I recommend reading the Chrome Comic Book. This short, 38 page comic book from legend Scott McCloud was published alongside the first version of Google Chrome back in 2008. It clearly and simply explains many of the core concepts behind modern browsers as application platforms.
  • HTML5Rocks has a great introduction into How Browsers Work. This is a lengthy and detailed introduction into how browsers turn HTML and CSS into what you see on the screen. Read this article and you’ll understand more about this topic than 90% of web developers.
  • The folks at Google have created a fantastic four-part illustrated series about how modern browsers work: Inside look at modern web browsers. Navigation, the Rendering Engine and Input and Compositing as a part of their Web Fundamentals site.
  • Mozilla wrote a fantastic cartoon introduction to WebAssembly, explaining the basics behind this new technology; there’s tons of other invaluable content on Mozilla Hacks.
  • The Chromium Chronicle is a monthly series geared specifically to the Chromium developers who build the browser.
  • Web Developers should check out Web.Dev, a great source of articles on building fast and secure websites.

Books

If you prefer to learn from books, I can only recommend a few. Sadly, there are few on browsers themselves (largely because they tend to evolve too quickly), but there are good books on web technologies.

Tools

One of the best ways to examine what’s going on with browsers is to just use tools to watch what’s going on as you use your favorite websites.

  • Built-in DevTools (just hit F12!) – They’re amazingly powerful. I don’t know of a great tutorial, but there are likely some on YouTube.
  • Telerik Fiddler – See what requests hit the network and what they contain.
  • The VisBug Chrome Extension – Easily manipulate any page layout, directly in your browser.

Use the Source, Leia

The fact that all of the major browsers are built atop open-source projects is a wonderful thing. No longer do you need to be a reverse-engineering ninja with a low-level debugger to figure out how things are meant to work (although sometimes such approaches can still be super-valuable).

Source code locations:

Navigating the Code

While simply perusing a browser’s source code might give you a good feel for the project, browsers tend to be enormous. Chromium is over 10 million lines of code, for example.

If you need to find something in particular, one often effective way to find it easily is to search for a string shown in the browser UI near the feature of interest. (Or, if you’re searching for a DOM function name or HTML attribute name, try searching for that.) We might call this method string chasing.

By way of example, today I encountered an unexpected behavior in the handling of the “Go to <url>” command on Chromium’s context menu:

So, to find the code that implements this feature, I first try searching for that string:

…but there are a gazillion hits, which makes it hard to find what I need. So I instead search for a string that’s elsewhere in the context menu, and find only one hit in the Chromium “grd” (resources) file:

When I go look at that grd file, I quickly find the identifier I’m really looking for just below my search result:

So, we now know that we’re looking for usages of IDS_CONTENT_CONTEXT_GOTOURL, probably in a .CC file, and we find that almost immediately:

From here, we see that the menu item has the command identifier IDC_CONTENT_CONTEXT_GOTOURL, which we can then continue to chase down through the source until we find the code that handles the command. That command makes use of a variable selection_navigation_url_, which is filled elsewhere by some pretty complicated logic.

After you gain experience in the Chromium code, you might learn “Oh, yeah, all of the context menu stuff is easy to find, it’s in the renderer_context_menu directory” and limit your searches to that area, but after four years of working on Chrome, I still usually start my searches broadly.

Optional: Compile the code

If you’d actually like to compile the code of a major browser, things are a bit more involved, but if you follow the configuration instructions to the letter— your first build will succeed. Back in 2015, Monica Dinculescu created an amazing illustrated guide to contributing to Chromium.

You can compile Chromium or Firefox on a mid-range machine from 2016, but it will take quite a long time. A beefy PC will speed things up a bunch, but until we have cloud compilers available to the public, it’s always going to be pretty slow.

Look at their Bugs

All browsers except Microsoft Edge have a public bug tracker where you can search for known issues and file new bugs if you encounter them.

  • FirefoxFirefox Bugzilla
  • WebkitWebKit Bugzilla
  • ChromiumCRBug
  • Microsoft Edge‘s – Platform bugs that are inherited from Chromium are tracked using CRBug. Sadly, at present there is no public tracker for bugs that reproduce only in Edge. Bugs reported by the “Feedback” button are tracked internally by Microsoft.
  • Braveon GitHub
  • HTML5 Specification – on GitHub

Here are instructions for filing a great browser bug.

Blogs to Read

  • This One – I write mostly about browsers.
  • My (archived) IEInternals – I started writing this blog because it was the only reliable way for me to find my notes from investigations and troubleshooting years later.
  • Cloudflare’s – Cloudflare is a $5B company whose primary product is their amazing blog. I understand they also run a CDN on the side to generate interesting topics for their blog to talk about.
  • Nasko Oskov’s – Nasko is an engineer on the Chrome Security team and writes mostly about security topics.
  • Chris Palmer’s – Chris is an engineer on the Chrome Security team and writes about secure design.
  • Adam Langley’s – Google’s expert cryptographer
  • Bruce Dawson’s – Bruce is a Chrome Engineer who posts lots of interesting information about debugging and performance troubleshooting, especially on Windows.
  • Anne van Kesteren’s – Anne works on the HTML5 spec.
  • Mark Nottingham’s – Mark co-chairs the HTTP and QUIC working groups

Specific Posts of Interest

People to Follow

I’ve doubtless forgotten some, see who I follow.

The Business of Browsers

Public data reveals each point of marketshare in the browser market is worth at least $100,000,000 USD annually (most directly in the form of payments from the browser’s configured search engine).

Remembering this fact will help you understand many other things, from how browsers pay their large teams of expensive software engineers, to how they manage to give browsers away for free, to why certain features behave the way that they do.

Extra Resources

Browsers are hugely complicated beasts, and tons of fun. If the resources above leave you feeling both overwhelmed and excited, maybe you should become a browser builder.

Want to change the world? Come join the new Microsoft Edge team today!

-Eric

In recent posts, I’ve explored mechanisms to communicate from web content to local (native) apps, and I explained how web apps can use the HTML5 registerProtocolHandler API to allow launching them from either local apps or other websites.

In today’s post, we’ll explore how local apps can launch web apps in the browser.

It’s Simple…

In most cases, it’s trivial for an app to launch a web app and send data to it. The app simply invokes the operating system’s “launch” API and passes it the desired URL for the web app.

Any data to be communicated to the web app is passed in the URL query string or the fragment component of the URL.

On Windows, such an invocation might look like this:

ShellExecute(hwnd, "open", "https://bayden.com/echo.aspx?DataTo=Pass#GoesHere", 0, 0, SW_SHOW);

Calling this API results in the user’s default browser being opened and a new tab navigated to the target URL.

This same simple approach works great on most operating systems and with virtually any browser a user might have configured as their default.

…Unless It’s Not

Unfortunately, this well-lit path adjoins a complexity cliff— if your scenario has requirements beyond the basic [Launch the default browser to this URL], things get much more challenging. The problem is that there is no API contract that provides a richer feature set and works across different browsers.

For instance, consider the case where you’d like your app to direct the browser to POST a form to a target server. Today, popular operating systems have no such concept– they know how to open a browser by passing it a URL, but they expose no API that says “Open the User’s browser to the following URL, sending the navigation request via the HTTP POST method and containing the following POST body data.

Over the years, a few workarounds have been used (e.g. see StackOverflow1 and StackOverflow2).

For instance, if the target webservice simply requires a HTTP POST and you cannot change it, your app could launch the browser to a webpage you control, passing the required data in the querystring component of a HTTP GET. Your web server could then reformat the data into the required POST body format and either proxy that request (server-side) to the target webservice, or it could return a web page with an auto-submitting form element with a method of POST and and action attribute pointed at the target webservice. The user’s browser will submit the form, posting the data to the target server.

Similarly, a more common approach involves having the app write a local HTML file in a temporary folder, then direct the Operating System to open that file using the appropriate API (again ShellExecute, in the case of Windows). Presuming that the user’s default HTML handler is also their default HTTPS protocol handler, opening the file will result in the default browser opening, and the HTML/script in the file will automatically submit the included form element to the target server. This “bounce through a local temporary form” approach has the advantage of making it possible to submit sizable of data to the server (e.g. the contents of a local file), unlike using a GET request’s size-limited querystring.

Caveats:

  • Unfortunately it is generally not possible to construct a HTML form that will submit a data field that exactly matches what you would get when sending an <input type=file> control. If the web service demands a format that was generated by a file upload control, you may not be able to emulate that.
  • Some browsers will not run JavaScript in local files by default.
  • Don’t forget to delete the temporary file!

If your scenario requires uploading files, an alternative approach is to:

  1. Upload the files directly from your app to a web service
  2. Have that web service return a secret token associated with the upload
  3. Have your app spawn a browser with a GET request whose querystring contains that secret token

Browser-Specific Approaches

Back in the Windows 7 days, the IE8 team created a very cool feature called Accelerators that would allow users to invoke web services in their browser from any other application. Interestingly, the API contract supported web services that required POST requests.

Because there was no API in Windows that supported launching the default browser with anything other than a URL, a different approach was needed. A browser that wished to participate as a handler for accelerators could implement a IOpenServiceActivityOutputContext::Navigate function which was expected to launch the browser and pass the data. The example implementation provided by our documentation called into Internet Explorer’s Navigate2() COM API, which accepted as a parameter the POST body to be sent in the navigation. As far as I know, no other browser ever implemented IOpenServiceActivityOutputContext.

These days, Accelerators are long dead, and no one should be using Internet Explorer anymore. In the intervening years, no browser-agnostic mechanism to transfer a POST request from an app to a browser has been created.

Perhaps the closest we’ve come is the W3C’s WebDriver Standard, designed for automated testing of websites across arbitrary browsers. Unfortunately, at present, there’s still no way for mainstream apps to take a dependency on WebDriver to deliver a reliable browser-agnostic solution enabling rich transfers from a local app to a web app. Similarly, Puppeteer can be used for some web automation scenarios in Chrome or Edge, and the new Microsoft Playwright enables automated testing in Chromium, WebKit, and Firefox.

Future Possibilities

While the current picture is bleak, the future is a bit brighter. That’s because a major goal of browsers’ investment in Progressive Web Apps is to make them rich enough to take the place of native apps. Today’s native apps have very rich mechanisms for passing data and files to one another and PWAs will need such capabilities in order to achieve their goals.

Perhaps one day, not too far in the future, your OS and your browser (regardless of vendor) will better interoperate.

-Eric

While I do most of my work in an office, from time to time I work on code changes to Chromium at home. With the recent deprecation of Jumbo Builds, building the browser on my cheap 2016-era Dell XPS 8900 (i7-6700K) went from unpleasant to impractical. While I pondered buying a high-end Threadripper, I couldn’t justify the high cost, especially given the limited performance characteristics for low-thread workloads (basically, everything other than compilation).

The introduction of the moderately-priced (nominally $750), 16 Core Ryzen 3950X hit the sweet spot, so I plunked down my credit card and got a new machine from a system builder. Disappointingly, it took almost two months to arrive in a working state, but things seem to be good now.

The AMD Ryzen 3950X has 16 cores with two threads each, and runs around 3.95ghz when they’re all fully-loaded; it’s cooled by a CyberPowerPC DeepCool Castle 360EX liquid cooler. An Intel Optane 905P 480GB system drive holds the OS, compilers, and Chromium code. The key advantage of the Optane over more affordable SSDs is that it has a much higher random read rate (~400% as fast as the Samsung 970 Pro I originally planned to use):

Following the Chromium build instructions, I configured my environment and set up a 32bit component build with reduced symbols:

is_component_build = true
enable_nacl = false
target_cpu = "x86"
blink_symbol_level = 0
symbol_level = 1

Atop Windows 10 1909, I disabled Windows Defender entirely, and didn’t do anything too taxing with the PC while the build was underway.

Ultimately, a clean build of the “chrome” target took just under 53 minutes, achieving 33.3x parallelism.

While this isn’t a fast result by any stretch of the definition, it’s still faster than my non-jumbo local build times back when I worked at Google in 2016/2017 and used a $6000 Xeon 48 thread workstation to build Chrome, at somewhere around half of the cost.

Cloud Compilation

When I first joined Google, I learned about the seemingly magical engineering systems available to Googlers, quickly followed by the crushing revelation that most of those magic tools were not available to those of us working on the Chromium open-source project.

The one significant exception was that Google Chrome engineers had access to a distributed build system called “Goma” which would allow compiling Chrome using servers in the Google cloud. My queries around the team suggested that only a minority of engineers took advantage of it, partly because (at the time) it didn’t generate very debuggable Windows builds. Nevertheless, I eventually gave it a shot and found that it cut perhaps five minutes off my forty-five minute jumbo build times on my Xeon workstation. I rationalized this by concluding that the build must not be very parallelizable, and the fact that I worked remotely from Austin, so any build-artifacts from the Goma cloud would be much further away than from my colleagues in Mountain View.

Given the complexity of the configuration, I stopped using Goma, and spent perhaps half of my tenure on Chrome with forty-five minute build times[1]. Then, one day I needed to do some development on my Macbook, and I figured its puny specs would benefit from Goma in a way my Xeon workstation never would. So I went back to read the Goma documentation and found a different reference than I saw originally. This one mentioned a then unknown to me “-j” command line argument that tells the build system how many cloud cores to use.

This new, better, documentation noted that by default the build system would just match your local core count, but when using Goma you should instead demand ~20x your local core count– so -j 960 for my workstation. With one command line argument, my typical compiles dropped from 45 minutes to around 6.

::suitable_meme_of_wonder_and_fury::

Returning to Edge

I returned to Microsoft as a Program Manager on the Edge team in mid-2018, unaware that replatforming atop Chromium was even a possibility until the day before I started. Just before I began, a lead sent me a 27 page PDF file containing the Edge-on-Chromium proposal. “What do you think?” he asked. I had a lot of thoughts (most of the form “OMG, yes!“) but one thing I told everyone who would listen is that we would never be able to keep up without having a cloud-compilation system akin to Goma. The Google team had recently open-sourced the Goma client, but hadn’t yet open-sourced the cloud server component. I figured the Edge team had engineering years worth of work ahead of us to replicate that piece.

When an engineer on the team announced two weeks later that he had “MSGoma” building Chromium using an Azure cloud backend, it was the first strong sign that this crazy bet could actually pay off.

And pay off it has. While I still build locally from time to time, I typically build Chromium using MSGoma from my late 2018 Lenovo X1 Extreme laptop, with build times hovering just over ten minutes. Cloud compilation is a game changer.

The Chrome team has since released a Goma Server implementation, and several other major Chromium contributors are using distributed build systems of their own design.

I haven’t yet tried using MSGoma from my new Ryzen workstation, but I’ve been told that the Optane drive is especially helpful when performing distributed builds, due to the high incidence of small random reads.

-Eric

[1] This experience recalled a much earlier one: my family moving to Michigan shortly after I turned 11. Our new house featured a huge yard. My dad bought a self-propelled lawn mower and my brother and I took turns mowing the yard weekly. The self-propelled mower was perhaps fifteen pounds heavier than our last mower, and the self-propelling system didn’t really seem to do much of anything.

After two years of weekly mows from my brother and I, my dad took a turn mowing. He pushed the lawn mower perhaps five feet before he said “That isn’t right,” reached under the control panel and flipped a switch. My brother and I watched in amazement and dismay as the mower began pulling him across the yard.

Moral of the story: Knowledge is power.

Problems in accessing websites can often be found and fixed if the network traffic between the browser and the website is captured as the problem occurs. This short post explains how to capture such logs.

Capturing Network Traffic Logs

If someone asked you to read this post, chances are good that you were asked to capture a web traffic log to track down a bug in a website or your web browser.

Fortunately, in Google Chrome or the new Microsoft Edge (version 76+), capturing traffic is simple:

  1. Optional but helpful: Close all browser tabs but one.
  2. Navigate the tab to chrome://net-export
  3. In the UI that appears, press the Start Logging to Disk button.
  4. Choose a filename to save the traffic to. Tip: Pick a location you can easily find later, like your Desktop.
  5. Reproduce the networking problem in a new tab. If you close or navigate the //net-export tab, the logging will stop automatically.
  6. After reproducing the problem, press the Stop Logging button.
  7. Share the Net-Export-Log.json file with whomever will be looking at it. Optional: If the resulting file is very large, you can compress it to a ZIP file.
Network Capture UI

Privacy-Impacting Options

In some cases, especially when you dealing with a problem in logging into a website, you may need to set either the Include cookies and credentials or Include raw bytes options before you click the Start Logging button.

Note that there are important security & privacy implications to selecting these options– if you do so, your capture file will almost certainly contain private data that would allow a bad actor to steal your accounts or perform other malicious actions. Share the capture only with a person you trust and do not post it on the Internet in a public forum.

Tutorial Video

If you’re more of a visual learner, here’s a short video demonstrating the traffic capture process.

In a followup post, I explore how developers can analyze captured traffic.

-Eric

Appendix A: Capture on Startup

In rare cases, you may need to capture network data early (e.g. to capture proxy script downloads and the like. To do that, close Edge, then run

msedge.exe --log-net-log=C:\some_path\some_file_name.json --net-log-capture-mode=IncludeSocketBytes

Note: This approach also works for Electron JS applications like Microsoft Teams:

%LOCALAPPDATA%\Microsoft\Teams\current\Teams.exe --log-net-log=C:\temp\TeamsNetLog.json

I suspect that this is only going to capture the network traffic from the Chromium layer of Electron apps (e.g. web requests from the nodeJS side will not be captured) but it still may be very useful.

Appendix B: References

If you’ve built an application using the old Web Browser Control (mshtml, aka Internet Explorer), you might notice that by default it does not support HTTP/2. For instance, a trivial WebOC host loading Akamai’s HTTP2 test page:

When your program is running on any build of Windows 10, you can set a Feature Control Key with your process’ name to opt-in to using HTTP/2.

For applications running at the OS-native bitness, write the key here:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_WEBOC_ENABLE_HTTP2

For 32-bit applications running on 64-bit Windows, write it here:
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_WEBOC_ENABLE_HTTP2

Like so:

After you make this change and restart the application, it will use HTTP/2 if the server and network path supports it.

Using HTTP/2 should help improve performance and can even resolve functional bugs.

-Eric

Update: Windows’ Internet Control Panel has a “HTTP2” checkbox, but it only controls web platform apps (IE and Legacy Edge), and unfortunately, the setting does not work properly for AppContainer/LowIL processes, which enable HTTP2 by default. This means that the checkbox, as of Windows 10 version 1909, is pretty much useless for its intended purpose (as only Intranet Zone sites outside of Protected Mode run at MediumIL).

A bug has been filed.

Update: Users of the new Chromium-based Edge browser can launch an instance with HTTP2 disabled using the disable-http2 command line argument, e.g. ms-edge.exe --disable-http2.

I’m not aware of a straightforward way to disable HTTP2 for the new Chromium-Edge-based WebView2 control, which has HTTP2 enabled by default.

While there are many different ways for servers to stream data to clients, the Server-sent Events / EventSource Interface is one of the simplest. Your code simply creates an EventSource and then subscribes to its onmessage callback:

Implementing the server side is almost as simple: your handler just prefaces each piece of data it wants to send to the client with the string data: and ends it with a double line-ending (\n\n). Easy peasy. You can see the API in action in this simple demo.

I’ve long been sad that we didn’t manage to get this API into Internet Explorer or the Legacy Edge browser. While many polyfills for the API exist, I was happy that we finally have EventSource in the new Edge.

Yay! \o/

Alas, I wouldn’t be writing this post if I hadn’t learned something new yesterday.

Last week, a customer reached out to complain that the new Edge and Chrome didn’t work well with their webmail application. After they used the webmail site for a some indeterminate amount time, they noticed that its performance slowed to a crawl– switching between messages would take tens of seconds or longer, and the problem reproduced regardless of the speed of the network. The only way to reliably resolve the problem was to either close the tabs they’d opened from the main app (e.g. the individual email messages could be opened in their own tabs) or to restart the browser entirely.

As the networking PM, I was called in to figure out what was going wrong over video conference. I instructed the user to open the F12 Developer Tools and we looked at the network console together. Each time the user clicked on a message, new requests were created and sat in the (pending) state for a long time, meaning that the requests were getting queued and weren’t even going to the network promptly.

But why? Diagnosing this remotely wasn’t going to be trivial, so I had the user generate a Network Export log that I could examine later.

In examining the log using the online viewer, the problem became immediately clear. On the Sockets tab, the webmail’s server showed 19 requests in the Pending state, and 6 Active connections to the server, none of which were idle. The fact that there were six connections strongly suggested that the server was using HTTP/1.1 rather than HTTP/2, and a quick look at the HTTP/2 tab confirmed it. Looking at the Events tab, we see five outstanding URLRequests to a URL that strongly suggests that it’s being used as an EventSource:

Each of these sockets is in the READING_RESPONSE state, and each has returned just ten bytes of body data to each EventSource. The web application is using one EventSource instance of the app, and the user has five tabs open to the app.

And now everything falls into place. Browsers limit themselves to 6 concurrent connections per server. When the server supports HTTP/2, browsers typically need just one connection because HTTP/2 supports multiplexing many (typically 100) streams onto a single connection. HTTP/1.1 doesn’t afford that luxury, so every long-lived connection used by a page decrements the available connections by one. So, for this user, all of their network traffic was going down a single HTTP/1.1 connection, and because HTTP/1.1 doesn’t allow multiplexing, it means that every action in the UI was blocked on a very narrow head-of-line-blocking pipe.

Looking in the Chrome bug tracker, we find this core problem (“SSE connections can starve other requests”) resolved “By Design” six years ago.

Now, I’m always skeptical when reading old bugs, because many issues are fixed over time, and it’s often the case that an old resolution is no longer accurate in the modern world. So I built a simple repro script for Meddler. The script returns one of four responses:

  • An HTML page that consumes an EventSource
  • An HTML page containing 15 frames pointed at the previous HTML page
  • An event source endpoint (text/event-stream)
  • A JPEG file (to test whether connection limits apply across both EventSources and other downloads)

And sure enough, when we load the page we see that only six frames are getting events from the EventSource, and the images that are supposed to load at the bottom of the frames never load at all:

Similarly, if we attempt to load the page in another tab, we find that it doesn’t even load, with a status message of “Waiting for available socket…”

The web app owners should definitely enable HTTP/2 on their server, which will make this problem disappear for almost all of their users.

However, even HTTP/2 is not a panacea, because the user might be behind a “break-and-inspect” proxy that downgrades connections to HTTP/1.1, or the browser might conceivably limit parallel requests on HTTP/2 connections for slow networks. As noted in the By Design issue, a server depending on EventSource in multiple tabs might use a BroadcastChannel or a SharedWorker to share a single EventSource connection with all of the tabs of the web application.

Alternatively, swapping an EventSource architecture with one based on WebSocket (even one that exposes itself as a EventSource polyfill) will also likely resolve the problem. That’s because, even if the client or server doesn’t support routing WebSockets over HTTP/2, the WebSockets-Per-Host limit is 255 in Chromium and 200 in Firefox.

Stay responsive out there!

-Eric