One of my final projects on the Chrome team was writing an internal document outlining Best Practices for Secure URL Display. Yesterday, it got checked into the public Chromium repro, so if this is a topic that interests you, please have a look!
In Windows 10 RS5 (aka the “October 2018 Update”), the venerable XSS Filter first introduced in 2008 with IE8 was removed from Microsoft Edge. The XSS Filter debuted in a time before Content Security Policy as a part of a basket of new mitigations designed to mitigate the growing exploitation of cross-site scripting attacks, joining older features like HTTPOnly cookies and the sandbox attribute for IFRAMEs.
The XSS Filter feature was a difficult one to land– only through the sheer brilliance and dogged persistence of its creator (David Ross) did the IE team accept the proposal that a client-side filtering approach could be effective with a reasonable false positive rate and good-enough performance to ship on-by-default. The filter was carefully tuned, firing only on cross-site navigation, and in need of frequent updates as security researchers inside and outside the company found tricks to bypass it. One of the most significant technical challenges for the filter concerned how it was layered into the page download pipeline, intercepting documents as they were received as raw text from the network. The filter relied evaluating dynamically-generated regular expressions to look for potentially executable markup in the response body that could have been reflected from the request URL or POST body. Evaluating the regular expressions could prove to be extremely expensive in degenerate cases (multiple seconds of CPU time in the worst cases) and required ongoing tweaks to keep the performance costs in check.
In 2010, the Chrome team shipped their similar XSS Auditor feature, which had the luxury of injecting its detection logic after the HTML parser runs, detecting and blocking reflections as they entered the script engine. By throttling closer to the point of vulnerability, its performance and accuracy is significantly improved over the XSS Filter.
Unfortunately, no matter how you implement it, clientside XSS filtration is inherently limited– of the four classes of XSS Attack, only one is potentially mitigated by clientside XSS filtration. Attackers have the luxury of tuning their attacks to bypass filters before they deploy them to the world, and the relatively slow ship cycles of browsers (6 weeks for Chrome, and at least a few months for IE of the era) meant that bypasses remained exploitable for a long time.
False positives are an ever-present concern– this meant that the filters have to be somewhat conservative, leading to false-negative bypasses (e.g. multi-stage exploits that performed a same-site navigation) and pronouncements that certain attack patterns were simply out-of-scope (e.g. attacks encoded in anything but the most popular encoding formats).
Early attempts to mitigate the impact of false positives (by default, neutering exploits rather than blocking navigation entirely) proved bypassable and later were abused to introduce XSS exploits in sites that would otherwise be free of exploit (!!!). As a consequence, browsers were forced to offer options that would allow a site to block navigation upon detection of a reflection, or disable the XSS filter entirely.
Surprisingly, even in the ideal case, best-of-class XSS filters can introduce information disclosure exploits into sites that are free of XSS vulnerabilities. XSS filters work by matching attacker-controlled request data to text in a victim response page, which may be cross-origin. Clientside filters cannot really determine whether a given string from the request was truly reflected into the response, or whether the string is naturally present in the response. This shortcoming creates the possibility that such a filter may be abused by an attacker to determine the content of a cross-origin page, a violation of Same Origin Policy. In a canonical attack, the attacker frames a victim page with a string of interest in it, then attempts to determine that string by making a series of successive guesses until it detects blocking by the XSS filter. For instance, xoSubframe.contentWindow.length exposes the count of subframes of a frame, even cross-origin. If the XSS filter blocks the loading of a frame, its subframe count is zero and the attacker can conclude that their guess was correct.
In Windows 10 RS4 (April 2018 update), Edge shipped its implementation of the Fetch standard, which redefines how the browser downloads content for page loads. As a part of this massive architectural shift, a regression was introduced in Edge’s XSS Filter that caused it to incorrectly determine whether a navigation was cross-origin. As a result, the XSS Filter began running its logic on same-origin navigations and skipping processing of cross-origin navigations, leading to a predictable flood of bug reports.
In the process of triaging these reports and working to address the regression, we concluded that the XSS Filter had long been on the wrong side of the cost/benefit equation and we elected to remove the XSS Filter from Edge entirely, matching Firefox (which never shipped a filter to begin with).
We encourage sites that are concerned about XSS attacks to use the client-side platform features available to them (Content-Security-Policy, HTTPOnly cookies, sandboxing) and the server-side patterns and frameworks that are designed to mitigate script injection attacks.
My oldest supported Windows application is a launcher app named SlickRun, and it’s ~24 years old this year. I haven’t done much to maintain it over the last few years, although it’s now available in 64-bit and runs great on Windows 10. (Thanks go to Embarcadero, who now offer a free “Community” edition of Delphi, the language/platform I ported SlickRun to circa 1994).
I still fix bugs in SlickRun from time to time, and as I was playing with Rust a few days ago I was reminded of one of the oldest limitations in my code– if you update your system’s %PATH% variable, those changes aren’t seen by applications/consoles spawned by SlickRun until you restart it. It’s particularly annoying because it’s so unexpected– users expect that command consoles launched by Win+R,cmd.exe,Enter will behave the same way as Win+Q,cmd,Enter, but the former consoles have the updated %PATH% while the latter do not.
While ShellExecute() sounds like it’s an API that causes the shell (aka Explorer) to execute something, in fact it does nothing of the sort.
Updating the Environment Block
The root cause of the “outdated path” problem is that processes launched via ShellExecute inherit the environment variables of their spawning process, and those environment variables (typically) are assigned as the process launches and never touched again. Because SlickRun starts with Windows, the %PATH% when it starts is the %PATH% that every process it launches inherits. (You can easily view a process’ environment block using the Properties > Environment tab in Process Explorer).
So, how does Explorer detect the change? That part I figured out ages ago– after updating an environment variable, the System Properties > Environment Variables Control Panel UI (or the SetX.exe console tool) broadcast a WM_SETTINGCHANGE message to all top-level windows with an lparam containing the string “Environment”. I could easily add code to SlickRun to detect that the variables had changed, but for decades I didn’t really know what to do next… I didn’t know how to read the updated variables (without doing something hacky like restarting the process) nor ensure that they were passed to the applications spawned by ShellExecute.
Yesterday, I got fed up and started Googling. A few posts on StackOverflow mentioned a promising-sounding function, RegenerateUserEnvironment. And while that function appears to be undocumented, there’s an amazing issue filed in an open-source tracker that explains exactly how Windows Explorer uses this function– basically, just wait for the WM_SETTINGCHANGE event, then call the API. The RegenerateUserEnvironment will replace the calling process’ current environment block with the latest values.
Launching at Medium Integrity
While we’re on the topic of executing applications “like the shell”, another scenario came up twelve years ago when Windows Vista was first introduced. The SlickRun installer, written in NSIS, launches SlickRun when installation completes. Unfortunately, the installer runs with Admin rights (High integrity), which means that, by default, all of the programs it launches inherit that integrity. For SlickRun, this is especially bad because it means that any programs that it, in turn, launches during that first session (e.g. your browser!) will run at High integrity too. Not good.
While you can easily use the “Runas” verb to ShellExecute to launch a High integrity application from a Medium integrity application, there (depressingly) isn’t a way to do the opposite. For years, the official recommendation was to do some fancy coding to clone Explorer’s tokens and use those. Unfortunately, this is quite complicated to implement, especially within a NSIS script.
As it turns out, however, there’s a trivial workaround which works quite well– while ShellExecute doesn’t run things as the shell, applications can easily get Explorer to launch anything they like at Explorer’s integrity. The trick is to simply invoke explorer.exe and pass the filename to be executed as the first command line argument:
While this approach isn’t technically supported, I expect it is likely to continue to work for the foreseeable future.
It’s depressing that together these tricks have taken me almost twenty years to discover, but I’m happy that I have. I hope they help you out.
More recently, I wrote about why Content-Type headers matter for same-origin-policy enforcement.
I’ve just read a great paper on cross-origin infoleaks and current/future mitigations. If you’re interested in browser security, it’s definitely worth a read.
In order to be eligible for the HSTS Preload list, your site must usually serve a Strict-Transport-Security header with a includeSubdomains directive.
Unfortunately, some sites do not follow the best practices recommended and instead just set a one-year preload header with includeSubdomains and then immediately request addition to the HSTS Preload list. The result is that any problems will likely be discovered too late to be rapidly fixed– removals from the preload list may take months.
In running the HSTS preload list, we’ve seen two common mistakes for sites using includeSubdomains:
Mistake: Forgetting Intranet Hosts
Some sites are set up with a public site (example.com) and an internal site only accessible inside the firewall (app.corp.example.com). When includeSubdomains is set, all sites underneath the specified domain must be accessible over HTTPS, including in this case app.corp.example.com. Some corporations have different teams building internal and external applications, and must take care that the security directives applied to the registrable domain are compatible with all of the internal sites running beneath it in the DNS hierarchy. Following the best practices of staged rollout (with gradually escalating max-age directives) will help uncover problems before you brick your internal sites and applications.
Of course, you absolutely should be using HTTPS on all of your internal sites as well, but HTTPS deployments are typically smoother when they haven’t been forced by Priority-Zero downtime.
Mistake: Forgetting Delegated Hosts
Some popular sites and services use third party companies for advertising, analytics, and marketing purposes. To simplify deployments, they’ll delegate handling of a subdomain under their main domain to the vendor providing the service. For instance, http://www.example.com will point to the company’s own servers, while the hostname mail-tracking.example.com will point to Experian or Marketo servers. A HSTS rule with includeSubdomains applied to example.com will also apply to those delegated domains. If your service provider has not enabled HTTPS support on their servers, all requests to those domains will fail when upgraded to HTTPS. You may need to change service providers entirely in order to unbrick your marketing emails!
Of course, you absolutely should be using HTTPS on all of your third-party apps as well, but HTTPS deployments are typically smoother when they haven’t been forced by Priority-Zero downtime.
If you do find yourself in the unfortunate situation of having preloaded a TLD whose subdomains were not quite ready, you can apply for removal from the preload list, but, as noted previously, the removal can be expected to take a very long time to propagate. For cases where you only have a few domains out of compliance, you should be able to quickly move them to HTTPS. You might also consider putting a HTTPS front-end out in front of your server (e.g. Cloudflare’s free Flexible SSL option) to allow it to be accessed over HTTPS before the backend server is secured.
Deploy safely out there!
Just over 5 years ago, I wrote a blog post titled “Misbehaving HTTPS Servers Impair TLS 1.1 and TLS 1.2.”
In that post, I noted that enabling versions 1.1 and 1.2 of the TLS protocol in IE would cause some sites to load more slowly, or fail to load at all. Sites that failed to load were sending TCP/IP RST packets when receiving a ClientHello message that indicated support for TLS 1.1 or 1.2; sites that loaded more slowly relied on the fact that the browser would retry with an earlier protocol version if the server had sent a TCP/IP FIN instead.
TLS version fallbacks were an ugly but practical hack– they allowed browsers to enable stronger protocol versions before some popular servers were compatible. But version fallback incurs real costs:
- security – a MITM attacker can trigger fallback to the weakest supported protocol
- performance – retrying handshakes takes time
- complexity – falling back only in the right circumstances, creating new connections as needed
- compatibility – not all clients are willing or able to fallback (e.g. Fiddler would never fallback)
Fortunately, server compatibility with TLS 1.1 and 1.2 has improved a great deal over the last five years, and browsers have begun to remove their fallbacks; first fallback to SSL 3 was disabled and now Firefox 37+ and Chrome 50+ have removed fallback entirely.
In the rare event that you encounter a site that needs fallback, you’ll see a message like this, in Chrome:
or in Firefox:
Currently, both Internet Explorer and Edge fallback; first a TLS 1.2 handshake is attempted:
and after it fails (the server sends a TCP/IP FIN), a TLS 1.0 attempt is made:
This attempt succeeds and the site loads in IE. If you analyze the affected site using SSLLabs, you can see that it has a lot of problems, the key one for this case is in the middle:
This is repeated later in the analysis:
The analyzer finds that the server refuses not only TLS 1.2 but also the upcoming TLS 1.3.
Unfortunately, as an end-user, there’s not much you can safely do here, short of contacting the site owners and asking them to update their server software to support modern standards. Fortunately, this problem is rare– the Chrome team found that only 0.0017% of TLS connections triggered fallbacks, and this tiny number is probably artificially high (since a spurious connection problem will trigger fallback).
HTTPS only works if you use it.
Coinbase is an online bitcoin exchange backed by $106M in venture capital investment. They’ve got a strong HTTPS security posture, including the latest ciphers, a 4096bit RSA key, and advanced features like browser-preloaded HSTS and HPKP.
SSLLabs grades Coinbase’s HTTPS deployment an A+:
This is a well-secured site with a professional security team.
Here’s the email they just sent me:
Let’s run the MoarTLS Analyzer on that:
That’s right… every hyperlink in this email is non-secure and any click can be intercepted and sent anywhere by a network-based attacker.
Sadly, Coinbase is far from alone in snatching security defeat from the jaws of victory; my #HTTPSFAIL folder includes a lot of other big names:
It doesn’t matter how well you secure your castle if you won’t help your visitors get to it securely. Use HTTPS everywhere.
Update: I filed a bug with Coinbase on HackerOne. Their security team says that they “agree” that these links should be HTTPS, but the problem is Mailchimp (their email vendor) and they can’t fix it. Mailchimp offers a security vulnerability reporting form, delivered exclusively over HTTP:
Coinbase isn’t the first service whose security is bypassed because their emails are sent with non-secure links; the Brave browser download announcements suffered the same problem.