As your browser navigates from page to page, servers are informed of the URL from where you’ve come from using the Referer HTTP header1; the document.referrer DOM property reveals the same information to JavaScript.

Similarly, as the browser downloads the resources (images, styles, JavaScript) within webpages, the Referer header on the request allows the resource’s server to determine which page is requesting the resource.

The Referrer is omitted in some cases, including:

  • When the user navigates via some mechanism other than a link in the page (e.g. choosing a bookmark or using the address box)
  • When navigating from HTTPS pages to HTTP pages
  • When navigating from a resource served by a protocol other than HTTP(S)
  • When the page opts-out (details in a moment)

Usefulness

The Referrer mechanism can be very useful, because it helps a site owner understand from where their traffic is originating. For instance, WordPress automatically generates this dashboard which shows me where my blog gets its visitors:

BlogStats

I can see not only which Search Engines send me the most users, but also which specific posts on Reddit are driving traffic my way.

Privacy Implications

Unfortunately, this default behavior has a significant impact on privacy, because it can potentially leak private and important information.

Imagine, for example, that you’re reviewing a document your mergers and acquisitions department has authored, with the URL https://contoso.com/​Q4/PotentialAcquisitionTargetsUpTo5M.docx. Within that document, there might have a link to https://fabrikam.com​/financialdisclosures.htm. If you were to click that link, the navigation request to Fabrikam’s server will contain the full URL of the document that led you there, potentially revealing information that your firm would’ve preferred to keep quiet.

Similarly, your search queries might contain something you don’t mind Bing knowing (“Am I required to disclose a disease before signing up for HumongousInsurance.com?”) but that you didn’t want to immediately reveal to the site where you’re looking for answers.

If your web-based email reader puts your email address in the URL, or includes the subject of the current email, links you click in that email might be leaking information you wish to keep private.

The list goes on and on.

Referrer Policy

Websites have always had ways to avoid leaking information to navigation targets, usually involving nonstandard navigation mechanisms (e.g. meta refresh) or by wrapping all links so that they go through an innocuous page (e.g. https://example.net/offsitelink.aspx).

However, these mechanisms were non-standard, cumbersome, and would not control the referrer information sent when downloading resources embedded in pages. To address these limitations, Referrer Policy was developed and implemented by most browsers2.

CanIUseRP

Referrer Policy allows a website to control what information is sent in Referer headers and exposed to the document.referrer property. As noted in the spec, the policy can be specified in several ways:

  • Via the Referrer-Policy HTTP response header.
  • Via a meta element with a name of referrer.
  • Via a referrerpolicy content attribute on an aareaimgiframe, or link element.
  • Via the noreferrer link relation on an aarea, or link element.
  • Implicitly, via inheritance.

The policy can be any of the following:

  • no-referrer – Do not send a Referer.
  • unsafe-url – Send the full URL (lacking only auth info and fragment), even on navigations from HTTPS to HTTP.
  • no-referrer-when-downgrade – Don’t send the Referer when navigating from HTTPS to HTTP. [The longstanding default behavior of browsers.]
  • strict-origin-when-cross-origin – For a same-origin navigation, send the URL. For a cross-origin navigation, send only the Origin of the referring page. Send nothing when navigating from HTTPS to HTTP. [Spoiler alert: The new default.]
  • origin-when-cross-origin For a same-origin navigation, send the URL. For a cross-origin navigation, send only the Origin of the referring page. Send the Referer even when navigating from HTTPS to HTTP.
  • same-origin – Send the Referer only for same-origin navigations.
  • origin – Send only the Origin of the referring page.
  • strict-origin – Send only the Origin of the referring page; send nothing when navigating from HTTPS to HTTP.
  • empty string – Inherit, or use the default

As you can see, there are quite a few policies. That’s partly due to the strict- variations which prevent leaking even the origin information on HTTPS->HTTP navigations.

Improving Defaults

With this background out of the way, the Chromium team has announced that they plan to change the default Referrer Policy from no-referrer-when-downgrade to strict-origin-when-cross-origin. This means that cross-origin navigations will no longer reveal path or query string information, significantly reducing the possibility of unexpected leaks.

As with other big privacy changes, this change is slated to ship in v80, the code has been in for five years and you can enable it in Chrome 78+ and Edge 78+:

  1. Visit chrome://flags/#reduced-referrer-granularity
  2. Set the feature to Enabled
  3. Restart your browser

Flag

I’ve published a few toy test cases for playing with Referrer Policy here.

As noted in their Intent To Implement, the Chrome team are not the first to make changes here. As of Firefox 70 (Oct 2019), the default referrer policy is set to strict-origin-when-cross-origin, but only for requests to known-tracking domains, OR while in Private mode. In Safari ITP, all cross-site HTTP referrers and all cross-site document.referrers are downgraded to origin. Brave forges the Referer (sending the Origin of the target, not the source) when loading of cross-origin resources.

Understand the Limits

Note that this new default is “opt-out”– a page can still choose to send unrestricted referral URLs if it chooses. As an author, I selfishly hope that sites like Reddit and Hacker News might do so.

Also note that this new default does not in any way limit JavaScript’s access to the current page‘s URL. If your page at https://contoso.com/SuperSecretDoc.aspx includes a tracking script:

script

… the HTTPS request for track.js will send Referer: https://contoso.com/, but when the script runs, it will have access to the full URL of its execution context (https://contoso.com/SuperSecretDoc.aspx) via the window.location.href property.

Test Your Sites

If you’re a web developer, you should test your sites in this new configuration and update them if anything is unexpectedly broken. If you want the browser to behave as it used to, you can use any of the policy-specification mechanisms to request no-referrer-when-downgrade behavior for either an entire page or individual links.

Or, you might pick an even stricter policy (e.g. same-origin) if you want to prevent even the origin information from leaking out on a cross-site basis. You might consider using this on your Intranet, for instance, to help prevent the hostnames of your Intranet servers from being sent out to public Internet sites.

Stay private out there!

-Eric

1 The misspelling of the HTTP header name is a historical accident which was never corrected.

2 Notably, Safari, IE11, and versions of Edge 18 and below only supported an older draft of the Referrer policy spec, with tokens never (matching no-referrer), always (matching unsafe-url), origin (unchanged) and default (matching no-referrer-when-downgrade). Edge 18 supported origin-when-cross-origin, but only for resource subdownloads.

For security reasons, Edge 76+ and Chrome block navigation1 to file:// URLs from non-file:// urls.

If a browser user clicks on a file:// link on an https-delivered webpage, nothing visibly happens. If you open the the Developer Tools console, you’ll see a note: “Not allowed to load local resource: file://host/whatever”.

In contrast, Edge18 (like Internet Explorer before it) allowed pages in your Intranet Zone to navigate to URLs that use the file:// url protocol; only pages in the Internet Zone were blocked from such navigations2.

No option to disable this navigation blocking is available in Chrome or Edge 76+.

What’s the Risk?

The most obvious problem is that the way file:// retrieves content can result in privacy and security problems. Pulling remote resources over file:// can leak your user account information and a hash of your password to the remote site. What makes this extra horrific is that if you log into Windows using an MSA account, the bad guy gets both your global userinfo AND a hash he can try to crack.

Beyond the data leakage risks related to remote file retrieval, other vulnerabilities related to opening local files also exist. Navigating to a local file might result in that file opening in a handler application in a dangerous or unexpected way. The Same Origin Policy for file URIs is poorly defined and inconsistent across browsers, which can result in security problems.

Workaround: IE Mode

Enterprise administrators can configure sites that must navigate to file:// urls to open in IE mode. Like legacy IE itself, IE mode pages in the Intranet zone can navigate to file urls.

Workaround: Extensions

Unfortunately, the extension API chrome.webNavigation.onBeforeNavigate does not fire for file:// links that are blocked in HTTP/HTTPS pages, which makes working around this blocking this via an extension difficult.

One could write an extension that uses a Content Script to rewrite all file:// hyperlinks to an Application Protocol handler (e.g. file-after-prompt://) that will launch a confirmation dialog before opening the targeted URL via ShellExecute or explorer /select,”file://whatever”, but this would entail rewriting the extension rewriting every page which has non-zero performance implications. It also wouldn’t fix up any non-link file navigations (e.g. JavaScript that calls window.location.href=”file://whatever”).

Similarly, the Enable Local File Links extension simply adds a click event listener to every page loaded in the browser. If the listener observes the user clicking on a link to a file:// URL, it cancels the click and directs the extension’s background page to perform the navigation to the target URL, bypassing the security restriction by using the extension’s (privileged) ability to navigate to file URLs. But this extension will not help if the page attempts to use JavaScript to navigate to a file URI, and it exposes you to the security risks described above.

 

Necessary but not sufficient

Unfortunately, blocking file:// uris in the browser is a good start, but it’s not complete. There are myriad formats which have the ability to hit the network for file URIs, ranging from Office documents, to emails, to media files, to PDF, MHT, SCF files, etc, and most of them will do so without confirmation.

In an enterprise, the expectation is that the Organization will block outbound SMB at the firewall. When I was working for Chrome and reported this issue to Google’s internal security department, they assured me that this is what they did. I then proved that they were not, in fact, correctly blocking outbound SMB for all employees, and they spun up a response team to go fix their broken firewall rules. In a home environment, the user’s router may or may not block the outbound request.

Various policies are available, but I get the sense that they’re not broadly used.

 

-Eric

This post covers navigating to file:// uris. Another question which occasionally comes up is “how can I embed a subresource like an image or a script from a file:// URI into my https-served page.” This, you also cannot do, for similar security/privacy reasons. And that’s probably a good thing.

2 Interestingly, some alarmist researchers didn’t realize that this was happening on a per-zone basis, and asserted that IE/Edge would directly leak your credentials from any Internet web page. This is not correct. It is further incorrect in old Edge (Spartan) because Internet-zone web pages run in Internet AppContainers, which lack the Enterprise Authentication permission, which means that they don’t even have your credentials.

The Chrome team is embarking on a clever and bold plan to change the recipe for cookies. It’s one of the most consequential changes to the web platform in almost a decade, but with any luck, users won’t notice anything has changed.

But if you’re a web developer, you should start testing your sites and services now to help ensure a smooth transition.

What’s this all about?

As originally designed, cookies were very simple. When a browser made a request to a website, that website could return a tiny piece of text, called a cookie, to the browser. When the web browser subsequently requested any resource from that website, the cookie string would be echoed back to the server that first sent it.

Simple, right?

A bit too simple, as it turns out.

Browser designers have spent the last two decades trying to clear up the mess that this one simple feature causes, and alternatives might never gain adoption.

There are two major classes of problem with the design of cookies: Privacy, and Security.

Privacy

The top privacy problem is that cookies are sent every time a request is made for a resource, even if that request is made from a completely different context. So, if you visit A.example.com, that page might request a tracking pixel from ad.doubleclick.net. This tracking pixel might set a cookie. The tracking pixel’s cookie is called a third party cookie because it was set by a domain unrelated to the page itself.

If you later visit B.textslashplain.com, which also contains a tracking pixel from ad.doubleclick.net, the tracking pixel’s cookie set on your visit to A.example.com is sent to ad.doubleclick.net, and now that tracker knows that you’ve visited both sites. As you browse more and more sites that contain a tracking pixel from the same provider, that provider can build up a very complete profile of the sites you like to visit, and use that information to target ads to you, sell the data to a data aggregation company, etc.

Today, Brave blocks 3rd party cookies by default, while Safari’s ITP feature does something more intricate. Firefox and the new Edge have “Tracking Prevention” features that block 3rd-party cookies from known trackers.

Most browsers offer a setting to turn off ALL third party cookies, and older versions of Internet Explorer used to P3P block cookies that did not promise to abide by reasonable privacy protections. However, almost all users leave 3rd party cookies enabled, and enough sites sent fraudulent P3P declarations that the P3P support was ripped out of the only browser that supported it.

Security

The security problems with cookies are a bit more subtle.

In most cases, after you log in, a site will store your identity in a cookie, such that you don’t have to reenter your password on every page, or retap your security key every time you do anything. An authentication token is stored in a cookie, and each request you make to a site carries that cookie and token.

The problem is that this creates the possibility of a cross site request forgery attack, in which an attacker carefully crafts a request to a website to which you are logged in. When you visit the attacker’s site (say, to read a news article or view an image link posted to your social media feed), the attacker’s page instructs your browser to send its malicious request (“Transfer $1000 from me to @badguy”) to the victim site where you are logged in (e.g. https://bank.example).

Normally, such a request would be ignored or responded to by a demand for credentials, but because your browser is already logged in to bank.example and because your browser made the request, the server receives the cookie containing your authentication token and deems the request legitimate. You’ve been robbed! This class of attack is called the confused deputy attack.

Now, there are myriad ways to protect against this problem, but they all require careful work on the part of web application developers, and a long history of exploits shows that failing to protect against CSRF is a common mistake.

Other Privacy and Security Problems

I’ve described the biggest and most prominent of the security and privacy problems with the design of cookies, but those are just the tip of the iceberg. There are many other problems, including:

  • Because cookies are sent on plaintext HTTP requests, a passive network observer can watch your cookies as they’re sent over the network and correlate requests back to a single client. The NSA famously abused this vulnerability.
  • Cookies are sent to servers which respond with data that might allow Cross Site Leaks or violations of Same Origin Policy. For instance, an attacker might include in their page several guesses as to your identity. If they guess right (e.g. your cookie matches the URL identity), the images loads and now the attacker site knows exactly who you are:
    WordPressCorruptsMarkup
  • Cookies can be trivially stolen in a XSS Attack if the HTTPOnly attribute was not set.
  • Cookies can be trivially leaked if the server forget to set the Secure attribute and the site isn’t on the HSTS preload list.
  • Same Origin Policy blocks reading of cross-origin resources, but this depends on the integrity of the browser sandbox. Attacks like Spectre weaken this guarantee. Ambient authentication (like cross-origin cookies) weakens Cross-Origin-Read-Blocking‘s ability to prevent a compromised renderer from stealing data.

Cookie Design Improvements

Over the years, browsers have introduced more and more features and toggles to help lock down cookies, from the Secure and HTTPOnly attributes to Cookie Prefixes (nee Magic Named Cookies).

Perhaps the most promising improvement is a feature called SameSite cookies, by which cookies can opt-in to being sent only in a first-party context:

Set-Cookie: __HostAuth=F123ABCA; SameSite=Strict; secure; httponly;

This helps protect your site against CSRF attacks and helps mitigate leakage of the user’s identity in a cross-site context.

There’s a nice SameSite cookie explainer (with pictures!).

While broadly supported by browsers, the SameSite directive isn’t getting used everywhere it should be.

Big Changes are Coming!

So the Chrome folks plan to change that.

In Chrome 80 and later, cookies will default to SameSite=Lax. This means that cookies will automatically be sent only in a first party context unless they opt-out by explicitly setting a directive of None:

Set-Cookie: ACookieAvailableCrossSite; SameSite=None; secure; httponly


This change is small in size, and huge in scope. It has huge implications for any site that expects its cookies to be used in a cross-origin context.

 

What’s the Immediate Good?

In one fell swoop, many websites will get more secure. Sites that were previously vulnerable to CSRF and cross-site leak attacks will be protected from attack in the most popular browser.

Privacy improves, because setting and sending of 3rd party cookies are blocked-by-default:

SubframeCookiesBlocked

Who’s on board?

We plan to match this change in default for the new Edge browser (release timeline TBD). However, there’s no plan to match this change for Internet Explorer (please stop using it!) or the old Edge (v18 and earlier).

Per Chromium’s Intent-to-Implement announcement, Firefox is looking into matching the change, although they’ve joking-not-joking suggested that they’re going to let Chrome lead the charge (and bear the brunt of the compatibility impact) before turning the feature on by default.

Per the I2I, Safari has not yet weighed in on this change. Safari’s ITP feature already imposes many interesting restrictions on cross-site cookies.

What Can Go Wrong?

If users visit a site that expects its cookies to be available but the cookies are missing, users might get a confusing error message suggesting that they toggle a setting that won’t help:

teamserror

Or, the site might just redirect between its identity provider and itself forever.

These sorts of problems happen on sites that use Federated Identity providers that depend on accessing cookies from 3rd-party subframes:

federatedid-1

To fix this, the identity provider site will either need to set SameSite=None on its cookies, or will need to use a browser storage feature (e.g. localStorage) that is not impacted by this change. Please note that other browser features do impact the availability of DOM Storage, so it’s not a silver bullet.

Rollout Plan

The Chrome team has set an ambitious timeline which calls for turning this feature on-by-default for Chrome 80, slated for stable release on February 4th, 2020. Chrome’s rollout plan includes enabling the new default on an experimental basis in Chrome 79 pre-release [Beta/Dev/Stable] channels for machines which are not externally managed — attempting to defer walking into the compatibility minefield of enterprise intranets. Presently, this exclusion does not apply to machines that are domain-joined via AAD due to a limitation in Chromium.

The Chrome team also announced two enterprise policies for Chrome 80 that will allow admins to opt-out of the new default entirely, or to opt-out only specific sites. Edge expects to offer these same policies.

Developer Tooling

Developers who wish to enable the SameSite-by-Default feature locally for testing purposes can do so by visiting chrome://flags and searching for SameSite:

flags

Set the SameSite by default cookies feature to Enabled and restart the browser.

You can view the cookies used by the current page using the Application tab of the Developer Tools; the column at the far right shows the declared SameSite attribute:

F12Tab

The Chrome team have enabled logging in the Developer Tools to notify web developers that cookie behavior is changing. Visit chrome://flags/cookie-deprecation-messages to ensure that the warnings are enabled:

CookieMessages

If you then explore my test page, you can see the notices from the tools:

DebugInfo

The Cookies subtab of the Network tab picked up a new checkbox “show filtered out request cookies” which allows you to see (in yellow) which cookies were not sent for the selected request due to SameSite rules:

CookiesTab

Cookies restricted by SameSite rules are also logged in NetLog captures (issue 1005217).

Problems and Accommodations

While the vast majority of cookie scenarios will continue to work as expected, compatibility breaks are inevitable. Unfortunately, some of these breaks might not be trivially fixed by adding the SameSite=None attribute.

For instance, older versions of Safari treated SameSite=None as SameSite=Strict, which means that servers must avoid sending the None token to Safari 12.

The .NET Framework’s cookie writer used to simply omit the SameSite attribute when the SameSiteMode was “None.” Changing this will require the affected sites to update their framework to a version with the patch.

Early in our investigations, we found another problem related to how SameSite impacts cookies sent while navigating.

Specifically, over a million sites first set an anti-CSRF cookie on themselves, then redirect to a federated Login provider, then the Login provider POSTs the login information back to the site. That initial anti-CSRF cookie is only meant to be used in a first party context. Crucially, however, SameSite cookies are not sent on navigations if the navigations use the HTTP POST verb. Making the anti-CSRF cookies SameSite=Lax by default breaks this scenario and thus breaks tons of websites.

AntiCSRF

The Two Minute Mitigation

Demanding these security cookies be set to SameSite=None would be both onerous (many more sites would need to change) and misleading (because these cookies are really only meant to go to a 1st party context).

To address this breakage, the new default was adjusted to allow a SameSite-Lax-by-Default cookie to be sent on a subsequent POST requests for two minutes, significantly reducing the breakage without giving up all of the security benefit of the change.

Note that the 2 minute mitigation might not be enough for login scenarios that take longer than 2 minutes. For instance, consider the case where the login flow is happening in a background tab, or you have to fetch your Security Key, or a child must get a parent’s permission, etc. Sites that wish to handle scenarios like this will need to store a copy of their anti-CSRF token elsewhere (e.g. sessionStorage seems appropriate).

The Chrome team plans to eventually remove the 2 minute mitigation entirely.

Compat Landmine: document.cookie

In Firefox and Safari, the document.cookie DOM property matches the Cookie header, including omission of cookies that were restricted by SameSite navigation rules.

In contrast, in Chrome and Edge, SameSite cookies that are omitted from the Cookie header are still included in the document.cookie collection following a cross-origin navigation. I’ve been convinced that this actually makes more sense, although the reasoning is subtle [issue].

To the extent possible, you should set your cookies with the httponly attribute, so that they’re not available to JavaScript and this compatibility difference is irrelevant.

What’s Next?

Assuming the rollout happens on schedule, users will get more security and more privacy. But that’s just the start.

The next step is to combat the “non-secure-cookies-are-trackable” attack mentioned previously. To prevent non-secure cross-site cookies being used by network observers to follow users around the web, SameSite=None cookies will be blocked if set without the Secure attribute. Chrome’s timeline for enabling this change by default seems squishier, but ChromeStatus claims it is also slated for Chrome 80.

After that, the next step is to combat the inevitable abuse by trackers.

Because trackers can simply opt their own cookies out of restrictions by setting SameSite=None, trackers will do so. But this isn’t is bad as you think– by forcing sites to explicitly declare each cookie for which cross-site use is intended, browsers can then focus extra love and attention around such cookies.

If we peek at Chrome’s flags page today, we see an interesting hint about the Chrome team’s plans:

SSNoneSwitch

Enabling that option enables EnableRemovingAllThirdPartyCookies which adds a new Remove Third-Party Cookies button to the All cookies and site data page. When clicked, it pops the following dialog:Delete3rdPartyButton

The HandleRemoveThirdParty() function invoked by the Clear button clears not only the cookies from those domains, but also all of the site data for the sites on which those cookies existed. This provides a strong disincentive for sites to opt-in to SameSite=None cookies unless they really need to.

(Disclaimer: Chrome might never launch the “Clear third-party cookies” button, but it’s in the code today).

You can expect that browser designers will soon dream up new and interesting remediations for cookies marked for access across multiple sites. For instance, browsers could limit the lifetime of those cookies to days or hours, or even a single browser session (tricky).

We live in interesting times!

Update: Chrome pushed back experimenting with this feature from Chrome 78 to Chrome 79.

-Eric

PS: If you’re a Fiddler user, you can use this script to easily visualize which cookies are being set with which SameSite values.

Cookie-dough hero photo by Pam Menegakis on Unsplash.

For a small number of users of Chromium-based browsers (including Chrome and the new Microsoft Edge) on Windows 10, after updating to 78.0.3875.0, every new tab crashes immediately when the browser starts.

Impacted users can open as many new tabs as they like, but each will instantly crash:

EveryTabCrashes

EdgeHavingAProblem

What’s going wrong? This problem relates to a security/reliability improvement made to Chromium’s sandboxing. Chromium runs each of the tabs (and extensions) within locked down (“sandboxed”) processes:

JAIL

In Chrome 78, a change was made to prevent 3rd-party code from injecting itself into these sandboxed processes. 3rd-party code is a top source of browser reliability and performance problems, and it has been a longstanding goal for browser vendors to get this code out of the web platform engine.

This new feature relies on setting a Windows 10 Process Mitigation policy that instructs the OS loader to refuse to load binaries that aren’t signed by Microsoft. Edge 13 enabled this mitigation in 2015, and the Chromium change brings parity to the new Edge 78+. Notably, Chrome’s own DLLs aren’t signed by Microsoft so they are specially exempted by the Chromium sandboxing code.

Unfortunately, the impact of this change is that the renderer is killed (resulting in the “Aw snap” page) if any disallowed DLL attempts to load, for instance, if your antivirus software attempts to inject its DLLs into the renderer processes. For example, Symantec Endpoint Protection versions before 14.2 are known to trigger this problem.

If you encounter this problem, you should follow the following steps:

Update any security software you have to the latest version.

Other than malware, security software is the other likely cause of code being unexpectedly injected into the renderers.

Temporarily disable the new protection

You can temporarily launch the browser without this sandbox feature to verify that it’s the source of the crashes.

  1. Close all browser instances (verify that there are no hidden chrome.exe or msedge.exe processes using Task Manager)
  2. Use Windows+R to launch the browser with the command line override:
msedge.exe --disable-features=RendererCodeIntegrity
chrome.exe --disable-features=RendererCodeIntegrity

Ensure that the tab processes work properly when code integrity checks are disabled.

If so, you’ve proven that code integrity checks are causing the crashes.

Hunt down the culprit

Visit chrome://conflicts#R to show the list of modules loaded by the client. Look for any files that are not Signed By Microsoft or Google.

If you see any, they are suspects. (There will likely be a few listed as “Shell Extension”s; e.g. 7-Zip.dll, that do not cause this problem)– check for an R in the Process types column to find modules loading in the Renderers.

You should install any available updates for any of your suspects to see if doing so fixes the problem.

Use Enterprise Policy to disable the new protection

If needed, IT Adminstrators can disable the new protection using the RendererCodeIntegrity policy for Chrome and Edge. You should outreach to the software vendors responsible for the problematic applications and request that they update them.

Other possible causes

Note that it’s possible that you could have a PC that encounters symptoms like this (all subprocesses crash) but not a result of the new code integrity check.

  • For instance, Chromium once had an obscure bug in its sandboxing code that caused all sandboxes to crash depending on the random memory mapping of Address Space Layout Randomization.
  • Similarly, Chrome and Edge still have an active bug where all renderers crash on startup if the PC has AppLocker enabled and the browser is launched elevated (as Administrator).

-Eric

Background

Typically, if you want your website to send a document to a client application, you simply send the file as a download. Your server indicates that a file should be treated as a download in one of a few simple ways:

  • Specifying a nonwebby type in the Content-Type response header.
  • Sending a Content-Disposition: attachment; filename=whatever.ext response header.
  • Setting a download attribute on the hyperlink pointing to the file.

These approaches are well-supported across browsers (via headers for decades, via the download attribute anywhere but IE since 2016).

The Trouble with Plain Downloads

However, there’s a downside to traditional downloads — unless the file itself contains the URL from which the download originated, the client application will not typically know where the file originated, which can be a problem for:

  • Security – “Do I trust the source of this file?
  • Functionality – “If the user makes a change to this file, to where should I save changes back?“, and
  • Performance – “If the user already had a copy of this 60mb slide deck, maybe skip downloading it again over our expensive trans-Pacific link?

Maybe AppProtocols?

Rather than sending a file download, a solution developer might instead just invoke a target application using an App Protocol. For instance, the Microsoft Office clients might support a syntax like:

ms-word:ofe|u|https://example.com/docx.docx

…which directs Microsoft Word to download the document from example.com.

However, the AppProtocol approach has a shortcoming– if the user doesn’t happen to have Microsoft Word installed, the protocol handler will fail to launch and either nothing will happen or the user may get a potentially confusing error message. That brokenness will occur even if they happen to have another client (e.g. WordPad) that could handle the document.

DirectInvoke

To address these shortcomings, we need a way to instruct the browser: “Download this file, unless the client’s handler application would prefer to just get its URL.”

While a poorly-documented precursor technology existed as early as the 1990s, Windows 8 reintroduced this feature as DirectInvoke. When a client application registers itself indicating that it supports receiving URLs rather than local filenames, and when the server indicates that it would like to DirectInvoke the application using the X-MS-InvokeApp response header:

DirectInvoke

…then the download stream is aborted and the user is instead presented with a confirmation prompt:

UIPrompt

If the user accepts the prompt, the handler application is launched, passing the URL to the web content.

Now, for certain types, the server doesn’t even need to ask for DirectInvoke behavior via the X-MS-InvokeApp header. The FTA_AlwaysUseDirectInvoke bit can be set in the type’s EditFlags registry value. The bit is documented on MSDN as: 

FTA_AlwaysUseDirectInvoke 0x00400000
Introduced in Windows 8. Ensures that the verbs for the file type are invoked with a URL instead of a downloaded version of the file. Use this flag only if you’ve registered the file type’s verb to support DirectInvoke through the SupportedProtocols or UseUrl registration.

Microsoft’s ClickOnce deployment technology makes use of the FTA_AlwaysUseDirectInvoke flag.

A sample registry script for a type that should always be DirectInvoke’d might look like this:

To test it, first set up the registry, install a handler to C:\Windows, and then click this example link

TraditionalVsDI.png

Caveats

In order for this architecture to work reliably, you need to ensure a few things.

App Should Handle Traditional Files

First, your application needs to have some reasonable experience if the content is handled as a traditional download, as it would be using Chrome or Firefox, or on a non-Windows operating system.

By way of example, it’s usually possible to construct a ClickOnce manifest that works correctly after download. Similarly, Office applications work fine with regular files, although the user must take care to reupload the files after making any edits.

App Should Avoid Depending On Browser State

If your download flow requires a cookie, the client application will not have access to that cookie and the download will fail. The client application probably will not be able to prompt the user to login to otherwise retrieve the file.

If your download flow requires HTTP Authentication or HTTPS Client Certificate Authentication, the client application might work (if it supports NTLM/Negotiate) or it might not (e.g. if the server requires Digest Auth and the client cannot show a credential prompt.

App Should Ensure URL Support

Many client applications have limits in the sorts of URLs that they can support. For instance, the latest version of Microsoft Excel cannot handle a URL longer than 260 characters. If a .xlsx download from SharePoint site attempts to DirectInvoke, Excel will launch and complain that it cannot retrieve the file.

App Should Ensure Network Protocol Support

Similarly, if the client app registers for DirectInvoke of HTTPS URLs, you should ensure that it supports the same protocols as the browser. If a server requires a protocol version (e.g. TLS/1.2) that the client hasn’t yet enabled (say it only enables TLS/1.0), then the download will fail.

Server Must Not Send |Content-Disposition: attachment|

As noted in the documentation, a Content-Disposition: attachment response header takes precedence over DirectInvoke behavior. If a server specifies attachment, DirectInvoke will not be used.

Note: If you wish to use a Content-Disposition header to name the file, you can do so using Content-Disposition: inline; filename=”fuzzle.fuzzle”

Conclusion

As you can see, there’s quite a long list of caveats around using the DirectInvoke WebToApp communication scheme, but it’s still a useful option for some scenarios.

In future posts, I’ll continue to explore some other alternatives for Web-to-App communication.

-Eric

Note: The current version of Edge 79 does not yet support FTA_AlwaysUseDirectInvoke. We expect to fix this in the future.

Note: I have a few test cases.

 

As we’ve been working to replatform the new Microsoft Edge browser atop Chromium, one interesting outcome has been early exposure to a lot more bugs in Chromium. Rapidly root-causing these regressions (bugs in scenarios that used to work correctly) has been a high-priority activity to help ensure Edge users have a good experience with our browser.

Stabilization via Channels

Edge’s code stabilizes as it flows through release channels, from the developer’s-only ToT/HEAD (Tip-of-tree, the latest commit in the source repository) to the Canary Channel build (updated daily) to the Dev Channel updated weekly, to the Beta Channel (updated a few times over its six week lifetime) to our eventual, as-yet-unreleased Stable Channel.

Until recently, Microsoft Edge was only available in Canary and Dev channels, which meant that any bugs landed in Chromium would almost immediately impact almost all users of Edge. Even as we added a Beta channel, we still found users reporting issues that “reproduce in all Edge builds, but not in Chrome.

As it turns out, most of the “not in Chrome” comparisons turn out to mean that the problems are “not repro in Chrome Stable.” And that’s because the either the regressions simply haven’t made it to Stable yet, or because the regressions are hidden behind Feature Flags that are not enabled for Chrome’s stable channel.

A common example of this is LayoutNG, a major update to the layout engine used in Chromium. This redesigned engine is more flexible than its predecessor and allows the layout engineers to more easily add the latest layout features to the browser as new standards are finalized. Unfortunately, changing any major component of the browser is almost certain to lead to regressions, especially in corner cases. Google had enabled LayoutNG by default in the code for Chrome 76 (and Edge picked up the change), but then subsequently used the Feature Flag to disable LayoutNG for the stable channel three days before Chrome 76 shipped. As a result, the new LayoutNG engine is on-by-default for Chrome Beta, Dev, and Canary and Edge Beta, Dev, and Canary.

The key difference is that Edge doesn’t yet have a public stable channel to which any bug-impacted users can retreat. Therefore, reproducing, isolating, and fixing regressions as quickly as possible is important for Edge engineers.

Isolating Regressions

When we receive a report of a bug that reproduces in Microsoft Edge, we follow a set of steps for figuring out what’s going on.

Check Latest Canary

The first step is checking whether it reproduces in our Edge Canary builds.

If not, then it’s likely the bug was already found and fixed. We can either tell our users to sit tight and wait for the next update, or we can search on CRBug.com to figure out when exactly the fix went in and inform the user specifically when a fix will reach them.

Check Upstream

If the problem still reproduces in the latest Edge Canary build, we next try to reproduce the problem in the equivalent build of Chrome or plain-vanilla Chromium.

If Not Repro in Chromium?

If the problem doesn’t not reproduce in Chrome, this implies that the problem was caused by Microsoft’s modifications to the code in our internal branches. Alternatively, it might also be the case that the problem is hidden in upstream Chrome behind an experimental flag, so sometimes we must go spelunking into the browser’s Feature Flag configuration by visiting:

chrome://version/?show-variations-cmd 

The Command-Line Variations section of that page reveals the names of the experiments that are enabled/disabled. Launching the browser with a modified version of the command line enables launching Chrome in a different configuration2.

If the issue really is unique to Edge, we can use git blame and similar techniques on our code to see where we might have caused the problem.

If Repro in Chromium?

If the problem does reproduce in Chrome or Chromium, that’s strong evidence that we’ve inherited the problem from upstream.

Sanity Check: Does it Repro in Firefox?

If the problem isn’t a crashing bug or some other obviously incorrect behavior, I will first check the site’s behavior in the latest Firefox nightly build, on the off chance that the browsers are behaving correctly and the site’s markup or JavaScript is actually incorrect.

Try tweaking common Flags

Depending on the area where the problem occurs, a quick next step is to try toggling Feature Flags that seem relevant on the chrome://flags page. For instance, if the problem is in layout, try setting chrome://flags/#enable-layout-ng to Disabled. If the problem seems to be related to the network, try toggling chrome://flags/#network-service-in-process, and so on.

Understanding whether the problem can be impacted by flags enables us to more quickly find its root cause, and provides us an option to quickly mitigate the problem for our users (by adjusting the flag remotely from our experimental configuration server).

Bisecting Regressions

The gold standard for root-causing regressions is finding the specific commit (aka “change list” aka “pull request” aka “patch”) that introduced the problem. When we determine which commit caused a problem, we not only know exactly what builds the problem affects, we also know what engineer committed the breaking change, and we know what exactly code was changed– often we can quickly spot the bug within the changed files.

Fortunately, Chromium engineers trying to find regressing commits have a super power, known as bisect-builds.py. Using this script is simple: You specify the known-bad Chromium version and a guesstimate of the last known good version. The script then automates the binary search of the builds between the two positions, requiring the user to grade each trial as “Good” or “Bad”, until the introduction of the regression is isolated.

The simplicity of this tool belies its magic— or, at least, the power of the log2 math that underlies it. OmahaProxy informs me that the current Chrome tree contains 692,278 commits. log2(692278) is 19.4, which means that we should be able to isolate any regressing change in history in just 20 trials, taking a few minutes at most1. And it’s rare that you’d want to even try to bisect all of history– between stable and ToT we see ~27250 commits, so we should be able to find any regression within such a range in just 15 trials.

On CrBug.com, regressions awaiting a bisect are tagged with the label Needs-Bisect, and a small set of engineers try to burn down the backlog every day. But running bisects is so easy that it’s usually best to just do it yourself, and Edge engineers are doing so constantly.

One advantage available to Googlers but not the broader Chromium community is the ability to do what’s called a “Per-Revision Bisect.” Inside Google, Chrome builds a new build for every single change (so they can unambiguously flag any change that impacts performance), but not all of these builds are public. Publicly, Google provides the Chromium Continuous Build Archive, an archive of builds that are built approximately every one to fifteen commits. That means that when we do bisects, we often don’t get back a single commit, but instead a regression range. We then must look at all of the commits within that range to figure out which was the culprit. In my experience, this is rarely a problem– most commits in the range obviously have nothing to do with the problem (“Change the icon for some far off feature”), so there are rarely more than one or two obvious suspects.

The Edge team does not currently expose our Edge build archives for bisects, but fortunately almost all of our web platform work is contributed upstream, so bisecting against Chromium is almost always effective.

Recent Example

Last Thursday, we started to get reports of a “missing certificate” problem in Microsoft Edge, whereby the browser wasn’t showing the expected Lock icon for HTTPS pages that didn’t contain any mixed content:

The certificate is missing

While the lock was missing for some users, it was present for others. After also reproducing the issue in Chrome itself, I filed a bug upstream and began investigating.

Back when I worked on the Chrome Security team, we saw a bunch of bugs that manifested like this one that were caused by refactoring in Chrome’s navigation code. Users could hit these bugs in cases where they navigated back/forward rapidly or performed an undo tab close operation, or sites used the HTML5 History API. In all of these cases, the high-level issue is that the page’s certificate is missing in the security_state: either the ssl_status on the NavigationHandle is missing or it contains the wrong information.

This issue, however, didn’t seem to involve navigations, but instead was hit as the site was loaded and thus it called to mind a more recent regression from back in March, where sites that used AppCache were missing their lock icon. That issue involved a major refactoring to use the new Network Service.

One fact that immediately jumped out at me about the sites first reported to hit this new problem is that they all use ServiceWorker (e.g. Hotmail, Gmail, MS Teams, Twitter). Like AppCache, ServiceWorker allows the browser to avoid hitting the network in response to a fetch. As with AppCache, that characteristic means that the the browser must somehow have the “original certificate” for that response from the network so it can set that certificate in the security_state when it’s needed.

But where does that certificate live?

Chromium stores the certificate for a given HTTPS response after the end of the cache entry, so it should be available whenever the cached resource is used3. A quick disk search revealed that ServiceWorker stores its scripts inside the folder:

%localappdata%\Microsoft\Edge SxS\User Data\Default\Service Worker\ScriptCache

Comparing the contents of the cache file on a “good” and a “bad” PC, we see that the certificate information is missing in the cache file for a machine that reproduces the problem:

The serialized certificate chain is present in the “Good” case

So, why is that certificate missing? I didn’t know.

I performed a bisect three times, and each time I ended up with the same range of a dozen commits, only one of which had anything to do with anything to do with caching, and that commit was for AppCache, not ServiceWorker.

More damning for my bisect suspect was the fact that this suspect commit landed (in 78.0.3890) the day after the build (3889) upon which the reproducing Edge build was based. I spent a bunch of time figuring out whether this could be the off-by-one issue in Edge build numbering before convincing myself that no, it couldn’t be: build number misalignment just means that Edge 3889 might not contain everything that’s in Chrome 3889.

Unless an Edge Engineer had cherry-picked the regressing commit into our 3889 (unlikely), the suspect couldn’t be the culprit.

Edge 3889 doesn’t include all of the commits in Chromium 3889.

I posted my research into the bug at 10:39 PM on Friday and forty minutes later a Chrome engineer casually revealed something I didn’t realize: Chrome uses two different codepaths for fetching a ServiceWorker script– a “new_script_loader” and an “updated_script_loader.”

And instantly, everything fell into place. The reason the repro wasn’t perfectly reliable (for both users and my bisect attempts) was that it only happened after a ServiceWorker script updates.

  • If the ServiceWorker in the user’s cache is missing, it is downloaded by the new_script_loader and the certificate is stored.
  • If the ServiceWorker script is present and is unchanged on the server, the lock shows just fine.
  • But if the ServiceWorker in the user’s cache is present but outdated, the updated_script_loader downloads the new script… and omits the certificate chain. The lock icon disappears until the user clears their cache or performs a hard (CTRL+F5) refresh, at which point the lock remains until the next script update.

With this new information in hand, building a reliable reduced repro case was easy– I just ripped out the guts of one of my existing PWAs and configured it so that it updated itself every five seconds. That way, on nearly every load, the cached ServiceWorker would be deemed outdated and the script redownloaded.

With this repro, we can kick off our bisect thusly:

python tools/bisect-builds.py -a win -g 681094 -b 690908 --verify-range --use-local-cache -- --no-first-run --user-data-dir=/temp https://webdbg.com/apps/alwaysoutdated/

… and grade each build based on whether the lock disappears on refresh:

Grading each build based on whether the lock disappears

Within a few minutes, we’ve identified the regression range:

A culprit is found

In this case, the regression range contains just one commit— one that turns on the new ServiceWorker update check code. This confirms the Chromium engineer’s theory and that this problem is almost identical to the prior AppCache bug. In both cases, the problem is that the download request passed kURLLoadOptionNone and that prevented the certificate from being stored in the HttpResponseInfo serialized to the cache file. Changing the flag to kURLLoadOptionSendSSLInfoWithResponse results in the retrieval and storage of the ssl_info, including the certificate.

The fix was quick and straightforward; it will be available in Chrome 78.0.3902 and the following build of Edge based on that Chromium version. Notably, because the bug is caused by failure to put data in a cache file, the lock will remain missing even in later builds until either the ServiceWorker script updates again, or you hard refresh the page once.

-Eric

1 By way of comparison, when I last bisected an issue in Internet Explorer, circa 2012, it was an extraordinarily painful two-day affair.

2 You can use the command line arguments in the variations page (starting at --force-fieldtrials=) to force the bisect builds to use the same variations.

Chromium also has a bisect-variations script which you can use to help narrow down which of the dozens of active experiments is causing a problem.

If all else fails, you can also reset Chrome’s field trial configuration using chrome.exe --reset-variation-state to see if the repro disappears.

3 Aside: Back in the days of Internet Explorer, WinINET had no way to preserve the certificate, so it always bypassed the cache for the first request to a HTTPS server so that the browser process would have a certificate available for its subsequent display needs.


Note: This post is part of a series about Web-to-App Communication techniques.

Just over eight years ago, I wrote my last blog post about App Protocols, a class of URL schemes that typically1 open another program on your computer instead of returning data to the web browser. 

App Protocols2 are both simple and powerful, allowing client app developers to easily enable the invocation of their apps from a website. For instance, ms-screenclip is a simple app protocol built into Windows 10 that kicks off the process of taking a screenshot:

    ms-screenclip:?delayInSeconds=2

When the user invokes this url, the handler waits two seconds, then launches its UI to collect a screenshot. Notably, App Protocols are fire-and-forgetmeaning that the handler has no direct way to return data back to the browser that invoked the protocol.

The power and simplicity of App Protocols comes at a cost. They are the easiest route out of browser sandboxes and are thus terrifying, especially because this exploit vector is stable and available in every browser from legacy IE to the very latest versions of Chrome/Firefox/Edge/Safari.

What’s the Security Risk?

A number of issues make App Protocols especially risky from a security point-of-view.

Careless App Implementation

The primary security problem is that most App Protocols were designed to address a particular scenario (e.g. a “Meet Now” page on a videoconferencing vendor’s website should launch the videoconferencing client) and they were not designed with the expectation that the app could be exposed to potentially dangerous data from the web at large.

We’ve seen apps where the app will silently reconfigure itself (e.g. sending your outbound mail to a different server) based on parameters in the URL it receives. We’ve seen apps where the app will immediately create or delete files without first confirming the irreversible operation with the user. We’ve seen apps that assumed they’d never get more than 255 characters in their URLs and had buffer-overflows leading to Remote Code Execution when that limit was exceeded. The list goes on and on.

Poor API Contract

In most cases3, App Protocols are implemented as a simple mapping between the protocol scheme (e.g. “alert”) and a shell command, e.g. 

AlertProtocol

When the protocol is invoked by the browser, it simply bundles up the URL and passes it on the command line to the target application. The app doesn’t get any information about the caller (e.g. “What browser or app invoked this?“, “What origin invoked this?“, etc) and thus it must make any decisions solely on the basis of the received URL.

Until recently, there was an even bigger problem here, related to the encoding of the URL. Browsers, including Chrome, Edge, and IE, were willing to include bare spaces and quotation marks in the URL argument, meaning that an app could launch with a command line like:

alert.exe "alert:This is an Evil URL" --DoSomethingDangerous --Ignore=This"

The app’s code saw the –DoSomethingDangerous “argument”, failed to recognize it as a part of the URL, and invoked dangerous functionality. This attack led to remote code execution bugs in App Protocol handlers many times over the years. 

Chrome began %-escaping spaces and quotation marks8 back in Chrome 64, and Edge 18 followed suit in Windows 10 RS5.

You can see how your browser behaves using the links on this test page.

Future Opportunity: A richer API contract that allows an App Protocol handler to determine how specifically it was invoked would allow it to better protect itself from unexpected callers. Moving the App Protocol URL data from the command line to somewhere else (e.g. stdin) might help reduce the possibility of parsing errors.

Sandbox

The application that handles the protocol typically runs outside of the browser’s sandbox. This means that a security vulnerability in the app can be exploited to steal or corrupt any data the user can access, install malware, etc. If the browser is running Elevated (at Administrator), any App Protocol handlers it invokes are launched Elevated; this is part of UAC’s design.

Because most apps are written in native code, the result is that most protocol handlers end up in the DOOM! portion of this diagram:

RuleOfTwo

Prompting

In most cases, the only4 thing standing between the user and disaster is a permission prompt.

In Internet Explorer, the prompt looked like this:

IEPermission
As you can see, the dialog provides a bunch of context about what’s happening, including the raw URL, the name of the handler, and a remark that allowing the launch may harm the computer.

Such information is lacking in more modern browsers, including Firefox:

FirefoxPermission

…and Edge/Chrome:

ChromePermission

Browser UI designers (reasonably) argue that the vast majority of users are poorly equipped to make trust decisions, even with the information in the IE prompt, and so modern UI has been greatly simplified5

Unfortunately, lost to evolution is the crucial indication to the user that they’re even making a trust decision at all.

Eliminating Prompts

Making matters more dangerous, everyone hates security prompts that interrupt non-malicious scenarios. A common user request can be summarized as “Prompt me only when something bad is going to happen. In fact, in those cases, don’t even prompt me, just protect me.

Unfortunately, the browser cannot know whether a given App Protocol invocation is good or evil, so it delegates control of prompting in two ways:

In Internet Explorer and Edge (version<=18), the browser respects a per-protocol WarnOnOpen boolean in the registry, such that the App itself may tell the browser: “No worries, let anyone launch me, no prompt needed.

In Firefox, Chrome, and Edge (version >= 76), the browser instead allows the user to suppress further prompts with a “Remember my choice” checkbox. If the user selects this option, the protocol will silently launch in the future without the browser first asking permission.

Notably Edge/Chrome version 77.0.3864 removed the “Always open these types of links in the associated app” checkbox. The stated reason is found in Chrome issue #982341:

No obvious way to undo “Always open these types of links” decision for External Protocols.

We realized in a conversation around issue 951540 that we don’t have settings UI
that allows users to reconsider decisions they’ve made around external protocol
support. Until we work that out, and make longer-term decisions about the
permission model around the feature generally, we should stop making the problem
worse by removing that checkbox from the UI.

A user who had ticked the “Always open” box has no way to later review that decision6, and no obvious way to reverse it. Almost no one figured out that using the “Clear Browsing Data > Cookies and other site data” dialog box option directs Chrome to delete all previous “Always open” decisions for the user’s profile. 

Particularly confusing is that the “Always open” decision wasn’t made on a per-site basis– it applies to every site visited by the user in that browser profile.

UpdateAn upcoming Enterprise policy will allow administrators to restore the checkbox.

Future Opportunity: Much of the risk inherent in open-without-prompting behavior comes from the site that any random site (http://evil.example.com) can abuse ambient permission to launch the protocol handler. If browsers changed the option to “Always allow this site to open this protocol”, the risk would be significantly reduced, and a user could reasonably safely allow, e.g. https://teams.microsoft.com to open the msteams protocol without a prompt.

Alternatively, perhaps the Registry-based provisioning of a protocol handler should explicitly list the sites allowed to launch the protocol, akin to the SiteLock protection for legacy ActiveX controls.

For some schemes7 , Chrome will not even show a prompt because the protocol is included on a built-in allow or deny list.

Some security folks have argued that browsers should not provide any mechanism for skipping the permission prompt. Unfortunately, there’s evidence to suggest that such a firm stance might result in vendors avoiding the prompt by choosing even riskier architectures for Web-to-App communication. More on this in a future post.

Zero-Day Defense

Even when a zero day vulnerability in an App Protocol handler is getting exploited in the wild (e.g. this one), browsers have few defenses available to protect users. 

Unlike, say, file downloads, where the browser has multiple mechanisms to protect users against newly-discovered threats (e.g. file type policies and SmartScreen/SafeBrowsing), browsers do not presently have rapid update mechanisms that can block access to known vulnerable App Protocol handlers.

Future Opportunity: Use SafeBrowsing/SmartScreen or a file-type-policies style Component Update to supply the client with a list of known-vulnerable protocol handlers. If a page attempts to invoke such a protocol, either block it entirely or strongly caution the user.

To improve the experience even further, the blocklist could contain version information such that blocking/additional warnings would only be shown if the version of the handler app is earlier than the version number of the app containing the fix. 

Antivirus programs typically do monitor all calls to ShellExecute and could conceivably protect against malicious invocation of app protocol handlers, but I am not aware of any having ever done so.

Privacy Concerns Prevent Protocol Detection

One of the most common challenges for developers who want to use App Protocols for Web-to-App communication is that the web platform does not expose the list of available protocol handlers to JavaScript. This is primarily a privacy consideration: exposing the list of protocol handlers to the web would expose a significant amount of fingerprintable entropy and might even reveal things about the user’s interests and beliefs (e.g. a ConservativeNews App or a LGBTQ App might expose a protocol handler for app-to-app communication).

Internet Explorer and Edge <= 18 supply a non-standard JavaScript function msLaunchUri that allows a web page to detect that a user didn’t have a protocol handler installed, but this function is not available in other browsers.

UX When a Protocol Isn’t Installed

Browser behavior varies if the user attempts to invoke a link with a scheme for which no protocol handler is registered.

Firefox shows an error page: FirefoxNotInstalled

On Windows 8 and later, IE and Edge<=18 show a prompt that offers to take the user to the Microsoft store to search for a protocol handler for the target scheme:

Win10NotInstalled

Unfortunately, this search is rarely fruitful because most apps are not available in the Microsoft Store.

Interestingly, Chrome and Edge76+ show nothing at all when attempting to invoke a link for which no protocol handler is installed. Surprisingly, there’s no notice even in the Developer Tools console; a particularly thorough debugger will only see a “(canceled)” request in the DevTool’s Network tab.

Upcoming change – Require HTTPS to Invoke

Chrome is looking at requiring that a page be served over HTTPS in order for it to invoke an application protocol.

 

In future posts, I’ll explore some other alternatives for Web-to-App communication.

-Eric


Notes

1 In some browsers, it’s possible to register web-based handlers for “AppProtocols” (e.g. maps: and mailto: might go to Google Maps and GMail respectively), but this is relatively uncommon.

2 Within Chromium, App Protocols are called “External Protocols.”

3 There are other ways to handle protocols, including COM and the Windows 10 App Model’s URI Activation mechanism but these are uncommon.

4 As an anti-abuse mechanism, the browser may require a user-gesture (e.g. a mouse click) before attempting to launch an App Protocol, and may throttle invocations to avoid spamming the user with an infinite stream of prompts.

5 Chrome’s prompt used to look much like IE’s.

6 Short of opening the Preferences for the profile in Notepad or another text editor. E.g. after choosing “Always open” for Microsoft Teams and Skype for Business, the JSON file %localappdata%\Microsoft\Edge SxS\user Data\default\preferences contains

“protocol_handler”:{“excluded_schemes”:{“msteams”:false, “ms-sfb”: false}}

To see the list in IE/Edge<=18, you can run a registry query to find protocols with WarnOnOpen set to 0:

reg query "HKCU\SOFTWARE\Microsoft\Internet Explorer\ProtocolExecute" /s
reg query "HKLM\SOFTWARE\Microsoft\Internet Explorer\ProtocolExecute" /s


7 Hardcoded schemes:

kDeniedSchemes[] = {“afp”,”data”,”disk”,”disks”,”file”,”hcp”,”ie.http”,”javascript”,”ms-help”,”nntp”,”res”,”shell”,”vbscript”,”view-source”,”vnd.ms.radio”}
kAllowedSchemes[] = {“mailto”, “news”, “snews”};

The EscapeExternalHandlerValue function:

// Escapes characters in text suitable for use as an external protocol handler command. // We %XX everything except alphanumerics and -_.!~*'() and the restricted // characters (;/?:@&=+$,#[]) and a valid percent escape sequence (%XX). EscapeExternalHandlerValue()