Improving Privacy by Limiting Referrers

As your browser navigates from page to page, servers are informed of the URL from where you’ve come from using the Referer HTTP header1; the document.referrer DOM property reveals the same information to JavaScript.

Similarly, as the browser downloads the resources (images, styles, JavaScript) within webpages, the Referer header on the request allows the resource’s server to determine which page is requesting the resource.

The Referrer is omitted in some cases, including:

  • When the user navigates via some mechanism other than a link in the page (e.g. choosing a bookmark or using the address box)
  • When navigating from HTTPS pages to HTTP pages
  • When navigating from a resource served by a protocol other than HTTP(S)
  • When the page opts-out (details in a moment)

Usefulness

The Referrer mechanism can be very useful, because it helps a site owner understand from where their traffic is originating. For instance, WordPress automatically generates this dashboard which shows me where my blog gets its visitors:

BlogStats

I can see not only which Search Engines send me the most users, but also which specific posts on Reddit are driving traffic my way.

Privacy Implications

Unfortunately, this default behavior has a significant impact on privacy, because it can potentially leak private and important information.

Imagine, for example, that you’re reviewing a document your mergers and acquisitions department has authored, with the URL https://contoso.com/​Q4/PotentialAcquisitionTargetsUpTo5M.docx. Within that document, there might have a link to https://fabrikam.com​/financialdisclosures.htm. If you were to click that link, the navigation request to Fabrikam’s server will contain the full URL of the document that led you there, potentially revealing information that your firm would’ve preferred to keep quiet.

Similarly, your search queries might contain something you don’t mind Bing knowing (“Am I required to disclose a disease before signing up for HumongousInsurance.com?”) but that you didn’t want to immediately reveal to the site where you’re looking for answers.

If your web-based email reader puts your email address in the URL, or includes the subject of the current email, links you click in that email might be leaking information you wish to keep private.

The list goes on and on.

Referrer Policy

Websites have always had ways to avoid leaking information to navigation targets, usually involving nonstandard navigation mechanisms (e.g. meta refresh) or by wrapping all links so that they go through an innocuous page (e.g. https://example.net/offsitelink.aspx).

However, these mechanisms were non-standard, cumbersome, and would not control the referrer information sent when downloading resources embedded in pages. To address these limitations, Referrer Policy was developed and implemented by most browsers2.

CanIUseRP

Referrer Policy allows a website to control what information is sent in Referer headers and exposed to the document.referrer property. As noted in the spec, the policy can be specified in several ways:

  • Via the Referrer-Policy HTTP response header.
  • Via a meta element with a name of referrer.
  • Via a referrerpolicy content attribute on an aareaimgiframe, or link element.
  • Via the noreferrer link relation on an aarea, or link element.
  • Implicitly, via inheritance.

The policy can be any of the following:

  • no-referrer – Do not send a Referer.
  • unsafe-url – Send the full URL (lacking only auth info and fragment), even on navigations from HTTPS to HTTP.
  • no-referrer-when-downgrade – Don’t send the Referer when navigating from HTTPS to HTTP. [The longstanding default behavior of browsers.]
  • strict-origin-when-cross-origin – For a same-origin navigation, send the URL. For a cross-origin navigation, send only the Origin of the referring page. Send nothing when navigating from HTTPS to HTTP. [Spoiler alert: The new default.]
  • origin-when-cross-origin For a same-origin navigation, send the URL. For a cross-origin navigation, send only the Origin of the referring page. Send the Referer even when navigating from HTTPS to HTTP.
  • same-origin – Send the Referer only for same-origin navigations.
  • origin – Send only the Origin of the referring page.
  • strict-origin – Send only the Origin of the referring page; send nothing when navigating from HTTPS to HTTP.
  • empty string – Inherit, or use the default

As you can see, there are quite a few policies. That’s partly due to the strict- variations which prevent leaking even the origin information on HTTPS->HTTP navigations.

Improving Defaults

With this background out of the way, the Chromium team has announced that they plan to change the default Referrer Policy from no-referrer-when-downgrade to strict-origin-when-cross-origin. This means that cross-origin navigations will no longer reveal path or query string information, significantly reducing the possibility of unexpected leaks.

As with other big privacy changes, this change is slated to ship in v80, the code has been in for five years and you can enable it in Chrome 78+ and Edge 78+:

  1. Visit chrome://flags/#reduced-referrer-granularity
  2. Set the feature to Enabled
  3. Restart your browser

Flag

I’ve published a few toy test cases for playing with Referrer Policy here.

As noted in their Intent To Implement, the Chrome team are not the first to make changes here. As of Firefox 70 (Oct 2019), the default referrer policy is set to strict-origin-when-cross-origin, but only for requests to known-tracking domains, OR while in Private mode. In Safari ITP, all cross-site HTTP referrers and all cross-site document.referrers are downgraded to origin. Brave forges the Referer (sending the Origin of the target, not the source) when loading of cross-origin resources.

Understand the Limits

Note that this new default is “opt-out”– a page can still choose to send unrestricted referral URLs if it chooses. As an author, I selfishly hope that sites like Reddit and Hacker News might do so.

Also note that this new default does not in any way limit JavaScript’s access to the current page‘s URL. If your page at https://contoso.com/SuperSecretDoc.aspx includes a tracking script:

script

… the HTTPS request for track.js will send Referer: https://contoso.com/, but when the script runs, it will have access to the full URL of its execution context (https://contoso.com/SuperSecretDoc.aspx) via the window.location.href property.

Test Your Sites

If you’re a web developer, you should test your sites in this new configuration and update them if anything is unexpectedly broken. If you want the browser to behave as it used to, you can use any of the policy-specification mechanisms to request no-referrer-when-downgrade behavior for either an entire page or individual links.

Or, you might pick an even stricter policy (e.g. same-origin) if you want to prevent even the origin information from leaking out on a cross-site basis. You might consider using this on your Intranet, for instance, to help prevent the hostnames of your Intranet servers from being sent out to public Internet sites.

Stay private out there!

-Eric

1 The misspelling of the HTTP header name is a historical accident which was never corrected.

2 Notably, Safari, IE11, and versions of Edge 18 and below only supported an older draft of the Referrer policy spec, with tokens never (matching no-referrer), always (matching unsafe-url), origin (unchanged) and default (matching no-referrer-when-downgrade). Edge 18 supported origin-when-cross-origin, but only for resource subdownloads.

2 thoughts on “Improving Privacy by Limiting Referrers

  1. By origin I assume the domain (and sub domain?) will be kept, in most cases this should be fine for tracking even from reddit and such. You do loose info on which reddit article is the most popular origin url though. But as you said, reddit may add exceptions for article pages (or rather urls within an article). And wordpress can easily add it as a optional toggle in their admin interface.

    1. Yes, an origin is the Scheme+FullyQualifiedDomainName. But this isn’t going to tell me where on Reddit mentioned my article, so I’d have to go there to search for it instead of getting an easy way to go see.

      By way of example, a recent blog post cited my SameSite Cookies article, and when I went to read that blog, I found it significantly mischaracterized things. Because the full Referer was present, I could go talk to the author and clarify things.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s