MoarTLS: Non-Secure Download Blocking

With little fanfare, an important security change has arrived on the web. Now, all major browsers (except Safari) block non-secure downloads from a secure page.

Browser VersionBehavior
Edge 94+Block with right-click “Keep” button
Chrome 94Block Silently
FirefoxBlock with “Allow download” button
Brave 1.30.89Block Silently
Opera 79.0.4143.72Block Silently
Safari 15Allow
Vivaldi 4.3.2439.44Allow
Major Browser Behavior

You can test your browser’s behavior with this test page. In Edge 94, the block looks like this:

By right-clicking on the “can’t be downloaded securely” item, you can choose to continue the download anyway.

Firefox offers a very similar user-experience, although somewhat confusingly, they prompt for permission to save the file before blocking it:

Firefox 93 Blocking UX

The Chromium team started rolling out this protection last year. Over time, the Chrome block turned into a silent block (arguably confusing for users) where the only indication of an attempted/blocked download is a notice in the Developer Tools console:

End-User Override

Within Chrome or Edge, a user may use the Permissions UI to enable a secure site to download non-secure resources without blocking.

IT Administrator Override

The InsecureContentAllowedForUrls policy allows an IT administrator to exempt site from mixed content blocking. List the origins that are allowed to request non-secure content/downloads (list the source page’s origin, not the target resource’s origin):

InsecureContentallowedForUrls Policy

Discovering non-secure links

My MoarTLS browser extension makes it simple for you to see whether any of the links (including download links) on your page are non-secure:

… however, note that this tool only flags links that are directly non-secure– if the link goes to HTTPS but then subsequently redirects to (or through) HTTP, the tool won’t notice, but the browser blocker will.

The fact that the browser blocks the download if any URL used in a download’s source redirect chain is non-secure can lead to confusing UI whenever only a single URL is shown to the user. For instance, this download was blocked because the source page referred to HTTP but the request was subsequently redirected to the HTTPS URL shown:

Fixing HTTP Links

The first step in avoiding mixed content download blocking is to ensure that all of your resources are available over HTTPS; if a download isn’t available over HTTPS, updating the source page’s download link’s url to point to https isn’t going to work.

The second step to avoiding blocking is to change all of the download links from HTTP to HTTPS.

Unfortunately, this might be much easier said than done– you might have hundreds of pages with hundreds of links.

What to do?

One approach is to use automation to rewrite links, either as a one-time job, or as a dynamic rewrite. When I was first building my test page above, I couldn’t figure out why it wasn’t working. It took a good 15 minutes to realize that I’d configured Cloudflare to automatically rewrite HTTP links to HTTPS. (In the Cloudflare Control panel, select SSL/TLS > Edge Certificates and enable Automatic HTTPS Rewrites.)

Content-Security-Policy offers an Upgrade-Insecure-Requests (UIR) directive that upgrades all of a page’s embedded resource URLs from HTTP to HTTPS. This is a great approach for fixing mixed content bugs without doing a lot of work on every page. Unfortunately, file downloads are typically treated as “Navigation requests”, which means that a UIR rule on https://example.com will upgrade http://example.com/dl/somefile but UIR will not upgrade https://othersite.example.com/dl/somefile because it is not same-origin to the original page. Bummer.

You might hope that just putting your download site on the Strict-Transport-Security (HSTS) pre-load list might fix things because doing so ensures that your site is always contacted over HTTPS. Unfortunately, for historical reasons, HSTS is evaluated after mixed content blocking and so this approach does not work. But the Chromium team is considering whether blocking should be bypassed if non-secure requests in the download flow were upgraded to HTTPS via HSTS or the browser’s “Always use HTTPS” setting such that every URL that actually hit the network was secure.

Browser Bugs

Today, if you try to close the browser without explicitly aborting the blocked download, Edge complains at you. I’ve filed feedback.

Accessibility (UIA) Troubleshooting

Chromium-based browsers offer a number of accessibility-related features. When you visit about:accessibility, you can see more about the state of these features (similarly, you can find the states in about:histograms/Accessibility.ModeFlag). You can enable features via the Accessibility page, or by You can also passing the command line argument --force-renderer-accessibility into the browser.

In some cases, you may be surprised to find some of the accessibility features enabled even when you have not manually enabled them:

This can happen when the browser detects interactions from a UI Automation tool; such API calls are mostly from accessibility tools like screen-readers. However, some features like Windows 10’s Text Cursor Indicator:

…are implemented using UIA, and when this feature is enabled, the browser enables the corresponding accessibility features.

Back in the spring, some improvements to the Accessibility code caused a series of regressions that would result in crashes, hangs, and memory exhaustion when a loading many pages, including YouTube:

These regressions were impactful for many customers who didn’t expect to be running the impacted code. Fortunately, the problems were quickly fixed.

Unfortunately for end-users, tracking down how Accessibility features got enabled on their browsers is presently used to be non-trivial. (Update: read on)

For Microsoft Edge users running Windows 10 version 20H1 or later, visiting about:histograms/UIA will show a truncated hash of the process name of the UIA client:

The value shown is the Integer representation of the first four bytes of the SHA-1 Hash of the process name. Some common values include:

Truncated HashProcess Name
612857738EoAExperiences.exe (Win10 Text Cursor Indicator feature)
1759000766TextExpander.exe
319076627Narrator.exe
427450884Snagit32.exe
592588840Magnify.exe
3897604127magnify.exe
494880639nvda.exe

Unfortunately, there’s no easy way to go from a truncated hash back to the original string; hashes are one-way. (The only way to do it is brute force — start with a list of possible strings and hash them all to find a match).

A simple PowerShell script mostly written by Artem Pronichkin allows you to get the hashes of all of your running processes, which you can then compare to the reported value:

In the future, we hope to streamline this experience somewhat, but for now, it’s an annoyingly geeky scavenger hunt.

Update: In Edge 93+, the edge://accessibility page includes a simple “Show Client Info” button. Easy peasy.

Practical Time Machines

Many “emergency” situations in our modern world would’ve been easy to fix had they been foreseen in advance. If only we’d known what was going to happen, the badness could’ve easily been prevented.

Unfortunately, when problems are discovered only “as they happen” in production, everyone must race to minimize the damage and put out the fires, dramatically increasing the cost and fallout of otherwise-trivial problems.

Turn your Emergencies into WorkItems

Background: You Are A Time Traveler

Yes, you, dear reader. You travel through time. A second has passed since you started reading this paragraph. You are never getting that second back (sorry!).

For all of its complexities, time (or at least our perception of it) is the most reliable thing in the universe. With every passing second, the universe ticks forward in time.

So, now we’ve established that time travel is not only possible, but guaranteed. Unfortunately, this sort of time travel isn’t very useful for our purposes. It’s not useful because everyone around us is also traveling through time at the same rate. If we’re doing a great job of paying attention, perhaps we’ll notice a problem before everyone else and maybe we’ll be able to fix it before anyone else notices, but this is still the definition of an emergency– we’re in a race against the clock. It’s exhausting, and we’ll usually lose.

If we had a way to send messages back to our past, perhaps disasters could be averted or great fortunes could be won. Alas, time is an arrow, and that arrow only points to the future. Unfortunately, changing the past is out.

So, we’ll need to change the present based on what happens in the future.

If only we had some sort of magical time machine to let us see what was going to go wrong before it actually does. Seems like an overused trope for TV shows and so many of my favorite movies, right?

What would you say if I told you that you’re surrounded by working time machines, you just have to use them?

The Future is Different

Before I tell you how to use or build time machines that let you glimpse the future, let’s restate our goal– we want to prevent future emergencies by discovering and fixing them in advance.

A key reason we’re so often surprised by emergency problems is that we expect the tools we use successfully today to continue to work tomorrow. This is a fundamental mistake.

It’s probably not true that Einstein said “The definition of insanity is doing the same thing over and over again, but expecting different results,” but it’s absolutely true that I have saidThe definition of insanity is doing the same thing over and over again, and expecting the same results.

That’s because we can’t ever truly “do the same thing” over and over again: before we were doing it then, but when we do it again, we’re doing it now. We’re in a changed universe.

The only constant is change, and the present is simply different than than the past; similarly, the future will be different than the present in innumerable ways. Many of those differences will be unpredictable, but fortunately some are predictable at high confidence.

For example, our software and services in the future will:

  1. Always be executed at a later date and time
  2. Probably bear a higher version number
  3. Probably contain many new features and bugfixes

Given this knowledge, many of our software emergencies are entirely predictable:

  • Certificates, access tokens, cookies, caches, and time bombs will expire
  • Version parsers will get confused by or reject unexpectedly large numbers
  • Workarounds, hacks, and quirks will stop working

Computers have a rather primitive understanding of the real Universe; they largely believe whatever inputs we give them. By controlling the inputs, we control their Universe.

Practical Time Machine – The Clock

Your computer can travel to the future even more easily than a DeLorean. The UI isn’t quite as cool, but it doesn’t require plutonium. Simply open your system control panel, turn off the “Set time automatically” option, and click the button to adjust the system date/time to your target.

Almost immediately, all hell will break loose. Microsoft Teams and most other applications will go “offline.”

Webpages will stop loading:

Signed programs might start getting reported as “Unknown publisher”:

…and on and on.

Now, as a time machine, this one’s a bit buggy. It shows what is likely to happen in the future if nothing else changes. But some things will change. Most web servers are now using the ACME protocol to automatically renew certificates shortly in advance of their expiration, so setting your clock five years into the future is likely to turn up a lot of false positives. However, setting your clock five days into the future is probably reasonable – you’ll get enough advance warning to figure out why a certificate is expiring, what the implications are, and deploy a new certificate.

Automated renewal doesn’t solve all problems for certificates though– for instance, it doesn’t fix up certificates embedded in client software, and it typically only applies to the end-entity certificate, not intermediates. Back in 2019, Mozilla accidentally let a certificate expire, breaking all extensions. This week, an intermediate certificate used by LetsEncrypt expired, leading to a trail of broken apps, sites, and devices.

In the final screenshot above, Windows shows the “Unknown publisher” warning because the software’s signature lacks a timestamp.

Without an Authenticode timestamp, when the signer’s certificate expires, all of the signatures from that certificate are deemed invalid. Good luck replacing every binary on every user’s system.

Beyond certificates expiring, you’re likely to notice other issues as well– trial software will expire, Strict-Transport-Security preloads will stop working, etc. For each of these issues, you will need to investigate and decide to take action.

Note: Depending on what you’re trying to do, you might not need to change the entire system clock; you might be able to change the clock for just one app.

Practical Time Machine – Version Lies

One of the most common predictable emergencies is when a scenario begins to fail because a software’s version number gets “too high.”

This happened so commonly in Windows client software that Microsoft changed the GetVersionEx function to start lying to applications unless they declare (via manifest) that they can handle the truth. Similarly, Windows 11 will advertise itself as version 10.0 to websites.

The Chromium team, anticipating that Chrome’s upcoming release of 100 is likely to break websites, built-in a simple flag to allow folks to test with the new 100 version number in the browser’s User-Agent string. Here’s an example of a problem (actually two) found when using the new “Force Version 100” flag:

As you can see, the website incorrectly parses the User-Agent string and concludes that Chrome is outdated, pushing the user to install another browser using icons circa 2009. And closer to home, my Show Chrome Version extension truncates the string 100 to 10 when rendering the information in the toolbar button. Oops.

Even if your browser vendor doesn’t offer a simple “Fake the future” flag, you can typically override the User-Agent string via a command-line argument (--user-agent), DevTools Command (Network Conditions), extension, or simple FiddlerScript rule.

Practical Time Machine – Channels

Most state-of-the-art software these days is available in multiple channels— a current version (commonly called “Release” or “Stable”) and future versions (often called “Pre-Release”, “Beta”, “Alpha”, “Dev”, “Canary”, “Preview”, or “Nightly”).

All major web browsers, in particular, follow this paradigm.

Using Pre-Release versions of browsers is, by far, the best way to learn what changes are coming in the future that might break your sites and services. There’s no higher-fidelity way (short of that plutonium-powered DeLorean) to find out what bugfixes, new features, and regressions (gulp) are headed for your user-base than to simply try the code that everyone will be running in a few weeks or months.

This option is convenient, low-friction, and is highly recommended for developers, IT, and a small “ring” of every Enterprise’s everyday users.

I cannot count how many times we (Chromium or Microsoft) got feedback from a Beta/Dev/Canary user of the browser that we used to fix a problem before it broke anyone in the Stable channel. By way of example– two days before Chrome 88 branched for Stable release, an Enterprise reported that their app was broken in Beta. We quickly investigated and isolated the regression in handling of XSLTs. We fixed it within a day. Unfortunately, a different department in the same Enterprise did not deploy a Beta ring, and a regression in their workflow wasn’t discovered until 88 Stable released, impacting everyone in the department.

Q: How do I “switch channels” for the new Microsoft Edge browser?
A: In addition to Stable, there are 3  channels: Beta, Dev, and Canary, and you can install them all on the same device at the same time, and choose which you want to run at any time. They are all kept up to date automatically. Any of them can be set as your default browser. Also, if you enable sync, then your data will sync between installed channels on the same device (provided you log in to each with the same ID). To get all the channels, go to microsoftedgeinsider.com/download.

What about Extended Stable?

Google Chrome and Microsoft Edge recently introduced support for an Extended Stable version, which is a bit like a time-machine that keeps your users “stuck in the past”, allowing them to run an (even numbered) “Extended Stable” release for up to four weeks beyond the release date of the subsequent (odd numbered) “Stable” channel. This is an interesting variant of the time machine approach– you’re hoping that the extra four weeks will allow any problems in the “Stable” release to be found and fixed before upgrading to the next release. But it’s not a panacea. Say you’ve got your users on Edge 94 Extended Stable. They skip Edge 95 entirely, and four weeks later are upgraded to Edge 96 Extended Stable. While any problems in Edge 95 may have been mitigated before your users update to Edge 96, only problems found in the pre-release channels of Edge 96 will get fixed before everyone moves up to 96.

Read Dispatches from the Future

Beyond running pre-release channels yourself, you can also take advantage of published documentation about what changes are coming in the future. Ranging from Edge’s Site Impacting Changes list to our Roadmap to the Chrome Platform Status page, you can see lists of top changes that are heading to Stable in the future. While documentation is useful, I still strongly recommend running Beta/Dev channels in your environment– our documentation covers the top changes we made on purpose— it doesn’t cover accidental bugs or the thousands of other changes made to each release that we didn’t expect to break anyone.

Some Enterprises have asked “Can’t you just give us the full changelist for each release?!?” and I’ve been put in the unfortunate and dangerous position of saying “I know enough about you and your business to know what you’re asking for isn’t what you really want.”

The list changes in each new Stable release would run to hundreds of pages. A change like “Enable TableNG“, which completely replaced the table-layout engine, would be both extremely difficult to describe, and you as an IT Manager would be poorly equipped to understand whether it would impact your sites — the only practical approach is to test your sites and apps in a pre-release change.

I promise.

There are other ways to discover what will happen in the future. Run your browser with the F12 Developer Tools open and watch the Console and Issues tabs for upcoming changes and deprecation notices:

Before deploying new features like Content-Security-Policy on your website, use the report_only mode to collect issue reports from real-world users, before enabling enforcement of those policies.

In many cases, sites quietly broadcast when something will break– for instance, a server’s certificate contains its expiration date. You can write rules in FiddlerScript to warn you on every HTTPS response whose certificate is soon to expire.

Calls To Action

  • If you’re responsible for using or deploying software, use the time machines at your disposal.
  • If you’re responsible for designing or building software, build new time machines for everyone to use.
  • If you’ve read this far, use the comments section below to remind me of real-world time machines I’ve forgotten to mention.

You’re a time traveler. Act like one!

-Eric

Determining OS Platform Version

In general, you should not care what Operating System visitors are using to visit your website. If you attempt to be clever, you will often get it wrong and cause problems that are an annoyance for users and a hassle for me to debug.

So avoid trying to be nosy/clever if at all possible.

That being said, some websites want to be able to distinguish Windows versions for whatever reason; I’m an engineer, not a cop.

The typical path to get this information from Windows-based browsers is to look at the dreaded User-Agent string, a wretched hive of lies, scum, and villainy.

However, this approach falls down with Windows 11, which reports itself as Windows NT 10.0 (almost certainly for compatibility reasons); in general, browsers are moving toward freezing the User-Agent string to limit passive fingerprinting. The modern mechanism for learning more information about the client is called Client Hints.

If you want to be able to distinguish between Windows 10 and Windows 11, starting around October 2021 in Microsoft Edge 95 (and v95 of most Chromium browsers), the Sec-Ch-UA-Platform-Version Client Hint is the way to go.

Available for request in v95, this value will indicate the UniversalApiContract “API Level” of the Windows platform. The mapping of values to version is

Windows VersionSec-CH-UA-Platform-Version
Windows 7 | 8 | 8.10.0.0
Windows 10 15071.0.0
Windows 10 15112.0.0
Windows 10 16073.0.0
Windows 10 17034.0.0
Windows 10 17095.0.0
Windows 10 18036.0.0
Windows 10 18097.0.0
Windows 10 1903 | 19098.0.0
Windows 10 2004 | 20H2 | 21H110.0.0
Windows 11 Previews13.0.0 | 14.0.0
Windows 11 Release15.0.0

You can request this Client Hint from JavaScript thusly:

navigator.userAgentData.getHighEntropyValues(['platformVersion'])
.then(uapv => { console.log(uapv.platformVersion); });

Note that the userAgentData API requires that your page be a secure context (Served by ~https or on localhost).

Or you can request it be sent to your server as a HTTP request header on subsequent requests by sending a response header:

Response.AddHeader("Accept-CH", "Sec-CH-UA-Platform-Version");
Response.AddHeader("Accept-CH-Lifetime", "86400");

As you may have noticed, the 0.0.0 value is shared across all pre-Windows 10 versions of Windows, so you’ll need to continue to use the User-Agent string to distinguish between Windows 7, 8.0, and 8.1 platforms.

As far as I know, Firefox does not plan to implement this Client Hint.

Bonus: Platform Architecture and Bitness

Distributors of native applications often would like to know the bitness and architecure of the client platform to ensure that they serve the correct installer for their native application. The sec-ch-ua-bitness and sec-ch-ua-arch hints are useful for this purpose.

Unfortunately, there’s presently a bug in Chromium whereby ARM64 devices like the Surface Pro X will return x86 as the architecture.

The Edge-proprietary window.external.getHostEnvironmentValue() API returns the truth {"os-architecture":"ARM64"} but this API is non-standard and you should not use it.

-Eric

PS: Native client apps should generally not check the registry key directly, instead it should query the appropriate “Is this API level supported” API. HKLM\SOFTWARE\Microsoft\WindowsRuntime\WellKnownContracts\Windows.Foundation.UniversalApiContract

Leaky Abstractions

In the late 1990s, the Windows Shell and Internet Explorer teams introduced a bunch of brilliant and intricate designs that allowed extension of the shell and the browser to handle scenarios beyond what those built by Microsoft itself. For instance, Internet Explorer supported the notion of pluggable protocols (“What if some protocol, say, FTPS, becomes as important as HTTP?”) and the Windows Shell offered an extremely flexible set of abstractions for browsing of namespaces, enabling third parties to build browsable “folders” not backed by the file system– everything from WebDAV (“your HTTP-server is a folder“) to CAB Folders (“your CAB archive is a folder“). As a PM on the clipart team in 2004, after I built a .NET-based application to browse clipart from the Office web services, I next sketched out an initial design for a Windows Shell extension that would make it look like Microsoft’s enormous web-based clipart archive were installed in a local folder on your system.

Perhaps the most popular (or infamous) example of a shell namespace extension is the Compressed Folders extension, which handles the exploration of ZIP files. First introduced in the Windows 98 Plus Pack and later included with Windows Me+ directly, Compressed Folders allows billions of Windows users to interact with ZIP files without downloading third-party software. Perhaps surprisingly, the feature was itself was acquired from two third-parties — Microsoft acquired the Explorer integration from Dave Plummer’s “side project”, while a company called InnerMedia claims credit for the “DynaZIP” engine underneath.

Unfortunately, the code hasn’t really been updated in a while. A long while. The timestamp in the module claims it was last updated on Valentine’s Day 1998, and while I suspect there may’ve been a fix here or there since then (and one feature, extract-only Unicode filename support), it’s no secret that the code is, as Raymond Chen says: “stuck at the turn of the century.” That means that it doesn’t support “modern” features like AES encryption, and its performance (runtime, compression ratio) is known to be dramatically inferior to modern 3rd-party implementations.

So, why hasn’t it been updated? Well, “if it aint broke, don’t fix it” accounts for part of the thinking– the ZIP Folders implementation has survived in Windows for 23 years without the howling of customers becoming unbearable, so there’s some evidence that users are happy enough.

Unfortunately, there are degenerate cases where the ZIP Folders support really is broken. I ran across one of those yesterday. I had seen an interesting Twitter thread about hex editors that offer annotation (useful for exploring file formats) and decided to try a few out (I decided I like ReHex best). But in the process, I downloaded the portable version of ImHex and tried to move it to my Tools folder.

I did so by double-clicking the 11.5mb ZIP to open it. I then hit CTRL+A to select all of the files within, then crucially (spoiler alert) CTRL+X to cut the files to my clipboard.

I then created a new subfolder in my C:\Tools folder and hit CTRL+V to paste. And here’s where everything went off the rails– Windows spent well over a minute showing “Calculating…” with no visible progress beyond the creation of a single subfolder with a single 5k file within:

Huh? I knew that the ZIP engine beneath ZIP Folders wasn’t well-optimized, but I’d never seen anything this bad before. After waiting a few more minutes, another file extracted, this one 6.5 mb:

This is bananas. I opened Task Manager, but nothing seemed to be using up much of my 12 thread CPU, my 64gb of memory, or my NVMe SSD. Finally, I opened up SysInternals’ Process Monitor to try to see what was going on, and the root cause of the problem was quickly seen.

After some small reads from the end of the file (where the ZIP file keeps its index), the entire 11 million byte file was being read from disk a single byte at a time:

Looking more closely, I realized that the reads were almost all a single byte, but every now and then, after a specific 1 byte read, a 15 byte read was issued:

What’s at those interesting offsets (330, 337)? The byte 0x50, aka the letter P.

Having written some trivial ZIP-recovery code in the past, I know what’s special about the character P in ZIP files– it’s the first byte of the ZIP format’s block markers, each of which start with 0x50 0x4B. So what’s plainly happening here is that the code is reading the file from start to finish looking for a particular block, 16 bytes in size. Each time it hits a P, it looks at the next 15 bytes to see if they match the desired signature, and if not, it continues scanning byte-by-byte, looking for the next P.

Is there something special about this particular ZIP file? Yes.

The ZIP Format consists of a series of file records, followed by a list (“Central Directory”) of those file records.

Each file record has its own “local file header” which contains information about the file, including its size, compressed size, and CRC-32; the same metadata is repeated in the Central Directory.

However, the ZIP format allows the local file headers to omit this metadata and instead write it as a “trailer” after each individual file’s DEFLATE-compressed data, a capability that is useful when streaming compression– you cannot know the final compressed size for each file until you’ve actually finished compressing its data. Most ZIP files probably don’t make use of this option, but my example download does. (The developer reports that this ZIP file was created by the GitHub CI.)

You can see the CRC and sizes are 0‘d in the header and instead appear immediately following the signature 0x08074b50 (Data Descriptor), just before the next file’s local header:

The 0x08 bit in the General Purpose flag indicates this option; users of 7-Zip can find it mentioned as Descriptor in the entry’s Characteristics column:

Based on the read size (1+15 bytes), I assume the code is groveling for the Data Descriptor blocks. Why it does that (vs. just reading the same data from the Central Directory), I do not know.

Making matters worse, this “read the file, byte by byte” crawl through the file doesn’t just happen once– it happens at least once for every file extracted. Making matters worse, this data is being read with ReadFile rather than fread() meaning that there’s no caching in userspace, requiring we go to the kernel for every byte read.

Eventually, after watching about 85 million single byte reads, Process Monitor hangs:

After restarting and configuring Process Monitor with Symbols, we can examine the one-byte reads and get a hint of what’s going on:

The GetSomeBytes function is getting hammered with calls passing a single byte buffer, in a tight loop inside the readzipfile function. But look down the stack and the root cause of the mess becomes clear– this is happening because after each file is “moved” from the ZIP to the target folder, the ZIP file must be updated to remove the file that was “moved.” This deletion process is inherently not fast (because it results in shuffling all of the subsequent bytes of the file and updating the index), and as implemented in the readzipfile function (with its one-byte read buffer) it is atrociously slow.

Back up in my repro steps, note that I hit CTRL+X to “Cut” the files, resulting in a Move operation. Had I instead hit CTRL+C to “Copy” the files, resulting in a Copy operation, the ZIP folder would not have performed a delete operation as each file was extracted. The time required to unpack the ZIP file drops from over thirty minutes to four seconds. For perspective, 7-Zip unpacks the file in under a quarter of a second, although it cheats a little.

And here’s where the abstraction leaks— from a user’s point-of-view, copying files out of a ZIP file (then deleting the ZIP) vs. moving the files from a ZIP file seems like it shouldn’t be very different. Unfortunately, the abstraction fails to fully paper over the reality that deleting from certain ZIP files is an extremely slow operation, while deleting a file from a disk is usually trivial. As a consequence, the Compressed Folder abstraction works well for tiny ZIPs, but fails for the larger ZIP files that are becoming increasingly common.

While it’s relatively easy to think of ways to dramatically improve the performance of this scenario, precedent suggests that the code in Windows is unlikely to be improved anytime soon. Perhaps for its 25th Anniversary? 🤞

– Eric

Offline NetLog Viewing

A while back, I explained how you can use Telerik Fiddler or the Catapult NetLog Viewer to analyze a network log captured from Microsoft Edge, Google Chrome, or another Chromium or Electron-based application.

While Fiddler is a native app that runs locally, the Catapult NetLog Viewer is a JavaScript application that runs in your browser. Because NetLogs can contain sensitive data, some users have worried about the privacy of the viewer– what if someday it started leaking sensitive data from logs, either unintentionally or maliciously?

Fortunately, the NetLog Viewer is a self-contained single page application that doesn’t need a network connection to run. You can use it entirely offline, either from a Virtual Machine with no network connection, or from a browser instance configured to override all network requests.

Your first step is to get an copy of the viewer as a file. You can do that by right-clicking this link and choosing “Save Link As”. Save the HTML file somewhere locally, e.g. C:\temp\NetLogView.html on Windows.

If you want to run it from a disconnected VM, simply copy the file into such a VM and you’re good to go.

If, however, you want the convenience of running the viewer from your Internet-connected PC without worrying about leaks, you can run it from a browser instance that won’t make network connections.

After saving the file, open it in a new browser window thusly:

msedge.exe --user-data-dir=C:\temp\profile --inprivate --host-rules="MAP * 0.0.0.0" --app=C:\temp\NetLogView.html

The command line arguments bear some explanation. In reverse order:

  • The app argument instructs Edge to open the supplied file with a minimal browser UI, as if it were a native app.
  • The host-rules argument tells the browser instance to direct all network requests to an IP address of 0.0.0.0. On Windows, such requests instantly fail. On Mac/Linux, the null IP points back at your own PC.
  • The inprivate argument directs the browser to discard all storage after the app exits (since it’s not needed). For Chrome, use --incognito instead.
  • The user-data-dir instructs the browser to use a temporary browser profile (which prevents the app’s window from being merged into an existing browser process, such that the host-rules argument would’ve been ignored.)


While none of this is strictly necessary (the NetLog Viewer doesn’t leak data), it’s always nice to be able to discard attack surface wherever possible.

-Eric

Download Blocking by File Type

I’ve previously spoken about the magic of the File Type Policies component — a mechanism that allows files to be classified by their level of “dangerousness”, such that harmless files (e.g. .txt files) can be downloaded freely, whilst potentially-dangerous files (e.g. .dll files) are subjected to a higher degree of vetting and a more security-conscious user-experience.

File Type Danger Level

Microsoft Edge inherits its file type policies from the upstream Chromium browser; you can view the current contents of the list here, and documentation of its format here.

Within the list, you’ll see that each type has a danger_level, which is one of three values: DANGEROUS, NOT_DANGEROUS, or ALLOW_ON_USER_GESTURE.

The first two are simple: NOT_DANGEROUS means Safe to download and open, even if the download was accidental. No additional warnings are necessary. DANGEROUS means Always warn the user that this file may harm their computer. Let users continue or discard the file. If [SmartScreen or Safe Browsing] returns a SAFE verdict, still warn the user before saving the file.

The third setting, ALLOW_ON_USER_GESTURE1 is more subtle. Such files are potentially dangerous, but likely harmless if the user is familiar with download site and if the download was intentional. Microsoft Edge will allow such downloads to proceed automatically if two conditions are both met:

  1. User Gesture: There is a user gesture associated with the network request that initiated the download (e.g. the user clicked a link to the download).
  2. Familiar Initiator: There is a recorded prior visit to the referring origin prior to the most recent midnight (i.e. yesterday or earlier). Such a visit implies that the user has at least some history of visiting the site that kicked off the download.

The download will also proceed automatically if the user explicitly initiated a download by using the Save link as context menu command, entered directly into the browser’s address bar the download’s URL, or if Microsoft Defender SmartScreen (in Edge, or Google Safe Browsing in Chrome) indicates that the file is known safe.

Update: Starting in version 91, Microsoft Edge will join Google Chrome in interrupting downloads that lack the required gesture.

User Experience for Downloads Lacking Gestures

Within Google Chrome, a download lacking a required gesture shows explicit buttons to allow the user to decide whether to proceed with the download or abandon it:

In contrast, Microsoft Edge states that the download “was blocked”, although the same options, titled Keep and Delete, are available from the … menu on the download item.

UPDATE: Edge 95+ has been updated with an interruption UX more like Chrome’s, in order to better reflect that the user may continue to download the file.

If you visit edge://downloads, you’ll see the same options:

Enterprise Controls

While users are somewhat unlikely to encounter download interruptions for sites they use every day, they might encounter them for legitimate downloads on sites that they use rarely or in sites that hit “Corner Cases” described in a section below.

To help streamline the user-experience for Enterprises, a Group Policy is available.

Enterprises can use ExemptDomainFileTypePairsFromFileTypeDownloadWarnings to specify the filetypes that are allowed to download from specific sites without interruption.

[{"file_extension":"xml","domains":["contoso.com", "woodgrovebank.com"]},
{"file_extension":"msg", "domains": ["*"]}]

If the SmartScreenForTrustedDownloadsEnabled (or equivalent policy for Chrome) is set to 0 (disabled), and the file download’s URL is Trusted (on Windows, in the Local Machine, Intranet, or Trusted zone) then the download will proceed without interruption (even without a gesture), regardless of danger_level. (Aside: This seems a bit strange, but feels more logical if you pretend that the file type warnings are a part of SmartScreen).

File Types Requiring a Gesture

The latest file types policies are published in the Chromium source code. As of May 2021, file types with a danger_level of ALLOW_ON_USER_GESTURE on at least one OS platform include:
accda, accdb, accde, accdr, action, ad, ade, adp, apk, app, application, appref-ms, as, asp, asx, bas, bash, bat, caction, cdr, cer, chi, chm, cmd, com, command, configprofile, cpgz, cpi, cpl, crt, crx, csh, dart, dc42, deb, definition, der, desktop, dex, diskcopy42, dmg, dmgpart, dvdr, dylib, efi, eml, exe, fon, fxp, hlp, htt, img, imgpart, inf, ins, internetconnect, inx, isp, isu, job, js, jse, ksh, lnk, mad, maf, mag, mam, maq, mar, mas, mat, mau, mav, maw, mda, mdb, mde, mdt, mdw, mdz, mht, mhtml, mmc, mobileconfig, mpkg, msc, msg, msh, msh1, msh1xml, msh2, msh2xml, mshxml, msi, msp, mst, ndif, networkconnect, ocx, ops, out, oxt, paf, partial, pax, pcd, pet, pif, pkg, pl, plg, prf, prg, ps1, ps1xml, ps2, ps2xml, psc1, psc2, pst, pup, py, pyc, pyo, pyw, rb, reg, rels, rgs, rpm, run, scr, sct, search-ms, service, settingcontent-ms, sh, shar, shb, shs, slk, slp, smi, sparsebundle, sparseimage, svg, tcsh, toast, u3p, udif, vb, vbe, vbs, vbscript, vdx, vsd, vsdm, vsdx, vsmacros, vss, vssm, vssx, vst, vstm, vstx, vsw, vsx, vtx, wflow, workflow, ws, wsc, wsf, wsh, xip, xml, xnk, xrm-ms, xsd, xsl

Other Fields in the File Type Policies

  • You’ll also note that some file types have an auto_open_hint which controls whether the user may configure that type of file to open automatically when the download completes.
  • File type settings sometimes vary depending on the client OS platform (an .exe is not dangerous on a Mac, while an .applescript is harmless on Windows). The platform attribute of an entry specifies on which OS the danger_level applies.
  • The max_file_size_to_analyze field controls how big of a file (.zip, .rar, etc) the browser will be willing to unpack to scan it for dangerous content.

Group Policies

DownloadRestrictions is a policy that makes a complicated browser behavior even more complicated. When you set DownloadRestrictions to 1, Edge won’t just interrupt the download, it will block it.

Enterprises can use ExemptDomainFileTypePairsFromFileTypeDownloadWarnings to specify the filetypes that are allowed to download from specific sites without blocking.

Corner Cases

  • If you put referrerpolicy=”no-referrer” on your download link (or otherwise suppress referrers), the Familiar Initiator check fails.
  • Prior to v94, if you initiate the download by dynamically creating an A element with a download attribute, then click it from JavaScript, User Gesture check fails.

As of August 2021, Microsoft Outlook Web Access’ email attachment file downloads encounter both of these issues.

Test cases for these conditions can be found here. (Note that you’ll have to have visited webdbg.com yesterday or earlier for the familiarity check to pass).


1 ALLOW_ON_USER_GESTURE_AND_FAMILIAR_INITIATOR would be the accurate name for the setting

Per-Site Permissions in Edge

Last year, I wrote about how the new Microsoft Edge browser mostly ignores Security Zones (except in very rare circumstances) to configure security and permissions decisions. Instead, in Chromium per-site permissions are controlled by settings and policies expressed using a simple syntax with limited wildcarding support.

Settings Page’s Site Permissions and Group Policy

Internet Explorer offered around 88 URLAction permissions, but the majority (62) of these settings have no equivalent; for instance, there are a dozen that control various features of ActiveX controls, a technology that does not exist in the new Edge.

Unfortunately, there’s no document mapping the old URLActions to the new equivalents (if any) available within the new Edge. 

When users open chrome://settings/content/siteDetails?site=https://example.com, they’ll find a long list of configuration switches and lists for various permissions. Users rarely use the Settings Page directly, instead making choices using various widgets and toggles in the Page Info dropdown (which appears when you click the lock) or via various prompts or buttons at the right-edge of the address bar/omnibox.

Enterprises can use Group Policy to provision site lists for individual policies that control the browser’s behavior. To find these policies, simply open the Edge Group Policy documentation and search for ForUrls to find the policies that allow and block behavior based on the loaded site’s URL. I recently wrote a post about Chromium’s URL Filter syntax, which doesn’t always work like one might expect. Most of the relevant settings are listed within the Group Policy for Content Settings.

There are also a number of policies whose names contain Default that control the default behavior for a given setting.

Here’s a list of Site Settings with information about their policies and behavior:

As you can see, some of these settings are very obscure (WebSerial, WebMIDI) while others will almost never be changed away from their defaults (Images).

-Eric

Specifying Per-Site Policy with Chromium’s URL Filter Format

Chromium-based browsers like Microsoft Edge make very limited use of Windows Security Zones. Instead, most permissions and features that offer administrators per-site configuration via policy rely on lists of rules in the URL Filter Format.

Filters are expressed in a syntax that is similar to other matching rules, but different enough to cause confusion. For instance, consider a URLBlocklist rule expressed as follows:

These filters don’t work as expected. The HTTPS rule should not have a trailing * in the path component (it won’t match anything), while the data: rule requires a trailing * to function.

The syntax has a few other oddities as well:

  • A leading dot before the host means exactly match; the filter .example.com matches just https://example.com but not https://www.example.com
  • You may specify a path prefix (example.com/foo) but you must not include a wildcard * anywhere in the path
  • You may specify wildcards in a query (example.com?bar=*); you may omit the path to have the querystring checked on all pages, or include a path to only check the querystring on pages within the path.
  • A rule of blob:* doesn’t seem to match blob URLs, while a rule of data:* does seem to match all data URLs.

Unfortunately, there’s not a great debugger for figuring out the proper syntax. You can use the chrome://policy page to see whether Chrome finds any glaring error in the policy:

…but short of testing your policy there’s not a great way to verify it does what you hope.

Q: The problem of special-URLs

There are a variety of special URLs (particularly blob and data) that do not directly express a hostname– instead, the URL exists within a security context that is not included in the URL itself. This can cause problems for Policies if the code implementing the policy does not check the URL of the security context and looks only at the blob/data URL directly. A system administrator might set a policy for downloads from https://example.com/download, but if download page on that site uses a script-generated file download (e.g. a blob), the policy check might overlook the rule for example.com because it checks just the blob: URL.

An example bug can be found here.

Q: Can filters match on a site’s IP?

The Permissions system’s “Site Lists” feature does not support specifying an IP-range for allow and block lists. Wildcards are not supported.

It does support specification of individual IP literals, but such rules are only respected if the user navigates to the site using said literal (e.g. http://127.0.0.1/). If a hostname is used (http://localhost), the IP Literal rule will not be respected even though the resolved IP of the host matches the filter-listed IP.

Q: Can filters match just dotless hostnames?

Not today, no. You must individually list each desired hostname, e.g. (https://payroll, https://stock, https://who, etc).

Chromium’s URL Filter Format is convenient if your intranet is structured under one private domain (e.g. *.intranet.example.com) but is much less convenient if your Intranet uses dotless hostnames (http://example) or many disjoint private domains.

The ability to match only hostnames not containing dots would be convenient to accommodate the old IE behavior whereby Windows would map dotless hostnames to the Local Intranet Zone by default. (To my surprise, there’s been no significant demand for this capability in the first year of Edge’s existence, so perhaps corporate intranets are no longer using dotless hostnames very much?)

References