Security: The Impact of Time

Two years ago, I wrote a long post about the importance of time, and how practical time machines can help reduce emergencies into more mundane workitems. Today, we revisit the same topic, with a focus on the Security impact of time.

Races

In many ways, the story of modern security is a story about races, with Attackers locked in an endless sprint to outpace Defenders. That’s especially true in the case of phishing and malware — attackers need to get their attack payloads to victims faster than defenders can recognize those payloads and block them.

Defenders work as fast as possible to minimize the time-to-block, but there are three key hurdles that they must leap to get there, each of which takes time:

  • Detection (false-negative reports from users, telemetry, or other sources)
  • Analysis/Grading (manual or automated steps to confirm phishing/malware, determine the scope/rollup of blocks, and reduce the likelihood of false positives)
  • Blocking (authoring and deployment of rules to block malicious content)

Attackers use cloaking and other techniques to try to slow down the process of detection and grading.

Google’s SafeBrowsing team recently claimed that 60% of phishing sites exist for under 10 minutes, giving defender’s precious little time to react before the attackers have moved to a new location and new victims.

Time-to-Check

Another key concern for security is the amount of time a security check requires, and this leads to many design constraints. Users are not willing to accept security solutions that make poor tradeoffs between usability and security, and slow performance can gravely impact usability. Security software must be able to return a verdict (e.g. block or allow) as quickly as possible to keep the user productive.

One common approach to dealing with performance challenges is to make checks asynchronous, such that the user’s activity isn’t blocked until/unless the security check reports a detection of malice. In some cases, this is a smart tradeoff– for example, Microsoft SmartScreen performs asynchronous reputation checks for phishing sites. While it may take a second or two for the online reputation check to complete, a potential-victim is safely navigated to the block page quickly enough that they don’t have the time to type sensitive data (e.g. passwords or credit card info) into the malicious site.

Unfortunately, asynchronous checks aren’t always appropriate. In the case of anti-malware scans or Network Protection checks, we cannot safely allow the malware to run or the malicious connection to be established because the damage could be done before the reputation check completes. So, in most cases, these checks need to be synchronous, which means that they pause the user’s requested activity until the check completes. This creates difficult performance constraints for security software, which otherwise might use relatively slow analysis mechanisms (e.g. “detonation” via sandboxed emulation) to detect malicious content.

These factors mean that security solutions must adopt designs that can meet the time requirements of the user-scenario. Consider Microsoft SmartScreen and Defender Network Protection, two features which are similar except in their time constraints. While SmartScreen performs many URL Reputation checks against the cloud, Network Protection cannot afford such checks. Instead, it consults a locally-cached and frequently updated bloom filter of potentially malicious sites. Only if there’s a match against the bloom filter is the time-expense of a (comparatively slow) online reputation check incurred.

DDoS

Most Distributed Denial of Service (DDoS) attacks (and many Denial of Service attacks in general) are based on attackers sending a huge flurry of requests in a short period of time, forcing legitimate services offline because there’s too much work arriving in too little time. If the attack weren’t concentrated within a short period, it wouldn’t work — the victim service could simply gradually scale up to cover the additional load without much impact.

Unfortunately for defenders, improving performance for legitimate scenarios can result in making attacks more efficient for attackers. A great recent example of this is the “HTTP/2 Rapid Reset” attack. This attack leverages features of the HTTP/2 protocol that were designed to significantly improve performance:

  • Tiny request sizes
  • The ability to layer dozens to hundreds of requests in a single TCP/IP packet
  • The ability to rapidly abandon/cancel requests

… to generate massive load on the recipient CDN or web server.

Crypto

Fireproof safes are rated by how many minutes they can protect their contents in the event of a fire. Most encryption algorithms are based on a similar concept: they aim to protect data longer than the attack lasts.

Given enough time, an attacker can brute force the encryption and discover the secret key that allows them to unscramble protected data. Crypto algorithms are designed to ensure that brute force attacks take impractically long amounts of time (decades or more).

For example, password’s hashes shouldn’t be stored using fast hashes (SHA256), but instead using hashes (Argo2, PBKDF2, BCrypt) that are deliberately designed to be slower.

Even then, defense-in-depth means that we strive to use crypto designs like forward secrecy to protect captured traffic well into the future.

Time Locks

Any interesting use of time is “time locks” whereby time itself is an important component of protection. A business may use a “time lock vaults” which aims to reduce the vulnerable period for a vault. A Time-based One-Time-Password, often used as a second-factor authenticator, uses a clock to generate a one-time-password that is only good for a short validity period.

Many lotteries and casino games depend upon the arrow of time to foil attacks — an attacker cannot steal the winning numbers because the drawing happens in the future and we don’t know of any way to communicate even simple digits to the past.

Cloaking Badness

Defenders routinely run sandboxed security tests on code (e.g. potential malware, potentially-malicious browser extensions, etc) to attempt to determine whether it’s malicious. This process is called detonation. Attackers know that defenders do this and sometimes include logic that attempts to hide bad behavior for longer than a sandboxed-analysis is expected to run. Analysis frameworks respond by doing tricks like advancing the clock, firing timers faster than real-world, etc, in an attempt to trick malware into revealing its misbehavior more quickly. It’s a cat-and-mouse game, and unfortunately, attackers have a distinct advantage, as there are any number of tricks they could use to defer misbehavior until later.

Trust vs. Age

Because defenders strive to kill attacks quickly, there’s a security benefit to treating younger entities (domains, files) with increased suspicion, as older entities are less likely to be undetectedly malicious.

For example, an attacker trying to launch a spear-phishing campaign is likely to register their attack domain (e.g. contoso-auth.xyz) and request its certificate as soon as possible before sending its lures to victims, lest the signals of the coming attack become visible before the attack even begins. For example, there are various monitors that watch for new domain name registrations, and entries into the Certificate Transparency logs that can flag an attack site before it’s even online. Attackers strive to reduce the window of exposure as much as possible to maximize the effective lifetime of their attacks.

To that end, Microsoft Defender’s Web Content Filtering feature offers administrators the option of blocking navigation to domains less than 30 days old, imposing hard constraints on attackers who hope to perform a rapid sneak attack.

Time is fundamental.

-Eric

Beware: URLs are Pointers to Mutable Entities

Folks often like to think of URLs as an entity that can be evaluated: “Is it harmless, or is it malicious?” In particular, vendors of security products tend to lump URLs in with other IoCs (indicators of compromise) like the hash of a known-malicious file, a malicious/compromised digital certificate, or a known-malicious IP address.

Unfortunately, these classes of IoCs are very different in nature. A file’s hash never changes– you can hash a file every second from now until eternity, and every single time you’ll get the same value. A file’s content cannot change without its hash changing, and as a consequence, a “harmless” file can never1 become “malicious” and vice-versa. Note that when we talk about a file here, we’re talking about a specific series of bytes in a particular order, stored anywhere. We’re not talking about a file path, like C:\users\eric\desktop\file.txt which could contain any arbitrary data.

In contrast to a file hash, a network address like an IP or a URL can trivially change from “harmless” to “dangerous” and vice-versa. That’s because, as we saw when we explored the problems with IP reputation, an IP is just a pointer, and a URL is just a pointer-to-a-pointer. The hostname component of a URL is looked up in DNS, and that results in an IP address to which the client makes a network connection. The DNS lookup can return[1] a different IP address every time, and the target server can switch from down-the-block to around-the-world in a millisecond. But that’s just the first pointer. After the client connects to the target server, that server gets to decide how to interpret the client’s request and may choose to return[2] different content every time:

Because the entities pointed at by a pointer can change, a given URL might change from harmless to malicious over time (e.g. a bad guy acquires a domain after its registration expires). But even more surprisingly, a URL can be both harmless and malicious at the same time dependent upon who’s requesting it (e.g. an attacker can “cloak” their attack to return malicious content to targeted victims while serving harmless content to others).

(Aside: A server can even serve a constant response that behaves differently when loaded on each client).

Implications & Attacks

Recently, searching for youtube on Google would result in the first link on the page being a “sponsored” link that looked like this:

If a unsuspecting user clicked on the link, they were taken to a tech scam site that would try to take over the screen and convince the user that they needed to telephone the attacker:

How on earth was this possible? Were the attackers using some sort of fancy Unicode spoofing to make it look like a YouTube link to a human but not to Google’s security checks? Was there a bug on YouTube’s website?

No, nothing so fancy. Attackers simply took advantage of the fact that URLs are pointers to mutable entities.

What almost certainly happened here is that the attacker placed an ad order and pointed it at a redirector that redirected to some page on YouTube. Google’s ad-vetting system checked the URL’s destination to ensure that it really pointed at YouTube, then began serving the ad. The attacker then updated the redirector to point at their scam site, a classic “time-of-check, time-of-use” vulnerability2.

Browser redirect chain upon link click

Because of how the web platform’s security model works, Google’s ability to detect this sort of chicanery is limited– after the user’s browser leaves the googleadservices.com server, Google’s ad engine does not know where the user will end up, and cannot3 know that the next redirector is now sending the user to an attack site.

Now, unfortunately, things are actually a bit worse than I’ve let on so far.

If you’re a “security sensitive” user, you might look at the browser’s status bubble to see where a link goes before you click on it. In this case, the browser claims that the link is pointed at tv.youtube.com:

Our exploration of this attack started at the URL, but there’s actually another level of indirection before that: a link (<a> element) is itself a pointer-to-a-pointer-to-a-pointer. Through JavaScript manipulation, the URL to which a link in a page points can change[0] in the middle of you clicking on it!

And that’s in fact what happens here: Google’s Search results page puts a “placeholder” URL into the <A> until the user clicks on it, at which point the URL changes to the “real” URL:

Now, malicious sites have always been able to spoof the status bubble, but browser vendors expected that “well-behaved sites” wouldn’t do that.

Unfortunately, that expectation turns out to be incorrect.

In this case, showing the “real” URL in the status bubble probably wouldn’t add any protection for our hypothetical “security-conscious” user — all of the links on the Google results page go through some sort of redirector. For example, the legitimate (non-sponsored) search result for YouTube shows www.youtube.com:

…but when clicked, it changes to an inscrutable redirector URL:

… so our theoretical user has no ready way to understand where they’ll end up when clicking on it anyway.

– Eric

1 All absolute statements are incorrect 😂. While a file’s content can’t change, files are typically processed by other code, and that code can change, meaning that a given file can go from harmless to dangerous or vice-versa.

2 It’s entirely possible that Google periodically revalidates the target destination of advertisements, and rather than doing a one-time-switcheroo, the attacker instead cloaked their redirector such that Google graders ended up on YouTube while victims ended up at the tech-scam. There’s some discussion of a similar vector (“tracking templates”).

3 If a user is using Chrome, Google at large could conceivably figure out that the ad was malicious, especially if the redirector ends up landing on a malicious page known by Google Safe Browsing. The SafeBrowsing code integrated into Chrome can “look back” at the redirect chain to determine how a user was lured to a site.

Email Etiquette: Avoid BCC’ing large distribution lists

While Microsoft corporate culture has evolved over the years, and the last twenty years have seen the introduction of new mass communication mechanisms like Yammer and Teams, we remain an email heavy company. Many product teams have related “Selfhost” or “Discussions” aliases (aka “Discussion Lists” or DLs) to which thousands of employees subscribe so they can ask questions and keep an eye on topics relevant to the product.

As a consequence, many employees like me get hundreds or thousands of emails every day, the majority of which I don’t read beyond the subject line. To keep things manageable, I keep a long list of email sorting rules in Outlook designed to sort inbound mails to folders based on whether it was sent directly to me, to a particular alias, etc.

One such rule, which sorts mail from the Edge browser Selfhost alias looks like this:

Generally, this approach works great. However, almost every day there are one or more messages sent by well-meaning employees that drop into my inbox, often beginning with something like:

[BCC’ing the large alias to reduce noise.]

Bill, I’ll take this issue offline and work with you directly to investigate.

Don’t be this guy.

Please, I’m begging you with all of my heart, do not do this!

Despite your best intentions (reducing noise for others), you’re instead dramatically amplifying the prominence of your message.

When you move large DLs to BCC, it has the effect of breaking recipients’ email sorting rules such that it increases the prominence of your email by dropping it in thousands of employee’s inboxes instead of the folders to which the mail would ordinarily be neatly sorted.

Taking an issue “offline” also often has the side-effect of hiding information from everyone else on the alias, and getting information is usually why they joined the alias in the first place!

Instead, please do this:

I’ll investigate this issue with Bill, directly and after we figure out what’s going on, I’ll reply back to the alias with our findings.

Email to Alias

Hey, Bill– Can you try <a,b,c> and send me the log files collected? I’ll take a look at them and figure out what’s going on so I can reply back to the Selfhost alias with the fix timeline and any workarounds.

Email sent only to Bill

Thanks for your help in saving attention!

-Eric


Postscript: There is a mechanism you can use in a rule to detect that you’ve been BCC’d on a message. If you’ve received a message due to a BCC, there will be a message header added:

X-MS-Exchange-Organization-Recipient-P2-Type: Bcc

Allegedly you could put a rule checking for that header before any other sorting rules and Exchange will then try to put BCC’d replies into the same folder as the message being replied to.

I’m going to give this a try, but everything above still stands as this is NOT a trick most users are familiar with.

Update: It didn’t seem to work.

Fiddler Web Debugger Turns 20

Twenty years ago (!!?!) was the first official release of Fiddler. I still run Fiddler for some task or another almost every working day. I still run my version (Fiddler Classic) although some of the newer tools in the Fiddler Universe are compelling for non-Windows platforms.

I presented some slides in a birthday celebration that Progress Telerik streamed online this morning. Those slides might not make sense without the audio, but most of the content is in a presentation I gave at the Codemash conference, and I’ve shared the slides and audio from that talk.

For decades, Fiddler has been a huge part of not only my professional life, but my personal life as well, so this milestone is a bittersweet one. I could write pages, but I won’t. Maybe some day.

Happy birthday, Fiddler, and thank you to everyone who has joined me on the journey over the decades!

-Eric

Security Tradeoffs: Privacy

In a recent post, I explored some of the tradeoffs engineers must make when evaluating the security properties of a given design. In this post, we explore an interesting tradeoff between Security and Privacy in the analysis of web traffic.

Many different security features and products attempt to protect web browsers from malicious sites by evaluating the target site’s URL and blocking access to the site if a reputation service deems the target site malicious: for example, if the site is known to host malware, or perform phishing attacks against victims.

When a web browser directly integrates with a site reputation service, building such a feature is relatively simple: the browser introduces a “throttle” into its navigation or networking stack, whereby URLs are evaluated for their safety, and a negative result causes the browser to block the request or navigate to a warning page. For example, Google Chrome and Firefox both integrate the Google Safe Browsing service, while Microsoft Edge integrates the Microsoft Defender SmartScreen service. In this case, the privacy implications are limited — URL reputation checks might result in the security provider knowing what URLs are visited, but if you can’t trust the vendor of your web browser, your threat model has bigger (insurmountable) problems.

However, beyond direct integration into a browser, there are other architectures.

One choice is to install a security addon (like Microsoft Defender for Chrome or Netcraft) into your browser. In that case, the security addon can collect navigating/downloading URL using the browser’s extension API events and perform its task as if its functionality was directly integrated into the browser. Similarly, Apple offers a new platform API that allows client applications to report URLs they plan to fetch to an extensible lookup service that allows a security provider to indicate whether a given request should be blocked.

In the loosest coupling, a security provider might not integrate into a web browser or client directly at all, instead providing its security at another architectural layer. For example, a provider might watch unencrypted outbound DNS lookups from the PC and block the resolution of hostnames which are known to be malicious. Or, a provider might watch network connections to an outbound site and block connections where the URL or hostname is known to be malicious. This sort of inspection might be achieved by altering the OS networking stack, or plugging into an OS firewall or similar layer.

A common choice is to configure all clients to route their traffic through a proxy or VPN that breaks TLS as a MitM (e.g. Entra Internet Access) and provides the security software access to unencrypted traffic.

The decision of where to integrate security software is a tradeoff between context and generality. The higher you are in the stack, the more context you have (e.g. you may wish to only check URLs for “active” content while ignoring images), while the lower you are in the stack, the more general your solution is (it is less likely to be inadvertently bypassed).

The advantage of integrating security software deep in the OS networking layer is that it can then protect any clients that use the OS networking stack, even if those clients aren’t widely known, and even if the client was written long after your security software was developed. By way of example, Microsoft Defender’s Network Protection and Web Content Filtering features rely upon the Windows Filtering Platform to inspect and block network connections.

For unencrypted HTTP requests, inspecting traffic at the network layer is easy — the URL is found directly in the headers on the outbound request, and because the HTTP traffic is unencrypted plaintext, parsing it is trivial. Unfortunately for security vendors, the story for encrypted HTTPS traffic is much more complicated — the whole point of HTTPS is to prevent a network intermediary (including a firewall, even on the same machine) from being able to read the traffic, including the URL.

To spy on HTTPS from the networking level, a security vendor might reconfigure clients to leak encryption keys, or it might rely on a longstanding limitation in HTTPS to sniff the hostname from the Client Hello message that is sent when the client is setting up the encrypted HTTPS channel with the server.

Security products based on sniffing network traffic (DNS or server connections) are increasingly encountering a sea change: web browser vendors are concerned about improving privacy, and have introduced a number of features to further constrain the ability of a network observer to view what the browser is doing. For example, DNS-over-HTTPS (DoH) means that browsers will no longer send unencrypted hostnames to the DNS server when performing lookups. Instead, a secure HTTPS connection is established to the DNS server, and all requests and responses are encrypted to hide them from network observers.

Additionally, a new feature called Encrypted Client Hello (ECH) means that browsers can no longer spy on the first few packets of a TLS connection to see what server hostname was requested. For example, in loading the https://tls-ech.dev test page with ECH enabled, the client sends a “decoy” Server Name Indicator (SNI) of public.tls-ech.dev rather than the real server name:

When a browser looks up a domain in DNS, the HTTPS Resource Record declares if ECH should be used and if so, the decoy SNI value to send in the OuterHello. You can see this record using a DNS viewer like https://dns.google:

The ECH data is base64-encoded in the record; you can decode it to read the decoy SNI value.

Both Chromium and Firefox have support for ECH. Originally, both DoH and ECH were disabled-by-default in Chrome, Edge, and Firefox for “managed” devices but it appears this may have changed. A test site for ECH can be used to see whether your browser is using ECH, and you can follow these steps to see if a server supports ECH.

A growing percentage of servers support HTTP3/QUIC, an encrypted protocol that runs over UDP. QUIC’s use of “initial encryption” precludes trivial extraction of the server’s name from the UDP packet.

Blinded by these privacy improvements, a security product running at the network level may be fooled and might be forced to fall back to only IP reputation, which suffers many challenges.

Unfortunately, this is a direct tradeoff between security and privacy: security products that aren’t directly integrated into browsers cannot perform their function, but disabling those privacy improvements means that a network-based observer could learn more about what sites users are visiting.

Browser Policies

Browsers offer policies to allow network administrators to make their own tradeoffs by disabling security-software-blinding privacy changes.

Chrome

Edge

Firefox

Safari

  • Current versions of Safari reportedly do not offer a policy to disable QUIC. You may be successful in blocking QUIC traffic at the firewall level (e.g. block UDP traffic to remote port 443).

Stay safe out there!

-Eric

PS: It’s worth understanding the threat scenario here– In this scenario, the network security component inspecting traffic is looking at content from a legitimate application, not from malware. Malware can easily avoid inspection when communicating with its command-and-control (C2) servers by using non-HTTPS protocols, or by routing its traffic through proxies or other communication channel platforms like Telegram, Cloudflare tunnels, or the like.

PPS: Integrating into the browser itself may also be necessary to block sites in situations where a site doesn’t require the network in order to load.

PPPS: Sniffing the SNI is additionally challenging in several other cases:

1) If the browser uses QUIC, the HTTP/3 traffic isn’t going over a traditional TLS-over-TCP channel at all.

2) If the browser is using a VPN service (or equivalent), then the web traffic’s routing information is encrypted before reaching the network stack, and the ClientHello is encrypted and thus the SNI cannot be read.

3) If the browser is connected to a Secure Web Proxy (e.g. TLS-to-the-proxy itself, like the Edge Secure Network feature) then the traffic’s routing information is encrypted before reaching the network stack and ClientHello is encrypted and thus the SNI cannot be read.

4) If the browser uses H/2 Connection coalescing, the browser might reuse a single HTTP/2 connection across multiple different origins. For example, if your admin blocks sports.yahoo.com, you can get around that block by first visiting www.yahoo.com, which creates an unblocked HTTP/2 connection that can be later reused by requests to the Sports.yahoo.com origin.

While an enterprise browser policy is available to disable QUIC, there is not yet a policy to disallow H2 coalescing in Chromium-based browsers. In Firefox, this can be controlled via the network.http.http2.coalesce-hostnames preference.