Security: The Impact of Time

Two years ago, I wrote a long post about the importance of time, and how practical time machines can help reduce emergencies into more mundane workitems. Today, we revisit the same topic, with a focus on the Security impact of time.

Races

In many ways, the story of modern security is a story about races, with Attackers locked in an endless sprint to outpace Defenders. That’s especially true in the case of phishing and malware — attackers need to get their attack payloads to victims faster than defenders can recognize those payloads and block them.

Defenders work as fast as possible to minimize the time-to-block, but there are three key hurdles that they must leap to get there, each of which takes time:

  • Detection (false-negative reports from users, telemetry, or other sources)
  • Analysis/Grading (manual or automated steps to confirm phishing/malware, determine the scope/rollup of blocks, and reduce the likelihood of false positives)
  • Blocking (authoring and deployment of rules to block malicious content)

Attackers use cloaking and other techniques to try to slow down the process of detection and grading.

Google’s SafeBrowsing team recently claimed that 60% of phishing sites exist for under 10 minutes, giving defender’s precious little time to react before the attackers have moved to a new location and new victims.

Time-to-Check

Another key concern for security is the amount of time a security check requires, and this leads to many design constraints. Users are not willing to accept security solutions that make poor tradeoffs between usability and security, and slow performance can gravely impact usability. Security software must be able to return a verdict (e.g. block or allow) as quickly as possible to keep the user productive.

One common approach to dealing with performance challenges is to make checks asynchronous, such that the user’s activity isn’t blocked until/unless the security check reports a detection of malice. In some cases, this is a smart tradeoff– for example, Microsoft SmartScreen performs asynchronous reputation checks for phishing sites. While it may take a second or two for the online reputation check to complete, a potential-victim is safely navigated to the block page quickly enough that they don’t have the time to type sensitive data (e.g. passwords or credit card info) into the malicious site.

Unfortunately, asynchronous checks aren’t always appropriate. In the case of anti-malware scans or Network Protection checks, we cannot safely allow the malware to run or the malicious connection to be established because the damage could be done before the reputation check completes. So, in most cases, these checks need to be synchronous, which means that they pause the user’s requested activity until the check completes. This creates difficult performance constraints for security software, which otherwise might use relatively slow analysis mechanisms (e.g. “detonation” via sandboxed emulation) to detect malicious content.

These factors mean that security solutions must adopt designs that can meet the time requirements of the user-scenario. Consider Microsoft SmartScreen and Defender Network Protection, two features which are similar except in their time constraints. While SmartScreen performs many URL Reputation checks against the cloud, Network Protection cannot afford such checks. Instead, it consults a locally-cached and frequently updated bloom filter of potentially malicious sites. Only if there’s a match against the bloom filter is the time-expense of a (comparatively slow) online reputation check incurred.

DDoS

Most Distributed Denial of Service (DDoS) attacks (and many Denial of Service attacks in general) are based on attackers sending a huge flurry of requests in a short period of time, forcing legitimate services offline because there’s too much work arriving in too little time. If the attack weren’t concentrated within a short period, it wouldn’t work — the victim service could simply gradually scale up to cover the additional load without much impact.

Unfortunately for defenders, improving performance for legitimate scenarios can result in making attacks more efficient for attackers. A great recent example of this is the “HTTP/2 Rapid Reset” attack. This attack leverages features of the HTTP/2 protocol that were designed to significantly improve performance:

  • Tiny request sizes
  • The ability to layer dozens to hundreds of requests in a single TCP/IP packet
  • The ability to rapidly abandon/cancel requests

… to generate massive load on the recipient CDN or web server.

Crypto

Fireproof safes are rated by how many minutes they can protect their contents in the event of a fire. Most encryption algorithms are based on a similar concept: they aim to protect data longer than the attack lasts.

Given enough time, an attacker can brute force the encryption and discover the secret key that allows them to unscramble protected data. Crypto algorithms are designed to ensure that brute force attacks take impractically long amounts of time (decades or more).

For example, password’s hashes shouldn’t be stored using fast hashes (SHA256), but instead using hashes (Argo2, PBKDF2, BCrypt) that are deliberately designed to be slower.

Even then, defense-in-depth means that we strive to use crypto designs like forward secrecy to protect captured traffic well into the future.

Cloaking Badness

Defenders routinely run sandboxed security tests on code (e.g. potential malware, potentially-malicious browser extensions, etc) to attempt to determine whether it’s malicious. This process is called detonation. Attackers know that defenders do this and sometimes include logic that attempts to hide bad behavior for longer than a sandboxed-analysis is expected to run. Analysis frameworks respond by doing tricks like advancing the clock, firing timers faster than real-world, etc, in an attempt to trick malware into revealing its misbehavior more quickly. It’s a cat-and-mouse game, and unfortunately, attackers have a distinct advantage, as there are any number of tricks they could use to defer misbehavior until later.

Trust vs. Age

Because defenders strive to kill attacks quickly, there’s a security benefit to treating younger entities (domains, files) with increased suspicion, as older entities are less likely to be undetectedly malicious.

For example, an attacker trying to launch a spear-phishing campaign is likely to register their attack domain (e.g. contoso-auth.xyz) and request its certificate as soon as possible before sending its lures to victims, lest the signals of the coming attack become visible before the attack even begins. For example, there are various monitors that watch for new domain name registrations, and entries into the Certificate Transparency logs that can flag an attack site before it’s even online. Attackers strive to reduce the window of exposure as much as possible to maximize the effective lifetime of their attacks.

To that end, Microsoft Defender’s Web Content Filtering feature offers administrators the option of blocking navigation to domains less than 30 days old, imposing hard constraints on attackers who hope to perform a rapid sneak attack.

Time is fundamental.

-Eric

Published by ericlaw

Impatient optimist. Dad. Author/speaker. Created Fiddler & SlickRun. PM @ Microsoft 2001-2012, and 2018-, working on Office, IE, and Edge. Now a GPM for Microsoft Defender. My words are my own, I do not speak for any other entity.

Leave a comment