text/plain

An Improbable Recovery

Way back on May 11th of 2022, I was visiting my team (Edge browser) for the week in Redmond, Washington. On Wednesday night, I left my ThinkPad X1 Extreme laptop in a work area on the 4th floor of the office when I went out for drinks with friends.

After dinner, I decided not to head back to the office to grab my laptop and bag to bring them back to the hotel. When I arrived at the office the next morning, my backpack (containing $80 in cash) and charger were still there, but my laptop was nowhere to be found. An exterior door in a nearby stairwell had been failing to latch but hadn’t been reported to the maintenance team.

I muddled through the rest of the week before heading home to Austin. Losing my X1 was a huge loss for me, because a bunch of irreplaceable files on it were not backed up to the cloud, including tax return data files and a birthday video some friends had arranged from Cameo. Some code updates to my tools hadn’t yet been uploaded to GitHub, and I lost access to a variety of notes and other information that wasn’t copied elsewhere. One consolation was that I knew that I’d enabled BitLocker Drive Encryption on the main disk, so the company’s data was safe.

For decades I’ve been worried about losing files through human error. Unfortunately, I only backed up to external drives every month or two, instead relying on more frequent backups by copying to dated folders to a second SSD on the same machine. Because COVID WFH meant that my laptop rarely left my home office, my entire machine getting stolen just wasn’t a significant part of my threat model. Oops.

I held out some hope that Microsoft Campus Security had video of the thief, and indeed they did. They supplied it to the Redmond Police department. Apparently, the thief was a “known transient in the area” and for a few weeks I held out hope that might mean I would see my laptop again. As the months went by, I lost hope. Because the laptop was my personal machine in BYOD mode (I’ve long used exclusively my own hardware for work) Microsoft wasn’t going to buy me a new one. Depression and frustration about the situation doubtless contributed in part to my leaving the Edge team later that summer.

The next time I gave my missing ThinkPad any real thought was the following April, when I was doing my taxes and had to painstakingly reenter data and make my best guesses based on my recollections of the past year’s now-missing data. It was very frustrating, and I’m sure I paid a few hundred dollars more than I needed to, having lost carryover deductions.

The last time I really thought about my lost ThinkPad was two years ago, when someone stole my son’s iPad out of my car parked in front of my house. I immediately assumed it was gone forever, and rolled my eyes that Apple’s “Find My” was so dumb that it still showed the tablet outside my house. And not even directly in the driveway of my house, but like 300 feet away. And that location never updated in the following days. Annoyed, I finally grudgingly decided to meander down the block to the tiny grove of trees marked by the blue dot and … sure enough, the iPad was sitting there in the grass under a tree. The thieves were smart enough to realize that they weren’t going to be able to use or sell the iPad without its passcode. Amazing.

Years passed.

And then…

Improbably, two weeks ago I received the following email:

I clicked the link, and sure enough, there it was, my laptop with my name still on the screen:

The listing was posted 60 miles away from Redmond. I contacted the seller and in chats over the following week, more of the story came out. Alexandra bids on abandoned storage units, and a recent acquisition had a pile of laptops in it, including mine. She mentioned that the container also had another ThinkPad, which unlike mine, bore a “Microsoft Asset” sticker on it. (I grabbed that Asset number and told her that our Corporate Security team would be in touch.)

She shipped my laptop back to me (somewhat tenuously packed in just an envelope full of bubble-wrap) and it arrived this afternoon. The fun stickers that once adorned the face are gone without a trace, but it’s unquestionably my device.

I’m a little afraid to turn it on — I’m pretty sure when I do, it’s going to auto-connect to my WiFi, realize that it’s been marked stolen, and get automatically wiped. Still, even in the unfortunate event that such a wipe happens, I might still end up formatting and reinstalling Windows 11 on it — the X1 Extreme Gen1 was the last ThinkPad that I really liked (my more recent buys have been “meh“), and it probably has at least a bit of useful life left in it.

And there’s something kinda magical in having my long wandering machine finally, improbably, find its way home, one thousand two hundred sixteen days later….

Lessons

If you see something, say something. I never would’ve gotten this laptop back if Howard didn’t reach out. It never would’ve been stolen to start with had the defective door been reported. (This advice goes for software bugs too!)
Encrypt your disk. I would’ve been very very uncomfortable with my tax returns and other personal data floating around in the wide world.
Regularly back up your data. Cloud backups are convenient, local offline backups provide additional assurance. Don’t forget your BitLocker keys!

-Eric

PostScript

Postscript: It looks like Microsoft Corp got rid of my device’s BitLocker disk encryption recovery key years ago, and I couldn’t find a copy of the recovery key file anywhere on my many external devices. (Consider backing yours up right now!)

But I randomly searched my other machines for text files containing the word “BitLocker” and found the X1’s key in an old SlickRun Jot file backed up from another laptop in 2022. I turned off WIFI in the X1 BIOS and successfully booted to my Windows desktop. It was a bit disorienting at first because the colors were all nuts — apparently, the last thing I was working on that fateful night in 2022 was a high-contrast bug. :)

Unfortunately, it looks like one component didn’t survive over the years — my beloved TrackPoint (the little red eraser-looking input device) only moves the cursor side-to-side, so I have to rely on the touch pad. But still, not bad at all.

AI Injection Attacks

A hot infosec topic these days is “How can we prevent abuse of AI agents?”

While AI introduces awesome new capabilities, it also entails an enormous set of risks from the obvious and mundane to the esoteric and elaborate.

As a browser security person, I’m most often asked about indirect prompt injection attacks, whereby a client’s AI (e.g. in-browser or on device) is tasked with interacting with content from the Internet. The threat here is that the AI Agent might mistakenly treat the web content it interacts with as instructions from the Agent’s user, and so hypnotized, fall under the control of the author of that web content. Malicious web content could then direct the Agent (now a confused deputy) to undertake unsafe actions like sharing private data about the user, performing transactions using that user’s wallet, etc.

Nothing New Under the Sun

Injection attacks can be found all over the cybersecurity landscape.

The most obvious example is found in memory safety vulnerabilities, whereby an attacker overflows a content data buffer and that data is incorrectly treated as code. That vulnerability roots back to a fundamental design choice in common computing architectures: the “Von Neumann Architecture,” whereby code and data are comingled in the memory of the system. While convenient for many reasons, it gave rise to an entire class of attacks that would’ve been prevented by the “Harvard Architecture” whereby the data and instructions would be plainly distinct. One of the major developments of 20 years ago– Data Execution Prevention / No eXecute (DEP/NX) was a processor feature that would more clearly delineate data and code in an attempt to prevent this mistake. And the list of “alphabet soup” mitigations has only grown over the years.

Injection attacks go far beyond low-level CPU architecture and are found all over, including in the Web Platform, which adopted a Von Neumann-style design in which the static text of web pages is comingled with inline scripting code, giving rise to the ever-present threat of Cross-Site Scripting. And here again, we ended up with protection features like the XSS Filter (IE), XSS Auditor (Chrome) and opt-in features to put the genie back in the bottle (e.g. Content Security Policy) by preventing content and script from mingling in dangerous ways.

I’ll confess that I don’t understand nearly enough about how LLM AIs operate to understand whether the “Harvard Architecture” is even possible for an LLM, but from the questions I’m getting, it clearly is not the common architecture.

What Can Be Done?

In a world where AI is subject to injection attacks, what can we do about it?

One approach would be to ensure that the Agent cannot load “unsafe” web content. Since I work on SmartScreen, a reputation service for blocking access to known-unsafe sites, I’m often asked whether we could just block Agent from accessing bad sites just as we would for a regular human browser user. And yes, we should and do, but this is wildly insufficient: SmartScreen blocks sites found to be phishing, distributing malware, or conducting tech scams, but the set of bad sites grows by the second, and it’s very unlikely that a site conducting a prompt injection attack would even be recognized today.

If blocking bad sites doesn’t work, maybe we could allow only “known good” sites? This too is problematic. There’s no concept of a “trustworthy sites list” per-se. The closest SmartScreen has is a “Top traffic” list, but that just reflects a list of high-traffic sites that are considered to be unlikely sources of the specific types of malicious threats SmartScreen addresses (e.g. phishing, malware, tech scams). And it’s worse than that — many “known good” sites contain untrusted content like user-generated comments/posts, ads, snippets of text from other websites, etc. A “known good” site that allows untrusted 3rd-party content would represent a potential source of a prompt injection attack.

Finally, another risk-limiting design might be to limit the Agent’s capabilities, either requiring constant approval from a supervising human, or by employing heavy sandboxing whereby the Agent operates from an isolated VM that does not have access to any user-information or ambient authority. So neutered, a hypnotised Agent could not cause much damage.

Unfortunately, any Agent that’s running in a sandbox doesn’t have access to resources (e.g. the user’s data or credentials) that are critical for achieving compelling scenarios (“Book a table for two at a nice restaurant, order flowers, and email an calendar reminder to my wife“), such that a sandboxed Agent may be much less compelling to an everyday human.

Aside: Game Theory

Despite the many security risks introduced by Agentic AI, product teams are racing ahead to integrate more and more capable Agent functionality into their products, (including completing purchase transactions).

AI companies are racing toward ever-more-empowered Agents because everyone is scared that one of the other AI companies is gonna come out with some less cautious product and that more powerful, less restricted product is gonna win the market. So we end up in the situation with the US at the end of the 1950s, whereby the Russians had 4 working ICBMs but the United States had convinced ourselves they had 1000. In response to this fear, the US itself built a thousand ICBMs, so the Russians then built a thousand ICBMs, and so on, until we both basically bankrupted the world over the next few decades.

2025 Summer Vacation

The boys and I went to Maryland for the first half of August to visit family and check out some roller coasters. They hit Kings Dominion, Busch Gardens, Six Flags America (final season), and Hershey Park. We also hiked up Old Rag mountain, visited Tree Trekkers, and rafted the lower-Yough in Ohiopyle State Park.

We also caught the movie Sketch, which I enjoyed more than most of us; it dared to boldly answer the question “What if we made a horror movie for pre-teens”?

A friend we met on the ground at Old Rag. The kids assumed it was a lost Croc Jibbitz

Nate and I waited 2.5 hours for Fahrenheit 451. His verdict? Not even close to being worth it.

Tree Trekkers is always a hit. Nate and I maximized our time on the zip lines.

Of all of our adventures, the rafting trip was the most fun. After a boring Middle-Yough experience last year, this year the boys and I were thrown from the raft shortly after the trip started, and this was followed by an intense storm that dumped on us, with lightning and trees coming down. Around halfway through, the weather cleared up. While the remainder of the expedition was still fun, Noah mused “It was more fun in the storm.”

Beyond the adventures, we played games, ate Grandma’s spaghetti, and Grandpa’s cinnamon rolls and omelets.

Security Product Efficacy

I’ve written about security products previously, laying out the framing that security products combine sensors and throttles with threat intelligence to provide protection against threats.

As a product engineer, I spend most of my time thinking about how to improve sensors and throttles to enhance protection, but those components only provide value if the threat intelligence can effectively recognize data from the sensors and tell the throttles to block dangerous actions.

A common goal for the threat intelligence team is to measure the quality of their intel because understanding the current quality is critical to improving it.

Efficacy is the measure of the false negatives (how many threats were missed) and false positives (how many innocuous files or behaviors were incorrectly blocked). Any security product can trivially have a 0% false negative rate (by blocking everything) or a 0% false positive rate (by blocking nothing). The challenge of the threat intelligence is in minimizing both false negatives and false positives.

Unfortunately, if you think about it for a moment the big problem in measuring efficacy leaps to mind: It’s kinda impossible.

Why?

Think about it: It’s like having a kid take a math test, and then asking that kid to immediately go back and grade his own test without first giving him an answer key or teaching him more math. When he wrote down his answers, he did his best to provide the answer he thought was correct. If you immediately ask him again, nothing has changed — he doesn’t have any more information than he had before, so he still thinks all of his answers are correct.

And the true situation is actually much more difficult than that analogy — arithmetic problems don’t try to hide their answers (cloaking), and their answers stay constant over time, whereas many kinds of threat are only “active” for brief slices of time (e.g. a compromised domain serving a phishing attack until it’s cleaned up).

There’s no such thing as an answer key for threat recognition, so what are we to do? Well, there are some obvious approaches for grading TI for false negatives:

Wisdom of the Crowd – Evaluate the entity through all available TI products (e.g. on VirusTotal) and use that to benchmark against the community consensus.
Look back later – Oftentimes threats are not detected immediately, but become are discovered later after broader exposure in the ecosystem. If we keep copies of the evaluated artifacts and evaluate them days or weeks later, we may get a better understanding of false negatives.
Sampling – Laboriously evaluating a small sample of specimens by an expert human grader who, for example, detonates the file, disassembles the code and audits every line to come up with an accurate verdict.
Corpus Analysis – Feed a collection of known-bad files into the engine and see how many it detects.

Each of these strategies is inherently imperfect:

the “Wisdom of the Crowd” only works for threats known to your competitors
“look back later” only works when the threat was ever recognized by anyone and remains active
sampling is extremely expensive, and fails when a threat is inactive (e.g. a command-and-control channel no longer exists)
Corpus analysis only evaluates “known bad” files and often contains files that have been rendered harmless by the passage of time (e.g. attempting to exploit vulnerabilities in software that was patched decades ago).

Even after you pick a strategy, or combination of strategies for grading, you’re still not done. Are you counting false positives/negatives by unique artifacts (e.g. the number of files that are incorrectly blocked or allowed), or by individual encounters (the number of times an incorrect outcome occurs)?

Incorrectly blocking a thousand unique files once each isn’t usually as impactful to the ecosystem as blocking a single file incorrectly a million times.

This matters because of the base rate: the vast majority of files (and behaviors) are non-malicious, while malicious files and behaviors are rare. The base rate means that a FN rate of 1% would be reasonably good for security software, while a FP rate of 1% would be disastrously undeployable.

Finally, it’s important to recognize that false positives and false negatives differ in terms of impact. For example:

A false negative might allow an attacker to take over a device, losing it forever.
A false positive might prevent a user from accomplishing an crucial task, making their device useless to them.

Customers acquire security software with the expectation that it will prevent bad things from happening; blocking a legitimate file or action is “a bad thing.” If the TI false positive rate is significant, users will lose trust in the protection and disable security features or override blocks. It’s very hard to keep selling fire extinguishers when they periodically burst into flame and burn down the building where they’re deployed.

-Eric

Family Safety Content Filtering

Microsoft Family Safety is a feature of Windows that allows parents to control their children’s access to apps and content in Windows. The feature is tied to the user accounts of the parent(s) and child(ren).

When I visit https://family.microsoft.com and log in with my personal Microsoft Account, I’m presented with the following view:

The “Nate” account is my 9yo’s account. Clicking it reviews a set of tabs which contain options about what parental controls to enable.

Within the Settings link, there’s a simple dialog of options:

Within the tabs of the main page, parents can set an overall screen time limit:

Parents can configure which apps the child may use, how long they may use them:

…and so on. Parents can also lock out a device for the remainder of the day:

On the Edge tab, parents can enable Filter inappropriate websites to apply parental filtering inside the Edge browser.

(Unlike Microsoft Defender for Endpoint’s Web Content Filtering, there are no individual categories to choose from– it’s all-or-nothing).

As with SmartScreen protection, Family Safety filtering is integrated directly into the Microsoft Edge browser. If the user visits a prohibited site, the navigation is blocked and a permission screen is shown instead:

If the parent responds to the request by allowing the site:

…the child may revisit that site in the future.

Blocking Third-Party Browsers

Importantly, Family Safety offers no filtering in third-party browsers (mostly because doing so is very difficult), so enabling Web Filtering will block third party browsers by default.

The blocking of third-party browsers is done in a somewhat unusual way. The Parental Controls Windows service watches as new browser windows appear:

…and if the process backing a window is that of a known browser (e.g. chrome.exe) the process is killed within a few hundred milliseconds (causing its windows to vanish).

After blocking, the child is then (intended to be) presented with the following dialog:

If the child presses “Ask to use”, a request is sent to the Family Safety portal, and the child is shown the same dialog they would see if they tried to use an application longer than a time limit set by the parent:

The parent(s) will receive an email:

…and the portal gives the parent simple options to allow access:

Some Bugs

For a rather long time, there was a bug where Family Safety failed to correctly enforce the block on third party browsers. That bug was fixed in early June and blocking of third party browsers was restored. This led to some panicked posts in forums like Microsoft Support and Reddit complaining that something weird had happened.

In many cases, the problem was relatively mild (“Hey, I didn’t change anything, but now I’m seeing this new permission prompt. What??”) and could be easily fixed by the parent by either turning off Web Filtering or by allowing Chrome to run:

How Parents Can Adjust Settings:

Go to https://familysafety.microsoft.com or open the Family Safety mobile app.

1. Select the child.

2. To allow other browsers:

· Disable “Filter inappropriate websites” under the Edge tab, or

· Go to Windows tab → Apps & Games → unblock Chrome.

Note that settings changes will take a minute or so to propagate to the client.

Pretty straightforward.

What’s less straightforward, however, is that there currently exists a second bug: If Activity reporting is disabled on the Windows tab for the child account:

…then the browser window is blown away without showing the permission request prompt:

This is obviously not good, especially in situations where users had been successfully using Chrome for months without any problem.

This issue has been acknowledged by the Family Safety team who will build the fix. For now, parents can workaround the issue by either opting out of web filtering or helping their children use the supported browser instead.

-Eric