Attack Techniques: Open Redirectors, CAPTCHAs, Site Proxies, and IPFS, oh my

The average phishing site doesn’t live very long– think hours rather than days or weeks. Attackers use a variety of techniques to try to keep ahead of the Defenders who work tirelessly to break their attack chains and protect the public.

Defenders have several opportunities to interfere with attackers:

  • Email scanners can detect Lure emails and either block them entirely, or warn the user (e.g. Microsoft SafeLinks) if they click on a link in an email that leads to a malicious site. These email scanners might check embedded URLs by directly checking URL Reputation Services, or they might use Detonators, automated bots which try to navigate a virtual machine to the URLs contained within a Lure email to determine whether the user will end up on a malicious site.
  • Browsers themselves use URL Reputation Services (Microsoft SmartScreen, Google SafeBrowsing) to block navigations to URLs that have been reported as maliciously Requesting the victim’s credentials and/or Recording those stolen credentials.
  • Browser extensions (e.g. NetCraft, Suspicious Site Reporter) can warn the user if the site they’re visiting is suspicious in some way (newly, bad reputation, hosted in a “dodgy neighborhood”, etc).
  • Defenders can work with Certificate Authorities to revoke the HTTPS certificates of malicious sites (alas, this no longer works very well)
  • Defenders and Authorities work with web infrastructure providers (hosting companies, CDNs, domain registration authorities, etc) to take down malicious sites.

Each of these represents a weak link for attackers, and they can improve their odds by avoiding them as much as possible. For example, phishers can try to avoid URL Reputation services’ blocking entirely by sending Lures that trick users into completing their victimization over the phone. Or, they can try to limit their exposure to URL Reputation services by using the Lure to serve the credential Request from the victim’s own computer, so that only the url that Records the stolen credentials is a candidate for blocking.

To make their Lure emails’ URLs less suspicious to mail scanners, some phishers will not include a URL that points directly at the credential Request page, instead pointing at a Redirect URL. In some cases, that redirector is provided by a legitimate service, like Google or LinkedIn:

That first Redirect URL might itself link to another Redirect service; in some cases, a Cloaking Redirector might be used which tries to determine whether the visitor is a real person (potential victim) or a security scanning bot (Defender). If the Cloaking Redirector believes they’ve got a real bite, they’ll send them to the Credential Request page, but if not, they’ll instead send the bot to some innocuous other page (Google and Microsoft homepages are common choices). Cloaking is a common strategy used by attackers to try to keep their sites live for longer.

Redirectors can also complicate the phish-reporting process: a user reporting a phishing site might not report the original URL, so when the credential Request page starts getting blocked, the attacker can just update the Redirect URL used in their lure to point to a new Request page.

Before showing the user the credential Request, an attacker might ask the user to complete a CAPTCHA or similar proof. Now, you might naturally wonder “Why would an attacker ever put a hurdle in the way of the victim on their merry way to give up their secrets?” And the answer is simple: While CAPTCHAs make things slightly harder for human victims, they make things significantly harder for the Defender’s Detonators — if an automated security scanner can’t get to the final URL, it cannot evaluate its phishyness.

After the user has been successfully lured to a credential collection page, the attacker bears some risk: the would-be victim might report the phish to URL reputation services. To mitigate that risk, the attacker might rely on cloaking techniques, so that graders cannot “see” the phishing attack when they check the false negative report.

Similarly, the would-be victim might themselves report the URL directly to the phisher’s web host, who often has no idea that they’re facilitating a criminal enterprise.

To avoid getting their sites taken offline by hosting providers, attackers may split their attack across multiple servers, with the credential Request happening at one URL, and the user’s stolen data sent to be Recorded on another domain entirely. That way, if only Request URL is taken down, the attacker can still collect their plunder from the other domain.

Proxy-Type Services

An attack I saw today utilized several of these techniques all at once. The attacker sent a lure with a URL pointing to a Google-owned translate.goog domain. That URL was itself just acting as a proxy for a Cloudflare IPFS gateway. IPFS is a new-ish technology that’s not supported by most browsers yet, but it has a huge benefit to attackers in that Authorities have no good way to “take down” content served via IPFS, although there’s a bad bits list.

To enable the attack page to be reachable by normal users’ browsers (which don’t natively support IPFS), the attackers supply a URL to a Cloudflare IPFS gateway, a special webservice that allows browsers to retrieve IPFS content using plain-old HTTPS. In this case, neither Google nor Cloudflare recognizes that they’re facilitating the attack, as neither of them is really acting as a “Web server” in any traditional sense.

Even if Google Translate and Cloudflare eventually do block the malicious URLs, the attacker can easily pick a different proxy service and a different IPFS gateway, without even having to republish their attack elsewhere on IPFS. The design of IPFS makes it harder to ever discover who’s behind the malicious page.

Now, storing data back to IPFS is a somewhat harder challenge for attackers, so this phishing site uses a different server for that purpose. The “KikiCard” URL used by the attackers receives POST requests with victims’ credentials, stores those credentials into a database for the attacker, and then redirects the user to some generic error page on Microsoft.com. In most cases, victims will never even see the “KikiCard” URL anywhere, making it much less likely to be reported.

Google SafeBrowsing is now blocking the KikiCard host as malicious, but it’s still online with a valid certificate.

Without more research, I usually couldn’t tell you whether this domain has always been owned by attackers, or whether an attacker simply hacked into an innocent web server and started using it for nefarious purposes. In this case, however, a quick search shows that it was found as a Recorder of stolen credentials going back to July 2022, not long after it got its first HTTPS certificate.

Open-Framer Vulnerabilities

Another attack variant abuses a legitimate site much like open-redirector and proxy-type service abuse. In this variant, a website has a vulnerability where an attacker is able to control the URL of a subframe of a page on that website. In contrast to a similar cross-site scripting attack, in this attack the attacker’s content is isolated by the web platform into a frame, ensuring that it cannot steal the user’s cookies or run script against content outside the frame in the containing page. However, because browsers do not indicate which parts of the page are inside subframes, users will naturally assume that the content in the frame legitimately originates from the parent page. An attacker can abuse this trust and prompt the user to perform an unsafe action in the frame (e.g. entering credentials or downloading files), and the user is likely to make their trust decision based on the URL of the unwitting outer page.

Trivia: These days, this attack requires a vulnerability in the website (e.g. accepting an arbitrary URL in a parameter), but decades ago it was a bug in the Internet Explorer web platform (any page could navigate any subframe contained inside a different page). This attack got the evocative nickname Super-Spoof Deluxe.

-Eric

Slow Seaside Half

After my first real-world half marathon in January, I ended up signing up for the 2024 race, but I also quickly decided that I didn’t want to wait a full year to give it another shot. A day or so later, I signed up for the Galveston Island Half Marathon at the end of February, with the hope that a similarly flat course would give me a shot at beating my Austin finishing time.

Alas, it wasn’t to be, although I’m still glad I ran it.

The weather forecast bounced around a bit in the final weeks leading up to the race, with rain predicted for a while, but race morning ultimately proved to be free of precipitation but extremely humid.

I woke up for half an hour at 3:15am, which wasn’t ideal, but I didn’t feel very tired. This time, I had a productive trip to the bathroom before leaving the house, and managed to squeeze in a final coffee disposal in the porta-potties just before the start.

In pre-race prep, I’d added more “peppy” music to my playlist, and configured my watch for easier visibility, although infuriatingly, I couldn’t coax it to tell me the time of day or total elapsed time: for my next run, I’m going to wear two watches.

The course started on Stewart Beach…

…heading north before looping back and passing by the starting area around 9.5 miles later:

Unfortunately, this run was hard. I never found my rhythm and ended up in my Peak heart rate zone almost immediately; after mile three, I was regularly dropping down to walks.

I ended up not needing my sunglasses (or sunscreen), and it was kinda nice to run alongside the foggy beach and surf. That said, I needed water or Gatorade at almost every aid stop and I think I pumped out more sweat than on any other run.

My pace for the first six miles was considerably slower than my expected (8:34), and only fell from there:

The middle miles of the race were hard. While nothing hurt for more than a second or two (a budding blister made its presence known, but it wasn’t either a surprise or bothersome), nothing felt very good either. I again found myself lost in unhappy thoughts and worries (mostly loneliness) and never managed to “zone out” and just run like I do on the treadmill.

When the finish line was finally in sight, I started sprinting; my knees instantly warned me that this wasn’t going to last, but otherwise it felt great to finally be moving.

I crossed the line fourteen minutes slower than my Austin Half, happy to be done:

After a shower back at the AirBnB, friends and I went to the Galveston Island Brewing taproom and sampled their beers. After a few hours, I walked over to the beach to enjoy the sun and warm weather (the fog had dissipated).

“Math Is Hard” Double IPA. (Or was it a quad, since I had two? :)

By the end of the day, I’d walked almost 6 additional miles, crossing over 35000 steps for the day.

The long-sleeve race shirt was pretty nice, and the logo was the same one used for the finisher’s medal.

Unfortunately, landscapers with a mower destroyed the back window of my car while it was parked at the AirBnB, but I managed to get it back to Austin without the shattered glass completely falling out.

I’m looking forward to some recovery treadmill runs for the next two months before the Capital 10K in April. I had a relaxed 8 mile run this morning and it felt great.

Update: I’ve signed up for the Feb 25, 2024 Galveston Half.

-Eric

Q: “Remember this Device, Doesn’t?!?”

Q: Many websites offer a checkbox to “Remember this device” or “Remember me” but it often doesn’t seem to work. For example, this option on AT&T’s website shown when prompting for a 2FA code:

…doesn’t seem to work. What’s up with that?

A: Unfortunately, there’s no easy answer here. There is no browser standard for how to implement a feature like this, so different websites implement it differently.

Virtually all of these systems are dependent upon storing some sort of long-lived token within one of the browser’s storage areas (cookies, DOM storage, IndexedDB, etc). Anything which interferes with your browser’s storage areas can interfere with the long-lived token:

  • Depending upon how the site is coded, privacy features like Edge’s Tracking Prevention might interfere with storage of the token to begin with.
  • There are many different features and operations that can cause one or more storage items to subsequently be become inaccessible. For example, privacy controls, 3rd party utilities, user-actions, use of multiple browser channels, and so on. (Please see the blog post for a more comprehensive list).

Even if the token is successfully stored by the website and is available on later site loads, the server might choose to ignore it.

  • Some sites will ignore a cached token if the visitor appears to be coming from a significantly different geographic location, e.g. because you’ve either moved your laptop or enabled a VPN.
  • Some sites will ignore a cached token if some element of the user’s environment changes: for instance, if the browser’s configured languages are different than when the token was stored.
  • We encountered one site whose auth flow broke if the browser’s User-Agent string changed– this site broke when we tried to fix a compatibility issue by automatically overriding the User-Agent value.
  • Some sites will expire a cached token after a certain (often undocumented) timeframe.
  • Some sites will expire a cached token if some other security setting in the account is changed, or if there are signs that the account’s login is under bruce-force attack.
  • Some sites simply change how they work over time. For example, Fidelity recently sent an email to customers with 2FA announcing that they’ll no longer respect a “remember this device” option:
  • Some sites will expire a cached token if some other risk heuristic triggers (e.g. a user begins logging in at an unusual time of day, etc).

Debugging

Debugging problems like this is often non-trivial, but you might try things like:

  • Watch the F12 Developer Tools’ console to look for any notes about storage being blocked by a browser privacy feature, or a JavaScript exception.
  • See if the “Remember me” behavior works once from the same browser instance.
  • See if the “Remember me” behavior works after restarting the browser.
  • See if the “Remember me” behavior works properly in a different browser or channel.
  • Poke through the F12 Developer Tools’ Application tab to see what sorts of Storage the site’s login flow is writing.

Attack Techniques: Blended Attacks via Telephone

Last month, we looked at a technique where a phisher serves his attack from the user’s own computer so that anti-phishing code like SmartScreen and SafeBrowsing do not have a meaningful URL to block.

Another approach for conducting an attack like this is to send a lure which demands that the victim complete the attack out-of-band using a telephone. Because the data theft is not conducted over the web, URL reputation systems don’t have anything to block.

Here’s an example of such a scam, which falsely claims that the user was charged $400 for one of the free programs already on their PC:

The attacker hopes that the user, upon seeing this charge, will call the phone number within the email and get tricked into supplying sensitive information. This particular scam’s phone number is routed to a call center purporting to be “Microsoft Support.” Unfortunately, some legitimate companies (like PayPal) will send fake invoices on behalf of the attackers so that your email program may trust the sender.

Pretty much any service that offers email notifications with attacker supplied text will get abused. For example, an attacker can configure Microsoft Azure to spam arbitrary email addresses with status alerts that have attacker-controlled text, like so:

Microsoft Azure-generated phone phishing lure

The advantage to an attacker here is that the email was sent by Microsoft’s legitimate email servers, being used as an unwitting accomplice to the fraud. The user can mark the email as fraudulent, but since it’s from a legitimate service, it’s unlikely that it will help much.

Another common form of the phone attack is called a tech support scam, and involves an ad or website that attempts to convince the user that their computer has a problem:

Evidence suggests that some email services have gotten wise to telephone-backed scams: because the phone number needs only be read by a human, attackers may try to evade detection and blocking by encoding their phone numbers using non-digit characters or irregular formatting, as in this lure:

…or by embedding the phone number inside an image, like this lure:

Unfortunately, relatively few phones offer any mechanism for warning the user when they’re calling a known-scam number — Google’s “Scam Likely” warnings only seem to show on the Pixel for inbound calls. As with traditional phishing attacks, bad actors can usually switch their infrastructure (rental call centers, Twilio VoIP, etc) easily after they are blocked.

Stay safe out there!

-Eric

PS: Sometimes this attack technique is lumped in with vishing, but I tend to think of vishing as an attack in which the initial lure arrives via a phone call or voicemail.

A New Era: PM -> SWE

tl;dr: As of last week, I am now a Software Engineer at Microsoft.

My path to becoming a Program Manager at Microsoft was both unforeseen (by me) and entirely conventional. Until my early teens, my plan was to be this guy:

I went to Space Camp and Space Academy, and spent years devouring endless books about NASA history, space flight, and jet planes. I spent hours “playing” on a realistic (not graphically, but in terms of slow pacing and technical accuracy) Space Shuttle simulator, until I could land the shuttle on instruments alone.

Over time, however, three factors conspired to change my course.

  • First was my realization that my few peers interested in space flight were all interested in space — stars and planets and the science, while I really only cared about the technology of getting there and surviving.
  • Second was the discovery of a Catch-22: While astronaut pilots don’t have to have perfect vision, they were required to have thousands of hours of experience flying jets, which practically required being a military jet pilot, which did require perfect uncorrected vision. My distance vision has been ~20/40 for most of my life.
  • Finally, I’d started getting more and more interested in playing around with computers. I began writing “choose-your-own adventure” games in GW-BASIC starting around age 8 or so, and continued coding in school on Apple II (AppleBasic) and PCs (Logo, Pascal).

Shortly after my 15th birthday, I spent a full summer job’s earnings (~$3000 at $4.75/hr) on my first personal PC (Comtrade Pentium 90 PC with 8 megs of RAM, 730mb HDD, 4X CDROM, 15.7″ monitor, bought over the telephone from an ad in Computer Shopper magazine) and I started writing apps in Turbo Pascal, VB3 (bought for $50 on 5.25″ floppies at the annual “Computer show” at the Frederick Fairgrounds), and eventually Delphi 1 ($100 at Babbages in the mall). By my late teens, I was spending ten or more (sometimes much more) hours a week writing code, and after my senior year, I got my first programming job building custom Windows apps in Delphi for a small development shop at almost 4x minimum wage.

After high school, I majored in Computer Science at the University of Maryland, and while I largely didn’t like it (too much theory, too little practice), I had already seen that software development was a pretty solid career choice. In my sophomore year, on a whim (with the promise of free pizza) I went to a Microsoft recruiting talk on campus delivered by Philip Su, a recent University of Maryland graduate who had joined Microsoft as a developer. Philip was a school legend, having written UMD’s web-based course planning system (a CGI written in C++ talking to the mainframe and spitting out HTML) that allowed you to specify constraints like “I need this many credits, these specific classes, and otherwise do not want to attend class before 11am on any day.” After Philip’s awesome talk, I went from being mildly interested in Microsoft to very excited at the prospect of getting an internship. I dropped off my resume, chatted briefly with Philip, and crossed my fingers.

I got a callback for a short interview at the campus career center a short time later. I didn’t really know what to expect, but figured my best bet was to show off the code I’d built so far. I put together a small binder of screenshots and explanations of tools I’d built in Delphi, including SlickRun, DigitalMC, and Logbook, a journaling program. Each of these was a “scratch my own itch” type of app where my goal was to use technology to solve a problem. In each app, I tried to build cool features, not implement fancy algorithms from scratch. Digital MC used several different libraries (text-to-speech, MP3 playback) and Logbook used an existing database engine.

My campus interviewer was a Microsoft developer in his early thirties (in hindsight, he may well have been younger) who looked a bit weary after a morning full of 15 minute interviews. After quick introductions, he asked which of the engineering roles I’d be most interested in applying for.

I told him that I thought I’d be a fine fit for any of the roles, although I was most interested in the SDE (Software Development Engineer) and PM (Program Manager) roles, and was interested in what he thought. I handed over the binder and walked him through the projects I’d built— as I explained SlickRun, his eyes lit up and he was clearly excited about it. “Have you ever shown this to Microsoft?” he asked excitedly. “I guess I just did?” I replied, wondering what exactly he meant— it wasn’t as if Microsoft toured the country looking for interesting bits of code. I asked him for advice on whether I should go for the PM or SDE role and he noted that Microsoft was looking for SDE interns with experience building 5000 line C and C++ programs. At that point, I’d built several large applications, but all were in Delphi’s Object Pascal. The only C and C++ I’d written was for class projects, and none of those had yet cracked a thousand lines. This made the decision easy— I’d submit my resume as a PM-candidate, a decision with far-ranging and long-lasting consequences. Not long after, I flew to Redmond for a day of on-site interviews with two teams in Office and got offers from both.

During my first Office summer internship in 1999, I ramped up on a new technology (devouring the first books on XML), wrote up competitive reports on the first web-based collaboration software, and played with the nascent API for our team’s “Office Web Server (OWS)” product (eventually renamed SharePoint Team Services). I attended a bunch of training classes, read a bunch of product specs, read a pile of usability books, and generally immersed myself in learning what it meant to be a Program Manager at Microsoft. At the time, the role was hand-wavingly defined as “The person who does everything but code and test.” Qualifications were similarly open, with recruiters told to look for candidates with “A passion for using technology to solve problems.”

I returned to the same team the following summer– by this point, the product was in much more defined form, and I was paired with an Intern Developer and Intern Tester (a “feature trio”) to build a feature. Over the course of the summer, I learned that the primary tasks for most PMs were writing feature design specifications, shepherding them through implementation, triaging bugs found in the implementation, and getting ready for release.

SharePoint was a product based on the idea of Lists (lists of documents, lists of links, lists of contacts, etc) and my intern trio was tasked with adding a feature whereby a SharePoint user could create a list based on pre-built templates with appropriate fields (e.g. the Contact list would have fields for email address, phone number, office address, etc, etc). I wrote the spec for how the feature should look, and for the packaging format that would define each template. I also wrote (in Delphi) a generator/packager app to allow a content team (initially me) to build template files in the correct format. Our dev intern (Brandon) wrote the C++ code that would run inside SharePoint to ingest the package and call the appropriate APIs to create the new list. Our tester (Matt?) made sure it all worked. We finished our feature before the 12 week internship was up, and I considered it an unqualified success.

Offered a full-time job after the internship, I went back to Redmond for a perfunctory day of interviews with the team and was greatly annoyed to learn that our internship’s Template feature had been unceremoniously cut from the release. That outcome, as well as the lack of challenging interview questions from the team, led to me surprising everyone (including myself) by deciding to switch teams. I chose to join the Office Update team, then responsible for all of the Office web sites.

During my senior year back at UMD, I had a work/study internship as a web developer at The Motley Fool, and wrote a primitive OS in C++ for CS412. After finally crossing that “5000 lines of C++” threshold that Microsoft was looking for, I still didn’t seriously consider moving over to SDE. I was already “in” as a PM, and from my internship, it felt like there was a greater opportunity for impact as a PM vs. SDE — most of the SDE interns only owned a tiny piece of a product because it took a ton of work (ensuring accessibility, globalization, localization, performance, security, etc, etc) to deliver that tiny piece. As a PM, I’d be able to direct the work of several developers and focus on maximizing the value of their work for our users. To be honest, being a 21 year-old PM felt a bit like using a “cheat code”– when I’d interviewed at IBM they were super-confused at my resume because at Big Blue, a PM was a grizzled developer who’d “moved up” after a decade of coding. But at Microsoft, I’d get to start there.

The Office Update team went through a reorg before I started, so in June 2001, I started on the Office Assistance and Worldwide Services (AWS) team, as the PM owner of the clipart website and as the team’s Security PM. I spent the three years on Office writing feature specs, triaging bugs, and generally doing “everything but writing code.”

Except… well, I wrote a lot of code. I wrote “Rip Art Gallery,” a tool for abusing the Office website’s API to download clipart without requiring an Office app, and wrote a proof-of-concept ActiveX control for a new feature. I wrote the Clip of the Day tool, to allow Content team to generate the XML manifests of which clip to feature in which locales, on each day for the upcoming months. I wrote webserver log analysis tools. I wrote TamperIE, a tool designed to exploit websites that failed to validate request data, and accidentally leaked it to the world.

Outside of work, I wrote a popular popup blocker (and a less popular one), continued to update SlickRun, maintained DigitalMC and Logbook, created MezerTools, wrote some simple IE Extensions, wrote some simple Delphi libraries (including two for CD-R burning), started building the Fiddler Web Debugger and Meddler, and otherwise acted like a developer. Nearly all of my code was written in Delphi, C#, or JavaScript, with my only C++ development being tiny tweaks to the Internet JunkBuster Proxy to convert it into a bare-bones HTTP traffic logger.

Every few months, my manager would ask “Are you sure you’re not a developer?” and I would demur and explain that I simply loved being a PM. Privately, I also worried that I might lose interest in my many side projects if I started writing code for work.

By the fall of 2004, I decided to move on from Office and join the Internet Explorer team. The newly reconstituted browser team was rapidly growing, and they were hungrier for SDEs than PMs, so the devs on my interview loop were eager to get me to jump disciplines. Unwilling to change both teams and roles at the same time, I remained a PM. Internet Explorer offered more opportunity to become a technical PM though, and I rapidly leaned into it, owning both the new consolidated URL (CURL) class as well as much of the networking and network security areas.

I also immediately embarked upon my barely secret mission — to figure out what bugs in Internet Explorer were responsible for the problem where the Office Clip-of-the-Day wasn’t reliably changing every day. (My futile queries to the skeleton IE team were how I encountered the “Want to change the world? Join the new IE team today” recruiting pitch). With my newly granted source code access permissions, I printed out the code for the WinINET network stack and read it at night with a red pen in hand. While I was not a C++ developer, I was reasonably competent as a C++ reader, and I flagged nearly a hundred bugs, including six different issues that would’ve caused the Clip-of-the-Day to fail to change.

When I’d first joined the IE team, my manager suggested that I find someone else to take over development of Fiddler, because I’d “be too busy.” “We’ll see” I replied, cockily thinking “Your entire test team are all going to be running Fiddler pretty soon.” I was right. I continued to spend tens of hours a week writing Fiddler code, late into the night and on weekends, and its audience grew and grew. In 2007, it won the Engineering Excellence award and I got a handshake from Bill Gates and $5000 to spend on a morale event. While Fiddler dominated my coding time, I still maintained SlickRun and built a few one-off utilities, including an ActiveX control that earned me a $500 steak dinner with friends at Daniel’s Broiler, and an IE extension that won me $3000 in furniture from Pottery Barn and Crate&Barrel. Perhaps my most lucrative win came when a new hire was assigned to “officially productize” a simple web app I’d written to generate IE Search Providers; we started dating and were married three years later.

After several years languishing in the PM2 level band, I finally broke into the Senior PM band on the recognition of my technical contributions. I could go toe-to-toe with the developers in triage conversations, often knowing the code as well as they did, and I built many reduced reproductions for bugs, sometimes explaining exactly what lines of code were at fault.

Toward the end of IE9, I was deeply interested in improving network performance, but I lamented that the dev team couldn’t muster the resources to fix a dozen performance bugs in the network cache code. As I explained the changes needed and how impactful they could be, one of our developers (Ed Praitis) listened thoughtfully and then quietly noted: “It seems like you understand this stuff pretty well. Why don’t you just fix it yourself?

I chuckled until I saw he was serious. “But I’m a PM!” I protested, “we don’t check-in code. At least, nothing like this.”

I’ll review it for you if you want,” he offered. And this was just the push I needed. Within a few weeks, I checked in my fixes, and it was the work I was most proud of in over a decade at the company… helping save hundreds of millions of users untold billions of seconds in downloading pages. Around that time, I also offered up a small change to the WinINET code to make it work better with Fiddler, and to my surprise (and amusement) that team accepted it.

After a decade, I’d started to get a bit burned out on the PM role, and fresh off the excitement of landing actual shipping product code, I pondered whether I could take the pay hit of down-leveling to become a junior SDE. Instead, team turnover intervened, and I became a PM Lead, with my four reports owning IE’s Security, Privacy, Reliability, Telemetry, Extensibility, and Process Model features. Despite my rather untraditional PM background, I was, apparently, going to continue my career in a PM Leadership role.

And then, I got an email. A developer tools company was interested in acquiring Fiddler, and I, looking at a full plate with “a real job,” a new wife, and plans for a baby within a few years, decided that the booming Fiddler project deserved a full-time team. I got deep into negotiations to sell Fiddler outright when a phone call from a second interested party threw everything aside. Telerik not only wanted to buy Fiddler, they also wanted me to come work on Fiddler for them in Austin, Texas. The financial terms were more generous, and the lower cost-of-living in Texas meant that we’d only need one income. After a blissful March visit and negotiations over the summer, I signed the papers and my wife and I both gave notice at Microsoft.

At Telerik, my job title in the address book fluctuated around as the company grew and evolved and I never paid it much attention– whether it was “Principal Software Engineer” or “Product Manager” or something else, I considered myself “Fiddler Product Owner” and I did all the jobs, from coding to user research to support to design to testing. Once in a while, I’d consult on Telerik’s other products, but I never wrote any meaningful code for them.

Alas, after two years and a big pre-IPO layoff of nearly everyone else in the building, I was no longer feeling stable at Telerik and I applied for a Developer Advocate role on the Chrome Security team in 2015. Google is amazeballs at many things, but hiring is not one of them. I completed the Developer Advocate interview loop but their hiring committee came back and suggested that I should be a Technical Program Manager. I did a TPM interview loop, but their hiring committee came back and suggested I should be a Developer Advocate. The lead of Chrome Security decided to resolve the deadlock by hiring me as a Senior SWE (Software Engineer), for which she had sole authority. Since I’d be reporting directly to her, she assured me, my actual duties would be unchanged and my address book title would make no difference. With significant trepidation (I always worried about anything “off book”), I agreed.

I had a very strange ramp-up at Google, with paternity leave after my second son was born in week 2, and a subsequent long bout with pneumonia. Within a few months of starting, a reorganization meant that I’d now start reporting to a new manager. “My new boss knows that I’m not really a SWE and I’m really this special unicorn, right?!?” I asked my director, and was assured the answer was “Yes.” I then went to confirm with my new boss: “You know I’m not a SWE, right? I’ve only written like two files of C++ in the last fifteen years. I’m really this special unicorn DevAdvocate.” She responded “Well, um, I don’t actually have any special unicorn jobs on my team. I do have a SWE job, however, and you do have a Senior SWE title, so we should see if it’s a good fit, right?”

As a father of now two and provider for a single-income family, I didn’t see a lot of options. I looked into down-leveling so my skills matched my role, but Google HR indicated that wasn’t an option, both because they didn’t allow down-leveling and because they didn’t allow remote employees below the Senior level. I spent a total of two and a half years barely keeping my head above water, landing 94 changelists in Chromium and learning a ton. I joked without joking that I was the worst developer in Chrome. While there was much to admire about how Google builds products, I lamented the lack of Microsoft-style PMs and always wondered how much more efficient the team would’ve been with a proper complement of Program Managers.

In 2018, when I saw that one of my former direct reports was now a Group Program Manager at Microsoft, I asked for a job and was delighted to learn that remote work was now possible at the “new Microsoft.” I came back as a Principal Program Manager, and twice ended up acting as an interim lead for a few months as the team turned over. As a PM on the “Web Platform” and as one of the only Edge employees with any experience in Chromium, I got to remain hyper-technical, spending the majority of my time reading specs, guiding designs, explaining engineering systems, reading code, reducing repros, and root-causing problems.

As the team ramped up on Chromium, Microsoft as a whole began a journey to redefine the Program Management role, eventually splitting the role into Product Management (PdM) and Technical Program Management (TPM) to match Google. It was not a graceful process, and many of us felt a great deal of angst at the change. The 2012 book How Google Tests Software had presaged Microsoft’s earlier messy implosion of its Software Test Engineer role, and now it seemed that Microsoft was looking to continue its Googlification and eventually phase out the PM role entirely.

Throughout 2021, I found myself hunting for useful work to do. I spent almost a year as an “enterprise fixer”, landing 168 changelists in Chromium — most of them quite small, and targeted at unblocking enterprises from deploying the new Edge. I again pondered down-leveling to switch disciplines, with perhaps even higher stakes, having ceded half my net worth in a divorce and with the stock market suffering wild gyrations daily.

Finally, in 2022, I took a leap, leaving the Edge team to rejoin old friends and colleagues on the Microsoft Security team responsible for SmartScreen and other security features across products. I spent a few months ramping up into the new technologies, looking at active attacks, and reviewing the code the team has built so far. I kept the “Principal Product Manager” title as a placeholder, with the promise of a reclassification to “Architect” at some point in the future, a spiffy-sounding title that feels like a good fit to encompass the sorts of contributions I like to make.

In conversations with my lead last week, we agreed that “PM” was no longer a good fit for the work I’ll be doing in the coming years, so as of Friday, I’m now a “Principal SWE Manager.” While I don’t think any title has ever been a particularly good fit for the breadth of work I do, I’m excited to try this one on.

-Eric

PostScript: After six months as a SWE (including modernizing this code), I was given a team of PMs and became a Group Product Manager for the Protection team within Defender. That role lasted for a year before my team was merged into another and I ended up back as an IC PM. 🤷 The only constant is change.

Appendix: So, What Did PMs Do, Anyway?

When I first published this post, I felt unsatisfied because I think most folks who weren’t at Microsoft in the late 1990s and early 2000’s probably don’t have a clear idea of what the Microsoft PMs of my era actually did. That’s partly because PM was a fairly broad title covering a lot of different activities, and partly because not every PM performed every type of task.

Generally, however, a model PM would do many of the following things:

  • Research and deeply understand customer problems.
  • Analyze and deeply understand current competitive solutions.
  • Brainstorm approaches to fix those problems and validate the proposals. Doing this effectively requires a comprehensive understanding of the capabilities of available technology (both hardware and software).
  • Design great experiences to delight customers. In high-visibility flows, PMs will often have the help of dedicated writers, graphic designers, and usability researchers. However, those resources are often very limited, so a PM should be prepared to put together a shippable design without subject-matter-expert help, and obtain feedback to improve the design before the product ships.
  • Make good tradeoffs and build consensus: whether it’s prioritizing feature investments, triaging bugs, or figuring out what dinner to order for folks staying late at the office.
  • Communicate effectively, both narrowly (1:1 emails, small group meetings, etc) and broadly (blog posts, standards bodies, conference talks). This often involved translating between the varying jargon and interests of different audiences.
  • Reduce Ambiguity. Even when a decision hasn’t yet been made or there’s not enough data, PMs work to ensure that everyone (dev, test, support, leadership, partner teams, etc) is on the same page about both the plan and the known unknowns.
  • Be the Scribe. Any decision that has been made should be recorded (along with supporting data). Outstanding action items should be recorded and driven to closure.

None of these tasks are forbidden to Software Engineers, of course, but SWEs are expected to be world-class experts in writing code, a huge domain and a full-time job all its own.