Cruising Alaska (An Alaska Brews Cruise)

I lived in the Seattle area for nearly 12 years, and one of my regrets is that I never took advantage of any of the Alaskan cruises that conveniently leave from Pier 91 a few miles out of downtown. Getting to Alaska from Austin is more of a hassle, but I figured I’d pair it with a visit to work and friends, so I booked Royal Caribbean’s “Endicott Arm & Dawes Glacier Cruise”, departing Seattle on September 16th. While there were a lot of moving parts (two rental cars, two hotel stays, a workday, friend visits, mandatory COVID testing, Canadian entry paperwork), nearly everything went according to plan… and yet almost nothing was as I’d expected. My expedition mate for this voyage was Clint, one of my two oldest friends– we’ve been going on adventures together since high school.

We started with the flight to Seattle, an early morning departure on Alaska Airlines, paid for entirely with points I’ve accumulated over twenty years (thank goodness their mileage plan’s points never expire– I accumulated almost all of these points over a decade ago). I drove to the office and visited with folks on my new team and we headed out to lunch at Matador, an old favorite in downtown Redmond. After work, Clint and I met up with Chris, one of my good friends from way back in Office days (circa 2002-2004)– we sampled some of the beers at Black Raven in Redmond. The following morning, I walked over to the Peet’s Coffee in Redmond, another old favorite where I had started writing the Fiddler book.

After coffee and free breakfast at the hotel, and a mandatory COVID test supervised online, we headed over to Seattle, dropped off our rental car at the Space Needle, and took a quick Lyft out to Pier 91 and our boat, the Ovation of the Seas. It was big. Too big, arguably– it doesn’t look like a boat so much as an apartment building afloat. (I really liked the Adventure of the Seas, my vessel for my first two Royal cruises) I was excited to see the ship, but first we had to get through an annoyingly long queue. I’d read some posts about the boarding process in Seattle, so I thought I was prepared, but what I wasn’t prepared for was the paper handed out at the front of the line… it turned out that our glacier cruise wasn’t going to be a glacier cruise after all. Boo!

Since I didn’t have any particular expectations for the glacier viewing, I was mostly just annoyed– the daylight hours of any spot on earth have been calculable for hundreds of years, so none of this should have been surprising to the planners. (A few days later, the Captain did a little presentation and mentioned that on the prior cruise, fog meant that their approach to the glacier was aborted three miles out, so no one really got to see much. Fog, at least, seems a less predictable phenomenon than daylight.)

No matter, we were here, COVID free, and going to board the boat. We’d packed wisely and headed to the Windjammer buffet dining room for lunch and snacks while our luggage was loaded onto the ship. At 2PM, we got access to our room. It was nice, although they hadn’t yet split the twin beds and it was tight compared to the junior suite I’d shared with the kids on the Adventure of the Seas in March.

The balcony was a good size, although given the weather forecast (rainy and low 50s) I wasn’t sure how much I’d be using it, even with the cozy blanket I’d packed, and apple cider and hot chocolate packets I’d brought to use in the room’s kettle.

Ultimately, our balcony was mostly home to my sweaty workout clothes after my one run in the ship’s gym. Unlike on the Caribbean cruise, they didn’t dry out. :)

As we waited for our 4PM departure, we were treated to some beautiful views of Seattle and Puget Sound:

The ship was in great shape and nicely decorated, although we were quickly reminded about how inadequate the elevators are (slow, crowded) and this was even more of an issue on the enormous Ovation. We ended up climbing a lot of stairs over the week between our home (cabin 8690) on Deck 8, the shows and main dining room on 3, and topside at 14. Fortunately, the stairwells were decorated with some fun art to break up the monotony:

One of the most visible features on the Ovation of the Seas is its “North Star” observation pod which extends on an arm up to 300 feet above sea level.

I didn’t want to miss it, so we ended up booking one of the first slots, going up before we’d even undocked.

Ultimately, it mostly ended up being a good way to see the whole ship– 300 feet sounds like a lot, but when you’re miles away from any points of interest, it doesn’t make much of a difference. (It probably would’ve been great late in the trip if I’d been excited about whale watching)

After unpacking, dinner, and the “Welcome aboard” Comedy show, we watched a movie (The 335) in the open air on the top deck (chilly!) and went to bed.

Our first full day was a Day at Sea, where I explored the ship, read a book, enjoyed the food, and generally relaxed. The ship was well-designed for this itinerary– while the kids water features were limited (the kids would’ve been very disappointed in the tiny water slides), there was an arena where you could ride bumper cars, roller skate, or play dodge ball, a small climbing wall, a small iFly indoor skydiving tube, and a ping-pong and an XBOX gaming lounge (although more than half of the consoles were broken. Sad).

For the grownups, there was an amazing solarium with hot tubs, lounge chairs, and little snuggle pod couches:

Two of my favorite spots were the two “bridge extension rooftops” that extended across the bow:

These allowed a look back at the rest of the ship; our cabin was somewhere around the orange arrow:

Throughout the cruise, I spent quite a bit of time walking laps on the top deck, passing by some really impressive decorations:

Dinner in the dining room was “Formal Night” so we dressed up in our best. Unlike the dining room in the Adventure of the Seas (a wide-open three-story beauty), our main dining room on the Ovation felt dark and claustrophobic, despite (or perhaps partly because of) mirrors mounted in the ceiling. (The Ovation splits its “main dining room” into four single-story areas). Our waiter seemed extremely stressed for the entire cruise, and all of our interactions felt extremely awkward.

After dinner, we saw the first big song-and-dance show, the Vegas-style “Live, Love, Legs.” The ability to see a great live show is one of my favorite things in the world and I ended up watching it twice, first from the balcony at 8pm and then from the front row at 10pm. The performers were super-talented, and it was awesome to get to see the show from good seats.

When I woke up early the next morning, I was excited to grab breakfast and get my first-ever glimpse of Alaska. I grabbed breakfast at the buffet and walked out the doors to the patio bracing for the cold… but it was only chilly at worst. While undeniably beautiful, everything looked a bit like, well, everywhere else in the Pacific Northwest.

Ah well. After breakfast, I was excited to get out and explore Ketchikan, Alaska’s “First City”:

Now, it’s worth explaining here that I didn’t really have a plan, per-se, for any port on this cruise. While the idea of buying the ship’s expensive “unlimited drinks” package (making this a “booze cruise”) sounded depressing and risky, the notion of doing a “brews cruise”, hitting the breweries in each port-of-call, sounded like a lot more fun.

Besides, by the time I had started looking into booking excursions for this trip, most were sold out, all were obscenely expensive (hundreds of dollars per person for most of them) and the weather was supposed to be awful anyway. So, I was excited to get out to discover whatever there was to see.

As we got off the ship, we were handed the little “Here are some shops you should check out” brochure that had a tiny map. On the map was a mention of hiking trails, so we set out in that direction. We walked a few miles on the road along the water until we reached the Ferry Terminal (oops, too far) and turned around to head back to the trailhead at the University of Alaska Southeast.

After a pretty but short hike, with some lovely overlooks:

….we were unceremoniously dumped back out on a (admittedly beautiful) back road and we walked back to the city, past the beautiful Ketchikan Public library and the less-beautiful Ketchikan jail.

Back in town, we grabbed coffees and pondered our next move. Lunch? We headed to a local fisherman’s bar, where we didn’t find anything interesting to eat or on tap, but I got to enjoy an old favorite in its home port:

Nothing in town seemed like a “Can’t miss” for lunch, so we decided to pop back onto the boat to try the Halibut and Chips at the “Fish and Ships” restaurant atop the boat. Frustratingly, they didn’t have the Halibut (and wouldn’t for the entire trip, despite remaining on their digital menu screen, grrr) so we settled for plain cod.

“50s and raining? Naw 70s and sunny!”

We then got back off the boat to find more beer. We ended up at a fantastic bar (Asylum) which had a huge selection on tap, including “Island Ale“, an instant favorite that I subsequently failed to find again for the rest of the trip :( .

We enjoyed our drinks with some pickle popcorn on nice sunny patio with a view out over the water. Alas, our ship’s 4pm departure drew near and we stumbled happily back to the boat. I chilled with my book on the top deck and didn’t even notice as we started pulling away.

After dinner, I spent some time reading alone on deck.

The next morning, I woke up early and headed down to breakfast. The fog over the water gave everything an otherworldly quality and I enjoyed a second cup of coffee walking the deck as we pulled into Juneau.

After disembarking, we immediately booked sets on a bus out to the Mendenhall Glacier, a short trip away. We spotted a half-dozen bald eagles (“Golf ball heads”) along the road, mostly watching us from the top of lampposts. The tour guide pointed out the local McDonald’s, noting that it was the only one that some local rural folks would see on rare trips to “the big city”.

Now, I’ll confess here that I had made it 43 years on this rock called Earth under the misimpression that a glacier is just an especially big iceberg, which turns out not to be the case at all. So, I was a bit surprised and disappointed, but nevertheless agreed that it was a beautiful sight. We hiked out to the base of the 377-foot Nugget Waterfalls at the right of this picture:

…and posed along the way with some ice that had taken hundreds of years to reach this shore:

I even carefully selected an icecube to bring home to the kids as a souvenir:

After a few hours, we’d walked all of the shorter trails and rain threatened, so we boarded the bus back to town.

In the city, we took our bus driver’s advice for a good spot for Halibut and Chips (crazy expensive at 30$ a plate: not bad, but not worth it either), bought some postcards to send home, and went in search of a brewery. We started at Devil’s Club Brewing, a nice-looking spot with some interesting (somewhat exotic) beers.

After a flight and another pint of our favorites, we mailed my postcards and found a more traditional bar where I had a hazy IPA and Clint paired a Guinness with an Alaskan Duck Fart.

We then headed back to the ship for dinner, deciding at the last minute to walk a half mile up the coastline to where a famous whale fountain had been installed in a park a few years ago. It was worth the walk, although it looked considerably less lifelike in person. :)

The Fountain
View from the park

After dinner and with hours to kill before Ovation’s 10PM departure, the neon “Alaskan Brewing” sign at the taproom next to the boat beckoned and we decided to head off for another drink.

View from the taproom

After sitting for almost ten minutes without a waitress in sight, we left to find a more fruitful taproom. (As we walked out to the street, we realized that we’d entered the back of the place and that’s probably why there was no service). We ended up at the cozy taproom (they had a cat!) for Barnaby Brewing, one of my favorites of the entire trip, and I enjoyed several delicious selections.

We closed the place down (admittedly, at 8pm) and headed back to the ship.

We had an early 7am arrival at our final Alaskan destination, Skagway, but because of some damage to the dock we had to use tenders (small boats) to reach the shore. On past cruises, this has been very cumbersome, but given the short distance, enormous tenders, and lack of competition for slots, it turned out to be trivial.

Again, I had no plan for what we might do in Skagway. It seemed like the most popular excursions involved getting on a train and riding it around, a prospect I found less than exciting. Fortunately, Google Maps reconnaissance indicated not one but two breweries in this tiny town.

We started by walking from one end of the city to the other, and grabbing a “Honey Bear Latte” at a cute little coffee shop (which was, unsurprisingly, flooded with tourists).

We bought a few souvenirs (shirts and a hat) then found our way to the Skagway Brewing Company, where we had a pint before heading upstairs for another lunch of Halibut and Chips (again, crazy expensive, and again, not really worth the price).

We then headed over to Klondike Brewing Company for a few tasty drinks:

… and then shuttled back to the boat before Ovation’s 6pm departure. The rain held off, and I ended up lounging on deck as we shoved off.

That night, the show was “Pixels”, a singing/dancing/multimedia spectacle in the “270 Lounge” at the back of the ship. It was a short show, and while entertaining, I didn’t enjoy it nearly as much as the other shows.

The next day was the second “Sea Day” with no ports-of-call, so I headed to the gym in the morning to run up an appetite– we were slated to have lunch at the steakhouse. Running was hard– I ended up splitting my 10K into two 5Ks with a few laps on the deck in the middle. My knees have been threatening me for the last few weeks, and the treadmills in the gym weren’t in great shape. I’ve also grown accustomed to running with multiple big fans pointed directly at me, and the ship felt hot and claustrophobic by comparison.

Lunch was, alas, a miss. Through some sort of scheduling mixup, our lunch was actually a “Taste of Royal” tasting tour, where we sat in the fancy “Wonderland” restaurant and had one plate from each of the “premium” eateries on ship. So, rather than a giant steak, we had a fancy spritzer drink, a tiny fish course, a tiny risotto dish, a tiny steak, and a small piece of fried cheesecake. It was tasty, but not what I’d run six miles for.

We putzed around all afternoon, had dinner, and watched a talented singer (Ana Alvaredo) covering popular songs at the onboard pub, Amber and Oak. But the big event of the day was the night’s show in the main theater, The Beautiful Dream. It was, in a word, spectacular. The costumes were amazing. The song choices (a mix of 80s/90s) were perfect. The singing and dancing were powerful. The plot (A father of two loses his wife and must find a way to carry on, was perhaps a bit too on the nose).

I was blown away and resolved that I must make more of an effort to see live theater. After seeing it close up at the 8pm showing, I went back to sit in the balcony at the 10pm showing to take it all in.

I went to bed glowing… this show alone was worth the trip.

Our final full day featured Victoria, but with a slated arrival time of 5pm, we had a day to fill on the boat first. I spent a few hours in hot tubs while most of the passengers were below decks.

Given our evening arrival (and sundown a scant 135 minutes later) I worried that it might not be worth even getting off the boat. In particular, I assumed that getting cleared off the ship and out of the port would be a hassle based on a blog from June, but it was the opposite– nobody checked our passports, vaccination status, arrival forms, or anything else. We all just walked off the boat and through the “Welcome to Victoria” building.

After a short walk along the coastline, we found ourselves in the middle of plenty to do. We quickly found the amazing Refuge Tap Room, where I got two beers and a flight, including a delicious apricot wheat. After drinks, we stopped for a quick, tasty, and calorie-laden poutine and then headed back to the ship.

I had a lot more fun in Victoria than I expected.

We arrived and disembarked in Seattle the following morning. We grabbed fancy Eggnog Lattes at Victor’s Coffee Company, took a long walk in one of my favorite parks (Marymoor), before lunch at one of my favorite spots (Ooba Tooba) and then went out for drinks at Black Raven with Nick. We finished the day with Thai Ginger for dinner.

On Saturday, we went to visit our friends Anson and Rachel in Bothell, then checked out the taproom for Mac & Jacks (my favorite beer). After a few drinks there, Zouhir introduced us to Chicago Pastrami in Issaquah, where I had an amazing Reuben and delicious pistachio ice cream.

After I posted the M&J pictures online, everyone said we had to try out Postdoc Brewing just down the street. So, we did the following day before we headed to the airport.

Our trip back was uneventful; we got to the airport super-early after reading horror stories of three-hour security lines at SeaTac, but I breezed through the TSA Pre line in less than fifteen minutes. We had plenty of time to get one last Mac & Jacks at the Africa Lounge, my favorite way to depart Seattle.

Our flight landed around midnight Austin time, and I eagerly tumbled into bed around 1:30 on Monday morning.

All in all, it was an amazing trip that I largely did not appreciate fully. I’ve got a lot on my mind.

Miscellaneous notes:

  • Seven days is too long for me to cruise without kids or a significant other to get me out of my head.
  • Having cell service on the trip made cruising feel very different. While it was convenient to post photos and hunt breweries ahead of time, it really changed the vibe for the worse.
  • Ships can be too big.
  • Cruise-ship comedians aren’t very funny unless you’re drinking.
  • Back home, I miss the fancy desserts.

-Eric

HTTPS Goofs: Forgetting the Bare Domain

As I mentioned, the top failure of HTTPS is failing to use it, and that’s particularly common in in-bound links sent via email, in newsletters, and the like.

Unfortunately, there’s another common case, whereby the user simply types your bare domain name (example.com) in the browser’s address bar without specifying https:// first.

For decades, many server operators simply had a HTTP listener sitting at that bare domain, redirecting http://example.com to https://www.example.com, changing from insecure HTTP to secure HTTPS and redirecting from the apex (base) domain to the www subdomain.

However, providing HTTPS support on your www subdomain isn’t really enough, you must also support HTTPS on your apex domain. Unfortunately, several major domains, including delta.com and royalcaribbean.com do not have HTTPS support for the apex domain, only the www subdomain. This shortcoming causes two problems:

  1. It means you cannot meet the submission requirements to HSTS-Preload your domain. HSTS preloading ensures that non-secure requests are never sent, protecting your site from a variety of attacks.
  2. Users who try to visit your bare domain over HTTPS will have a poor experience.

This second problem is only getting more common.

Browsers are working hard to shift all traffic over to HTTPS, adding new features to default to HTTPS for user-typed URLs (or optionally even all URLs). For some sites, like delta.com, the attempt to navigate to HTTPS on the apex domain will very slowly time out:

…while for other sites on CDNs like Akamai (who do not seem to support HTTPS for free), the user gets a baffling and scary error message because the CDN returns a generic certificate that does not match the target site:

It’s frustrating to me that Akamai even offers a “shoot self in foot” option for their customers when their competitors like Cloudflare give HTTPS away, even to sites on their free tier who don’t pay them anything.

Ideally, sites and CDNs will correct their misconfigurations, helping keep users secure and avoiding confusing errors.

On the browser developer side, it’s kinda fun to brainstorm what the browser might do here, although I haven’t seen any great ideas yet. For example, as I noted back in 2017, the browser used to include a “magic” feature whereby if user went to https://www.example.com but the cert only contained example.com, the user would be silently redirected to https://example.com to avoid a certificate error. You could imagine that the browser could introduce a similar feature here, or we could ship with a list of broken sites like Delta and Royal Caribbean and help the user recover from the site’s configuration error. Unfortunately, most of these approaches don’t meet a cost/benefit bar, so they remain unimplemented.

Please ensure that your apex domain loads properly over HTTPS!

-Eric

Best Practice: Post-Mortems

I’ve written a bit about working at Google in the past. Google does a lot of things right, and other companies would benefit by following their example.

At Google, one of the technical practices that I thought was both essential and very well done was the “post-mortem”– whenever they hit a significant problem, after putting out the fires and getting everything running again, they’d get the engineers closest to the problem to spend a day or two investigating the root cause of the issue and writing up their findings for everyone to read. The visibility of post-mortems meant that even a lowly browser engineer could go read in-depth content about how a live service went down for a day (“We didn’t think about what would happen if the data center caught on fire during the migration“), or the comic tale about what happens when a catering order for 1000 donuts is misunderstood as an order for 1000 dozen donuts. Some post-mortems are even made public.

The aim was a “blameless” post-mortem (nobody got in trouble for the results) where the goal was to identify the true root causes (not just the immediately precipitating errors) and file bugs to eradicate those causes and prevent recurrence of not just the same problem, but all similar problems in the future. As a part of the process, they’d calculate out exactly how much the problem ended up costing in direct dollars (lost revenue, damage, etc). 

Bugs filed from post-mortems got worked on with priority– there was solid evidence showing the real danger of leaving things unfixed, and no one wanted to get burned by the same root causes twice. 

A key technique in the post-mortem was following the “Five Whys” paradigm (famously introduced at Toyota) for finding root causes, in which the participants would start at the immediate issue and then probe further toward the root causes by asking “And why did that happen?” (The downtime was caused because the database ran out of space and the code didn’t notice. Why? Because there was no test for that case. Why? Because the test environment ran on different hardware with a mock database that couldn’t run out of space. Why? Because it was deemed too difficult to test on production-class hardware. Why? Because we haven’t prioritized building a parallel test environment. Why? Because it’s expensive and we didn’t think it was necessary. Now we know better). 

The post-mortems were serious affairs — mandatory, well-funded (engineering time is expensive), and broadly reviewed — all of them published on an intranet portal for anyone in the company to learn from. They were tremendously effective — fixes for the root causes were prioritized based on cost and impact and rapidly addressed. I don’t think Google could have become a trillion-dollar company without them.

Many companies’ engineering cultures have adopted post-mortems in theory— but if your culture isn’t willing to expect, fund, recognize, and respect them, they become yet another source of overhead and another exhausting checkbox to tick.

Badware Techniques: Notification Spam

I tried visiting an old colleague’s long-expired blog today, just to see what would happen. I got redirected here:

Wat? What is this even talking about? There’s no “Allow” link or button anywhere.

The clue is that tiny bell with a red X in the omnibox– This site tried to ask for permission to spam me with notifications forevermore. The site hopes that I don’t understand the permission prompt, I will assume this is one of the billions of CAPTCHAs on today’s web, and that I will simply click “Allow”.

However, in this case, Edge said “Naw, we’re not even going to bother showing the prompt for this site” and suppressed it by default.

The resulting user experience isn’t an awesome one for the user, but there’s not a ton the browser can do about that in general– websites can always lie to visitors, and the browser’s ability to do anything reasonable in response is limited. The truly bad outcome (a continuous flood of spam notifications appearing inside the OS, leading the user to wonder whether they’ve been hacked for weeks afterward) has been averted because the user never sees the “Shoot self in foot” option.

This “Quieter Notifications” behavior can be found in Edge Settings; you can use the other toggle to turn off Notification permission requests entirely:

edge://settings/content/notifications screenshot

Today, there’s no “Report this site is trying to trick users” feature. The current menu command ... > Help and Feedback > Report Unsafe Site is today only used to report sites that distribute malware or conduct phishing attacks for blocking with SmartScreen.

Edge’s Super-Res Image Enhancement

One interesting feature that the Edge team is experimenting with this summer is called “SuperRes” or “Enhance Images.” This feature allows Microsoft Edge to use a Microsoft-built AI/ML service to enhance the quality of images shown within the browser. You can learn more about how the images are enhanced (and see some examples) in the Turing SuperRes blog post.

Currently only a tiny fraction of Stable channel users and much larger fraction of Dev/Canary channel users have the feature enabled by field trial flags. If the feature is enabled, you’ll have an option to enable/disable it inside edge://settings:

Users of the latest builds will also see a “HD” icon appear in the omnibox. When clicked, it opens a configuration balloon that allows you to control the feature:

As seen in the blog post, this feature can meaningfully enhance the quality of many photographs, but the model is not yet perfect. One limitation is that it tends not to work as well for PNG screenshots, which sometimes get pink fuzzies:

“Pink Fuzzies” JPEG Artifacts

… and today I filed a bug because it seems like the feature does not handle ICCv4 color profiles correctly.

Green tint due to failed color profile handling

If you encounter failed enhancements like this, please report them to the Edge team using the … > Help and Feedback > Send Feedback tool so the team can help improve the model.

On-the-Wire

Using Fiddler, you can see the image enhancement requests that flow out to the Turing Service in the cloud:

Inspecting each response from the server takes a little bit of effort because the response image is encapsulated within a Protocol Buffer wrapper:

Because of the wrapper, Fiddler’s ImageView will not be able to render the image by default:

Fortunately, the response image is near the top of the buffer, so you can simply focus the Web Session, hit F2 to unlock it for editing, and use the HexView inspector to delete the prefix bytes:

…then hit F2 to commit the changes to the response. You can then use the ImageView inspector to render the enhanced image, skipping over the remainder of the bytes in the protocol buffer (see the “bytes after final chunk” warning on the left):

Stay sharp out there!

-Eric

QuickFix: Trivial Chrome Extensions

Almost a decade before I released the first version of Fiddler, I started work on my first app that survives to this day, SlickRun. SlickRun is a floating command line that can launch any app on your PC, as well as launching web applications and performing other simple and useful features, like showing battery, CPU usage, countdowns to upcoming events, and so forth:

SlickRun allows you to come up with memorable commands (called MagicWords) for any operation, so you can type whatever’s natural to you (e.g. bugs/edge launches the Edge bug tracker) for any operation.

One of my favorite MagicWords, beloved for decades now, is goto. It launches your browser to the best match for any web search:

For example, I can type goto download fiddler and my browser will launch and go to the Fiddler download page (as found by an “I’m Feeling Lucky” search on Google) without any further effort on my part.

Unfortunately, back in 2020 (presumably for anti-abuse reasons), Google started interrupting their “I’m Feeling Lucky” experience with a confirmation page that requires the user to acknowledge that they’re going to a different website:

… and this makes the goto user flow much less magical. I grumbled about Google’s change at the time, without much hope that it would ever be fixed.

Last week, while vegging on some video in another tab, I typed out a trivial little browser extension which does the simplest possible thing: When it sees this page appear as the first or second navigation in the browser, it auto-clicks the continue link. It does so by instructing the browser to inject a trivial content script into the target page:

"content_scripts": [
    {
      "matches": ["https://www.google.com/url?*"],
      "js": ["content-script.js"]
    }

…and that injected script clicks the link:

// On the Redirect Notice page, click the first link.
if (window.history.length<=2) {
  document.links[0].click(); 
}
else {
  console.log(`Skipping auto-continue, because history.length == ${window.history.length}`);
}

This whole thing took me under 10 minutes to build, and it still delights me every time.

-Eric

Passkeys – Syncable WebAuthN credentials

Passwords have lousy security properties, and if you try to use them securely (long, complicated, and different for every site), they often have horrible usability as well. Over the decades, the industry has slowly tried to shore up passwords’ security with multi-factor authentication (e.g. one-time codes via SMS, ToTP authenticators, etc) and usability improvements (e.g. password managers), but these mechanisms are often clunky and have limited impact on phishing attacks.

The Web Authentication API (WebAuthN) offers a way out — cryptographically secure credentials that cannot be phished and need not be remembered by a human. But the user-experience for WebAuthN has historically been a bit clunky, and adoption by websites has been slow.

That’s all set to change.

Passkeys, built atop the existing WebAuthN standards, offers a much slicker experience, with enhanced usability and support across three major ecosystems: Google, Apple, and Microsoft. It will work in your desktop browser (Chrome, Safari, or Edge), as well as well as on your mobile phone (iPhone or Android, in both web apps and native apps).

Passkeys offers the sort of usability improvement that finally makes it practical for sites to seize the security improvement from retiring passwords entirely (or treating password-based logins with extreme suspicion).

PMs from Google and Microsoft put together an awesome (and short!) demo video for the User Experience across devices which you can see over on YouTube.

I’m super-excited about this evolution and hope we’ll see major adoption as quickly as possible. Stay secure out there!

-Eric

Bonus Content: A PassKeys Podcast featuring Google Cryptographer Adam Langley, IMO one of the smartest humans alive.

Understanding Browser Channels

Edge channel logs

Microsoft Edge (and upstream Chrome) is available in four different Channels: Stable, Beta, Dev, and Canary. The vast majority of Edge users run on the Stable Channel, but the three pre-Stable channels can be downloaded easily from microsoftedgeinsider.com. You can keep them around for testing if you like, or join the cool kids and set one as your “daily driver” default browser.

Release Schedule

The Stable channel receives a major update every four weeks (Official Docs), Beta channel more often than that (irregularly), Dev channel aims for one update per week, and Canary channel aims for one update per day.

While Stable only receives a major version update every four weeks, in reality it will usually be updated several times during its four-week lifespan. These are called respins, and they contain security fixes and high-impact functionality fixes. (The Extended Stable channel for Enterprises updates only every eight weeks, skipping every odd-numbered release).

Similarly, some Edge features are delivered via components, and those can be updated for any channel at any time.

Why Use a Pre-Stable Channel?

The main reason to use Beta, Dev, or even Canary as your “daily driver” is because these channels (sometimes referred to collectively as “pre-release channels”) are a practical time machine. They allow you to see what will happen in the future, as the code from the pre-release channels flows from Canary to Dev to Beta and eventually Stable.

For a web developer, Enterprise IT department, or ISV building software that interacts with browsers this time machine is invaluable– a problem found in a pre-Release channel can be fixed before it becomes a work-blocking emergency during the Stable rollout.

For Edge and the Chromium project, self-hosting of pre-release channels is hugely important, because it allows us to discover problematic code before billions of users are running it. Even if an issue isn’t found by a hand-authored customer bug report submission, engineers can discover many regressions using telemetry and automatic crash reporting (“Watson”).

What If Something Does Go Wrong?

As is implied in the naming, pre-Stable channels are, well, less Stable than the Stable channel. Bugs, sometimes serious, are to be expected.

To address this, you should always have at least two Edge channels configured for use– the “fast” channel (Dev or Canary) and a slower channel (Beta or Stable).

If there’s a blocking bug in the version you’re using as your fast channel, temporarily “retreat” from your fast to slow channel. To make this less painful, configure your browser profile in both channels to sync information using a single MSA or AAD account. That way, when you move from fast to slow and back again, all of your most important information (see edge://settings/profiles/sync for data types) is available in the browser you’re using.

Understanding Code Flow

In general, the idea is that Edge developers check in their code to the internal Main branch. Code from Microsoft employees is joined by code pulled by the “pump” from the upstream Chromium project, with various sheriffs working around the clock to fix any merge conflicts between the upstream code pumped in and the code Microsoft engineers have added.

Every day, the Edge build team picks a cut-off point, compiles an optimized release build, runs it through an automated test gauntlet, and if the resulting build runs passably (e.g. the browser boots and can view some web pages without crashing), that build is blessed as the Canary and released to the public. Note that the quality of Canary might well be comically low (the browser might render entirely in purple, or have menu items that crash the browser entirely) but still be deemed acceptable for release. The Canary channel, jokes aside, is named after the practice of bringing birds into mining tunnels deep underground. If a miner’s canary falls over dead, the miners know that the tunnel is contaminated by odorless but deadly carbon monoxide and they can run for fresh air immediately. (Compared to humans, canaries are much more sensitive to carbon monoxide and die at a much lower dose). Grim metaphors aside, the Canary channel serves the same purpose– to discover crashes and problems before “regular” users encounter them. Firefox avoids etymological confusion and names its latest channel “Nightly.”

Every week or so, the Edge build team selects one of the week’s Canary releases and “promotes” it to the Dev branch. The selected build is intended to be one of the more reliable Canaries, with fewer major problems than we’d accept for any given Canary, but sometimes we’ll pick a build with a major problem that wasn’t yet noticed. When it goes out to the broader Dev population, Microsoft will often fix it in the next Canary build, but folks on the busted Dev build might have to wait a few days for the next Canary to Dev promotion. It’s for this reason that I run Canary as my daily driver rather than Dev.

Notably for Canary and Dev, the Edge team does not try to exactly match any given upstream Canary or Dev release. Sometimes, we’ll skip a Dev or Canary release when we don’t have a good build, or sometimes we’ll ship one when upstream does not. This means that sometimes (due to pump latency, “sometimes” is nearly “always”) an Edge Canary might have slightly different code than the same day’s Chrome Canary. Furthermore, due to our code pump works, Edge Canary can even have slightly different code than Chromium’s even for the exact same Chrome version number.

In contrast, for Stable, we aim to match upstream Chrome, and work hard to ensure that Version N of Edge has the same upstream changelists as the matching Version N of Chrome/Chromium. This means that anytime upstream ships or respins a new version of Stable, we will ship or respin in very short order.

In some cases, upstream Chromium engineers or Microsoft engineers might “cherry-pick” a fix into the Dev, Beta, or Stable branches to get it out to those more stable branches faster than the normal code-flow promotion. This is done sparingly, as it entails both effort and risk, but it’s a useful capability. If Chrome cherry-picks a fix into its Stable channel and respins, the Edge team does the same as quickly as possible. (This is important because many cherry-picks are fixes for 0-day exploits.)

Code Differences

As mentioned previously, the goal is that faster-updating channels reflect the exact same code as will soon flow into the more-stable, slower-updating channels. If you see a bug in Canary version N, that bug will end up in Stable version N unless it’s reported and fixed first. Other than a different icon and a mention on the edge://version page, it’s often hard to tell which channel is even being used.

However, it’s not quite true that the same build will behave the same way as it flows through the channels. A feature can be coded so that it works differently depending upon the channel.

For example, Edge has a “Domain Actions” feature to accommodate certain websites that won’t load properly unless sent a specific User-Agent header. When you visit a site on the list, Edge will apply a UA-string spoof to make the site work. You can see the list on edge://compat/useragent:

However, this Domain Actions list is applied only in Edge Stable and Beta channels and is not used in Edge Dev and Canary.

Edge rolls out features via a Controlled Feature Rollout process (I’ve written about it previously). The Experimental Configuration Server typically configures the “Feature Enabled” rate in pre-release channels (Canary and Dev in particular) to be much higher (e.g. 50% of Canary/Dev users will have a feature enabled, while 5% of Beta and 1% of Stable users will get it).

Similarly, there exist several “experimental” Extension APIs that are only available for use in the Dev and Canary channels. There are also some UI bubbles (e.g. warning the user about side-loaded “developer-mode” extensions) that are shown only in the Stable channel.

Chrome and Edge offer a UX to become the default browser, but this option isn’t shown in the Canary channel.

Individual features can also take channel into account to allow developer overrides and the like, but such features overrides tend to be rather niche.

Thanks for helping improve the experience for everyone by self-hosting pre-Stable channels!

-Eric

Certificate Revocation in Microsoft Edge

When you visit a HTTPS site, the server must present a certificate, signed by a trusted third-party (a Certificate Authority, aka CA), vouching for the identity of the bearer. The certificate contains an expiration date, and is considered valid until that date arrives. But what if the CA later realizes that it issued the certificate in error? Or what if the server’s private key (corresponding to the public key in the certificate) is accidentally revealed?

Enter certificate revocation. Revocation allows the trusted third-party to indicate to the client that a particular certificate should no longer be considered valid, even if it’s unexpired.

There are several techniques to implement revocation checking, and each has privacy, reliability, and performance considerations. Back in 2011, I wrote a long post about how Internet Explorer handles certificate revocation checks.

Back in 2018, the Microsoft Edge team decided to match Chrome’s behavior by not performing online OCSP or CRL checks for most certificates by default.

Wait, What? Why?

The basic arguments are that HTTPS certificate revocation checks:

  • Impair performance (tens of milliseconds to tens of seconds in latency)
  • Impair privacy (CAs could log what you’re checking and know where you went)
  • Are too unreliable to hard-fail (too many false positives on downtime or network glitches)
  • Are useless against most threats when soft-fail (because an active MITM can block the check)

For more context about why Chrome stopped using online certificate revocation checks many years ago, see these posts from the Chromium team explaining their thinking:

Note: Revocation checks still happen

Chromium still performs online OCSP/CRL checks for Extended Validation certificates only, in soft-fail mode. If the check fails (e.g. offline OCSP responder) the certificate is just treated as a regular TLS certificate without the EV treatment. Users are very unlikely to ever notice because the EV treatment, now buried deep in the security UX, is virtually invisible. Notably, however, there is a performance penalty– if your Enterprise blackholes or slowly blocks access to a major CA’s OCSP responder, TLS connections from Chromium will be 🐢 very slow. Update: Chromium has announced that v106+ will no longer revocation check for EV.

Even without online revocation checks, Chromium performs offline checks in two ways.

  1. It calls the Windows Certificate API (CAPI) with an “offline only” flag, such that revocation checks consult previously-cached CRLs (e.g. if Windows had previously retrieved a CRL), and certificate distrust entries deployed by Microsoft.
  2. It plugs into CAPI an implementation of CRLSets, a Google/Microsoft deployed list of popular certificates that should be deemed revoked.

On Windows, Chromium uses the CAPI stack to perform revocation checks. I would expect this check to behave identically to the Internet Explorer check (which also relies on the Windows CAPI stack). Specifically, I don’t see any attempt to set dwUrlRetrievalTimeout away from the default. How CAPI2 certificate revocation works. Sometimes it’s useful to enable CAPI2 diagnostics.

CRLSets are updated via the Component Updater; if the PC isn’t ever on the Internet (e.g. an air-gapped network), the CRLSet will only be updated when a new version of the browser is deployed. (Of course, in an environment without access to the internet at large, revocation checking is even less useful.)

After Chromium moves to use its own built-in verifier, it will perform certificate revocation checks using its own revocation checker. Today, that checker supports only HTTP-sourced CRLs (the CAPI checker also supports HTTPS, LDAP, and FILE).

Group Policy Options

Chromium (and thus Edge and Chrome) support two Group Policies that control the behavior of revocation checking.

The EnableOnlineRevocationChecks policy enables soft-fail revocation checking for certificates. If the certificate does not contain revocation information, the certificate is deemed valid. If the revocation check does not complete (e.g. inaccessible CA), the certificate is deemed valid. If the certificate revocation check successfully returns that the certificate was revoked, the certificate is deemed invalid.

The RequireOnlineRevocationChecksForLocalAnchors policy allows hard-fail revocation checking for certificates that chain to a private anchor. A “private anchor” is not a “public Certificate Authority”, but instead e.g. the Enterprise root your company deployed to its PCs for either its internal sites or its Monster-in-the-Middle MITM network traffic inspection proxy). If the certificate does not contain revocation information, the certificate is deemed invalid. If the revocation check does not complete (e.g. inaccessible CA), the certificate is deemed invalid. If the certificate revocation check successfully returns that the certificate was revoked, the certificate is deemed invalid.

Other browsers

Note: This section may be outdated!

Here’s an old survey of cross-browser revocation behavior.

By default, Firefox still queries OCSP servers for certificates that have a validity lifetime over 10 days. If you wish, you can require hard-fail OCSP checking by navigating to about:config and toggling security.OCSP.require to true. See this wiki for more details. Mozilla also distributes a CRLSet-like list of intermediates that should no longer be trusted, called OneCRL.

For the now-defunct Internet Explorer, you can set a Feature Control registry DWORD to convert the usual soft-fail into a slightly-less-soft fail:

HKEY_CURRENT_USER\SOFTWARE\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_WARN_ON_SEC_CERT_REV_FAILED

iexplore.exe=1

Edge Legacy did not have any option for non-silent failure for revocation checks.

New Recipes for 3rd Party Cookies

For privacy reasons, the web platform is moving away from supporting 3rd-party cookies, first with lockdowns, and eventually with removal of support in late 2023 the second half of 2024.

Background: What Does “3rd-Party” Mean?

A 3rd-party cookie is one that is set or sent from a 3rd-party context on a web page.

A 3rd-party context is a frame or resource whose registrable domain (sometimes called eTLD+1) differs from that of the top-level page. This is sometimes called “cross-site.” In this example:

domain2.com and domain3.com are cross-site 3rd-parties to the parent page served by domain1.com. (In contrast, a resource from sub.domain1.com is cross-origin, but same-site/1st Party to domain1.com).

Importantly, frames or images[1] from domain2.com and domain3.com cannot see or modify the cookies in domain1.com‘s cookie jar, and script running at domain1.com cannot see or set cookies for the embedded domain2.com or domain3.com contexts.

Background: Existing Restrictions

Q: Why do privacy advocates worry about 3rd-party cookies?
A: Because they are a simple way to track a given user’s browsing across the web.

Say a bunch of unrelated sites include ads from an advertising server. A 3rd-party cookie set on the content from the ad will allow that ad server to identify the set of sites that the user has visited. For example, consider three pages the user visits:

The advertiser, instead of simply knowing that their ad is running on Star Trek’s website, is also able to know that this specific user has previously visited sites related to running and a medication, and can thus target its advertisements in a way that the visitor may deem a violation of their privacy.

For this reason, browsers have supported controls on 3rd-party cookies for decades, but they were typically off-by-default or trivially bypassed.

More recently, browsers have started introducing on-by-default controls and restrictions, including the 2020 change that makes all cookies SameSite=Lax by default.

However, none of these restrictions will go as far as browsers will go in the future.

A Full Menu of Replacements

In order to support scenarios that have been built atop 3rd-party cookies for multiple decades, new patterns and technologies will be needed.

The Easy Recipe: CHIPS

In 2020, cookies were made SameSite=Lax by default blocking cookies from being set and sent in 3rd-party contexts by default. The workaround for Web Developers who still needed cookies in 3rd-party contexts was simple: when a cookie is set, adding the attribute SameSite=none will disable the new behavior and allow the cookie to be set and sent freely. Over the course of the last two years, most sites that cared about their cookies began sending the attribute.

The CHIPS proposal (“Cookies having independent partitioned state”) offers a new but more limited escape hatch– a developer may opt-in to partitioning their cookie so that it’s no longer a “3rd party cookie”, it’s instead a partitioned cookie. A partitioned cookie set in the context of domain3.com embedded inside runnersworld.com will not be visible in the context domain3.com embedded inside startrek.com. Similarly, setting the cookie in the context domain3.com embedded inside gas-x.com will have no impact on the cookie’s value in the other two pages. If the user visits domain3.com as a top-level browser navigation, the cookies that were set on that origin’s subframes in the context of other top-level pages remain inaccessible.

Using the new Partitioned attribute is simple; just add it to your Set-Cookie header like so:

Set-Cookie: __Host-id=4d5e6;Partitioned;SameSite=None; Secure;Path=/; 

Support for CHIPS is expected to be broad, across all major browsers.

I was initially a bit skeptical about requiring authors to explicitly specify the new attribute– why not just treat all cookies in 3rd-party contexts as partitioned? I eventually came around to the arguments that an explicit declaration is desirable. As it stands, legacy applications already needed to be updated with a SameSite=None declaration, so we probably wouldn’t keep any unmaintained legacy apps working if we didn’t require the attribute.

The Explicit Recipe: The Storage Access API

The Storage Access API allows a website to request permission to use storage in a 3rd party context. Microsoft Edge joined Safari and Firefox with support for this API in 2020 as a mechanism for mitigating the impact of the browser’s Tracking Prevention feature.

The Storage Access API has a lot going for it, but lack of universal support from major browsers means that it’s not currently a slam-dunk.

A Niche Recipe: First Party Sets

In some cases, the fact that cookies are treated as “3rd-party” represents a technical limitation rather than a legal or organizational one. For example, Microsoft owns xbox.com, office.com, and teams.microsoft.com, but these origins do not today share a common eTLD+1, meaning that pages from these sites are treated as cross-site 3rd-parties to one another. The First Party Sets proposal would allow sites owned and operated by a single-entity to be treated as first-party when it comes to privacy features.

Originally, a new cookie attribute, SameParty, would allow a site to request inclusion of a cookie when the cross-origin sub-resource’s context is in the same First Party Set as the top-level origin, but a recent proposal removes that attribute.

The Authentication Recipe: FedCM API

As I explained three years ago, authentication is an important use-case for 3rd-party cookies, but it’s hampered by browser restrictions on 3P cookies. The Federated Credential Management API proposes that browsers and websites work together to imbue the browser with awareness and control of the user’s login state on participating websites. As noted in Google’s explainer:

We expect FedCM to be useful to you only if all these conditions apply:

  1. You’re an identity provider (IdP).
  2. You’re affected by the third-party cookie phase out.
  3. Your Relying Parties are third-parties.

FedCM is a big, complex, and important specification that aims to solve exclusively authentication scenarios.

Complexity Abounds

The move away from supporting 3rd-party cookies has huge implications for how websites are built. Maintaining compatibility for desirable scenarios while meaningfully breaking support for undesirable scenarios (trackers) is inherently extremely challenging– I equate it to trying to swap out an airliner’s engines while the plane is full of passengers and in-flight.

Combinatorics

As we add multiple new approaches to address the removal of 3P cookies, we must carefully reason about how they all interact. Specifications need to define how the behavior of CHIPS, First-Party-Sets, and the Storage Access API all intersect, for example, and web developers must account for cases where a browser may support only some of the new features.

Cookies Aren’t The Only Type of Storage

Another compexity is that cookies aren’t the only form of storage– IndexedDB, localStorage, sessionStorage, and various other cookie-like storages all exist in the web platform. Limiting only cookies without accounting for other forms of storage wouldn’t get us to where we want to be on privacy.

That said, cookies are one of the more interesting forms of storage when it comes to privacy, as they

  1. are sent to the server before the page loads,
  2. operate without JavaScript enabled,
  3. operate in cases like <img> elements where no script-execution context exists
  4. etc.

Cookies Are Special

Another interesting aspect of migrating scenarios away from cookies is that we lose some of the neat features that have been added over the years.

One such feature is the HTTPOnly declaration, which prevents a cookie from being accessible to JavaScript. This feature was designed to blunt the impact of a cross-site scripting attack — if script injected into a compromised page cannot read a cookie, that cookie cannot be leaked out to a remote attacker. The attacker is forced to abuse the XSS’d page immediately (“a sock-puppet browser”) limiting the sorts of attacks that can be undertaken. Some identity providers demand that their authentication tokens be carried only via HTTPOnly cookies, and if an authentication token must be available to JavaScript directly, the provider mints that token with a much shorter validity lifetime (e.g. one hour instead of one week).

Another cookie feature is TLS Token Binding, an obscure capability that attempts to prevent token theft attacks from compromised PCs. If malware or a malicious insider steals Token-bound cookie data directly from a PC, that cookie data will not work from another device because the private key material used to authenticate the cookies cannot be exported off of the compromised client device. (This non-exportability property is typically enforced by security hardware like a TPM.) While Token binding provides a powerful and unique capability for cookies, for various reasons the feature is not broadly supported.

Deprecating 3rd-Party Cookies is Not a Panacea

Unfortunately, getting rid of 3rd-party cookies doesn’t mean that we’ll be rid of tracking. There are many different ways to track a user, ranging from the obvious (they’re logged in to your site, they have a unique IP address) to the obscure (various fingerprinting mechanisms). But getting rid of 3rd-party cookies is a valuable step as browser makers work to engineer a privacy sandbox into the platform.

It’s a fascinating time in the web platform privacy space, and I can’t wait to see how this all works out.

-Eric

[1] Interestingly, if domain1.com includes a <script> element pointed at a resource from domain2.com or domain3.com, that script will run inside domain1.com‘s context, such that calls to the document.cookie DOM property will return the cookies for domain1.com, not the domain that served the script. But that’s not important for our discussion here.