Meddler

While there are many different ways for servers to stream data to clients, the Server-sent Events / EventSource Interface is one of the simplest. Your code simply creates an EventSource and then subscribes to its onmessage callback:

Implementing the server side is almost as simple: your handler just prefaces each piece of data it wants to send to the client with the string data: and ends it with a double line-ending (\n\n). Easy peasy. You can see the API in action in this simple demo.

I’ve long been sad that we didn’t manage to get this API into Internet Explorer or the Legacy Edge browser. While many polyfills for the API exist, I was happy that we finally have EventSource in the new Edge.

Yay! \o/

Alas, I wouldn’t be writing this post if I hadn’t learned something new yesterday.

Last week, a customer reached out to complain that the new Edge and Chrome didn’t work well with their webmail application. After they used the webmail site for a some indeterminate amount time, they noticed that its performance slowed to a crawl– switching between messages would take tens of seconds or longer, and the problem reproduced regardless of the speed of the network. The only way to reliably resolve the problem was to either close the tabs they’d opened from the main app (e.g. the individual email messages could be opened in their own tabs) or to restart the browser entirely.

As the networking PM, I was called in to figure out what was going wrong over video conference. I instructed the user to open the F12 Developer Tools and we looked at the network console together. Each time the user clicked on a message, new requests were created and sat in the (pending) state for a long time, meaning that the requests were getting queued and weren’t even going to the network promptly.

But why? Diagnosing this remotely wasn’t going to be trivial, so I had the user generate a Network Export log that I could examine later.

In examining the log using the online viewer, the problem became immediately clear. On the Sockets tab, the webmail’s server showed 19 requests in the Pending state, and 6 Active connections to the server, none of which were idle. The fact that there were six connections strongly suggested that the server was using HTTP/1.1 rather than HTTP/2, and a quick look at the HTTP/2 tab confirmed it. Looking at the Events tab, we see five outstanding URLRequests to a URL that strongly suggests that it’s being used as an EventSource:

Each of these sockets is in the READING_RESPONSE state, and each has returned just ten bytes of body data to each EventSource. The web application is using one EventSource instance of the app, and the user has five tabs open to the app.

And now everything falls into place. Browsers limit themselves to 6 concurrent connections per server. When the server supports HTTP/2, browsers typically need just one connection because HTTP/2 supports multiplexing many (typically 100) streams onto a single connection. HTTP/1.1 doesn’t afford that luxury, so every long-lived connection used by a page decrements the available connections by one. So, for this user, all of their network traffic was going down a single HTTP/1.1 connection, and because HTTP/1.1 doesn’t allow multiplexing, it means that every action in the UI was blocked on a very narrow head-of-line-blocking pipe.

Looking in the Chrome bug tracker, we find this core problem (“SSE connections can starve other requests”) resolved “By Design” six years ago.

Now, I’m always skeptical when reading old bugs, because many issues are fixed over time, and it’s often the case that an old resolution is no longer accurate in the modern world. So I built a simple repro script for Meddler. The script returns one of four responses:

  • An HTML page that consumes an EventSource
  • An HTML page containing 15 frames pointed at the previous HTML page
  • An event source endpoint (text/event-stream)
  • A JPEG file (to test whether connection limits apply across both EventSources and other downloads)

And sure enough, when we load the page we see that only six frames are getting events from the EventSource, and the images that are supposed to load at the bottom of the frames never load at all:

Similarly, if we attempt to load the page in another tab, we find that it doesn’t even load, with a status message of “Waiting for available socket…”

The web app owners should definitely enable HTTP/2 on their server, which will make this problem disappear for almost all of their users.

However, even HTTP/2 is not a panacea, because the user might be behind a “break-and-inspect” proxy that downgrades connections to HTTP/1.1, or the browser might conceivably limit parallel requests on HTTP/2 connections for slow networks. As noted in the By Design issue, a server depending on EventSource in multiple tabs might use a BroadcastChannel or a SharedWorker to share a single EventSource connection with all of the tabs of the web application.

Alternatively, swapping an EventSource architecture with one based on WebSocket (even one that exposes itself as a EventSource polyfill) will also likely resolve the problem. That’s because, even if the client or server doesn’t support routing WebSockets over HTTP/2, the WebSockets-Per-Host limit is 255 in Chromium and 200 in Firefox.

Stay responsive out there!

-Eric

Many classic Windows APIs accept a pointer to a byte buffer and a pointer to an integer indicating the size of the buffer. If the buffer is large enough to hold the data returned from the API, the buffer is filled and the API returns S_OK. If the buffer supplied is not large enough to hold all of the data, the API instead returns ERROR_INSUFFICIENT_BUFFER, updating the supplied integer with the length of the buffer required. The client is expected to reallocate a new buffer of the specified size and call the API again with the new buffer and length.

For example, the InternetGetCookieEx function, used to query the WinINET networking stack for cookies for a given URL, is one such API. The GetExtendedTcpTable function, used to map sockets to processes, is another.

The advantage of APIs with this form is that you can call the API with a reasonably-sized stack buffer and avoid the cost of a heap allocation unless the stack buffer happens to be too small.

In the case of Internet Explorer and Edge, the document.cookie DOM API getter’s implementation first calls the InternetGetCookieEx API with a 1024 WCHAR buffer. If the buffer is big enough, the cookie string is then immediately returned to the page.

However, if ERROR_INSUFFICIENT_BUFFER is returned instead (and if the size needed is 10240 characters (MAX_COOKIE_LEN) or fewer), the API will allocate a new buffer on the heap and call the API again. If the API succeeds, the cookie string is returned to the page, otherwise if any error is returned, an empty string is returned to the page.

Wait. Do you see the problem here?

It’s tempting to conclude that the document.cookie API doesn’t need to be thread-safe–JavaScript that touches the DOM runs in one thread, the UI thread. But cookies are a form of data storage that is available across multiple threads and processes. For instance, subdownload network requests for the page’s resources can be manipulating the cookie store in parallel, and if I happen to have multiple tabs or windows open to the same site, they’ll be interacting with the same cookie jar.

So, consider following scenario: The document.cookie implementation calls InternetGetCookieEx but gets back ERROR_INSUFFICIENT_BUFFER with a required size of 1200 bytes. The implementation dutifully allocates a 1200 byte buffer, but before it gets the chance to call InternetGetCookieEx again, an image on the page sets a new 4 byte cookie which WinINET puts in the cookie jar. Now, when InternetGetCookieEx is called again, it again returns ERROR_INSUFFICIENT_BUFFER because the required buffer is now 1204 characters. Because document.cookie isn’t using any sort of loop-until-success, it returns an empty cookie string.

Now, this is all fast native code (C/C++), so surely this sort of thing is just theoretical… it can’t really happen on a fast computer, right?

Around ten years ago, I showed how you can use Meddler to easily generate a lot of web traffic for testing browsers. Meddler is a simple web server that has a simple GUI code editor slapped on the front (most developers would use node.js or Go for such tasks). I quickly threw together a tiny little MeddlerScript which exercises cookies by loading cookie-setting images in a loop and monitoring the document.cookie API to see if it ever returns an empty string.

Boy, does it ever. On my i7 machines, it usually only takes a few seconds to run into the buggy case where document.cookie returns an empty string.

Failure

I haven’t gone back to check the history, but I suspect this IE/Edge bug is at least fifteen years old.

After confirming this bug, it felt strangely familiar, as if I’d hit this landmine before. Then, as I was writing this post, I realized when… Back in 2011, I shared the C# code Fiddler uses for mapping a socket to a process. That code relies on the GetExtendedTcpTable API, which has the same reallocate-then-reinvoke design. Fortunately, I’d fixed the bug a few weeks later in Fiddler, but it looks like I never updated my blog post (sorry about that).

-Eric

PS: Unrelated, but one more pitfall to be aware of: InternetGetCookieExW has a truly bizarre shape, in that the lpdwSize argument is a pointer to a count of wide characters, but if ERROR_INSUFFICIENT_BUFFER is returned, the size argument is set to the count of bytes required.

PPS: As of Windows 10 RS3, Edge (and IE) support 180 cookies per domain to match Chrome, but the network stack will skip setting or sending individual cookies with a value over 5120 bytes.