Improving Native Message Host Reliability on Windows

Last Update: Mar 22, 2023

Previously, I’ve written about Chromium’s Native Messaging functionality that allows a browser extension to talk to a process running outside of the browser’s sandbox, and I shared a Native Messaging Debugger I wrote that allows you to monitor (and even tamper with) the communication channels between the browser extension and the Host App.

Obscure Problems on Windows

Native Messaging is a powerful capability, and a common choice for building extensions that need to interact with the rest of the system. However, over the years, users have reported a trail of bugs related to how the feature is implemented on Windows. While these bugs are typically only seen in uncommon configurations, they could break Native Messaging entirely for some users.

Some examples include:

  • crbug/335558 – Ampersand in Host’s path prevents launching
  • crbug/387228 – Broken if %comspec% not pointed at cmd.exe
  • crbug/387233 – Broken when cmd.exe is disabled or set to RUNASADMIN

While the details of each of these issues differ, they all have the same root cause: On Windows, Chromium did not launch Native Message Hosts directly, instead launching cmd.exe (Windows’ console command prompt) and directing it to launch the target Host:

This approach provided two benefits: it enabled developers to implement Hosts using languages like Python, whose scripts are not directly executable in Windows, and it enabled support for Windows XP, where the APIs did not allow Chromium to easily set up the communication channel between the browser and the Native Host.

Unfortunately, the cmd-in-the-middle design meant that anything that prevented cmd.exe from running (387233, 387228) or that prevented it from starting the Host (335558) would cause the flow to fail. While these configurations tend to be uncommon (which is why the problems have existed for ten years), they also tend to be very very hard to recognize/diagnose, and the impacted customers often have little recourse short of abandoning the extension platform.

The Fix

So, over a few nights and weekends, I landed a changelist in Chromium to improve this scenario for version 113.0.5656 and later. This change means that Chrome, Edge, and other Chromium-derived browsers will now directly invoke any Native Host that is a Windows Executable (.exe) rather than going through cmd.exe instead:

Native Hosts that are not implemented by executables (e.g. Python scripts or the like) will continue to use the old codepath.

I’ve got my fingers crossed that effectively no one even notices this change, with the exception of those unfortunate users who were encountering the old bugs. However, this change also fixes two other bugs that were caused by the cmd-in-the-middle flow and those changes could cause problems if your Windows executable was not aware of the expected behavior for Native Hosts.

(In)Visibility

When Chromium launches a native host, it sets a start_hidden flag to prevent any UI from popping up from the host. That flag prevents the proxy cmd.exe‘s UI window (conhost.exe) from appearing on the screen. This start_hidden flag means that console-based (subsystem:console) Windows applications remain invisible during native-messaging communications. However, the start_hidden flag didn’t flow through to non-console applications (e.g. subsystem:Windows), like my Native Messaging Debugger application, which is built atop C#’s WinForms and meant to be seen by the user.

The new Direct Launch for Executables flow changes this– now Windows .exe files are started hidden, meaning that they’re not visible to the user by default. Surprisingly, this might not be obvious to the application’s code; for example, checking frmMain.Visible in my WinForms startup code still returned true even though the window was not visible.

Fixing this in my Host was simple— I just explicitly call ShowWindow in the form’s Load event handler:

private const int SW_SHOW = 5;
[DllImport("User32")]
private static extern int ShowWindow(int hwnd, int nCmdShow);

ShowWindow((int)this.Handle, SW_SHOW);

The Terminator

When a Native Host is no longer needed, because the Extension’s sendNativeMessage() got a reply from the Host, or the disconnect() method was called (either explicitly or during garbage collection) on the port returned from connectNative(), Chromium shuts down the Native Host connection. First, it closes the stdin and stdout pipes that it established to communicate with the new process. Then, it checks whether the new process has exited itself (typical), and if not, sets up a timer to call Windows’ TerminateProcess() two seconds later if the Host is still running.

In the cmd.exe flow, this process termination was effectively a no-op, and a Host that did not self-terminate was always left running. You can see this with my Native Messaging Debugger app — the pipes close, but the UI remains alive.

In the new direct launch flow, the Host is reliably terminated two seconds after the pipes disconnect, if-and-only if Chromium is still running. (If the Host was disconnected because Chromium itself is exiting entirely, the Host process’s pipes are detached but not terminated… likely an existing bug where Chromium’s two-second callback is aborted during shutdown.)

While this is the intended design (preventing process leaks), unfortunately I’m not aware of an easy way for a Host that doesn’t want to exit to keep itself alive. Unlike many shutdown-type events, Windows does not allow a process to “decline” termination… it’s just there one moment and gone the next1. A Windows process can use a DACL trick to deny Process Terminate rights on handle non-elevated applications who get a handle to the process, but unfortunately this isn’t sufficient here, because Chromium gets a handle without this restriction as it launches the Host, before the Host process has a chance to protect itself. If your App truly needs to outlive the browser itself, you could either launch it via a .bat file, or you could have your Native Host itself be a stub that acts as an IPC proxy to the rest of your App.

If you encounter a scenario that visibly changes with this new flow in Chromium browsers v113.0.5656 and later, please let me know ASAP!

Update: Mar 22, 2023: The developer of a popular extension found another behavior change in the new codepath that caused their extension to unexpectedly stop working. It’s a subtle issue, and hopefully theirs is the only one that will hit it.

What happened?

With the new launch flow, if your Host has an outstanding Read() on the Standard Input (stdin) handle, if you attempt to close that handle:

// Don't do this!
CloseHandle(GetStdHandle(StdIn));

…that function will now block unless/until the the Read() operation completes. If you were issuing this CloseHandle() call on the UI thread, your Host will hang until Chromium gets around to terminating your Host process, which could cause problems for your Host if it expected to perform any other cleanup after disconnecting.

The best fix for this issue is to simply not call CloseHandle(), because you don’t need to! All three STDIO handles will be correctly closed when your process exits in a few seconds anyway, so there’s no need to manually close the handle yourself.

If you really want to manually the handle, you can first call the function CancelIoEx(GetStdHandle(STD_INPUT_HANDLE), NULL); before calling CloseHandle(), but to reiterate, there’s really no good reason to bother closing the handle yourself.

-Eric

1 It took me some time to actually figure out what was happening here. My Native Messaging Debugger app started disappearing after the pipes closed, and I didn’t know why. I assumed that an unhandled exception must be silently crashing my app. I finally figured out what was happening using the awesome Silent Process Exit debugger option inside gflags:

Published by ericlaw

Impatient optimist. Dad. Author/speaker. Created Fiddler & SlickRun. PM @ Microsoft 2001-2012, and 2018-2022, working on Office, IE, and Edge. Now a SWE on Microsoft Defender Web Protection. My words are my own, I do not speak for any other entity.

One thought on “Improving Native Message Host Reliability on Windows

  1. Wow, I definitely need to check this on work computers now. OWA S/MIME has always been problematic on DoD computers since the giant move to M365. Cmd.exe tends to be disabled as a baseline config, so most people give up using OWA S/MIME and just use the desktop app. Thanks for another great post (and the Chromium update)!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: