troubleshooting

Previously, I’ve described how to capture a network traffic log from Microsoft Edge, Google Chrome, and applications based on Chromium or Electron.

In this post, I aim to catalog some guidance for looking at these logs to help find the root cause of captured problems and otherwise make sense of the data collected.

Last Update: April 24, 2020I expect to update this post over time as I continue to gain experience in analyzing network logs.

Choose A Viewer – Fiddler or Catapult

After you’ve collected the net-export-log.json file using the about:net-export page in the browser, you’ll need to decide how to analyze it.

The NetLog file format consists of a JSON-encoded stream of event objects that are logged as interesting things happen in the network layer. At the start of the file there are dictionaries mapping integer IDs to symbolic constants, followed by event objects that make use of those IDs. As a consequence, it’s very rare that a human will be able to read anything interesting from a NetLog.json file using just a plaintext editor or even a JSON parser.

The most common (by far) approach to reading NetLogs is to use the Catapult NetLog Viewer, a HTML/JavaScript application which loads the JSON file and parses it into a much more readable set of events.

An alternative approach is to use the NetLog Importer for Telerik Fiddler.

Importing NetLogs to Fiddler

For Windows users who are familiar with Fiddler, the NetLog Importer extension for Fiddler is easy-to-use and it enables you to quickly visualize HTTP/HTTPS requests and responses. The steps are easy:

  1. Install the NetLog Importer,
  2. Open Fiddler, ideally in Viewer mode fiddler.exe -viewer
  3. Click File > Import > NetLog JSON
  4. Select the JSON file to import

In seconds, all of the HTTP/HTTPS traffic found in the capture will be presented for your review. If the log was compressed before it was sent to you, the importer will automatically extract the first JSON file from a chosen .ZIP or .GZ file, saving you a step.

In addition to the requests and responses parsed from the log, there are a number of pseudo-Sessions with a fake host of NETLOG that represent metadata extracted from the log:

These pseudo-sessions include:

  • RAW_JSON contains the raw constants and event data. You probably will never want to examine this view.
  • CAPTURE_INFO contains basic data about the date/time of the capture, what browser and OS version were used, and the command line arguments to the browser.
  • ENABLED_EXTENSIONS contains the list of extensions that are enabled in this browser instance. This entry will be missing if the log was captured using the –log-net-log command line argument.
  • URL_REQUESTS contains a dictionary mapping every event related to URL_REQUEST back to the URL Requests to which it belongs. This provides a different view of the events that were used in the parsing of the Web Sessions added to the traffic list.
  • SECURE_SOCKETS contains a list of all of the HTTPS sockets that were established for network requests, including the certificates sent by the server and the parameters requested of any client certificates. The Server certificates can be viewed by saving the contents of a –BEGIN CERTIFICATE– entry to a file named something.cer. Alternatively, select the line, hit CTRL+C, click Edit > Paste As Sessions, select the Certificates Inspector and press its Content Certificates button.

You can then use Fiddler’s UI to examine each of the Web Sessions.

Limitations

The NetLog format currently does not store request body bytes, so those will always be missing (e.g. on POST requests).

Unless the Include Raw Bytes option was selected by the user collecting the capture, all of the response bytes will be missing as well. Fiddler will show a “dropped” notice when the body bytes are missing:

If the user did not select the Include Cookies and Credentials option, any Cookie or Authorization headers will be stripped down to help protect private data:

Scenario: Finding URLs

You can use Fiddler’s full text search feature to look for URLs of interest if the traffic capture includes raw bytes. Otherwise, you can search the Request URLs and headers alone.

On any session, you can use Fiddler’s “P” keystroke (or the Select > Parent Request context menu command) to attempt to walk back to the request’s creator (e.g. referring HTML page).

You can look for the traffic_annotation value that reflects why a resource was requested by looking for the X-Netlog-Traffic_Annotation Session Flag.

Scenario: Cookie Issues

If Fiddler sees that cookies were not set or sent due to features like SameSiteByDefault cookies, it will make a note of that in the Session using a psuedo $NETLOG-CookieNotSent or $NETLOG-CookieNotSet header on the request or response:

Closing Notes

If you’re interested in learning more about this extension, see announcement blog post and the open-source code.

While the Fiddler Importer is very convenient for analyzing many types of problems, for others, you need to go deeper and look at the raw events in the log using the Catapult Viewer.


Viewing NetLogs with the Catapult Viewer

Opening NetLogs with the Catapult NetLog Viewer is even simpler:

  1. Navigate to the web viewer
  2. Select the JSON file to view

If you find yourself opening NetLogs routinely, you might consider using a shortcut to launch the Viewer in an “App Mode” browser instance: msedge.exe --app=https://netlog-viewer.appspot.com/#import

The App Mode instance is a standalone window which doesn’t contain tabs or other UI:

Note that the Catapult Viewer is a standalone HTML application. If you like, you can save it as a .HTML file on your local computer and use it even when completely disconnected from the Internet. The only advantage to loading it from appspot.com is that the version hosted there is updated from time to time.

Along the left side of the window are tabs that offer different views of the data– most of the action takes place on the Events tab.

Tips

If the problem only exists on one browser instance, check the Command Line parameters and Active Field trials sections on the Import tab to see if there’s an experimental flag that may have caused the breakage. Similarly, check the Modules tab to see if there are any browser extensions that might explain the problem.

Each URL Request has a traffic_annotation value which is a hash you can look up in annotations.xml. That annotation will help you find what part of Chromium generated the network request:

Most requests generated by web content will be generated by the blink_resource_loader, navigations will have navigation_url_loader and requests from features running in the browser process are likely to have other sources.

Scenario: DNS Issues

Look at the DNS tab on the left, and HOST_RESOLVER_IMPL_JOB entries in the Events tab.

One interesting fact: The DNS Error page performs an asynchronous probe to see whether the configured DNS provider is working generally. The Error page also has automatic retry logic; you’ll see a duplicate URL_REQUEST sent shortly after the failed one, with the VALIDATE_CACHE load flag added to it. In this way, you might see a DNS_PROBE_FINISHED_NXDOMAIN error magically disappear if the user’s router’s DNS flakes.

Scenario: Cookie Issues

Look for COOKIE_INCLUSION_STATUS events for details about each candidate cookie that was considered for sending or setting on a given URL Request. In particular, watch for cookies that were excluded due to SameSite or similar problems.

Scenario: HTTPS Handshaking Issues

Look for SSL_CONNECT_JOB and CERT_VERIFIER_JOB entries. Look at the raw TLS messages on the SOCKET entries.

Scenario: HTTPS Certificate Issues

Note: While NetLogs are great for capturing certs, you can also get the site’s certificate from the browser’s certificate error page.

The NetLog includes the certificates used for each HTTPS connection in the base64-encoded SSL_CERTIFICATES_RECEIVED events.

You can just copy paste each certificate (including ---BEGIN to END---) out to a text file, name it log.cer, and use the OS certificate viewer to view it. Or you can use Fiddler’s Inspector, as noted above.

If you’ve got Chromium’s repo, you can instead use the script at \src\net\tools\print_certificates.py to decode the certificates. There’s also a cert_verify_tool in the Chromium source you might build and try. For Mac, using verify-cert to check the cert and dump-trust-settings to check the state of the Root Trust Store might be useful.

In some cases, running the certificate through an analyzer like https://crt.sh/lintcert can flag relevant problems.

Scenario: Authentication Issues

Look for HTTP_AUTH_CONTROLLER events, and responses with the status codes 401, 403, and 407.

For instance, you might find that the authentication fails with ERR_INVALID_AUTH_CREDENTIALS unless you enable the browser’s DisableAuthNegotiateCnameLookup policy (Kerberos has long been very tricky).

Scenario: Debugging Proxy Configuration Issues

See Debugging the behavior of Proxy Configuration Scripts.


Got a great NetLog debugging tip I should include here? Please leave a comment and teach me!

-Eric

While lately I’ve been endlessly streaming the latest news with horrified fascination, this morning my calendar unexpectedly popped up a reminder set over a year ago… Today is the fifteenth anniversary of my big-league blogging debut on the Internet Explorer Team’s blog.

My first post there, “A HTTP Detective Story” remains one of my favorites. Sadly, its topic feels all too familiar: a website took a dependency upon a browser quirk, the Referer and User-Agent headers were critical elements of the repro, and I used Fiddler to root-cause the problem. I’ve learned (and shared) so much more about these topics over the last fifteen years, and I appreciate the readers who’ve followed me from the IEBlog to IEInternals (237 posts) to Telerik’s Fiddler blog (41 posts) to my book, to this, my newest blog, TextSlashPlain (189 posts and counting!).

Here’s to learning and sharing for the next 15 years!

With gratitude,

-Eric

For a small number of users of Chromium-based browsers (including Chrome and the new Microsoft Edge) on Windows 10, after updating to 78.0.3875.0, every new tab crashes immediately when the browser starts.

Impacted users can open as many new tabs as they like, but each will instantly crash:

EveryTabCrashes

EdgeHavingAProblem

As of Chrome 81.0.3992, the page will show the string Error Code: STATUS_INVALID_IMAGE_HASH.

What’s going wrong?

This problem relates to a security/reliability improvement made to Chromium’s sandboxing. Chromium runs each of the tabs (and extensions) within locked down (“sandboxed”) processes:

JAIL

In Chrome 78, a change was made to prevent 3rd-party code from injecting itself into these sandboxed processes. 3rd-party code is a top source of browser reliability and performance problems, and it has been a longstanding goal for browser vendors to get this code out of the web platform engine.

This new feature relies on setting a Windows 10 Process Mitigation policy that instructs the OS loader to refuse to load binaries that aren’t signed by Microsoft. Edge 13 enabled this mitigation in 2015, and the Chromium change brings parity to the new Edge 78+. Notably, Chrome’s own DLLs aren’t signed by Microsoft so they are specially exempted by the Chromium sandboxing code.

Unfortunately, the impact of this change is that the renderer is killed (resulting in the “Aw snap” page) if any disallowed DLL attempts to load, for instance, if your antivirus software attempts to inject its DLLs into the renderer processes. For example, Symantec Endpoint Protection versions before 14.2 are known to trigger this problem.

If you encounter this problem, you should follow the following steps:

Update any security software you have to the latest version.

Other than malware, security software is the other likely cause of code being unexpectedly injected into the renderers.

Temporarily disable the new protection

You can temporarily launch the browser without this sandbox feature to verify that it’s the source of the crashes.

  1. Close all browser instances (verify that there are no hidden chrome.exe or msedge.exe processes using Task Manager)
  2. Use Windows+R to launch the browser with the command line override:
  msedge.exe --disable-features=RendererCodeIntegrity

or

  chrome.exe --disable-features=RendererCodeIntegrity

Ensure that the tab processes work properly when code integrity checks are disabled.

If so, you’ve proven that code integrity checks are causing the crashes.

Hunt down the culprit

Navigate your browser to the URL chrome://conflicts#R to show the list of modules loaded by the client. Look for any files that are not Signed By Microsoft or Google.

If you see any, they are suspects. (There will likely be a few listed as Shell Extensions; e.g. 7-Zip.dll, that do not cause this problem)– check for an R in the Process types column to find modules loading in the Renderers.

You should install any available updates for any of your suspects to see if doing so fixes the problem.

Check the Event Log

The Windows Event Log will contain information about modules denied loading. Open Event Viewer. Expand Applications and Services Logs > Microsoft > Windows > CodeIntegrity > Operational and look for events with ID 3033. The detail information will indicate the name and location of the DLL that caused the crash:CodeIntegrity

Optional: Use Enterprise Policy to disable the new protection

If needed, IT Adminstrators can disable the new protection using the RendererCodeIntegrity policy for Chrome and Edge. You should outreach to the software vendors responsible for the problematic applications and request that they update them.

Other possible causes

Note that it’s possible that you could have a PC that encounters symptoms like this (all subprocesses crash) but not a result of the new code integrity check. In such cases, the Error Code on the crash page will be something other than STATUS_INVALID_IMAGE_HASH.

  • For instance, Chromium once had an obscure bug in its sandboxing code that caused all sandboxes to crash depending on the random memory mapping of Address Space Layout Randomization.
  • Similarly, Chrome and Edge still have an active bug where all renderers crash on startup if the PC has AppLocker enabled and the browser is launched elevated (as Administrator).

-Eric

Sometimes a site will not load by default but it works just fine in InPrivate mode or when loaded in a different browser profile. In many such cases, this means there’s a bug in the website where they’ve set a cookie but fail to load when that cookie is sent back.

This might happen, for instance, if a site set a ton of cookies over time but the server has a request length limit. After the cookies build up through the course of normal browsing, the 16k header limit is exceeded and the server rejects all further requests by dropping the connection or showing a message like Bad Request – Request too long, Bad Request – The size of the request headers is too long, or ERR_TOO_MANY_REDIRECTS. This problem is unfortunately common for large web properties, including Microsoft.com.

Fortunately, it’s often easy to fix this problem in the new Edge (and Chrome).

Delete Cookies for the Current Site

On the error page, click the icon next to the address bar and see whether there are Cookies in use:

ClearSiteCookies1

If so, click the item to open the Cookies in use screen. In the box that appears, select each server name and click the Remove button at the bottom to remove the cookies set for that server:

ClearSiteCookies2

After you remove all of the cookies, click the Done button and try reloading the page.

-Eric

PS: If you have a server hitting this problem, use fewer/smaller cookies, and reconfigure your server to accept larger headers. Reconfiguration might be required to use your server if your server uses Active Directory authentication and your visitors have large AD tokens.

Sometimes, when you try to load a HTTPS address in Chrome, instead of the expected page, you get a scary warning, like this one:

image

Chrome has found a problem with the security of the connection and has blocked loading the page to protect your information.

In a lot of cases, if you’re just surfing around, the easiest thing to do is just find a different page to visit. But what happens if this happens on an important site that you really need to see? You shouldn’t just “click through” the error, because this could put your device or information at risk.

In some cases, clicking the ADVANCED link might explain more about the problem. For instance, in this example, the error message says that the site is sending the wrong certificate; you might try finding a different link to the site using your favorite search engine.

image

Or, in this case, Chrome explains that the certificate has expired, and asks you to verify that your computer clock’s Date and Time are set correctly:

image

You can see the specific error code in the middle of the text:

image

Some types of errors are a bit more confusing. For instance, NET::ERR_CERT_AUTHORITY_INVALID means that the site’s certificate didn’t come from a company that your computer is configured to trust.

image

Google Internet Authority G3?

If the root certificate is from Google Internet Authority G3, see this article.

Errors Everywhere?

What happens if you start encountering errors like this on every HTTPS page that you visit, even major sites like https://google.com?

In such cases, this often means that you have some software on your device or network that is interfering with your secure connections. Sometimes this software is well-meaning (e.g. anti-virus software, ad-blockers, parental control filters), and sometimes it’s malicious (adware, malware, etc). But even buggy well-meaning software can break your secure connections.

If you know what software is intercepting your traffic (e.g. your antivirus) consider updating it or contacting the vendor.

Getting Help

If you don’t know what to do, you may be able to get help in the Chrome Help Forum. When you ask for help, please include the following information:

  • The error code (e.g. NET::ERR_CERT_AUTHORITY_INVALID).
    • To help the right people find your issue, consider adding this to the title of your posting.
  • What version of Chrome you’re using. Visit chrome://version in your browser to see the version number
  • The type of device and network (e.g. “I’m using a laptop on wifi on my school’s network.”)
  • The error diagnostic information.

You can get diagnostic information by clicking or tapping directly on the text of the error code: image. When you do so, a bunch of new text will appear in the page:

image

You should select all of the text:

image

…then hit CTRL+C (or Command ⌘+C on Mac) to copy the text to your clipboard. You can then paste the text into your post. The “PEM encoded chain” information will allow engineers to see exactly what certificate the server sent to your computer, which might shed light on what specifically is interfering with your secure connections.

With any luck, we’ll be able to help you figure out how to surf securely again in no time!

 

-Eric

I’ve made changes to the latest versions of Fiddler to improve the performance of certificate creation, and to avoid problems with new certificate validation logic coming to Chrome and Firefox. The biggest of the Fiddler changes is that CertEnroll is now the default certificate generator on Windows 7 and later.

Unfortunately, this change can cause problems for users who have previously trusted the Fiddler root certificate; the browser may show an error message like NET::ERR_CERT_AUTHORITY_INVALID or The certificate was not issued by a trusted certificate authority.

Please perform the following steps to recreate the Fiddler root certificate:

Fiddler 4.6.1.5+

  1. Click Tools > Fiddler Options.
  2. Click the HTTPS tab.
  3. Ensure that the text says Certificates generated by CertEnroll engine.
  4. Click Actions > Reset Certificates. This may take a minute.
  5. Accept all prompts

Fiddler 4.6.1.4 and earlier

  1. Click Tools > Fiddler Options.
  2. Click the HTTPS tab
  3. Uncheck the Decrypt HTTPS traffic checkbox
  4. Click the Remove Interception Certificates button. This may take a minute.
  5. Accept all of the prompts that appear (e.g. Do you want to delete these certificates, etc)
  6. (Optional) Click the Fiddler.DefaultCertificateProvider link and verify that the dropdown is set to CertEnroll
  7. Exit and restart Fiddler
  8. Click Tools > Fiddler Options.
  9. Click the HTTPS tab
  10. Re-check the Decrypt HTTPS traffic checkbox
  11. Accept all of the prompts that appear (e.g. Do you want to trust this root certificate)

image

If you are using Fiddler to capture secure traffic from a mobile device or Firefox, you will need to remove the old Fiddler root certificate from that device (or Firefox) and install the newly-generated Fiddler certificate.

I apologize for the inconvenience, but I believe that the new certificate generator will help ensure smooth debugging with current and future clients.

-Eric Lawrence