META CHARSET

Someone complained that a Japanese page is garbled in Edge/Chrome, but renders with the correct characters in Firefox and IE:

The problem is that Chromium is using an unexpected character set to interpret the response in the HTML Parser. That happens because the server doesn’t send a proper character set directive. To avoid problems like this and improve performance, document authors should specify the character set in the HTTP response headers:

Content-Type: text/html; charset=utf-8

If a charset isn’t specified in the headers, Chrome looks for a META CHARSET declaration within the response body text (“note the absurdity of encoding the character encoding in the document that you’re trying to decode“). The HTML5 spec demands that documents be encoded in UTF-8, and that the charset declaration, if any, appears within the first 1024 bytes of the response.

Chrome checks the full head for a character-set directive, and if it doesn’t find one, ensures that it’s looked through at least 1024 bytes before giving up.

Unfortunately, this site accidentally includes a div tag up in the head (ending the head section prematurely), and buries the charset down 1479 bytes into the response:

To avoid problems like this:

  1. Specify the CHARSET in the Content-Type response header, and
  2. Ensure META CHARSET appears as the first element of your HEAD.
  3. To avoid problems in legacy browsers, write it as utf-8 rather than as utf8.

Get your <HEAD> in order!

-Eric

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s