Compression Context

ZIP is a great format—it’s extremely broadly deployed, relatively simple, and supports a wide variety of use-cases pretty well. ZIP is the underlying format beneath Java (.jar) Archives , Office (docx/xlsx/pptx) files, Fiddler (.saz) Session Archive ZIP files, and many more.

Even though some features (Unicode filenames, AES encryption, advanced compression engines) aren’t supported by all clients (particularly Windows Explorer), basic support for ZIP is omnipresent. There are even solid implementations in JavaScript (optionally utilizing asmjs), and discussion of adding related primitives directly to the browser.

I learned a fair amount about the ZIP format when building ZIP repair features in Fiddler’s SAZ file loader. Perhaps the most interesting finding is that each individual file within a ZIP is compressed on its own, without any context from files already added to the ZIP. This means, for instance, that you can easily remove files from within a ZIP file without recompressing anything—you need only delete the removed entries and recompute the index. However, this limitation also means that if the data you’re compressing contains a lot of interfile redundancy, the compression ratio does not improve as it would if there were intrafile redundancy.

This limitation can be striking in cases like Fiddler where there may be a lot of repeated data across multiple Sessions. In the extreme case, consider a SAZ file with 1000 near-identical Sessions. When that data is compressed to a SAZ, it is 187 megabytes. If the data were instead compressed with 7-Zip, which shares a compression context across embedded files, the output is 99.85% smaller!

ZIP almost 650x larger than 7z

In most cases, of course, the Session data is not identical, but Sessions on the whole tend to contain a lot of redundancy, particularly when you consider HTTP headers and the Session metadata XML files.

The takeaway here is that when you look at compression, the compression context is very important to the resulting compression ratio. This fact rears its head in a number of other interesting places:

  • brotli compression achieves high-compression ratios in part by using a 16 megabyte sliding window, as compared to the 32kb window used by nearly all DEFLATE implementations. This means that brotli content can “point back” much further in the already-compressed data stream when repeated data is encountered.
  • brotli also benefits by pre-seeding the compression context with a 122kb static dictionary of common web content; this means that even the first bytes of a brotli-compressed response can benefit from existing context.
  • SDCH compression achieves high-compression ratios by supplying a carefully-crafted dictionary of strings that result in an “ideal” compression context, so that later content can simply refer to the previously-calculated dictionary entries.
  • Adding context introduces a key tradeoff, however, as the larger a compression context grows, the more memory a compressor and decompressor tend to require.
  • While HTTP/2 reduces the need to concatenate CSS and JS files by reducing the performance cost of individual web requests, HTTP compression contexts are per-resource. That means larger files tend to yield higher-compression ratios. See “Bundling Improves Compression.”
  • Compression contexts can introduce information disclosure security vulnerabilities if a context is shared between “secret” and “untrusted” content. See also CRIME, BREACH, and HPACK. In these attacks, the bad guy takes advantage of the fact that if his “guess” matches some secret string earlier in the compression context, it will compress better (smaller) than if his guess is wrong. This attack can be combatted by isolating compression contexts by trust level, or by introducing random padding to frustrate size analysis.

 

Ready to learn a lot more about compression on the web? Check out my earlier Compressing the Web blog, my Brotli blog, and the excellent Google Developers CompressorHead video series.

-Eric

Compression Context

Automatically Evaluating Compressibility

Fiddler’s Transformer tab has long been a simple way to examine the use of HTTP compression of web assets, especially as new compression engines (like Zopfli) and compression formats (like Brotli) arose. However, the one-Session-at-a-time design of the Transformer tab means it is cumbersome to use to evaluate the compressibility of an entire page or series of pages.

Introducing Compressibility

Compressibility is a new Fiddler 4 add-on1 which allows you to easily find opportunities for compression savings across your entire site. Each resource dropped on the compressibility tab is recompressed using several compression algorithms and formats, and the resulting file sizes are recorded:

Compressibility tab

You can select multiple resources to see the aggregate savings:

Total savings text

WebP savings are only computed for PNG and JPEG images; Zopfli savings for PNG files are computed by using the PNGDistill tool rather than just using Zopfli directly. Zopfli is usable by all browsers (as it is only a high-efficiency encoder for Deflate) while WebP is supported only by Chrome and Opera. Brotli is available in Chrome and Firefox, but limited to use from HTTPS origins.

Download the Addon…

To show the Compressibility tab, simply install the add-on, restart Fiddler, and choose Compressibility from the View > Tabs menu2.

View > Tabs > Compressibility menu screenshot

The extension also adds ToWebP Lossless and ToWebP Lossy commands to the ImageView Inspector’s context menu:

ImagesMenuExt

I hope you find this new addon useful; please send me your feedback so I can enhance it in future updates!

-Eric

1 Note: Compressibility requires Fiddler 4, because there’s really no good reason to use Fiddler 2 any longer, and Fiddler 4 resolves a number of problems and offers extension developers the ability to utilize newer framework classes.

2 If you love Compressibility so much that you want it to be shown in the list of tabs by default, type prefs set extensions.Compressibility.AlwaysOn true in Fiddler’s QuickExec box and hit enter.

Automatically Evaluating Compressibility

Getting Started with Profile Guided Optimization

For the convenience of the Windows developer community, I periodically compile the Zopfli and Brotli compressors from source, building for Win32 and code-signing the binaries (Interested? Get Zopfli.exe and Brotli.exe). After announcing the latest build on Twitter, I got an interesting question in reply:

Do you even PGO?

While I try to use the latest compiler (VS2015 U1), I’ve never used PGO with C++ myself. Profile guided optimization requires that you first compile a special instrumented binary that you run against a training set of data. The generated profiling data is fed into the compiler and it compiles an optimized binary based on the observed execution of the code, tuning the hottest paths for speed.

As with any technology-adoption question, I wondered: 1> Is using PGO hard? and 2> Will it noticeably improve performance?

Spoiler alert: The answers are “No” and “Yes.”

I started by skimming this old blog about PGO in Visual Studio; it looks pretty simple.

Optimizing a compressor with PGO is pretty straightforward. Unlike a GUI application with thousands of different operations, a compressor really only does one thing—compress.

I created a folder with files that I felt reasonably represent the types of data that I’ll be compressing with Zopfli (eight files captured via Fiddler). I could’ve experimented using a broader sample, but this seemed like a fine corpus of data with which to begin.

Click Build > Profile Guided Optimization > Instrument to generate an instrumented binary:

Build > Profile Guided Optimization > Instrument

Right-click the project in the Solution Explorer pane and choose Debugging under the Configuration Properties category. Edit the Command Arguments to specify the training scenario. Zopfli accepts a list of files to compress, so we simply list all eight:

Edit Command arguments

Close the dialog and click Build > Profile Guided Optimization > Run Instrumented/Optimized Application to run our application and generate profiling data:

Run Instrumented/Optimized Application

The scenario then runs; it takes a bit of extra time due to the cost of the profiling instructions in the instrumented binary. After it completes, a new file (Zopfli!1.pgc) is written to the \Release\ folder; if we’d run the application multiple times to train different scenarios, Zopfli!2.pgc, Zopfli!3.pgc, etc would be present as well.

Finally, click Build > Profile Guided Optimization > Optimize to generate a new build using the profiling data to select paths for optimization. You can see the effect of the profiling database on the Build in the Output window:

Build output shows optimizations

Now your executable has been optimized.

Pretty simple, right?

Proper benchmarking is an entire field itself, but let’s do the simplest thing that could possibly work to check the effectiveness of the optimizations:

Script runs optimized and unoptimized

We run the script a few times and see that the original unoptimized binary takes ~64 seconds to compress the corpus and the optimized binary takes ~46 seconds, a savings of almost 30%.

ZopFli PGO vs non PGO

You should run the same benchmark against a new set of data, just to ensure that your changes yield similar improvements (or at least no regression!) given different input data. A few runs of my PNGDistill tool (which uses Zopfli internally) show improvements of 10% to 25% when using the optimized compressor.

Pretty cool, right?

-Eric Lawrence

Getting Started with Profile Guided Optimization

Fiddler and Brotli

Regular readers of my blog know how excited I am about Google’s new open compression engine, Brotli. Support is in Firefox 44 Nightly already and is expected for other browsers. Because Brotli is used inside the WOFF2 font format, many clients already have the compression code and just need to expose it in a new way.

Yesterday’s Fiddler release supports Brotli. To expose Brotli inside Fiddler 4.6.0.5+, download and place brotli.exe inside Fiddler’s %ProgramFiles(x86)%\Fiddler2\Tools\ subfolder and restart to see it appear on the Transformer tab:

image

After installation, Content-Encoding: br is also supported by Fiddler’s Decode toolbar option, as well as the encoding Infobar:

image

Test servers are starting to come online for Brotli, particularly now that Google has released a new Brotli module for the nginx web server. For now, you can try downloading this image:

Brotli didn't decode

…that is served with brotli encoding. If your browser supports brotli, it will render a star; if not, the image will appear broken unless you set the Decode option in Fiddler’s toolbar so Fiddler will decode the image before the browser sees it.

-Eric Lawrence

Fiddler and Brotli

WebP–What Isn’t Google Telling Us?

Beyond their awesome work on Zopfli and Brotli, Google has brought their expertise in compression to bear on video and image formats. One of the most interesting of these efforts is WebP, an image format designed to replace the aging JPEG (lossy) and PNG (lossless) image formats.

WebP offers more efficient compression mechanisms than both PNG and JPEG, as you can see in this comparison of a few PNG files on Google’s top sites vs. WebP-Lossless versions that are pixel-for-pixel identical:

size table

You can see these savings everywhere, from Google’s homepage logo, which is 3918 bytes (29%) smaller, to Google applications’ image sprites (59% smaller!) to advertisements served by Google’s ad network (18% smaller). These compression savings are much greater than those provided by Zopfli, which is constrained by compatibility with the legacy PNG format.

As an additional benefit, WebP files don’t contain the sort of metadata bloat found in PNG, JPEG, and GIF.

So, the bandwidth and cache-size savings are obvious.

While the format is currently only supported in Chrome and Opera, web servers can easily serve WebP to only clients that request it via the Accept header:

Fiddler screenshot showing WebP in use

This approach to WebP adoption is in use today by major sites like the Washington Post.

Google invented the format, so it’s not a case of “not-invented-here.”

The non-adoption of their own format leads to a troubling question—is there something about WebP that Google isn’t telling us? Surely there must be a good reason that Google’s own properties aren’t reaping the benefits of the format they’ve invented?

Update: Alex Russell retorts “uh, we use webp in TONS of places.”

-Eric Lawrence

PS: WebP Status Tracking links for Firefox and IE/Edge

WebP–What Isn’t Google Telling Us?

Brotli

Regular readers of my blog know how much I love Zopfli, Google’s compression engine that often shrinks output by 5% or better when compared to the popular zlib engine. The beauty of Zopfli is that its output is compatible with all of the billions of existing DEFLATE encoders deployed worldwide, making its use an easy choice for any static content.

But imagine for a moment what compression ratios we could achieve if we weren’t limited by compatibility with existing decoders? If we could add a new compression engine to the web, what might it look like?

The Brotli compression engine, co-written by Jyrki Alakuijala (inventor of Zopfli), provides one answer. Brotli combines the LZ77 and Huffman algorithms of DEFLATE with a larger sliding window (up to 16mb1 vs. DEFLATE’s 32kb) and context modeling; the specification also calls for a 122kb static dictionary.

Brotli In Browsers

Today, Brotli is the compression engine behind the newish WOFF2 font format, providing savings of approximately 25% over WOFF 1.0 fonts compressed with Zopfli. Not content to rest on their laurels, Google has announced their Intent to Implement Brotli as a general purpose HTTP Content-Encoding, allowing web developers to use it to compress script, stylesheets, svg, xml, and the like. Firefox beat Google to the finish and shipped Brotli support in the Firefox 44 Dev build.

Probably HTTPS only

Past attempts to add new compression algorithms (bzip2 and SDCH) have demonstrated that a non-trivial number of intermediaries (proxies, gateway scanners) fail when Content-Encodings other than GZIP and DEFLATE are specified, so Brotli will probably only be supported over HTTPS connections, where intermediaries are less likely to interfere.

    Accept-Encoding: br, gzip, deflate, sdch

Results

Facebook investigated Brotli and found it would save about 17% of CSS bytes and 20% of JavaScript bytes (compared with Zopfli). When run on the CSS and JavaScript from the Alexa top-300k, Brotli saved 12% of CSS bytes and 9% of JavaScript bytes when compared to gzip.

Running a few simple tests with Fiddler, I saw great results with Brotli:

    Content-Encoding: br

jQueryMobileMin.js
image

Microsoft homepage:
image

A random giant XML documentation file:
image

Microsoft Word Online WordEditor.js
image

Microsoft Word Online WordEditor.Wac.TellMeModel.js
image

Cloudflare’s blog post on Brotli includes some benchmarks too.

Brotli is optimized for decompression speed. When compressing, Brotli is slower than zlib’s deflate, but considerably faster than zopfli, lzma and bzip2; given 1gb of extremely compressible content, Brotli finished compressing it to 3339 bytes after 301 seconds of CPU time. After 8040 seconds of CPU time, zopfli.exe crashed when a memory allocation failed.

Running Brotli.exe

To make things simpler for Windows users, I’ve built the latest release (v0.3) from GitHub for Win32 using Visual Studio 2015. You can download the Authenticode-signed Windows Brotli.exe from my site.

To compress a file, specify the input and output filenames:

  • --in filename
  • --out compressed_filename

… and optionally specify any of the following arguments:

  • --quality n
  • --force
  • --verbose

The quality parameter controls the compression-speed vs. compression-ratio tradeoff; the higher the quality, the slower but denser the compression. The supported range is 0 to 11, and 11 is the default.

The force parameter instructs Brotli to overwrite the output file if it already exists.

The verbose parameter instructs Brotli to display its compression speed in megabytes per second upon completion.

To decompress a file, use the --decompress parameter and specify the input and output filenames:

  • --in compressed_filename
  • --out filename

… and optionally specify the --verbose parameter to instruct Brotli to display its decompression speed in megabytes per second upon completion.

If you’d like to expose Brotli inside Fiddler 4.6.0.5+, place brotli.exe inside Fiddler’s \Tools\ subfolder and restart to see it appear on the Transformer tab:

image

Tracking Brotli

If you’d like to follow along:

Alas, the Brotli Discussion forum is currently empty.

Assorted Further Investigations

1. Someone needs to register the brotli token in the IANA registry (although Google’s SDCH and Microsoft’s Xpress aren’t listed there either).

2. Implementers should consider protections against “brotli bombing” DoS attacks. Brotli’s high compression ratio makes attacks even cheaper for the bad guys.  A trivial test of compressing a file containing all 0s shows that Brotli can achieve a compression ratio of at least 386516:1, meaning that 1389 bytes of compressed data can blow up to 512MB when uncompressed. In contrast, DEFLATE has a maximum compression ratio approaching 1032 to 1, so an attacker would need to send 375 times as much data over the network to achieve a similar result. That being said, even DEFLATE can result in a denial-of-service, as with this 5.8mb PNG file that can require allocation of up to 141GB of memory.

3. Brotli’s use in WOFF2 means that browsers have already taken on its attack surface. However, not all attack surface is created equal; WOFF2 fonts can be decoded inside a very restricted sandbox. When Chrome 1.0 released, I was surprised to learn its HTTP decompressors were in a full-trust process; it turns out that is still the case today, which makes fuzzing against decompressors very interesting to an attacker.

4. Brotli’s static dictionary was generated from a broad corpus of content, but considering the most likely use cases (static files), it may not be optimal for this use. At this point, it’s probably too late to change it.

5. When used as a Content-Encoding, will brotli be used “bare” or in some framing format (e.g. with a trailing CRC and size marker)? Will it have magic bytes that will allow sniffing? (Per @mcmanusducksong, Firefox is going with a bare stream and no magics. boo)

6. While not terribly relevant to my scenarios, it turns out Google builds a lot of compression engines I’d never heard of, e.g. Snappy and Gipfeli. When compression speed is more important than ratio, they’re worth a look.

7. Brotli makes the most sense for pre-compression of static content; to that end, someone needs to xcopy the http_gzip_static module for nginx and make a few tweaks to create a new http_brotli_static module. While the nginx team may eventually release one, Google released a brotli module that supports both dynamic and static compression.

1 While Brotli can use a 16mb window, for performance reasons it appears that constraining the window to 4mb is the plan for most scenarios.

Brotli

Zopfli All The Things

I’ve written about Zopfli quite a bit in the past, and even wrote a tool to apply it to PNG files. For fun, I had a look at one of the most optimized pages in the world: Google.com, through the lens of Zopfli.

Here are the basic resources delivered by the Google homepage:

Zopfli WhatIf

This breakdown shows that Google isn’t optimizing their own compression using the compressor they wrote. The Savings column shows the number of bytes saved by using Zopfli over whatever Google used to compress the asset. Using the default settings in an ideal world, Google could save up to 16.5k, almost 5% of the bytes transferred, by using Zopfli.

I’ve color-coded the column based on how practical I believe the savings to be—the green numbers are the static images where there’s no question the size benefit could be realized. The yellow numbers are cases where script files are compressed; given the complicated query string parameters, I’m betting these scripts are dynamically generated and the compression cost of Zopfli might not be reasonable. The red number is the homepage itself, which probably isn’t reasonable to Zopfli compress as it certainly is generated dynamically.

So, most likely the savings of a practical Zopfli deployment on the homepage page would be about 3.7kb; savings are much greater on other pages on other sites.

More interesting, however, is the Google API CDN, which hosts scripts for other sites; optimizing these would take a minute or two at most and make every site that uses them faster.

Zopfli savings

Use Zopfli; give the tubes a little bit more room.

-Eric

PS: You may already have zopfli.exe on your system; Fiddler installs a copy to its \Tools\ subfolder!

Zopfli All The Things