browsers, perf, reviews

Going Offline with ServiceWorker

In the IE8 era, I had a brief stint as an architect on the IE team, trying to figure out a coherent strategy and a deployable set of technologies that would allow web developers to build offline-capable web applications. A few of those ideas turned into features, several turned into unimplemented patents, and a few went nowhere at all.

A decade later, it’s clear that ServiceWorker is going to be the core engine beneath all future top-tier web applications. ServiceWorker brings the power of Fiddler’s AutoResponder and FiddlerScript features to JavaScript code running directly within the user’s browser. Designed by real web developers for real web developers, it delivers upon scenarios that previously required native applications. And browser support is looking great:

ServiceWorker

As I started looking at ServiceWorker, I was concerned about its complexity but I was delighted to discover a straightforward, very approachable reference on designing a ServiceWorker-backed application: Going Offline by Jeremy Keith. The book is short (I’m busy), direct (“Here’s a problem, here’s how to solve it“), opinionated in the best way (landmine-avoiding “Do this“), and humorous without being confusing. As anyone who has received unsolicited (or solicited) feedback from me about their book knows, I’m an extremely picky reader, and I have no significant complaints on this one. Highly recommended.

Unfortunately, the book isn’t available at list price on Amazon, but buying directly from the publisher is straightforward. The EBook is $11.00 and the paperback+ebook bundle is $28.80+shipping.

-Eric

Standard
dev, perf

Finding Image Bloat In Binary Files

I’ve previously talked about using PNGDistill to optimize batches of images, but in today’s quick post, I’d like to show how you can use the tool to check whether images in your software binaries are well optimized.

For instance, consider Chrome. Chrome uses a lot of PNGs, all mashed together a single resources.pak file. Tip: Search for files for the string IEND to find embedded PNG files.

With Fiddler installed, go to a command prompt and enter the following commands:

cd %USERPROFILE%\AppData\Local\Google\Chrome SxS\Application\60.0.3079.0
mkdir temp
copy resources.pak temp
cd temp
"C:\Program Files (x86)\Fiddler2\tools\PngDistill.exe" resources.pak grovel
for /f "delims=|" %f in ('dir /b *.png') do "c:\program files (x86)\fiddler2\tools\pngdistill" "%f" log

You now have a PNGDistill.LOG file showing the results. Open it in a CSV viewer like Excel or Google Sheets. You can see that Chrome is pretty well-optimized, with under 3% bloat.

image

Let’s take a look at Brave, which uses electron_resources.pak:

image

Brave does even better! Firefox has images in a few different files; I found a bunch in a file named omni.ja:

image

The picture gets less rosy elsewhere though. Microsoft’s MFC140u.dll’s images are 7% bloat:

image

Windows’ Shell32.dll uses poor compression:

image

Windows’ ImageRes.dll has over 5 megabytes (nearly 20% of image weight) bloat:

image

And the Windows 10’s ApplicationFrame.dll is well-compressed, but the images have nearly 87% metadata bloat:

image

Does ImageBloat Matter?

Well, yes, it does. Even when software isn’t distributed by webpages, image bloat still takes up precious space on your disk (which might be limited in the case of a SSD) and it burns cycles and memory to process or discard unneeded metadata.

Optimize your images. Make it automatic via your build process and test your binaries to make sure it’s working as expected.

-Eric

PS: Rafael Rivera wrote a graphical tool for finding metadata bloat in binaries; check it out.

PPS: I ran PNGDistill against all of the PNGs embedded in EXE/DLLs in the Windows\System32 folder. 33mb * 270M devices = 8.9 petabytes of wasted storage for imagebloat in system32 alone.  Raw Data:

Standard
dev, perf

Compression Context

ZIP is a great format—it’s extremely broadly deployed, relatively simple, and supports a wide variety of use-cases pretty well. ZIP is the underlying format beneath Java (.jar) Archives, Office (docx/xlsx/pptx) files, Fiddler (.saz) Session Archive ZIP files, and many more.

Even though some features (Unicode filenames, AES encryption, advanced compression engines) aren’t supported by all clients (particularly Windows Explorer), basic support for ZIP is omnipresent. There are even solid implementations in JavaScript (optionally utilizing asmjs), and discussion of adding related primitives directly to the browser.

I learned a fair amount about the ZIP format when building ZIP repair features in Fiddler’s SAZ file loader. Perhaps the most interesting finding is that each individual file within a ZIP is compressed on its own, without any context from files already added to the ZIP. This means, for instance, that you can easily remove files from within a ZIP file without recompressing anything—you need only delete the removed entries and recompute the index. However, this limitation also means that if the data you’re compressing contains a lot of interfile redundancy, the compression ratio does not improve as it would if there were intrafile redundancy.

This limitation can be striking in cases like Fiddler where there may be a lot of repeated data across multiple Sessions. In the extreme case, consider a SAZ file with 1000 near-identical Sessions. When that data is compressed to a SAZ, it is 187 megabytes. If the data were instead compressed with 7-Zip, which shares a compression context across embedded files, the output is 99.85% smaller!

ZIP almost 650x larger than 7z

In most cases, of course, the Session data is not identical, but Sessions on the whole tend to contain a lot of redundancy, particularly when you consider HTTP headers and the Session metadata XML files.

The takeaway here is that when you look at compression, the compression context is very important to the resulting compression ratio. This fact rears its head in a number of other interesting places:

  • brotli compression achieves high-compression ratios in part by using a 16 megabyte sliding window, as compared to the 32kb window used by nearly all DEFLATE implementations. This means that brotli content can “point back” much further in the already-compressed data stream when repeated data is encountered.
  • brotli also benefits by pre-seeding the compression context with a 122kb static dictionary of common web content; this means that even the first bytes of a brotli-compressed response can benefit from existing context.
  • SDCH compression achieves high-compression ratios by supplying a carefully-crafted dictionary of strings that result in an “ideal” compression context, so that later content can simply refer to the previously-calculated dictionary entries.
  • Adding context introduces a key tradeoff, however, as the larger a compression context grows, the more memory a compressor and decompressor tend to require.
  • While HTTP/2 reduces the need to concatenate CSS and JS files by reducing the performance cost of individual web requests, HTTP compression contexts are per-resource. That means larger files tend to yield higher-compression ratios. See “Bundling Improves Compression.”
  • Compression contexts can introduce information disclosure security vulnerabilities if a context is shared between “secret” and “untrusted” content. See also CRIME, BREACH, and HPACK. In these attacks, the bad guy takes advantage of the fact that if his “guess” matches some secret string earlier in the compression context, it will compress better (smaller) than if his guess is wrong. This attack can be combatted by isolating compression contexts by trust level, or by introducing random padding to frustrate size analysis.

 

Ready to learn a lot more about compression on the web? Check out my earlier Compressing the Web blog, my Brotli blog, and the excellent Google Developers CompressorHead video series.

-Eric

Standard
browsers, fiddler, perf

Automatically Evaluating Compressibility

Fiddler’s Transformer tab has long been a simple way to examine the use of HTTP compression of web assets, especially as new compression engines (like Zopfli) and compression formats (like Brotli) arose. However, the one-Session-at-a-time design of the Transformer tab means it is cumbersome to use to evaluate the compressibility of an entire page or series of pages.

Introducing Compressibility

Compressibility is a new Fiddler 4 add-on1 which allows you to easily find opportunities for compression savings across your entire site. Each resource dropped on the compressibility tab is recompressed using several compression algorithms and formats, and the resulting file sizes are recorded:

Compressibility tab

You can select multiple resources to see the aggregate savings:

Total savings text

WebP savings are only computed for PNG and JPEG images; Zopfli savings for PNG files are computed by using the PNGDistill tool rather than just using Zopfli directly. Zopfli is usable by all browsers (as it is only a high-efficiency encoder for Deflate) while WebP is supported only by Chrome and Opera. Brotli is available in Chrome and Firefox, but limited to use from HTTPS origins.

Download the Addon…

Update: Telerik updated Fiddler to install per-user. To use this extension, copy the files from C:\program files (x86)\Fiddler to %userprofile%\appdata\local\programs\Fiddler.

To show the Compressibility tab, simply install the add-on, restart Fiddler, and choose Compressibility from the View > Tabs menu2.

View > Tabs > Compressibility menu screenshot

The extension also adds ToWebP Lossless and ToWebP Lossy commands to the ImageView Inspector’s context menu:

ImagesMenuExt

I hope you find this new addon useful; please send me your feedback so I can enhance it in future updates!

-Eric

1 Note: Compressibility requires Fiddler 4, because there’s really no good reason to use Fiddler 2 any longer, and Fiddler 4 resolves a number of problems and offers extension developers the ability to utilize newer framework classes.

2 If you love Compressibility so much that you want it to be shown in the list of tabs by default, type prefs set extensions.Compressibility.AlwaysOn true in Fiddler’s QuickExec box and hit enter.

Standard
dev, perf

Getting Started with Profile Guided Optimization

For the convenience of the Windows developer community, I periodically compile the Zopfli and Brotli compressors from source, building for Win32 and code-signing the binaries (Interested? Get Zopfli.exe and Brotli.exe). After announcing the latest build on Twitter, I got an interesting question in reply:

Do you even PGO?

While I try to use the latest compiler (VS2015 U1), I’ve never used PGO with C++ myself. Profile guided optimization requires that you first compile a special instrumented binary that you run against a training set of data. The generated profiling data is fed into the compiler and it compiles an optimized binary based on the observed execution of the code, tuning the hottest paths for speed.

As with any technology-adoption question, I wondered: 1> Is using PGO hard? and 2> Will it noticeably improve performance?

Spoiler alert: The answers are “No” and “Yes.”

I started by skimming this old blog about PGO in Visual Studio; it looks pretty simple.

Optimizing a compressor with PGO is pretty straightforward. Unlike a GUI application with thousands of different operations, a compressor really only does one thing—compress.

I created a folder with files that I felt reasonably represent the types of data that I’ll be compressing with Zopfli (eight files captured via Fiddler). I could’ve experimented using a broader sample, but this seemed like a fine corpus of data with which to begin.

Click Build > Profile Guided Optimization > Instrument to generate an instrumented binary:

Build > Profile Guided Optimization > Instrument

Right-click the project in the Solution Explorer pane and choose Debugging under the Configuration Properties category. Edit the Command Arguments to specify the training scenario. Zopfli accepts a list of files to compress, so we simply list all eight:

Edit Command arguments

Close the dialog and click Build > Profile Guided Optimization > Run Instrumented/Optimized Application to run our application and generate profiling data:

Run Instrumented/Optimized Application

The scenario then runs; it takes a bit of extra time due to the cost of the profiling instructions in the instrumented binary. After it completes, a new file (Zopfli!1.pgc) is written to the \Release\ folder; if we’d run the application multiple times to train different scenarios, Zopfli!2.pgc, Zopfli!3.pgc, etc would be present as well.

Finally, click Build > Profile Guided Optimization > Optimize to generate a new build using the profiling data to select paths for optimization. You can see the effect of the profiling database on the Build in the Output window:

Build output shows optimizations

Now your executable has been optimized.

Pretty simple, right?

Proper benchmarking is an entire field itself, but let’s do the simplest thing that could possibly work to check the effectiveness of the optimizations:

Script runs optimized and unoptimized

We run the script a few times and see that the original unoptimized binary takes ~64 seconds to compress the corpus and the optimized binary takes ~46 seconds, a savings of almost 30%.

ZopFli PGO vs non PGO

You should run the same benchmark against a new set of data, just to ensure that your changes yield similar improvements (or at least no regression!) given different input data. A few runs of my PNGDistill tool (which uses Zopfli internally) show improvements of 10% to 25% when using the optimized compressor.

Pretty cool, right?

-Eric Lawrence

Standard
fiddler, perf

Fiddler and Brotli

Regular readers of my blog know how excited I am about Google’s new open compression engine, Brotli. Support is in Firefox 44 Nightly already and is expected for other browsers. Because Brotli is used inside the WOFF2 font format, many clients already have the compression code and just need to expose it in a new way.

Yesterday’s Fiddler release supports Brotli. To expose Brotli inside Fiddler 4.6.0.5+, download and place brotli.exe inside Fiddler’s %ProgramFiles(x86)%\Fiddler2\Tools\ subfolder and restart to see it appear on the Transformer tab:

image

After installation, Content-Encoding: br is also supported by Fiddler’s Decode toolbar option, as well as the encoding Infobar:

image

Test servers are starting to come online for Brotli, particularly now that Google has released a new Brotli module for the nginx web server. For now, you can try downloading this image:

Brotli didn't decode

…that is served with brotli encoding. If your browser supports brotli, it will render a star; if not, the image will appear broken unless you set the Decode option in Fiddler’s toolbar so Fiddler will decode the image before the browser sees it.

-Eric Lawrence

Standard
perf

WebP–What Isn’t Google Telling Us?

Beyond their awesome work on Zopfli and Brotli, Google has brought their expertise in compression to bear on video and image formats. One of the most interesting of these efforts is WebP, an image format designed to replace the aging JPEG (lossy) and PNG (lossless) image formats.

WebP offers more efficient compression mechanisms than both PNG and JPEG, as you can see in this comparison of a few PNG files on Google’s top sites vs. WebP-Lossless versions that are pixel-for-pixel identical:

size table

You can see these savings everywhere, from Google’s homepage logo, which is 3918 bytes (29%) smaller, to Google applications’ image sprites (59% smaller!) to advertisements served by Google’s ad network (18% smaller). These compression savings are much greater than those provided by Zopfli, which is constrained by compatibility with the legacy PNG format.

As an additional benefit, WebP files don’t contain the sort of metadata bloat found in PNG, JPEG, and GIF.

So, the bandwidth and cache-size savings are obvious.

While the format is currently only supported in Chrome and Opera, web servers can easily serve WebP to only clients that request it via the Accept header:

Fiddler screenshot showing WebP in use

This approach to WebP adoption is in use today by major sites like the Washington Post.

Google invented the format, so it’s not a case of “not-invented-here.”

The non-adoption of their own format leads to a troubling question—is there something about WebP that Google isn’t telling us? Surely there must be a good reason that Google’s own properties aren’t reaping the benefits of the format they’ve invented?

Update: Alex Russell retorts “uh, we use webp in TONS of places.”

-Eric Lawrence

PS: WebP Status Tracking links for Firefox and IE/Edge

Standard