dev, perf

Getting Started with Profile Guided Optimization

For the convenience of the Windows developer community, I periodically compile the Zopfli and Brotli compressors from source, building for Win32 and code-signing the binaries (Interested? Get Zopfli.exe and Brotli.exe). After announcing the latest build on Twitter, I got an interesting question in reply:

Do you even PGO?

While I try to use the latest compiler (VS2015 U1), I’ve never used PGO with C++ myself. Profile guided optimization requires that you first compile a special instrumented binary that you run against a training set of data. The generated profiling data is fed into the compiler and it compiles an optimized binary based on the observed execution of the code, tuning the hottest paths for speed.

As with any technology-adoption question, I wondered: 1> Is using PGO hard? and 2> Will it noticeably improve performance?

Spoiler alert: The answers are “No” and “Yes.”

I started by skimming this old blog about PGO in Visual Studio; it looks pretty simple.

Optimizing a compressor with PGO is pretty straightforward. Unlike a GUI application with thousands of different operations, a compressor really only does one thing—compress.

I created a folder with files that I felt reasonably represent the types of data that I’ll be compressing with Zopfli (eight files captured via Fiddler). I could’ve experimented using a broader sample, but this seemed like a fine corpus of data with which to begin.

Click Build > Profile Guided Optimization > Instrument to generate an instrumented binary:

Build > Profile Guided Optimization > Instrument

Right-click the project in the Solution Explorer pane and choose Debugging under the Configuration Properties category. Edit the Command Arguments to specify the training scenario. Zopfli accepts a list of files to compress, so we simply list all eight:

Edit Command arguments

Close the dialog and click Build > Profile Guided Optimization > Run Instrumented/Optimized Application to run our application and generate profiling data:

Run Instrumented/Optimized Application

The scenario then runs; it takes a bit of extra time due to the cost of the profiling instructions in the instrumented binary. After it completes, a new file (Zopfli!1.pgc) is written to the \Release\ folder; if we’d run the application multiple times to train different scenarios, Zopfli!2.pgc, Zopfli!3.pgc, etc would be present as well.

Finally, click Build > Profile Guided Optimization > Optimize to generate a new build using the profiling data to select paths for optimization. You can see the effect of the profiling database on the Build in the Output window:

Build output shows optimizations

Now your executable has been optimized.

Pretty simple, right?

Proper benchmarking is an entire field itself, but let’s do the simplest thing that could possibly work to check the effectiveness of the optimizations:

Script runs optimized and unoptimized

We run the script a few times and see that the original unoptimized binary takes ~64 seconds to compress the corpus and the optimized binary takes ~46 seconds, a savings of almost 30%.

ZopFli PGO vs non PGO

You should run the same benchmark against a new set of data, just to ensure that your changes yield similar improvements (or at least no regression!) given different input data. A few runs of my PNGDistill tool (which uses Zopfli internally) show improvements of 10% to 25% when using the optimized compressor.

Pretty cool, right?

-Eric Lawrence

perf, windmills

Photoshop and Save For Web

Adobe recently announced that “Save for Web” in Photoshop is a “legacy feature” which won’t be improved. I decided to have a look at Adobe Photoshop CC (2015.0.0 Release 20150529.r88 x64) to see the impact of its many different “save” commands on the resulting file size.

First, I created a trivial 20×20 image and drew a red dot in the middle of it.

Next, I performed the naïve File > Save As > PNG operation. The output is a 16,723 byte PNG file, 97% of which is Adobe metadata:


If I instead use File > Export > Quick Export as PNG, the result is a 571 byte PNG that can be shrunk by 35 bytes to 536 using the Zopfli compressor:


If I instead click File > Export > Export As > PNG, the default size is 608 bytes:


If I untick the “Transparency” checkbox:


…the file grows to 662 bytes. Interestingly, however, when I retick that same box, the file now shrinks down to 571 bytes. A quick investigation shows that unticking and reticking the box silently changes the PNG from a 48 color palette to a RGB/A image, which is smaller in the case of this small image.

If I use the new File > Generate Assets checkbox and name my layer “reddot.png” the automatically-saved PNG file in the PSD’s subfolder is the 608 byte version.

If I choose File > Export > Save for Web (Legacy) and choose to save a PNG-24 file:


… Photoshop reassures me that I’ve made good choices:


… But it’s lying. The information at the bottom left doesn’t account for the 935 bytes of useless metadata embedded in the image:


I need to change the Metadata dropdown to None:


…to get Adobe to omit most of the metadata, although it still wastes 37 bytes of your file advertising Adobe’s product. If you now distill the file, you can save those 37 bytes and pick up a 29 byte improvement in compression for a final file size of 411 bytes.image

So, as you can see, Adobe Photoshop can save this simple 477 byte image in sizes ranging from 477 bytes to a whopping 16723 bytes. The Adobe overhead isn’t “fixed”—it can be much larger: a 207K PNG file on Adobe’s website has 132K of metadata in it, while a 49.1K PNG file on Microsoft’s website contains 48.9K of Adobe metadata.

Lessons Learned:

  1. Learn how to use your tools.
  2. Expect your tools to lie to you.
  3. Use an optimizers.



Optimize PNGs using PngDistill

Unfortunately, many PNG image generators opt for minimum compression time, failing to achieve maximum compression. Even worse, the most popular PNG generation tools often include huge amounts of unnecessary metadata that can bloat images by thousands of percent!

Fiddler now includes PngDistill, a simple tool that removes unnecessary metadata chunks and recompresses PNG image data streams using the Zopfli compression engine. Zopfli-based recompression often shaves 10% or more from the size of PNG files. You can access the PngDistill tool from the context menu of Fiddler’s ImageView inspector:



While it is well-integrated into Fiddler, PngDistill , which is installed to C:\program files (x86)\Fiddler2\Tools folder, only requires PngDistill.exe (a .NET application) and zopfli.exe to run; you can use these tools without using Fiddler.

To run PngDistill against an entire local folder of images, you can do so from the command prompt:

   for /f "delims=|" %f in ('dir /b *.png') do PngDistill "%f" replace

This script runs PngDistill on every image in the current folder, replacing any image for which the distillation process saved bytes. You can then update the images on your server with the optimized images.

Running PngDistill .exe without any arguments will show the usage instructions:



  • The “Minify-then-compress” Best Practice applies to PNGs. While large fields of empty pixels compress really well, the browser must decompress those fields back into memory. So, if you’re building a sprite image with all of your site’s icons, don’t leave a huge empty area in it.
  • More advanced optimizations for PNG files are available using filters, color depth reduction, etc. PngDistill does not undertake these optimizations as its goal is to be 100% safe for automation, with no possibility of a user-visible change in the pixels of the image.
  • PngDistill partially supports .ICO files. Icon files may contain embedded PNGs; when run on a .ICO, PngDistill will extract the PNGs and save them externally; you will need to rebuild the .ICO file with the new PNG file(s).