HTTP Object Compression Efficiency Analyzer

 
 
  

Please enter a URL to analyze, e.g., http://www.example.com/example.html

Introduction

This tool uses a simple mathematical approach to determine whether or not HTTP compression should be enabled on the URL submitted. This is a common concern amongst web masters because reducing the size of an object by compressing it causes the expenditure of processing time and the passage of real time. Simply put, time spent compressing an object before transmitting it may outweigh the time-benefit of reducing its transfer-size.

The approach is simple. First, the URL-referenced object is downloaded uncompressed. Then, after the user-configurable pause in seconds, the object is re-downloaded requesting gzip deflation (supported by the vast majority of browsers). Measurements (in microseconds and in bytes) are taken at each stage of the transaction using 14 decimal digit precision. Compression ratios and effective transfer rates are also calculated. Overall transaction speed wins.

Considerations

  • This methodology is reasonable because clients customarily tell web servers if they want compressed content at the time they request an object. Then, the servers may respond with the object either compressed or not depending on their capabilities, configuration, and the file type requested.
  • Each request made by this application constitutes a new TCP connect().
  • A DNS lookup on the target is performed and cached prior to testing.
  • This application will follow up to 2 HTTP redirects.
  • This method does not determine whether or not a complete page (including all referenced objects) is worthy of being compressed. In order to determine this, all objects referenced in the marked-up content (including content imported by scripts, etc.) would have to be recursively downloaded and compared using this methodology and other factors must be considered such as the next item below...
  • Many web servers allow webmasters to conditionally-deflate objects based on their MIME-type.
  • Depending on several factors (current server load, object type and size, and server caching), repeated tests may yield wildly different results.
  • Filesystem caching, web acceleration, and web content caching: If you're using web acceleration products or services, note that their numbers may surprise you. For example, some may actually serve compressed content faster than uncompressed content.

Conclusions

  • Most web servers are used to transmit many different types of content. Depending on the "type" of content being transmitted, compression may be desirable.
  • We suggest conditionally compressing content based on MIME-type if your server software supports this in a quick and reliable fashion. This is an important distinction from enabling compression globally because many common web objects are already compressed (such as JPEGs and other image types) and as such, their compression cost-to-benefit ratio would run askew to other, more compression-friendly objects.
  • Compression-friendly file types are: text/html, text/plain, text/xml, and text/css
  • As a general rule, the best files to compress are large (5 KB or larger) "text"-oriented files. Small ones don't make enough of a difference and big files don't matter if they're already compressed in some way.

Common Deficiencies of other Tests

Many alternative tools exist to perform a similar analysis. Many are flawed. These are some of the reasons why:

  • DNS caching may not be performed prior to the connections. Because of this, the second measurement performed will be artificially faster because of DNS caching subsequent to the first test.
  • Many tests only perform one download, requesting compressed content. Once the payload arrives, they decompress the content and estimate how long it would have taken to perform an "uncompressed" download based on the transfer rate they calculated during the compressed download. This has so many problems, but for the sake of brevity, we'll only list a couple:
    • There is a first-byte download penalty associated with compression that doesn't apply to objects that are not being compressed.
    • Servers have different implementations of compression. When compression is enabled, some of the other mechanics of the software operate differently, causing different results.
  • Other tests that do perform multiple downloads often use the same socket/TCP connection. Because of this, the second measurement performed will be artificially faster because it doesn't need to do the 3WHS to establish a TCP connection.

Configuring Conditional Compression in your Web Server

In Apache, a simple approach to accomplishing conditional compression is to use the "mod_deflate" module. Once this is enabled in your configuration, you can activate it within a VirtualHost directive simply by adding a line similar to the following:

AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css

If you're using PHP and compiled the zlib module, compression can be enabled several ways. You can add these configuration values to the web server configuration through .htaccess files or in VirtualHost-type configuration:

php_flag zlib.output_compression on php_value zlib.output_compression_level 2

You can also enable compression using PHP's output buffering features (we do not recommend this) by adding this line to the beginning of your pages:

<?php ob_start("ob_gzhandler"); ?>

Read more about PHP's Zlib compression.

Instructions to enable HTTP compression in Microsoft IIS 6 on specific directories.