How To: Optimize Your Apache Site with Mod Deflate

April 23, 2009 · 45 comments

The title of this post might be a little cryptic to those not familiar with the Apache webserver, but this post is a sort of followup to Paul Buchheit’s recent post “Make your site faster and cheaper to operate in one easy step” as well as a response to a recent Skribit suggestion. The step he’s referring to is getting your web server to utilize gzip encoding.

PaulStamatiou.com is gzipped!
Check to see if your site is gzipped with gzipcheck.

Paul Buchheit goes over the reasons why you should use gzip encoding — from 4-to-1 compression of HTML files to the reduced costs associated with serving smaller files. However he doesn’t mention the specific ways of how you can get that running on your site or blog.

Using these numbers, we can estimate that it would cost $1.88 to gzip 1TB of data on Amazon EC2, and $174 to transfer 1TB of data. If you instead compress your data (and get 4-to-1 compression, which is not unusual for html), the bandwidth will only cost $43.52

There are a myriad of server software setups but I’ll address one of the most popular HTTP web servers: Apache.

So just a run-through of why you should consider enabling mod_deflate:

  • Enabling gzip compression will reduce file sizes at the expense of slightly increased CPU utilization (I find that to be negligible).
  • Smaller files served to your clients means less bandwidth used, as well as faster transfer time which means the client gets the page faster and your server can proceed to serving the next client.

Notice

This article might not work for you without some tinkering. Apache install locations can vary by your setup or that of your webhost. For the purposes of this article, I am using a Media Temple (dv) server which has a Cent OS and Plesk setup with Apache installed in /etc/httpd.

Enter mod_deflate

As defined by Apache documentation, the deflate module “provides the DEFLATE output filter that allows output from your server to be compressed before being sent to the client over the network.” In other words, it compresses files without you having to explicitly compress individual files on your own. However, this could become a problem if Apache ends up compressing files you have already compressed or if it decides to compress images in your blog posts, potentially making them look worse. For that reason, it’s important that mod_deflate is configured properly.

Configuring

First we need to load up the actual mod_deflate.so module. If you are using Apache 2, then you likely already have mod_deflate installed. Just to be sure though, go to your Apache httpd.conf file (/etc/httpd/conf/httpd.conf for me) and place the following line if it’s not already there:

LoadModule deflate_module modules/mod_deflate.so

Apache httpd.conf - place modules here
Look for this section and place the line above anywhere within.

The next step is telling mod_deflate how to work. Instead of working with httpd.conf, we will want to place the upcoming lines in the appropriate vhost.conf file, if your server uses a vhost configuration. For example, I created my vhost file in /var/www/vhosts/paulstamatiou.com/conf/vhost.conf.

If you’re not sure, you can put it in httpd.conf, but the custom mod_deflate logging I created won’t work due to this issue outlined by Apache documentation:

If CustomLog or ErrorLog directives are placed inside a <VirtualHost> section, all requests or errors for that virtual host will be logged only to the specified file. Any virtual host which does not have logging directives will still have its requests sent to the main server logs.

We’ll start by placing these lines in the appropriate vhost.conf file to configure mod_deflate:

<IfModule mod_deflate.c>
   SetOutputFilter DEFLATE

   # example of how to compress ONLY html, plain text and xml
   # AddOutputFilterByType DEFLATE text/plain text/html text/xml

   # Don't compress binaries
   SetEnvIfNoCase Request_URI \.(?:exe|t?gz|zip|iso|tar|bz2|sit|rar)$ no-gzip dont-vary

   # Don't compress images
   SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|jpg|ico|png)$  no-gzip dont-vary

   # Don't compress PDFs
   SetEnvIfNoCase Request_URI \.pdf$ no-gzip dont-vary

   # Don't compress flash files (only relevant if you host your own videos)
   SetEnvIfNoCase Request_URI \.flv$ no-gzip dont-vary

   # Netscape 4.X has some problems
   BrowserMatch ^Mozilla/4 gzip-only-text/html

   # Netscape 4.06-4.08 have some more problems
   BrowserMatch ^Mozilla/4\.0[678] no-gzip

   # MSIE masquerades as Netscape, but it is fine
   BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

   # Make sure proxies don't deliver the wrong content
   Header append Vary User-Agent env=!dont-vary

   # Setup custom deflate log
   DeflateFilterNote Input instr
   DeflateFilterNote Output outstr
   DeflateFilterNote Ratio ratio
   LogFormat '"%r" %{outstr}n/%{instr}n %{ratio}n%%' DEFLATE
   CustomLog logs/deflate_log DEFLATE
</IfModule>

Before you save the file, I’ll explain what these lines do. There are two ways of settings up deflate filtering:

  • allowing ONLY certain types of files (AddOutputFilterByType)
  • OR

  • allowing ALL except certain file extensions (SetEnvIfNoCase)

If you aren’t too sure what types of files you’re serving, it’s a safe bet to use AddOutputFilterByType and only compress a few known filetypes. Otherwise, keep those lines as is and alter what file types you don’t want compressed with the SetEnvIfNoCase lines. I have it setup for my server to exclude common image file types, PDFs, FLVs as well as common binaries, so this will likely be fine for your uses as well.

As for those BrowserMatch lines, they are recommended client compression exclusions outlined by Apache documentation, but if you ask me I doubt you really have to worry about breaking Netscape 4 users’ experiences.

The last bit of those lines deals with a custom log for mod_deflate. While not necessary I find it to be one of the more interesting things you can do with mod_deflate. The log shows all HTTP requests and displays the file sizes before and after compression, as well as listing that ratio. If you’re so inclined, you can do cool things like run through your logs with a perl script and find out how much bandwidth you’ve been saving each month by using mod_deflate.

Example deflate log snippet from PaulStamatiou.com
Example deflate log snippet. (sudo tail -f /etc/httpd/logs/deflate_log)

Save that file when you’re done tinkering and restart Apache for the changes to take effect.

/etc/init.d/httpd restart

Visit gzipcheck to make sure Apache accepted the changes and is serving up compressed files! From there, you can tinker with some more interesting mod_deflate configurations. For example, if you have a beefy server you can set a higher compression level with DeflateCompressionLevel and save even more bandwidth.

Tip of the Iceberg

This post was meant to highlight an easy way to speed up your site and save money, but mod_deflate is not the be-all and end-all site optimization trick. There are tons of ways to speed up your site from both the server side of things and by optimizing the actual website itself. If you want to read up on some Apache tuning tips, O’Reilly has some good books worth checking out.

Related Resources:

Do you use any compression tool like mod_deflate with your current server setup? What else do you do to ensure your server and site run efficiently?

Update: Grzegorz Daniluk suggests an alternate BrowserMatch setup to avoid compressing files for Internet Explorer 6 as it sometimes has an issue with compressed files. However, I did some testing on my own and wasn’t able to reproduce any issues where a gzipped site loaded a blank page in IE6.

PaulStamatiou.com runs on the Thesis Theme for WordPress

How smart is your Theme?  How good is your support? Check out ThesisTheme for WordPress.

Thesis is the search engine optimized WordPress theme of choice for serious online publishers. If you’re a blogger who doesn’t understand a lot of PHP, Thesis will give a ton of functionality without having to alter any code. For the advanced, Thesis has incredible customization possibilities via Thesis hooks.

With so many design options, you can use the template over and over and never have it look like the same site. The theme is robust and flexible enough not only to accommodate a site like PaulStamatiou.com, but also to enable the site to run far more efficiently than it ever has before.

{ 3 trackbacks }

iWeb Blog » Nouvelles Techno iWeb: cadres, optimisation Apache, Windows 7
April 23, 2009 at 8:02 am
the echothis blog » Apache 2 compression the easy way
April 28, 2009 at 12:52 pm
WordPress How To: Change Your Blog’s Permalinks — PaulStamatiou.com
June 22, 2009 at 7:16 am

{ 42 comments… read them below or add one }

1 Chris Morrell April 23, 2009 at 4:27 am

Would have thought gzip compression would be enabled by default. Browsers have been supporting it for years and it is one of those “duh” things you’d think would be enabled across the web.

Reply

2 Paul Freet April 23, 2009 at 7:12 am

Thanks for the link to the test site. I had a configuration error and some of my pages were not being compressed. That was a great help.

One suggestion. To restart your web server, you should be using the wonderful apachctl script.

sudo /usr/sbin/apachectl restart

BTW, you do sudo, don’t you?? :-)

Reply

3 mike April 23, 2009 at 11:19 am

The /etc/init.d/httpd script uses apachectl. The init.d scripts are generally the best way to start and stop services because they should be designed for your specific OS configuration.

Reply

4 mike April 23, 2009 at 12:38 pm

On further inspection, it looks like you should do graceful restart instead of just restart. That will make it use the apachectl command.

Reply

5 Craig Hughes April 23, 2009 at 3:21 pm

You probably want to do a reload in this instance not a restart. No need to piss off any visitors who happen to be on your site at the time just for a simple config change…

Reply

6 Paul Stamatiou April 23, 2009 at 3:45 pm

That’s one way to look at it. I just want to get the changes up quickly so I can view them without having to question whether I’m looking at the old changes or the new ones.

7 Paul Stamatiou April 23, 2009 at 3:49 pm

Yeah I use sudo but I left it out of this article because last time I had a similar command with sudo in it and about 3 linux gurus called me out saying it wasn’t necessary. It might be necessary in this case depending on your setup. ;D

Reply

8 ST April 23, 2009 at 8:39 am

One thing. On some (most?) shared hosting CPU and memory usage are the bottleneck, not bandwidth or response time. Using mod_deflate will only make this worse. In my case, I keep running up against the resource limits of Media Temple’s DV service but bandwidth and response time are not so much an issue. Enabling mod_deflate will make this worse. The question is, how much worse exactly and is that enough to make this not worth doing?

Reply

9 wales April 23, 2009 at 11:01 am

Like ST says – best thing is to test the site (at a couple of times during the day if you’re on shared hosting), I found the YSlow plugin to Firebug to be handy for this.

Here’s another optimisation you can make, serve some framework files from Google, eg JQuery ( http://alicious.com/2009/site-optimizing-reducing-bandwidth-with-htaccess/ )

Reply

10 Paul Stamatiou April 23, 2009 at 5:12 pm

Wow that’s interesting. I’ve never had *any* resource issues with my (dv) rage. That being said my system load average is always under 0.35-0.40 with normal traffic. Are you pushing tons of traffic?

Reply

11 ST April 23, 2009 at 6:34 pm

That’s the thing. We don’t have THAT much traffic and aren’t serving up huge things like video. I have optimized the hosting (turned off mail servers we don’t use, turned off spamassasin) and the servers to death (opcode cache, misc. apache tweaks, mysql configuration) and have even made a slew of minor page load optimizations (like moving js to bottom of file, minifying css and js), set expire headers and cannot really see why we are running into the problems we are (those being Qos black zone alerts about kmemsize mostly but also some yellow zone alerts about tcpsndbuf).

When I profiled the site with kmemcache I found that our only real suck was the advertising software we run but there isn’t much we can do about that and it wasn;t THAT bad anyway. It is a bit frustrating (but kind of fun too, I love messing around in unix land).

Reply

12 Jeniffer April 23, 2009 at 9:21 am

Thanks for the guide!

Reply

13 Grzegorz Daniluk April 23, 2009 at 10:18 am

There is a serious problem with this configuration. IE6 has bug which sometimes occur when a html page and a js file are compressed. In the result IE6 displays blank screen sometimes.

So it is safer to disable js / css files compression for IE6. Below line enables compression only for newer MS browsers.

BrowserMatch bMSIEs(7|8) !no-gzip !gzip-only-text/html

Reply

14 mike April 23, 2009 at 11:23 am

Seriously? I suggest you stop supporting IE6 and force the people from the stone age to upgrade.

Reply

15 Paul Stamatiou April 23, 2009 at 3:47 pm

I thought I remembered an IE6 issue with compression – so I opened up my VMware Fusion Windows XP install with IE6 and did a bunch of tests on my site and everything loaded perfectly, so I didn’t bother to do any further research. :-)

Do you have any links to documentation about this that I can read? Thanks.

Reply

16 Grzegorz Daniluk April 24, 2009 at 10:27 am

I also tested my configuration on IE6 and everything worked fine. Then we got our clients reports that our web application is broken. So I guess in 99% of cases the bug doesn’t occur.

Hopefully we were able to replicate the problem when loading one php file. BTW after clicking reload on the php generated page, it rendered fine. To find the solution I had to rollback all changes I did to the code and Apache configuration and sorted out that IE6 doesn’t like compressed js files.

Here are other people discussing similar problem.
http://www.robertswarthout.com/2007/05/ie-6-apache-mod_deflate-blank-pages/

Reply

17 Rarst April 23, 2009 at 11:24 am

I couldn’t make deflate work on my hosting (I am on simple shared one) so after much googling and trying different ways I ended up using zlib compression from PHP for html, css and js.

Reply

18 Craig Hughes April 23, 2009 at 3:25 pm

# Don’t compress binaries
SetEnvIfNoCase Request_URI \.(?:exe|t?gz|zip|iso|tar|bz2|sit|rar)$ no-gzip dont-vary

This looked wrong to me — why would you not want to compress EXE files, which are notoriously often bloated with giant piles of repeated content which makes for great compressibility. So I thought maybe that might no longer be true. I ran a test, on a more or less clean install of Windows 7 Ultimate in a vmware image:

#######
[craig@puck:/Volumes/Untitled 1]$ find ‘Program Files’ Windows -name ‘*.exe’ -print0 | xargs -0 cat | wc -c
cat: Windows/Panther/setup.exe: Is a directory
300999635
[craig@puck:/Volumes/Untitled 1]$ find ‘Program Files’ Windows -name ‘*.exe’ -print0 | xargs -0 gzip -9c | wc -c
gzip: Windows/Panther/setup.exe is a directory — ignored
148891342
[craig@puck:/Volumes/Untitled 1]$ dc
2 k 300999635 148891342 – 300999635 / p q
.50
########

50% savings on what are often large files is not to be sneezed at!

Reply

19 Craig Hughes April 23, 2009 at 3:26 pm

You almost certainly want to compress .tar and .iso also!

Reply

20 Paul Stamatiou April 23, 2009 at 5:14 pm

Thanks for sharing your thoughts Craig. I guess I’m just running this setup based on some limited experience I had where a mod_deflate gzip’d exe file didn’t work properly on the other end. That being said I don’t think I’m serving any exe’s on my blog so I haven’t run into any big needs to do this. Your #’s are persuasive though.

Reply

21 Craig Hughes April 23, 2009 at 6:59 pm

It didn’t really occur to me that there might be some kind of bug where there’s a different codepath for downloading an EXE vs any other kind of file, such that it wouldn’t work… But I guess not occurring to me that there’d be a stupid bug like that is one of the pitfalls of no longer using Windows much :)

Reply

22 Craig Hughes April 23, 2009 at 7:05 pm

Oh one other thing. While you’re configuring mod_deflate, you should configure mod_expires too. For stuff like image files and the like, you can save a lot of downloads of repeated static content (like icons, background images, etc) by setting an expires tag for some time in the future:

ExpiresActive On
ExpiresByType image/gif “access plus 1 month”
ExpiresByType image/png “access plus 1 month”
ExpiresByType image/jpg “access plus 1 month”
ExpiresByType image/jpeg “access plus 1 month”
ExpiresByType application/pdf “access plus 1 month”
ExpiresByType audio/x-wav “access plus 1 month”
ExpiresByType audio/mpeg “access plus 1 month”
ExpiresByType video/mpeg “access plus 1 month”
ExpiresByType video/mp4 “access plus 1 month”
ExpiresByType video/quicktime “access plus 1 month”
ExpiresByType video/x-ms-wmv “access plus 1 month”

ExpiresByType text/css “access plus 1 hour”

You don’t seem to be using mod_expires right now:

[craig@puck:~/code/simulationcraft-svn]$ curl –compress -I http://paulstamatiou.com/wp-content/plugins/backtype-connect/images/twitter-16.gif
HTTP/1.1 200 OK
Date: Thu, 23 Apr 2009 23:05:18 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Mon, 20 Apr 2009 20:01:20 GMT
ETag: “7ab001e-3fd-fe724400″
Accept-Ranges: bytes
Content-Length: 1021
Connection: close
Content-Type: image/gif

Reply

23 Jeremy Ricketts April 23, 2009 at 10:45 pm

Perfect. This blog is so entertaining AND often useful. A rare thing for blogs these days.

Reply

24 Carlton Bale April 24, 2009 at 1:30 pm

If you’re on a shared server or want a simpler solution, you can enable PHP gzip compression by adding the following command to the .htaccess file:

# Note: command to enable PHP compression
php_value output_handler ob_gzhandler
# End PHP compression

(ob_gzhandler must be enabled by your host, but this is pretty common.)

Reply

25 Dakotah April 24, 2009 at 2:16 pm

I have the same MT (dv) setup and when I try to restart it tells me:

“DeflateFilterNote not allowed here”

I had to comment it out :-( Its a shame because I want to (at least for a short period) see all the stats. I searched google for that error and no real resolutions either.

Reply

26 ST April 24, 2009 at 2:26 pm

Well that only affects the stats, at least you will still gain the benefit of deflate.

Reply

27 Michael Whalen April 27, 2009 at 6:52 pm

I’m surprised that there isn’t a single mention of nginx anywhere on this entire page. I guess you’re stuck with Apache on mediatemple… unfortunately =(

Reply

28 Brett May 1, 2009 at 6:15 am

@Paul

You can also use this URL to test that your gzipped correctly:

http://whatsmyip.org/mod_gzip_test/

(mt) also has a KB on the same topic but your configuration is much better than theirs IMO. I got everything to work if I placed the code into the httpd.conf, I’ll sort out why vhost.conf is not working some other time. Thanks for another awesome article!

Reply

29 Issac Kelly May 4, 2009 at 7:10 pm

Good Post, I was looking over it, and in your AddOutputFilterByType line, you didn’t compress JS or CSS

I added these three mime types to mine:
text/css text/javascript application/x-javascript

Reply

30 Paul Stamatiou May 4, 2009 at 9:43 pm

Issac – actually that line is just there as an example of how you could do it that way using AddOutputFilterByType. You’ll notice that it is actually commented out. I ended up using the “allow everything EXCEPT” route with the SetEnvIfNoCase lines.

Reply

31 Issac Kelly May 12, 2009 at 11:09 am

Right on Paul

Reply

32 David December 16, 2009 at 2:04 am

How do you add to the httpd.conf file for the mod_deflate settings as above, but, to apply for only ONE domain, not to all domains that Apache serves? I want to test compression on just my ski prices site Ski Candy.

Reply

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: