Twitter: dinner with @bellmor

On Being a Website Performance Junkie

When it comes to high-performance, speed and optimization, I’m there. I have written several articles about this subject in the past with 5 Ways to Speed Up Your Site, How To: Optimize Your CSS Even More and a brief look into image maps to reduce HTTP calls and bandwidth usage. However, similar to modifying a car there is always something more that can be done.

Road Atlanta Track
Cars cool down and await heading onto the track at Road Atlanta.

First off, let’s talk about what I have done before. With the last redesign I went through my WordPress theme and gutted it almost entirely; removing features I don’t use, rewriting some PHP to be less server-taxing and compressing CSS with gzip via PHP. I actually don’t cache pages since the WordPress plugin that handles page caching requires that pages not be compressed for caching. My server can handle non-cached pages just fine and I thought the gains of compressing XHTML markup were substantial - something like 35kB vs 8kB. That boils down to an issue of having a quickly served site with a larger file size and a smaller file size served slightly slower but that is faster to download.

MySQL Woes

Not-caching pages does lead to inevitable MySQL slow down though. Your server’s database query language, MySQL, is many times slower than PHP and PHP ends up waiting on MySQL to process things like posts and items in the sidebar such as the popular and latest posts. While that still is an issue for me, I have MySQL 4 caching setup on the server-level, or so my friends at Media Temple tell me.

This brings me to my next point, web statistics tracking. I have been using a locally-installed web stats application called Mint for 2 years. It has incredible expandability with add-on modules called peppers, but Mint succumbs to the MySQL slowdown issued mentioned above. Every page tracked by Mint includes a small JavaScript file which holds up the entire page load and is proportionate to the speed of your server’s MySQL execution and number of peppers installed.

As such I have been using Google Analytics for roughly 2 weeks and plan on using it full time at the end of this month. Analytics works the same way as Mint, by including a JavaScript tracking file with each page load, but Google hosts the JavaScript and their servers are a tad faster and more optimized than yours. So what does this mean in numbers? In my testing with the Firebug net tool Mint consumed approximately 200 to 300ms (with crazy peaks of > 500ms at times) of loading time, where as Google Analytics only used around 40 - 60ms. That was with me using only basic Mint peppers - Visits, Referrers and Pages. I imagine your numbers would be much higher if you use more as a product of the additional JavaScript included with each page load and the MySQL issues.

You might be thinking something along the lines of “you’re worried about 200 damn ms?” Yeah, I am. When my site loads in 700ms on my Internet connection, 200ms is a big deal.

S3

Amazon’s super-affordable and super-fast storage solution S3 is nothing new. I have talked about it before, Jeff has talked about it before, and Jeremy has talked about it before, along with the rest of the Internet. While I have been using S3 to backup personal files for months, I never ventured to using it on my site. That changed last night when I decided to experiment with hosting frequently-served images, such as my logo, mugshot and sidebar images, on Amazon S3.

S3 Firebug PSTAM.com

I decided not to host every image, such as those in articles, on S3 like Jeff Atwood did because that seemed like too much trouble and would break my writing workflow of uploading images directly inside the WordPress Admin panel. However, for people paying for bandwidth S3 is an excellent solution to off-load images and pay a fraction of the bandwidth cost. Amazon’s servers are fast which will definitely be noticeable if you’re on a low-level or shared host. For me though, latency and download times for images hosted locally versus S3 was rather negligible.

Then why am I still doing it? The answer is twofold. By setting up a static file host like S3, which will generally have lower latency time, a greater maximum throughput and the ability to cope with more requests per second than your host, you give your server a chance to keep up with high traffic levels. Also, by utilizing more than one hostname (ie, yoursite.com is a hostname, yourbucket.s3.amazonaws.com is another hostname) to serve your content you increase your effective bandwidth, especially if HTTP keepalives are enabled.

Firebug - Mint and S3
The Firebug net tool helping my point. Time spans from left to right.

Of course there are some stipulations with that last part. Spreading your content/media across slow hostnames won’t help your case and there is a point at which the latency encountered from multiple DNS lookups for each additional hostname becomes inefficient. Amazon S3 is a prime example of a speedy hostname to throw static files on. If your users have HTTP pipelining enabled, they’ll see even greater benefits. An HTTP pipelining enabled web browser uses server keepalive signals to assume that a socket is open so it can receive an uninterrupted stream of packets. Without pipelining, the browser must communicate back and forth with the server to determine whether the last item was received successfully and uses more TCP packets to do so.

HTTP Pipelining

If Firefox is your primary browser, enabling HTTP pipelining is a simple process.

S3 101

To avoid the inevitable questions about how to get started with using S3 as a file server, I’ll give you a brief walk through. Assuming you have an S3 account and someway of connecting to it (I prefer S3fox), create a new bucket. Buckets can be considered as root-level directories of your S3 account. Bucket names are global so you’ll have to pick a unique one, like you have to pick a unique user name on social networking sites.

In my case, my bucket name is “stammy”. To access a file within that bucket, the URL syntax is the following:

http://[bucket_name].s3.amazonaws.com/

As such, if I wanted to access a file “logo.png” inside if the “stammy” bucket, it would be found at

http://stammy.s3.amazonaws.com/logo.png

.. but not yet. You need to configure the Access Control List (ACL) which lets S3 know what is safe to give public permission to. In this case, we need to give everyone read access to “logo.png”. The method of editing the ACL varies by how you are interfacing with S3, either by way of application or your own code, but with S3fox all you need to do is fire up a dialog box by right-clicking on a file and selecting “Edit ACL”.

Overall

By moving to Google Analytics and off-loading commonly-served files to S3, my homepage will (I have not removed Mint yet) load in around 500ms based on my testing on a 5mbit line. What’s next? I would love to come up with a bulletproof way of caching pages as I’ve had trouble with WP-Cache in the past. I am considering hosting my favicon on S3 after reading about a guy whose favicon consumed 27GB of bandwidth in a month, albeit a 70kB favicon! Also, I am in the process of experimenting with gains of spreading files across hostnames more throughly - gzip-compressed, local CSS versus a non-compressed S3-hosted CSS file. For the most part I am still an amateur with web optimization but I always get a little smirk when I can eliminate a few milliseconds here and there. As they say in car racing, for every 100 pounds shaved off your car’s weight, your 1320 foot time will drop by 0.1 seconds. In other words, the small things add up.

Promote this article on various sites or email to your friends:     



43 Comments

  1. Nice tutorial. Thanks for all the great content. Safe Journey to SJ and enjoy WordCamp.

  2. Great write-up Paul. One thing to keep in mind when making the switch from Mint to Analytics is that Google’s stats are delayed a few hours… whereas Mint is a real-time stats solution.

    I know Mint slows things down for us, but I keep it installed and check it incessantly throughout the day. I like seeing who is linking my site right now.

  3. Yeah, I will miss the live referrers feature. As for stats, did you know you can track stats per hour with Analytics - I use an OS X dashboard widget called Dashalytics. However, I think it relies on the older Analytics version which will get phased out today/17th, so the widget will change.

  4. Paul,

    When you’re really looking to shave off a little load time, consider combining your images into one. Google realized this would save users time:
    http://www.google.com/images/nav_logo3.png

    Then just use the CSS background-position property to control which part of the image is shown.

    Looking forward to seeing you this week.

    Phil

  5. Well written - shame that I went with Mint about a month ago. I have used Google Analytics since day 1 though.

    You tweeted the otherday about GA ‘missing’ a lot of your traffic though?

  6. @Chris - it seems that was a fluke, all is well now and all of the data that wasn’t reported is now there.

  7. Phil, yeah I talked about that image maps in an old post http://paulstamatiou.com/2006/04/16/customizing-k2-part-4/

    As for using it myself, I hadn’t thought about using it with my current design - I only used it an old design with a footer image with a bunch of logos linked to their appropriate things. Although that was just one big image that was all displayed, I should definitely combine some of my smaller images into one. Cool to see how many images they stick in there though. =)

    BTW, I think I’m hitting up Google campus on Friday, I’ll give you a ring.

  8. great post, man. it’s amazing to think that just a little while ago, you still had this site running of remnants of K2 code and what not.

    I’m embarassed to admit that you lost me around the client / server diagram, but I’m learning.

    no detraction from your post, phil’s google image map comment was probably the most mind blowing thing I learned today. Can’t believe I never did a safari activity check on a google search page.

    p.s. you’ll break 5k someday, haha.

  9. It won’t help you much, but anyone who is self-hosting a site and wants a solid caching mechanism should look into setting up a reverse proxy. Basically, you set up a proxy that serves up all of your web docs. If a document does not exist in the proxy’s cache, or is out of date, the proxy will forward the request to your web server (typically listening on a different port) and store the result so that it can handle future requests without the web server’s help. Reverse proxies can typically handle compression, too. Google for Perlbal or Squid for more info.

  10. Hey paul I been monitoring your site (MT)and a few others (bluehost, dreamhost etc) using moitor.us and various other tools. I found your response time(MT Grid) a bit slow. I eliminated sites hosted on dedicated accounts. I found that people hosted on bluehost seemed to have much better response time. I will do a write up shortly when I get back home from office. what are your thoughts on this?

  11. Also make sure to try out CSS compress:

    http://dev.wp-plugins.org/wiki/css-compress

    upload, activiate, done, works like a charm.

  12. We should meet up in SF this week and work on this. WordPress performance tweaking is my specialty.

    Instead of S3, I like to set clients up with a CacheFly account and then do an automated rsync of all image files on cron. Then I rewrite the image URLs in entries/postmeta on the fly. So they just upload to WP as before, but images are loaded from an external server.

    WP-Cache is a must-have. It can give you 100-1000x more HTML throughput.

  13. That was a good post, thanks.

  14. @Mark - WP-Cache is janky for me and when someone leaves a comment it doesn’t update for some reason and people end up leaving the same comment 3 times. =/ But the rest of what you said sounded nice. I take it you also consult for companies with high-traffic WP installs?

  15. Paul,

    This is super info, and timely for me since I just moved to a new domain and want to ensure I optimize as best I can.

    Currently I am using the widget feature within WordPress 2.2.1 for one widget only: archives … everything else is hard-coded into my sidebar.php

    Would I be better off just doing away with widgets altogether, to the point of even removing the function call for registering them (within the theme’s functions.php file, for example), or are widgets not that much of a loading concern?

  16. @Bruce - PHP is so much faster that MySQL in processing time that it is fairly negligible, however it is always good to bypass extra code right?

  17. Thanks, Paul. I appreciate the insights very much!

  18. Very interesting, I’ll try to put some static content on S3. We use it a lot for our Web-apps, but never thought about putting static files there.

  19. Excellent article Paul…i find it interesting in the mint/google analytics speed…i just installed mint on my site and find it a lot more user friendly than the google analytics. It is still crazy it causes that much delay than google that is off your server…

  20. I didn’t mean to be so blunt about it. On smaller sites, Mint is great in terms of features and live stats reporting, it just doesn’t appear to scale well.

  21. Do you mind telling us how much your s3 bill is for hosting those small images?

  22. PHP is so much faster that MySQL in processing time that it is fairly negligible, however it is always good to bypass extra code right?

    Not always (is PHP so much faster than MySQL). In fact, since WordPress 2.1 and up have posts queries that remain static (

    WHERE post_status=’publish’

    as opposed to the old WHERE post_date_gmt ), hosts who enable MySQL qcache will probably make PHP the bottleneck.

    For instance, on my personal blog, I just got “Dynamic: 0.329 seconds | 10 queries (0.050 seconds)” — so MySQL is only 13% of the page generation time.

    Things that can really kill you are (a) showing all your categories in the sidebar (particularly if you have a huge category tree), (b) listing recent posts (in a non-cached way), (c) poorly coded plugins that do a lot of pushups on every load when they could cache stuff and update when the info changes.

  23. Nice post! I never thought of removing the comments from CSS to speed up load time.

  24. Just yesterday the company I work for found some problems with IE6.0’s support of gzip. Probably not much of a concern here as I doubt many people are running IE6 when coming to this site (and a non-updated versions of IE6 at that!). Something to consider. Maybe.

    Side note: This post was killer.

  25. Also related are domain name limitations (ie., simultaneous downloads from the same domain - varies per browser), and future “expires” headers for caching (eg. a file which is valid until 2010 should be aggressively cached.) Y! has published a few articles via the YUI blog on performance, recommended reading.

  26. OK folks, as a noob to this stuff, you have really gotten my interest up.

    How does one go about measuring these times that you are talking about? I’d like to see what factors are influencing my loading. I use bluehost.

  27. I’ve been running WP Cache on my site for awhile, and I’ve never had any problem with comments showing up. If for whatever reason the caches aren’t clearing on-comment, you can just tell WP-Cache not to cache article pages.

    You might also look into writing a simple little caching php script using output buffers. I made something like that before I started using WP cache, and i just modified the comment submit handler to also clear the caching system i wrote.

    Finally, what about stored MySQL procedures? If you’re running the same queries for populating the homepage, archives, etc you might be able to shave a few ms using stored procs.

    Great writeup, I learned quite a bit.

  28. Great write up, really useful.
    You seem to have good experience on web stat tools.
    I have been using GoStats.com , they provide a log size of 1000 visitors, have u used any tool with a greater log size?

  29. For my latest work in progress site, bandwidth is a problem. Now I am thinking about getting a S3 account. I am just worried I’ll use so much bandwidth that I end up paying more because of S3.

  30. I switched all my sites stats from Mint to Performancing Metrics (http://pmetrics.performancing.com/) a few weeks ago and am very pleased with it, as it has tons of features that neither Mint nor Google Analytics (which I also tried but didn’t like) have.

    The only problem I have, which I haven’t solved is that I have been using Mint for two years, and all the historical data for visitors is only recorded with Mint. So unless I can transfer the data to my new monitoring system, I have to keep both running at the same time or no longer have comprehensive up-to-date historical stats.

    Has anyone found a solution to this irritating problem?

  31. I’ve notice my site load times slow down a little when I have Google Analytics running. I decided to take it out just for that reason.

  32. Very informative, thanks.

  33. Great stuff. I do performance testing of web sites for a living. ;-)

  1. [...] Stamatiou on improving web site performance with S3. - nothing too new here, but he does a nice job of explaining [...]

  2. [...] On Being a Website Performance Junkie - PaulStamatiou.com Interesting stuff. I’m not quite this obsessive, but I probably should be. (tags: webdev) [...]

  3. [...] Hur gör man dÃ¥ en riktigt smart “tree-state-button”? Att byta bilder pÃ¥ saker och ting dynamiskt med CSS eller JavaScript medför nÃ¥gra problem. När man t.ex. för musen över knappen första gÃ¥ngen kommer bilden för “hover”-läget behöva hämtas och det blir en fördröjning. Detsamma gäller när man klickar pÃ¥ knappen. Klickar man för fort hinner man aldrig ens se knappens nedtryckta läge. Detta gÃ¥r förstÃ¥s att komma undan genom att man förladdar knappens tre bilder. Detta innebär dock tvÃ¥ onödiga HttpRequests, vilket är en stor smärta för den äkta “Percormance Junkien“. [...]

  4. [...] Junkie or speaking for myself, a Wordpress Performance Junkie Blogger. One of his latest posts, On Being a Website Performance Junkie got me thinking. As i’m new to Amazon S3 i got excited from the idea of moving files like [...]

  5. [...] Setting up an S3 account was easy. Learning how to use it took a bit of effort. My starting point was some of super-geek Paul Stamatiou’s material, starting with his how-to on being a website performance junkie. [...]

  6. [...] improvements far outweigh the compatibility issues that may arise. If you’re a real performance junkie you may be interested in a simple hack to enable gzip with [...]

  7. [...] important) response time for the sites they hosted over MT. In fact I even commented on this on paul stamatiou’s blog 2 months back and wanted to do a blog post about it but never got down to doing [...]

  8. [...] improvements far outweigh the compatibility issues that may arise. If you’re a real performance junkie you may be interested in a simple hack to enable gzip with [...]

  9. [...] improvements far outweigh the compatibility issues that may arise. If you’re a real performance junkie you may be interested in a simple hack to enable gzip with [...]

Post a comment, receive Stammy points.


Send a trackback.


  • If you plan on posting code, run it through Postable first.
Copyright © 2005 - 2008 PaulStamatiou.com  Privacy Policy - Terms of Service Can't spell my name? Use PSTAM.com. Go back up ↑.