More about Performance Tuning

Posted by Stuart Herbert on January 31st, 2008 in Toolbox.

Mike Willbanks recently wrote a good article about performance tuning. There’s some good advice in there, and I thought it’d be a good idea to quickly add a bit more detail about the separate approaches that Mike raises.

Mike recommended using APC for bytecode caching. APC’s pretty good, but just be aware that APC isn’t compatible with Sara’s excellent runkit extension. Xcache is, but some versions of Zend Optimizer refuse to run if they detect Xcache has been loaded. (Btw, Zend Optimizer is worth looking at, but because of the way Zend compile it, it can affect overall scalability. I haven’t sat down yet and worked out whether Zend Optimizer’s performance improvements make up for the cost of how the Linux kernel has to load it into Apache. I touched on the issues with how things are compiled last year, but haven’t followed it up yet with any definitive figures on scalability.)

If you don’t need Zend Platform’s download server (which rocks), then XCache + Zend Optimizer + Memcache out-performs Zend Platform substantially, and costs a lot less too ;) Zend Platform also isn’t compatible with runkit. It’d be great to see runkit supported better by accelerators.

Memcache is best suited to storing smaller pieces of data. If you’re using it to cache whole XHTML pages, they sometimes don’t fit into Memcache, and need to be cached on disk instead. (Always cache onto local disk, never NFS). Memcache divides the memory allocated to it into different size buckets for performance reasons, and there are far more small buckets than there are large buckets. You can edit the Memcache source code and change the size of the largest bucket before recompiling.

The GZIP trick Mike mentions just isn’t safe with IE6. There are copies of IE out there that fail to decrypt the content correctly alas :( I remember reading a stat that it was about 1% of copies of IE had this bug, but I don’t have the link to hand. I have seen copies of IE with this bug myself. There’s nothing more frustrating than looking at two copies of IE, both reporting exact version numbers, and one copes with GZIPed data whilst the other one doesn’t :( It’s possible that the widespread adoption of IE7 has “fixed” a lot of these buggy IE copies.

I’d recommend placing more emphasis on the Not Modified header, and also on making sure that your code is architected to send back Not Modified headers as quickly as possible. It not only improves per-page performance, but reduces per-page memory usage, and substantially improves scalability. Getting this right can make a huge difference, especially for sites where users normally view more than one page per visit. And make sure the metadata you use to work out whether or not you can send back the Not Modified header is fine-grained enough :)

Also, looking at the Not Modified header … don’t take it for granted that Apache is getting this right for your static files. I can’t remember which Apache module disables this off the top of my head (I think it was mod_includes, but I could be wrong), but check the HTTP traffic to make sure your site isn’t sending static files when it doesn’t need to.

With SQL queries of the form “SELECT … FROM table WHERE primaryKey IN ( … )”, be aware that the size of the IN list varies from database server to database server, and it doesn’t take all that big a list before you run into portability problems.

One important thing Mike didn’t touch on was about separating out static files onto a separate box. Apache + mod_php doesn’t serve static files very efficiently. With static files on a separate box, you can recompile Apache to use the “worker” MPM, which serves static files substantially better, or you can use an alternative web server such as lighttpd.

There are plenty of other things you can do to optimise PHP on servers, such as tuning Apache to prevent swapping, tuning the Linux TCP/IP stack to reduce connection failures at peak times, and moving your database off onto a separate box. I’m going to go into these in a lot more detail at a later date.

Finally, xdebug is a fantastic tool for profiling your code and telling you where you have inefficient loops and whatnot. It takes the guesswork out of finding bottlenecks!

15 Comments

  1. Mike Willbanks says:
    January 31st, 2008 at 10:48 pm

    Stuart,
    Great job on picking up where I left off! :)

    You hit many of the edge cases that I wanted to put in but figured I was going to keep it as an overview leaving some more of the edge cases and amount of time in writing behind.

    I had about 5 or 6 more topics that I thought about covering, however, I might leave that for another -10 degree (F) day.

  2. mibus says:
    February 1st, 2008 at 12:36 am

    I believe you can use the worker MPM with Apache alongside PHP, as long as you’re using (eg.) FastCGI rather than mod_php.

  3. gaetano says:
    February 1st, 2008 at 9:36 am

    About IE6 and decoding gzipped content: it might be a different issue, but when I have seen it happen the problem was actually introduced by the proxy (MS ISA server) upgrading the browser request from HTTP 1.0 to 1.1 and forgetting to inflate the responses received before passing them back to the browser. IE does not think that http 1.0 responses can be deflated, so he does not reinflate either. The quick workaround was sot set ‘use http 1.1 w proxies’ on

  4. Stu says:
    February 1st, 2008 at 10:18 am

    @mike: thanks!

    @mibus: I don’t recommend FastCGI. The last time I tested it, it wasn’t stable under heavy load. I’ll make a point to look at the different ways to run PHP as part of my Web Platform series.

    @gaetano: I’ve seen the GZIP problem happen with no proxy servers involved :( I’m hoping that this month’s forced upgrade to IE7 will sort out nearly all the remaining copies of IE6 that suffer from this bug.

  5. Andrew says:
    February 1st, 2008 at 12:25 pm

    Win XP up to SP1 will exhibit the JavaScript and CSS decompression bugs. (There are actually several different failure modes.) One can use the user agent matching of the Apache 2 gzip module to not gzip CSS and JS for those versions of IE6 while still compressing it for the WinXP SP1+ version of IE6.

    YMMV, but at this point, about 8% of our traffic is from WinXP pre-SP1 boxes.

  6. Trophaeum says:
    February 2nd, 2008 at 12:31 pm

    Sadly it seems as though many of the things posted in these are still things that havn’t been tested in a long time. I am yet to find a single issue with apache2 using mod_deflate with IE6 and that includes the fact that its compressing css and js files with it as well… retest things people, there is no permanent perfect solution!

    As for fastcgi stability, i have a server sustaining over 120req/sec 24/7 and it’s perfectly stable with event mp apache2, php 5.2.5 fastcgi, mod_deflate, xcache, memcached…

  7. Anthony Ferrara says:
    February 3rd, 2008 at 6:25 pm

    I agree with Trophaeum. I currently run a FCgi powered php site that’s seen over 900 requests/sec (Single dual xeon server with 2gb ram). Never had a problem (but I’m also not using Apache). About Zend Opimizer, all testing I’ve done shows it is absolutely not worth it. While it may make a very slight improvement in load times for specific scripts, it does slow down others (based on my testing).

    I like APC, but Xcache is the best out there. It’s compatable with FastCGI, light, and FAST. As for size limitations for memcache, don’t forget that xcache can store userland vars as well…

    As far as serving static content, Apache + mod_php does not serve static content well. Apache (or lighttpd), with php as FastCGI serve static content quite well…

    I really question the numbers on IE6 pre SP1 installs… I run a few very high traffic sites, and IE6 makes up an average of 25% of traffic. So, does that mean 1/3 of all IE 6 users are still using < SP1?

    Moving the DB to a separate box is only a fix SOMETIMES. It will only help if you are disc, CPU or memory bound. Don’t forget, that running a DB locally has VERY little network overhead, but running it on a separate box adds latency. While the latency is very small, it adds up on every query. Plus, semi large result sets will tie up network bandwidth (a 100kb result will take about 8ms to transfer (no latency) over a 100mbit line. That adds up quickly…

    I notice you mention using Lighttpd for static content. Why not use it for PHP as well (as a FastCGI dispatcher)? It’s MUCH more simple, easier to maintain, and cheaper (only one server needed). All the tests I have done between Apache and Lighttpd, Lighttpd beats Apache hands down in all tests (and some very significantly). For example, mod_php supported 200 req/sec at 100% CPU. Lighttpd+Fcgi supported 2000 req/sec of the same script at 50% CPU…

    Just my comments here..

  8. Mike Willbanks says:
    February 10th, 2008 at 5:52 am

    @Anthony
    The difference between Apache and Lighttp is really not that large. You can certainly performance tune apache taking out many of the default modules to make it perform at a much higher rate. One of the reasons for running 2 web servers on a local box is to have 1 for sending out non-dynamic content and one for processing in this case PHP.

    This is where squid is extremely helpful if you want the port to be blind to the external user. Having squid sit in front of your web servers caching certain items and then passing the request onto apache or lighttpd for it’s respective items.

  9. Hacking Ajax | Links: PHP Performance says:
    February 13th, 2008 at 8:02 pm

    [...] an excellent collection of web app performance tips. The loop example alone is worth the read. Then Stuart Herbert picks it up with some thoughts on XCache + Zend Optimizer + [...]

  10. TuxLives says:
    February 21st, 2008 at 5:51 pm

    I have yet to see a good comment/test around reducing include calls and function overhead.

    IE: By including all functions in one file, you make only one call to the filesystem to include it, but you are not instantiating all those functions (which you may or may not need).

    I’d love to know what is the crossover point?

  11.   Stuart Herbert's Blog: More about Performance Tuning by Joe McLaughlin’s Blog says:
    February 27th, 2008 at 7:45 am

    [...] off of a previous article from Mike Willbanks, Stuart Herbert has posted some of his own thoughts on tuning and tweaking your applications for the best performance you can get out of them. [...]

  12. Jones says:
    February 26th, 2009 at 10:29 am

    How to “tuning Apache to prevent swapping”?

  13. WEB Server Performansını Nasıl Arttırırım? -2 | Outlier says:
    July 15th, 2010 at 8:39 pm

    [...] http://blog.stuartherbert.com/php/2008/01/31/more-about-performance-tuning/ [...]

  14. WEB Server Performansını Nasıl Arttırırım? -3 | Outlier says:
    July 16th, 2010 at 10:20 pm

    [...] http://blog.stuartherbert.com/php/2008/01/31/more-about-performance-tuning/ [...]

  15. WEB Server Performansını Nasıl Arttırırım? -4 | Outlier says:
    July 17th, 2010 at 10:59 pm

    [...] http://blog.stuartherbert.com/php/2008/01/31/more-about-performance-tuning/ [...]