Thought I would update everyone on the ClassicKidsTv site. The launch I posted about was short lived, after around 24 hours the site couldn’t cope with the sheer amount of simultaneous hits it was getting and started crashing continually. Thankfully Bytemark’s SMS alerts kept me up to date and my new Qwerty SPV phone allowed me to SSH on from anywhere to give the box or Apache a reboot. After two days I got extremely frustrated with the crashes and realised we had some major problems, I took the wiki down, leaving only the forum up along with a note on the homepage redirecting visitors.
As a temporary solution I uploaded the old site onto the new box and temporarily reconfigured Apache to point visitors to the old design, this solved the immediate problem and we were back online at least, although I really needed a fix to get the Mediawiki site back up and running. At this time I had 3 options. Firstly, I could reinstate the old site permanently and forget about the wiki, although this would mean losing the weeks of work that Orbling, Em and I put in to get the site built, not to mention all the work I put in getting the box setup and learning Mediawiki, Mysql and Apache to a fairly expert level, so really not an option! Secondly, I could convert the wiki pages into a static Html design, this would require coming up with a new design, and losing the community editing functionality. The last (and preferred) solution was to figure out what was wrong with Mediawiki and fix it!
I did some research, including observing a crash as it happened and rooting through the system logs and it turned out there were a number of reasons for the crashes. Firstly the machine was having to do a lot of processing because of MediaWiki, for the first time the site was running from a purely dynamic system whereby a user requests a document and Apache/Mysql/Mediawiki all work together to build the page on the fly and send it to the user. This kind of setup is employed on many sites and CMS systems such as Drupal, Moodle, Joomla and WordPress and it does have it’s advantages, pages can be updated by anyone, dynamic pages can adjust very fast to changes, and best of all, many users can collaborate on the same project, but there are some major disadvantages, mainly speed. The page builds are very slow on this kind of system due to all the co-operation needed between the applications. On Mediawiki websites each time a page is requested, Apache requests the file on behalf of a user on the Wiki, Apache then connects to the Mysql database and drags the information through for the page, and Apache builds and sends the completed page out, this is in contrast to the old system where static pages were requested and sent quickly by Apache with no other involvement needed.
Secondly, the Apache server can only support a limited number of simultaneous connections, and where pages are taking a long time to load, a connection slot is being used. As time goes on more and more slots are used up and if the load gets too high then eventually Apache will fill up Ram and connections and crash the box. The extent of this was revealed when I adjusted the maxservers parameter in apache.conf. The default configuration allowed for a maximum of 20 clients on Apache, this is quite a lot when you consider most requests on a web server will be fulfilled in a few seconds. Normally you shouldn’t make this number too high as each Apache service uses a lot of Ram, but I knew it was worth a test anyway. I adjusted the maxservers setting to 40, then 80, both of which filled up almost as fast as 20 slots did, so I knew that Apache alone could not handle the load the visitors were demanding. I was also limited by the small amount of Ram I have on the rented box, which can be upgraded, although at a cost I cannot really afford.
A clever solution was found, and advised to me by my good friend Karl in the form of a guide to using Mediawiki with Squid proxy. Wikipedia itself uses multiple Squid servers, so it is certainly a tried and tested method. The solution involves using a reverse proxy server to request the dynamic pages and cache them so they are already built when the next visitor requests them. A normal proxy server works by caching web pages viewed by users on a network, so that when any other user on the network requests the same page it will be served from the on site proxy cache, that way the page does not need to be re-fetched from the web, and this also saves time and money. Proxy server’s can also be used in a similar way by ISP‘s, and some networks use them to filter out objectionable content in the same way a Firewall filters out dangerous network traffic.
Using the open source and free Squid proxy as a reverse-proxy allows a previously dynamic page to be cached on the web server itself in a static form. It works a little something like this. The server is setup so that Apache can no longer be seen directly by the outside world, Apache still runs on port 80, although only locally on the machine, Squid is installed and setup to run on the external port 80 (the one internet users connect to) and configuration is done so that certain pages are not cached, mainly forum posts and wiki edit pages, neither of which are needed. So now the page is requested by a user for the first time, Squid sends this request to the web server. Apache and Mysql build the page and it is sent through to Squid which sends it out on port 80 to the visitor. The next time the same page is requested Squid checks it’s cache to see if the page has already been cached (it also checks to see if the page has changed and if so it grabs a new copy to send out). If it finds the page in it’s cache it retrieves it and sends it through to the visitor, completely cutting out any involvement by Apache or the database.
After doing some testing on a clone of the machine and then moving the configurations over to the live box, this setup has worked excellently for us, we have more Ram to play around with, a much faster page loading time and we have had no downtime in 2 weeks, it is an ingenious solution so I really thought it should be shared here for everyone to appreciate how well it works! If anyone has any comments or questions please let me know!