Simple loadbalancing with PHP & MySQL

I am running different sites on a lot of different hardware configurations. When a site becomes too slow you have different options:

  • buy a new,faster server
  • buy more servers
  • fix the software (optimize; better/more cache)

Usually we host sites built on Java/J2EE platform; these sites are made for medium size to (very) big companies and are all built to last and the be scalable. Clustering/failover/loadbalancing are options which are built into these systems by default. The main products my employer sells support clustering out-of-the-box.

Now enter my hobby projects, all of them in PHP and written in a few weekends and nights. One of them got a bit out of hand. When there were 75-100 people concurrent, my single CPU, cheapo hardware took 20 secs to render an average page. The entire site was not cached, it was loaded with features (the most feature-rich dating platform in the world, I dare to say) and written badly.

This site is running on very cheap (< Euro 300) hardware (Celeron 2,4 with 512 mb mem and 2 * 40 gb IDE hds); these boxes are homemade with the cheapest material money can buy. It is hardly possible to do it cheaper than that. Why did I choose this? Because it is actually a free (as in beer) site. So I don't want to spend a lot of money on hardware/hosting/anything. I first fixed the code to be slightly more efficient and started, at the same time, to build in cache. You must know; datingsite have caching ‘challenges’, because users are continuesly doing searches and changing stuff while being logged-on. So I had to make cache-per-person for most of the pages. After these changes the site could hold out, on the same hardware, to 400-500 concurrent users. After a while, the amount of concurrent users per day began to go over 500 and after a while, even over 600. The pages, at night, took 20 sec again to load 🙁 The site was completely not prepared for loadbalancing, in any way. So the next step I took was seperating the webpart and the database part and put them on different servers. You have to know that the webpart, with all user photos weighs about 4 gigabytes currently (with more than 30.000 members) and the database is 3 gigabytes. I seperated it on 2 servers with the same homemade crappy hardware described above.
This trick was great; the site could handle up to 700-800 users with ease suddenly.

But ofcourse, it was still growing and the 800 concurrent level was made 4 weeks ago. More users made the site slower and slower and slower and…….

The problem seemed to be high traffic on both the webfrontend and the database. The webfrontend could easily be fixed; I would add another box (or even more) and DNS them to be called www1, www2, www3 etc. Then I would have the index on the main domain throw a user to one of those boxes randomly:

$server[]=”www1″;
$server[]=”www2″;
$server[]=”www3″;
$r = rand(0,sizeof($server));
header(“Location: “.$server[$r]);

This is the real lame-brain loadbalancing; better loadbalancing would be, ofcourse, using a hardware loadbalancer or a BIND DNS round-robin loadbalancer. But for hardware I don’t have the money and for software I would need the domain to be managed by my DNS and it isn’t and the provider I have it at doesn’t know what loadbalancing is. So I will move, but for now, this works fine.

This helped a bit, but ofcourse overloaded the MySQL database servicing all these frontends…

Balancing MySQL is a bit harder. It is actually much harder. You need to be capable of replicating the MySQL to different servers and this is difficult; MySQL only support master-slave replication which is quite lame; you cannot write to the slaves. Also, you need to put on Logs for this replication which take I/O power and fill up your harddisk space fast.

So I decided to do this differently; I added another box (running 4 boxes now) and rewrote the software by changing all references to mysql_query to _mysql_query. I tested using:

function _mysql_query($qry) {
return mysql_query($qry);
}

When this worked, I filled in this method:

$master_db = make_mysql_connection(MASTER_DB);
$slave_db[] = make_mysql_connection(SLAVE_DB1);

function __mysql_query($qry) {
global $master_db;
global $slave_db;
$cmd = strtolower(substr($qry, 0, 6));
$conn = $master_db;
if ($cmd == “select”) {
$choice = rand(0, 10);
if ($choice > 1 && $slave_db[0]) {
$conn = $slave_db[0];
}
} else {
for ($i=0;$i<sizeof($slave_db);$i++) {
if ($slave_db[$i]) {
mysql_query($qry, $slave_db[$i]);
}
}
}
return mysql_query($qry, $conn);
}

This way it will do all writes to all slaves and the master, and it will alternate reads between the master and the slave to some given random %.
Flaws: I will fix this function to round robin between X slaves and I will fix it to fail a slave (after which you have to re-make it with a DB dump from the master. For this last problem I will also figure something out I hope.

The current system is capable of packing a much larger punch and it is actually quite scalable, cheap and very easy to make and setup, even in environments you don’t have control over the MySQL and Apache installs.

Be the first to leave a comment. Don’t be shy.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>