A more stable MySQL with HAProxy
If you administer a MySQL server chances are you’ve seen it go astray. A some point in time for no apparent reason, maybe for a traffic spike, the number of running queries starts snowballing until MySQL eats 100% of your CPU grinding your whole server to a halt. In the context of this article let’s just call this pattern The Snowball Effect.
The Snowball Effect can be caused by slow queries and locking issues especially on MyISAM tables. After you’ve done your homework and optimized your queries, using better indexes and tweaking MySQL configuration, there’s still a little detail, one that doesn’t get noticed: you reduced the probability of triggering The Snowball Effect. It can still happen if the load on database ever increases. For some applications and datasets you may never be able to optimize enough to prevent the Snowball Effect from happening.
So what to do? We can really prevent the Snowball Effect only by limiting the concurrent number of queries executed by MySQL. Unfortunately MySQL cannot queue connections that exceed a configured maximum. It is black or white: a connection can be accepted and have its queries immediately executed or it can be rejected, causing an error. You can never be sure a sudden connection spike won’t start the chain reaction that will ultimately hang your server.
Sure, you can control the maximum concurrent connections at the application level using common connection pooling techniques. Except that many times you cannot work at that level as a system administrator and many platforms, notably PHP, due their stateless (aka start-all-over-again) nature, just can’t do it.
Enter HAProxy. HAProxy is an incredibly versatile and powerful TCP/HTTP Load Balancer. We’ll use a tiny amount of its countless features. Using it as a simple TCP proxy sitting between our webserver and MySQL, we can take advantage of its neat queuing system and finally gain control of the maximum number of simultaneous connections that will reach our MySQL server. Let’s see how.
Get HAProxy from your distribution repositories, it is surely in there. Once installed, edit your
haproxy.cfg (which is probably located in
global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 user haproxy group haproxy daemon defaults log global mode http retries 3 option redispatch maxconn 4096 contimeout 50000 clitimeout 50000 srvtimeout 50000 listen MySQL localhost:3305 mode tcp balance roundrobin server mysql_slave localhost:3306 maxconn 20 listen admin_page localhost:3307 mode http balance roundrobin stats uri /
The important bits are the last two
listen groups. The
MySQL one establishes the TCP proxy on port 3305 allowing only 20 simultaneous connections. The
admin_page stuff configures HAProxy’s web control panel on port 3307. Don’t set
maxconn value too high, especially if your db doesn’t span multiple disks. At the end of the day your disk has just one head and it’s good at doing one read at the time. Values higher than one just compensate for the filesystem cache and the MySQL query cache that cause many connections to never hit the disk.
Now change your application configuration so that it connects to HAProxy instead of MySQL. For example if you have a WordPress blog set
wp-config.php. It is important to use
127.0.0.1 and NOT
localhost. By using
localhost most MySQL drivers will connect via a UNIX socket and not through the TCP loopback interface, ignoring the port number and bypassing HAProxy!
This setup assumes a small number of queries per connection. Never use persistent connections with this setup.
The complexity of your architecture doesn’t really matter here. You can have dozens of servers, but the point still remains. Put a HAProxy in front of every single MySQL instance. Used as described above it provides a functionality that should be built inside MySQL.
You can use MySQL Administrator or mtop to monitor the connections on your MySQL and actually see that they get capped when they reach the configured maximum value. Now sit back and enjoy a more stable MySQL.
Very good blog post and tips. Thanks for sharing. I use haproxy in my webserver for a while and indeed it helps reducing the mysql load.
Thank you, Markos. Yes, HAProxy is definitely an invaluable tool.
Thanks for the info
Hey Flavio, this is a nice article with good tips. I wanted to add that it’s important not to use persistent connections on this setup, but you already took care of that, which is perfect. I’ve added a link to your article from the haproxy page. You may want to check the other article from
Alex Williams about setting up a failover cluster for mysql. I think that both of your experiences could be merged into a best practice to cover various types of expectations and optimizations.
Thanks for replying here and for linking this post, Willy. Here’s a link to Alex’ post for those who may be interested: http://www.alexwilliams.ca/blog/2009/08/10/using-haproxy-for-mysql-failover-and-redundancy/
Not quite follow the statement: Never use persistent connections with this setup. What’s the difference in the case of persistent connections?
@kc In the setup described above, HAProxy will allow only a limited number of connections to MySQL. If a connection stays open without actually executing queries, you’ll quickly run out of available connections. Moreover you don’t get the queuing effect, defying the whole idea of this setup.
Seems to me that the cost of opening new connections would be expensive enough for high-volume applications as to make a tool like this unusable. I would hope that the proxy would be able to handle idle (persistent) connections well enough that it would differentiate them from temporary and active connections. One of the reasons for using a proxy like this or Tungsten is to make sure that the database server appears to be completely available without needing to change the application. Unfortunately, the approach taken approach breaks any existing applications that rely on persistent connections.
Duh – I forgot to add – by telling me that I can only have twenty simultaneous queries running at once, that’s effectively limiting the number of rows / queries I can process per second even if those queries are executing very quickly (partly because of the time it takes to set up and tear down a connection). My main concern with looking at this software is for the HA piece. If I need a truly a HA database, then I’ll look at something like MySQL Cluster. The problem with MySQL Cluster, however, is that it only protects the data, not the SQL servers hosting the data. In that case, you’ll still need some type of proxy that can handle sensing where to shove requests in the event that one of the SQL nodes goes off-line.
[…] A more stable MySQL with HAProxy Enter HAProxy. HAProxy is an incredibly versatile and powerful TCP/HTTP Load Balancer. We’ll use a tiny amount of its countless features. Using it as a simple TCP proxy sitting between our webserver and MySQL, we can take advantage of its neat queuing system and finally gain control of the maximum number of simultaneous connections that will reach our MySQL server. Let’s see how. (tags: webbuilding) […]
I never know haproxy can do this，just like cache or queue。
Sorry I can not understand why we just don’t use max_connections option in my.cnf?
At any given time only one query can execute per connection, so in example if max_connections is set to 20 it is the same as maxconn 20 in here?
Can you please explain the difference?
Please it is very import to me to understand.
So in example if we have 100 connections request and maxconn is set to 20, the HAProxy will let 20 connections through and queue remaining ones that will hopefully let through when traffic spike is over?
Thank you very much.
Yes, exactly. The whole point is that HAProxy creates a “queue”. MySQL max_connections just discards exceeding connections by sending an error, it does not queue them.
Thank you very much for the explanation!
I am new to HAProxy. Please can you advise on how to test it. I am running HAProxy on two ubuntu machines connected to the mysql database. I do not want to use any webserver for this.
Change your storage engine on your mysql table to INNODB then implement mysql-proxy with master and slave configuration. Mysql-proxy can slipt your traffic where Master receives write type queries and slave receives read type queries.
No change to your application needed.