HAProxy is frequently used as a software load balancer in the MySQL world. Peter Boros, in a past post, explained how to set it up with Percona XtraDB Cluster (PXC) so that it only sends queries to available nodes. The same approach can be used in a regular master-slaves setup to spread the read load across multiple slaves. However with MySQL replication, another factor comes into play: replication lag. In this case the approach mentioned for Percona XtraDB Cluster does not work that well as the check we presented only returns ‘up’ or ‘down’. We would like to be able to tune the weight of a replica inside HAProxy depending on its replication lag. This is what we will do in this post using HAProxy 1.5.

Agent checks in HAProxy

Making HAProxy 1.5 replication lag aware in MySQLHAProxy 1.5 allows us to run an agent check, which is a check that can be added to a regular health check. The benefit of agent checks is that the return value can be ‘up’ or ‘down’, but also a weight.

What is an agent? It is simply a program that can be accessed from a TCP connection on a given port. So if we want to run an agent on a MySQL server that will:

  • Mark the server as down in HAProxy if replication is not working
  • Set the weight to 100% if the replication lag is < 10s
  • Set the weight to 50% if the replication lag is >= 10s and < 60s
  • Set the weight to 5% in all other situations

We can use a script like this:

If you want the script to be accessible from port 6789 and connect to a MySQL instance running on port 3306, run:

You will also need a dedicated MySQL user:

When the agent is started, you can check that it is working properly:

Assuming it is run locally on the app server, that 2 replicas are available (192.168.10.2 and 192.168.10.3) and that the application will send all reads on port 3307, you will define a frontend and a backend in your HAProxy configuration like this:

Demo

Now that everything is set up, let’s see how HAProxy can dynamically change the weight of the servers depending on the replication lag.

No lag

Slave1 lagging

Slave2 down

Conclusion

Agent checks are a nice addition in HAProxy 1.5. The setup presented above is a bit simplistic though: for instance, if HAProxy fails to connect to the agent, it will not mark the corresponding as down. It is then recommended to keep a regular health check along with the agent check.

Astute readers will also notice that in this configuration, if replication is broken on all nodes, HAProxy will stop sending reads. This may not be the best solution. Possible options are: stop the agent and mark the servers as UP using the stats socket or add the master as a backup server.

And as a final note, you can edit the code of the agent so that replication lag is measured with Percona Toolkit’s pt-heartbeat instead of Seconds_Behind_Master.

14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jan Mara

Hi Stephane,

thanks for sharing 🙂 I’ve corrected 2 typos and the localhost binding (if you bind the agent socket on 127.0.0.1, the check only works if HAProxy is running on the node itself).

And a second change for the Debian/Ubuntu Users who do not need the User/Password setting. Here is the gist: https://gist.github.com/jmara/8035c07a86ff111465d9/revisions

LeDistordu

Hi, thank for this script, i have just a notice :

PHP Notice: Undefined offset: 0 in /usr/bin/haproxy_checkgalera on line 33

wtarreau

Hi Stéphane,

I just discovered your post now, it’s very instructive for those who want to learn more about the possibilities of the agent check, so thanks for sharing this. I’ve added a link from the haproxy home page.

Jackey Lin

The script agent.php has few problem, I’ve corrected it as

# Script Name: agent.php
<?php
// Simple socket server
// See http://php.net/manual/en/function.stream-socket-server.php
$port = $argv[1];
$mysql_port = $argv[2];
$mysql = "/usr/bin/mysql";
$user = 'haproxy';
$password = 'haproxy_pwd';
$query = "SHOW SLAVE STATUS";
function set_weight($lag){
# Write your own rules here
if ($lag == 'NULL'){
return "down";
}
else if ($lag = 10 && $lag

Jackey Lin

Seems my previous posting is truncated due to the limitation of maximum allowed string size. Let me point out the parts of agent.php created by Stephane.

1. Change
<!–?php
to
<?php

2. Change
$cmd = "$mysql -h127.0.0.1 -u$user -p$password -P$mysql_port -Ee "$query" | grep Seconds_Behind_Master | cut -d ':' -f2 | tr -d ' '";
exec("$cmd",$lag);

to

$cmd = "$mysql -h127.0.0.1 -u$user -p$password -P$mysql_port -e $query | grep Seconds_Behind_Master | cut -d ':' -f2 | tr -d ' '";
exec("$cmd",$lag);

MLBR

What is the recommended way to ensure that the php script is running and the socket is open? It would be unfortunate for the agent script to fail for some reason and falsely reporting the slaves as being down.

SDF

I am having a heck of a time getting HA proxy to accept this configuration. I keep getting an error of
server slave1 only supports options ‘backup’, ‘cookie’, ‘redir’, ‘observer’, ‘on-error’, ‘error-limit’, ‘check’, ‘disabled’, ‘track’, ‘id’, ‘inter’, ‘fastinter’, ‘downinter’, ‘rise’, ‘fall’, ‘addr’, ‘port’, ‘source’, ‘minconn’, ‘maxconn’, ‘maxqueue’, ‘slowstart’ and ‘weight’.

with a line of
server slave1 xxx.xxx.xxx.xxx weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions

SDF

well that was just it! it was what and apt went and got on ubuntu 14.04 originally. I remove it and installed 1.6.3 for my tests.
on another note. I am wanting to debug the agent-check since everything checks out over telnet from the proxy server, but when I stop the slave, nothing is happening on the haproxy as far as changes to status. telnet request states down when I stop replication..
btw echo “show stat” | socat stdio /run/haproxy/admin.sock | cut -d ‘,’ -f1,2,18,19 returns nothing.
my config
global
log /dev/log local0
log 127.0.0.1 local1 notice
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
# mode http
# option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 5000
timeout client 50000
timeout server 50000
listen stats
bind *:1936
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth XXX
stats admin if TRUE

listen read_only-back
bind *:3306
mode tcp
option tcplog
log global
balance leastconn
server slave1 xxx.xxx.xxx weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions
server slave2 xxx.xxx.xxx weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions
server slave3 xxx.xxx.xxx weight 100 check agent-check agent-port 6789 inter 1000 rise 1 fall 1 on-marked-down shutdown-sessions

Thanks

neob

Can’t get this to work. Able to setup the agent, but seems can’t get Haproxy listen to the agent port.
Tried to shutdown a slave, the backend server still show up.
Hopefully more detail guidelines will be provided.

fredleeflang

Interesting idea and quite helpful to me to get started with. I think forking a mysql program from PHP every time is a bit expensive though and slows down things a lot. It’s also not necessary since PHP has its own quite good SQL support.

What was a small challenge was how to get the values from SHOW SLAVE STATUS into PHP though. I figured that out and now made the script run in the background, make a connection to the database one time only, and every time haproxy’s agent-check connects it sends the SHOW SLAVE STATUS query over the existing DB connection. This make it a whole lot faster.

Дмитрий Окунев

Hi! Can you share your script?

udhayan

Thanks Stephane. How can we have the agent check along with regular check , Can haproxy validate both regular check and agent check and failover only if both the checks are satisfied?