Setup Galera Cluster for MySQL/MariaDB on Ubuntu/Debian and CentOS/RHEL

Introduction

Galera Cluster enables building a high availability MySQL / MariaDB database cluster without compromising data consistency. By replicating data across nodes in real-time, Galera provides continuous uptime even when individual nodes fail.

In this comprehensive tutorial, we will walk through the steps to configure a two node Galera cluster on Ubuntu/Debian and CentOS/RHEL Linux.

Prerequisites

Before we begin, ensure you have:

2 servers or VMs with Ubuntu 18.04+/Debian 9+ or CentOS 7+/RHEL 7+ installed.
At least 2GB RAM and 2 vCPU cores on each node.
Root access to servers via SSH.
MySQL server already installed on both nodes. Version 5.7+ recommended.
Basic firewall rules configured to allow traffic on ports 3306, 4567, 4568 and 4444.

We will be executing most commands as the root user or with sudo privileges.

We begin by installing the Galera Cluster software and other utilities needed for proper functioning of the cluster.

On Ubuntu/Debian

Update apt repository indexes and install the required packages:

$ apt update
$ apt install galera-4 mariadb-server socat python3-mysql.connector

This installs:

galera-4 – Galera Cluster frameworks
mariadb-server – For MySQL database server
socat – For testing & monitoring clusters
python3-mysql.connector – For MySQL connections from Python

On CentOS/RHEL

Enable EPEL repos which provide the Galera packages:

$ yum install epel-release

Now install Galera and dependencies:

$ yum install galera mariadb-server rsync socat mysql-connector-python

This installs:

galera – Galera Cluster
mariadb-server – MySQL server
rsync – For SST snapshot transfers
socat – Monitoring utility
mysql-connector-python – Python MySQL connector

Galera Cluster is now ready to be configured on both nodes.

Configuring MySQL for Galera

For MySQL to use Galera, we need to configure some options in my.cnf.

Open the config file:

$ nano /etc/my.cnf

Add the following under [mysqld]:

binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
# Galera settings
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_name='my_galera_cluster'
wsrep_cluster_address="gcomm://node1,node2"
# This Only on Node 1
wsrep_node_address='192.168.1.101'
wsrep_node_name='node1'
# This Only on Node 2
wsrep_node_address='192.168.1.102'
wsrep_node_name='node2'
wsrep_sst_method=rsync

The wsrep_cluster_address contains a list of cluster node IPs.

wsrep_node_address and wsrep_node_name must be unique on every server.

Save and close the file after making changes.

Do this on both servers, replacing the IPs and names as per your servers.

This configures MySQL to use the Galera plugin for replication.

Starting the Galera Cluster

With configuration in place, we are ready to bootstrap the cluster.

Start MySQL only on the first node (node1):

$ systemctl start mysql
# or 
$ systemctl start mariadb

This will initialize the Galera cluster.

Check MySQL status and wsrep variables:

$ mysql -u root -e "SHOW STATUS LIKE '%wsrep%';"

Example output:

+------------------------------+-------------------------------------------------+
| Variable_name                | Value                                           |
+------------------------------+-------------------------------------------------+
| wsrep_local_state_uuid       | af2a75b4-9e1c-11ed-9838-be4b133a6b15            |
| wsrep_protocol_version       | 7                                               |   
| wsrep_last_committed         | 0                                               |
| wsrep_replicated             | 0                                               |
| wsrep_replicated_bytes       | 0                                               |
| wsrep_repl_keys              | 0                                               |
| wsrep_repl_keys_bytes        | 0                                               |
| wsrep_repl_data_bytes        | 0                                               |
| wsrep_repl_other_bytes       | 0                                               |
| wsrep_received               | 1                                               |
| wsrep_received_bytes         | 119                                             | 
| wsrep_local_commits          | 0                                               |  
| wsrep_local_cert_failures    | 0                                               |
| wsrep_local_replays          | 0                                               |
| wsrep_local_send_queue       | 0                                               |
| wsrep_local_send_queue_avg   | 0.000000                                        |        
| wsrep_local_recv_queue       | 0                                               |
| wsrep_local_recv_queue_avg   | 0.000000                                        |  
| wsrep_local_cached_downto    | 18446744073709551615                            |
| wsrep_flow_control_paused_ns | 0                                               |
| wsrep_flow_control_sent      | 0                                               |
| wsrep_flow_control_recv      | 0                                               |
+------------------------------+-------------------------------------------------+

This confirms Galera is operational. Note the local cluster size is 1 right now.

Now start MySQL on the second node to join it to the cluster:

$ systemctl start mysql
# or
$ systemctl start mariadb

Verify it joined successfully:

$ mysql -u root -e "SHOW STATUS LIKE '%wsrep%';"

We should now see the cluster size as 2:

| wsrep_cluster_size |2|

Additionally, run mysql -e "SHOW STATUS LIKE '%wsrep%';" on node1 again and the status variables should sync up between both nodes.

Our two node Galera cluster is ready!

Testing Cluster Operation

Let’s test replication between the two nodes works as expected.

On node1, create a test database and insert some data:

mysql> CREATE DATABASE cluster_test;
mysql> USE cluster_test;
mysql> CREATE TABLE test (id INT, message VARCHAR(20));
mysql> INSERT INTO test VALUES (1, 'Hello Galera');
mysql> SELECT * FROM test;
+------+----------------+ 
| id   | message        |
+------+----------------+
|    1 | Hello Galera   |
+------+----------------+

Now check the same table contents from node2:

mysql> USE cluster_test; 
mysql> SELECT * FROM test;
+------+----------------+
| id   | message        |
+------+----------------+
|    1 | Hello Galera   |
+------+----------------+

The row replicated from node1 to node2 as expected.

Let’s also test shutdown recovery. Stop MySQL on node1:

$ systemctl stop mysql
# or
$ systemctl stop mariadb

On node2, connect to MySQL and verify queries still work:

mysql> USE cluster_test;
mysql> SELECT * FROM test;
+------+----------------+ 
| id   | message        |
+------+----------------+
|    1 | Hello Galera   |
+------+----------------+

Node2 remains operational despite node1 shutdown since all data is replicated.

Bring node1 back up:

$ systemctl start mysql
# or 
$ systemctl start mariadb

It will sync again with node2 automatically. Run SHOW STATUS LIKE '%wsrep%'; on both nodes to confirm values match.

This demonstrates the high availability provided by Galera!

Cluster Monitoring & Management

Now that we have a working Galera cluster, let’s look at some tips for monitoring and managing it.

Checking Cluster Status

Use the garbd daemon to check high level cluster stats:

$ garbd -a gcomm://192.168.1.101,192.168.1.102 -g my_galera_cluster
Galera cluster Node 1/2 info:
  evs::protover => 7
  evs::uuid     => af2a75b4-9e1c-11ed-9838-be4b133a6b15
  evs::status   => Primary
  evs::state    => Synced
Galera cluster Node 2/2 info: 
  evs::protover => 7 
  evs::uuid     => af2a75b4-9e1c-11ed-9838-be4b133a6b15
  evs::status   => Primary
  evs::state    => Synced

This shows both nodes are in Synced state and part of the same cluster.

Monitoring Node Status

Use mysqladmin to check Galera variables on each node:

$ mysqladmin -uroot -p -h192.168.1.101 variables | grep wsrep
| wsrep_cluster_conf_id | 25 |
| wsrep_cluster_size | 2 |
| wsrep_cluster_state_uuid | af2a75b4-9e1c-11ed-9838-be4b133a6b15 |
| wsrep_cluster_status | Primary |
$ mysqladmin -uroot -p -h192.168.1.102 variables | grep wsrep 
| wsrep_cluster_conf_id | 25 |
| wsrep_cluster_size | 2 |
| wsrep_cluster_state_uuid | af2a75b4-9e1c-11ed-9838-be4b133a6b15 | 
| wsrep_cluster_status | Primary |

Values like cluster size, UUID, status should match on all nodes.

Checking SST Transfer Status

When a new node joins, State Snapshot Transfer (SST) is used to sync data to it.

Monitor SST progress with:

$ mysql -e "SHOW STATUS LIKE 'wsrep_local_state_uuid'"
+----------------------------------+--------------------------------------+
| Variable_name                    | Value                                |  
+----------------------------------+--------------------------------------+
| wsrep_local_state_uuid           | af2a75b4-9e1c-11ed-9838-be4b133a6b15 |
+----------------------------------+--------------------------------------+
$ mysql -e "SHOW STATUS LIKE 'wsrep_local_state_comment'"
+-----------------------------------+---------+ 
| Variable_name                     | Value   |
+-----------------------------------+---------+
| wsrep_local_state_comment         | Synced  |
+-----------------------------------+---------+

While SST is in progress, wsrep_local_state_comment will show the sync percentage.

Checking Recovery Status

When a node rejoins after disconnect, status can be checked with:

mysql>SHOW STATUS WHERE `variable_name` LIKE'wsrep_%';

Look for wsrep_local_state_comment like Recovering or Donor/Desynced during recovery.

This way various stages of cluster syncs and recovery can be tracked.

Monitoring Cluster Size

To check the number of nodes in the cluster:

mysql> SHOW STATUS LIKE 'wsrep_cluster_size'; 
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+

Matching the expected number of nodes.

We can also use the clustercheck script to monitor cluster size:

$ clustercheck
Cluster is CORRECT (2 nodes)
$

This will warn if nodes are missing or extra.

Checking Node Consistency

The cluster status should be verified to ensure all nodes contain the same data.

Comparing wsrep_local_state_uuid variable between nodes indicates consistency:

$ mysql -e "SHOW STATUS LIKE 'wsrep_local_state_uuid'\G" -h192.168.1.101
   *************************** 1. row ***************************
   Variable_name: wsrep_local_state_uuid
        Value: af2a75b4-9e1c-11ed-9838-be4b133a6b15
$ mysql -e "SHOW STATUS LIKE 'wsrep_local_state_uuid'\G" -h192.168.1.102   
   *************************** 1. row ***************************
   Variable_name: wsrep_local_state_uuid
        Value: af2a75b4-9e1c-11ed-9838-be4b133a6b15

If the UUID matches on nodes, data is consistent.

Checking Connection Status

Use socat to check TCP connection status between nodes:

$ socat - TCP:192.168.1.101:4567
> 
$ socat - TCP:192.168.1.102:4567
>

This confirms TCP port 4567 is open between nodes for Galera Cluster replication traffic.

We can also use MySQL to check connection status:

mysql> SHOW STATUS WHERE `variable_name` LIKE '%connection%';
+-----------------------------+-------+
| Variable_name               | Value |  
+-----------------------------+-------+
| Slave_connections           | 0     |
| Max_used_connections        | 2     |
| Aborted_connects            | 0     |
| Max_used_connections        | 2     |
+-----------------------------+-------+

Monitor open connection count to detect issues.

This provides insight into overall cluster health and connectivity.

Tracking Node History

The node history can be useful when troubleshooting or analyzing events:

mysql> SHOW STATUS LIKE 'wsrep_local_state_uuid%';
+--------------------------------+--------------------------------------+
| Variable_name                  | Value                                |
+--------------------------------+--------------------------------------+  
| wsrep_local_state_uuid         | af2a75b4-9e1c-11ed-9838-be4b133a6b15 |
| wsrep_local_state_uuid_history | af2a75b4-9e1c-11ed-9838-be4b133a6b15 |
+--------------------------------+--------------------------------------+

Any past cluster UUIDs will be appended to wsrep_local_state_uuid_history on events like recoveries.

Similarly, the number of cluster membership changes is tracked by:

mysql> SHOW STATUS LIKE 'wsrep_cluster_size_change%';
+-------------------------------+-------+
| Variable_name                 | Value |  
+-------------------------------+-------+
| wsrep_cluster_size_changes    | 1     |
| wsrep_cluster_size_change_history | 1 |
+-------------------------------+-------+

This provides insight into cluster activity over time.

Using these status variables and commands, the Galera cluster can be monitored for proper operation. Issues like node disconnects, replication lag, or consistency loss can be rapidly detected and debugged as well.

Configuring Galera Arbitrator (optional)

For a two node cluster, we should setup a Galera Arbitrator to avoid split brain scenarios. The arbitrator is a lightweight process that provides a quorum for the cluster to determine which node should continue operating in case of network splits.

On a third server, install just the galera-4 or galera package.

Modify /etc/my.cnf with:

[mysqld]
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
[galera]
wsrep_cluster_address="gcomm://192.168.1.101,192.168.1.102" 
wsrep_cluster_name='my_galera_cluster'

Start the arbitrator:

$ galera_arbitrator

Check logs at /var/log/mysql/galera.log to ensure it connected to the cluster successfully.

The arbitrator will now participate in quorum calculations and provide automatic failover in split brain scenarios. This prevents data loss in the event of network partitions.

Conclusion

In this detailed guide, we covered the steps to install, configure and monitor a two node Galera cluster on Ubuntu/Debian and CentOS/RHEL distributions step-by-step with practical examples.

Key takeaways include:

Galera enables building multi-master MySQL clusters for high availability.
Real-time row based replication provides consistency.
Nodes can be dynamically added and removed.
Automatic node joining, state transfers and quorum handling.
Monitoring key status variables like wsrep metrics.
Galera arbitrator prevents split brain scenarios.

A two node Galera cluster serves well for reducing downtime and providing redundancy for many applications. Additional nodes can be introduced seamlessly later if needed.

Using this tutorial as a guide, you can now deploy highly available MySQL clusters with Galera on Ubuntu/Debian and CentOS/RHEL.