Configuring VRRP for a BareMetal server cluster using Keepalived
VRRP
To implement fault tolerance, two or more routers are grouped into a single virtual router acting as the default gateway for the associated network segments. VRRP enables creating a virtual IP address which is shared among the grouped routers to increase the gateway availability.
This tutorial provides an example of setting up a high-availability proxy server configuration on BareMetal servers, with symmetric proxying across two or more HAProxy
Solution architecture
In the ru-central1-m
availability zone, you will set up an environment of two private subnets, subnet-m3
and subnet-m4
, created in the ru-central1-m3
and ru-central1-m4
server pools, respectively. You will group these subnets into a virtual network segment (VRF) named vrrp-vrf
.
In subnet-m3
, you will create two BareMetal servers, master-server-m3
and backup-server-m3
, which will have the MASTER
and BACKUP
roles, respectively, in the VRRP group. On these two servers, you will run Keepalived and use it to set up a virtual IP address for the server group in the ru-central1-m3
pool.
In subnet-m4
of the ru-central1-m4
server pool, you will create a BareMetal server named client-server-m4
, which will serve as a client when using the virtual IP address created in the ru-central1-m3
pool.
This solution fully demonstrates the operation of an isolated client VRF segment with the OSIru-central1-m3
and ru-central1-m4
server pools as well as the L2 operation of the broadcast VRRP in the ru-central1-m3
server pool.
Note
At OSI L2, broadcasting works only within one server pool and only for a group of servers in the same network.
To configure a fault-tolerant cluster of BareMetal servers using VRRP:
- Get your cloud ready.
- Create a virtual routing and forwarding segment.
- Create private subnets.
- Lease BareMetal servers.
- Configure Keepalived on the servers of the ru-central1-m3 pool.
- Test the solution.
See also How to cancel server lease.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or create a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can navigate to the cloud page
Learn more about clouds and folders here.
Required paid resources
The cost of this solution includes the BareMetal server lease fee (see Yandex BareMetal pricing).
Create a virtual routing and forwarding segment
To enable OSI L3 communication between private subnets, group them into a virtual network segment (VRF).
Create a new VRF segment:
- In the management console
, select the folder where you are going to create your infrastructure. - In the list of services, select BareMetal.
- In the left-hand panel, select
VRF and click Create VRF. - In the Name field, name your VRF segment:
vrrp-vrf
. - Click Create VRF.
Create private subnets
Create two private subnets in different server pools and add them to your VRF segment:
- In the management console
, select the folder where you are deploying your infrastructure. - In the list of services, select BareMetal.
- In the left-hand panel, select
Private subnets and click Create subnet. - In the Pool field, select the
ru-central1-m3
server pool. - In the Name field, enter a name for the subnet:
subnet-m3
. - Enable IP addressing and routing.
- In the Virtual network segment (VRF) field, select
vrrp-vrf
. - In the CIDR field, specify
172.28.1.0/24
. - Click Create subnet.
- Similarly, create a private subnet named
subnet-m4
in theru-central1-m4
server pool with the172.28.2.0/24
CIDR.
Lease BareMetal servers
-
In the management console
, select the folder where you are deploying your infrastructure. -
In the list of services, select BareMetal and click Lease server.
-
Under Configuration, click the
Pool
filter and select theru-central1-m3
server pool. -
Under Configuration, select the appropriate server configuration.
-
(Optional) Under Disk, configure disk partitioning:
-
Click Configure disk layout.
-
Specify the partitioning parameters. To create a new partition, click
Add partition.To build RAID arrays and configure disk partitions yourself, click Remove RAID.
-
Click Save.
-
-
Under Image, select the
Ubuntu 24.04
image. -
In the Lease duration field, select a lease period:
1 day
,1 month
,3 months
,6 months
, or1 year
.When this period expires, server lease will automatically be renewed for the same period. You cannot terminate the lease during the specified lease period, but you can refuse to extend the server lease further.
-
Under Private network, in the Private subnet field, select the
subnet-m3
subnet you created earlier. -
Under Public network, select
From ephemeral subnet
in the Public address field. -
Under Access:
-
In the Password field, select one of the following options to create a root password:
-
To generate a new root password, select
New password
and click Generate.Warning
This option requires you to maintain password security. Save the password you generated in a secure location. Yandex Cloud does not store it, and you will not be able to retrieve it once the server is deployed.
-
To use the root password saved in a Yandex Lockbox secret, select
Lockbox secret
.In the Name, Version, and Key fields, select the secret containing your password, its version, and its key, respectively.
If you do not have a Yandex Lockbox secret, click Create to create it.
Choose the
Custom
secret type to specify a custom password orGenerated
to generate password automatically.
-
-
In the Public SSH key field, select the SSH key saved in your organization user profile.
If there are no SSH keys in your profile or you want to add a new key:
-
Click Add key.
-
Enter a name for the SSH key.
-
Select one of the following:
-
Enter manually
: Paste the contents of the public SSH key. You need to create an SSH key pair on your own. -
Load from file
: Upload the public part of the SSH key. You need to create an SSH key pair on your own. -
Generate key
: Automatically create an SSH key pair.When adding a new SSH key, an archive containing the key pair will be created and downloaded. In Linux or macOS-based operating systems, unpack the archive to the
/home/<user_name>/.ssh
directory. In Windows, unpack the archive to theC:\Users\<user_name>/.ssh
directory. You do not need additionally enter the public key in the management console.
-
-
Click Add.
The system will add the SSH key to your organization user profile. If the organization has disabled the ability for users to add SSH keys to their profiles, the added public SSH key will only be saved in the user profile inside the newly created resource.
-
-
-
Under Server information, in the Name field, enter the server name:
master-server-m3
. -
Click Lease server.
-
Similarly, lease two more servers: one named
backup-server-m3
in theru-central1-m3
server pool and another one namedclient-server-m4
withsubnet-m4
in theru-central1-m4
server pool.
On the page with a list of BareMetal servers that opens, you will see information about all the servers you created. In the Public address field of the table, copy the server public IP addresses, as you will need them to connect to the servers over SSH.
Note
Getting servers ready and installing operating systems on them may take up to 45 minutes. The servers will have the Provisioning
status during this time. After OS installation is complete, the server status will change to Ready
.
Configure Keepalived on the servers of the ru-central1-m3 pool
You will now install, configure, and run Keepalivedru-central1-m3
pool.
Follow the steps below to configure both servers, master-server-m3
and backup-server-m3
.
-
Connect to the server over SSH by using the server's public IP address you saved in the previous step.
-
Install Keepalived by running this command:
sudo apt update && sudo apt install keepalived -y
-
View a list of the server's network interfaces:
ip a
Result:
... 5: etx2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:02:c9:35:fd:31 brd ff:ff:ff:ff:ff:ff altname enp6s0d1 inet 172.28.1.2/24 metric 100 brd 172.28.1.255 scope global dynamic etx2 valid_lft 3512sec preferred_lft 3512sec inet6 fe80::202:c9ff:fe35:fd31/64 scope link valid_lft forever preferred_lft forever
In the command output, find an interface with an IP address in the
172.28.1.0/24
range allocated forsubnet-m3
. In the example above, such an interface has theetx2
ID. You will need the interface ID in later steps to configure Keepalived. -
Create a Keepalived configuration file:
sudo nano /etc/keepalived/keepalived.conf
-
Add the following configuration to the file:
MasterBackupvrrp_instance M3_1 { state MASTER interface etx2 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass hGoVjTjSYQq3Epm } virtual_ipaddress { 172.28.1.254 } preempt }
vrrp_instance M3_2 { state BACKUP interface etx2 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass hGoVjTjSYQq3Epm } virtual_ipaddress { 172.28.1.254 } preempt }
Where:
-
vrrp_instance
: Virtual router name:M3_1
for the server with theMASTER
role.M3_2
for the server with theBACKUP
role.
-
state
: Server state,MASTER
orBACKUP
. -
interface
: ID of the network interface where the virtual IP address will be used. In the example above, it isetx2
. -
virtual_router_id
: Unique VRRP ID for the group of virtual routers. This value must be the same for all servers in the group. -
priority
: Priority that allows you to set the master and backup nodes. Set the server's priority to100
to make it the master node or to90
to make it the backup one. -
advert_int
: Interval between state announcements in seconds. -
authentication
: Section with authentication settings to provide security. Its contents must be the same for all servers in the group. -
virtual_ipaddress
: Virtual IP address the current node will manage. Make sure your virtual IP address meets the following requirements:- It belongs to the CIDR range allocated for the virtual subnet where you created the server group.
- It is unused.
- It is the same for all servers in the group.
-
preempt
: Enables the server to change its state toMASTER
if it has a higher priority than the current master in the group.
-
-
Restart Keepalived:
systemctl restart keepalived.service
-
View Keepalived logs to make sure the service is running:
sudo journalctl -u keepalived.service
Result:
MasterBackupsystemd[1]: keepalived.service - Keepalive Daemon (LVS and VRRP) was skipped because of an unmet condition check (ConditionFileNotEmpty=/etc/keepalived/keepalived.conf). systemd[1]: Starting keepalived.service - Keepalive Daemon (LVS and VRRP)... Keepalived[4045]: Starting Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+ Keepalived[4045]: Running on Linux 6.8.0-53-generic #55-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 17 15:37:52 UTC 2025 (built for Linux 6.8.0) Keepalived[4045]: Command line: '/usr/sbin/keepalived' '--dont-fork' Keepalived[4045]: Configuration file /etc/keepalived/keepalived.conf Keepalived[4045]: NOTICE: setting config option max_auto_priority should result in better keepalived performance Keepalived[4045]: Starting VRRP child process, pid=4046 Keepalived_vrrp[4046]: (/etc/keepalived/keepalived.conf: Line 10) Truncating auth_pass to 8 characters Keepalived[4045]: Startup complete systemd[1]: Started keepalived.service - Keepalive Daemon (LVS and VRRP). Keepalived_vrrp[4046]: (M3_1) Entering BACKUP STATE (init) Keepalived_vrrp[4046]: (M3_1) Entering MASTER STATE
systemd[1]: keepalived.service - Keepalive Daemon (LVS and VRRP) was skipped because of an unmet condition check (ConditionFileNotEmpty=/etc/keepalived/keepalived.conf). systemd[1]: Starting keepalived.service - Keepalive Daemon (LVS and VRRP)... Keepalived[2751]: Starting Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+ Keepalived[2751]: Running on Linux 6.8.0-53-generic #55-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 17 15:37:52 UTC 2025 (built for Linux 6.8.0) Keepalived[2751]: Command line: '/usr/sbin/keepalived' '--dont-fork' Keepalived[2751]: Configuration file /etc/keepalived/keepalived.conf Keepalived[2751]: NOTICE: setting config option max_auto_priority should result in better keepalived performance Keepalived[2751]: Starting VRRP child process, pid=2752 Keepalived_vrrp[2752]: (/etc/keepalived/keepalived.conf: Line 10) Truncating auth_pass to 8 characters Keepalived[2751]: Startup complete Keepalived_vrrp[2752]: (M3_2) Entering BACKUP STATE (init)
Test the solution
-
Make sure the virtual IP address was added to the network interface of the server with the
Master
role:-
Connect to
master-server-m3
over SSH. -
View the configuration of the network interface assigned to
subnet-m3
.ip a
Result:
... 5: etx2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:02:c9:35:fd:31 brd ff:ff:ff:ff:ff:ff altname enp6s0d1 inet 172.28.1.2/24 metric 100 brd 172.28.1.255 scope global dynamic etx2 valid_lft 3575sec preferred_lft 3575sec inet 172.28.1.254/32 scope global etx2 valid_lft forever preferred_lft forever inet6 fe80::202:c9ff:fe35:fd31/64 scope link valid_lft forever preferred_lft forever
The network interface received an additional virtual IP address specified in Keepalived settings:
172.28.1.254/32
.
-
-
Send ICMP
requests fromsubnet-m4
to make sure the virtual IP address insubnet-m3
is available:-
Connect to
client-server-m4
over SSH. -
Run this command:
ping 172.28.1.254 -s 1024 -c 5
Result:
PING 172.28.1.254 (172.28.1.254) 1024(1052) bytes of data. 1032 bytes from 172.28.1.254: icmp_seq=1 ttl=62 time=0.211 ms 1032 bytes from 172.28.1.254: icmp_seq=2 ttl=62 time=0.242 ms 1032 bytes from 172.28.1.254: icmp_seq=3 ttl=62 time=0.264 ms 1032 bytes from 172.28.1.254: icmp_seq=4 ttl=62 time=0.312 ms 1032 bytes from 172.28.1.254: icmp_seq=5 ttl=62 time=0.273 ms --- 172.28.1.254 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4117ms rtt min/avg/max/mdev = 0.211/0.260/0.312/0.033 ms
This command sends and receives large packets. All packets were successfully delivered.
-
-
Make sure the Keepalived load balancer works properly:
-
Connect to
client-server-m4
over SSH. -
In a separate terminal window, connect to
master-server-m3
over SSH.Arrange the terminal windows so you can see both at the same time.
-
In the terminal window with the
client-server-m4
session, runping
once again without a retry limit:ping 172.28.1.254 -s 1024
During this poll, stop Keepalived in the terminal window with the open
master-server-m3
session by running this command:sudo systemctl stop keepalived
When it stops, watch the terminal window with the
client-server-m4
session. If the virtual IP address was shared successfully, ICMP requests should switch to the backup host almost seamlessly without interrupting the runningping
command.Note
A minor loss of 1 to 3 packets is acceptable, which may happen when the timer for new group
MASTER
election triggers and the system reassigns the virtual IP address.Result:
PING 172.28.1.254 (172.28.1.254) 1024(1052) bytes of data. 1032 bytes from 172.28.1.254: icmp_seq=1 ttl=62 time=0.249 ms ... 1032 bytes from 172.28.1.254: icmp_seq=56 ttl=62 time=0.224 ms 1032 bytes from 172.28.1.254: icmp_seq=57 ttl=62 time=0.314 ms 1032 bytes from 172.28.1.254: icmp_seq=58 ttl=62 time=0.278 ms ^C --- 172.28.1.254 ping statistics --- 58 packets transmitted, 55 received, 5.17241% packet loss, time 58368ms rtt min/avg/max/mdev = 0.185/0.271/0.326/0.035 ms
-
In the terminal window with the open
master-server-m3
session, run Keepalived using this command:sudo systemctl start keepalived
-
-
Check Keepalived logs on the server with the
BACKUP
role:-
Connect to
backup-server-m3
over SSH. -
View Keepalived logs:
sudo journalctl -u keepalived.service
Result:
... # Logging the transition to MASTER as Keepalived stops on the original master node Feb 19 07:08:07 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Entering MASTER STATE # Logging the transition to BACKUP when resuming Keepalived on the original master node. Feb 19 07:08:31 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Master received advert from 172.28.1.2 with higher priority 100, ours 90 Feb 19 07:08:31 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Entering BACKUP STATE ...
As you can see from the service log and comments,
backup-server-m3
was promoted to the master node after Keepalived stopped onmaster-server-m3
. After resuming Keepalived onmaster-server-m3
, the server reclaimed its master role andbackup-server-m3
, again, became the backup node.
-
How to cancel server lease
You cannot delete BareMetal servers. Instead, you can choose not to renew their lease.
To stop paying for the resources you created, cancel the lease of the BareMetal servers you created earlier.