Configuring VRRP for a BareMetal server cluster
Note
Yandex BareMetal is at the Preview stage.
VRRP
To implement fault tolerance, two or more routers are grouped into a single virtual router acting as the default gateway for the associated network segments. VRRP enables creating a virtual IP address which is shared among the grouped routers to increase the gateway availability.
This tutorial provides an example of setting up a high-availability proxy server configuration on BareMetal servers, where proxying functions are configured symmetrically on two or more HAProxy
Solution architecture
In the ru-central1-m
availability zone, you will set up an environment of two private subnets, subnet-m3
and subnet-m4
, created in the ru-central1-m3
and ru-central1-m4
server pools, respectively. You will group these subnets into a virtual routing and forwarding (VRF) network segment named vrrp-vrf
.
In subnet-m3
, you will create two BareMetal servers, master-server-m3
and backup-server-m3
, which will have the MASTER
and BACKUP
roles, respectively, in the VRRP group. On these two servers, you will run Keepalived and use it to set up a virtual IP address for the server group in the ru-central1-m3
pool.
In subnet-m4
of the ru-central1-m4
server pool, you will create a BareMetal server named client-server-m4
, which will serve as a client when using the virtual IP address created in the ru-central1-m3
pool.
This solution provides end-to-end insights into the operation of an isolated client VRF with the OSIru-central1-m3
and ru-central1-m4
server pools as well as the operation of the broadcast VRRP at the L2 level in the ru-central1-m3
server pool.
Note
At L2 of the OSI network model, broadcasting works only within one server pool and only for a group of servers in the same network.
To configure a fault-tolerant cluster of BareMetal servers using VRRP:
- Get your cloud ready.
- Create a virtual network segment.
- Create private subnets.
- Lease BareMetal servers.
- Configure Keepalived on the servers of the ru-central1-m3 pool.
- Test the solution.
If you no longer need the resources you created, delete them.
Getting started
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Create a virtual network segment
To link several private subnets at the L3 level of the OSI network model, you need to group them into a virtual routing and forwarding (VRF) network fragment.
Create a new VRF:
- In the management console
, select the folder to create your infrastructure in. - From the list of services, select BareMetal.
- In the left-hand panel, select
VRF and click Create VRF. - In the Name field, enter a name for the VRF:
vrrp-vrf
. - Click Create VRF.
Create private subnets
Create two private subnets in different server pools and add them to the VRF you created earlier:
- In the management console
, select the folder to create your infrastructure in. - From the list of services, select BareMetal.
- In the left-hand panel, select
Private subnets and click Create subnet. - In the Pool field, select the
ru-central1-m3
server pool. - In the Name field, enter a name for the subnet:
subnet-m3
. - Enable Routing settings.
- In the Virtual network segment (VRF) field, select the previously created VRF,
vrrp-vrf
. - In the CIDR field, specify
172.28.1.0/24
. - Click Create subnet.
- Similarly, create a private subnet named
subnet-m4
in theru-central1-m4
server pool with the172.28.2.0/24
CIDR.
Lease BareMetal servers
-
In the management console
, select the folder to create your infrastructure in. -
In the list of services, select BareMetal and click Lease server.
-
In the Pool field, select the
ru-central1-m3
server pool. -
Under Configuration, select the appropriate server configuration.
-
(Optionally) Under Disk, configure disk partitioning:
-
Click Configure disk layout.
-
Specify the partitioning parameters. To create a new partition, click
Add partition.Note
To build RAID arrays and configure disk partitions yourself, click Remove RAID.
-
Click Save.
-
-
Under Image, select the Ubuntu 24.04 image.
-
Under Lease conditions, select the period you want to lease the server for. When this period expires, server lease will be automatically renewed for the same period.
-
Under Network settings:
- In the Private subnet field, select the
subnet-m3
subnet you created earlier. - In the Public address field, select
Automatically
.
- In the Private subnet field, select the
-
Under Access:
-
Next to the Password field, click Generate to generate a password for the root user.
Warning
Save the password in a safe place. Yandex Cloud does not store this password, and you will not be able to view it once you lease the server.
-
In the Public SSH key field, select the SSH key saved in your organization user profile.
If there are no saved SSH keys in your profile, or you want to add a new key:
- Click Add key.
- Enter a name for the SSH key.
- Upload or paste the contents of the public key file. You need to create a key pair for the SSH connection to a server yourself.
- Click Add.
The SSH key will be added to your organization user profile.
If adding SSH keys by users to their profiles is disabled in the organization, the public SSH key will be saved only to the new BareMetal server's user profile.
-
-
Under Server information in the Name field, enter a name for the server:
master-server-m3
. -
Click Lease server.
-
Similarly, lease two more servers: one named
backup-server-m3
in theru-central1-m3
server pool and another one namedclient-server-m4
with thesubnet-m4
subnet in theru-central1-m4
server pool.
On the page with a list of BareMetal servers that opens, you will see information about all the servers you created. In the Public address field of the table, copy the server public IP addresses as you will need them to connect to the servers over SSH.
Note
Getting servers ready and installing operating systems on them may take up to 45 minutes. The servers will have the Provisioning
status during this time. After OS installation is complete, the server status will change to Ready
.
Configure Keepalived on the servers of the ru-central1-m3 pool
At this step, you will install, configure, and run Keepalivedru-central1-m3
pool.
Follow the steps below to configure both servers, master-server-m3
and backup-server-m3
.
-
Connect to the server over SSH by using the server’s public IP address you saved in the previous step.
-
Install Keepalived by running this command:
sudo apt update && sudo apt install keepalived -y
-
View the list of the server’s network interfaces:
ip a
Result:
... 5: etx2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:02:c9:35:fd:31 brd ff:ff:ff:ff:ff:ff altname enp6s0d1 inet 172.28.1.2/24 metric 100 brd 172.28.1.255 scope global dynamic etx2 valid_lft 3512sec preferred_lft 3512sec inet6 fe80::202:c9ff:fe35:fd31/64 scope link valid_lft forever preferred_lft forever
In the command output, find an interface with an IP address in the
172.28.1.0/24
range allocated for the private subnet namedsubnet-m3
. In the example above, such an interface has theetx2
ID. You will need the interface ID in later steps to configure Keepalived. -
Create a Keepalived configuration file:
sudo nano /etc/keepalived/keepalived.conf
-
Add the following configuration into the file you created:
MasterBackupvrrp_instance M3_1 { state MASTER interface etx2 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass hGoVjTjSYQq3Epm } virtual_ipaddress { 172.28.1.254 } preempt }
vrrp_instance M3_2 { state BACKUP interface etx2 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass hGoVjTjSYQq3Epm } virtual_ipaddress { 172.28.1.254 } preempt }
Where:
-
vrrp_instance
: Virtual router name:M3_1
for the server with theMASTER
role.M3_2
for the server with theBACKUP
role.
-
state
: Server state,MASTER
orBACKUP
. -
interface
: ID of the network interface where the virtual IP address will be used. In the example above, it isetx2
. -
virtual_router_id
: Unique VRRP ID for the group of virtual routers. This value must be the same for all servers in the group. -
priority
: Priority that allows you to set the master and backup nodes. Set a server’s priority to100
to make it the master node or to90
to make it the backup one. -
advert_int
: Interval between state announcements in seconds. -
authentication
: Section with authentication settings to provide security. Contents of this section must be the same for all servers in a group. -
virtual_ipaddress
: Virtual IP address that the current node will manage. Virtual IP address requirements:- It must belong to the CIDR range allocated for the virtual subnet where you created the server group.
- It must be unused.
- All servers in the group must have the same address.
-
preempt
: Enables the server to change its state toMASTER
if it has a higher priority than the current master in the group.
-
-
Restart Keepalived:
systemctl restart keepalived.service
-
View Keepalived logs to make sure the service is running:
sudo journalctl -u keepalived.service
Result:
MasterBackupsystemd[1]: keepalived.service - Keepalive Daemon (LVS and VRRP) was skipped because of an unmet condition check (ConditionFileNotEmpty=/etc/keepalived/keepalived.conf). systemd[1]: Starting keepalived.service - Keepalive Daemon (LVS and VRRP)... Keepalived[4045]: Starting Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+ Keepalived[4045]: Running on Linux 6.8.0-53-generic #55-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 17 15:37:52 UTC 2025 (built for Linux 6.8.0) Keepalived[4045]: Command line: '/usr/sbin/keepalived' '--dont-fork' Keepalived[4045]: Configuration file /etc/keepalived/keepalived.conf Keepalived[4045]: NOTICE: setting config option max_auto_priority should result in better keepalived performance Keepalived[4045]: Starting VRRP child process, pid=4046 Keepalived_vrrp[4046]: (/etc/keepalived/keepalived.conf: Line 10) Truncating auth_pass to 8 characters Keepalived[4045]: Startup complete systemd[1]: Started keepalived.service - Keepalive Daemon (LVS and VRRP). Keepalived_vrrp[4046]: (M3_1) Entering BACKUP STATE (init) Keepalived_vrrp[4046]: (M3_1) Entering MASTER STATE
systemd[1]: keepalived.service - Keepalive Daemon (LVS and VRRP) was skipped because of an unmet condition check (ConditionFileNotEmpty=/etc/keepalived/keepalived.conf). systemd[1]: Starting keepalived.service - Keepalive Daemon (LVS and VRRP)... Keepalived[2751]: Starting Keepalived v2.2.8 (04/04,2023), git commit v2.2.7-154-g292b299e+ Keepalived[2751]: Running on Linux 6.8.0-53-generic #55-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 17 15:37:52 UTC 2025 (built for Linux 6.8.0) Keepalived[2751]: Command line: '/usr/sbin/keepalived' '--dont-fork' Keepalived[2751]: Configuration file /etc/keepalived/keepalived.conf Keepalived[2751]: NOTICE: setting config option max_auto_priority should result in better keepalived performance Keepalived[2751]: Starting VRRP child process, pid=2752 Keepalived_vrrp[2752]: (/etc/keepalived/keepalived.conf: Line 10) Truncating auth_pass to 8 characters Keepalived[2751]: Startup complete Keepalived_vrrp[2752]: (M3_2) Entering BACKUP STATE (init)
Test the solution
-
Make sure the virtual IP address was added to the network interface of the server with the
Master
role:-
Connect to
master-server-m3
over SSH. -
View the configuration of the network interface assigned to the
subnet-m3
private subnet.ip a
Result:
... 5: etx2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:02:c9:35:fd:31 brd ff:ff:ff:ff:ff:ff altname enp6s0d1 inet 172.28.1.2/24 metric 100 brd 172.28.1.255 scope global dynamic etx2 valid_lft 3575sec preferred_lft 3575sec inet 172.28.1.254/32 scope global etx2 valid_lft forever preferred_lft forever inet6 fe80::202:c9ff:fe35:fd31/64 scope link valid_lft forever preferred_lft forever
The network interface received an additional virtual IP address specified in Keepalived settings:
172.28.1.254/32
.
-
-
Send ICMP
requests from thesubnet-m4
private subnet to make sure the virtual IP address in thesubnet-m3
private subnet is available:-
Connect to
client-server-m4
over SSH. -
Run this command:
ping 172.28.1.254 -s 1024 -c 5
Result:
PING 172.28.1.254 (172.28.1.254) 1024(1052) bytes of data. 1032 bytes from 172.28.1.254: icmp_seq=1 ttl=62 time=0.211 ms 1032 bytes from 172.28.1.254: icmp_seq=2 ttl=62 time=0.242 ms 1032 bytes from 172.28.1.254: icmp_seq=3 ttl=62 time=0.264 ms 1032 bytes from 172.28.1.254: icmp_seq=4 ttl=62 time=0.312 ms 1032 bytes from 172.28.1.254: icmp_seq=5 ttl=62 time=0.273 ms --- 172.28.1.254 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4117ms rtt min/avg/max/mdev = 0.211/0.260/0.312/0.033 ms
The command you have used sends and receives packages of an increased size. All packages were delivered in full.
-
-
Make sure the Keepalived load balancer works correctly:
-
Connect to
client-server-m4
over SSH. -
In a separate terminal window, connect to
master-server-m3
over SSH.Move terminal windows so that you see contents of both windows at the same time.
-
In the terminal window with the
client-server-m4
session, runping
once again without a retry limit:ping 172.28.1.254 -s 1024
During this poll, stop Keepalived in the terminal window with the open
master-server-m3
session by running this command:sudo systemctl stop keepalived
When it stops, observe the terminal window with the
client-server-m4
session. If the virtual IP address was shared successfully, ICMP requests should switch to the backup host almost seamlessly without interrupting the runningping
command.Note
A minor loss of 1 to 3 packages is acceptable, which may happen when the timer for selecting a new group
MASTER
is triggered and the server is assigned the virtual IP address.Result:
PING 172.28.1.254 (172.28.1.254) 1024(1052) bytes of data. 1032 bytes from 172.28.1.254: icmp_seq=1 ttl=62 time=0.249 ms ... 1032 bytes from 172.28.1.254: icmp_seq=56 ttl=62 time=0.224 ms 1032 bytes from 172.28.1.254: icmp_seq=57 ttl=62 time=0.314 ms 1032 bytes from 172.28.1.254: icmp_seq=58 ttl=62 time=0.278 ms ^C --- 172.28.1.254 ping statistics --- 58 packets transmitted, 55 received, 5.17241% packet loss, time 58368ms rtt min/avg/max/mdev = 0.185/0.271/0.326/0.035 ms
-
In the terminal window with the
master-server-m3
session, run Keepalived using this command:sudo systemctl start keepalived
-
-
Check Keepalived logs on the server with the
BACKUP
role:-
Connect to
backup-server-m3
over SSH. -
View Keepalived logs:
sudo journalctl -u keepalived.service
Result:
... # Logging the transition to MASTER when Keepalived stopped on the initial master node Feb 19 07:08:07 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Entering MASTER STATE # Logging the transition to BACKUP when resuming Keepalived on the initial master node. Feb 19 07:08:31 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Master received advert from 172.28.1.2 with higher priority 100, ours 90 Feb 19 07:08:31 backup-server-m3 Keepalived_vrrp[2752]: (M3_2) Entering BACKUP STATE ...
As you can see from the service log and comments,
backup-server-m3
was promoted to the master node when Keepalived was stopped onmaster-server-m3
. After resuming Keepalived onmaster-server-m3
, the server reclaimed its master role andbackup-server-m3
, again, became the backup node.
-
How to delete the resources you created
You cannot delete a BareMetal server. Instead, you can cancel the server lease.
To stop paying for the resources you created, cancel the lease of the BareMetal servers you created earlier.