Load balancers
A load balancer is used to receive incoming traffic and transmit it to the backend endpoints. Requests are routed based on the settings of load balancer listeners. Settings for transmitting traffic to backends are configured in backend groups.
The load balancer stores a list of endpoints, which accept traffic, and disables TLS encryption before sending the traffic to backends. Load balancer supports modern TLS versions (TLSv1.2, TLSv1.3) and encryption methods. If the load balancer is going to serve multiple domains, you can configure individual certificates and HTTP routers for each domain by using the TLS SNI mechanism.
The L7 load balancer supports IETF
For convenience and security, you can use the load balancer in combination with Yandex Certificate Manager to store your TLS certificates. You can also use Yandex Monitoring services to monitor request processing.
Security groups
When creating a load balancer, you need to specify security groups as they contain rules the load balancer uses to receive incoming traffic and send it to backend VMs. Each VM also has security groups assigned to it.
For the load balancer to work correctly:
-
The load balancer security groups must allow:
- Receiving external incoming traffic on the ports specified in the listener, e.g., for HTTP(S) traffic: TCP connections on ports
80and443from any address (CIDR:0.0.0.0/0). - Receiving incoming traffic to health check load balancer nodes in different availability zones: TCP connections on port
30080with theLoad balancer healthcheckssource. - Sending traffic to backend VMs, i.e., VMs whose IP addresses are included in target groups. For example, any outgoing connections to internal VM addresses (any protocol, full port range, CIDR:
<VM_internal_IP_address>/32).
Note
Security group rules must specify IP ranges in CIDR format. You cannot assign a group that uses a different security group.
- Receiving external incoming traffic on the ports specified in the listener, e.g., for HTTP(S) traffic: TCP connections on ports
-
Backend VM security groups must allow incoming traffic from the load balancer on the ports specified in the backend groups, e.g., any incoming connections from subnets hosting the load balancer or from at least one of its security groups.
For information on how to configure security groups for the ingress controller and Gateway API, see Configuring security groups for Managed Service for Kubernetes Application Load Balancer tools.
Load balancer location and its internal IP addresses
When creating a load balancer, specify a network and subnets in the availability zones. Those are the subnets where the load balancer's nodes will be hosted. Application backends will receive traffic from the load balancer nodes in these subnets.
Alert
If all backends with health checks enabled in an availability zone fail those checks, traffic will no longer route to that zone, even if functional backends without health checks remain.
We recommend configuring health checks for all backends.
See below to learn what subnet sizes are recommended for load balancers.
Internal IP addresses
The load balancer reserves internal IP addresses in the specified networks and assigns addresses to its nodes. These addresses are used for communication between the load balancer nodes and backends. Node IP addresses are shown in the list of internal IP addresses.
To correctly distribute the load across backends, add a permission for incoming traffic from subnets where the load balancer nodes are located:
- Get the CIDR of each network the load balancer nodes are using.
- To enable nodes to freely communicate with backends, add these CIDRs to the list of allowed sources.
For example, the load balancer uses subnets with CIDRs
10.0.1.0/24and10.0.2.0/24, and backends receive traffic on port8080. In which case you will need two rules to allow traffic from the load balancer nodes:
Traffic
directionPort range Protocol Source CIDR blocks Inbound 8080TCPCIDR10.0.1.0/24Inbound 8080TCPCIDR10.0.2.0/24
Autoscaling and resource units
An internal group of VM instances called resource units is created in each availability zone of the load balancer.
One resource unit supports these peak performance thresholds:
- 1000 requests per second (RPS).
- 4000 concurrently active connections.
- 300 new connections per second.
- 22 MB (176 Mbit) of traffic per second (covers both incoming and outgoing traffic).
The system automatically scales the resource unit group based on the load balancer node’s external workload. The system calculates the group size to ensure resource unit utilization remains below specified thresholds.
As an example, let's take a look at the following load:
- 6,000 RPS.
- 29,000 concurrently active connections.
- 750 new connections per second.
- 20 MB of traffic per second.
This workload requires eight resource units to process:
- 6,000 / 1,000 = 6 resource units per 6,000 RPS.
- 29,000 / 4,000 = 7.25 ≈ 8 resource units per 30,000 active connections.
- 750 / 300 = 2.5 ≈ 3 resource units per 750 new connections.
- 20 / 22 = 0.9090... ≈ 1 resource unit per 20 MB/sec of traffic.
By default, the minimum number of resource units per availability zone is 2. You can increase it in the autoscaling settings. For more information, see below.
The number of resource units affects the cost of using the load balancer. For more information, see Yandex Application Load Balancer pricing policy.
Tip
Application Load Balancer uses Yandex Compute Cloud instance groups as load balancer resource units. See the description of instance groups during a zonal incident and our mitigation guidelines.
Autoscaling settings
In the load balancer settings, you can specify the following:
- The minimum number of resource units per availability zone
-
If you expect higher loads on the load balancer, you can increase the minimum number of resource units per zone in advance to avoid waiting for it to increase following the load.
The default minimum is 2. You cannot set a limit lower than two resource units per zone.
- Maximum total number of resource units
-
The cost of using the load balancer depends on the number of its resource units (see the relevant pricing policy). By default, it is unlimited. You can set a limit to control your expenses.
If the specified minimum is too low for the actual load on the load balancer, it may run incorrectly.
Make sure this value is no less than the number of load balancer availability zones multiplied by the minimum number of resource units per zone.
You can set autoscaling for a group of resource units of your load balancer when creating or updating it.
Recommended subnet sizes
For Application Load Balancer to provide load balancer availability as specified in the service level agreement
For example, if a load balancer uses eight resource units in each availability zone, as shown in this example, each subnet should have at least 8 × 2 = 16 free addresses. For the load balancer, we recommend specifying subnets at least /27 in size.
Zonal shift mechanism
A load balancer can be temporarily disabled in one or more availability zones. You can disable zones manually or allow the system to auto-disable zones during incidents. The disabling is done by Yandex Cloud staff if a zone is unavailable. Once the zone is back on track, it is re-enabled, so your traffic is distributed evenly again.
Once a zone is disabled, external traffic will no longer be sent to the load balancer nodes in these availability zones. However, the load balancer nodes in other availability zones will continue supplying traffic to backends in the availability zones the load balancer was disabled in, if allowed by the locality-aware routing settings.
When manually disabling an availability zone, you can set a time from 1 minute to 72 hours, after which the zone will automatically return to work and the load balancer settings will be updated without your intervention. If no time is set, the zone will be off until you re-enable it manually.
By manually disabling an availability zone, you can:
-
Reduce its traffic load during localized issues, e.g., after a faulty app release on your backend or an incident caused by high traffic or misconfiguration. This helps prevent service disruption or quickly restore app functionality for your users.
-
Test the resilience of your load balancer and traffic failover mechanisms. Thus, you can proactively identify potential weaknesses, apply fixes, and optimize your load balancer settings in advance.
Listener
The listener determines the ports, addresses, and protocols the load balancer will accept traffic on.
Some incoming ports, such as port 22, are reserved for service purposes and you cannot use them.
Request routing to backend groups depends on the listener type:
- HTTP: Load balancer accepts HTTP or HTTPS requests and distributes them across backend groups based on the rules set in HTTP routers, or redirects HTTP requests to HTTPS. Backend groups receiving traffic must have the HTTP or gRPC type. For HTTP listeners, Yandex Monitoring calculates and displays the request statistics.
- Stream: Load balancer accepts incoming traffic via unencrypted or encrypted TCP connections and routes it to Stream backend groups. For Stream listeners, the system does not collects statistics on individual HTTP requests, so Monitoring does not display their error or access metrics.
If encrypted traffic is accepted, the main listener and optional SNI listeners are set up for the load balancer. In each SNI listener, the domain name specified as Server Name Indication
Tip
Some browsers reuse TLS connections with the same IP address if a connection certificate contains the necessary domain name. In this case, no new SNI match is selected and traffic may be routed to an inappropriate HTTP router. Use different certificates in different SNI listeners and the main listener. To distribute traffic across the domain names of a single certificate, set up HTTP router virtual hosts.
One load balancer can serve both regular and encrypted traffic on different ports and have public and internal IP addresses on different listeners.
Example
The listener can accept HTTP traffic on port 80 and redirect traffic to HTTPS port 443. The listener gets an HTTP request from a client and returns a response with HTTP code 302. Further requests will be accepted at port 443 via HTTPS.
If an HTTPS listener is used, specify a certificate from Certificate Manager that will be used to terminate TLS connections.
Use cases
- Setting up virtual hosting
- Creating a load balancer with DDoS protection
- Creating an L7 Application Load Balancer with a Yandex Smart Web Security profile
- Migrating from an ALB ingress controller to Gwin
- Migrating services from an NLB to an L7 ALB to enable Yandex Smart Web Security protection
- Configuring a Yandex Application Load Balancer using an ingress controller
- Fault-tolerant website with load balancing via Yandex Application Load Balancer using the management console
- Health checking applications in a Yandex Managed Service for Kubernetes cluster via a Yandex Application Load Balancer
- Writing load balancer logs to PostgreSQL
- Configuring Yandex Application Load Balancer logging via an ingress controller