Load balancers
A load balancer is used to receive incoming traffic and transmit it to the backend endpoints. Requests are routed based on the settings of load balancer listeners. Settings for transmitting traffic to backends are configured in backend groups.
The load balancer stores a list of endpoints, which accept traffic, and disables TLS encryption before sending the traffic to backends. Load balancer supports modern TLS versions (TLSv1.2, TLSv1.3) and encryption methods. If the load balancer is going to serve multiple domains, you can configure individual certificates and HTTP routers for each domain by using the TLS SNI mechanism.
For convenience and security, you can use the load balancer in combination with Yandex Certificate Manager to store your TLS certificates. You can also use Yandex Monitoring services to monitor request processing.
Security groups
When creating a load balancer, you need to specify security groups as they contain rules the load balancer uses to receive incoming traffic and send it to backend VMs. Each VM also has security groups assigned to it.
For the load balancer to work correctly:
- The load balancer security groups must allow:
- Receiving external incoming traffic on the ports specified in the listener, e.g., for HTTP(S) traffic: TCP connections on ports
80
and443
from any address (CIDR:0.0.0.0/0
). - Receiving incoming traffic to health check load balancer nodes in different availability zones: TCP connections on port
30080
with theLoad balancer healthchecks
source. - Sending traffic to backend VMs, i.e., VMs whose IP addresses are included in target groups. For example, any outgoing connections to internal VM addresses (any protocol, full port range, CIDR:
<VM_internal_IP_address>/32
).
- Receiving external incoming traffic on the ports specified in the listener, e.g., for HTTP(S) traffic: TCP connections on ports
- Backend VM security groups must allow incoming traffic from the load balancer on the ports specified in the backend groups, e.g., any incoming connections from subnets hosting the load balancer or from at least one of its security groups.
For information on how to configure security groups for the Ingress controller and Gateway API, see Configuring security groups for Application Load Balancer tools for Managed Service for Kubernetes.
Host load balancer
When creating a load balancer, specify a network and subnets in the availability zones. Those are the subnets where the load balancer's nodes will be hosted. Application backends will receive traffic from the load balancer nodes in these subnets.
Alert
If all backends in an availability zone with health checks enabled fail the checks, traffic will stop routing to the zone, even if there are working backends without health checks in the zone.
We recommend configuring health checks for all backends.
See below to learn what subnet sizes are recommended for load balancers.
You can disable the load balancer in the selected availability zones. In this case, external traffic will no longer be sent to the load balancer nodes in these availability zones. However, the load balancer nodes in other availability zones will continue delivering traffic to backends in the availability zones where the load balancer was disabled, if this is allowed by the locality aware routing settings.
Autoscaling and resource units
An internal group of VM instances called resource units is created in each availability zone of the load balancer.
One resource unit is designed for the following maximum indicator values:
- 1000 requests per second (RPS).
- 4000 concurrently active connections.
- 300 new connections per second.
- 22 MB (176 Mbit) of traffic per second.
A group of resource units is automatically scaled depending on the external load on load balancer nodes. The group size is calculated so that the load per unit does not exceed the threshold values.
As an example, let's take a look at the following load:
- 6000 RPS.
- 29000 concurrently active connections.
- 750 new connections per second.
- 20 MB of traffic per second.
This is equal to eight resource units:
- 6000 / 1000 = 6 is the number of resource units designed for 6000 RPS.
- 29000 / 4000 = 7.25 ~ 8 is the number of resource units designed for 30000 active connections.
- 750 / 300 = 2.5 ~ 3 is the number of resource units designed for 750 new connections.
- 20 / 22 = 0.9090... ~ 1 is the number of resource units designed for 20 MB of traffic per second.
By default, the minimum number of resource units per availability zone is 2. You can increase it in the autoscaling settings. For more information, see below.
The number of resource units affects the cost of using the load balancer. For more information, see Yandex Application Load Balancer pricing policy.
Autoscaling settings
In the load balancer settings, you can specify the following:
- The minimum number of resource units per availability zone
-
If you expect higher loads on the load balancer, you can increase the minimum number of resource units per zone in advance to avoid waiting for it to increase following the load.
The default minimum is 2. You cannot set a minimum value below 2.
- Maximum total number of resource units
-
The cost of using the load balancer depends on the number of its resource units (see the relevant pricing policy). By default, this number is unlimited. You can set a limit to control your expenses.
If the specified minimum is too low for the actual load on the load balancer, it may run incorrectly.
Make sure the value is more or equal to the number of load balancer availability zones multiplied by the minimum number of resource units per zone.
You can set autoscaling for a group of resource units of your load balancer when creating or updating it.
Recommended subnet sizes
For Application Load Balancer to provide load balancer availability as specified in the service level agreement
For instance, if a load balancer uses eight resource units in each availability zone as shown in this example, each subnet should have at least 8 × 2 = 16 addresses available. For each load balancer, we recommend specifying subnets with the size of at least /27.
Listener
The listener determines the ports, addresses, and protocols the load balancer will accept traffic on.
Some incoming ports, such as port 22, are reserved for service purposes and you cannot use them.
Request routing to backend groups depends on the listener type:
- HTTP: Load balancer accepts HTTP or HTTPS requests and distributes them across backend groups based on the rules set in HTTP routers, or redirects HTTP requests to HTTPS. Backend groups receiving traffic must have the HTTP or gRPC type.
- Stream: Load balancer accepts incoming traffic via unencrypted or encrypted TCP connections and routes it to Stream backend groups.
If encrypted traffic is accepted, the main listener and optional SNI listeners are set up for the load balancer. In each SNI listener, the domain name specified as Server Name Indication
Tip
Some browsers reuse TLS connections with the same IP address if a connection certificate contains the necessary domain name. In this case, no new SNI match is selected and traffic may be routed to an inappropriate HTTP router. Use different certificates in different SNI listeners and the main listener. To distribute traffic across the domain names of a single certificate, set up HTTP router virtual hosts.
One load balancer can serve both regular and encrypted traffic on different ports and have public and internal IP addresses on different listeners.
Example
The listener can accept HTTP traffic on port 80 and redirect traffic to HTTPS port 443. The listener gets an HTTP request from a client and returns a response with HTTP code 302. Further requests will be accepted at port 443 via HTTPS.
If an HTTPS listener is used, specify a certificate from Certificate Manager that will be used to terminate TLS connections.
Statistics
Load balancer statistics are automatically logged in the Yandex Monitoring metrics. The following dashboards and measures are available:
-
HTTP statistics:
- RPS: Number of load balancer requests per second.
- 4XX, 5XX: Number of load balancer responses containing HTTP codes 4XX and 5XX and the corresponding gRPC codes per second.
- Request size: Total volume of load balancer requests per second.
- Response size: Total volume of load balancer responses per second.
- Latency: Response delay (the time between the balancer receiving the first byte of a request to sending the last byte of the response), 50th to 99th percentiles.
-
Capacity statistics:
- Active connections: Number of active connections.
- Connections per second: Number of connections per second.
- Requests per second: Number of requests per second.
- Bytes per second: Amount of data handled per second.
For a full list of metrics delivered to Yandex Monitoring, see the reference.
Application Load Balancer has aggregate load balancer statistics available. In Monitoring, you can view statistics itemized by the resources associated with the load balancer (HTTP routers, virtual hosts, routes, and the like) as well as create alerts.
For instructions on viewing statistics, see Viewing L7 load balancer statistics.
Logging
You can configure the delivery of load balancer logs to a Yandex Cloud Logging log group.
For more information on how to view logs, see Viewing L7 load balancer logs.
A full list of logged parameters is provided in the log reference.
You can also send load balancer logs to a PostgreSQL DB.
Rules for discarding logs
Writing and storing logs in Cloud Logging is charged based on the service pricing rules. To log less data, add rules for discarding logs.
Possible rules:
Rule |
Value |
HTTP codes |
|
HTTP code classes |
|
gRPC codes |
|