Tips for using Managed Service for Kubernetes
Use these recommendations for your PRODUCTION
applications that require:
- High availability and fault tolerance.
- Load scaling.
- Resource isolation.
Tip
Test the proposed strategies in a test environment before migrating them to PRODUCTION
.
High availability and fault tolerance
-
Use the
REGULAR
orSTABLE
release channel.Tip
Use the
RAPID
release channel for test environments to more quickly test Kubernetes and Managed Service for Kubernetes updates. -
Control cluster and node group updates. Either disable automatic updates and perform them manually, or set the update time so that your applications are available during active usage hours.
-
Configure
podDisruptionBudget
policies to minimize service downtime during updates. -
Select the highly available master type located in three zones. Kubernetes services will be available in the event of an availability zone level failure. The Managed Service for Kubernetes [Service Level Agreement] applies to the configuration with a highly available master located in three zones.
-
Allocate sufficient compute resources (CPU, RAM) to the master and nodes.
-
Minimize or eliminate resubscription of resources on the nodes, especially of RAM.
-
Configure correct health checks for load balancers.
-
To make your cluster more robust, create node groups with automatic scaling in multiple availability zones.
Tip
Managed Service for Kubernetes uses Yandex Compute Cloud VM groups as cluster node groups. See the description of instance groups during a zonal incident and our mitigation guidelines.
-
Deploy your
Deployment
andStatefulSet
type services in multiple instances in different availability zones. Use the Pod Topology Constraints and the AntiAffinity strategies to ensure high service availability and efficient usage of Kubernetes cluster resources.Use the label combinations below for all strategies:
topology.kubernetes.io/zone
to keep the services available in the event of an availability zone failure.kubernetes.io/hostname
to keep the services available in the event of a cluster node failure.
Warning
Autoscaling resources during an availability zone failure takes time. Always use specified labels to distribute pods across different nodes and availability zones so that your applications work properly.
Load scaling
Use these recommendations if the load on your Managed Service for Kubernetes cluster is constantly growing:
- To reduce the load on the Kubernetes DNS, use NodeLocal DNS. If the cluster contains over 50 nodes, use automatic DNS scaling.
- To reduce horizontal traffic within the cluster, use a network load balancer and the
externalTrafficPolicy:Local
rule where possible. - Consider node storage requirements in advance:
- Review disk limits for Yandex Compute Cloud.
- Load test your disk subsystem in a test environment.
- To reduce latency at high IOPS, use non-replicated disks.
Network load balancer
The network load balancer distributes incoming traffic across target resources (VMs). A listener with a public IP address enables the balancer to process internet traffic, while a listener with a private IP address handles internal traffic. The load balancer uses health checks to test target resource availability.
Yandex Cloud implements the NLB Zone Shift
mechanism, where you can mark the load balancer with a special flag. If there is a partial failure in the availability zone, which is undetected by health checks, Yandex Cloud support will disable the compromised zone for this balancer.
To test your application during an availability zone failure, see this scenario
Learn more about network load balancers.
Application load balancer
The application load balancer is based on the network load balancer, but it can route traffic to any private IP addresses, e.g., IP addresses of resources outside the cloud network. Traffic is routed through intermediate VMs acting as reverse proxies.
In an application load balancer, you can manually disable an availability zone with partial failure.
Learn more about application load balancers.
Isolating resources
Follow these recommendations for applications that use shared Kubernetes cluster resources.
Adjust limits
and requests
for all the cluster services:
---
...
containers:
...
resources:
limits:
cpu: 250m
memory: 128Mi
requests:
cpu: 100m
memory: 64Mi
...
Specify vCPU availability in thousandths and RAM in megabytes. The service will not exceed the vCPU or RAM limits specified in limits
. Customized requests
allow you to autoscale cluster nodes.
To manage pod resources automatically, configure Kubernetes policies:
- Quality of Service for Pods
to create pods of different availability classes. - Limit Ranges
to set limits at the namespace level.
Monitoring and escalation
Monitoring and alerts are key tools for ensuring fault tolerance.
- Set up metric monitoring and create alerts to track the status of master, nodes, pods, and persistent volumes.
- Configure escalation policies for alerts.