Horizontal application scaling in a cluster
Managed Service for Kubernetes supports several types of autoscaling. In this article, you will learn to configure cluster autoscaling using a combination of Cluster Autoscaler and Horizontal Pod Autoscaler.
If you no longer need the resources you created, delete them.
Warning
While running, the total number of group nodes may increase to 6. Make sure you have sufficient folder resources to follow the steps provided in this tutorial.
Getting started
-
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the
--folder-name
or--folder-id
parameter. -
Create service accounts for the master and node groups and assign roles to them.
sa-k8s-master
service account for cluster management.k8s.clusters.agent
: To manage a Kubernetes cluster.load-balancer.admin
: To manage a network load balancer.
sa-k8s-nodes
service account for node group management:container-registry.images.puller
: For pulling images from Yandex Container Registry.
-
Create a network named
k8s-network
to host your cluster. When creating your network, select the Create subnets option. -
Create security groups for the Managed Service for Kubernetes cluster and its node groups.
Warning
The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.
-
- Name:
k8s-symetric-key
. - Encryption algorithm:
AES-128
. - Rotation period, days:
365 days
.
- Name:
-
Create a Managed Service for Kubernetes cluster with the following settings:
- Service account for resources:
sa-k8s-master
. - Service account for nodes:
sa-k8s-nodes
. - Encryption key:
k8s-symetric-key
. - Release channel:
RAPID
. - Public address:
Auto
. - Type of master:
Regional
. - Cloud network:
k8s-network
. - Security groups: Select the previously created security groups containing the rules for service traffic and Kubernetes API access.
- Enable tunnel mode:
Enabled
.
- Service account for resources:
-
Create two groups of nodes with the following settings in the
ru-central1-a
and theru-central1-b
availability zones:- Under Scaling:
- Type:
Automatic
. - Minimum number of nodes:
1
. - Maximum number of nodes:
3
. - Initial number of nodes:
1
.
- Type:
- Under Network settings:
- Public address:
Auto
. - Security groups: Select the previously created security groups containing the rules for service traffic, connection to the services from the internet, and connection to nodes over SSH.
- Location:
ru-central1-a
orru-central1-b
.
- Public address:
- Under Scaling:
-
Install kubectl
and configure it to work with the created cluster.
Scaling based on CPU utilization
In this section, you will learn to configure cluster autoscaling based on CPU load.
-
Create a file named
k8s-autoscale-CPU.yaml
containing the settings for a test application, a load balancer, and Horizontal Pod Autoscaler:k8s-autoscale-CPU.yaml
--- ### Deployment apiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: name: nginx labels: app: nginx spec: containers: - name: nginx image: registry.k8s.io/hpa-example resources: requests: memory: "256Mi" cpu: "500m" limits: memory: "500Mi" cpu: "1" --- ### Service apiVersion: v1 kind: Service metadata: name: nginx spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 80 type: LoadBalancer --- ### HPA apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: nginx spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx minReplicas: 1 maxReplicas: 10 targetCPUUtilizationPercentage: 20
-
Create objects:
kubectl apply -f k8s-autoscale-CPU.yaml
-
In a separate window, launch Kubernetes component load tracking:
watch kubectl get pod,svc,hpa,nodes -o wide
-
Run a process to simulate a workload:
URL=$(kubectl get service nginx -o json \ | jq -r '.status.loadBalancer.ingress[0].ip') && \ while true; do wget -q -O- http://$URL; done
Tip
To increase load and accelerate the execution of the scenario, run several processes in separate windows.
Note
In the span of several minutes, Horizontal Pod Autoscaler will increase the number of pods on the nodes as a result of CPU usage. As soon as existing cluster resources become inadequate to satisfy the
requests
value, Cluster Autoscaler will increase the number of nodes in the groups. -
Stop simulating the workload. Over the next few minutes, the number of nodes and pods will drop back to the initial state.
Scaling based on the number of application requests
In this section, you will learn to configure cluster autoscaling based on the number of application requests (Requests Per Second, RPS).
Runtime algorithm
-
An Ingress controller transmits information on the number of application requests to the Prometheus monitoring system
. -
Prometheus generates and publishes the
nginx_ingress_controller_requests_per_second
metric for the number of application requests per second.To create this metric, the following rule has been added to the Prometheus configuration file called
values-prom.yaml
:rules: groups: - name: Ingress rules: - record: nginx_ingress_controller_requests_per_second expr: rate(nginx_ingress_controller_requests[2m])
-
Based on this metric, the autoscaling tools update the number of pods and nodes.
Installing objects
-
Clone the GitHub repository containing the up-to-date configuration files:
git clone https://github.com/yandex-cloud-examples/yc-mk8s-autoscaling-solution.git && \ cd yc-mk8s-autoscaling-solution
-
Add the Helm repositories with the Ingress controller and the Prometheus monitoring system:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx && \ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && \ helm repo update
-
Install the Ingress controller:
helm upgrade \ --install rps ingress-nginx/ingress-nginx \ --values values-ingr.yaml
-
Install Prometheus:
helm upgrade \ --install prometheus prometheus-community/prometheus \ --values values-prom.yaml
-
Install a Prometheus adapter
that will deliver Prometheus metrics to the autoscaling tools:helm upgrade \ --install prometheus-adapter prometheus-community/prometheus-adapter \ --values values-prom-ad.yaml
-
Create a test application, an Ingress rule, and Horizontal Pod Autoscaler:
kubectl apply -f k8s_autoscale-RPS.yaml
Once objects are created, Prometheus will add a new metric called
nginx_ingress_controller_requests_per_second
. Prometheus will not start computing this metric until after traffic passes through the Ingress controller. -
Make a few test requests against the Ingress controller:
URL=$(kubectl get service rps-ingress-nginx-controller -o json \ | jq -r '.status.loadBalancer.ingress[0].ip') && \ curl --header "Host: nginx.example.com" http://$URL
-
Make sure that the
nginx_ingress_controller_requests_per_second
metric is available:kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq . | \ grep ingresses.networking.k8s.io/nginx_ingress_controller_requests_per_second
Result:
"name": "ingresses.networking.k8s.io/nginx_ingress_controller_requests_per_second",
Testing autoscaling
-
In a separate window, launch Kubernetes component load tracking:
watch kubectl get pod,svc,hpa,nodes -o wide
-
Run a process to simulate workload:
URL=$(kubectl get service rps-ingress-nginx-controller -o json \ | jq -r '.status.loadBalancer.ingress[0].ip') && \ while true; do curl --header "Host: nginx.example.com" http://$URL; done
Note
Over the next several minutes, Horizontal Pod Autoscaler will increase the number of pods on the nodes as a result of an increased number of application requests. As soon as existing cluster resources become inadequate to satisfy the
requests
value, Cluster Autoscaler will increase the number of nodes in the groups. -
Stop simulating the workload. Over the next few minutes, the number of nodes and pods will drop back to the initial state.
Delete the resources you created
Delete the resources you no longer need to avoid paying for them:
- Delete the Kubernetes cluster.
- If static public IP addresses were used for cluster and node access, release and delete them.