Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Kubernetes
  • Comparing with other Yandex Cloud services
  • Getting started
    • All tutorials
    • Creating a new Kubernetes project in Yandex Cloud
    • Creating a Kubernetes cluster with no internet access
    • Running workloads with GPUs
    • Using node groups with GPUs and no pre-installed drivers
    • Setting up time-slicing GPUs
    • Migrating resources to a different availability zone
    • Encrypting secrets in Managed Service for Kubernetes
    • Creating a Kubernetes cluster using the Yandex Cloud provider for the Kubernetes Cluster API
    • Accessing the Yandex Cloud API from a Managed Service for Kubernetes cluster using a workload identity federation
      • Horizontal scaling of an application in a cluster
      • Vertical scaling of an application in a cluster
      • Updating the Metrics Server parameters
      • Deploying and load testing a scalable gRPC service
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes

In this article:

  • Required paid resources
  • Getting started
  • Create Vertical Pod Autoscaler and a test application
  • Test Vertical Pod Autoscaler
  • Delete the resources you created
  1. Tutorials
  2. Setting up and testing scaling
  3. Vertical scaling of an application in a cluster

Vertical scaling of an application in a cluster

Written by
Yandex Cloud
Updated at November 21, 2025
  • Required paid resources
  • Getting started
  • Create Vertical Pod Autoscaler and a test application
  • Test Vertical Pod Autoscaler
  • Delete the resources you created

Managed Service for Kubernetes supports several types of autoscaling. In this tutorial, you will learn how to set up automatic pod resource management with Vertical Pod Autoscaler:

  • Create Vertical Pod Autoscaler and a test application.
  • Test Vertical Pod Autoscaler.

If you no longer need the resources you created, delete them.

Required paid resourcesRequired paid resources

The support cost for this solution includes:

  • Fee for using the master and outgoing traffic in a Managed Service for Kubernetes cluster (see Managed Service for Kubernetes pricing).
  • Fee for using computing resources, OS, and storage in cluster nodes (VMs) (see Compute Cloud pricing).
  • Fee for a public IP address for cluster nodes (see Virtual Private Cloud pricing).

Getting startedGetting started

  1. If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

    By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

  2. Create security groups for the Managed Service for Kubernetes cluster and its node groups.

    Warning

    The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.

  3. Create a Managed Service for Kubernetes cluster. Use these settings:

    • Use the security groups you created earlier.
    • For Yandex Cloud internal network usage, your cluster does not need a public IP address. To enable internet access to your cluster, assign it a public IP address.
  4. Create a node group. Use these settings:

    • Use the security groups you created earlier.
    • To enable internet access for your node group (e.g., for Docker image pulls), assign it a public IP address.
  5. Install kubect and configure it to work with the new cluster.

    If a cluster has no public IP address assigned and kubectl is configured via the cluster's private IP address, run kubectl commands on a Yandex Cloud VM that is in the same network as the cluster.

  6. Install Vertical Pod Autoscaler from this repository as follows:

    cd /tmp && \
      git clone https://github.com/kubernetes/autoscaler.git && \
      cd autoscaler/vertical-pod-autoscaler/hack && \
      ./vpa-up.sh
    

Create Vertical Pod Autoscaler and a test applicationCreate Vertical Pod Autoscaler and a test application

  1. Create a file named app.yaml with the nginx test application and load balancer settings:

    app.yaml
    ---
    ### Deployment
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          name: nginx
          labels:
            app: nginx
        spec:
          containers:
            - name: nginx
              image: registry.k8s.io/hpa-example
              resources:
                requests:
                  memory: "256Mi"
                  cpu: "500m"
                limits:
                  memory: "500Mi"
                  cpu: "1"
    ---
    ### Service
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
    spec:
      selector:
        app: nginx
      ports:
        - protocol: TCP
          port: 80
          targetPort: 80
      type: LoadBalancer
    
  2. Create a file named vpa.yaml with Vertical Pod Autoscaler configuration:

    vpa.yaml
    ---
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: nginx
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind:       Deployment
        name:       nginx
      updatePolicy:
        updateMode:  "Auto"
        minReplicas: 1
    
  3. Create the objects:

    kubectl apply -f app.yaml && \
    kubectl apply -f vpa.yaml
    
  4. Make sure the Vertical Pod Autoscaler and nginx pods switched to Running:

    kubectl get pods -n kube-system | grep vpa && \
    kubectl get pods | grep nginx
    

    Result:

    vpa-admission-controller-58********-qmxtv  1/1  Running  0  44h
    vpa-recommender-67********-jqvgt           1/1  Running  0  44h
    vpa-updater-64********-xqsts               1/1  Running  0  44h
    nginx-6c********-62j7w                     1/1  Running  0  42h
    

Test Vertical Pod AutoscalerTest Vertical Pod Autoscaler

To test Vertical Pod Autoscaler, you will simulate nginx workload.

  1. Review the recommendations provided by Vertical Pod Autoscaler prior to simulating the workload:

    kubectl describe vpa nginx
    

    Note the low Cpu values in the Status.Recommendation.Container Recommendations metrics:

    Name:         nginx
    Namespace:    default
    Labels:       <none>
    Annotations:  <none>
    API Version:  autoscaling.k8s.io/v1
    Kind:         VerticalPodAutoscaler
    ...
    Status:
      Conditions:
        Last Transition Time:  2022-03-18T08:02:04Z
        Status:                True
        Type:                  RecommendationProvided
      Recommendation:
        Container Recommendations:
          Container Name:  nginx
          Lower Bound:
            Cpu:     25m
            Memory:  262144k
          Target:
            Cpu:     25m
            Memory:  262144k
          Uncapped Target:
            Cpu:     25m
            Memory:  262144k
          Upper Bound:
            Cpu:     25m
            Memory:  262144k
    
  2. Make sure Vertical Pod Autoscaler is managing the nginx pod resources:

    kubectl get pod <nginx_pod_name> --output yaml
    

    Result:

    apiVersion: v1
    kind: Pod
    metadata:
      annotations:
        vpaObservedContainers: nginx
        vpaUpdates: 'Pod resources updated by nginx: container 0: cpu request, memory
          request, cpu limit, memory limit'
    ...
    spec:
      containers:
      ...
        name: nginx
        resources:
          limits:
            cpu: 50m
            memory: 500000Ki
          requests:
            cpu: 25m
            memory: 262144k
    
  3. In a separate terminal window, run the following command to simulate a workload:

    URL=$(kubectl get service nginx -o json \
      | jq -r '.status.loadBalancer.ingress[0].ip') && \
      while true; do wget -q -O- http://$URL; done
    

    Tip

    To increase the load and speed up the scenario, run multiple simulations in separate windows.

    Note

    If the resource is unavailable at the specified URL, make sure that the security groups for the Managed Service for Kubernetes cluster and its node groups are configured correctly. If any rule is missing, add it.

  4. Wait a few minutes and review the recommendation provided by Vertical Pod Autoscaler after simulating the workload:

    kubectl describe vpa nginx
    

    Vertical Pod Autoscaler allocated additional resources to the pods as the workload increased. Note the increased Cpu values in the Status.Recommendation.Container Recommendations metrics:

    Name:         nginx
    Namespace:    default
    Labels:       <none>
    Annotations:  <none>
    API Version:  autoscaling.k8s.io/v1
    Kind:         VerticalPodAutoscaler
    ...
    Status:
     Conditions:
        Last Transition Time:  2022-03-18T08:02:04Z
        Status:                True
        Type:                  RecommendationProvided
      Recommendation:
        Container Recommendations:
          Container Name:  nginx
          Lower Bound:
            Cpu:     25m
            Memory:  262144k
          Target:
            Cpu:     410m
            Memory:  262144k
          Uncapped Target:
            Cpu:     410m
            Memory:  262144k
          Upper Bound:
            Cpu:     28897m
            Memory:  1431232100
    
  5. Stop simulating the workload. Within a few minutes, the Status.Recommendation.Container Recommendations metrics will regain their initial values.

Delete the resources you createdDelete the resources you created

Delete the resources you no longer need to avoid paying for them:

  1. Delete the Kubernetes cluster.
  2. If you used static public IP addresses to access your cluster or nodes, release and delete them.

Was the article helpful?

Previous
Horizontal scaling of an application in a cluster
Next
Updating the Metrics Server parameters
© 2025 Direct Cursus Technology L.L.C.