Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Kubernetes
  • Comparing with other Yandex Cloud services
  • Getting started
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes
    • General questions
    • Data storage
    • Configuring and updating
    • Autoscaling
    • Resources
    • Monitoring and logs
    • Troubleshooting
    • All questions on one page

In this article:

  • General questions
  • Data storage
  • Autoscaling
  • Setup and updates
  • Resources
  • Logs
  • Troubleshooting
  1. FAQ
  2. All questions on one page

FAQ about Managed Service for Kubernetes

Written by
Yandex Cloud
Improved by
Dmitry A.
Updated at November 21, 2025
  • General questions
  • Data storage
  • Autoscaling
  • Setup and updates
  • Resources
  • Logs
  • Troubleshooting

General questionsGeneral questions

  • What services are available in Managed Service for Kubernetes clusters by default?

  • Which Kubernetes CLI (kubectl) version do I need to install for comprehensive cluster management?

  • Can Yandex Cloud restore the health of my cluster if I configure it incorrectly?

  • Who will be monitoring the cluster health?

  • How quickly does Yandex Cloud address vulnerabilities discovered in the security system? What should I do if an attacker has taken advantage of a vulnerability and my data is compromised?

  • Can I connect to a cluster node via OS Login, similar to a Yandex Cloud VM?

Data storageData storage

  • What should I consider when using disk storage for a database, such as MySQL® or PostgreSQL, deployed in a Kubernetes cluster?

  • How do I connect a pod to managed Yandex Cloud databases?

  • How do I correctly attach a persistent volume to a container?

  • What types of volumes does Managed Service for Kubernetes support?

AutoscalingAutoscaling

  • Why does my cluster have N nodes and is not scaling down?

  • In an autoscaling group, the number of nodes never scales down to one, even when there is no load

  • Why does the node group fail to scale down after the pod deletion?

  • Why does autoscaling fail to trigger even though the number of nodes is below the minimum or above the maximum?

  • Why do Terminated pods remain in my cluster?

  • Is Horizontal Pod Autoscaler supported?

Setup and updatesSetup and updates

  • What should I do if I lose some of my data during a Kubernetes version upgrade?

  • Can I configure a backup for a Kubernetes cluster?

  • Will my resources be unavailable while Kubernetes is going through a version upgrade?

  • Can I upgrade my Managed Service for Kubernetes cluster in one step?

  • Is the Container Network Interface plugin upgraded together with the Managed Service for Kubernetes cluster?

  • Can I send you a YAML configuration file so that you apply it to my cluster?

  • Can you install Web UI Dashboard, Rook, and other tools?

  • What should I do if volumes fail to attach after a Kubernetes upgrade?

ResourcesResources

  • What resources do I need to maintain a Kubernetes cluster with a group of, say, three nodes?

  • Can I change resources for each node in a Kubernetes cluster?

  • Who handles Kubernetes cluster scaling?

LogsLogs

  • How can I monitor the Managed Service for Kubernetes cluster state?

  • Can I access the logs of my activity in Yandex Cloud services?

  • Can I save logs myself?

  • Can I use Yandex Cloud Logging for viewing logs?

TroubleshootingTroubleshooting

  • Error creating a cluster in a different folder's cloud network

  • Namespace fails to delete and remains Terminating

  • I am using Yandex Network Load Balancer together with an ingress controller. Why are some of my cluster's nodes UNHEALTHY?

  • Why does the newly created PersistentVolumeClaim remain Pending?

  • Why does my Managed Service for Kubernetes cluster fail to start after I update its node configuration?

  • Error updating ingress controller certificate

  • Why is DNS resolution not working in my cluster?

  • Creating a node group with the CLI results in a parameter conflict. How do I fix it?

  • Error connecting to a cluster using kubectl

  • Errors connecting to a node over SSH

  • How do I provide internet access to my Managed Service for Kubernetes cluster nodes?

  • Why cannot I choose Docker as the container runtime?

  • Error connecting a GitLab repository to Argo CD

  • Traffic loss when deploying app updates in a cluster with Yandex Application Load Balancer

  • System time displayed incorrectly in the Linux console, as well as in container and Managed Service for Kubernetes cluster pod logs

  • What should I do if I deleted my Yandex Network Load Balancer or its target groups that were automatically created for a LoadBalancer service?

General questionsGeneral questions

What services are available in Managed Service for Kubernetes clusters by default?What services are available in Managed Service for Kubernetes clusters by default?

The following services are available by default:

  • Metrics Server for aggregating data on resource usage in a Kubernetes cluster.
  • Kubernetes plugin for CoreDNS for name resolution in a cluster.
  • DaemonSet supporting CSI plugins to work with persistent volumes (PersistentVolume).

Which Kubernetes CLI (kubectl) version do I need to install for comprehensive cluster management?Which Kubernetes CLI (kubectl) version do I need to install for comprehensive cluster management?

We recommend using the latest official version of kubectl to avoid compatibility issues.

Can Yandex Cloud restore the health of my cluster if I configure it incorrectly?Can Yandex Cloud restore the health of my cluster if I configure it incorrectly?

The master is fully managed by Yandex Cloud, so you cannot damage it. If you have issues with Kubernetes cluster components, contact support.

Who will be monitoring the cluster health?Who will be monitoring the cluster health?

Yandex Cloud. We will monitor your cluster for corrupted file system, kernel deadlock, lost internet connection, and Kubernetes component issues. We are also developing a solution to automatically restore faulty components.

How quickly does Yandex Cloud address vulnerabilities discovered in the security system? What should I do if an attacker has taken advantage of a vulnerability and my data is compromised?How quickly does Yandex Cloud address vulnerabilities discovered in the security system? What should I do if an attacker has taken advantage of a vulnerability and my data is compromised?

Yandex Cloud services, images and master configuration undergo various security and compliance checks from the start.

Users can set the update frequency depending on their workloads and cluster configuration. It is important to consider attack vectors and the vulnerabilities of applications deployed in the Kubernetes cluster. Such factors as network security policies between applications, Docker container vulnerabilities, and improper container runtime configuration in the cluster may affect application security.

Can I connect to a cluster node via OS Login, similar to a Yandex Cloud VM?Can I connect to a cluster node via OS Login, similar to a Yandex Cloud VM?

Yes, you can. To do this, follow this guide.

Data storageData storage

What should I consider when using disk storage for a database, such as MySQL® or PostgreSQL, deployed in a Kubernetes cluster?What should I consider when using disk storage for a database, such as MySQL® or PostgreSQL, deployed in a Kubernetes cluster?

For a database deployed in a Kubernetes cluster, use StatefulSet controllers. We do not recommend running stateful services with persistent volumes in Kubernetes. To work with databases of stateful applications, use Yandex Cloud managed databases, e.g., Managed Service for MySQL® or Managed Service for PostgreSQL.

How do I connect a pod to managed Yandex Cloud databases?How do I connect a pod to managed Yandex Cloud databases?

To connect to a Yandex Cloud managed database in the same network, specify its host name and FQDN.

To connect a database certificate to a pod, use the secret or configmap objects.

How do I correctly attach a persistent volume to a container?How do I correctly attach a persistent volume to a container?

You can choose how to attach Compute Cloud disks depending on your use case:

  • If you need Kubernetes to automatically provision a PersistentVolume and configure a new disk, create a pod with a dynamically provisioned volume.
  • To use existing Compute Cloud volumes, create a pod with a statically provisioned pod.

For more information, see Working with persistent volumes.

What types of volumes does Managed Service for Kubernetes support?What types of volumes does Managed Service for Kubernetes support?

Managed Service for Kubernetes supports temporary (Volume) and persistent (PersistentVolume) volumes. For more information, see Volume.

AutoscalingAutoscaling

Why does my cluster have N nodes and is not scaling down?Why does my cluster have N nodes and is not scaling down?

Autoscaling does not stop nodes with pods that cannot be evicted. The following prevents scaling:

  • Pods with a PodDisruptionBudget that restricts their eviction.
  • Pods in the kube-system namespace:
    • Those not managed by a DaemonSet controller.
    • Those without a PodDisruptionBudget or those with a PodDisruptionBudget restricting their eviction.
  • Pods not managed by a replication controller, such as ReplicaSet, Deployment, or StatefulSet.
  • Pods with local-storage.
  • Pods that cannot be scheduled anywhere due to restrictions, e.g., due to insufficient resources or lack of nodes matching the affinity or anti-affinity selectors.
  • Pods annotated with "cluster-autoscaler.kubernetes.io/safe-to-evict": "false".

Note

You can evict kube-system pods, pods with local-storage, and pods without a replication controller. To do this, set "safe-to-evict": "true":

kubectl annotate pod <pod_name> cluster-autoscaler.kubernetes.io/safe-to-evict=true

Other possible causes include:

  • The node group has already reached its minimum size.

  • The node has been idle for less than 10 minutes.

  • The node group was scaled up in the last 10 minutes.

  • There was a failed attempt to scale down the node group in the last three minutes.

  • There was an unsuccessful attempt to stop a certain node. In this case, the next attempt occurs in 5 minutes.

  • The node is annotated to prevent it from being stopped during downscaling: "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true". You can add or remove the annotation using kubectl.

    Check the node for annotations:

    kubectl describe node <node_name> | grep scale-down-disabled
    

    Result:

    Annotations:        cluster-autoscaler.kubernetes.io/scale-down-disabled: true
    

    Set the annotation:

    kubectl annotate node <node_name> cluster-autoscaler.kubernetes.io/scale-down-disabled=true
    

    You can remove the annotation by running the kubectl command with -:

    kubectl annotate node <node_name> cluster-autoscaler.kubernetes.io/scale-down-disabled-
    

In an autoscaling group, the number of nodes never scales down to one, even when there is no loadIn an autoscaling group, the number of nodes never scales down to one, even when there is no load

In a Managed Service for Kubernetes cluster, the kube-dns-autoscaler app decides on the number of CoreDNS replicas. If the preventSinglePointFailure parameter in the kube-dns-autoscaler configuration is set to true and there is more than one node in the group, the minimum number of CoreDNS replicas is two. In this case, the Cluster Autoscaler cannot scale down the number of nodes in the cluster below that of CoreDNS pods.

Learn more about DNS scaling based on the cluster size here.

Solution:

  1. Disable the protection setting that limits the minimum number of CoreDNS replicas to two. To do this, set the preventSinglePointFailure parameter to false in the kube-dns-autoscaler ConfigMap.

  2. Enable the kube-dns-autoscaler pod eviction by adding the save-to-evict annotation to Deployment:

    kubectl patch deployment kube-dns-autoscaler -n kube-system \
      --type merge \
      -p '{"spec":{"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict":"true"}}}}}'
    

Why does the node group fail to scale down after the pod deletion?Why does the node group fail to scale down after the pod deletion?

If a node is underutilized, it will be deleted after 10 minutes.

Why does autoscaling fail to trigger even though the number of nodes is below the minimum or above the maximum?Why does autoscaling fail to trigger even though the number of nodes is below the minimum or above the maximum?

Autoscaling will not violate the preset limits, but Managed Service for Kubernetes does not explicitly enforce the limits. Upscaling will only happen if there are unschedulable pods.

Why do Terminated pods remain in my cluster?Why do Terminated pods remain in my cluster?

This happens because the Pod garbage collector (PodGC) fails to timely clean up these pods during autoscaling. For more information, see Deleting terminated pods.

To get answers to other questions about autoscaling, see Kubernetes FAQ.

Is Horizontal Pod Autoscaler supported?Is Horizontal Pod Autoscaler supported?

Yes, Managed Service for Kubernetes supports Horizontal Pod Autoscaler.

Setup and updatesSetup and updates

What should I do if I lose some of my data during a Kubernetes version upgrade?What should I do if I lose some of my data during a Kubernetes version upgrade?

Your data will not get lost as Managed Service for Kubernetes creates a data backup prior to upgrading the Kubernetes version. You can manually configure cluster backup in Yandex Object Storage. We also recommend using the application's native features to back up your databases.

Can I configure a backup for a Kubernetes cluster?Can I configure a backup for a Kubernetes cluster?

Yandex Cloud provides secure storage and replication for data in Managed Service for Kubernetes clusters. However, you can back up data from your Managed Service for Kubernetes cluster node groups at any time and store it in Object Storage or other types of storage.

For more information, see Managed Service for Kubernetes cluster backups in Object Storage.

Will my resources be unavailable while Kubernetes is going through a version upgrade?Will my resources be unavailable while Kubernetes is going through a version upgrade?

When a master is going through an upgrade, control plane resources will experience downtime. For this reason, operations like creating or deleting a Managed Service for Kubernetes node group will be unavailable. Still, the application will continue serving user requests.

If max_expansion is greater than zero, Managed Service for Kubernetes creates new nodes when upgrading node groups. The system moves all workloads to the new nodes and deletes the old node groups. The downtime in this case equals the time it takes for a pod to restart when moved to the new Managed Service for Kubernetes node group.

Can I upgrade my Managed Service for Kubernetes cluster in one step?Can I upgrade my Managed Service for Kubernetes cluster in one step?

It depends on the source and target versions of the Managed Service for Kubernetes cluster upgrade. In one step, you can only upgrade your Managed Service for Kubernetes cluster to the next minor version from the current one. Upgrading to newer versions is done in multiple steps, e.g., 1.19 → 1.20 → 1.21. For more information, see Updating a cluster.

If you want to skip intermediate versions, create a Managed Service for Kubernetes cluster of the version you need and migrate the workloads from the old cluster to the new one.

Is the Container Network Interface plugin upgraded together with the Managed Service for Kubernetes cluster?Is the Container Network Interface plugin upgraded together with the Managed Service for Kubernetes cluster?

Yes. If you are using the Calico and Cilium controllers, they are upgraded along with your Managed Service for Kubernetes cluster. To upgrade your Managed Service for Kubernetes cluster, do one of the following:

  • Create a Managed Service for Kubernetes cluster of the version you need and migrate the workloads from the old cluster to the new one.
  • Upgrade your Managed Service for Kubernetes cluster manually.

To get timely Managed Service for Kubernetes cluster version upgrades, set up auto upgrading.

Can I send you a YAML configuration file so that you apply it to my cluster?Can I send you a YAML configuration file so that you apply it to my cluster?

No. You can use a kubeconfig file to apply the YAML file with cluster configuration on your own.

Can you install Web UI Dashboard, Rook, and other tools?Can you install Web UI Dashboard, Rook, and other tools?

No. You can install all the required tools on your own.

What should I do if volumes fail to attach after a Kubernetes upgrade?What should I do if volumes fail to attach after a Kubernetes upgrade?

If you get the following error after a Kubernetes upgrade:

AttachVolume.Attach failed for volume "pvc":
Attach timeout for volume yadp-k8s-volumes/pvc

Update the s3-CSI driver to the latest version.

ResourcesResources

What resources do I need to maintain a Kubernetes cluster with a group of, say, three nodes?What resources do I need to maintain a Kubernetes cluster with a group of, say, three nodes?

Each node needs resources to run the Kubernetes components that allow the node to work as part of the Kubernetes cluster. For more information, see Dynamic resource allocation.

Can I change resources for each node in a Kubernetes cluster?Can I change resources for each node in a Kubernetes cluster?

You can only change resources for a node group. You can create groups with different configurations in a single Kubernetes cluster and spread them across multiple availability zones. For more information, see Updating a Managed Service for Kubernetes node group.

Who handles Kubernetes cluster scaling?Who handles Kubernetes cluster scaling?

In Managed Service for Kubernetes, you can enable cluster autoscaling.

LogsLogs

How can I monitor the Managed Service for Kubernetes cluster state?How can I monitor the Managed Service for Kubernetes cluster state?

Get cluster statistics. You can find the description of the available cluster metrics in this reference.

Can I get logs of my operations in Yandex Cloud?Can I get logs of my operations in Yandex Cloud?

Yes, you can request information about operations with your resources from Yandex Cloud logs. Do it by contacting support.

Can I save logs myself?Can I save logs myself?

For log collection and storage, use Fluent Bit.

Can I use Yandex Cloud Logging for viewing logs?Can I use Yandex Cloud Logging for viewing logs?

Yes, you can. To do this, set up sending logs to Cloud Logging when creating or updating a Managed Service for Kubernetes cluster. This setting is only available in the CLI, Terraform, and API.

TroubleshootingTroubleshooting

This section describes typical issues you may encounter while using Managed Service for Kubernetes and gives troubleshooting recommendations.

Error creating a cluster in a different folder's cloud networkError creating a cluster in a different folder's cloud network

Error message:

Permission denied

This error occurs when the resource service account has no required roles in the folder that contains the cloud network selected when creating the cluster.

To create a Managed Service for Kubernetes cluster in a cloud network of another folder, assign the resource service account the following roles in that folder:

  • vpc.privateAdmin
  • vpc.user

To use a public IP address, also assign the vpc.publicAdmin role.

Namespace fails to delete and remains TerminatingNamespace fails to delete and remains Terminating

This issue occurs when your namespace contains stuck resources that the namespace controller cannot delete.

To fix it, delete the stuck resources manually.

CLI

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

  1. Connect to the Managed Service for Kubernetes cluster.

  2. Get the list of resources remaining in the namespace:

    kubectl api-resources --verbs=list --namespaced --output=name \
      | xargs --max-args=1 kubectl get --show-kind \
      --ignore-not-found --namespace=<namespace>
    
  3. Delete the listed resources:

    kubectl delete <resource_type> <resource_name> --namespace=<namespace>
    

If the namespace is still in the Terminating status and cannot be deleted, delete it forcibly using finalizer:

  1. Run a local proxy to the Kubernetes API:

    kubectl proxy
    
  2. Delete the namespace:

    kubectl get namespace <namespace> --output=json \
      | jq '.spec = {"finalizers":[]}' > temp.json && \
    curl --insecure --header "Content-Type: application/json" \
      --request PUT --data-binary @temp.json \
      127.0.0.1:8001/api/v1/namespaces/<namespace>/finalize
    

We do not recommend deleting the namespace with the Terminating status using finalizer right away, as this may cause the stuck resources to remain in your Managed Service for Kubernetes cluster.

I am using Yandex Network Load Balancer together with an ingress controller. Why are some of my cluster's nodes UNHEALTHY?I am using Yandex Network Load Balancer together with an ingress controller. Why are some of my cluster's nodes UNHEALTHY?

This is normal behavior for a load balancer with External Traffic Policy: Local enabled. Only the Managed Service for Kubernetes nodes whose pods are ready to handle user traffic get the HEALTHY status. All other nodes are labeled as UNHEALTHY.

To check the policy type of a load balancer created using a LoadBalancer service, run this command:

kubectl describe svc <LoadBalancer_service_name> \
| grep 'External Traffic Policy'

For more information, see Parameters of a LoadBalancer service.

Why does the newly created PersistentVolumeClaim remain Pending?Why does the newly created PersistentVolumeClaim remain Pending?

This is normal for a PersistentVolumeClaim (PVC). The newly created PVC remains Pending until you create a pod that will use it.

To change the PVC status to Running:

  1. View the PVC details:

    kubectl describe pvc <PVC_name> \
      --namespace=<namespace>
    

    Where --namespace is the namespace containing the PVC.

    The waiting for first consumer to be created before binding message means that the PVC is awaiting pod creation.

  2. Create a pod for this PVC.

Why does my Managed Service for Kubernetes cluster fail to start after I update its node configuration?Why does my Managed Service for Kubernetes cluster fail to start after I update its node configuration?

Make sure the new configuration of Managed Service for Kubernetes nodes is within the quota:

CLI

If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

To run diagnostics for your Managed Service for Kubernetes cluster nodes:

  1. Connect to the Managed Service for Kubernetes cluster.

  2. Check the state of Managed Service for Kubernetes nodes:

    yc managed-kubernetes cluster list-nodes <cluster_ID>
    

    A message saying that the limit of Managed Service for Kubernetes cluster resources has been exceeded appears in the first column of the command output. Here is an example:

    +--------------------------------+-----------------+------------------+-------------+--------------+
    |         CLOUD INSTANCE         | KUBERNETES NODE |     RESOURCES    |     DISK    |    STATUS    |
    +--------------------------------+-----------------+------------------+-------------+--------------+
    | fhmil14sdienhr5uh89no          |                 | 2 100% core(s),  | 64.0 GB hdd | PROVISIONING |
    | CREATING_INSTANCE              |                 | 4.0 GB of memory |             |              |
    | [RESOURCE_EXHAUSTED] The limit |                 |                  |             |              |
    | on total size of network-hdd   |                 |                  |             |              |
    | disks has exceeded.,           |                 |                  |             |              |
    | [RESOURCE_EXHAUSTED] The limit |                 |                  |             |              |
    | on total size of network-hdd   |                 |                  |             |              |
    | disks has exceeded.            |                 |                  |             |              |
    +--------------------------------+-----------------+------------------+-------------+--------------+
    

To start your Managed Service for Kubernetes cluster, increase the quotas.

After changing the node subnet mask in the cluster settings, the number of pods per node is not as expectedAfter changing the node subnet mask in the cluster settings, the number of pods per node is not as expected

Solution: Recreate the node group.

Error updating ingress controller certificateError updating ingress controller certificate

Error message:

ERROR controller-runtime.manager.controller.ingressgroup Reconciler error
{"name": "some-prod", "namespace": , "error": "rpc error: code = InvalidArgument
desc = Validation error:\nlistener_specs[1].tls.sni_handlers[2].handler.certificate_ids:
Number of elements must be less than or equal to 1"}

The error occurs if different certificates are specified for the same ingress controller handler.

Solution: Edit and apply the ingress controller specifications so that each handler has only one certificate.

Why is DNS resolution not working in my cluster?Why is DNS resolution not working in my cluster?

A Managed Service for Kubernetes cluster may fail to resolve internal and external DNS requests for several reasons. To fix the issue:

  1. Check the version of your Managed Service for Kubernetes cluster and node groups.
  2. Make sure CoreDNS is up and running.
  3. Make sure your Managed Service for Kubernetes cluster has enough CPU resources available.
  4. Set up autoscaling.
  5. Set up local DNS caching.
Check the version of your cluster and node groupsCheck the version of your cluster and node groups
  1. Get the list of current Kubernetes versions:

    yc managed-kubernetes list-versions
    
  2. Get the Managed Service for Kubernetes cluster version:

    yc managed-kubernetes cluster get <cluster_name_or_ID> | grep version:
    

    You can get the Managed Service for Kubernetes cluster ID and name with the list of clusters in the folder.

  3. Get the Managed Service for Kubernetes node group version:

    yc managed-kubernetes node-group get <node_group_name_or_ID> | grep version:
    

    You can get the Managed Service for Kubernetes node group ID and name with the list of node groups in the cluster.

  4. If the versions of your Managed Service for Kubernetes cluster and node groups are not on the list of current Kubernetes versions, upgrade them.

Make sure CoreDNS is up and runningMake sure CoreDNS is up and running

Get the list of CoreDNS pods and their statuses:

kubectl get pods -n kube-system -l k8s-app=kube-dns -o wide

Make sure all pods have the Running status.

Make sure your cluster has enough CPU resources availableMake sure your cluster has enough CPU resources available
  1. Navigate to the folder dashboard and select Managed Service for Kubernetes.
  2. Click the name of the Managed Service for Kubernetes cluster you need and select the Node manager tab.
  3. Go to the Nodes tab and click the name of any Managed Service for Kubernetes node.
  4. Navigate to the Monitoring tab.
  5. Make sure that, in the CPU, [cores] chart, the used CPU values have not reached the total available CPU values. Check this for each Managed Service for Kubernetes cluster node.
Set up autoscalingSet up autoscaling

Set up DNS autoscaling based on the Managed Service for Kubernetes cluster size.

Set up local DNS cachingSet up local DNS caching

Set up NodeLocal DNS Cache. For optimal settings, install NodeLocal DNS Cache from Yandex Cloud Marketplace.

Creating a node group with the CLI results in a parameter conflict. How do I fix it?Creating a node group with the CLI results in a parameter conflict. How do I fix it?

Check whether you are specifying the --location, --network-interface, and --public-ip parameters in the same command. Providing them together causes the following errors:

  • For the --location and --public-ip or --location and --network-interface pairs:

    ERROR: rpc error: code = InvalidArgument desc = Validation error:
    allocation_policy.locations[0].subnet_id: can't use "allocation_policy.locations[0].subnet_id" together with "node_template.network_interface_specs"
    
  • For the --network-interface and --public-ip pair:

    ERROR: flag --public-ip cannot be used together with --network-interface. Use '--network-interface' option 'nat' to get public address
    

Make sure you only provide one of the three parameters in a command. It is enough to specify the location of a Managed Service for Kubernetes node group either in --location or in --network-interface.

To grant internet access to Managed Service for Kubernetes cluster nodes, do one of the following:

  • Assign a public IP address to the cluster nodes, specifying --network-interface ipv4-address=nat or --network-interface ipv6-address=nat.
  • Enable access to Managed Service for Kubernetes nodes from the internet after creating a node group.

Error connecting to a cluster using kubectlError connecting to a cluster using kubectl

Error message:

ERROR: cluster has empty endpoint

This error occurs if you try to connect to a cluster with no public IP address and get kubectl credentials for a public IP address using this command:

yc managed-kubernetes cluster \
   get-credentials <cluster_name_or_ID> \
   --external

To connect to the cluster's private IP address from a VM in the same network, get kubectl credentials using this command:

yc managed-kubernetes cluster \
   get-credentials <cluster_name_or_ID> \
   --internal

If you need to connect to a cluster from the internet, recreate the cluster and assign it a public IP address.

Errors connecting to a node over SSHErrors connecting to a node over SSH

Error messages:

Permission denied (publickey,password)
Too many authentication failures

The following situations cause errors when connecting to a Managed Service for Kubernetes node:

  • No public SSH key is added to the Managed Service for Kubernetes node group metadata.

    Solution: Update the Managed Service for Kubernetes node group keys.

  • An invalid public SSH key is added to the Managed Service for Kubernetes node group metadata.

    Solution: Change the format of the public key file to the appropriate one and update the Managed Service for Kubernetes node group keys.

  • No private SSH key is added to an authentication agent (ssh-agent).

    Solution: Add a private key by running the ssh-add <path_to_private_key_file> command.

How do I provide internet access to my Managed Service for Kubernetes cluster nodes?How do I provide internet access to my Managed Service for Kubernetes cluster nodes?

If Managed Service for Kubernetes cluster nodes have no internet access, the following error occurs when trying to connect to the internet:

Failed to pull image "cr.yandex/***": rpc error: code = Unknown desc = Error response from daemon: Gethttps://cr.yandex/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

You can provide internet access to your Managed Service for Kubernetes cluster nodes in several ways:

  • Set up a NAT gateway or NAT instance. With static routing in place, traffic will go through a gateway or a separate NAT instance.
  • Assign a public IP address to your Managed Service for Kubernetes node group.

Note

If you assign public IP addresses to your cluster nodes and then configure a NAT gateway or NAT instance, internet access via the public IP addresses will be disabled. For more information, see our Yandex Virtual Private Cloud article.

Why cannot I choose Docker as the container runtime?Why cannot I choose Docker as the container runtime?

Clusters running Kubernetes 1.24 or higher do not support the Docker container runtime. Containerd is the only available runtime.

Error connecting a GitLab repository to Argo CDError connecting a GitLab repository to Argo CD

Error message:

FATA[0000] rpc error: code = Unknown desc = error testing repository connectivity: authorization failed

This error occurs if access to GitLab over HTTP(S) is disabled.

Solution: Enable HTTP(S) access. To do this:

  1. In GitLab, in the left-hand panel, select Admin → Settings → General.
  2. Under Visibility and access controls, find the Enabled Git access protocols setting.
  3. In the list, select the item which allows access over HTTP(S).

For more information, see this GitLab guide.

Traffic loss when deploying app updates in a cluster with Yandex Application Load BalancerTraffic loss when deploying app updates in a cluster with Yandex Application Load Balancer

When your app traffic is managed by an Application Load Balancer and the load balancer's ingress controller traffic policy is set to externalTrafficPolicy: Local, the app processes requests on the same node they were delivered to by the load balancer. There is no traffic flow between nodes.

The default health check monitors the status of the node, not application. Therefore, Application Load Balancer traffic may go to a node where there is no application running. When you deploy a new app version in a cluster, the Application Load Balancer ingress controller requests the load balancer to update the backend group configuration. It takes at least 30 seconds to process the request, and the app may not receive any user traffic during that time.

To prevent this, we recommend setting up backend health checks on your Application Load Balancer. With health checks, the load balancer timely spots unavailable backends and reroutes traffic to healthy backends. Once the application is updated, traffic will again be distributed across all backends.

For more information, see Tips for configuring Yandex Application Load Balancer health checks and Annotations (metadata.annotations).

System time displayed incorrectly on nodes, as well as in container and Managed Service for Kubernetes cluster pod logsSystem time displayed incorrectly on nodes, as well as in container and Managed Service for Kubernetes cluster pod logs

Managed Service for Kubernetes cluster time may not match the time of other resources, such as VMs, if they use different time synchronization sources. For example, a Managed Service for Kubernetes cluster synchronizes with a time server (by default), whereas a VM synchronizes with a private or public NTP server.

Solution: Set up Managed Service for Kubernetes cluster time synchronization with your private NTP server. To do this:

  1. Specify the NTP server addresses in the DHCP settings of the master subnets.

    Management console
    CLI
    Terraform
    API
    1. Navigate to the folder dashboard and select Managed Service for Kubernetes.
    2. Click the name of the Kubernetes cluster.
    3. Under Master configuration, click the subnet name.
    4. Click Edit in the top-right corner.
    5. In the window that opens, expand the DHCP settings section.
    6. Click Add and specify the IP address of your NTP server.
    7. Click Save changes.

    If you do not have the Yandex Cloud CLI installed yet, install and initialize it.

    By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.

    1. See the description of the CLI command for updating subnet settings:

      yc vpc subnet update --help
      
    2. Run the subnet command, specifying the NTP server IP address in the --ntp-server parameter:

      yc vpc subnet update <subnet_ID> --ntp-server <server_address>
      

    Tip

    To find out the IDs of the subnets containing the cluster, get detailed information about the cluster.

    1. In the Terraform configuration file, change the cluster subnet description. Add the dhcp_options section (if missing) with the ntp_servers parameter specifying the IP address of your NTP server:

      ...
      resource "yandex_vpc_subnet" "lab-subnet-a" {
        ...
        v4_cidr_blocks = ["<IPv4_address>"]
        network_id     = "<network_ID>"
        ...
        dhcp_options {
          ntp_servers = ["<IPv4_address>"]
          ...
        }
      }
      ...
      

      For more information about the yandex_vpc_subnet settings, see this Terraform provider article.

    2. Apply the changes:

      1. In the terminal, go to the directory where you edited the configuration file.

      2. Make sure the configuration file is correct using this command:

        terraform validate
        

        If the configuration is correct, you will get this message:

        Success! The configuration is valid.
        
      3. Run this command:

        terraform plan
        

        You will see a detailed list of resources. No changes will be made at this step. If the configuration contains any errors, Terraform will show them.

      4. Apply the changes:

        terraform apply
        
      5. Type yes and press Enter to confirm the changes.

      Terraform will update all required resources. You can check the subnet update using the management console or this CLI command:

      yc vpc subnet get <subnet_name>
      

    Use the update method for the Subnet resource and provide the following in the request:

    • NTP server IP address in the dhcpOptions.ntpServers parameter.
    • dhcpOptions.ntpServers parameter to update in the updateMask parameter.

    Tip

    To find out the IDs of the subnets containing the cluster, get detailed information about the cluster.

    Warning

    For a highly available master hosted across three availability zones, you need to update each of the three subnets.

  2. Enable connections from the cluster to NTP servers.

    Create a rule in the security group of the cluster and node groups:

    • Port range: 123. If your NTP server uses a port other than 123, specify that port.
    • Protocol: UDP.
    • Destination name: CIDR.
    • CIDR blocks: <NTP_server_IP_address>/32. For a master hosted across three availability zones, specify three sections: <NTP_server_IP_address_in_subnet1>/32, <NTP_server_IP_address_in_subnet2>/32, and <NTP_server_IP_address_in_subnet3>/32.
  3. Update the network settings in the cluster node group using one of the following methods:

    • Connect to each node in the group over SSH or via OS Login and run the sudo dhclient -v -r && sudo dhclient command.
    • Reboot the group nodes at any convenient time.

    Warning

    Updating network settings may cause the services within the cluster to become unavailable for a few minutes.

What should I do if I deleted my Yandex Network Load Balancer or its target groups that were automatically created for a LoadBalancer service?What should I do if I deleted my Yandex Network Load Balancer or its target groups that were automatically created for a LoadBalancer service?

You cannot manually restore a Network Load Balancer or its target groups. Recreate your LoadBalancer service. This will automatically create a load balancer and target groups.

Was the article helpful?

Previous
Troubleshooting
© 2025 Direct Cursus Technology L.L.C.