Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML Services
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Tutorials
    • All tutorials
    • Setting up a Managed Service for PostgreSQL connection from a container in Serverless Containers
    • Creating a VM from a Container Optimized Image
    • Creating a VM from a Container Optimized Image with an additional volume for a Docker container
    • Creating an instance group from a Container Optimized Image with multiple Docker containers
    • Creating an instance group from a Container Optimized Image
    • Creating a VM from a Container Optimized Image with multiple Docker containers
    • Updating a Container Optimized Image VM
    • Configuring data output from a Docker container to a serial port
      • Creating a new Kubernetes project
      • Creating a Kubernetes cluster with no internet access
      • Creating a Kubernetes cluster using the Yandex Cloud provider for the Kubernetes Cluster API
      • Running workloads with GPUs
      • Using node groups with GPUs and no pre-installed drivers
      • Setting up Time-Slicing GPUs
      • Migrating resources to a different availability zone
      • Encrypting secrets
        • Cluster monitoring with Prometheus and Grafana
        • Managed Service for Kubernetes cluster monitoring with Filebeat OSS

In this article:

  • Required paid resources
  • Getting started
  • Install Prometheus
  • Install the Trickster caching proxy
  • Install Grafana
  • Set up and check Grafana
  • Delete the resources you created
  1. Container infrastructure
  2. Managed Service for Kubernetes
  3. Monitoring
  4. Cluster monitoring with Prometheus and Grafana

Monitoring a Yandex Managed Service for Kubernetes cluster with Prometheus and Grafana

Written by
Yandex Cloud
Improved by
Dmitry A.
Updated at August 11, 2025
  • Required paid resources
  • Getting started
  • Install Prometheus
  • Install the Trickster caching proxy
  • Install Grafana
  • Set up and check Grafana
  • Delete the resources you created

Managed Service for Kubernetes enables you to upload cluster object metrics to monitoring systems.

In this article, you will learn how to set up the Prometheus metrics collection system and the Grafana visualization system in a Managed Service for Kubernetes cluster. The Trickster caching proxy will be installed to speed up the transfer of metrics.

To set up the Managed Service for Kubernetes cluster monitoring system:

  • Install Prometheus.
  • Install the Trickster caching proxy.
  • Install Grafana.
  • Set up and check Grafana.

If you no longer need the resources you created, delete them.

Required paid resourcesRequired paid resources

The support cost includes:

  • Fee for using the master and outgoing traffic in a Managed Service for Kubernetes cluster (see Managed Service for Kubernetes pricing).
  • Fee for using computing resources, OS, and storage in cluster nodes (VMs) (see Compute Cloud pricing).
  • Fee for the public IP address assigned to cluster nodes (see Virtual Private Cloud pricing).

Getting startedGetting started

  1. Create security groups for the Managed Service for Kubernetes cluster and its node groups.

    Warning

    The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.

  2. Create a Managed Service for Kubernetes cluster and a node group in any suitable configuration with internet access and the security groups prepared earlier.

  3. Install kubect and configure it to work with the new cluster.

  4. Install Helm v3.8.0 or higher.

Install PrometheusInstall Prometheus

The Prometheus monitoring system scans Managed Service for Kubernetes cluster objects and collects their metrics into its own database. The collected metrics are available within the Managed Service for Kubernetes cluster over HTTP.

  1. Add a repository containing the Prometheus distribution:

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && \
    helm repo update
    
  2. Install Prometheus:

    helm install my-prom prometheus-community/prometheus
    
  3. Make sure that all pods have entered the Running state:

    kubectl get pods -l "app.kubernetes.io/instance=my-prom"
    

    Result:

    NAME                                              READY  STATUS   RESTARTS  AGE
    my-prom-prometheus-alertmanager-7b********-xt6ws  2/2    Running  0         81s
    my-prom-prometheus-node-exporter-*****            1/1    Running  0         81s
    my-prom-prometheus-pushgateway-69********-swrfb   1/1    Running  0         81s
    my-prom-prometheus-server-7b********-m4v78        2/2    Running  0         81s
    

Install the Trickster caching proxyInstall the Trickster caching proxy

The Trickster caching proxy speeds up reading from a Prometheus database, which enables the display of near real-time Grafana metrics and reduces the load on Prometheus.

  1. Create a configuration file named trickster.yaml that contains Trickster settings:

    trickster.yaml
     apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       name: trickster-pvc
     spec:
       accessModes:
         - ReadWriteOnce
       storageClassName: yc-network-hdd
       resources:
         requests:
           storage: 15Gi
     
     ---
     apiVersion: v1
     kind: ConfigMap
     metadata:
       name: trickster-conf
       labels:
         name: trickster-conf
     
     data:
       trickster.conf: |-
         [frontend]
         listen_port = 8480
         tls_listener = false
         connections_limit = 0
         [logging]
         log_level = "info"
     
         [caching]
         cache_type = "filesystem"
         filesystem_path = "/tmp/trickster"
     
         [proxy]
         origin = "default"
     
         [origins.default]
         origin_type = "prometheus"
         origin_url = "http://my-prom-prometheus-server:80"
         is_default = true
     
         [metrics]
         listen_port = 8481
         listen_address = ""
     
         [health]
         listen_port = 8481
         listen_address = ""
     
         [telemetry]
         prometheus_metrics = false
     
         [logging.profiler]
         enabled = false
         port = 6060
     
     ---
     apiVersion: apps/v1
     kind: Deployment
     metadata:
       name: trickster
       labels:
         app: trickster
     spec:
       replicas: 1
       selector:
         matchLabels:
           app: trickster
       template:
         metadata:
           labels:
             app: trickster
         spec:
           containers:
             - name: trickster
               image: tricksterproxy/trickster:1.1
               imagePullPolicy: IfNotPresent
               args:
                 - -config
                 - /etc/trickster/trickster.conf
               ports:
                 - name: http
                   containerPort: 8480
                   protocol: TCP
                 - name: metrics
                   containerPort: 8481
                   protocol: TCP
               volumeMounts:
                 - name: config-volume
                   mountPath: /etc/trickster
                   readOnly: true
                 - name: cache-volume
                   mountPath: /tmp/trickster
               env:
                 - name: NAMESPACE
                   valueFrom:
                     fieldRef:
                       fieldPath: metadata.namespace
           volumes:
             - name: config-volume
               configMap:
                 name: trickster-conf
                 items:
                   - key: trickster.conf
                     path: trickster.conf
             - name: cache-volume
               persistentVolumeClaim:
                 claimName: trickster-pvc
     
     ---
     apiVersion: v1
     kind: Service
     metadata:
       annotations:
         prometheus.io/scrape: "true"
         prometheus.io/port: "8481"
         prometheus.io/path: "/metrics"
       name: trickster
     spec:
       ports:
         - name: http
           port: 8480
           targetPort: http
         - name: metrics
           port: 8481
           targetPort: metrics
       selector:
         app: trickster
    

    You can change the size of the storage allocated to the caching proxy. Specify the storage size you need in the PersistentVolumeClaim.spec.resources.requests.storage parameter.

  2. Install Trickster:

    kubectl apply -f trickster.yaml
    
  3. Make sure the Trickster pod has entered the Running state:

    kubectl get pods -l "app=trickster"
    

The caching proxy is available in the Managed Service for Kubernetes cluster at http://trickster:8480. Grafana will use this URL to collect metrics.

Install GrafanaInstall Grafana

When deploying the application, the following will be created:

  • Deployment of the Grafana application.
  • PersistentVolumeClaim to reserve internal storage.
  • Service of the LoadBalancer type to enable network access to the Grafana management console.

To install Grafana:

  1. Create a configuration file named grafana.yaml.

    grafana.yaml
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: grafana-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: grafana
      name: grafana
    spec:
      selector:
        matchLabels:
          app: grafana
      template:
        metadata:
          labels:
            app: grafana
        spec:
          securityContext:
            fsGroup: 472
            supplementalGroups:
              - 0
          containers:
            - name: grafana
              image: grafana/grafana:latest
              imagePullPolicy: IfNotPresent
              ports:
                - containerPort: 3000
                  name: http-grafana
                  protocol: TCP
              readinessProbe:
                failureThreshold: 3
                httpGet:
                  path: /robots.txt
                  port: 3000
                  scheme: HTTP
                initialDelaySeconds: 10
                periodSeconds: 30
                successThreshold: 1
                timeoutSeconds: 2
              livenessProbe:
                failureThreshold: 3
                initialDelaySeconds: 30
                periodSeconds: 10
                successThreshold: 1
                tcpSocket:
                  port: 3000
                timeoutSeconds: 1
              resources:
                requests:
                  cpu: 250m
                  memory: 750Mi
              volumeMounts:
                - mountPath: /var/lib/grafana
                  name: grafana-pv
          volumes:
            - name: grafana-pv
              persistentVolumeClaim:
                claimName: grafana-pvc
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: grafana
    spec:
      ports:
        - port: 3000
          protocol: TCP
          targetPort: http-grafana
      selector:
        app: grafana
      sessionAffinity: None
      type: LoadBalancer
    

    If required, change:

    • Storage size allocated for Grafana in the spec.resources.requests.storage parameter for kind: PersistentVolumeClaim.
    • Computing resources allocated to the Grafana pod in the spec.containers.resources parameters for kind: Deployment.
  2. Install Grafana:

    kubectl apply -f grafana.yaml
    
  3. Make sure the Grafana pod has entered the Running state:

    kubectl get pods -l "app=grafana"
    

Set up and check GrafanaSet up and check Grafana

  1. Find the address where Grafana is available and go to it:

    export GRAFANA_IP=$(kubectl get service/grafana -o jsonpath='{.status.loadBalancer.ingress[0].ip}') && \
    export GRAFANA_PORT=$(kubectl get service/grafana -o jsonpath='{.spec.ports[0].port}') && \
    echo http://$GRAFANA_IP:$GRAFANA_PORT
    

    Note

    If the resource is unavailable at the specified URL, make sure that the security groups for the Managed Service for Kubernetes cluster and its node groups are configured correctly. If any rule is missing, add it.

  2. In the browser window that opens, enter your admin/admin username and password and then set a new password for the admin user.

  3. Add a data source with the Prometheus type and the following settings:

    • Name: Prometheus.
    • URL: http://trickster:8480.
  4. Click Save & test and make sure that the data source was successfully connected (Data source is working).

  5. Import the Kubernetes Deployment Statefulset Daemonset metrics dashboard containing the basic Kubernetes metrics. Specify the dashboard ID (8588) when importing.

    Tip

    To check the scenario, you can use any suitable dashboard from the Grafana catalog.

  6. Open the dashboard and make sure that Grafana receives metrics from the Managed Service for Kubernetes cluster.

Delete the resources you createdDelete the resources you created

Delete the resources you no longer need to avoid paying for them:

  1. Delete the Managed Service for Kubernetes cluster.
  2. Delete the Managed Service for Kubernetes cluster's public static IP address if you had reserved one.
  3. Delete the disk created for the trickster storage. You can find it by the label in the disk description, which you can check using the kubectl describe pvc trickster-pvc command: the label will match the value in the Volume field.

Was the article helpful?

Previous
Working with Compute Cloud snapshots
Next
Managed Service for Kubernetes cluster monitoring with Filebeat OSS
© 2025 Direct Cursus Technology L.L.C.