Monitoring a Yandex Managed Service for Kubernetes cluster with Prometheus and Grafana
Managed Service for Kubernetes enables you to upload cluster object metrics to monitoring systems.
In this article, you will learn how to set up the Prometheus
To set up the Managed Service for Kubernetes cluster monitoring system:
If you no longer need the resources you created, delete them.
Required paid resources
The support cost includes:
- Fee for using the master and outgoing traffic in a Managed Service for Kubernetes cluster (see Managed Service for Kubernetes pricing).
- Fee for using computing resources, OS, and storage in cluster nodes (VMs) (see Compute Cloud pricing).
- Fee for the public IP address assigned to cluster nodes (see Virtual Private Cloud pricing).
Getting started
-
Create security groups for the Managed Service for Kubernetes cluster and its node groups.
Warning
The configuration of security groups determines the performance and availability of the cluster and the services and applications running in it.
-
Create a Managed Service for Kubernetes cluster and a node group in any suitable configuration with internet access and the security groups prepared earlier.
-
Install kubect
and configure it to work with the new cluster. -
Install Helm
v3.8.0 or higher.
Install Prometheus
The Prometheus monitoring system scans Managed Service for Kubernetes cluster objects and collects their metrics into its own database. The collected metrics are available within the Managed Service for Kubernetes cluster over HTTP.
-
Add a repository containing the Prometheus distribution:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && \ helm repo update
-
Install Prometheus:
helm install my-prom prometheus-community/prometheus
-
Make sure that all pods have entered the
Running
state:kubectl get pods -l "app.kubernetes.io/instance=my-prom"
Result:
NAME READY STATUS RESTARTS AGE my-prom-prometheus-alertmanager-7b********-xt6ws 2/2 Running 0 81s my-prom-prometheus-node-exporter-***** 1/1 Running 0 81s my-prom-prometheus-pushgateway-69********-swrfb 1/1 Running 0 81s my-prom-prometheus-server-7b********-m4v78 2/2 Running 0 81s
Install the Trickster caching proxy
The Trickster caching proxy speeds up reading
-
Create a configuration file named
trickster.yaml
that contains Trickster settings:trickster.yaml
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: trickster-pvc spec: accessModes: - ReadWriteOnce storageClassName: yc-network-hdd resources: requests: storage: 15Gi --- apiVersion: v1 kind: ConfigMap metadata: name: trickster-conf labels: name: trickster-conf data: trickster.conf: |- [frontend] listen_port = 8480 tls_listener = false connections_limit = 0 [logging] log_level = "info" [caching] cache_type = "filesystem" filesystem_path = "/tmp/trickster" [proxy] origin = "default" [origins.default] origin_type = "prometheus" origin_url = "http://my-prom-prometheus-server:80" is_default = true [metrics] listen_port = 8481 listen_address = "" [health] listen_port = 8481 listen_address = "" [telemetry] prometheus_metrics = false [logging.profiler] enabled = false port = 6060 --- apiVersion: apps/v1 kind: Deployment metadata: name: trickster labels: app: trickster spec: replicas: 1 selector: matchLabels: app: trickster template: metadata: labels: app: trickster spec: containers: - name: trickster image: tricksterproxy/trickster:1.1 imagePullPolicy: IfNotPresent args: - -config - /etc/trickster/trickster.conf ports: - name: http containerPort: 8480 protocol: TCP - name: metrics containerPort: 8481 protocol: TCP volumeMounts: - name: config-volume mountPath: /etc/trickster readOnly: true - name: cache-volume mountPath: /tmp/trickster env: - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumes: - name: config-volume configMap: name: trickster-conf items: - key: trickster.conf path: trickster.conf - name: cache-volume persistentVolumeClaim: claimName: trickster-pvc --- apiVersion: v1 kind: Service metadata: annotations: prometheus.io/scrape: "true" prometheus.io/port: "8481" prometheus.io/path: "/metrics" name: trickster spec: ports: - name: http port: 8480 targetPort: http - name: metrics port: 8481 targetPort: metrics selector: app: trickster
You can change the size of the storage allocated to the caching proxy. Specify the storage size you need in the
PersistentVolumeClaim.spec.resources.requests.storage
parameter. -
Install Trickster:
kubectl apply -f trickster.yaml
-
Make sure the Trickster pod has entered the
Running
state:kubectl get pods -l "app=trickster"
The caching proxy is available in the Managed Service for Kubernetes cluster at http://trickster:8480
. Grafana will use this URL to collect metrics.
Install Grafana
When deploying the application, the following will be created:
Deployment
of the Grafana application.- PersistentVolumeClaim to reserve internal storage.
Service
of theLoadBalancer
type to enable network access to the Grafana management console.
To install Grafana:
-
Create a configuration file named
grafana.yaml
.grafana.yaml
--- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: grafana-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: grafana name: grafana spec: selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: securityContext: fsGroup: 472 supplementalGroups: - 0 containers: - name: grafana image: grafana/grafana:latest imagePullPolicy: IfNotPresent ports: - containerPort: 3000 name: http-grafana protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /robots.txt port: 3000 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 2 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 3000 timeoutSeconds: 1 resources: requests: cpu: 250m memory: 750Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-pv volumes: - name: grafana-pv persistentVolumeClaim: claimName: grafana-pvc --- apiVersion: v1 kind: Service metadata: name: grafana spec: ports: - port: 3000 protocol: TCP targetPort: http-grafana selector: app: grafana sessionAffinity: None type: LoadBalancer
If required, change:
- Storage size allocated for Grafana in the
spec.resources.requests.storage
parameter forkind: PersistentVolumeClaim
. - Computing resources allocated to the Grafana pod in the
spec.containers.resources
parameters forkind: Deployment
.
- Storage size allocated for Grafana in the
-
Install Grafana:
kubectl apply -f grafana.yaml
-
Make sure the Grafana pod has entered the
Running
state:kubectl get pods -l "app=grafana"
Set up and check Grafana
-
Find the address where Grafana is available and go to it:
export GRAFANA_IP=$(kubectl get service/grafana -o jsonpath='{.status.loadBalancer.ingress[0].ip}') && \ export GRAFANA_PORT=$(kubectl get service/grafana -o jsonpath='{.spec.ports[0].port}') && \ echo http://$GRAFANA_IP:$GRAFANA_PORT
-
In the browser window that opens, enter your
admin/admin
username and password and then set a new password for theadmin
user. -
Add a data source
with thePrometheus
type and the following settings:- Name:
Prometheus
. - URL:
http://trickster:8480
.
- Name:
-
Click Save & test and make sure that the data source was successfully connected (
Data source is working
). -
Import
theKubernetes Deployment Statefulset Daemonset metrics
dashboard containing the basic Kubernetes metrics. Specify the dashboard ID (8588
) when importing.Tip
To check the scenario, you can use any suitable dashboard from the Grafana catalog
. -
Open the dashboard and make sure that Grafana receives metrics from the Managed Service for Kubernetes cluster.
Delete the resources you created
Delete the resources you no longer need to avoid paying for them:
- Delete the Managed Service for Kubernetes cluster.
- Delete the Managed Service for Kubernetes cluster's public static IP address if you had reserved one.
- Delete the disk created for the
trickster
storage. You can find it by the label in the disk description, which you can check using thekubectl describe pvc trickster-pvc
command: the label will match the value in theVolume
field.