Monitoring

Written by

Updated at April 8, 2026

Grafana interface
Access management
Metric dashboards
Logs
Alerts
Configuration
- General format

Stackland allows you to set up monitoring for the cluster and platform components with Prometheus, Loki, Fluent Bit, and Grafana.

Prometheus: Tool for collecting metrics from the cluster and platform components.
Loki: Centralized log aggregation system.
Fluent Bit: Log shipper.
Grafana: Interface for viewing and visualizing metrics and logs.
Alertmanager: Tool for managing alerting rules and sending notifications on issues.

You can extend monitoring capabilities by adding data sources and plugins to work with them.

Grafana interface

The Grafana interface is available at https://grafana.sys.<cluster domain>. To sign in, click Sign in with Stackland Auth.

The Loki and Prometheus data sources are connected to it by default. To add new sources or check the connected ones, go to Connections on the Data sources page. Open the Explore tab and select Metrics or Logs to see what metrics and logs are collected in the cluster.

Access management

Based on the role in the cluster, a user will be assigned a global role in Grafana. If the user is a member of the stackland-cluster-admins group, they will immediately get administrator permissions in Grafana. If they are a member of the stackland-cluster-editors group, they get the editor role. If the user is not a group member, they will be assigned the viewer role.

Once you log in, you can manage permissions at the level of individual Grafana resources and individual users. You can read more about access management in this Grafana guide.

Metric dashboards

You can find ready-to-use dashboards on the Dashboards tab: the stackland-monitoring folder contains dashboards with cluster metrics. You can find dashboards with platform component metrics in other folders, e.g., in stackland-managed-postgres.

In addition to the ready-to-use dashboards, you can create your own and add your app metrics to them. For more information about creating dashboards, see Creating a dashboard.

apiVersion: stackland.yandex.cloud/v1alpha1
kind: MonitoringConfig
metadata: ...
status:
  datasourceConfigured: true
  grafanaReady: true
  message: Grafana is ready
  observedGeneration: 1
spec:
  enabled: true
  settings:
    alertmanager:
      enabled: true
      ingressEnabled: true
      resources:
        requests:
          cpu: 50m
          memory: 200Mi
    grafana:
      enabled: true
      resources:
        limits:
          cpu: 500m
          memory: 1Gi
        requests:
          cpu: 100m
          memory: 256Mi
    grafanaOperator:
      enabled: true
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 100m
          memory: 128Mi
    prometheus:
      enabled: true
      ingressEnabled: true
      resources:
        limits:
          memory: 2Gi
        requests:
          cpu: 100m
          memory: 400Mi
      retention: 10d

Monitoring component status

status:
  datasourceConfigured: true
  grafanaReady: true
  message: Grafana is ready
  observedGeneration: 1

datasourceConfigured: Prometheus and Loki are connected to Grafana.
grafanaReady: Grafana is ready for use.
message: Grafana status message.
observedGeneration: Active configuration version.

Alertmanager

alertmanager:
  enabled: true
  ingressEnabled: true
  resources:
    requests:
      cpu: 50m
      memory: 200Mi

enabled: Enables Alertmanager.
ingressEnabled: Enables access to Alertmanager via Ingress.
resources: Resource requirements.

Grafana

grafana:
  enabled: true
  resources:
    limits:
      cpu: 500m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 256Mi

enabled: Enables Grafana.
resources: Resource requirements.

Grafana Operator

grafanaOperator:
  enabled: true
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 100m
      memory: 128Mi

enabled: Enables Grafana Operator.
resources: Resource requirements.

Prometheus

prometheus:
  enabled: true
  ingressEnabled: true
  resources:
    limits:
      memory: 2Gi
    requests:
      cpu: 100m
      memory: 400Mi
  retention: 10d

enabled: Enables Prometheus.
ingressEnabled: Opens the Prometheus web UI via Ingress.
resources: Resource requirements.
retention: Data retention period before deletion.

Monitoring

Grafana interfaceGrafana interface

Access managementAccess management

Metric dashboardsMetric dashboards

LogsLogs

AlertsAlerts

ConfigurationConfiguration

General formatGeneral format

Monitoring component statusMonitoring component status

AlertmanagerAlertmanager

GrafanaGrafana

Grafana OperatorGrafana Operator

PrometheusPrometheus

Was the article helpful?

Grafana interface

Access management

Metric dashboards

Logs

Alerts

Configuration

General format

Monitoring component status

Alertmanager

Grafana

Grafana Operator

Prometheus