Yandex Cloud
Search
Discuss with expertTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex Managed Service for OpenSearch
  • Getting started
    • All guides
    • Managing users
      • Viewing cluster logs
      • Cluster and host state monitoring
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes
  • FAQ

In this article:

  • Cluster state monitoring
  • Monitoring the state of hosts
  • Monitoring the state of host groups
  • Setting up alerts in Yandex Monitoring
  • Cluster state and status
  • Cluster states
  • Cluster statuses
  1. Step-by-step guides
  2. Logs and monitoring
  3. Cluster and host state monitoring

OpenSearch cluster and host state monitoring

Written by
Yandex Cloud
Updated at May 27, 2026
  • Cluster state monitoring
  • Monitoring the state of hosts
  • Monitoring the state of host groups
  • Setting up alerts in Yandex Monitoring
  • Cluster state and status
    • Cluster states
    • Cluster statuses

Data on the cluster and host state is available in the management console. You can view them on the Monitoring tab of the cluster management page or in Yandex Monitoring.

Diagnostic information about cluster states is presented as graphs.

Charts are updated every 15 seconds.

Note

The most appropriate multiple units (MB, GB, and more) are automatically used in charts.

You can configure alerts in Yandex Monitoring to receive notifications about cluster failures. In Yandex Monitoring, there are two alert thresholds: Warning and Alarm. If the specified threshold is exceeded, you will receive alerts via the configured notification channels.

Cluster state monitoringCluster state monitoring

To view detailed information on the health state of a Managed Service for OpenSearch cluster:

Management console
  1. In the management console, navigate to the folder page.

  2. Navigate to Managed Service for OpenSearch.

  3. Click the cluster name and open the  Monitoring tab.

    The page displays the following charts:

    • Under Cluster state:

      • Health status: Cluster health and technical condition:

        • 0 (red): Cluster is unhealthy or partially functional. At least one of the primary shards is not available. If the cluster responds to queries, search results will be incomplete.
        • 1 (yellow): Cluster is functional. There is no access to at least one of the shard replicas. The search results in the cluster's responses are full, but the cluster's operation will be disrupted if more unavailable shards appear.
        • 2 (green): Cluster is healthy. All cluster shards are available.
      • Current master: FQDN of one of the hosts with the MANAGER role.

      • Nodes: Total number of hosts in the cluster (excluding Dashboards hosts) and the number of hosts with the DATA role.

      • Pending tasks: Number of enqueued tasks.

    • Under Indices and load info:

      • Top indices by size: Largest indexes in terms of occupied storage space and their size (in bytes).

      • Active shards: Number of active primary shards and the total number of active shards in the cluster.

      • Search rate: Number of search queries per second, per host.

      • Top indices by docs count: Indexes with the largest document count and the number of documents in them.

      • Other shards: Number of inactive shards in each of the following states:

        • Delayed unassigned: Host assignment is delayed.
        • Unassigned: No host is assigned.
        • Unassigned Primary: No host is assigned (primary shards only).
        • Relocating: Shards are being moved to another host.
        • Initializing: Shards are initializing.
      • Indexing rate: Indexing speed for each host (operations per second).

    • Under Indices segments info:

      • Total indices segments per host: Number of index segments for each host.
    • Under Latest backup info:

      • Backup size: Size of the latest backup:

        • backup_total_size: Total size of all indexes in the backup.
        • backup_incremental_size: Size of the indexes included in the backup increment.
        • backup_free_space_required: Storage size required to restore a cluster from a backup.

Note

To get started with Monitoring metrics, dashboards, or alerts, click Open in Monium in the top panel.

Monitoring the state of hostsMonitoring the state of hosts

To view detailed information on the state of individual Managed Service for OpenSearch hosts:

  1. In the management console, navigate to the folder page.
  2. Navigate to Managed Service for OpenSearch.
  3. Click the cluster name and open the  Hosts tab.
  4. Select the Monitoring tab.
  5. Select the host from the drop-down list.

This page displays the charts showing workloads of individual cluster hosts. It depends on the host type:

MANAGER
DATA
DASHBOARDS
  • Process CPU: Processor core workload generated by the JVM OpenSearch process.

  • Memory usage: Amount of RAM used, in bytes.

  • JVM heap: Use of JVM heap memory, in bytes.

  • Disk space usage percent: Percentage of the disk space used.

  • Management thread pool: Number of cluster management requests.

  • Generic thread pool: Number of requests for running general operations.

  • Thread pool queued: Number of enqueued requests.

  • Thread pool rejected: Number of rejected requests.

  • Under Disk Metrics Details:

    • Disk write latency (percentiles): Disk write time, in percentiles.
    • Disk write bytes: Average and maximum disk write rate.
    • Disk write operations: Average and maximum number of write operations per second.
    • Disk read latency (percentiles): Disk read time, in percentiles.
    • Disk read bytes: Average and maximum disk read rate.
    • Disk read operations: Average and maximum number of read operations per second.
    • Disk write throttler latency (percentiles): Write delay introduced by exceeding disk quota, percentiles.
    • Disk read throttler latency (percentiles): Read delay introduced by exceeding disk quota, percentiles.
    • Disk used quota: Disk operation quota usage.
  • Process CPU: Processor core workload generated by the JVM OpenSearch process.

  • Memory usage: Use of RAM, in bytes.

  • JVM heap percent: Percentage of the JVM heap memory used.

  • Disk space usage percent: Percentage of the disk space used.

  • Shards count: Number of index shards.

  • Primary shards count: Number of primary index shards.

  • Open file descriptors: Number of open file descriptors.

  • Indexing rate: Indexing speed (operations per second).

  • Search queries: Number of search queries per second.

  • Write thread pool: Requests for indexing, deleting, or updating documents.

  • Average query time: Average query execution time.

  • Average indexing time: Average time spent on document indexing.

  • Thread pool queued: Number of enqueued requests.

  • Thread pool rejected: Number of rejected requests.

  • Merging time: Time spent to merge the documents.

  • Under Disk Metrics Details:

    • Disk write latency (percentiles): Disk write time, in percentiles.
    • Disk write bytes: Average and maximum disk write rate.
    • Disk write operations: Average and maximum number of write operations per second.
    • Disk read latency (percentiles): Disk read time, in percentiles.
    • Disk read bytes: Average and maximum disk read rate.
    • Disk read operations: Average and maximum number of read operations per second.
    • Disk write throttler latency (percentiles): Write delay introduced by exceeding disk quota, percentiles.
    • Disk read throttler latency (percentiles): Read delay introduced by exceeding disk quota, percentiles.
    • Disk used quota: Disk operation quota usage.
  • Is Alive: Status that shows the host is available.
  • Requests Total: Total number of host requests.
  • Process CPU: Processor core workload generated by the JVM OpenSearch process.
  • Memory usage: Use of RAM, in bytes.
  • Disk read/write bytes: Speed of disk operations, in bytes per second.
  • Disk IOPS: Number of disk operations per second.
  • Network Packets: Network packet exchange rate, in packets per second.
  • Network bytes: Speed of network data exchange, in bytes per second.

Monitoring the state of host groupsMonitoring the state of host groups

To view detailed information on the state of a Managed Service for OpenSearch host group:

  1. In the management console, navigate to the folder page.
  2. Navigate to Managed Service for OpenSearch.
  3. Click the cluster name and open the  Node groups tab.
  4. Select the Monitoring tab.
  5. Select the host group from the drop-down list.

This page displays the charts showing workloads of a cluster host group. The list depends on the type of hosts in the group and matches the charts shown for individual hosts.

Setting up alerts in Yandex MonitoringSetting up alerts in Yandex Monitoring

Management console
  1. In the management console, select the folder containing the cluster where you want to set up alerts.

  2. Navigate to the  Monitoring service.

  3. Under Service dashboards, select:

    • Managed Service for OpenSearch to configure cluster alerts.
    • Managed Service for OpenSearch — Dashboards to configure alerts for hosts with the DASHBOARDS role.
    • Managed Service for OpenSearch — Data to configure alerts for hosts with the DATA role.
    • Managed Service for OpenSearch — Manager to configure alerts for hosts with the MANAGER role.
  4. In the chart you need, click and select Create alert.

  5. If the chart displays multiple metrics, select the data query for the relevant metric and click Continue. To learn more about the query language, see this Yandex Monitoring article.

  6. Set the Alarm and Warning alert thresholds.

  7. Click Create alert.

To have other cluster health indicators monitored automatically:

Management console
  1. Create an alert.
  2. Add a status metric.
  3. In the alert parameters, set the alert thresholds.

Recommended threshold values for selected metrics:

Metric Designation Formula Alarm Warning
Cluster status opensearch_status bottom_last(1) equal to 0 equal to 1
Number of unassigned shards opensearch_unassigned_shards top_last(1) greater than 0
Number of shards being relocated opensearch_relocating_shards top_last(1) greater than 0
Number of initializing shards opensearch_initializing_shards top_last(1) greater than 0
Number of delayed unassigned shards opensearch_delayed_unassigned_shards top_last(1) greater than 0
JVM heap memory used opensearch_jvm_mem_heap_used_percent top_last(1) Over 90% of host RAM
Storage space used opensearch_fs_total_used_percent top_last(1) Over 90% of the storage size Over 85% of the storage size
Using the JVM long-lived object pool opensearch_jvm_mem_heap_pressure top_last(1) Over 90% of host RAM Over 75% of host RAM
Storage space used disk.used_bytes — 90% of the storage size 80% of the storage size

For the disk.used_bytes metric, the Alarm and Warning thresholds are only set in bytes. For example, the recommended values for a 100 GB disk are as follows:

  • Alarm: 96636764160 bytes (90%).
  • Warning: 85899345920 bytes (80%).

You can view the current storage size and RAM of the hosts in the detailed information about the cluster. For a complete list of supported metrics, see this Monitoring guide.

Cluster state and statusCluster state and status

The State of a cluster shows the health of its hosts, while the Status shows whether the cluster is started, stopped, or is at an intermediate stage.

To check the cluster state and status:

Management console
CLI
REST API
gRPC API
  1. In the management console, navigate to the folder page.
  2. Navigate to Managed Service for OpenSearch.
  3. In the cluster row, hover over the indicator in the Availability column.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To find out the state and status of a cluster, get information about it:

yc managed-opensearch cluster get <cluster_name_or_ID>

You will see the cluster state in the health parameter and the cluster status, in the status parameter.

You can get the cluster name and ID with the list of clusters in the folder.

  1. Get an IAM token for API authentication and put it into an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Call the Cluster.Get method, e.g., via the following cURL request:

    curl \
        --request GET \
        --header "Authorization: Bearer $IAM_TOKEN" \
        --url 'https://mdb.api.cloud.yandex.net/managed-opensearch/v1/clusters/<cluster_ID>'
    

    You can get the cluster ID with the list of clusters in the folder.

  3. Check the server response to make sure your request was successful.

    You will see the cluster health and status in the health and status parameters, respectively.

  1. Get an IAM token for API authentication and put it into an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume that the repository contents reside in the ~/cloudapi/ directory.

  3. Call the ClusterService.Get method, e.g., via the following gRPCurl request:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/mdb/opensearch/v1/cluster_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
              "cluster_id": "<cluster_ID>"
            }' \
        mdb.api.cloud.yandex.net:443 \
        yandex.cloud.mdb.opensearch.v1.ClusterService.Get
    

    You can get the cluster ID with the list of clusters in the folder.

  4. Check the server response to make sure your request was successful.

    You will see the cluster health and status in the health and status parameters, respectively.

Cluster statesCluster states

State Description Suggested actions
ALIVE Cluster is operating normally. No action is required.
DEGRADED Cluster is not running at its full capacity: the state of at least one of the hosts is other than ALIVE. Run the diagnostics:
  • Go to the Hosts tab and see which hosts are not working.
  • Go to the Operations tab and make sure all operations are completed.
  • Make sure the cluster is not under maintenance.
If you cannot find the cause yourself, contact support.
DEAD The cluster is down: none of its hosts are running. Make a support request stating the following:
  • Cluster ID.
  • IDs of the last operations performed on it.
  • Time the cluster entered the DEAD state according to the availability charts.
UNKNOWN Cluster state is unknown. Make a support request stating the following:
  • Cluster ID.
  • IDs of the last operations performed on it.
  • Time the cluster entered the UNKNOWN state according to the availability charts.

Cluster statusesCluster statuses

Status Description Suggested actions
CREATING Preparing for the first start Wait a while and get started. The time it takes to create a cluster depends on the host class.
RUNNING The cluster is operating normally No action is required.
STOPPING The cluster is stopping After a while, the cluster status will switch to STOPPED and the cluster will be disabled. No action is required.
STOPPED The cluster is stopped Start the cluster to get it running again.
STARTING Starting the cluster that was stopped earlier After a while, the cluster status will switch to RUNNING. Wait a while and get started.
UPDATING Updating the cluster's configuration Once the update is complete, the cluster will get the status it had prior to the update: RUNNING or STOPPED.
ERROR Error when performing an operation with the cluster or during a maintenance window If the cluster remains in this status for a long time, contact support. You can see whether a cluster is available by its status.
STATUS_UNKNOWN The cluster is unable to determine its status If the cluster remains in this status for a long time, contact support.

Was the article helpful?

Previous
Viewing cluster logs
Next
Configuring an index policy in Managed Service for OpenSearch
© 2026 Direct Cursus Technology L.L.C.