Managing disk space in a Managed Service for Apache Kafka® cluster
When the storage usage exceeds 97%, the host automatically switches to read-only mode. To prevent issues with topic writes, use one of the following methods:
-
Set up alerts in Yandex Monitoring to monitor storage usage.
-
Increase the storage size to automatically disable read-only mode.
-
Set up automatic storage expansion to prevent the disk from running out of free space and the host from switching to read-only mode.
Setting up alerts in Yandex Monitoring
-
In the management console
, navigate to the relevant folder. -
Go to Monitoring.
-
Create an alert with the following settings:
-
Metrics: Configure the following metric settings:
-
Cloud.
-
Folder.
-
Managed Service for Kafka.
-
Managed Service for Apache Kafka® cluster ID.
You can get the ID with the list of clusters in the folder.
-
disk.free_byteslabel.
-
-
Alert condition: Set the condition for free disk space usage to trigger the alert:
- Aggregation function:
Minimum(metric’s minimum value over the period). - Comparison function:
Less than or equals. - Warning:
95(95% of the storage size). - Alarm:
90(90% of the storage size). - Evaluation window: Preferred metric update period.
- Evaluation delay: Preferred time shift backward, in seconds. It allows to keep the alert from triggering when multiple metrics are specified and collected at different intervals. To learn more about the evaluation delay, see this Yandex Monitoring guide.
- Aggregation function:
-
Notifications: Add the notification channel you created earlier.
-
Increasing your storage size
Make sure the cloud has enough quota to increase the storage size. Open the cloud's Quotas
To increase your cluster storage size:
-
In the management console
, navigate to the relevant folder. -
Go to Managed Service for Kafka.
-
In the cluster row, click
and select Edit. -
Edit the settings under Storage.
You cannot change the disk type for an Apache Kafka® cluster once the cluster is created.
-
Click Save.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To increase your host storage size:
-
See the description of the CLI command for updating a cluster:
yc managed-kafka cluster update --help -
To resize your broker host storage, run this command:
yc managed-kafka cluster update <cluster_name_or_ID> \ --disk-size <storage_size>If you specify no size units, gigabytes are used.
-
To resize ZooKeeper host storage, run this command:
yc managed-kafka cluster update <cluster_name_or_ID> \ --zookeeper-disk-size <disk_size>If you specify no size units, gigabytes are used.
You cannot change the disk type for an Apache Kafka® cluster once the cluster is created.
To increase your cluster storage size:
-
Open the current Terraform configuration file describing your infrastructure.
Learn how to create this file in Creating a cluster.
-
In the Managed Service for Apache Kafka® cluster description, change the
disk_sizevalue in thekafka.resourcesandzookeeper.resourcessections for Apache Kafka® and ZooKeeper hosts, respectively:resource "yandex_mdb_kafka_cluster" "<cluster_name>" { ... kafka { resources { disk_size = <storage_size_in_GB> ... } ... } zookeeper { resources { disk_size = <storage_size_in_GB> ... } } }You cannot change the disk type for an Apache Kafka® cluster once the cluster is created.
-
Make sure the settings are correct.
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Confirm updating the resources.
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
-
For more information, see this Terraform provider guide.
Timeouts
The Terraform provider limits the time for all operations with the Managed Service for Apache Kafka® cluster to 60 minutes.
Operations exceeding the timeout are aborted.
How do I change these limits?
Add the timeouts section to your cluster description, such as the following:
resource "yandex_mdb_kafka_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.update method, e.g., via the following cURL
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMaskparameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-kafka/v1/clusters/<cluster_ID>' \ --data '{ "updateMask": "configSpec.kafka.resources.diskSize,configSpec.zookeeper.resources.diskSize", "configSpec": { "kafka": { "resources": { "diskSize": "<storage_size_in_bytes>" } }, "zookeeper": { "resources": { "diskSize": "<storage_size_in_bytes>" } } } }'Where:
-
updateMask: Comma-separated string of settings you want to update.Specify the relevant parameters:
configSpec.kafka.resources.diskSize: To resize the broker host storage.configSpec.zookeeper.resources.diskSize: To resize the ZooKeeper host storage. Use only for Apache Kafka® 3.5 clusters.
-
configSpec.kafka.resources.diskSize: Broker host storage size, in bytes. -
configSpec.zookeeper.resources.diskSize: ZooKeeper host storage size, in bytes. Use only for Apache Kafka® 3.5 clusters.
You can get the cluster ID with the list of clusters in the folder.
-
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService/Update method, e.g., via the following gRPCurl
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/kafka/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "update_mask": { "paths": [ "config_spec.kafka.resources.disk_size", "config_spec.zookeeper.resources.disk_size" ] }, "config_spec": { "kafka": { "resources": { "disk_size": "<storage_size_in_bytes>" } }, "zookeeper": { "resources": { "disk_size": "<storage_size_in_bytes>" } } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.kafka.v1.ClusterService.UpdateWhere:
-
update_mask: List of settings you want to update as an array of strings (paths[]).Specify the relevant parameters:
config_spec.kafka.resources.disk_size: To resize the broker host storage.config_spec.brokers_count: To resize the ZooKeeper host storage. Use only for Apache Kafka® 3.5 clusters.
-
config_spec.kafka.resources.disk_size: Broker host storage size, in bytes. -
config_spec.zookeeper.resources.disk_size: ZooKeeper host storage size, in bytes. Use only for Apache Kafka® 3.5 clusters.
You can get the cluster ID with the list of clusters in the folder. Earlier, you already obtained the list of available host classes with their IDs.
-
-
Check the server response to make sure your request was successful.
Setting up automatic storage expansion
Set up automatic storage expansion to prevent the cluster from running out of disk space and the hosts from switching to read-only mode.
Make sure the cloud has enough quota to increase the storage size. Open the cloud's Quotas
Warning
- You cannot reduce the storage size.
- When using local disks (
local-ssd), cluster hosts will be unavailable while the storage is being resized.
-
In the management console
, navigate to the relevant folder. -
Go to Managed Service for Kafka.
-
In the cluster row, click
and select Edit. -
Under Automatic increase of storage size, set the storage utilization thresholds that will trigger storage expansion when reached:
- In the Increase size field, select one or both thresholds:
- In the maintenance window when full at more than: Scheduled expansion threshold. When reached, the storage expands during the next maintenance window.
- Immediately when full at more than: Immediate expansion threshold. When reached, the storage expands immediately.
- Specify a threshold value (as a percentage of the total storage size). If you select both thresholds, make sure the immediate expansion threshold is higher than the scheduled one.
- Set the Maximum storage size.
- In the Increase size field, select one or both thresholds:
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To set up automatic storage expansion:
-
See the description of the CLI command for updating a cluster:
yc managed-kafka cluster update --help -
Run this command, specifying the maximum storage size and the conditions for expansion:
yc managed-kafka cluster update <cluster_ID_or_name> \ --disk-size-autoscaling planned-usage-threshold=<scheduled_expansion_percentage>,` `emergency-usage-threshold=<immediate_expansion_percentage>,` `disk-size-limit=<maximum_storage_size_in_bytes>Where:
-
planned-usage-threshold: Storage usage percentage to trigger a storage expansion during the next maintenance window.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled.If you set this condition, configure the maintenance schedule.
-
emergency-usage-threshold: Storage usage percentage to trigger an immediate storage expansion.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled. The value of this setting must be greater than or equal toplanned-usage-threshold. -
disk-size-limit: Maximum storage size, in bytes, that can be set when storage usage reaches one of the specified thresholds.If you set it to
0, automatic storage expansion will be disabled.
-
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the Cluster.update method, e.g., via the following cURL
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMaskparameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-kafka/v1/clusters/<cluster_ID>' \ --data '{ "updateMask": "configSpec.diskSizeAutoscaling.plannedUsageThreshold,configSpec.diskSizeAutoscaling.plannedUsageThreshold,configSpec.diskSizeAutoscaling.plannedUsageThreshold", "configSpec": { "diskSizeAutoscaling": { "plannedUsageThreshold": "<scheduled_expansion_percentage>", "emergencyUsageThreshold": "<immediate_expansion_percentage>", "diskSizeLimit": "<maximum_storage_size_in_bytes>" } } }'Where:
-
updateMask: Comma-separated string of settings you want to update.Specify the relevant parameters:
configSpec.diskSizeAutoscaling.plannedUsageThreshold: To change the storage usage percentage to trigger a scheduled expansion.configSpec.diskSizeAutoscaling.emergencyUsageThreshold: To change the storage usage percentage to trigger a non-scheduled expansion.configSpec.diskSizeAutoscaling.diskSizeLimit: To change the maximum storage size during automatic expansion.
-
plannedUsageThreshold: Storage usage percentage to trigger a storage expansion during the next maintenance window.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled.If you set this condition, configure the maintenance window schedule.
-
emergencyUsageThreshold: Storage usage percentage to trigger an immediate storage expansion.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled. The value of this setting must be greater than or equal toplannedUsageThreshold. -
diskSizeLimit: Maximum storage size, in bytes, that can be set when storage usage reaches one of the specified thresholds.
You can get the cluster ID with the list of clusters in the folder.
-
-
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the ClusterService/Update method, e.g., via the following gRPCurl
request:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_maskparameter as an array ofpaths[]strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/kafka/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "update_mask": { "paths": [ "config_spec.disk_size_autoscaling.planned_usage_threshold", "config_spec.disk_size_autoscaling.emergency_usage_threshold", "config_spec.disk_size_autoscaling.disk_size_limit" ] }, "config_spec": { "disk_size_autoscaling": { "planned_usage_threshold": "<scheduled_expansion_percentage>", "emergency_usage_threshold": "<immediate_expansion_percentage>", "disk_size_limit": "<maximum_storage_size_in_bytes>" } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.kafka.v1.ClusterService.UpdateWhere:
-
update_mask: List of settings you want to update as an array of strings (paths[]).Specify the relevant parameters:
config_spec.disk_size_autoscaling.planned_usage_threshold: To change the storage usage percentage to trigger a scheduled expansion.config_spec.disk_size_autoscaling.emergency_usage_threshold: To change the storage usage percentage to trigger a non-scheduled expansion.config_spec.disk_size_autoscaling.disk_size_limit: To change the maximum storage size during automatic expansion.
-
planned_usage_threshold: Storage usage percentage to trigger a storage expansion during the next maintenance window.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled.If you set this condition, configure the maintenance window schedule.
-
emergency_usage_threshold: Storage usage percentage to trigger an immediate storage expansion.Use a value between
0and100%. The default value is0, i.e., automatic expansion is disabled. The value of this setting must be greater than or equal toplanned_usage_threshold. -
disk_size_limit: Maximum storage size, in bytes, that can be set when storage usage reaches one of the specified thresholds.
You can get the cluster ID with the list of clusters in the folder. Earlier, you already obtained the list of available host classes with their IDs.
-
-
Check the server response to make sure your request was successful.
Upon reaching the specified threshold, the storage expands differently depending on the disk type:
-
For network HDDs and SSDs, by the higher of the two values: 20 GB or 20% of the current disk size.
-
For non-replicated SSDs, by 93 GB.
-
For local SSDs, in a cluster based on:
- Intel Cascade Lake, by 100 GB.
- Intel Ice Lake, by 368 GB.
If the threshold is reached again, the storage will be automatically expanded until it reaches the specified maximum. After that, you can set a new maximum storage size manually.
Warning
- You cannot reduce the storage size.
- When using local disks (
local-ssd), cluster hosts will be unavailable while the storage is being resized.