Managing disk space in a Managed Service for Apache Kafka® cluster
When the storage is more than 97% full, the host automatically switches to read-only mode. To avoid issues with writing to the database, use one of the following methods:
-
Set up alerts in Yandex Monitoring to monitor storage utilization.
-
Increase the storage size to automatically disable the read-only mode.
Set up alerts in Yandex Monitoring
-
Go to the folder page and select Monitoring.
-
Create an alert with the following properties:
-
Metrics: Set the following metric parameters:
-
Cloud
-
Folder
-
Managed Service for Kafka service
-
Managed Service for PostgreSQL cluster ID
You can get the cluster ID with a list of clusters in the folder.
-
disk.free_bytes
label
-
-
Alert condition: Set the condition for free disk space usage to trigger the alert:
- Aggregation function:
Minimum
(minimum metric value for the period). - Comparison function:
Less than or equals
. - Warning:
95
(95% of the storage size). - Alarm:
90
(90% of the storage size). - Evaluation window: Preferred metric update period.
- Evaluation delay: Preferred time shift backward, in seconds. It allows to keep the alert from triggering when multiple metrics are specified and collected at different intervals. To learn more about the calculation delay, see the Yandex Monitoring documentation.
- Aggregation function:
-
Notifications: Add the previously created notification channel.
-
Increasing storage size
Make sure the cloud has enough quota to increase the storage size. Open the cloud's Quotas
To increase the cluster storage size:
-
Go to the folder page
and select Managed Service for Kafka. -
In the cluster row, click
, then select Edit. -
Edit the settings in the Storage section.
You cannot change the disk type for an Apache Kafka® cluster once you create it.
-
Click Save.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To increase the hosts' storage size:
-
View the description of the update cluster CLI command:
yc managed-kafka cluster update --help
-
To change the size of the broker host storage, run the command:
yc managed-kafka cluster update <cluster_name_or_ID> \ --disk-size <storage_size>
If no size units are specified, gigabytes are used.
-
To change the size of the ZooKeeper host storage, run the command:
yc managed-kafka cluster update <cluster_name_or_ID> \ --zookeeper-disk-size <disk_size>
If no size units are specified, gigabytes are used.
You cannot change the disk type for an Apache Kafka® cluster once you create it.
To increase the cluster storage size:
-
Open the current Terraform configuration file with an infrastructure plan.
For more information about creating this file, see Creating clusters.
-
In the Managed Service for Apache Kafka® cluster description, change the value of the
disk_size
parameter in thekafka.resources
andzookeeper.resources
blocks for Apache Kafka® and ZooKeeper hosts, respectively:resource "yandex_mdb_kafka_cluster" "<cluster_name>" { ... kafka { resources { disk_size = <storage_size_in_GB> ... } ... } zookeeper { resources { disk_size = <storage_size_in_GB> ... } } }
You cannot change the disk type for an Apache Kafka® cluster once you create it.
-
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
For more information, see the Terraform
Time limits
The Terraform provider limits the amount of time for all Managed Service for Apache Kafka® cluster operations to complete to 60 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_kafka_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
To increase the cluster storage size, use the update REST API method for the Cluster resource or the ClusterService/Update gRPC API call and provide the following in the request:
- Cluster ID in the
clusterId
parameter. To find out the cluster ID, get a list of clusters in the folder. - New storage settings in the
configSpec.kafka.resources
parameter (configSpec.zookeeper.resources
for ZooKeeper hosts). - List of settings to update in the
updateMask
parameter.
You cannot change the disk type for an Apache Kafka® cluster once you create it.
Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the updateMask
parameter as a single comma-separated string.
Setting up automatic increase of storage size
To prevent the cluster disk space from running out, set up automatic storage increase.
Make sure the cloud has enough quota to increase the storage size. Open the cloud's Quotas
Warning
- You cannot decrease the storage size.
- While resizing the storage, cluster hosts will be unavailable.
-
Go to the folder page
and select Managed Service for Kafka. -
In the cluster row, click
, then select Edit. -
Under Automatic increase of storage size, set the storage utilization thresholds that will trigger an increase in storage size when reached:
- In the Increase size field, select one or both thresholds:
- In the maintenance window when full at more than: Scheduled increase threshold. When reached, the storage size increases during the next maintenance window.
- Immediately when full at more than: Immediate increase threshold. When reached, the storage size increases immediately.
- Specify a threshold value (as a percentage of the total storage size). If you select both thresholds, make sure the immediate increase threshold is higher than the scheduled one.
- Set Maximum storage size.
- In the Increase size field, select one or both thresholds:
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To set up automatic increase of storage size:
-
View the description of the update cluster CLI command:
yc managed-kafka cluster update --help
-
Set the maximum storage size and conditions for its increase in the update cluster command:
yc managed-kafka cluster update <cluster_ID_or_name> \ --disk-size-autoscaling planned-usage-threshold=<scheduled_increase_percentage>,` `emergency-usage-threshold=<immediate_increase_percentage>,` `disk-size-limit=<maximum_storage_size_in_bytes>
Where:
-
planned-usage-threshold
: Storage utilization percentage to trigger a storage increase in the next maintenance window.Use a percentage value between
0
and100
. The default value is0
(automatic increase is disabled).If you set this parameter, configure the maintenance schedule.
-
emergency-usage-threshold
: Storage utilization percentage to trigger an immediate storage increase.Use a percentage value between
0
and100
. The default value is0
(automatic increase is disabled). This parameter value must be greater than or equal toplanned-usage-threshold
. -
disk-size-limit
: Maximum storage size, in bytes, that can be set when utilization reaches one of the specified percentages.If the value is
0
, automatic increase of storage size will be disabled.
-
To enable automatic increase of storage size, use the update REST API method for the Cluster resource or the ClusterService/Update gRPC API call and provide the following in the request:
-
Storage utilization percentage to trigger a storage increase in the next maintenance window, in the
configSpec.diskSizeAutoscaling.plannedUsageThreshold
parameter.Use a value between
0
and100
%. The default value is0
(automatic increase is disabled).If you set this parameter, configure the maintenance window schedule.
-
Storage utilization percentage to trigger an immediate storage increase, in the
configSpec.diskSizeAutoscaling.emergencyUsageThreshold
parameter.Use a value between
0
and100
%. The default value is0
(automatic increase is disabled). This parameter value must be greater than or equal toconfigSpec.diskSizeAutoscaling.plannedUsageThreshold
. -
Maximum storage size, in bytes, that can be set when utilization reaches one of the specified percentages, in the
configSpec.diskSizeAutoscaling.diskSizeLimit
parameter.
If the specified threshold is reached, the storage size may increase in different ways depending on the disk type:
-
For network HDDs and SSDs, by the higher of the two values: 20 GB or 20% of the current disk size.
-
For non-replicated SSDs, by 93 GB.
-
For local SSDs, in a platform cluster:
- Intel Cascade Lake, by 100 GB.
- Intel Ice Lake, by 368 GB.
If the threshold is reached again, the storage size will be automatically increased until it reaches the specified maximum. After that, you can specify a new maximum storage size manually.
Warning
- You cannot decrease the storage size.
- While resizing the storage, cluster hosts will be unavailable.