Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Compute Cloud
  • Yandex Container Solution
    • Resource relationships
    • Graphics processing units (GPUs)
    • Images
      • Overview
      • Access
      • YAML specification
      • Instance template
      • Variables in an instance template
        • Overview
        • Allocation policy
        • Deployment policy
        • Scaling policy
        • Recovery policy
      • Scaling types
      • Instance health checks and automatic recovery
      • Integrating with network and L7 load balancers
      • Handling a stateful workload
      • Stopping and pausing an instance group
      • Sequentially restarting and recreating instances in a group
      • Statuses
    • Dedicated host
    • Encryption
    • Backups
    • Quotas and limits
  • Access management
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Release notes

In this article:

  • fixed_scale
  • auto_scale
  • test_auto_scale
  • Use cases
  1. Concepts
  2. Instance groups
  3. Policies
  4. Scaling policy

Scaling policy

Written by
Yandex Cloud
Updated at April 18, 2025
  • fixed_scale
  • auto_scale
  • test_auto_scale
  • Use cases

When creating an instance group, you can choose how to increase and decrease the number of instances in the group.

The policy is defined in the scale_policy key in the YAML file.

fixed_scale

The fixed_scale key defines a fixed-size instance group. The group size is defined in the size key. You can create a group with the required number of instances within the available quotas and limits.

Here is how a YAML file entry may look like:

...
scale_policy:
  fixed_scale:
    size: 3
...

Where:

Key Value
fixed_scale Fixed-size instance group
size* Number of VM instances in the group.
Acceptable values: 0 to 100.

* This is a required field.

auto_scale

The auto_scale key defines an automatically scalable instance group. The initial size of the group is defined in the initial_size key. You can create a group with the required number of instances within the available quotas and limits.

The VM instance group will be scaled based on the specified metrics: CPU utilization (the cpu_utilization_rule key) and/or Yandex Monitoring metrics. If multiple metrics are specified in the file, the largest estimated VM instance group size is used.

Here is how a YAML file entry may look like:

scale_policy:
  auto_scale:
    auto_scale_type: REGIONAL
    initial_size: 5
    max_size: 15
    min_zone_size: 3
    measurement_duration: 30s
    warmup_duration: 60s
    stabilization_duration: 120s
    cpu_utilization_rule:
      utilization_target: 75
    custom_rules:
    - rule_type: WORKLOAD
      metric_type: GAUGE
      metric_name: queue.messages.stored_count
      labels:
        queue: dj6000000002********
      target: 5

Where:

Key Value
auto_scale Automatically scaled instance group
auto_scale_type Type of automatic scaling.
The possible values are:
  • ZONAL (zonal scaling): For each availability zone, its own average scaling metric value and required number of instances are calculated.
  • REGIONAL (regional scaling): The metric value and the number of instances are calculated for the entire group.
The default value is ZONAL.
initial_size* Initial number of instances in the group.
The values range from 1 to 100.
max_size Maximum number of VM instances in the group.
Acceptable values: 0 to 100.
min_zone_size Minimum number of VM instances per availability zone.
Acceptable values: 0 to 100.
measurement_duration Utilization measurement period: The value of each metric is computed as the average of all measurements taken during the period in question. If this value exceeds the target scaling metric value, Instance Groups will increase the number of VM instances in the group.
The acceptable values range from 60 to 600 seconds. The default value is 60 seconds.
warmup_duration Instance warmup period. This is a period of time following startup during which the traffic is routed to the VM, while the values of metrics from this VM are not used to scale the group. The average values of the group metrics are used instead.
The acceptable values range from 0 to 600 seconds. The default value is zero seconds.
stabilization_duration Stabilization period. After the number of VM instances increases, the group size does not decrease until the stabilization period ends, even if the average scaling metric value drops below the target level.
The acceptable values range from 60 to 1,800 seconds.
cpu_utilization_rule Sets the target CPU utilization to run scaling based on the average CPU utilization in the instance group.
utilization_target Target CPU utilization to be supported by Instance Groups.
If the average CPU utilization is below the target value, Instance Groups will reduce the number of instances until it reaches min_zone_size in each availability zone.
If the average CPU utilization is higher than the target value, Instance Groups will be creating instances until it reaches max_size.
The values range from 10 to 100.
custom_rules List of metrics from Yandex Monitoring for automatic scaling. It can include up to three metrics.
rule_type Metric rule type:
  • UTILIZATION: For metrics describing resource utilization per VM instance.
  • WORKLOAD: For metrics describing total workload on all VM instances.
For more information, see Monitoring metrics.
metric_type Type of metric:
  • GAUGE: Metric reflects the value at particular time point.
  • COUNTER: Metric exhibits a monotonous growth over time.
For more information, see Monitoring metrics.
metric_name Name of the metric in Monitoring.
labels Metrics labels from Monitoring.
target Target metric value by which Instance Groups calculates the number of required VM instances. For more information, see Monitoring metrics.

* This is a required field.

test_auto_scale

The test_auto_scale key defines a fixed-size instance group with autoscaling testing enabled. The Monitoring tab charts present recommendations on how much to increase or decrease the number of instances depending on the selected metric value; the actual number of instances always remains equal to the size key value. You can create a group with the required number of instances within the available quotas and limits.

scale_policy:
  fixed_scale:
    size: 5
  test_auto_scale:
    initial_size: 5
    max_size: 15
    min_zone_size: 3
    measurement_duration: 30s
    warmup_duration: 60s
    stabilization_duration: 120s
    cpu_utilization_rule:
      utilization_target: 75

For test_auto_scale, use the same keys as for auto_scale.

Use cases

  • Running an autoscaling instance group

See also

  • Recovery policy
  • Allocation policy
  • Deployment policy

Was the article helpful?

Previous
Deployment policy
Next
Recovery policy
Yandex project
© 2025 Yandex.Cloud LLC