Scaling types
Instance group scaling type should be selected for each new instance group. This setting decides how the number of instances in the group will be controlled: automatically or manually.
Note
If you pause an instance group's processes (the PAUSED
status), the group will not be scaled.
Manually scaled groups
You can create fixed-size instance groups and manage their size manually based on your current computing needs.
Use cases
Automatically scaled groups
When creating an automatically scaled instance group, you specify the target metric value, while the service continuously re-adjusts the number of instances:
- If the average metric value exceeds the target, Instance Groups will create new instances in the group.
- If the average value decreases below the target value with a smaller group, Instance Groups will delete unnecessary instances.
This is done to ensure that the average metric value within the same availability zone or the entire group (depending on the automatic scaling type) does not differ much from the target value.
For example, let's assume there are 4 instances in an availability zone with an average metric value of 70 and target value of 80. Instance Groups will not reduce the group size because if an instance is deleted, the average value will exceed the target value: 4 × 70 / 3 = 93.3. When the average value drops to 60, Instance Groups will delete one instance because the average value does exceed the target value: 4 × 60 / 3 = 80.
If multiple metrics are specified in the settings, the largest estimated instance group size is used.
For automatically scaled groups, you need to specify common scaling settings and metric settings.
Type of automatic scaling
Instance Groups can adjust the number of instances separately in each availability zone specified in the group settings or in the entire instance group:
- With zonal scaling, Instance Groups will calculate the average scaling metric value and required number of instances separately for each availability zone. This type of automatic scaling is used by default.
- With regional scaling, the metric value and the number of instances are calculated for the entire group. To change the group's autoscaling type to regional, provide the
auto_scale
scaling policy with theauto_scale_type: REGIONAL
key.
General settings
To reduce adjustment sensitivity, with Instance Groups, you can configure:
-
Stabilization period: After the number of VMs increases, the group size will not decrease until the stabilization period ends, even if the average metric value has dropped low enough.
-
Warm-up period: Period during which the VM, upon its start, will not use:
- CPU utilization.
- Monitoring metric values that are applied according to the
UTILIZATION
rule.
Average metric values for the group will be used instead.
-
Utilization measurement period: Metric value will be calculated as an average of all measurements taken during the specified period.
For example, the CPU load may rise to 100% in one second and then drop to 10%. To ignore such surges, Instance Groups will use average values for the specified period, such as one minute.
You can also set limits on the number of instances per group:
- Maximum group size: Instance Groups will not create more instances if a group already contains this many.
- Minimum size in a single availability zone: Instance Groups will not delete instances from an availability zone if there are only this many instances in the zone.
Metrics for autoscaling
You can use the following metrics for autoscaling:
CPU utilization
Instance Groups can control the group size to maintain average CPU utilization within the target level. The average CPU utilization is calculated for an instance separately from each availability zone or from the entire group (for the zonal or regional scaling type, respectively).
Here is what Instance Groups will do outside the stabilization period:
-
Calculate the average CPU utilization during the specified measurement period for each instance, except those that are still warming up. The load is measured several times per minute on every instance.
-
Use the obtained values to calculate the average load for each availability zone or across the entire group.
For example, let's assume there is a group of four instances located in one availability zone. One of the instances starts, while the others are under 90%, 75%, and 85% workload on average during the measurement period. Average zone load: (90 + 75 + 85) / 3 = 83.4%.
-
Obtain the total load, i.e., multiply the resulting average load by the total number of instances.
In our example, it is 83.4 × 4 = 333.6%
-
Divide the total load by the target load level to obtain the number of instances required (the result is rounded up).
Say, for example, the target level is 75%. This means that you need 333.6 / 75 = 4.48 ~ 5 instances. Based on the approximate results, you need to create another instance.
Once the number of instances is calculated and changed (if required), Instance Groups will start calculating the average load again.
Monitoring metrics
You can use up to three Monitoring metrics for automatic scaling in Instance Groups. To read the metrics, the service account linked to the instance group needs the monitoring.viewer
role or higher.
When using monitoring metrics, specify the following in Instance Groups:
- Metric name you specified in Monitoring.
- Labels you specified in Monitoring:
- (Optional)
folder_id
: Folder ID. By default, it is the ID of the folder the group belongs to. - (Optional)
service
: Service ID. The default value iscustom
. You can use a label to specify service metrics, e.g.,service
with thecompute
value for Compute Cloud.
You will also need specify other labels for this metric:
- Metric type that affects how Instance Groups will calculate the average metric value:
GAUGE
: Used for metrics displaying the metric value at a specific point in time, e.g., the number of requests per second to a server running on an instance. Instance Groups computes the average metric value for the specified averaging period.COUNTER
: Used for metrics exhibiting a monotonous growth over time, e.g., total number of requests to a server running on an instance. Instance Groups calculates the metric's average increment for the specified averaging period.
- Metric rule type:
-
UTILIZATION
: Metric displays resource consumption by a single instance.The number of instances per availability zone or the entire group (for zonal or regional scaling, respectively) based on the
UTILIZATION
metric is calculated in the same way as the number of instances based on CPU load.When delivered to Monitoring, the
UTILIZATION
metric must have theinstance_id
label. -
WORKLOAD
: Means the metric presents the total load on all instances in one availability zone or the entire group (for zonal or regional scaling type, respectively).To calculate the number of instances per availability zone or in the entire group by the
WORKLOAD
metric, the average metric value is divided by the target value, and the result is rounded up.For example, let's assume there are two instances in an availability zone. The metric shows the total number of requests per second (RPS) to all instances. If the target metric value is 200, then with an average value of 450, Instance Groups will increase the number of instances in the availability zone to three: 450 / 200 = 2.25 ~ 3 instances.
The metric value is also calculated and used during the instance warm-up period specified in the general settings.
If zonal scaling is applied to the group, when delivered in Monitoring, the
WORKLOAD
metric must have thezone_id
label.
-
- Target metric value based on which Instance Groups calculates the required number of instances. For
UTILIZATION
metrics, the target value is the desired resource consumption level for each instance; forWORKLOAD
metrics, it is the maximum allowed workload for each instance.
Calculating the average metric value
The average metric value is calculated using an exponential moving average
The normal average is calculated using the following formula:
where is the function of the metric in the time range.
There may be spikes in metric values throughout the time range. The normal average is calculated without regard to when the spike occurred — whether closer to the beginning or end of the time range. This can cause a VM group to scale excessively and increase the costs of resources.
To account for the time of a metric spike, an exponential moving average is used:
where is the , weight function allowing you to assign larger weights to the function values towards the end of the segment, i.e., closer to the current time.
The factor depends on how long the metric is measured and is calculated using this formula:
where is the time of measuring the metric in seconds, .
Use cases
- Running an autoscaling instance group
- Autoscaling an instance group to process messages in Yandex Message Queue