Scaling types
When creating each instance group, you will need to choose its scaling type, which determines whether the number of instances in the group will change automatically or manually.
Note
If you pause processes (switch them to the PAUSED
status) in an instance group, it will not scale up.
Manually scaled groups
You can create fixed-size instance groups and manage their size manually based on your current computing needs.
Automatically scaled groups
When creating an automatically scaled instance group, you specify the target metric value, while the service continuously re-adjusts the number of instances:
- If the average metric value exceeds the target, Instance Groups will create new instances in the group.
- If the average value decreases below the target value with a smaller group, Instance Groups will delete unnecessary instances.
This is done to ensure that the average metric value within the same availability zone or the entire group (depending on the automatic scaling type) does not differ much from the target value.
For example, let's assume there are 4 instances in an availability zone with an average metric value of 70 and target value of 80. Instance Groups will not reduce the group size, because as you delete an instance, the average value will be larger than the target one: 4 × 70 / 3 = 93.3. When the average value drops to 60, Instance Groups will delete one instance since the average value does not surpass the target: 4 × 60 / 3 = 80.
If multiple metrics are specified in the settings, the largest estimated instance group size is used.
For automatically scaled groups, you need to specify common scaling settings and metric settings.
Type of automatic scaling
Instance Groups can adjust the number of instances separately in each availability zone specified in the group settings or in the entire instance group:
- With zonal scaling, Instance Groups will calculate the average metric value for scaling and required number of instances separately for each availability zone. This type of automatic scaling is used by default.
- With regional scaling, the metric value and the number of instances are calculated for the entire group. To change the group auto scaling type to regional, specify the
auto_scale
scaling policy with theauto_scale_type: REGIONAL
key.
General settings
To reduce adjustment sensitivity, with Instance Groups, you can configure:
-
Stabilization period: After the number of VMs increases, the group size will not decrease until the stabilization period ends, even if the average metric value has dropped low enough.
-
Warm-up period: Period during which the VM, upon its start, will not use:
- CPU utilization.
- Monitoring metric values that are applied according to the
UTILIZATION
rule.
Average metric values for the group will be used instead.
-
Utilization measurement period: Metric value will be calculated as an average of all measurements taken during the specified period.
For example, the CPU load may rise to 100% in one second and then drop to 10%. To ignore such surges, Instance Groups will use average values for the specified period, such as one minute.
You can also set limits on the number of instances per group:
- Maximum group size: Instance Groups will not create more instances if a group already contains this many.
- Minimum size in a single availability zone: Instance Groups will not delete instances from an availability zone if there are only this many instances in the zone.
Metrics for automatic scaling
You can use the following metrics for automatic scaling:
CPU utilization
Instance Groups can control the group size to maintain average CPU utilization within the target level. The average CPU utilization is calculated for an instance separately from each availability zone or from the entire group (for the zonal or regional scaling type, respectively).
Here is what Instance Groups will do outside the stabilization period:
-
Calculate the average CPU utilization during the specified measurement period for each instance, except those that are still warming up. The load is measured several times per minute on every instance.
-
Use the obtained values to calculate the average load for each availability zone or across the entire group.
For example, let's assume there is a group of four instances located in one availability zone. One of the instances starts, while the others are under 90%, 75%, and 85% workload on average during the measurement period. The average load across the zone is: (90+75+85) / 3 = 83.4%
-
Obtain the total load, i.e., multiply the resulting average load by the total number of instances.
In our example, it is 83.4 × 4 = 333.6%
-
Divide the total load by the target load level to obtain the number of instances required (the result is rounded up).
Say, for example, the target level is 75%. This means that you need 333.6 / 75 = 4.48 ~ 5 instances. Based on the approximate results, you need to create another instance.
Once the number of instances is calculated and changed (if required), Instance Groups will start calculating the average load again.
Monitoring metrics
You can use up to three Monitoring metrics for automatic scaling in Instance Groups. To read the metrics, the service account linked to the instance group needs at least the monitoring.viewer
role.
When using monitoring metrics, specify the following in Instance Groups:
- Metric name you specified in Monitoring.
- Labels you specified in Monitoring:
- (optional)
folder_id
: ID of the folder. By default, it is the ID of the folder the group belongs to. - (optional)
service
: ID of the service. The default value iscustom
. Labels can be used to specify service metrics, such asservice
with thecompute
value for Compute Cloud.
You will also need specify other labels for this metric:
- Metric type that affects how Instance Groups will calculate the average metric value:
GAUGE
: Used for metrics that show the metric value at a specific point in time, such as the number of requests per second to a server running on an instance. Instance Groups computes the average metric value for the specified averaging period.COUNTER
: Used for metrics that grow uniformly over time, such as the total number of requests to a server running on an instance. Instance Groups calculates the average metric growth for the specified averaging period.
- Metric rule type:
-
UTILIZATION
: Metric will show resource consumption by a single instance.The number of instances per availability zone or in the entire group (for the zonal or regional scaling type, respectively) by the
UTILIZATION
metric is calculated in the same way as the number of instances by CPU utilization.When delivered in Monitoring, the
UTILIZATION
metric must have theinstance_id
label. -
WORKLOAD
: Metric will show the total workload on all instances in a single availability zone or the entire group (for the zonal or regional scaling type, respectively).To calculate the number of instances per availability zone or in the entire group by the
WORKLOAD
metric, the average metric value is divided by the target value, and the result is rounded up.For example, let's assume there are two instances in an availability zone. The metric shows the total number of requests per second (RPS) to all instances. If the target metric value is 200, then, with an average value of 450, Instance Groups will increase the number of instances in the availability zone to three: 450/200 = 2.25 ~ 3 instances.
The metric value is also calculated and used during the instance warm-up period specified in the general settings.
If zonal scaling is applied to the group, when delivered in Monitoring, the
WORKLOAD
metric must have thezone_id
label.
-
- Target metric value by which Instance Groups calculates the required number of VM instances. For
UTILIZATION
metrics, the target value is the required level of resource consumption by each instance. ForWORKLOAD
metrics, it is the maximum allowed workload on each instance.
Calculating the average metric value
The average metric value is calculated using an exponential moving average
The normal average is calculated using the following formula:
where is the function of the metric in the time range.
There may be spikes in metric values over the entire time range. The normal average is calculated without regard to when the spike occurred — whether closer to the beginning or end of the time range. This can cause a VM group to scale excessively and increase the costs of resources.
To account for the time of a metric spike, an exponential moving average is used:
where is the , weight function allowing you to assign larger weights to the function values at the end of the segment, i.e., closer to the current time.
The factor depends on how long the metric is measured and is calculated using this formula:
where is the time of measuring the metric in seconds, .