Yandex Compute Cloud instance groups during a zonal incident
A zonal incident is a Yandex Cloud infrastructure failure affecting a single availability zone within a region. Such incidents may impact the availability of resources in the affected zone as well as the behavior of multizonal resources, e.g., instance groups.
The allocation policy distinguishes between these two instance group types:
- Zonal: All instances in the group are located in one availability zone.
- Multizonal: Instances in the group are distributed across multiple availability zones.
Instance group behavior during zonal incidents may vary depending on the instance group type:
- Zonal group with instances outside the incident zone
- Zonal group with instances in the incident zone
- Multi-zonal group with instances outside the incident zone
- Multi-zonal group with instances in the incident zone
Also, zonal incidents affect the behavior of services running on instance groups.
Warning
In case of a zonal incident, it is essential you take steps to mitigate its adverse effects on your own.
After the incident, instance groups are automatically fully recovered. If you modified the instance group configuration prior to the incident, it will be restored once the incident is over.
Zonal group with instances outside the incident zone
Zonal incidents have no impact on zonal instance groups located outside the incident zone.
Zonal group with instances in the incident zone
Warning
In case of a zonal incident, a zonal instance group located in the incident zone may be fully unavailable.
Once the incident is over, the group will be fully recovered. If the instance group was modified during the incident, e.g., the number of instances or their template was updated, these updates will automatically apply after the incident.
Tip
During an incident, create a new instance group in a healthy availability zone.
Also, to ensure fault tolerance, we recommend hosting instance groups in all availability zones.
Multi-zonal group with instances outside the incident zone
Zonal incidents have no impact on instance groups located in multiple unaffected zones.
For example, if the incident only affected the
ru-central1-a
zone, and the group’s instances are located in theru-central1-b
andru-central1-d
zones.
Tip
During an incident, avoid updating the instance group’s allocation policy in terms of instance placement in the affected zone.
Multi-zonal group with instances in the incident zone
Warning
In case of a zonal incident, all instances located in the incident zone may be unavailable.
Zonal incidents affect instance group behavior in the incident area. To the most part, such an instance group will work correctly on other zones. For more information, see Multi-zonal Yandex Compute Cloud instance group with instances in the incident zone.
Yandex Cloud services running on instance groups
Some Yandex Cloud services base their resources on instance groups, e.g.:
- Yandex Application Load Balancer uses instance groups as L7 load balancer resource units.
- Yandex Managed Service for Kubernetes uses instance groups as cluster node groups.
- Yandex Managed Service for YDB uses instance groups as cluster nodes in dedicated mode.
Tip
To mitigate the effects of a zonal incident for such services, we recommend the same measures as when working with instance groups directly.
To compensate for lost capacity from the affected zone, set the maximum number of nodes or resource units with a performance buffer.