High availability of a Managed Service for MySQL® cluster
This article describes the cluster settings referred to in the service level agreement (SLA)
Number and placement of cluster hosts
A cluster may consist of one or more hosts.
A single-host cluster does not provide high availability. If the master host fails, your cluster will be unavailable for reading and writing until the host is fully recovered.
A cluster with two or more hosts located in different availability zones is tolerant of a single zone failure. For production environments with strict requirements for availability and performance, we recommend using clusters with three or more hosts: if one zone fails, redundancy is maintained and there is no significant drop in performance.
Note
A host with a manually set replication source is not counted into the minimum number of hosts required to ensure the high availability of a cluster.
Replication and master failover settings
High availability is achieved through replication and master failover, which work as follows:
- Clusters use a mechanism for automatic selection and failover to a new master. If the master host fails, one of its replicas becomes a new master. You can also select a new master and switch to it manually.
- For any replica, you can manually select a host as the replication source. Such a replica will not be involved in the master selection and failover mechanism.
- If you use public access for the host, you must also enable it for the replicas, otherwise the cluster will become unavailable following master failover.
- Using the current master's FQDN simplifies application development; however, your cluster will be temporarily unavailable while switching to a new master. To quickly switch to a new master, you need to implement the new master definition on the application side.
- Managed Service for MySQL® clusters use semi-sync replication
: by default, the master waits for a transaction to be completed in at least one replica. You can increase the minimum number of replicas that must confirm a transaction using the MySQL® Rpl semi sync master wait for slave count setting. We recommend to set Rpl semi sync master wait for slave count to at least the maximum number of hosts per zone, not including hosts with a manually selected replication source. Then each transaction will be confirmed by at least one replica in another availability zone, so that even if an entire zone fails, the transactions will not be lost.
Available storage space
If the database storage reaches 95% capacity, the cluster will switch to read-only mode. To keep your cluster writable, regularly check its Disk usage chart or create an alert with the disk.used_bytes metric.
Maintenance settings
During cluster maintenance and MySQL® version updates, read performance may temporarily drop and the cluster may become temporarily unavailable for writing. Consider the expected load on your cluster when deciding on:
- Maintenance window.
- Time for the MySQL® version upgrade.
Other settings
The following settings may also affect cluster availability:
- Backup settings.
- Storage disk type you selected.
- Host classes.
- Quotas and limits.
- Configuring security groups.
- MySQL® Max connections and Sync binlog settings.