High availability of a Managed Service for Apache Kafka® cluster
High availability of a Managed Service for Apache Kafka® cluster depends on the number and placement of its hosts, topic settings, and other cluster parameters.
Number and placement of cluster hosts
The Service Level Agreement (SLA)
Single-host cluster
Single-host cluster is both the cheapest and easiest to operate. We recommend to use it for test clusters or production apps where high cluster availability is not critical.
Here is why a single-host cluster is not a high-availability solution:
- If the broker host VM fails, your cluster will be unavailable for reading and writing until the VM is fully recovered.
- Once its host goes to read-only mode, your cluster will be unavailable for writing until you manually increase the storage size.
Two-host cluster
For a cluster with two broker hosts, the maximum topic replication factor is two, and the SLA does not apply to such clusters.
Compared to a single-host cluster, a cluster with two hosts offers the following advantages:
- At your application level, you can balance data reads and writes between two broker hosts so your cluster will work faster.
- You can replicate topic partitions if the topic has a replication factor of
2. This ensures availability if one of the cluster hosts fails.
To ensure high availability of your cluster under the SLA, you can increase the number of broker hosts.
Cluster with three or more hosts
A cluster with three or more hosts offers reliable storage and continuous data availability if each of the three availability zones has at least one broker host. Such a cluster meets the high availability criteria and is subject to the SLA.
To qualify for high availability under the SLA, your cluster topics must have the following parameters:
- Replication factor:
3 - Minimum number of in-sync replicas:
2
Also, we recommend specifying the acks=all parameter in the producer configuration. In which case, writing a message to a topic will be considered successful only after Apache Kafka® gets a write confirmation from as many broker hosts as specified in the Minimum number of in-sync replicas parameter. For more information, see the Apache Kafka®
Cluster availability during maintenance
The table below lists the possible cluster availability options based on the number of broker hosts.
| Cluster topology | Availability during maintenance |
|---|---|
| One host | Completely unavailable during reboots or updates. |
| Two hosts | Only the rebooting host is temporarily unavailable. When upgrading the Apache Kafka® version, cluster topics are unavailable if their replication factor is 1. |
| Three hosts | Only the rebooting host is temporarily unavailable. |
Consider the expected load on your cluster when selecting the maintenance window.
Other settings
High availability of your cluster also depends on the following:
- Storage disk type you selected.
- Host classes.
- Quotas and limits.