Coordination services in Managed Service for ClickHouse®
A coordination service manages consistency across cluster nodes, provides data replication, and runs distributed DDL queries.
Managed Service for ClickHouse® offers these coordination services:
- Built-in ClickHouse® Keeper.
- ClickHouse® Keeper running on separate hosts.
- ZooKeeper.
Warning
After you select a coordination service, you cannot change or disable it. The hosts of both services are counted towards the used cloud resource quota
Selecting a coordination service
The choice between the ClickHouse® Keeper services depends on the use case:
- Built-in ClickHouse® Keeper runs on ClickHouse® hosts and is well-equipped to handle testing or low-load applications.
- ClickHouse® Keeper deployed on separate hosts lends itself well for stable releases and high-load applications.
ZooKeeper also works well for stable releases and high-load applications. However, ZooKeeper is a legacy solution, so we recommend using ClickHouse® Keeper on separate hosts instead. Going forward, ZooKeeper will no longer be available for creating new clusters.
ClickHouse® Keeper
ClickHouse® Keeper is a ClickHouse® solution that ensures the consistency of data reads and writes. ClickHouse® Keeper implements a ZooKeeper-compatible client-server protocol, so you can use any standard ZooKeeper client to work with it. However, snapshots, logs, and the ClickHouse® Keeper inter-server protocol are not compatible with ZooKeeper, so ClickHouse® Keeper and ZooKeeper hosts cannot be used in the same cluster.
For more information about ClickHouse® Keeper, see this ClickHouse® guide
In Managed Service for ClickHouse®, the ClickHouse® Keeper coordination service is now available in the following modes:
-
Embedded ClickHouse Keeper: ClickHouse® Keeper runs on ClickHouse® hosts. For replication, the cluster must consist of three or more ClickHouse® hosts.
-
ClickHouse Keeper (on separate hosts): ClickHouse® Keeper runs on separate hosts. For replication, the cluster must consist of two or more ClickHouse® hosts and include three or five ClickHouse® Keeper hosts.
This mode is used by default in the management console
when you create a cluster with two or more ClickHouse® hosts per shard or configure the coordination service.
You can turn on the ClickHouse® Keeper coordination service:
- When creating a cluster.
- When updating a cluster, if created without a coordination service.
Once ClickHouse® Keeper is is turned on, you cannot turn it off.
ZooKeeper
ZooKeeper is one of the first open-source coordination services. Unlike ClickHouse® Keeper, it provides consistency of data writes only, but not that of reads.
For more information about ZooKeeper, see this article
ZooKeeper runs on separate hosts. For successful replication, your Managed Service for ClickHouse® cluster must have three or five ZooKeeper hosts.
ZooKeeper is used by default when you create a Managed Service for ClickHouse® cluster in the Yandex Cloud CLI, Terraform, or API.
If you create a cluster with two or more ClickHouse® hosts per shard and the cluster network has subnets in each availability zone, the system automatically adds three ZooKeeper hosts, one in each subnet. If only some availability zones have subnets, specify the ZooKeeper host settings manually.
The minimum number of cores per ZooKeeper host depends on the total number of cores on ClickHouse® hosts:
| Total number of ClickHouse® host cores | Minimum number of cores per ZooKeeper host |
|---|---|
| Less than 48 | 2 |
| 48 or more | 4 |
You can change ZooKeeper host class and storage size when updating cluster settings. Still, you cannot change the ZooKeeper settings or connect directly to its hosts.
You can turn on the ZooKeeper coordination service:
- When creating a cluster.
- When updating a cluster, if created without a coordination service.
Once ZooKeeper is is turned on, you cannot turn it off.
ClickHouse® is a registered trademark of ClickHouse, Inc