Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for ClickHouse®
  • Getting started
    • Resource relationships
    • Host classes
    • Networking in Managed Service for ClickHouse®
    • Quotas and limits
    • Disk types
    • Backups
    • Replication
    • Dictionaries
    • Sharding
    • Service users
    • Maintenance
    • Supported clients
    • Memory management
    • ClickHouse® versioning policy
    • ClickHouse® settings
  • Access management
  • Terraform reference
  • Yandex Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • ClickHouse® Keeper
  • ZooKeeper
  • Replicated tables
  1. Concepts
  2. Replication

Replication in Managed Service for ClickHouse®

Written by
Yandex Cloud
Updated at April 9, 2025
  • ClickHouse® Keeper
  • ZooKeeper
  • Replicated tables

In ClickHouse®, replication is performed if the cluster meets all these conditions:

  • There is at least one shard with two or more hosts.
  • Host coordination tool is set up.

A Managed Service for ClickHouse® cluster with enabled replication is fault-tolerant. In such a cluster, you can create replicated tables.

With Managed Service for ClickHouse®, you can use one of the following tools to coordinate hosts and distribute queries among them:

  • ClickHouse® Keeper
  • ZooKeeper (default)

ClickHouse® KeeperClickHouse® Keeper

Note

This feature is at the Preview stage. Access to ClickHouse® Keeper is available on request. Contact technical support or your account manager.

ClickHouse® Keeper is a service for data replication and running distributed DDL queries; it implements the ZooKeeper-compatible client-server protocol. Unlike ZooKeeper, ClickHouse® Keeper does not require separate hosts for its operation and runs on ClickHouse® hosts. You can enable ClickHouse® Keeper support only when creating a cluster.

Using ClickHouse® Keeper is associated with the following limitations:

  • You can only create clusters of three or more hosts.
  • ClickHouse® Keeper support cannot be enabled or disabled after creating a cluster.
  • You cannot switch clusters using ZooKeeper hosts to ClickHouse® Keeper.
  • To migrate a host from ClickHouse® Keeper to a different availability zone, you have to contact support.

You can learn more about ClickHouse® Keeper in the ClickHouse® documentation.

ZooKeeperZooKeeper

ZooKeeper is a coordination tool you can use to distribute queries among ClickHouse® hosts. For successful replication, a Managed Service for ClickHouse® cluster must have three or five ZooKeeper hosts.

If your cluster consists of one ClickHouse® host or several single-host shards and was originally created without ClickHouse® Keeper support, you must enable fault tolerance for the cluster before adding new hosts. To do this, add three or five ZooKeeper hosts to the cluster. If the cluster already has ZooKeeper hosts, you can add ClickHouse® hosts to any shards.

If you are creating a cluster with two or more ClickHouse® hosts per shard, three ZooKeeper hosts will be automatically added to the cluster. At this point, you can only set up their configuration. Mind the following:

  • If a cluster in the virtual network has subnets in each availability zone, a ZooKeeper host is automatically added to each subnet if you do not explicitly specify the settings for such hosts. You can explicitly specify three ZooKeeper hosts and their settings when creating a cluster, if required.

  • If a cluster in the virtual network has subnets only in certain availability zones, you need to explicitly specify three ZooKeeper hosts and their settings when creating a cluster.

  • If you did not specify any subnets for these hosts, Managed Service for ClickHouse® will automatically distribute them among the subnets of the network the ClickHouse® cluster is connected to.

The minimum number of cores per ZooKeeper host depends on the total number of cores on ClickHouse® hosts:

Total number of ClickHouse® host cores Minimum number of cores per ZooKeeper host
Less than 48 2
48 or higher 4

You can change ZooKeeper host class and storage size when updating cluster settings. You cannot change ZooKeeper settings or connect to such hosts.

Warning

ZooKeeper hosts, if any, are counted in when calculating resource usage and cluster cost.

Replicated tablesReplicated tables

ClickHouse® supports automatic replication only for tables on the ReplicatedMergeTree engine. To enable replication, you can create the tables on each host separately or use a distributed DDL query.

Warning

We recommend creating replicated tables on all cluster hosts. Otherwise, you may lose data when restoring a cluster from a backup or migrating cluster hosts to a different availability zone.

To create a ReplicatedMergeTree table on a specific ClickHouse® host, run the following query:

CREATE TABLE db_01.table_01 (
    log_date date,
    user_name String) ENGINE = ReplicatedMergeTree ('/table_01', '{replica}'
)
PARTITION BY log_date
ORDER BY
    (log_date, user_name);

Where:

  • db_01: Database name.
  • table_01: Table name.
  • /table_01: Path to the table in ZooKeeper or ClickHouse® Keeper, which must start with a forward slash /.
  • {replica}: Host ID macro substitution.

To create replicated tables on all cluster hosts, send a distributed DDL request:

CREATE TABLE db_01.table_01 ON CLUSTER '{cluster}' (
    log_date date,
    user_name String) ENGINE = ReplicatedMergeTree ('/table_01', '{replica}'
)
PARTITION BY log_date
ORDER BY
    (log_date, user_name);

The '{cluster}' argument will be automatically resolved to the ClickHouse® cluster ID.

To learn how to manage the interaction between replicated and distributed tables in a ClickHouse® cluster, see Sharding.

ClickHouse® is a registered trademark of ClickHouse, Inc.

Was the article helpful?

Previous
Backups
Next
Dictionaries
© 2025 Direct Cursus Technology L.L.C.