ClickHouse Keeper

Updated December 8, 2025

ClickHouse Keeper is a comprehensive replacement for Apache ZooKeeper, designed specifically for coordinating and managing ClickHouse clusters. The solution provides full compatibility with ZooKeeper’s client protocol and uses the same data model while offering significant improvements in performance and functionality.

Key advantages of ClickHouse Keeper:

Simplified Setup and Operation:

  • Implemented in C++ instead of Java, providing more efficient resource utilization
  • Can run both embedded in ClickHouse and in standalone mode
  • Uses less memory for the same volume of data compared to ZooKeeper

Optimized Data Storage:

  • Snapshots and logs consume much less disk space due to better compression
  • No limit on the default packet and node data size (ZooKeeper has a 1 MB limit)
  • No ZXID overflow issue that forces restart every 2 billion transactions in ZooKeeper

Enhanced Performance and Reliability:

  • Faster recovery after network partitions due to the use of a better distributed consensus protocol
  • Additional consistency guarantees: provides the same consistency guarantees as ZooKeeper (linearizable writes plus strict ordering of operations inside the same session)
  • Optionally provides linearizable reads via quorum_reads setting

Core Coordination Functions:

ClickHouse Keeper performs critical functions within the ClickHouse ecosystem:

  • Provides the coordination system for data replication in self-managed shared-nothing ClickHouse clusters
  • Enables automatic insert deduplication for replicated tables of the MergeTree engine family based on block hash-sums
  • Provides consensus for part names and for assigning part merges and mutations to specific cluster nodes
  • Powers the KeeperMap table engine which allows using Keeper as a consistent key-value store
  • Tracks consumed files in the S3Queue table engine
  • Stores all metadata for the Replicated Database engine
  • Coordinates backups with the ON CLUSTER clause
  • Serves as storage for user-defined functions and access control information
  • Used as a shared central store for all metadata in ClickHouse Cloud

ClickHouse Keeper represents a reliable and efficient solution for distributed system coordination, specifically optimized for ClickHouse workloads.

Deployment instructions

in case of a production environment, you should not use the address 0.0.0.0

  1. Create a virtual machine with the clickhouse-keeper product (KEEPER-VM) from Yandex Cloud Marketplace following the instructions and with a public address.

  2. After launching the virtual machine, log in to the KEEPER-VM console via ssh (opening access to the KEEPER-VM on port 22, as well as port 9181 for later connection from the TEST-VM through security groups).

  3. On the KEEPER-VM, use sudo nano /etc/clickhouse-keeper/keeper_config.xml to add after

    <clickhouse>
    
    <listen_host>0.0.0.0</listen_host>
    
  4. Check that the keeper is running sudo systemctl status clickhouse-keeper

  5. If the keeper is not running in the previous step, then sudo systemctl start clickhouse-keeper

  6. Create another virtual machine (TEST-VM) to test the operation of clickhouse-keeper and connect to it also via SSH

  7. Install clickhouse server and client on the TEST-VM (https://clickhouse.com/docs/install/debian_ubuntu) (after opening access to this VM to incoming traffic through port 80 TCP and security groups)

  8. Add through sudo nano /etc/clickhouse-server/config.d/zookeeper.xml

    <clickhouse>
        <zookeeper>
            <node>
                <host>KEEPER-VM public IP</host>
                <port>9181</port>
            </node>
        </zookeeper>
    </clickhouse>
    
  9. Execute sudo systemctl start clickhouse-server on the TEST-VM

  10. On the TEST-VM, execute clickhouse-client and then

    SELECT
        hostname() AS clickhouse_server,
        zookeeperSessionUptime() AS keeper_session_seconds
    
  11. On the KEEPER-VM, check that sudo ss -anp | grep :9181 shows a connection with the TEST-VM

If step 11 is successful — congratulations, your clickhouse-server on the TEST-VM is connected to the KEEPER-VM clickhouse-keeper

Billing type
Free
Type
Virtual Machine
Category
Databases
Publisher
Yandex Cloud
Use cases
  • Data replication coordination in ClickHouse clusters
  • Automatic insert deduplication for MergeTree replicated tables
  • Task scheduling queue implementation using KeeperMap table engine
  • Exactly-once delivery guarantees in Kafka Connect Sink
  • Consumed file tracking in S3Queue table engine
  • Metadata storage for Replicated Database engine
  • Cluster backup coordination
  • User-defined functions and access control storage
  • Central metadata store for ClickHouse Cloud
Technical support

Yandex Cloud technical support is available 24/7 to respond to requests. The types of requests available and their response time depend on your pricing plan. You can enable paid support in the management console. Learn more about requesting technical support.

Yandex Cloud does not provide technical support for this product. If you have any issues, please refer to the developer’s information resources.

Product IDs
image_id:
fd87hpei8uevemfgj91u
family_id:
clickhouse-keeper
Product composition
SoftwareVersion
Ubuntu24.04
Terms
By using this product you agree to the Yandex Cloud Marketplace Terms of Service
Billing type
Free
Type
Virtual Machine
Category
Databases
Publisher
Yandex Cloud