Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for ClickHouse®
  • Getting started
  • Access management
  • Pricing policy
  • Terraform reference
  • Yandex Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes
    • General questions
    • Questions about ClickHouse®
    • Connection
    • Updating a cluster
    • Cluster configuration
    • Moving and restoring a cluster
    • Monitoring and logs
    • All questions on a single page
  1. FAQ
  2. Questions about ClickHouse®

Questions about ClickHouse®

Written by
Yandex Cloud
Updated at January 23, 2025
  • Why should I use ClickHouse® in Managed Service for ClickHouse® rather than my own VM-based installation?

  • When should I use ClickHouse® instead of PostgreSQL?

  • How do I upload data to ClickHouse®?

  • How do I upload a massive amount of data to ClickHouse®?

  • What will happen to a cluster if one of its nodes fails?

  • Can I deploy a ClickHouse® database cluster in multiple availability zones?

  • How does replication work for ClickHouse®?

  • Why does a ClickHouse® cluster take up 3 hosts more than it should?

  • How do I delete data in ClickHouse® based on TTL?

  • Can I use JSON data for tables in ClickHouse®?

  • Why is the cluster slow even though the computing resources are not used fully?

Why should I use ClickHouse® in Managed Service for ClickHouse® rather than my own VM-based installation?Why should I use ClickHouse® in Managed Service for ClickHouse® rather than my own VM-based installation?

Managed Service for ClickHouse® automates routine database maintenance:

  • Quick DB deployment with the necessary available resources.

  • Data backup.

  • Regular software updates.

  • Providing DB cluster failover.

  • Database usage monitoring and statistics.

When should I use ClickHouse® instead of PostgreSQL?When should I use ClickHouse® instead of PostgreSQL?

ClickHouse® only supports adding and reading data because it is designed primarily for (OLAP). In other cases, it's probably more convenient to use PostgreSQL.

How do I upload data to ClickHouse®?How do I upload data to ClickHouse®?

Use the INSERT statement described in the ClickHouse® documentation.

How do I upload very large data to ClickHouse®?How do I upload very large data to ClickHouse®?

Use the CLI for efficient data compression during transmission (the recommended frequency is no more than one INSERT command per second).

Data transfer from physical media is not yet supported.

What happens to a cluster if one of its nodes fails?What happens to a cluster if one of its nodes fails?

DB clusters consist of at least two replicas, so the cluster will continue working if one of its nodes is out.

Data may be lost only if a node with a non-replicated table fails.

Can I deploy a ClickHouse® database cluster in multiple availability zones?Can I deploy a ClickHouse® database cluster in multiple availability zones?

Yes, you can. A database cluster may consist of hosts residing in different availability zones or even regions.

How does replication work for ClickHouse®?How does replication work for ClickHouse®?

Managed Service for ClickHouse® clusters use replication using ClickHouse® Keeper or ZooKeeper. In the first case, no additional settings are required — replication and fault tolerance are enabled by default. In the second case, for each ClickHouse® cluster, a ZooKeeper cluster with at least three hosts is created.

Access to ZooKeeper and its setup are not available to Yandex Cloud users.

Why does a ClickHouse® cluster take up 3 hosts more than it should?Why does a ClickHouse® cluster take up 3 hosts more than it should?

When creating a ClickHouse® cluster with 2 or more hosts, Managed Service for ClickHouse® automatically creates a cluster with 3 ZooKeeper hosts to manage replication and fault tolerance, if ClickHouse® Keeper support is not enabled. These hosts are taken into account when calculating the consumed cloud resource quota and cluster cost. By default, ZooKeeper hosts are created with a minimal host class.

For more information about using ZooKeeper, see the ClickHouse® documentation.

How do I delete data in ClickHouse® based on TTL?How do I delete data in ClickHouse® based on TTL?

Data is deleted based on TTL either in entire data chunks or in merge transactions rather than row by row.

Deleting entire data chunks is more efficient and uses less server resources but requires the value of the TTL expression and the partitioning key to be the same or at least of the same order of magnitude for all TTL data chunk rows.

Deletions during merge transactions use more resources and are carried out with regular background merge transactions or during unscheduled merges. Merge frequency depends on the value in the merge_with_ttl_timeout parameter. This parameter is set at table creation and is equal to the minimum time in seconds before a repeat merge to process data with an expired TTL. The default is 14400 seconds (4 hours).

We recommend managing TTL data processing always to delete obsolete data in entire chunks. To do this, set ttl_only_drop_parts to true when creating tables.

Can I use JSON data for tables in ClickHouse®?Can I use JSON data for tables in ClickHouse®?

Yes, you can. However, JSON is currently an experimental data type in ClickHouse®. To allow creating tables of this type, run this query:

SET allow_experimental_object_type=1;

Note

SET queries are not supported when connecting to a cluster through the management console. To run such a query, use a different cluster connection method, e.g., through clickhouse-client.

Make sure you have the latest client version installed.

For more information, see the ClickHouse® documentation.

Why is the cluster slow even though the computing resources are not used fully?Why is the cluster slow even though the computing resources are not used fully?

Your storage may have insufficient maximum IOPS and bandwidth to process the current number of requests. In this case, throttling occurs, which degrades the entire cluster performance.

The maximum IOPS and bandwidth values increase by a fixed value when the storage size increases by a certain step. The step and increment values depend on the disk type:

Disk type Step, GB Max IOPS increase (read/write) Max bandwidth increase (read/write), MB/s
network-hdd 256 300/300 30/30
network-ssd 32 1,000/1,000 15/15
network-ssd-nonreplicated, network-ssd-io-m3 93 28,000/5,600 110/82

To increase the maximum IOPS and bandwidth values and make throttling less likely, increase the storage size when you update your cluster.

If you are using the network-hdd storage type, consider switching to network-ssd or network-ssd-nonreplicated by restoring the cluster from a backup.

ClickHouse® is a registered trademark of ClickHouse, Inc.

Was the article helpful?

Previous
General questions
Next
Connection
Yandex project
© 2025 Yandex.Cloud LLC