Asynchronously replicating data from Yandex Metrica to ClickHouse® using Yandex Data Transfer
Note
You can transfer data from a Yandex Metrica source if the Metrica Pro
Data Transfer enables you to transfer metrics from Yandex Metrica
- Processing data with ClickHouse® tools.
- Streaming data from ClickHouse® to other locations.
- Visualizing data with Yandex DataLens
or other services.
To transfer data:
If you no longer need the resources you created, delete them.
Required paid resources
- Managed Service for ClickHouse® cluster: Use of computing resources allocated to hosts, storage and backup size (see Managed Service for ClickHouse® pricing).
- Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- Metrica Pro
.
Getting started
Set up the infrastructure:
-
Select
an existing metric in Yandex Metrica or create and install a new one. -
Create a target Managed Service for ClickHouse® cluster with your preferred configuration.
Set up and activate the transfer
-
Create an endpoint for the
Metricasource.Hits and sessions are transferred as separate tables.
-
-
Database type:
ClickHouse -
Endpoint parameters → Connection type:
Managed clusterSelect your target cluster from the list and specify its connection settings.
-
-
Create a Replication-type transfer configured to use the new endpoints.
-
Activate the transfer.
Transfers process only the latest data, leaving historical data untouched. If you deactivate and then reactivate the transfer:
-
Data collected by the Yandex Metrica tracking tag while the transfer was deactivated will not be migrated.
-
Depending on the cleanup policy selected in the target endpoint, existing data tables will be:
- Drop: Deleted along with all data and recreated with the same names.
- Truncate: Truncated while preserving their schemas.
- Don't cleanup: Left intact for future data writing.
Test your transfer
-
Wait for the transfer status to change to Replicating.
-
Make sure the metric data from Yandex Metrica has been transferred to the Managed Service for ClickHouse® database:
-
Connect to the cluster via
clickhouse-client. -
Verify that the hit and session tables have appeared in the database:
SELECT table FROM system.tables WHERE database = '<ClickHouse®_database_name>'Result:
┌─table───────────────────────┐ │ hits_dt... │ │ visits_dt... │ └─────────────────────────────┘ -
Check whether the hit and session tables contain the relevant metric data:
SELECT * FROM <name_of_hit_or_visit_table>
-
Delete the resources you created
Note
Before deleting the created resources, deactivate the transfer.
To reduce the consumption of resources you do not need, delete them:
- Delete the transfer.
- Delete the source and target endpoints.
- Delete the Managed Service for ClickHouse® cluster.
- Delete the Yandex Metrica tracking counter from your Yandex Metrica Pro
account.
ClickHouse® is a registered trademark of ClickHouse, Inc