Asynchronously replicating data from Yandex Metrica to ClickHouse® using Yandex Data Transfer
Note
You can transfer data from a Yandex Metrica source if the Metrica Pro
With Data Transfer, you can transfer data from a Yandex Metrica
- Processing data with ClickHouse® tools.
- Streaming data from ClickHouse® to other locations.
- Visualizing data using Yandex DataLens
or other services.
To transfer data:
If you no longer need the resources you created, delete them.
Getting started
Prepare the infrastructure:
-
Select
a Yandex Metrica tag or create and install a new one. -
Create a Managed Service for ClickHouse® target cluster with any suitable configuration.
Prepare and activate the transfer
-
Create an endpoint for the
Metrica
source.Hits and sessions are transferred as separate tables.
-
Create an endpoint for the target:
-
Database type:
ClickHouse
-
Endpoint parameters → Connection type:
Managed cluster
Select a target cluster from the list and specify its connection settings.
-
-
Create a transfer of the Replication type that will use the created endpoints.
-
Activate your transfer.
A transfer only moves the current data and does not affect historical data. If you deactivate and activate the transfer again:
-
Data obtained by the Yandex Metrica tag before the transfer was disabled will not be transferred.
-
Depending on the cleanup policy selected in the target endpoint, the existing data tables will be:
- Drop: Deleted along with the data and created again with the same names.
- Truncate: Purged of existing data without removing the tables and schemas.
- Don't cleanup: Used for further data writes.
Test the transfer
-
Wait for the transfer status to change to Replicating.
-
Make sure the data from the Yandex Metrica tag has been moved to the Managed Service for ClickHouse® database:
-
Connect to the cluster using
clickhouse-client
: -
Check whether the hit and session tables have appeared in the database:
SELECT table FROM system.tables WHERE database = '<ClickHouse®_database_name>'
Result:
┌─table───────────────────────┐ │ hits_dt... │ │ visits_dt... │ └─────────────────────────────┘
-
Check whether the hit and session tables contain data from the tag:
SELECT * FROM <name_of_hit_or_session_table>
-
Delete the resources you created
Note
Before deleting the created resources, deactivate the transfer.
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the transfer.
- Delete endpoints for both the source and target.
- Delete the Managed Service for ClickHouse® cluster.
- Delete the Yandex Metrica tag from your Yandex Metrica Pro
account.
ClickHouse® is a registered trademark of ClickHouse, Inc