Loading data from Yandex Metrica to a ClickHouse® data mart
Note
You can transfer data from a Yandex Metrica source if the Metrica Pro
With Data Transfer, you can transfer data from a Yandex Metrica
- Processing data with ClickHouse® tools.
- Streaming data from ClickHouse® to other locations.
- Visualizing data using Yandex DataLens
or other services.
To transfer data:
If you no longer need the resources you created, delete them.
Required paid resources
The support cost includes:
- Managed Service for ClickHouse® cluster fee: Using computing resources allocated to hosts (including ZooKeeper hosts) and disk space (see Managed Service for ClickHouse® pricing).
- Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
- Metrica Pro
fee.
Getting started
Set up your infrastructure:
-
Select
a Yandex Metrica tag or create and install a new one. -
Create a Managed Service for ClickHouse® target cluster in any suitable configuration.
Set up and activate the transfer
-
Create an endpoint for the
Metrica
source.Hits and sessions are transferred as separate tables.
-
Create an endpoint for the target:
-
Database type:
ClickHouse
-
Endpoint parameters → Connection type:
Managed cluster
Select a target cluster from the list and specify its connection settings.
-
-
Create a transfer of the Replication type that will use the endpoints you created.
-
Activate your transfer.
A transfer only moves the current data and does not affect historical data. If you deactivate and activate the transfer again:
-
Data obtained by the Yandex Metrica tag before the transfer was disabled will not be transferred.
-
Depending on the cleanup policy selected in the target endpoint, the existing data tables will be:
- Drop: Deleted along with the data and created again with the same names.
- Truncate: Purged of existing data without removing the tables and schemas.
- Don't cleanup: Used for further data writes.
Test your transfer
-
Wait until the transfer status switches to Replicating.
-
Make sure the data from the Yandex Metrica tag has been moved to the Managed Service for ClickHouse® database:
-
Connect to the cluster using
clickhouse-client
. -
Check whether the hit and session tables have appeared in the database:
SELECT table FROM system.tables
WHERE database = '<ClickHouse®_database_name>'
```Result: ```text ┌─table───────────────────────┐ │ hits_dt... │ │ visits_dt... │ └─────────────────────────────┘ ```
-
Check whether the hit and session tables contain data from the tag:
SELECT * FROM <name_of_hit_or_visit_table>
-
Delete the resources you created
Note
Before deleting the created resources, deactivate the transfer.
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need:
- Delete the transfer.
- Delete the endpoints for both the source and target.
- Delete the Managed Service for ClickHouse® cluster.
- Delete the Yandex Metrica tag from your Yandex Metrica Pro
account.
ClickHouse® is a registered trademark of ClickHouse, Inc