Connecting to Yandex Managed Service for ClickHouse®
With a directed acyclic graph (DAG), you can configure a connection to a database in a Yandex Managed Service for ClickHouse® cluster. Data for connecting to the DB is stored in Yandex Lockbox and automatically substituted into the graph.
Getting started
-
Create a Managed Service for ClickHouse® cluster with the following parameters:
- DB name:
default
- Username:
admin
- Password:
admin-password
- DB name:
-
Issue the
lockbox.payloadViewer
role to your service account.You do not have to assign the
lockbox.payloadViewer
role for the whole folder. It is enough to assign it for a specific Yandex Lockbox secret once you create it.
Create a Yandex Lockbox secret
Create a Yandex Lockbox secret named airflow/connections/ch1
, containing json.dumps(conn)
, where conn
is a connection to a Managed Service for ClickHouse® cluster. As conn_type
, specify SQLite
or another SQL DB.
conn = {
"conn_type": "SQLite",
"host": "<ClickHouse®_cluster_host_FQDN>",
"port": 9440,
"schema": "default",
"login": "admin",
"password": "admin-password",
"extra": {
"secure": True,
}
}
The secret will store the data to connect to the database in the Managed Service for ClickHouse® cluster.
For more information on how to find out the FQDN of a ClickHouse® cluster host, see FQDNs of ClickHouse® hosts.
Prepare the DAG file and run the graph
-
Create a local file named
clickhouse.py
and copy the following script to it:from airflow.decorators import dag, task from airflow_clickhouse_plugin.hooks.clickhouse import ClickHouseHook @dag(schedule=None) def clickhouse(): @task def query_clickhouse(): ch_hook = ClickHouseHook(clickhouse_conn_id="ch1") result = ch_hook.execute('select 1;') print(f'query result: {result}') query_clickhouse() clickhouse()
-
Upload the
clickhouse.py
DAG file to the bucket you created earlier. This will automatically create a graph with the same name in the Apache Airflow™ web interface. -
Make sure a new graph named
clickhouse
has appeared in the DAGs section.It may take a few minutes to upload a DAG file from the bucket.
-
To run the graph, click
in the line with its name.
Check the result
To check the result in the Apache Airflow™ web interface:
- In the DAGs section, open the
clickhouse
graph. - Go to the Graph section.
- Select the query_clickhouse job.
- Go to Logs.
- Make sure the logs contain the
query result: 1
string. This means the query was successful.