Storing Apache Airflow™ connections and variables in Yandex Lockbox
When working with Yandex Managed Service for Apache Airflow™, you can use Yandex Lockbox to store artifacts that can be used in DAG files: connections, variables, and configuration data. Yandex Lockbox integrates into Managed Service for Apache Airflow™ via the Yandex Lockbox Secret Backend
Below, we consider a directed acyclic graph (DAG) running the SELECT 1; SQL query to a database in a Yandex Managed Service for PostgreSQL cluster. Data for connecting to the DB is stored in Yandex Lockbox and automatically substituted into the graph.
To use configuration data from a Yandex Lockbox secret in the graph:
- Set up your infrastructure.
- Create a Yandex Lockbox secret.
- Prepare the DAG file and run the graph.
- Check the result.
If you no longer need the resources you created, delete them.
Required paid resources
The support cost includes:
- Managed Service for PostgreSQL cluster fee: Computing resources and disk space (see Managed Service for PostgreSQL pricing).
- Managed Service for Apache Airflow™ cluster fee: Computing resources and disk space (see Apache Airflow™ pricing).
- Object Storage bucket fee: Storing data and performing operations with it (see Object Storage pricing).
- Fee for using a Yandex Lockbox secret (see Yandex Lockbox pricing).
- Fee for using public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
Set up your infrastructure
-
Create a service account named
airflow-sawith the following roles:managed-airflow.integrationProviderlockbox.payloadViewer
You do not have to assign the
lockbox.payloadViewerrole for the whole folder. It is enough to assign it for a specific Yandex Lockbox secret once you create it. -
Create an Object Storage bucket in any configuration.
-
Edit the ACL of the new bucket to give the
READpermission to theairflow-saservice account. -
Create a Managed Service for Apache Airflow™ cluster with the following parameters:
- Service account:
airflow-sa. - Bucket name: Name of the new bucket.
- Use Lockbox Secret Backend: Make sure to enable this option.
- Service account:
-
Create a Managed Service for PostgreSQL cluster with the following parameters:
- DB name:
db1 - Username:
user1 - Password:
user1-password
- DB name:
Create a Yandex Lockbox secret
For the Apache Airflow™ cluster to work correctly, your Yandex Lockbox secret's name must have this format: airflow/<artifact_type>/<artifact_ID>, where:
<artifact_type>: Type of the artifact to store in the secret. The following types are available:connections: Connections.variables: Variables.config: Configuration data.
<artifact_ID>: ID to use to access the artifact in Apache Airflow™.
Create a Yandex Lockbox secret with the following parameters:
-
Name:
airflow/connections/pg -
Secret type:
Custom -
Key:
airflow/connections/pg -
Value: Select Text and specify the following contents:
{ "conn_type": "postgres", "host": "<PostgreSQL_cluster_host_FQDN>", "port": 6432, "schema": "db1", "login": "user1", "password": "user1-password" }
The secret will store the data to connect to the database in the Managed Service for PostgreSQL cluster.
For more information on how to get the FQDN of a PostgreSQL cluster host, see the documentation.
Prepare the DAG file and run the graph
-
Create a local file named
test_lockbox_connection.pyand copy the following script to it:from airflow import DAG from airflow.providers.postgres.operators.postgres import PostgresOperator from datetime import datetime with DAG( dag_id='test_lockbox_connection', start_date=datetime(2024, 4, 19), schedule="@once", catchup=False, ) as dag: check_conn = PostgresOperator( task_id="check_conn", postgres_conn_id='pg', sql="SELECT 1;", ) -
Upload the
test_lockbox_connection.pyDAG file to the bucket you created earlier. This will automatically create a graph with the same name in the Apache Airflow™ web interface. -
Make sure a new graph named
test_lockbox_connectionhas appeared in the DAGs section.It may take a few minutes to upload a DAG file from the bucket.
-
To run the graph, click
in the line with its name.
Check the result
To check the result in the Apache Airflow™ web interface:
- In the DAGs section, open the
test_lockbox_connectiongraph. - Go to the Graph section.
- Select the check_conn job.
- Go to Logs.
- Make sure the logs contain the
Rows affected: 1string. This means the query was successful.
Delete the resources you created
Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them: