Connecting to a Yandex Object Storage bucket with a bucket policy
Written by
Updated at December 24, 2024
In Yandex Managed Service for Apache Airflow™, you can work with an Yandex Object Storage bucket with access policies configured. You access the bucket from a separate DNS zone through an internal load balancer used to distribute traffic among NAT instances. The described connection is shown below. An Apache Airflow™ cluster will be used in place of the test-s3-vm
VM.
Getting started
- Create a network infrastructure to access the Object Storage bucket as shown on the picture above. For information on how to create such an infrastructure, see this tutorial.
- Test the new infrastructure.
- To connect to the bucket you created via Apache Airflow™, edit the bucket access policy. To do this, in the
Action
parameter , specify operations allowed for Apache Airflow™:"s3:GetObject, s3:ListBucket"
. After that, apply the changes using theterraform apply
command.
Prepare the Apache Airflow™ cluster
- Create a service account named
my-account
with thevpc.user
andmanaged-airflow.integrationProvider
roles. - Grant the
READ
permission for the bucket you created earlier to themy-account
service account. - Create an Apache Airflow™ cluster and specify the
my-account
service account in it.
Test the connection
To test the connection to the Object Storage bucket, upload the DAG file to the bucket The DAG should be displayed in the Apache Airflow™ web interface.