Networking in Managed Service for Apache Airflow™
When creating a cluster, you can specify the following network settings:
-
Availability zones that can host cluster components.
-
Network and subnets within it. Subnets correspond to the selected availability zones.
There are certain requirements for subnets, because a cluster allocates special network addresses.
-
Security groups to allow strictly defined outgoing traffic.
Cluster network addresses
A Managed Service for Apache Airflow™ cluster allocates special network addresses in its subnets. It uses them to establish connections to Yandex Cloud resources (the connections are specified in directed acyclic graphs (DAGs)). For example, you can set up a connection to a database in a Yandex Managed Service for PostgreSQL cluster.
The allocated network addresses are internal: the cluster connects to Yandex Cloud resources within the internal network. If you need to grant the cluster access to internet resources, configure a NAT gateway. When configuring it, link the route table with the NAT gateway to all the Managed Service for Apache Airflow™ cluster subnets.
The cluster allocates dynamic network addresses. These may change, e.g., during maintenance. Therefore, you should use cluster subnet ranges instead of particular addresses, e.g., in your on-premise firewall settings.
Connecting a cluster to your on-premise resources
You can set up network connectivity between a Managed Service for Apache Airflow™ cluster in the cloud and your on-premise resources. In which case the cluster will be able to connect to your on-premise resources.
Limit access to your on-premise resources using a firewall. In order to grant access only for traffic from an Apache Airflow™ cluster, create separate subnets for it and specify their ranges in firewall settings. Do not place any other resources in the subnets you created.
For more information about setting up such network connectivity, see the Yandex Cloud Interconnect documentation.
Requirements for cluster subnets
Each Managed Service for Apache Airflow™ cluster subnet must meet the following conditions:
-
The cluster network range does not overlap with the 10.248.0.0/13 address range of the auxiliary network Yandex Cloud manages the Managed Service for Apache Airflow™ cluster components in.
The cluster network range combines the ranges of all subnets in this network. This includes subnets not assigned for the cluster. For example, if the cluster is in
subnet-a
, while the network also featuressubnet-b
andsubnet-d
, none of these subnets can have its range overlapping with 10.248.0.0/13.If this condition is not met, you will get an error when creating the cluster.
This requirement also applies to your on-premise networks. From an Apache Airflow™ cluster, you will not be able to connect to resources with IPs from the 10.248.0.0/13 range.
-
The subnet range includes at least
2 × N
vacant IP addresses, whereN
is the total number of instances of all components. Let's assume that the cluster consists of two web servers, three schedulers, five workers, and one Triggerer service. Then,N = 11
, and the subnet must have at least22
vacant addresses.This is the number of addresses you need for the cluster's special network addresses. If there are not enough vacant addresses in the subnet, the cluster will not be able to operate properly.
To figure out the number of vacant IP addresses in the subnet, calculate its size by mask and then learn how many addresses are occupied. However, as the number of occupied IP addresses may vary, it is better to select a large enough subnet.
Security groups
Security group settings do not affect a Managed Service for Apache Airflow™ cluster's functions; you need them only for outgoing connections from the cluster.
Security groups allow limiting only outgoing traffic for a Managed Service for Apache Airflow™ cluster, so there is no need to set rules for incoming traffic. Outgoing traffic rules allow the cluster to connect only to specified resources. But security group settings affect neither access to the Apache Airflow™ web interface nor incoming traffic, which only enters the cluster’s web server.
If you assign no security group for the Managed Service for Apache Airflow™ cluster, it will automatically be assigned the cluster network's default security group. When this network is created, all traffic is allowed in the default security group.
Tip
When connecting to a Yandex Cloud resource from the cloud network of a Managed Service for Apache Airflow™ cluster, also set up security groups for the resource the cluster is connecting to.