Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for Apache Airflow™
  • Getting started
    • All guides
      • Information about existing clusters
      • Creating a cluster
      • Updating a cluster
      • Stopping and starting a cluster
      • Deleting a cluster
    • Working with Apache Airflow™ interfaces
    • Transferring logs from Apache Airflow™ to Cloud Logging
  • Access management
  • Pricing policy
  • Terraform reference
  • Yandex Monitoring metrics
  • Release notes
  • FAQ

In this article:

  • Roles for creating a cluster
  • Creating a cluster
  1. Step-by-step guides
  2. Clusters
  3. Creating a cluster

Creating an Apache Airflow™ cluster

Written by
Yandex Cloud
Updated at May 5, 2025
  • Roles for creating a cluster
  • Creating a cluster

Every Managed Service for Apache Airflow™ cluster consists of a set of Apache Airflow™ components, each of which can be represented in multiple instances. The instances may reside in different availability zones.

Roles for creating a clusterRoles for creating a cluster

To create a Managed Service for Apache Airflow™ cluster, your Yandex Cloud account needs the following roles:

  • managed-airflow.editor: To create a cluster.
  • vpc.user: To use the cluster network.
  • iam.serviceAccounts.user: To link a service account to a cluster.

Make sure to assign the managed-airflow.integrationProvider role to the cluster's service account. The cluster will thus get the permissions it needs to work with user resources. For more information, see Impersonation.

For more information about assigning roles, see the Yandex Identity and Access Management documentation.

Creating a clusterCreating a cluster

Management console
CLI
Terraform
REST API
gRPC API
  1. In the management console, select the folder where you want to create a cluster.

  2. Select Managed Service for Apache Airflow™.

  3. Click Create a cluster.

  4. Under Basic parameters:

    1. Enter a name for the cluster. The name must be unique within the folder.
    2. (Optional) Enter a cluster description.
    3. (Optional) Create labels:
      1. Click Add label.
      2. Enter a label in key: value format.
      3. Press Enter.
  5. Under Access settings:

    • Set a password for the admin user. The password must be not less than 8 characters long and contain at least:

      • One uppercase letter
      • One lowercase letter
      • One number
      • One special character

      Note

      Save the password locally or memorize it. The service does not show passwords after the registry is created.

    • Select an existing service account or create a new one.

      Make sure to assign the managed-airflow.integrationProvider role to the service account:

  6. Under Network settings, select:

    • Availability zones for the cluster.

    • Cloud network.

    • Subnet in each of the selected availability zones.

      To ensure that only an Apache Airflow™ cluster can connect to your resources, create separate subnets for the cluster and do not host any other resources in those subnets.

      Make sure your subnets meet the following requirements:

      • For each subnet in the cluster network (including those not assigned to the cluster), the IP address range does not overlap with that of the 10.248.0.0/13 auxiliary subnet.
      • The IP address range of each cluster subnet includes at least 2 × N vacant addresses, where N is the total number of instances of all components: web server, scheduler, workers, and triggerer service.

      For more information, see Requirements for cluster subnets.

    • Security group for the cluster network traffic.

      Security group settings do not affect access to the Apache Airflow™ web interface.

  7. Set the number of instances and a computing resource configuration for the Managed Service for Apache Airflow™ components:

    • Web server

    • Scheduler

    • Workers

      Note

      If the issue queue is empty, the number of workers will be the minimum value. When issues appear, the number of workers will increase up to the maximum value.

    • (Optional) Triggerer services

  8. (Optional) Under Dependencies, specify pip and deb package names to install additional libraries and applications in the cluster to run DAG files.

    To specify multiples packages, click Add.

    If required, you can set version restrictions for the installed packages, for example:

    pandas==2.0.2
    scikit-learn>=1.0.0
    clickhouse-driver~=0.2.0
    

    The package name format and version are defined by the install command: pip install for pip packages and apt install for deb packages.

    Warning

    To install pip and deb packages from public repositories, specify a network with configured egress NAT under Network settings.

  9. Under DAG file storage, select a bucket or create a new one. This bucket will store DAG files.

    Make sure to grant the READ permission for this bucket to the cluster service account.

  10. (Optional) Under Advanced settings, enable cluster deletion protection.

  11. Optionally, under Airflow configuration:

    • Specify Apache Airflow™ additional properties, e.g., api.maximum_page_limit as a key and 150 as its value.

      Fill in the fields manually or import the settings from a configuration file (see configuration file example).

    • Enable the Use Lockbox Secret Backend option to use secrets in Yandex Lockbox to store Apache Airflow™ configuration data, variables, and connection parameters.

      To extract the required information from the secret, the cluster service account must have the lockbox.payloadViewer role.

      You can assign this role either at whole folder level or individual secret level.

  12. (Optional) Under Log settings, enable logging. Logs generated by Apache Airflow™ components will be sent to Yandex Cloud Logging. Set logging parameters:

    • In the Destination field, specify the logging destination:

      • Folder: Select the folder. Logs will be written to the default log group for this folder.
      • Log group: Select a custom log group or create a new one.
    • Select the minimum logging level: TRACE, DEBUG, INFO (default), WARN, ERROR, or FATAL.

  13. Click Create.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder specified when creating the CLI profile is used by default. To change the default folder, use the yc config set folder-id <folder_ID> command. You can specify a different folder using the --folder-name or --folder-id parameter.

To create a Managed Service for Apache Airflow™ cluster:

  1. View the description of the CLI command to create a cluster:

    yc managed-airflow cluster create --help
    
  2. Specify cluster parameters in the create command (the list of supported parameters in the example is not exhaustive):

    yc managed-airflow cluster create \
       --name <cluster_name> \
       --description <cluster_description> \
       --labels <label_list> \
       --admin-password <administrator_password> \
       --service-account-id <service_account_ID> \
       --subnet-ids <subnet_IDs> \
       --security-group-ids <security_group_IDs> \
       --webserver count=<number_of_instances>,`
                  `resource-preset-id=<resource_ID> \
       --scheduler count=<number_of_instances>,`
                  `resource-preset-id=<resource_ID> \
       --worker min-count=<minimum_number_of_instances>,`
               `max-count=<maximum_number_of_instances>,`
               `resource-preset-id=<resource_ID> \
       --triggerer count=<number_of_instances>,`
                  `resource-preset-id=<resource_ID> \
       --deb-packages <list_of_deb_packages> \
       --pip-packages <list_of_pip_packages> \
       --dags-bucket <bucket_name> \
       --deletion-protection \
       --lockbox-secrets-backend \
       --airflow-config <list_of_properties> \
       --log-enabled \
       --log-folder-id <folder_ID> \
       --log-min-level <logging_level>
    

    Where:

    • --name: Cluster name.

    • --description: Cluster description.

    • --labels: List of labels. Provide labels in <key>=<value> format.

    • --admin-password: Admin user password. The password must be not less than 8 characters long and contain at least:

      • One uppercase letter
      • One lowercase letter
      • One number
      • One special character
    • --service-account-id: Service account ID.

    • --subnet-ids: Subnet IDs list.

      To ensure that only an Apache Airflow™ cluster can connect to your resources, create separate subnets for the cluster and do not host any other resources in those subnets.

      Make sure your subnets meet the following requirements:

      • For each subnet in the cluster network (including those not assigned to the cluster), the IP address range does not overlap with that of the 10.248.0.0/13 auxiliary subnet.
      • The IP address range of each cluster subnet includes at least 2 × N vacant addresses, where N is the total number of instances of all components: web server, scheduler, workers, and triggerer service.

      For more information, see Requirements for cluster subnets.

    • --security-group-ids: List of security group IDs.

    • --webserver, --scheduler, --worker, --triggerer: Managed Service for Apache Airflow™ component configuration:

      • count: Number of instances in the cluster for the web server, scheduler, and Triggerer.

      • min-count, max-count: Minimum and maximum number of instances in the cluster for the worker.

      • resource-preset-id: ID of the computing resources of the web server, scheduler, worker, and Triggerer. The possible values are:

        • c1-m2: 1 vCPU, 2 GB RAM
        • c1-m4: 1 vCPU, 4 GB RAM
        • c2-m4: 2 vCPUs, 4 GB RAM
        • c2-m8: 2 vCPUs, 8 GB RAM
        • c4-m8: 4 vCPUs, 8 GB RAM
        • c4-m16: 4 vCPUs, 16 GB RAM
        • c8-m16: 8 vCPUs, 16 GB RAM
        • c8-m32: 8 vCPUs, 32 GB RAM
    • --deb-packages, --pip-packages: Lists of deb and pip packages enabling you to install additional libraries and applications in the cluster for running DAG files:

      If required, you can set version restrictions for the installed packages, for example:

      --pip-packages "pandas==2.0.2,scikit-learn>=1.0.0,clickhouse-driver~=0.2.0"
      

      The package name format and version are defined by the install command: pip install for pip packages and apt install for deb packages.

    • --dags-bucket: Name of the bucket to store DAG files in.

    • --deletion-protection: Enables cluster protection against accidental deletion.

      Even if it is enabled, one can still connect to the cluster manually and delete it.

    • --lockbox-secrets-backend: Enables using secrets in Yandex Lockbox to store Apache Airflow™ configuration data, variables, and connection parameters.

    • --airflow-config: Apache Airflow™ additional properties. Provide them in <configuration_section>.<key>=<value> format, such as the following:

      --airflow-config core.load_examples=False
      
    • Logging parameters:

      • --log-enabled: Enables logging. Logs generated by Apache Airflow™ will be sent to Yandex Cloud Logging.

      • --log-folder-id: Folder ID. Logs will be written to the default log group for this folder.

      • --log-group-id: Custom log group ID. Logs will be written to this group.

        Specify one of the two parameters: --log-folder-id or --log-group-id.

      • --log-min-level: Minimum logging level. Possible values: TRACE, DEBUG, INFO (default), WARN, ERROR, and FATAL.

      You can specify only one of the parameters: --log-folder-id or --log-group-id.

With Terraform, you can quickly create a cloud infrastructure in Yandex Cloud and manage it using configuration files. These files store the infrastructure description written in HashiCorp Configuration Language (HCL). If you change the configuration files, Terraform automatically detects which part of your configuration is already deployed, and what should be added or removed.

Terraform is distributed under the Business Source License. The Yandex Cloud provider for Terraform is distributed under the MPL-2.0 license.

For more information about the provider resources, see the documentation on the Terraform website or mirror website.

If you do not have Terraform yet, install it and configure its Yandex Cloud provider.

To create a Managed Service for Apache Airflow™ cluster:

  1. In the configuration file, describe the resources you are creating:

    • Managed Service for Apache Airflow™ cluster: Cluster description.

    • Network: Description of the cloud network where a cluster will be located. If you already have a suitable network, you don't have to describe it again.

    • Subnets: Description of the subnets to connect the cluster hosts to. If you already have suitable subnets, you don't have to describe them again.

    Here is the configuration file example:

    resource "yandex_airflow_cluster" "<cluster_name>" {
      name        = "<cluster_name>"
      description = "<cluster_description>"
    
      labels = { <label_list> }
    
      admin_password     = "<administrator_password>"
      service_account_id = "<service_account_ID>"
      subnet_ids         = ["<list_of_subnet_IDs>"]
      security_group_ids = ["<list_of_security_group_IDs>"]
    
      webserver = {
        count              = <number_of_instances>
        resource_preset_id = "<resource_ID>"
      }
    
      scheduler = {
        count              = <number_of_instances>
        resource_preset_id = "<resource_ID>"
      }
    
      worker = {
        min_count          = <minimum_number_of_instances>
        max_count          = <maximum_number_of_instances>
        resource_preset_id = "<resource_ID>"
      }
    
      triggerer = {
        count              = <number_of_instances>
        resource_preset_id = "<resource_ID>"
      }
    
      pip_packages = ["list_of_pip_packages"]
      deb_packages = ["list_of_deb_packages"]
    
      code_sync = {
        s3 = {
          bucket = "<bucket_name>"
        }
      }
    
      deletion_protection = <deletion_protection>
    
      lockbox_secrets_backend = {
        enabled = <usage_of_secrets>
      }
    
      airflow_config = {
        <configuration_section> = {
          <key> = "<value>"
        }
      }
    
      logging = {
        enabled   = <use_of_logging>
        folder_id = "<folder_ID>"
        min_level = "<logging_level>"
      }
    }
    
    resource "yandex_vpc_network" "<network_name>" { name = "<network_name>" }
    
    resource "yandex_vpc_subnet" "<subnet_name>" {
      name           = "<subnet_name>"
      zone           = "<availability_zone>"
      network_id     = "<network_ID>"
      v4_cidr_blocks = ["<range>"]
    }
    

    Where:

    • name: Cluster name.

    • description: Cluster description.

    • labels: List of labels. Provide labels in <key> = "<value>" format.

    • admin_password: Admin user password. The password must be not less than 8 characters long and contain at least:

      • One uppercase letter
      • One lowercase letter
      • One number
      • One special character
    • service_account_id: Service account ID.

    • subnet_ids: Subnet IDs list.

      To ensure that only an Apache Airflow™ cluster can connect to your resources, create separate subnets for the cluster and do not host any other resources in those subnets.

      Make sure your subnets meet the following requirements:

      • For each subnet in the cluster network (including those not assigned to the cluster), the IP address range does not overlap with that of the 10.248.0.0/13 auxiliary subnet.
      • The IP address range of each cluster subnet includes at least 2 × N vacant addresses, where N is the total number of instances of all components: web server, scheduler, workers, and triggerer service.

      For more information, see Requirements for cluster subnets.

    • security_group_ids: List of security group IDs.

    • webserver, scheduler, worker, triggerer: Managed Service for Apache Airflow™ component configuration:

      • count: Number of instances in the cluster for the web server, scheduler, and Triggerer.

      • min_count, max_count: Minimum and maximum number of instances in the cluster for the worker.

      • resource_preset_id: ID of the computing resources of the web server, scheduler, worker, and Triggerer. The possible values are:

        • c1-m2: 1 vCPU, 2 GB RAM
        • c1-m4: 1 vCPU, 4 GB RAM
        • c2-m4: 2 vCPUs, 4 GB RAM
        • c2-m8: 2 vCPUs, 8 GB RAM
        • c4-m8: 4 vCPUs, 8 GB RAM
        • c4-m16: 4 vCPUs, 16 GB RAM
        • c8-m16: 8 vCPUs, 16 GB RAM
        • c8-m32: 8 vCPUs, 32 GB RAM
    • deb_packages, pip_packages: Lists of deb and pip packages enabling you to install additional libraries and applications in the cluster for running DAG files:

      If required, you can set version restrictions for the installed packages, for example:

      pip_packages = ["pandas==2.0.2","scikit-learn>=1.0.0","clickhouse-driver~=0.2.0"]
      

      The package name format and version are defined by the install command: pip install for pip packages and apt install for deb packages.

    • code_sync.s3.bucket: Name of the bucket to store DAG files in.

    • deletion_protection: Enables cluster protection against accidental deletion. The possible values are true or false.

      Even if it is enabled, one can still connect to the cluster manually and delete it.

    • lockbox_secrets_backend.enabled: Enables using secrets in Yandex Lockbox to store Apache Airflow™ configuration data, variables, and connection parameters. The possible values are true or false.

    • airflow_config: Apache Airflow™ additional properties, e.g., core for configuration section, load_examples for key, and False for value.

    • logging: Logging parameters:

      • enabled: Enables logging. Logs generated by Apache Airflow™ components will be sent to Yandex Cloud Logging. The possible values are true or false.

      • folder_id: Folder ID. Logs will be written to the default log group for this folder.

      • log_group_id: Custom log group ID. Logs will be written to this group.

        Specify one of the two parameters: folder_id or log_group_id.

      • min_level: Minimum logging level. Possible values: TRACE, DEBUG, INFO (default), WARN, ERROR, and FATAL.

      You can specify only one of the parameters: folder_id or log_group_id.

  2. Make sure the settings are correct.

    1. In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.

    2. Run this command:

      terraform validate
      

      Terraform will show any errors found in your configuration files.

  3. Create a Managed Service for Apache Airflow™ cluster.

    1. Run this command to view the planned changes:

      terraform plan
      

      If you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.

    2. If everything looks correct, apply the changes:

      1. Run this command:

        terraform apply
        
      2. Confirm updating the resources.

      3. Wait for the operation to complete.

    This will create all resources you need in the specified folder. You can check the new resources and their settings using the management console.

For more information, see the Terraform provider documentation.

  1. Get an IAM token for API authentication and put it into the environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Create a file named body.json and add the following contents to it:

    {
      "folderId": "<folder_ID>",
      "name": "<cluster_name>",
      "description": "<cluster_description>",
      "labels": { <label_list> },
      "config": {
        "versionId": "<Apache Airflow™>_version",
        "airflow": {
          "config": { <list_of_properties> }
        },
        "webserver": {
          "count": "<number_of_instances>",
          "resources": {
            "resourcePresetId": "<resource_ID>"
          }
        },
        "scheduler": {
          "count": "<number_of_instances>",
          "resources": {
            "resourcePresetId": "<resource_ID>"
          }
        },
        "triggerer": {
          "count": "<number_of_instances>",
          "resources": {
            "resourcePresetId": "<resource_ID>"
          }
        },
        "worker": {
          "minCount": "<minimum_number_of_instances>",
          "maxCount": "<maximum_number_of_instances>",
          "resources": {
            "resourcePresetId": "<resource_ID>"
          }
        },
        "dependencies": {
          "pipPackages": [ <list_of_pip_packages> ],
          "debPackages": [ <list_of_deb_packages> ]
        },
        "lockbox": {
          "enabled": <use_of_logging>
        }
      },
      "network": {
        "subnetIds": [ <list_of_subnet_IDs> ],
        "securityGroupIds": [ <list_of_security_group_IDs> ]
      },
      "codeSync": {
        "s3": {
          "bucket": "<bucket_name>"
        }
      },
      "deletionProtection": <deletion_protection>,
      "serviceAccountId": "<service_account_ID>",
      "logging": {
        "enabled": <use_of_logging>,
        "minLevel": "<logging_level>",
        "folderId": "<folder_ID>"
      },
      "adminPassword": "<administrator_password>"
    }
    

    Where:

    • folderId: Folder ID. You can request it with the list of folders in the cloud.

    • name: Cluster name.

    • description: Cluster description.

    • labels: List of labels. Provide labels in "<key>": "<value>" format.

    • config: Cluster configuration:

      • versionId: Apache Airflow™ version.

      • airflow.config: Advanced Apache Airflow™ properties. Provide them in "<configuration_section>.<key>": "<value>" format, for example:

        "airflow": {
          "config": {
            "core.load_examples": "False"
          }
        }
        
      • webserver, scheduler, triggerer, worker: Managed Service for Apache Airflow™ component configuration:

        • count: Number of instances in the cluster for the web server, scheduler, and Triggerer.

        • minCount, maxCount: Minimum and maximum number of instances in the cluster for the worker.

        • resources.resourcePresetId: ID of the computing resources of the web server, scheduler, worker, and Triggerer. The possible values are:

          • c1-m2: 1 vCPU, 2 GB RAM
          • c1-m4: 1 vCPU, 4 GB RAM
          • c2-m4: 2 vCPUs, 4 GB RAM
          • c2-m8: 2 vCPUs, 8 GB RAM
          • c4-m8: 4 vCPUs, 8 GB RAM
          • c4-m16: 4 vCPUs, 16 GB RAM
          • c8-m16: 8 vCPUs, 16 GB RAM
          • c8-m32: 8 vCPUs, 32 GB RAM
      • dependencies: Lists of packages enabling you to install additional libraries and applications for running DAG files in the cluster:

        • pipPackages: List of pip packages.
        • debPackages: List of deb packages.

        If required, you can set version restrictions for the installed packages, for example:

        "dependencies": {
          "pipPackages": [
            "pandas==2.0.2",
            "scikit-learn>=1.0.0",
            "clickhouse-driver~=0.2.0"
          ]
        }
        

        The package name format and version are defined by the install command: pip install for pip packages and apt install for deb packages.

      • lockbox.enabled: Enables using secrets in Yandex Lockbox to store Apache Airflow™ configuration data, variables, and connection parameters. The possible values are true or false.

    • network: Network settings:

      • subnetIds: Subnet IDs list.

        To ensure that only an Apache Airflow™ cluster can connect to your resources, create separate subnets for the cluster and do not host any other resources in those subnets.

        Make sure your subnets meet the following requirements:

        • For each subnet in the cluster network (including those not assigned to the cluster), the IP address range does not overlap with that of the 10.248.0.0/13 auxiliary subnet.
        • The IP address range of each cluster subnet includes at least 2 × N vacant addresses, where N is the total number of instances of all components: web server, scheduler, workers, and triggerer service.

        For more information, see Requirements for cluster subnets.

      • securityGroupIds: List of security group IDs.

    • codeSync.s3.bucket: Name of the bucket to store DAG files in.

    • deletionProtection: Enables cluster protection against accidental deletion. The possible values are true or false.

      Even if it is enabled, one can still connect to the cluster manually and delete it.

    • serviceAccountId: Service account ID.

    • logging: Logging parameters:

      • enabled: Enables logging. Logs generated by Apache Airflow™ components will be sent to Yandex Cloud Logging. The possible values are true or false.

      • minLevel: Minimum logging level. Possible values: TRACE, DEBUG, INFO, WARN, ERROR, and FATAL.

      • folderId: Folder ID. Logs will be written to the default log group for this folder.

      • logGroupId: Custom log group ID. Logs will be written to this group.

        Specify either folderId or logGroupId.

    • adminPassword: Admin user password. The password must be not less than 8 characters long and contain at least:

      • One uppercase letter
      • One lowercase letter
      • One number
      • One special character

      Note

      Save the password locally or memorize it. The service does not show passwords after the registry is created.

  3. Use the Cluster.create method and send the following request, e.g., via cURL:

    curl \
        --request POST \
        --header "Authorization: Bearer $IAM_TOKEN" \
        --url 'https://airflow.api.cloud.yandex.net/managed-airflow/v1/clusters'
        --data '@body.json'
    
  4. View the server response to make sure the request was successful.

  1. Get an IAM token for API authentication and put it into the environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume the repository contents are stored in the ~/cloudapi/ directory.

  3. Create a file named body.json and add the following contents to it:

    {
      "folder_id": "<folder_ID>",
      "name": "<cluster_name>",
      "description": "<cluster_description>",
      "labels": { <label_list> },
      "config": {
        "version_id": "<Apache Airflow™>_version",
        "airflow": {
          "config": { <list_of_properties> }
        },
        "webserver": {
          "count": "<number_of_instances>",
          "resources": {
            "resource_preset_id": "<resource_ID>"
          }
        },
        "scheduler": {
          "count": "<number_of_instances>",
          "resources": {
            "resource_preset_id": "<resource_ID>"
          }
        },
        "triggerer": {
          "count": "<number_of_instances>",
          "resources": {
            "resource_preset_id": "<resource_ID>"
          }
        },
        "worker": {
          "min_count": "<minimum_number_of_instances>",
          "max_count": "<maximum_number_of_instances>",
          "resources": {
            "resource_preset_id": "<resource_ID>"
          }
        },
        "dependencies": {
          "pip_packages": [ <list_of_pip_packages> ],
          "deb_packages": [ <list_of_deb_packages> ]
        },
        "lockbox": {
          "enabled": <use_of_logging>
        }
      },
      "network": {
        "subnet_ids": [ <list_of_subnet_IDs> ],
        "security_group_ids": [ <list_of_security_group_IDs> ]
      },
      "code_sync": {
        "s3": {
          "bucket": "<bucket_name>"
        }
      },
      "deletion_protection": <deletion_protection>,
      "service_account_id": "<service_account_ID>",
      "logging": {
        "enabled": <use_of_logging>,
        "min_level": "<logging_level>",
        "folder_id": "<folder_ID>"
      },
      "admin_password": "<administrator_password>"
    }
    

    Where:

    • folder_id: Folder ID. You can request it with the list of folders in the cloud.

    • name: Cluster name.

    • description: Cluster description.

    • labels: List of labels. Provide labels in "<key>": "<value>" format.

    • config: Cluster configuration:

      • version_id: Apache Airflow™ version.

      • airflow.config: Apache Airflow™ additional properties. Provide them in "<configuration_section>.<key>": "<value>" format, for example:

        "airflow": {
          "config": {
            "core.load_examples": "False"
          }
        }
        
      • webserver, scheduler, triggerer, worker: Managed Service for Apache Airflow™ component configuration:

        • count: Number of instances in the cluster for the web server, scheduler, and Triggerer.

        • min_count, max_count: Minimum and maximum number of instances in the cluster for the worker.

        • resources.resource_preset_id: ID of the computing resources of the web server, scheduler, worker, and Triggerer. The possible values are:

          • c1-m2: 1 vCPU, 2 GB RAM
          • c1-m4: 1 vCPU, 4 GB RAM
          • c2-m4: 2 vCPUs, 4 GB RAM
          • c2-m8: 2 vCPUs, 8 GB RAM
          • c4-m8: 4 vCPUs, 8 GB RAM
          • c4-m16: 4 vCPUs, 16 GB RAM
          • c8-m16: 8 vCPUs, 16 GB RAM
          • c8-m32: 8 vCPUs, 32 GB RAM
      • dependencies: Lists of packages enabling you to install additional libraries and applications for running DAG files in the cluster:

        • pip_packages: List of pip packages.
        • deb_packages: List of deb packages.

        If required, you can set version restrictions for the installed packages, for example:

        "dependencies": {
          "pip_packages": [
            "pandas==2.0.2",
            "scikit-learn>=1.0.0",
            "clickhouse-driver~=0.2.0"
          ]
        }
        

        The package name format and version are defined by the install command: pip install for pip packages and apt install for deb packages.

      • lockbox.enabled: Enables using secrets in Yandex Lockbox to store Apache Airflow™ configuration data, variables, and connection parameters. The possible values are true or false.

    • network: Network settings:

      • subnet_ids: Subnet IDs list.

        To ensure that only an Apache Airflow™ cluster can connect to your resources, create separate subnets for the cluster and do not host any other resources in those subnets.

        Make sure your subnets meet the following requirements:

        • For each subnet in the cluster network (including those not assigned to the cluster), the IP address range does not overlap with that of the 10.248.0.0/13 auxiliary subnet.
        • The IP address range of each cluster subnet includes at least 2 × N vacant addresses, where N is the total number of instances of all components: web server, scheduler, workers, and triggerer service.

        For more information, see Requirements for cluster subnets.

      • security_group_ids: List of security group IDs.

    • code_sync.s3.bucket: Name of the bucket to store DAG files in.

    • deletion_protection: Enables cluster protection against accidental deletion. The possible values are true or false.

      Even if it is enabled, one can still connect to the cluster manually and delete it.

    • service_account_id: Service account ID.

    • logging: Logging parameters:

      • enabled: Enables logging. Logs generated by Apache Airflow™ components will be sent to Yandex Cloud Logging. The possible values are true or false.

      • min_level: Minimum logging level. Possible values: TRACE, DEBUG, INFO, WARN, ERROR, and FATAL.

      • folder_id: Folder ID. Logs will be written to the default log group for this folder.

      • log_group_id: Custom log group ID. Logs will be written to this group.

        Specify either folder_id or log_group_id.

    • admin_password: Admin user password. The password must be not less than 8 characters long and contain at least:

      • One uppercase letter
      • One lowercase letter
      • One number
      • One special character

      Note

      Save the password locally or memorize it. The service does not show passwords after the registry is created.

  4. Use the ClusterService/Create call and send the following request, e.g., via gRPCurl:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/airflow/v1/cluster_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d @ \
        airflow.api.cloud.yandex.net:443 \
        yandex.cloud.airflow.v1.ClusterService.Create \
        < body.json
    
  5. View the server response to make sure the request was successful.

Was the article helpful?

Previous
Information about existing clusters
Next
Updating a cluster
Yandex project
© 2025 Yandex.Cloud LLC