Yandex Cloud
Поиск
Связаться с намиПодключиться
  • Документация
  • Блог
  • Все сервисы
  • Статус работы сервисов
    • Популярные
    • Инфраструктура и сеть
    • Платформа данных
    • Контейнеры
    • Инструменты разработчика
    • Бессерверные вычисления
    • Безопасность
    • Мониторинг и управление ресурсами
    • Машинное обучение
    • Бизнес-инструменты
  • Все решения
    • По отраслям
    • По типу задач
    • Экономика платформы
    • Безопасность
    • Техническая поддержка
    • Каталог партнёров
    • Обучение и сертификация
    • Облако для стартапов
    • Облако для крупного бизнеса
    • Центр технологий для общества
    • Облако для интеграторов
    • Поддержка IT-бизнеса
    • Облако для фрилансеров
    • Обучение и сертификация
    • Блог
    • Документация
    • Контент-программа
    • Мероприятия и вебинары
    • Контакты, чаты и сообщества
    • Идеи
    • Истории успеха
    • Тарифы Yandex Cloud
    • Промоакции и free tier
    • Правила тарификации
  • Документация
  • Блог
Проект Яндекса
© 2025 ООО «Яндекс.Облако»
Yandex Data Processing
  • Начало работы
  • Управление доступом
  • Правила тарификации
  • Справочник Terraform
    • Аутентификация в API
      • Overview
        • Overview
        • Get
        • List
        • Create
        • Update
        • Delete
        • Start
        • Stop
        • ListOperations
        • ListHosts
        • ListUILinks
  • Метрики Monitoring
  • Аудитные логи Audit Trails
  • Публичные материалы
  • Вопросы и ответы
  • Обучающие курсы

В этой статье:

  • HTTP request
  • Body parameters
  • CreateClusterConfigSpec
  • HadoopConfig
  • InitializationAction
  • CreateSubclusterConfigSpec
  • Resources
  • AutoscalingConfig
  • Response
  • CreateClusterMetadata
  • Status
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction
  1. Справочник API
  2. REST (англ.)
  3. Cluster
  4. Create

Yandex Data Processing API, REST: Cluster.Create

Статья создана
Yandex Cloud
Обновлена 24 апреля 2025 г.
  • HTTP request
  • Body parameters
  • CreateClusterConfigSpec
  • HadoopConfig
  • InitializationAction
  • CreateSubclusterConfigSpec
  • Resources
  • AutoscalingConfig
  • Response
  • CreateClusterMetadata
  • Status
  • Cluster
  • Monitoring
  • ClusterConfig
  • HadoopConfig
  • InitializationAction

Creates a cluster in the specified folder.

HTTP requestHTTP request

POST https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters

Body parametersBody parameters

{
  "folderId": "string",
  "name": "string",
  "description": "string",
  "labels": "object",
  "configSpec": {
    "versionId": "string",
    "hadoop": {
      "services": [
        "string"
      ],
      "properties": "object",
      "sshPublicKeys": [
        "string"
      ],
      "initializationActions": [
        {
          "uri": "string",
          "args": [
            "string"
          ],
          "timeout": "string"
        }
      ],
      "osloginEnabled": "boolean"
    },
    "subclustersSpec": [
      {
        "name": "string",
        "role": "string",
        "resources": {
          "resourcePresetId": "string",
          "diskTypeId": "string",
          "diskSize": "string"
        },
        "subnetId": "string",
        "hostsCount": "string",
        "assignPublicIp": "boolean",
        "autoscalingConfig": {
          "maxHostsCount": "string",
          "preemptible": "boolean",
          "measurementDuration": "string",
          "warmupDuration": "string",
          "stabilizationDuration": "string",
          "cpuUtilizationTarget": "string",
          "decommissionTimeout": "string"
        }
      }
    ]
  },
  "zoneId": "string",
  "serviceAccountId": "string",
  "bucket": "string",
  "uiProxy": "boolean",
  "securityGroupIds": [
    "string"
  ],
  "hostGroupIds": [
    "string"
  ],
  "deletionProtection": "boolean",
  "logGroupId": "string",
  "environment": "string"
}

Field

Description

folderId

string

Required field. ID of the folder to create a cluster in.

To get a folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request.

name

string

Name of the cluster. The name must be unique within the folder.
The name can't be changed after the Yandex Data Processing cluster is created.

description

string

Description of the cluster.

labels

object (map<string, string>)

Cluster labels as key:value pairs.

configSpec

CreateClusterConfigSpec

Required field. Configuration and resources for hosts that should be created with the cluster.

zoneId

string

Required field. ID of the availability zone where the cluster should be placed.

To get the list of available zones make a yandex.cloud.compute.v1.ZoneService.List request.

serviceAccountId

string

Required field. ID of the service account to be used by the Yandex Data Processing manager agent.

bucket

string

Name of the Object Storage bucket to use for Yandex Data Processing jobs.

uiProxy

boolean

Enable UI Proxy feature.

securityGroupIds[]

string

User security groups.

hostGroupIds[]

string

Host groups to place VMs of cluster on.

deletionProtection

boolean

Deletion Protection inhibits deletion of the cluster

logGroupId

string

ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service

environment

enum (Environment)

Environment of the cluster

  • ENVIRONMENT_UNSPECIFIED
  • PRODUCTION
  • PRESTABLE

CreateClusterConfigSpecCreateClusterConfigSpec

Field

Description

versionId

string

Version of the image for cluster provisioning.

All available versions are listed in the documentation.

hadoop

HadoopConfig

Yandex Data Processing specific options.

subclustersSpec[]

CreateSubclusterConfigSpec

Specification for creating subclusters.

HadoopConfigHadoopConfig

Hadoop configuration that describes services installed in a cluster,
their properties and settings.

Field

Description

services[]

enum (Service)

Set of services used in the cluster (if empty, the default set is used).

  • SERVICE_UNSPECIFIED
  • HDFS
  • YARN
  • MAPREDUCE
  • HIVE
  • TEZ
  • ZOOKEEPER
  • HBASE
  • SQOOP
  • FLUME
  • SPARK
  • ZEPPELIN
  • OOZIE
  • LIVY

properties

object (map<string, string>)

Properties set for all hosts in *-site.xml configurations. The key should indicate
the service and the property.

For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property
in the file /etc/hadoop/conf/hdfs-site.xml.

sshPublicKeys[]

string

List of public SSH keys to access to cluster hosts.

initializationActions[]

InitializationAction

Set of init-actions

osloginEnabled

boolean

Oslogin enable on cluster nodes

InitializationActionInitializationAction

Field

Description

uri

string

URI of the executable file

args[]

string

Arguments to the initialization action

timeout

string (int64)

Execution timeout

CreateSubclusterConfigSpecCreateSubclusterConfigSpec

Field

Description

name

string

Name of the subcluster.

role

enum (Role)

Required field. Role of the subcluster in the Yandex Data Processing cluster.

  • ROLE_UNSPECIFIED

  • MASTERNODE: The subcluster fulfills the master role.

    Master can run the following services, depending on the requested components:

  • HDFS: Namenode, Secondary Namenode
  • YARN: ResourceManager, Timeline Server
  • HBase Master
  • Hive: Server, Metastore, HCatalog
  • Spark History Server
  • Zeppelin
  • ZooKeeper
  • DATANODE: The subcluster is a DATANODE in a Yandex Data Processing cluster.

    DATANODE can run the following services, depending on the requested components:

  • HDFS DataNode
  • YARN NodeManager
  • HBase RegionServer
  • Spark libraries
  • COMPUTENODE: The subcluster is a COMPUTENODE in a Yandex Data Processing cluster.

    COMPUTENODE can run the following services, depending on the requested components:

  • YARN NodeManager
  • Spark libraries

resources

Resources

Required field. Resource configuration for hosts in the subcluster.

subnetId

string

Required field. ID of the VPC subnet used for hosts in the subcluster.

hostsCount

string (int64)

Number of hosts in the subcluster.

assignPublicIp

boolean

Assign public ip addresses for all hosts in subcluter.

autoscalingConfig

AutoscalingConfig

Configuration for instance group based subclusters

ResourcesResources

Field

Description

resourcePresetId

string

ID of the resource preset for computational resources available to a host (CPU, memory etc.).
All available presets are listed in the documentation.

diskTypeId

string

Type of the storage environment for the host.
Possible values:

  • network-hdd - network HDD drive,
  • network-ssd - network SSD drive.

diskSize

string (int64)

Volume of the storage available to a host, in bytes.

AutoscalingConfigAutoscalingConfig

Field

Description

maxHostsCount

string (int64)

Upper limit for total instance subcluster count.

preemptible

boolean

Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time
if their resources are needed by Compute.
For more information, see Preemptible Virtual Machines.

measurementDuration

string (duration)

Required field. Time in seconds allotted for averaging metrics.

warmupDuration

string (duration)

The warmup time of the instance in seconds. During this time,
traffic is sent to the instance, but instance metrics are not collected.

stabilizationDuration

string (duration)

Minimum amount of time in seconds allotted for monitoring before
Instance Groups can reduce the number of instances in the group.
During this time, the group size doesn't decrease, even if the new metric values
indicate that it should.

cpuUtilizationTarget

string

Defines an autoscaling rule based on the average CPU utilization of the instance group.

decommissionTimeout

string (int64)

Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120

ResponseResponse

HTTP Code: 200 - OK

{
  "id": "string",
  "description": "string",
  "createdAt": "string",
  "createdBy": "string",
  "modifiedAt": "string",
  "done": "boolean",
  "metadata": {
    "clusterId": "string"
  },
  // Includes only one of the fields `error`, `response`
  "error": {
    "code": "integer",
    "message": "string",
    "details": [
      "object"
    ]
  },
  "response": {
    "id": "string",
    "folderId": "string",
    "createdAt": "string",
    "name": "string",
    "description": "string",
    "labels": "object",
    "monitoring": [
      {
        "name": "string",
        "description": "string",
        "link": "string"
      }
    ],
    "config": {
      "versionId": "string",
      "hadoop": {
        "services": [
          "string"
        ],
        "properties": "object",
        "sshPublicKeys": [
          "string"
        ],
        "initializationActions": [
          {
            "uri": "string",
            "args": [
              "string"
            ],
            "timeout": "string"
          }
        ],
        "osloginEnabled": "boolean"
      }
    },
    "health": "string",
    "status": "string",
    "zoneId": "string",
    "serviceAccountId": "string",
    "bucket": "string",
    "uiProxy": "boolean",
    "securityGroupIds": [
      "string"
    ],
    "hostGroupIds": [
      "string"
    ],
    "deletionProtection": "boolean",
    "logGroupId": "string",
    "environment": "string"
  }
  // end of the list of possible fields
}

An Operation resource. For more information, see Operation.

Field

Description

id

string

ID of the operation.

description

string

Description of the operation. 0-256 characters long.

createdAt

string (date-time)

Creation timestamp.

String in RFC3339 text format. The range of possible values is from
0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z, i.e. from 0 to 9 digits for fractions of a second.

To work with values in this field, use the APIs described in the
Protocol Buffers reference.
In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).

createdBy

string

ID of the user or service account who initiated the operation.

modifiedAt

string (date-time)

The time when the Operation resource was last modified.

String in RFC3339 text format. The range of possible values is from
0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z, i.e. from 0 to 9 digits for fractions of a second.

To work with values in this field, use the APIs described in the
Protocol Buffers reference.
In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).

done

boolean

If the value is false, it means the operation is still in progress.
If true, the operation is completed, and either error or response is available.

metadata

CreateClusterMetadata

Service-specific metadata associated with the operation.
It typically contains the ID of the target resource that the operation is performed on.
Any method that returns a long-running operation should document the metadata type, if any.

error

Status

The error result of the operation in case of failure or cancellation.

Includes only one of the fields error, response.

The operation result.
If done == false and there was no failure detected, neither error nor response is set.
If done == false and there was a failure detected, error is set.
If done == true, exactly one of error or response is set.

response

Cluster

The normal response of the operation in case of success.
If the original method returns no data on success, such as Delete,
the response is google.protobuf.Empty.
If the original method is the standard Create/Update,
the response should be the target resource of the operation.
Any method that returns a long-running operation should document the response type, if any.

Includes only one of the fields error, response.

The operation result.
If done == false and there was no failure detected, neither error nor response is set.
If done == false and there was a failure detected, error is set.
If done == true, exactly one of error or response is set.

CreateClusterMetadataCreateClusterMetadata

Field

Description

clusterId

string

ID of the cluster that is being created.

StatusStatus

The error result of the operation in case of failure or cancellation.

Field

Description

code

integer (int32)

Error code. An enum value of google.rpc.Code.

message

string

An error message.

details[]

object

A list of messages that carry the error details.

ClusterCluster

A Yandex Data Processing cluster. For details about the concept, see documentation.

Field

Description

id

string

ID of the cluster. Generated at creation time.

folderId

string

ID of the folder that the cluster belongs to.

createdAt

string (date-time)

Creation timestamp.

String in RFC3339 text format. The range of possible values is from
0001-01-01T00:00:00Z to 9999-12-31T23:59:59.999999999Z, i.e. from 0 to 9 digits for fractions of a second.

To work with values in this field, use the APIs described in the
Protocol Buffers reference.
In some languages, built-in datetime utilities do not support nanosecond precision (9 digits).

name

string

Name of the cluster. The name is unique within the folder.

description

string

Description of the cluster.

labels

object (map<string, string>)

Cluster labels as key:value pairs.

monitoring[]

Monitoring

Monitoring systems relevant to the cluster.

config

ClusterConfig

Configuration of the cluster.

health

enum (Health)

Aggregated cluster health.

  • HEALTH_UNKNOWN: Object is in unknown state (we have no data).
  • ALIVE: Object is alive and well (for example, all hosts of the cluster are alive).
  • DEAD: Object is inoperable (it cannot perform any of its essential functions).
  • DEGRADED: Object is partially alive (it can perform some of its essential functions).

status

enum (Status)

Cluster status.

  • STATUS_UNKNOWN: Cluster state is unknown.
  • CREATING: Cluster is being created.
  • RUNNING: Cluster is running normally.
  • ERROR: Cluster encountered a problem and cannot operate.
  • STOPPING: Cluster is stopping.
  • STOPPED: Cluster stopped.
  • STARTING: Cluster is starting.

zoneId

string

ID of the availability zone where the cluster resides.

serviceAccountId

string

ID of service account for the Yandex Data Processing manager agent.

bucket

string

Object Storage bucket to be used for Yandex Data Processing jobs that are run in the cluster.

uiProxy

boolean

Whether UI Proxy feature is enabled.

securityGroupIds[]

string

User security groups.

hostGroupIds[]

string

Host groups hosting VMs of the cluster.

deletionProtection

boolean

Deletion Protection inhibits deletion of the cluster

logGroupId

string

ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used.
To prevent logs from being sent to the cloud set cluster property dataproc:disable_cloud_logging = true

environment

enum (Environment)

Environment of the cluster

  • ENVIRONMENT_UNSPECIFIED
  • PRODUCTION
  • PRESTABLE

MonitoringMonitoring

Metadata of a monitoring system for a Yandex Data Processing cluster.

Field

Description

name

string

Name of the monitoring system.

description

string

Description of the monitoring system.

link

string

Link to the monitoring system.

ClusterConfigClusterConfig

Field

Description

versionId

string

Image version for cluster provisioning.
All available versions are listed in the documentation.

hadoop

HadoopConfig

Yandex Data Processing specific configuration options.

HadoopConfigHadoopConfig

Hadoop configuration that describes services installed in a cluster,
their properties and settings.

Field

Description

services[]

enum (Service)

Set of services used in the cluster (if empty, the default set is used).

  • SERVICE_UNSPECIFIED
  • HDFS
  • YARN
  • MAPREDUCE
  • HIVE
  • TEZ
  • ZOOKEEPER
  • HBASE
  • SQOOP
  • FLUME
  • SPARK
  • ZEPPELIN
  • OOZIE
  • LIVY

properties

object (map<string, string>)

Properties set for all hosts in *-site.xml configurations. The key should indicate
the service and the property.

For example, use the key 'hdfs:dfs.replication' to set the dfs.replication property
in the file /etc/hadoop/conf/hdfs-site.xml.

sshPublicKeys[]

string

List of public SSH keys to access to cluster hosts.

initializationActions[]

InitializationAction

Set of init-actions

osloginEnabled

boolean

Oslogin enable on cluster nodes

InitializationActionInitializationAction

Field

Description

uri

string

URI of the executable file

args[]

string

Arguments to the initialization action

timeout

string (int64)

Execution timeout

Была ли статья полезна?

Предыдущая
List
Следующая
Update
Проект Яндекса
© 2025 ООО «Яндекс.Облако»