Data Proc API, REST: Cluster.create
Creates a cluster in the specified folder.
HTTP request
POST https://dataproc.api.cloud.yandex.net/dataproc/v1/clusters
Body parameters
{
"folderId": "string",
"name": "string",
"description": "string",
"labels": "object",
"configSpec": {
"versionId": "string",
"hadoop": {
"services": [
"string"
],
"properties": "object",
"sshPublicKeys": [
"string"
],
"initializationActions": [
{
"uri": "string",
"args": [
"string"
],
"timeout": "string"
}
]
},
"subclustersSpec": [
{
"name": "string",
"role": "string",
"resources": {
"resourcePresetId": "string",
"diskTypeId": "string",
"diskSize": "string"
},
"subnetId": "string",
"hostsCount": "string",
"assignPublicIp": true,
"autoscalingConfig": {
"maxHostsCount": "string",
"preemptible": true,
"measurementDuration": "string",
"warmupDuration": "string",
"stabilizationDuration": "string",
"cpuUtilizationTarget": "number",
"decommissionTimeout": "string"
}
}
]
},
"zoneId": "string",
"serviceAccountId": "string",
"bucket": "string",
"uiProxy": true,
"securityGroupIds": [
"string"
],
"hostGroupIds": [
"string"
],
"deletionProtection": true,
"logGroupId": "string"
}
Field | Description |
---|---|
folderId | string Required. ID of the folder to create a cluster in. To get a folder ID make a list request. The maximum string length in characters is 50. |
name | string Name of the cluster. The name must be unique within the folder. The name can't be changed after the Data Proc cluster is created. Value must match the regular expression |
description | string Description of the cluster. The maximum string length in characters is 256. |
labels | object Cluster labels as No more than 64 per resource. The string length in characters for each key must be 1-63. Each key must match the regular expression |
configSpec | object Required. Configuration and resources for hosts that should be created with the cluster. |
configSpec. versionId |
string Version of the image for cluster provisioning. All available versions are listed in the documentation. |
configSpec. hadoop |
object Data Proc specific options. Hadoop configuration that describes services installed in a cluster, their properties and settings. |
configSpec. hadoop. services[] |
string Set of services used in the cluster (if empty, the default set is used). |
configSpec. hadoop. properties |
object Properties set for all hosts in For example, use the key 'hdfs:dfs.replication' to set the |
configSpec. hadoop. sshPublicKeys[] |
string List of public SSH keys to access to cluster hosts. |
configSpec. hadoop. initializationActions[] |
object Set of init-actions |
configSpec. hadoop. initializationActions[]. uri |
string URI of the executable file |
configSpec. hadoop. initializationActions[]. args[] |
string Arguments to the initialization action |
configSpec. hadoop. initializationActions[]. timeout |
string (int64) Execution timeout |
configSpec. subclustersSpec[] |
object Specification for creating subclusters. |
configSpec. subclustersSpec[]. name |
string Name of the subcluster. Value must match the regular expression |
configSpec. subclustersSpec[]. role |
string Required. Role of the subcluster in the Data Proc cluster.
|
configSpec. subclustersSpec[]. resources |
object Required. Resource configuration for hosts in the subcluster. |
configSpec. subclustersSpec[]. resources. resourcePresetId |
string ID of the resource preset for computational resources available to a host (CPU, memory etc.). All available presets are listed in the documentation. |
configSpec. subclustersSpec[]. resources. diskTypeId |
string Type of the storage environment for the host. Possible values:
|
configSpec. subclustersSpec[]. resources. diskSize |
string (int64) Volume of the storage available to a host, in bytes. |
configSpec. subclustersSpec[]. subnetId |
string Required. ID of the VPC subnet used for hosts in the subcluster. The maximum string length in characters is 50. |
configSpec. subclustersSpec[]. hostsCount |
string (int64) Number of hosts in the subcluster. The minimum value is 1. |
configSpec. subclustersSpec[]. assignPublicIp |
boolean (boolean) Assign public ip addresses for all hosts in subcluter. |
configSpec. subclustersSpec[]. autoscalingConfig |
object Configuration for instance group based subclusters |
configSpec. subclustersSpec[]. autoscalingConfig. maxHostsCount |
string (int64) Upper limit for total instance subcluster count. Acceptable values are 1 to 100, inclusive. |
configSpec. subclustersSpec[]. autoscalingConfig. preemptible |
boolean (boolean) Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time if their resources are needed by Compute. For more information, see Preemptible Virtual Machines. |
configSpec. subclustersSpec[]. autoscalingConfig. measurementDuration |
string Required. Time in seconds allotted for averaging metrics. Acceptable values are 60 seconds to 600 seconds, inclusive. |
configSpec. subclustersSpec[]. autoscalingConfig. warmupDuration |
string The warmup time of the instance in seconds. During this time, traffic is sent to the instance, but instance metrics are not collected. The maximum value is 600 seconds. |
configSpec. subclustersSpec[]. autoscalingConfig. stabilizationDuration |
string Minimum amount of time in seconds allotted for monitoring before Instance Groups can reduce the number of instances in the group. During this time, the group size doesn't decrease, even if the new metric values indicate that it should. Acceptable values are 60 seconds to 1800 seconds, inclusive. |
configSpec. subclustersSpec[]. autoscalingConfig. cpuUtilizationTarget |
number (double) Defines an autoscaling rule based on the average CPU utilization of the instance group. Acceptable values are 0 to 100, inclusive. |
configSpec. subclustersSpec[]. autoscalingConfig. decommissionTimeout |
string (int64) Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 Acceptable values are 0 to 86400, inclusive. |
zoneId | string Required. ID of the availability zone where the cluster should be placed. To get the list of available zones make a list request. The maximum string length in characters is 50. |
serviceAccountId | string Required. ID of the service account to be used by the Data Proc manager agent. |
bucket | string Name of the Object Storage bucket to use for Data Proc jobs. |
uiProxy | boolean (boolean) Enable UI Proxy feature. |
securityGroupIds[] | string User security groups. |
hostGroupIds[] | string Host groups to place VMs of cluster on. |
deletionProtection | boolean (boolean) Deletion Protection inhibits deletion of the cluster |
logGroupId | string ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service |
Response
HTTP Code: 200 - OK
{
"id": "string",
"description": "string",
"createdAt": "string",
"createdBy": "string",
"modifiedAt": "string",
"done": true,
"metadata": "object",
// includes only one of the fields `error`, `response`
"error": {
"code": "integer",
"message": "string",
"details": [
"object"
]
},
"response": "object",
// end of the list of possible fields
}
An Operation resource. For more information, see Operation.
Field | Description |
---|---|
id | string ID of the operation. |
description | string Description of the operation. 0-256 characters long. |
createdAt | string (date-time) Creation timestamp. String in RFC3339 text format. The range of possible values is from To work with values in this field, use the APIs described in the Protocol Buffers reference. In some languages, built-in datetime utilities do not support nanosecond precision (9 digits). |
createdBy | string ID of the user or service account who initiated the operation. |
modifiedAt | string (date-time) The time when the Operation resource was last modified. String in RFC3339 text format. The range of possible values is from To work with values in this field, use the APIs described in the Protocol Buffers reference. In some languages, built-in datetime utilities do not support nanosecond precision (9 digits). |
done | boolean (boolean) If the value is |
metadata | object Service-specific metadata associated with the operation. It typically contains the ID of the target resource that the operation is performed on. Any method that returns a long-running operation should document the metadata type, if any. |
error | object The error result of the operation in case of failure or cancellation. includes only one of the fields error , response |
error. code |
integer (int32) Error code. An enum value of google.rpc.Code. |
error. message |
string An error message. |
error. details[] |
object A list of messages that carry the error details. |
response | object includes only one of the fields error , response The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is the standard Create/Update, the response should be the target resource of the operation. Any method that returns a long-running operation should document the response type, if any. |