Data Proc API, gRPC: ClusterService.Create
Creates a cluster in the specified folder.
gRPC request
rpc Create (CreateClusterRequest) returns (operation.Operation)
CreateClusterRequest
{
"folderId": "string",
"name": "string",
"description": "string",
"labels": "string",
"configSpec": {
"versionId": "string",
"hadoop": {
"services": [
"Service"
],
"properties": "string",
"sshPublicKeys": [
"string"
],
"initializationActions": [
{
"uri": "string",
"args": [
"string"
],
"timeout": "int64"
}
]
},
"subclustersSpec": [
{
"name": "string",
"role": "Role",
"resources": {
"resourcePresetId": "string",
"diskTypeId": "string",
"diskSize": "int64"
},
"subnetId": "string",
"hostsCount": "int64",
"assignPublicIp": "bool",
"autoscalingConfig": {
"maxHostsCount": "int64",
"preemptible": "bool",
"measurementDuration": "google.protobuf.Duration",
"warmupDuration": "google.protobuf.Duration",
"stabilizationDuration": "google.protobuf.Duration",
"cpuUtilizationTarget": "double",
"decommissionTimeout": "int64"
}
}
]
},
"zoneId": "string",
"serviceAccountId": "string",
"bucket": "string",
"uiProxy": "bool",
"securityGroupIds": [
"string"
],
"hostGroupIds": [
"string"
],
"deletionProtection": "bool",
"logGroupId": "string"
}
Field |
Description |
folderId |
string Required field. ID of the folder to create a cluster in. To get a folder ID make a yandex.cloud.resourcemanager.v1.FolderService.List request. |
name |
string Name of the cluster. The name must be unique within the folder. |
description |
string Description of the cluster. |
labels |
string Cluster labels as |
configSpec |
Required field. Configuration and resources for hosts that should be created with the cluster. |
zoneId |
string Required field. ID of the availability zone where the cluster should be placed. To get the list of available zones make a yandex.cloud.compute.v1.ZoneService.List request. |
serviceAccountId |
string Required field. ID of the service account to be used by the Data Proc manager agent. |
bucket |
string Name of the Object Storage bucket to use for Data Proc jobs. |
uiProxy |
bool Enable UI Proxy feature. |
securityGroupIds[] |
string User security groups. |
hostGroupIds[] |
string Host groups to place VMs of cluster on. |
deletionProtection |
bool Deletion Protection inhibits deletion of the cluster |
logGroupId |
string ID of the cloud logging log group to write logs. If not set, logs will not be sent to logging service |
CreateClusterConfigSpec
Field |
Description |
versionId |
string Version of the image for cluster provisioning. All available versions are listed in the documentation. |
hadoop |
Data Proc specific options. |
subclustersSpec[] |
Specification for creating subclusters. |
HadoopConfig
Hadoop configuration that describes services installed in a cluster,
their properties and settings.
Field |
Description |
services[] |
enum Service Set of services used in the cluster (if empty, the default set is used).
|
properties |
string Properties set for all hosts in For example, use the key 'hdfs:dfs.replication' to set the |
sshPublicKeys[] |
string List of public SSH keys to access to cluster hosts. |
initializationActions[] |
Set of init-actions |
InitializationAction
Field |
Description |
uri |
string URI of the executable file |
args[] |
string Arguments to the initialization action |
timeout |
int64 Execution timeout |
CreateSubclusterConfigSpec
Field |
Description |
name |
string Name of the subcluster. |
role |
enum Role Required field. Role of the subcluster in the Data Proc cluster.
|
resources |
Required field. Resource configuration for hosts in the subcluster. |
subnetId |
string Required field. ID of the VPC subnet used for hosts in the subcluster. |
hostsCount |
int64 Number of hosts in the subcluster. |
assignPublicIp |
bool Assign public ip addresses for all hosts in subcluter. |
autoscalingConfig |
Configuration for instance group based subclusters |
Resources
Field |
Description |
resourcePresetId |
string ID of the resource preset for computational resources available to a host (CPU, memory etc.). |
diskTypeId |
string Type of the storage environment for the host.
|
diskSize |
int64 Volume of the storage available to a host, in bytes. |
AutoscalingConfig
Field |
Description |
maxHostsCount |
int64 Upper limit for total instance subcluster count. |
preemptible |
bool Preemptible instances are stopped at least once every 24 hours, and can be stopped at any time |
measurementDuration |
Required field. Time in seconds allotted for averaging metrics. |
warmupDuration |
The warmup time of the instance in seconds. During this time, |
stabilizationDuration |
Minimum amount of time in seconds allotted for monitoring before |
cpuUtilizationTarget |
double Defines an autoscaling rule based on the average CPU utilization of the instance group. |
decommissionTimeout |
int64 Timeout to gracefully decommission nodes during downscaling. In seconds. Default value: 120 |
operation.Operation
{
"id": "string",
"description": "string",
"createdAt": "google.protobuf.Timestamp",
"createdBy": "string",
"modifiedAt": "google.protobuf.Timestamp",
"done": "bool",
"metadata": {
"clusterId": "string"
},
// Includes only one of the fields `error`, `response`
"error": "google.rpc.Status",
"response": {
"id": "string",
"folderId": "string",
"createdAt": "google.protobuf.Timestamp",
"name": "string",
"description": "string",
"labels": "string",
"monitoring": [
{
"name": "string",
"description": "string",
"link": "string"
}
],
"config": {
"versionId": "string",
"hadoop": {
"services": [
"Service"
],
"properties": "string",
"sshPublicKeys": [
"string"
],
"initializationActions": [
{
"uri": "string",
"args": [
"string"
],
"timeout": "int64"
}
]
}
},
"health": "Health",
"status": "Status",
"zoneId": "string",
"serviceAccountId": "string",
"bucket": "string",
"uiProxy": "bool",
"securityGroupIds": [
"string"
],
"hostGroupIds": [
"string"
],
"deletionProtection": "bool",
"logGroupId": "string"
}
// end of the list of possible fields
}
An Operation resource. For more information, see Operation.
Field |
Description |
id |
string ID of the operation. |
description |
string Description of the operation. 0-256 characters long. |
createdAt |
Creation timestamp. |
createdBy |
string ID of the user or service account who initiated the operation. |
modifiedAt |
The time when the Operation resource was last modified. |
done |
bool If the value is |
metadata |
Service-specific metadata associated with the operation. |
error |
The error result of the operation in case of failure or cancellation. Includes only one of the fields The operation result. |
response |
The normal response of the operation in case of success. Includes only one of the fields The operation result. |
CreateClusterMetadata
Field |
Description |
clusterId |
string ID of the cluster that is being created. |
Cluster
A Data Proc cluster. For details about the concept, see documentation.
Field |
Description |
id |
string ID of the cluster. Generated at creation time. |
folderId |
string ID of the folder that the cluster belongs to. |
createdAt |
Creation timestamp. |
name |
string Name of the cluster. The name is unique within the folder. |
description |
string Description of the cluster. |
labels |
string Cluster labels as |
monitoring[] |
Monitoring systems relevant to the cluster. |
config |
Configuration of the cluster. |
health |
enum Health Aggregated cluster health.
|
status |
enum Status Cluster status.
|
zoneId |
string ID of the availability zone where the cluster resides. |
serviceAccountId |
string ID of service account for the Data Proc manager agent. |
bucket |
string Object Storage bucket to be used for Data Proc jobs that are run in the cluster. |
uiProxy |
bool Whether UI Proxy feature is enabled. |
securityGroupIds[] |
string User security groups. |
hostGroupIds[] |
string Host groups hosting VMs of the cluster. |
deletionProtection |
bool Deletion Protection inhibits deletion of the cluster |
logGroupId |
string ID of the cloud logging log group to write logs. If not set, default log group for the folder will be used. |
Monitoring
Metadata of a monitoring system for a Data Proc cluster.
Field |
Description |
name |
string Name of the monitoring system. |
description |
string Description of the monitoring system. |
link |
string Link to the monitoring system. |
ClusterConfig
Field |
Description |
versionId |
string Image version for cluster provisioning. |
hadoop |
Data Proc specific configuration options. |
HadoopConfig
Hadoop configuration that describes services installed in a cluster,
their properties and settings.
Field |
Description |
services[] |
enum Service Set of services used in the cluster (if empty, the default set is used).
|
properties |
string Properties set for all hosts in For example, use the key 'hdfs:dfs.replication' to set the |
sshPublicKeys[] |
string List of public SSH keys to access to cluster hosts. |
initializationActions[] |
Set of init-actions |
InitializationAction
Field |
Description |
uri |
string URI of the executable file |
args[] |
string Arguments to the initialization action |
timeout |
int64 Execution timeout |