Creating a GPU cluster
Note
By default, the cloud has a zero quota for creating GPU clusters. To change the quota
This section explains how to create GPU clusters. Currently, GPU clusters can only be created in the ru-central1-a
availability zone.
After creating a cluster, you can add VMs from the same availability zone to it.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
-
View the description of the create GPU cluster CLI command:
yc compute gpu-cluster create --help
Note that you can currently create GPU clusters with the
InfiniBand
connection type only. -
Create a GPU cluster in the default availability zone:
yc compute gpu-cluster create --interconnect-type infiniband
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
In the Terraform configuration file, describe the parameters of the resource to create:
provider "yandex" { zone = "ru-central1-a" } resource "yandex_compute_gpu_cluster" "default" { name = "<GPU_cluster_name>" interconnect_type = "<connection_type>" zone = "ru-central1-a" labels = { <key_of_label_1> = "<value_of_label_1>" <key_of_label_2> = "<value_of_label_2>" } }
Where:
name
: GPU cluster name. This is a required parameter.interconnect_type
: Type of connection. You can currently create GPU clusters with theinfiniband
connection type only. This is a required parameter.labels
: Resource label in<key> = "<value>"
format. This is an optional parameter.
For more information about the
yandex_compute_gpu_cluster
resource parameters, see the Terraform provider documentation . -
Create resources:
-
In the terminal, change to the folder where you edited the configuration file.
-
Make sure the configuration file is correct using the command:
terraform validate
If the configuration is correct, the following message is returned:
Success! The configuration is valid.
-
Run the command:
terraform plan
The terminal will display a list of resources with parameters. No changes are made at this step. If the configuration contains errors, Terraform will point them out.
-
Apply the configuration changes:
terraform apply
-
Confirm the changes: type
yes
in the terminal and press Enter.
-
This will create a GPU cluster in the specified folder. You can check the new GPU cluster and its configuration using the management console
yc compute gpu-cluster get <GPU_cluster_name>