Creating a GPU cluster
Note
This section explains how to create a GPU cluster. Currently, GPU clusters can only be created in the ru-central1-a
availability zone.
After creating a cluster, you can add VMs from the same availability zone to it.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder through the --folder-name
or --folder-id
parameter.
-
See the description of the CLI command for creating a GPU cluster:
yc compute gpu-cluster create --help
Note that currently, you can only create GPU clusters with the
infiniband
interconnect type. -
Create a GPU cluster in the default availability zone:
yc compute gpu-cluster create --interconnect-type infiniband
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
In the Terraform configuration file, define the parameters of the resource you want to create:
provider "yandex" { zone = "ru-central1-a" } resource "yandex_compute_gpu_cluster" "default" { name = "<GPU_cluster_name>" interconnect_type = "<interconnect_type>" zone = "ru-central1-a" labels = { <label_1_key> = "<label_1_value>" <label_2_key> = "<label_2_value>" } }
Where:
name
: GPU cluster name. This is a required parameter.interconnect_type
: Interconnect type. Currently, you can only create GPU clusters with theinfiniband
interconnect type. This is a required parameter.labels
: Resource label in<key> = "<value>"
format. This is an optional parameter.
For more information about the
yandex_compute_gpu_cluster
resource properties, see the Terraform provider documentation . -
Create the resources:
-
In the terminal, change to the folder where you edited the configuration file.
-
Make sure the configuration file is correct using the command:
terraform validate
If the configuration is correct, the following message is returned:
Success! The configuration is valid.
-
Run the command:
terraform plan
The terminal will display a list of resources with parameters. No changes are made at this step. If the configuration contains errors, Terraform will point them out.
-
Apply the configuration changes:
terraform apply
-
Confirm the changes: type
yes
in the terminal and press Enter.
-
This will create a GPU cluster in the specified folder. You can check the new GPU cluster and its settings using the management console
yc compute gpu-cluster get <GPU_cluster_name>