Creating a Trino cluster
Note
The service is at the Preview stage.
Each Managed Service for Trino cluster comprises a set of Trino components: coordinator and workers, which can be represented in multiple instances.
Roles for creating a cluster
To create a Managed Service for Trino cluster, your Yandex Cloud account needs the following roles:
- managed-trino.admin: To create a cluster.
- vpc.user: To use the cluster network.
- iam.serviceAccounts.user: To link a service account to the cluster.
Make sure to assign the managed-trino.integrationProvider
and storage.editor
roles to the cluster service account. The cluster will thus get the permissions it needs to work with user resources.
For more information about assigning roles, see the Yandex Identity and Access Management documentation.
Creating a cluster
-
In the management console, select the folder where you want to create a Managed Service for Trino cluster.
-
Select Managed Service for Trino.
-
Click Create cluster.
-
Under Basic parameters:
-
Specify a name for the cluster. The name must be unique within the folder.
-
(Optional) Enter a cluster description.
-
Optionally, create labels:
- Click Add label.
- Enter a label in
key: value
format. - Press Enter.
-
Select an existing service account or create a new one.
Make sure to assign the
managed-trino.integrationProvider
andstorage.editor
to the service account.
-
-
Under Network settings, select a network, subnet, and security group for the cluster.
-
Configure the coordinator and workers.
-
Under Catalogs, add the required folders. You can do this either when creating the cluster or later.
-
Specify a name for the folder. The name must be unique within the cloud.
-
Select Connector type.
-
Under Folder settings, set the parameters depending on the selected type:
-
For Hive, Iceberg, and Delta Lake connectors:
- URI to connect to the Metastore cluster in this format:
thrift://<IP_address>:<port>
. - File storage: Select the file storage type: Yandex Object Storage or External storage. For external storage, specify the following settings:
- AWS-compatible static access key ID.
- AWS-compatible static access key secret key.
- File storage endpoint, such as
storage.yandexcloud.net
. - File storage region, such as
ru-central1
.
- URI to connect to the Metastore cluster in this format:
-
For PostgreSQL and ClickHouse® connectors:
- URL to connect to a cluster in this format:
jdbc:<DBMS>://<host_address>:<port>/<DB_name>
, whereDBMS
ispostgresql
orclickhouse
. - Username to connect to the cluster.
- User Password.
- URL to connect to a cluster in this format:
-
TPC-H
and TPC-DS connectors provide access to test data and do not require configuration.
-
-
Optionally, specify additional folder settings in
key:value
format.
-
-
Under Advanced settings:
-
Optionally, enable cluster deletion protection.
-
Optionally, configure logging:
- Enable the Write logs setting.
- Select where the logs will be stored:
- Folder: Select a folder from the list.
- Group: Select a log group from the list or create a new one.
- Select Min. logging level from the list.
-
-
Click Create.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Create a file named
body.json
and add the following contents to it:Note
This example does not use all available parameters. For a list of all parameters, see the API documentation.
{ "folderId": "<folder_ID>", "name": "<cluster_name>", "description": "<cluster_description>", "labels": { <label_list> }, "trino": { "coordinatorConfig": { "resources": { "resourcePresetId": "<resource_ID>" } }, "workerConfig": { "resources": { "resourcePresetId": "<resource_ID>" }, "scalePolicy": { "autoScale": { "minCount": "<minimum_number_of_instances>", "maxCount": "<maximum_number_of_instances>" } } } }, "network": { "subnetIds": [ <list_of_subnet_IDs> ], "securityGroupIds": [ <list_of_security_group_IDs> ] }, "deletionProtection": "<deletion_protection>", "serviceAccountId": "<service_account_ID>", "logging": { "enabled": "<use of_logging>", "folderId": "<folder_ID>", "minLevel": "<logging_level>" } }
Where:
-
folderId
: Folder ID. You can request it with the list of folders in the cloud. -
name
: Cluster name. -
description
: Cluster description. -
labels
: List of labels. Provide labels in"<key>": "<value>"
format. -
trino
: Configuration of Trino cluster components.-
coordinatorConfig
: Coordinator configuration.-
resources.resourcePresetId
: ID of the coordinator’s computing resources. The possible values are:c4-m16
: 4 vCPUs, 16 GB RAMc8-m32
: 8 vCPUs, 32 GB RAM
-
-
workerConfig
: Worker configuration.-
resources.resourcePresetId
: ID of the worker’s computing resources. The possible values are:c4-m16
: 4 vCPUs, 16 GB RAMc8-m32
: 8 vCPUs, 32 GB RAM
-
scalePolicy
: Worker scaling policy:-
fixedScale
: Fixed scaling policy.count
: Number of workers.
-
fixed_scale
: Automatic scaling policy.minCount
: Minimum number of workers.maxCount
: Maximum number of workers.
Specify one of the two parameters:
fixedScale
orautoScale
. -
-
-
-
network
: Network settings:subnetIds
: Subnet IDs list.securityGroupIds
: List of security group IDs.
-
deletionProtection
: Enables cluster protection against accidental deletion. The possible values aretrue
orfalse
.Even if it is enabled, one can still connect to the cluster manually and delete it.
-
serviceAccountId
: Service account ID. -
logging
: Logging parameters:enabled
: Enables logging. Logs generated by Trino components will be sent to Yandex Cloud Logging. The possible values aretrue
orfalse
.minLevel
: Minimum logging level. Possible values:TRACE
,DEBUG
,INFO
,WARN
,ERROR
, andFATAL
.folderId
: Folder ID. Logs will be written to the default log group for this folder.logGroupId
: Custom log group ID. Logs will be written to this group.
Specify one of the two parameters:
folderId
orlogGroupId
.
-
-
Use the Cluster.create method and send the following request, e.g., via cURL
:curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://trino.api.cloud.yandex.net/managed-trino/v1/clusters' --data '@body.json'
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Create a file named
body.json
and add the following contents to it:Note
This example does not use all available parameters. For a list of all parameters, see the API documentation.
{ "folder_id": "<folder_ID>", "name": "<cluster_name>", "description": "<cluster_description>", "labels": { <label_list> }, "trino": { "coordinator_config": { "resources": { "resource_preset_id": "<resource_ID>" } }, "worker_config": { "resources": { "resource_preset_id": "<resource_ID>" }, "scale_policy": { "auto_scale": { "min_count": "<minimum_number_of_instances>", "max_count": "<maximum_number_of_instances>" } } } }, "network": { "subnet_ids": [ <list_of_subnet_IDs> ], "security_group_ids": [ <list_of_security_group_IDs> ] }, "deletion_protection": "<deletion_protection>", "service_account_id": "<service_account_ID>", "logging": { "enabled": "<use of_logging>", "folder_id": "<folder_ID>", "min_level": "<logging_level>" } }
Where:
-
folder_id
: Folder ID. You can request it with the list of folders in the cloud. -
name
: Cluster name. -
description
: Cluster description. -
labels
: List of labels. Provide labels in"<key>": "<value>"
format. -
trino
: Configuration of Trino cluster components.-
coordinator_config
: Coordinator configuration.-
resources.resource_preset_id
: ID of the coordinator’s computing resources. The possible values are:c4-m16
: 4 vCPUs, 16 GB RAMc8-m32
: 8 vCPUs, 32 GB RAM
-
-
worker_config
: Worker configuration.-
resources.resource_preset_id
: ID of the worker’s computing resources. The possible values are:c4-m16
: 4 vCPUs, 16 GB RAMc8-m32
: 8 vCPUs, 32 GB RAM
-
scale_policy
: Worker scaling policy:-
fixed_scale
: Fixed scaling policy.count
: Number of workers.
-
fixed_scale
: Automatic scaling policy.min_count
: Minimum number of workers.max_count
: Maximum number of workers.
Specify one of the two parameters:
fixed_scale
orauto_scale
. -
-
-
-
network
: Network settings:subnet_ids
: Subnet IDs list.security_group_ids
: List of security group IDs.
-
deletion_protection
: Enables cluster protection against accidental deletion. The possible values aretrue
orfalse
.Even if it is enabled, one can still connect to the cluster manually and delete it.
-
service_account_id
: Service account ID. -
logging
: Logging parameters:enabled
: Enables logging. Logs generated by Trino components will be sent to Yandex Cloud Logging. The possible values aretrue
orfalse
.min_level
: Minimum logging level. Possible values:TRACE
,DEBUG
,INFO
,WARN
,ERROR
, andFATAL
.folder_id
: Folder ID. Logs will be written to the default log group for this folder.log_group_id
: Custom log group ID. Logs will be written to this group.
Specify one of the two parameters:
folder_id
orlog_group_id
.
-
-
Use the ClusterService/Create call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/trino/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ trino.api.cloud.yandex.net:443 \ yandex.cloud.trino.v1.ClusterService.Create \ < body.json
-
View the server response to make sure the request was successful.