Getting started with Yandex Managed Service for Trino
To get started:
- Create a Managed Service for Trino cluster.
- Install the Trino CLI.
- Connect to Trino.
- Send queries to the Yandex Managed Service for PostgreSQL cluster through Trino.
- Check the queries in the Managed Service for PostgreSQL cluster.
Getting started
-
Go to the management console
and log in to Yandex Cloud or sign up if not signed up yet. -
If you do not have a folder yet, create one:
-
In the management console
, select the appropriate cloud from the list on the left. -
At the top right, click Create folder.
-
Give your folder a name. The naming requirements are as follows:
- It must be 2 to 63 characters long.
- It may contain lowercase Latin letters, numbers, and hyphens.
- It must start with a letter and cannot end with a hyphen.
-
Optionally, specify the description for your folder.
-
Select Create a default network. This will create a network with subnets in each availability zone. Within this network, you will also have a default security group, inside which all network traffic will be allowed.
-
Click Create.
-
-
Assign the following roles to your Yandex Cloud account:
managed-trino.admin
: To create a cluster.- vpc.user: To use the cluster network.
- iam.serviceAccounts.user: To link a service account to the cluster.
Note
If you are unable to manage roles, contact your cloud or organization administrator.
-
Create a service account with the
managed-trino.integrationProvider
andstorage.editor
roles. -
Create a Yandex Managed Service for PostgreSQL cluster Trino will work with.
Create a Managed Service for Trino cluster
-
In the management console, select the folder where you want to create a Managed Service for Trino cluster.
-
Select Managed Service for Trino.
-
Click Create cluster.
-
Specify a name for the cluster.
-
In the Service account field, select the previously created service account.
-
Under Network settings, select a network, subnet, and security group for the cluster.
-
Configure the coordinator and workers.
-
Under Catalogs, add folders:
-
For the Managed Service for PostgreSQL cluster, with the following properties:
-
Catalog name:
test
-
Connector type:
PostgreSQL
-
URL:
jdbc:postgresql://<FQDN_of_host_of_Managed_Service_for_Postgresql>:6432/<DB_name>?ssl=true&sslmode=verify-full
To learn how to get the FQDN of a host in a Managed Service for PostgreSQL cluster, see this guide.
-
User name: Username in the Managed Service for PostgreSQL cluster.
-
Password: User password in the Managed Service for PostgreSQL cluster.
-
-
For test data, with the following properties:
- Catalog name:
data
. - Connector type:
TPC-H
.
- Catalog name:
You can add folders both when you are creating a cluster and after you have done so.
-
-
Optionally, under Advanced settings, configure logging:
- Enable the Write logs setting.
- Select where the logs will be stored:
- Folder: Select a folder from the list.
- Group: Select a log group from the list or create a new one.
- Select Min. logging level from the list.
-
Click Create.
-
Wait until the cluster is ready: its status on the Managed Service for Trino dashboard will change to Running and its state, to Alive. This may take some time.
Install the Trino CLI
-
Download the Trino CLI from the Apache Maven central repository to the
trino
file:wget -O trino \ https://repo.maven.apache.org/maven2/io/trino/trino-cli/<version_number>/trino-cli-<version_number>-executable.jar
The available versions are listed here
. -
Convert the file you downloaded to an executable:
sudo chmod +x trino
-
Make sure Trino is installed successfully. Do it by requesting Trino version:
./trino --version
The response will look as follows:
Trino CLI <version_number>
Connect to the Managed Service for Trino cluster
-
Create an IAM token and put it to the
TRINO_PASSWORD
environment variable:export TRINO_PASSWORD=$(yc iam create-token)
This IAM token in
TRINO_PASSWORD
will be your password to the Managed Service for Trino cluster. To enable it, specify the--password
flag upon connection. -
Connect to the Managed Service for Trino cluster:
./trino <coordinator_URL> --user iam --password
You can copy the coordinator URL and paste it to the Coordinator field on the Trino overview page in the management console
.
Send queries to the Managed Service for PostgreSQL cluster through Trino
-
Connect to the Trino cluster through the Trino CLI.
To follow the steps below, use the Trino CLI.
-
Create a new schema in the Managed Service for PostgreSQL cluster:
CREATE SCHEMA IF NOT EXISTS test.tpch_postgresql;
-
Create a table:
CREATE TABLE IF NOT EXISTS test.tpch_postgresql.customers ( name VARCHAR, phone VARCHAR, acctbal DOUBLE );
-
Populate the table with data from the test data folder:
INSERT INTO test.tpch_postgresql.customers SELECT name, phone, acctbal FROM data.sf100000.customer LIMIT 1000000;
-
Get the number of rows in the table:
SELECT COUNT(*) as count FROM test.tpch_postgresql.customers;
Expected response:
count --------- 1000000 (1 row)
Check the queries in the Managed Service for PostgreSQL cluster
To check whether the queries to the Managed Service for PostgreSQL cluster are running correctly, connect to the cluster database and get the number of rows in the customers
table:
SELECT COUNT(*) FROM tpch_postgresql.customers;
Expected response:
count
---------
1000000
(1 row)