Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Trino
  • Getting started
    • Resource relationships
    • Networking in Managed Service for Trino
    • Impersonation
    • Fault-tolerant query execution
    • Host classes
    • Maintenance
  • Terraform reference
  • Quotas and limits
  • Access management
  • Pricing policy
  • Yandex Monitoring metrics
  • Audit Trails events
  • Release notes

In this article:

  • Cluster architecture
  • Coordinator
  • Workers
  • Trino catalog
  • Connector
  • Running a query in a Trino cluster
  1. Concepts
  2. Resource relationships

Resource relationships in Yandex Managed Service for Trino

Written by
Yandex Cloud
Updated at July 3, 2025
  • Cluster architecture
    • Coordinator
    • Workers
    • Trino catalog
    • Connector
  • Running a query in a Trino cluster

Trino is a high-performance distributed massively parallel query processing system. With Trino, you can run queries to various data storages and work with data in various formats using the standard SQL syntax.

Trino implements separated storage and compute layers. Trino works only with queries and their results. All data operations are delegated to the external data storage you are querying, so you do not need to upload data from the storage to Trino to run a query. This approach accelerates query processing and, in combination with the massively parallel architecture, simplifies Managed Service for Trino cluster scaling for various needs.

Cluster architectureCluster architecture

The main entity in Managed Service for Trino is a cluster.

A Trino cluster consists of the coordinator and workers.

CoordinatorCoordinator

The coordinator is the primary data processing node. It receives user queries, plans query execution, manages task distribution among workers, and processes task results they return.

The coordinator server runs a discovery service that monitors worker availability. If a worker becomes unavailable, the coordinator stops assigning new tasks to it.

A Trino cluster always has only one coordinator.

WorkersWorkers

Workers are processing nodes. They handle coordinator queries, run data operations, and return results to the coordinator. When started, each worker registers itself with the discovery service running on the coordinator server. This way it becomes available for task assignment. From time to time, workers send availability signals to the discovery service. If the discovery service does not receive such a signal within the specified time, new tasks will not be assigned to that worker.

When creating a cluster, you can either set a fixed number of workers (from 1 to 64) or autoscale the number of workers (between 0 and 64) based on workload.

Trino catalogTrino catalog

The coordinator and workers can access data sources through catalogs.

A catalog is a set of parameters describing a connection to a data source. You can create one or more catalogs in a Managed Service for Trino cluster. Trino supports working with data from multiple catalogs within a single query.

Each catalog describes only one data source. The data source type is determined by the selected connector.

ConnectorConnector

A connector is an interface for accessing a specific type of data source. Connector provide data from the source as an abstract table which the workers can send queries to. This table supports the same workflow for all data sources regardless of what particular requirements they may have.

In Managed Service for Trino, the following connectors are available:

  • ClickHouse
  • Delta Lake
  • Hive
  • Iceberg
  • Oracle Preview
  • PostgreSQL
  • MS SQL Server Preview
  • TPC-DS
  • TPC-H

You select a connector when creating a Trino catalog.

Running a query in a Trino clusterRunning a query in a Trino cluster

The user works with the Trino cluster via a client, e.g., the Trino CLI. The client sends queries to the coordinator and shows their results.

Running a query in a Trino cluster involves these stages:

  1. The coordinator receives a query from the client as an SQL statement.

  2. The coordinator plans query execution stages and converts it into a series of related tasks distributed among workers.

  3. Workers run queries to data sources, process the incoming information, exchange intermediate task results, and submit the results of all tasks to the coordinator.

  4. The coordinator collects task results from workers, generates the final result, and sends it to the client, which outputs the query result to the user.

Workers interact with each other and the coordinator via the REST API. Additionally, workers can exchange intermediate data through Exchange Manager, which servers as a temporary data storage. This way, if a worker fails, its active process may be completed on a different worker using the intermediate data from Exchange Manager.

Was the article helpful?

Previous
Viewing cluster logs
Next
Networking in Managed Service for Trino
© 2025 Direct Cursus Technology L.L.C.