Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Trino
  • Getting started
    • Resource relationships
  • Quotas and limits
  • Access management
  • Pricing policy

In this article:

  • Cluster architecture
  • Coordinator
  • Workers
  • Catalog
  • Connector
  • Running a query in a Trino cluster
  1. Concepts
  2. Resource relationships

Resource relationships in Yandex Managed Service for Trino

Written by
Yandex Cloud
Updated at May 5, 2025
  • Cluster architecture
    • Coordinator
    • Workers
    • Catalog
    • Connector
  • Running a query in a Trino cluster

Trino is a high-performance distributed massively parallel query processing system. With Trino, you can run queries to data in various formats and systems using the standard SQL syntax.

Trino implements separated storage and compute layers. Trino works only with queries and their results and does not provide data storage. All data operations are delegated to the data storage you are querying. As a result, you do not need to pre-load data from your storage into Trino to run a query. This means queries are processed faster, and Trino can work without an additional data storage. Together with the massively parallel architecture, this approach makes it easier to scale a Managed Service for Trino cluster for various tasks.

Cluster architectureCluster architecture

The main entity in Managed Service for Trino is a cluster.

A Trino cluster consists of the coordinator and workers.

CoordinatorCoordinator

The coordinator is the primary data processing node. It receives user queries, plans query execution, manages task distribution among workers, and processes task results they return.

The coordinator server runs a discovery service that monitors worker availability. If a worker becomes unavailable, the coordinator stops assigning new tasks to it.

A Trino cluster always has only one coordinator.

WorkersWorkers

Workers are processing nodes. They handle coordinator queries, run data operations, and return results to the coordinator. When started, each worker registers itself with the discovery service running on the coordinator server. This way it becomes available for task assignment. From time to time, workers send availability signals to the discovery service. If the discovery service does not receive such a signal within the specified time, new tasks will not be assigned to that worker.

When creating a cluster, you can either set a fixed number of workers (from 1 to 64) or configure automatic scaling of workers (between 0 and 64) based on workload.

CatalogCatalog

The coordinator and workers can access data sources through catalogs.

A catalog is a set of parameters describing a connection to a data source. In a Managed Service for Trino cluster, you can create one or more catalogs. Trino supports working with data from multiple catalogs within a single query.

Each catalog describes only one data source. The data source type is determined by the selected connector.

ConnectorConnector

A connector is an interface for accessing a specific type of data source. Connector provide data from the source as an abstract table which the workers can send queries to. This abstraction allows you to work uniformly with all data sources regardless of their specific requirements.

In Managed Service for Trino, the following connectors are available:

  • ClickHouse
  • Delta Lake
  • Hive
  • Iceberg
  • PostgreSQL
  • TPC-DS
  • TPC-H

Running a query in a Trino clusterRunning a query in a Trino cluster

Users interact with the Trino cluster by means of the coordinator, to which a user connects via a client such as the Trino CLI. The client is used to send queries to Trino and display their results. Learn more about connecting to a Trino cluster.

Running a query in a Trino cluster involves these stages:

  1. The coordinator receives a query from the client as an SQL statement.

  2. The coordinator plans query execution stages and converts it into a series of related tasks distributed among workers.

  3. The workers run queries to data sources, process the received information, exchange intermediate task results, and send final results to the coordinator.

  4. The coordinator collects task results from workers, generates the final result, and sends it to the client, which outputs the query result to the user.

Workers interact with each other and the coordinator via the REST API. Additionally, workers can exchange intermediate results through Exchange Manager, which servers as a temporary data storage. This way, if a worker fails, its active process may be completed on a different worker using the intermediate data from Exchange Manager.

Was the article helpful?

Previous
Deleting a cluster
Next
Quotas and limits
© 2025 Direct Cursus Technology L.L.C.