Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Query
    • Data formats and compression algorithms
      • Reading data using connections
      • Reading data using bindings
      • Writing data
    • Working with Managed Service for ClickHouse® databases
    • Working with Managed Service for Greenplum® databases
    • Working with Managed Service for MySQL® databases
    • Working with Managed Service for PostgreSQL databases
    • Working with Managed Service for YDB databases
    • Writing metrics to Yandex Monitoring
  • Access management
  • Pricing policy
  • Integration
  • Audit Trails events
  • FAQ

In this article:

  • Setting up a connection
  • Data model
  • Data schema description
  • Automatic output of a data schema
  • Data path formats
  • Example of reading data using connections
  1. Data sources and targets
  2. Working with Yandex Object Storage
  3. Reading data using connections

Reading data from Object Storage using Query connections

Written by
Yandex Cloud
Updated at March 6, 2025
  • Setting up a connection
  • Data model
    • Data schema description
    • Automatic output of a data schema
    • Data path formats
  • Example of reading data using connections

When working with Yandex Object Storage, it is convenient to use connections for prototyping and initial setup of connections to data.

Sample query for reading data:

SELECT
    *
FROM
    object_storage.`*.tsv`
WITH
(
    format=tsv_with_names,
    SCHEMA
    (
        `timestamp` Uint32,
        action String
    )
);

Setting up a connectionSetting up a connection

To create a connection to Object Storage:

  1. In the management console, select the folder where you want to create a connection.

  2. In the list of services, select Yandex Query.

  3. In the left-hand panel, go to the Connections tab.

  4. Click Create new.

  5. Specify the connection parameters:

    1. Under General parameters:

      • Name: Name of the connection to Object Storage.
      • Type: Object Storage.
    2. Under Connection type parameters:

      • Bucket auth: Select Public or Private depending on the type of the bucket object read permissions.

        For a public bucket, enter a name in the Bucket field.
        For a private bucket, select:

        • Cloud and Folder where the data source is located.

        • Select a bucket or create a new one.

        • Select or create a service account with the storage.viewer role to be used to access the data.

          To use a service account, the iam.serviceAccounts.user role is required.

  6. Click Create.

Data modelData model

Object Storage stores data as binary files. To read data, use the following SQL statement:

SELECT
    <expression>
FROM
    <connection>.<path>
WITH(
    FORMAT = "<data_format>",
    COMPRESSION = "<compression_format>",
    SCHEMA = (<schema_description>))
WHERE
    <filter>;

Where:

  • <connection>: Name of the storage connection.
  • <path>: Path to a file or files in the bucket. Wildcard characters (*) are supported.
  • <data_format>: Data format in the files.
  • <compression_format>: File compression format.
  • <schema_description>: Data schema description in the files.

Data schema descriptionData schema description

Data schema description includes the following fields:

  • Field name
  • Field type
  • Attribute indicating a required field

For example, the below data schema describes a required schema field named Year of the Int32 type:

Year Int32 NOT NULL

If a field is marked as required (NOT NULL) but it is missing from the file being processed, this operation will fail with an error. If a field is marked as optional (NULL), no error will occur if that field is missing from the file being processed but the field will take the NULL value. The NULL keyword in optional fields is optional.

Automatic output of a data schemaAutomatic output of a data schema

Automatic output of a schema is available for all data formats except raw and json_as_string. This is convenient when a schema contains a large number of fields. To avoid entering these fields manually, use the WITH_INFER parameter:

SELECT
    <expression>
FROM
    <connection>.<path>
WITH(
    FORMAT = "<data_format>",
    COMPRESSION = "<compression_format>",
    WITH_INFER="true")
WHERE
    <filter>;

Where:

  • <connection>: Name of the storage connection.
  • <path>: Path to a file or files in the bucket. Wildcard characters (*) are supported.
  • <data_format>: Data format in the files.
  • <compression_format>: File compression format.

This request will automatically output field names and types.

Data path formatsData path formats

Yandex Query supports the following paths to data:

Path format Description Example
Path that ends with / Folder path Path /a locates the entire contents of a folder:
/a/b/c/d/1.txt
/a/b/2.csv
Path that contains the * macro substitution character Any files nested in the path Path /a/*.csv locates files in folders:
/a/b/c/1.csv
/a/2.csv
/a/b/c/d/e/f/g/2.csv
Path that neither ends with / nor contains macro substitution characters Path to an individual file Path /a/b.csv locates a specific file: /a/b.csv

Example of reading data using connectionsExample of reading data using connections

Sample query for reading data from Object Storage:

SELECT
    *
FROM
    connection.`folder/filename.csv`
WITH(
    format='csv_with_names',
    SCHEMA
    (
        Year int,
        Manufacturer String,
        Model String,
        Price Double
    )
);

Where:

  • connection: Name of the connection to Object Storage.
  • folder/filename.csv: Path to the file in the Object Storage bucket.
  • SCHEMA: Data schema description in the file.

Was the article helpful?

Previous
Data formats and compression algorithms
Next
Reading data using bindings
Yandex project
© 2025 Yandex.Cloud LLC