Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Query
    • Overview
    • Terms and definitions
    • Quotas and limits
    • Query processing
      • Description
      • Federated queries
      • Data partitioning
      • Partition projection
    • Unified analysis of streaming and analytical data
    • Backups
  • Access management
  • Pricing policy
  • Integration
  • Audit Trails events
  • FAQ

In this article:

  • Syntax
  • Description of fields
  • Field of the integer type
  • Field of the enum type
  • Field of the date type
  • Path templates
  • Use cases
  1. Concepts
  2. Batch processing
  3. Partition projection

Partition projection

Written by
Yandex Cloud
Updated at March 7, 2025
  • Syntax
  • Description of fields
    • Field of the integer type
    • Field of the enum type
    • Field of the date type
  • Path templates
  • Use cases

Partitioning allows you to prompt Yandex Query which rules should be used for storing data in Yandex Object Storage.

Let's assume the data in Yandex Object Storage is stored using the following folder structure:

year=2021
    month=01
    month=02
    month=03
year=2022
    month=01

When running the query below, Yandex Query does the following:

  1. Gets a full list of subfolders within '/'.
  2. For each subfolder, tries processing the subfolder name in "year=<DIGITS>" format.
  3. For each "year=<DIGITS>" subfolder, gets a list of all subfolders in "month=<DIGITS>" format.
  4. Processes the data that was read.
SELECT
    *
FROM
    objectstorage.'/'
    WITH
    (
        Schema =
        (
            data String,
            year Int,
            month Int
        ),
        partitioned_by =
        (
            year,
            month
        )
    )
WHERE
    year=2021
    AND month=02

This means that when working with partitioned data, the contents of Yandex Object Storage get fully listed, which may be a lengthy process for large buckets.

To optimize operations with large amounts of data, use extended partitioning. This mode does not involve Yandex Object Storage folder scanning. Instead, all paths are determined in advance and data is only accessed using them.

To enable extended partitioning, set the rules for operations via a special parameter called "projection". This parameter describes the rules for storing data in Yandex Object Storage folders.

SyntaxSyntax

Extended partitioning is called "partition projection" and set using the projection parameter.

Example of setting partition projection:


$projection =
@@
{
    "projection.enabled" : true,

    "projection.year.type" : "integer",
    "projection.year.min" : 2010,
    "projection.year.max" : 2022,
    "projection.year.interval" : 1,

    "projection.month.type" : "integer",
    "projection.month.min" : 1,
    "projection.month.max" : 12,
    "projection.month.interval" : 1,
    "projection.month.digits" : 2,

    "storage.location.template" : "${year}/${month}"
}
@@;

SELECT 
    * 
FROM 
    <connection>.`/`
WITH 
(
    schema=
    (
        data String, 
        year Int, 
        month Int
    ), 
    partitioned_by=(year, month),
    projection=$projection
)

The example above indicates that there is data for each year and each month from 2010 to 2022 and that bucket data is stored in folders like 2022/12.

In general, partition projection setup is as follows:


$projection = 
@@ 
{
    "projection.enabled" : <partition_projection_availability>,

    "projection.<field_1_name>.type" : "<type>",
    "projection.<field_1_name>...." : "<properties>",

    "projection.<field_2_name>.type" : "<type>",
    "projection.<field_2_name>...." : "<properties>",

    "storage.location.template" : ".../${<field_2_name>}/${<field_1_name>}/..."
}
@@;

SELECT 
    * 
FROM 
    <connection>.<path> 
WITH 
(
    schema=(<fields>, <field_1>, <field_2>), 
    partitioned_by=(<field_1_name>, <field_2_name>),
    projection=$projection
)

Description of fieldsDescription of fields

Field name Field description Acceptable values
projection.enabled Shows if partition projection is enabled true, false
projection.<field_1_name>.type Field data type integer, enum, date
projection.<field_1_name>.XXX Type properties

Field of the integer typeField of the integer type

Used for columns whose values can be represented as integers in the range from -263 to 263-1.

Field name Required Field description Sample value
projection.<field_name>.type Yes Field data type integer
projection.<field_name>.min Yes Defines the minimum allowed value. Set as an integer -100
004
projection.<field_name>.max Yes Defines the maximum allowed value. Set as an integer -10
5000
projection.<field_name>.interval No, defaults to 1 Sets the step between the elements inside a value range. For example, step 3 will produce the following values for the range 2, 10: 2, 5, 8 2
11
projection.<field_name>.digits No, defaults to 0 Sets the number of digits in a number. If the amount of non-zero digits is less than the specified value, zeros are added in front until the specified amount of digits is reached. For example, if the set value is .digits=3 while 2 is transmitted, it is converted to 002. 2
4

Field of the enum typeField of the enum type

Used for columns whose values can be represented as an enumeration.

Field name Required Field description Sample value
projection.<field_name>.type Yes Field data type enum
projection.<field_name>.values Yes Specifies the allowed values separated by a comma. Spaces are not ignored. 1, 2
A,B,C

Field of the date typeField of the date type

Used for columns whose values can be represented as a date.

Field name Required Field description Sample value
projection.<field_name>.type Yes Field data type date
projection.<field_name>.min Yes Defines the minimum allowed date. Values in YYYY-MM-DD format or as an expression with the special NOW macro substitution are allowed. Using the NOW macro substitution, you can perform arithmetic operations:
NOW-3DAYS,
NOW+1MONTH,
NOW-6YEARS,
NOW+4HOURS,
NOW-5MINUTES,
NOW+6SECONDS.
2020-01-01
NOW-5DAYS
NOW+3HOURS
projection.<field_name>.max Yes Defines the maximum allowed date. Values in YYYY-MM-DD format or as an expression with the special NOW macro substitution are allowed. Using the NOW macro substitution, you can perform arithmetic operations:
NOW-3DAYS,
NOW+1MONTH,
NOW-6YEARS,
NOW+4HOURS,
NOW-5MINUTES,
NOW+6SECONDS.
2020-01-01
NOW-5DAYS
NOW+3HOURS
projection.<field_name>.format Yes Date formatting string based on strptime %Y-%m-%d
%D
projection.<field_name>.unit No Time interval units. Acceptable values: DAYS DAYS
projection.<field_name>.interval No, defaults to 1 Sets the step between the elements inside a value range for the unit set in projection.<field_name>.unit. For example, step 15 will produce the following values for the range 2021-02-02, 2021-03-05 with DAYS for unit: 2021-02-17, 2021-03-04 2
6

Path templatesPath templates

Yandex Object Storage bucket data can be stored in folders with any name. Using the storage.location.template setting, you can specify the rules for naming folders with data.

Field name Field description Sample value
storage.location.template Folder name path template. It is set as ${<field_name...>}/${<field_name...>} root/a/${year}/b/${month}/d
${year}/${month}

Use casesUse cases

  • Working with data in Yandex Object Storage

Was the article helpful?

Previous
Data partitioning
Next
Description
© 2025 Direct Cursus Technology L.L.C.