Yandex Data Processing release notes

Written by

Yandex Cloud

Updated at February 13, 2026

Q4 2025
Q3 2025
Q2 2025
Q1 2025
Q4 2024
Q3 2024
Q2 2024
Q2 2023
Q3 2022
Q2 2022
Q1 2022

Q4 2025

Added a new host class: AMD Zen 4.
Added a dedicated IP address for UI Proxy: 158.160.167.170/32.
You can now specify a separate service account to manage VMs of an autoscaling subcluster when creating a cluster, creating an autoscaling cluster, or updating a cluster.

Q3 2025

In image 2.2.9 (beta), Apache Spark™ is updated to version 3.5.6.

Q2 2025

Added the OS Login option to use when creating a cluster. This option enables OS Login access to all hosts you create in the cluster.

Added support for environment variables:
- HADOOP_HEAPSIZE_MIN and HADOOP_HEAPSIZE_MAX for the hadoop service:
  - hadoop.env:HADOOP_HEAPSIZE_MIN
  - hadoop.env:HADOOP_HEAPSIZE_MAX
- HADOOP_HEAPSIZE for hive (available only for 2.0 images): hive.env:HADOOP_HEAPSIZE.

Q1 2025

In 2.2.X images, Java version updated to 11.

Q4 2024

Added environment selection (PRODUCTION / PRESTABLE) during cluster creation and modification.
In 2.2.X images, Python version updated to 3.1.

Q3 2024

Apache Hive™ Metastore cluster functionality has been integrated in Yandex MetaData Hub. For more information about Apache Hive™ Metastore clusters, see the Yandex MetaData Hub documentation.
In 2.1.X and 2.2.X images, Conda now uses Mamba as its default solver.

Q2 2024

A stable image version line 2.1 is now available. This update enables cluster creation with newer runtime versions: Spark 3.3.2 and Hadoop 3.3.2.

Q2 2023

Added support for creating Apache Hive™ Metastore clusters. This feature is currently in Preview.

Q3 2022

Added support for new configuration settings in the DataprocCreateClusterOperator Airflow operator.
Added cpu-optimized host classes configured with 2GB RAM per 1 vCPU core. The new configurations are exclusively available for Intel Ice Lake processors.
Published a guide for using initialization scripts to configure GeeseFS.

Q2 2022

Image version 2.1 is now available.
Added support for public internet access across all subcluster types.
Lightweight Spark support is now available starting with image version 2.0.39. You can now create a cluster without data storage subclusters because YARN and SPARK services are no longer dependent on HDFS.
Added support for initialization scripts in the CLI.

Q1 2022

You can now create clusters using non-replicated network drives up to 8 TB in size. Non-replicated drives have a simpler architecture than network SSD storage, resulting in significantly higher performance.
Added support for job cancellation.
Added the build number in Yandex Data Processing image version.
Spark and PySpark jobs now accept packages, repositories, and exclude_packages parameters. You can use these parameters to download additional dependencies and packages from third-party repositories.

Yandex Data Processing release notes

Q4 2025Q4 2025

Q3 2025Q3 2025

Q2 2025Q2 2025

Q1 2025Q1 2025

Q4 2024Q4 2024

Q3 2024Q3 2024

Q2 2024Q2 2024

Q2 2023Q2 2023

Q3 2022Q3 2022

Q2 2022Q2 2022

Q1 2022Q1 2022

Was the article helpful?