Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Data Processing
  • Getting started
    • Resource relationships
    • Runtime environment
    • Yandex Data Processing component interfaces and ports
    • Jobs in Yandex Data Processing
    • Spark jobs
    • Automatic scaling
    • Decommissioning subclusters and hosts
    • Networking in Yandex Data Processing
    • Maintenance
    • Quotas and limits
    • Storage in Yandex Data Processing
    • Component properties
    • Apache Iceberg™ in Yandex Data Processing
    • Delta Lake in Yandex Data Processing
    • Logs in Yandex Data Processing
    • Initialization scripts
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Environment
  • Current images
  • Deprecated images
  1. Concepts
  2. Runtime environment

Runtime environment

Written by
Yandex Cloud
Updated at January 29, 2025
  • Environment
  • Current images
  • Deprecated images

When creating a Yandex Data Processing cluster, you can choose the image version that determines component versions.

Below is a list of current and deprecated Yandex Data Processing images. Each image version includes conda, pip (Python environment managers), and a collection of pre-installed libraries.

Yandex Data Processing has no native mechanism for image version upgrades. To upgrade your image version, create a new cluster. To make sure the version you use is always up-to-date, automate the creation and removal of temporary Yandex Data Processing clusters using Yandex Managed Service for Apache Airflow™. To run jobs automatically, apart from Managed Service for Apache Airflow™ you can also use Yandex DataSphere.

EnvironmentEnvironment

When creating a cluster, you can choose one of the following environments:

  • PRODUCTION: For stable versions of your apps.
  • PRESTABLE: For testing purposes. The prestable environment is similar to the production environment and likewise covered by the SLA, but it is the first to get new functionalities, improvements, and bug fixes. In the prestable environment, you can test compatibility of new versions with your application.

When you create a cluster, the environment affects the choice of the image build, giving its version with accuracy down to the minor one. You start using new image builds:

  • For PRODUCTION: At least one week after the release.
  • For PRESTABLE: Directly after the release.

Once stabilized, each minor version supports backward compatibility. However, we recommend using a test configuration with the PRESTABLE environment for processes requiring regular creation of clusters. This will allow you so detect likely backward compatibility issues earlier.

Once a cluster is created, the environment does not affect its operation. You cannot change an existing cluster's environment.

Current imagesCurrent images

Note

Access to image 2.2 is provided on request. Contact support or your account manager.

Components Image 2.0 Image 2.11 Image 2.2 (beta)
Component versions
Hadoop 3.2.2 3.3.2 3.3.2
Tez 0.10.0 0.10.1 —
Hive 3.1.2 — —
Zookeeper 3.4.14 — —
HBase 2.2.7 — —
Oozie 5.2.1 — —
Spark 3.0.3 3.3.22 3.5.0
Zeppelin 0.9.0 0.10.0 —
Livy 0.8.0 0.8.0 0.8.0
Versions of Python and machine learning libraries
Python 3.8.10 3.8.13 3.11.10
PyArrow 1.0.1 4.0.0 14.0.2
ipykernel 5.3.4 5.3.4 6.29.5
PyHive 0.6.1 0.6.1 0.7.0
scikit-learn 0.23.2 0.24.1 1.5.1
pandas 1.1.3 1.2.4 2.2.2
koalas 1.7.0 1.8.2 —
numpy 1.19.2 1.20.1 1.26.4
boto3 1.16.7 1.16.7 1.34.154
IPython 7.19.0 7.22.0 8.27.0
Matplotlib 3.2.2 3.4.2 3.9.2

1 Stable since 2.1.15.

2 Spark 3.3.2 is supported in Yandex Data Processing images starting from version 2.1.4. Images versions 2.1.1-2.1.3 contain Spark 3.2.1.

Deprecated imagesDeprecated images

Note

These images are deprecated. We recommend using the latest image versions. Existing clusters will continue running, but you will not be able to create new clusters with deprecated versions.

Components Image 1.4
Component versions
Hadoop 2.10.0
Tez 0.9.2
Hive 2.3.6
Zookeeper 3.4.14
HBase 1.3.5
Sqoop 1.4.7
Oozie 5.2.0
Spark 2.4.6
Flume 1.9.0
Zeppelin 0.8.2
Livy 0.7.0
Versions of Python and machine learning libraries
Python 3.7.9
PyArrow 0.13.0
ipykernel 5.1.3
TensorFlow 1.15.0
CatBoost 0.20.2
PyHive 0.6.1
LightGBM 2.3.0
XGBoost 0.90
scikit-learn 0.21.3
pandas 0.25.3
IPython 7.9.0
Matplotlib 3.1.1

Was the article helpful?

Previous
Before June 20, 2023
Next
Yandex Data Processing component interfaces and ports
Yandex project
© 2025 Yandex.Cloud LLC