Tutorials for Yandex Data Processing
Written by
Updated at October 14, 2024
Network settings and cluster maintenance
- Configuring networks for Yandex Data Processing
- Migrating an HDFS Yandex Data Processing cluster to a different availability zone
- Reconfiguring a network connection when recreating a Yandex Data Processing cluster
Working with jobs
Basic examples of working with jobs
- Working with Hive jobs
- Working with MapReduce jobs
- Working with PySpark jobs
- Working with Spark jobs
Advanced examples of working with jobs
- Running Apache Hive jobs
- Launching and managing applications for Spark and PySpark
- Running jobs from remote hosts that are not part of the Yandex Data Processing cluster
Integrating Yandex Data Processing with other services
- Using Yandex Object Storage in Yandex Data Processing
- Importing data from Yandex Object Storage, processing and exporting it to Yandex Managed Service for ClickHouse®
- Mounting Yandex Object Storage buckets to the file system of Yandex Data Processing hosts
- Shared use of tables through Metastore
- Transferring data between Yandex Data Processing clusters using Metastore
- Exchanging data with Yandex Managed Service for ClickHouse®
- Importing data from Yandex Managed Service for MySQL® clusters using Sqoop
- Importing data from Yandex Managed Service for PostgreSQL clusters using Sqoop
- Integrating with Yandex DataSphere
- Working with Apache Kafka® topics using PySpark jobs in Yandex Data Processing
- Automating operations using Yandex Managed Service for Apache Airflow™