Tutorials for Yandex Data Proc
Written by
Updated at September 25, 2024
Network settings and cluster maintenance
- Configuring networks for Yandex Data Proc
- Migrating an HDFS Yandex Data Proc cluster to a different availability zone
- Reconfiguring a network connection when recreating a Yandex Data Proc cluster
Working with jobs
Basic examples of working with jobs
- Working with Hive jobs
- Working with MapReduce jobs
- Working with PySpark jobs
- Working with Spark jobs
Advanced examples of working with jobs
- Running Apache Hive jobs
- Launching and managing applications for Spark and PySpark
- Running jobs from remote hosts that are not part of the Yandex Data Proc cluster
Integrating Yandex Data Proc with other services
- Using Yandex Object Storage in Yandex Data Proc
- Importing data from Yandex Object Storage, processing and exporting it to Yandex Managed Service for ClickHouse®
- Mounting Yandex Object Storage buckets to the file system of Yandex Data Proc hosts
- Exchanging data with Yandex Managed Service for ClickHouse®
- Importing data from Yandex Managed Service for MySQL clusters using Sqoop
- Importing data from Yandex Managed Service for PostgreSQL clusters using Sqoop
- Integrating with Yandex DataSphere
- Working with Apache Kafka® topics using PySpark jobs in Yandex Data Proc
- Automating operations using Yandex Managed Service for Apache Airflow™