Transferring data to a Apache Iceberg™ target endpoint

Written by

Updated at June 8, 2026

Scenarios for transferring data to Apache Iceberg™
Configuring the data source
Configuring the Apache Iceberg™ target endpoint

Yandex Data Transfer enables you to migrate data to Apache Iceberg™ tables in a Apache Hive™ Metastore cluster and implement various data transfer, processing, and transformation scenarios. To implement a transfer:

Explore possible data transfer scenarios.
Configure one of the supported data sources.
Configure the target endpoint in Yandex Data Transfer.
Create a transfer and start it.
Perform the required operations with the tables and see how the transfer is going.

Scenarios for transferring data to Apache Iceberg™

For a detailed description of possible Yandex Data Transfer scenarios, see Tutorials.

Configuring the data source

Configure one of the supported data sources:

For a complete list of supported sources and targets in Yandex Data Transfer, see Available transfers.

Configuring the Apache Iceberg™ target endpoint

When creating or updating an endpoint, you can define:

Settings for connecting to a Apache Hive™ Metastore cluster.
Configuration settings for an Yandex Object Storage bucket or custom S3-compatible storage.
Additional parameters.

Apache Hive™ Metastore cluster

Warning

To create or edit an endpoint of a managed database, you will need the managed-metastore.viewer role or the primitive viewer role for the folder the cluster of this managed database resides in.

Connection with the cluster specified in Yandex Cloud.

Management console

Apache Hive™ Metastore cluster: ID of the cluster whose folder is used for Apache Iceberg™ tables.
Security groups: Select the cloud network to host the endpoint and security groups for network traffic. This will allow you to apply the specified security group rules to the VMs and clusters in the selected network without changing their settings. For more information, see Networking in Yandex Data Transfer.

Make sure the selected security groups are configured.

Bucket configurations

Yandex Object Storage bucket

Custom S3-compatible storage

Bucket: Name of the bucket to upload source data to.
Service account: Select or create a service account with the storage.uploader role that Data Transfer will use to connect to the bucket.

(Optional) Endpoint: Endpoint for an Amazon S3-compatible service. Leave this field empty to use Amazon.
Region: Region to send requests.
Bucket: Bucket name.
Access Key ID and Secret Access Key: ID and contents of the AWS key used to access a private bucket.

Path prefix: Path prefix for writing objects to the bucket. This is optional.

Additional settings

Management console

Cleanup policy: Select a way to clean up data in the target database before the transfer:
- DISABLED: Use the existing tables to write new data.
- DROP: Remove all tables involved in the transfer.
  
  Use this option to always transfer the latest version of the table schema to the target database from the source whenever the transfer is activated.

After configuring the data source and target, create and start the transfer.

Transferring data to a Apache Iceberg™ target endpoint

Scenarios for transferring data to Apache Iceberg™Scenarios for transferring data to Apache Iceberg™

Configuring the data sourceConfiguring the data source

Configuring the Apache Iceberg™ target endpointConfiguring the Apache Iceberg™ target endpoint

Apache Hive™ Metastore clusterApache Hive™ Metastore cluster

Bucket configurationsBucket configurations

Additional settingsAdditional settings