Transferring data from an OpenSearch source endpoint

Written by

Updated at July 15, 2025

Scenarios for transferring data from OpenSearch
Preparing the source database
Configuring the OpenSearch source endpoint
Configuring the data target
Troubleshooting data transfer issues
- Transfer failure
- Document duplication on the target

Yandex Data Transfer enables you to migrate search and analytics data from an OpenSearch database and implement various data transfer, processing, and transformation scenarios. To implement a transfer:

Explore possible data transfer scenarios.
Prepare the OpenSearch database for the transfer.
Set up a source endpoint in Yandex Data Transfer.
Set up one of the supported data targets.
Create a transfer and start it.
Perform required operations with the database and control the transfer.
In case of any issues, use ready-made solutions to resolve them.

Scenarios for transferring data from OpenSearch

Migration: Moving data from one storage to another. Migration often means migrating a database from obsolete local databases to managed cloud ones.

For a detailed description of possible Yandex Data Transfer scenarios, see Tutorials.

Preparing the source database

OpenSearch

If not planning to use Cloud Interconnect or VPN for connections to an external cluster, make such cluster accessible from the Internet from IP addresses used by Data Transfer.

For details on linking your network up with external resources, see this concept.

Configuring the OpenSearch source endpoint

When creating or updating an endpoint, you can define:

Yandex Managed Service for OpenSearch cluster connection or custom installation settings, including those based on Yandex Compute Cloud VMs. These are required parameters.
Additional parameters.

Managed Service for OpenSearch cluster

Warning

To create or edit an endpoint of a managed database, you need to have the managed-opensearch.viewer role or the viewer primitive role assigned for the folder where this managed database cluster resides.

Connection with the cluster ID specified in Yandex Cloud.

Management console

Connection type: Select a cluster connection option:
- Self-managed: Allows you to specify connection settings manually.
  
  Select Managed Service for OpenSearch cluster as the installation type and configure these settings:
  - Managed Service for OpenSearch cluster: Select the cluster to connect to.
  - User: Specify the username Data Transfer will use to connect to the cluster.
  - Password: Enter the user password to the cluster.
- Connection Manager: Allows connecting to the cluster via Yandex Connection Manager:
  - Select the folder with the Managed Service for OpenSearch cluster.
  - Select Managed DB cluster as the installation type and configure these settings:
    - Cluster for Managed DB: Select the cluster to connect to.
    - Connection: Select or create a connection in Connection Manager.
  Warning
  
  To use a connection from Connection Manager, the user must have access permissions for this connection of connection-manager.user or higher.
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.

Thus, you will be able to apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Networking in Yandex Data Transfer.

Custom installation

Connecting to nodes with explicitly specified network addresses and ports.

Management console

Connection type: Select a database connection option:
- Self-managed: Allows you to specify connection settings manually.
  
  Select Custom installation as the installation type and configure these settings:
  - Data nodes: Click to add a new data node. For each node, specify:
    - Host: IP address or FQDN of the host with the DATA role you need to connect to.
    - Port: Port number Data Transfer will use to connect to the host with the DATA role.
  - SSL: Select this option if a secure SSL connection is used.
  - CA certificate: Upload the certificate file or add its contents as text if you need to encrypt the data to transfer, e.g., for compliance with the PCI DSS requirements.
    
    Warning
    
    If no certificate is added, the transfer may fail with an error.
  - Subnet ID: Select or create a subnet in the required availability zone. The transfer will use this subnet to access the database.
    
    If this field has a value specified for both endpoints, both subnets must be hosted in the same availability zone.
    
    If you do not specify a subnet, you may get an error when activating the transfer.
  - User: Specify the username Data Transfer will use to connect to the database.
  - Password: Enter the user password for access to the database.
- Connection Manager: Allows connecting to the database using Yandex Connection Manager:
  - Select the folder the Connection Manager connection was created in.
  - Select Custom installation as the installation type and configure these settings:
    - Connection: Select or create a connection in Connection Manager.
    - Subnet ID: Select or create a subnet in the required availability zone. The transfer will use this subnet to access the database.
      
      If this field has a value specified for both endpoints, both subnets must be hosted in the same availability zone.
      
      If you do not specify a subnet, you may get an error when activating the transfer.
  Warning
  
  To use a connection from Connection Manager, the user must have access permissions for this connection of connection-manager.user or higher.
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.

This will allow you to apply the specified security group rules to VMs and DBs in the selected network without reconfiguring these VMs and DBs. For more information, see Networking in Yandex Data Transfer.

Additional settings

Management console

Dump an index with type mapping: Select this option to move data types from a source to a target before a transfer is started. If the option is disabled and no index schema is set on the target, data types on the target will be identified automatically during a transfer.

Warning

If a source index includes data types that are not supported on the target, enabling this option may cause a transfer run error. In this case, disable the option and create an index schema on the target manually.

Configuring the data target

Configure the target endpoint:

For a complete list of supported sources and targets in Yandex Data Transfer, see Available transfers.

After configuring the data source and target, create and start the transfer.

Troubleshooting data transfer issues

For more troubleshooting tips, see Troubleshooting.

Transfer failure

Error messages:

object field starting or ending with a [.] makes object resolution ambiguous <field_description>

Index -1 out of bounds for length 0

The transfer is aborted because the keys in the documents being transferred are not valid for the OpenSearch target. Invalid keys are empty keys and keys that:

Consist of spaces.
Consist of periods.
Have a period at the beginning or end.
Have two or more periods in a row.
Include periods separated by spaces.

Solution:

In the target endpoint additional settings, enable Sanitize documents keys and reactivate the transfer.

Document duplication on the target

When repeatedly transferring data, documents get duplicated on the target.

All documents transferred from the same source table fall under the same index named <schemaName.tableName> on the target. In this case, the target automatically generates document IDs (_id) by default. As a result, identical documents are assigned different IDs and get duplicated.

There is no duplication if the primary keys are specified in the source table or endpoint conversion rules. Document IDs are then generated at the transfer stage using the primary key values.

Generation is performed as follows:

If the key value contains a period (.), it is escaped with \: some.key --> some\.key.
All the primary key values are converted into a string: <some_key1>.<some_key2>.<...>.
The resulting string is converted by the url.QueryEscape function.
If the length of the resulting string does not exceed 512 characters, it is used as the _id. If it is longer than 512 characters, it is hashed with SHA-1 and the resulting hash is used as the _id.

As a result, documents with the same primary keys will receive the same ID when the data is transferred again, and the document transferred last will overwrite the existing one.

Solution:

Set the primary key for one or more columns in the source table or in the endpoint conversion rules.
Run the transfer.

Transferring data from an OpenSearch source endpoint

Scenarios for transferring data from OpenSearchScenarios for transferring data from OpenSearch

Preparing the source databasePreparing the source database

Configuring the OpenSearch source endpointConfiguring the OpenSearch source endpoint

Managed Service for OpenSearch clusterManaged Service for OpenSearch cluster

Custom installationCustom installation

Additional settingsAdditional settings

Configuring the data targetConfiguring the data target

Troubleshooting data transfer issuesTroubleshooting data transfer issues

Transfer failureTransfer failure

Document duplication on the targetDocument duplication on the target

Was the article helpful?

Scenarios for transferring data from OpenSearch

Preparing the source database

Configuring the OpenSearch source endpoint

Managed Service for OpenSearch cluster

Custom installation

Additional settings

Configuring the data target

Troubleshooting data transfer issues

Transfer failure

Document duplication on the target