Troubleshooting in Data Transfer
- Problems that arise when working with Data Transfer
- General
- Data transformation
- API errors
- Network
- ClickHouse®
- Elasticsearch
- MongoDB
- MySQL®
- A single transaction's log size exceeds 4 GB
- No new tables are added
- Error when transferring from AWS RDS for MySQL®
- Error when transfering tables without primary keys
- Error when accessing a binary log
- Error when dropping a table under the Drop cleanup policy
- Time shift in DATETIME data type when transferring to ClickHouse®
- Object Storage
- OpenSearch
- PostgreSQL
- Stopping a transfer's master transaction session
- Exceeding the connection time-to-live quota
- Exceeding the number of database connections
- VIEW transfer error
- Error when adding a table entry by constraint
- Error when transferring all schema tables
- Unable to create objects involving extension functions
- Low transfer speed
- Unable to transfer child tables
- Insufficient replication slots in a source database
- No data transfer after changing a source endpoint
- Transfer error when changing a master host
- Error when transferring nested transactions
- Error transferring tables with deferred constraints
- Could not create a replication slot at the activation step
- Excessive WAL size increase
- Error when replicating data from an external source
- Error when transfering tables without primary keys
- Error when dropping a table under the Drop cleanup policy
- Yandex Managed Service for YDB
- Yandex Data Streams
- Who to report your problem to
This section describes typical problems that may arise during transfer activation or operation, and the relevant solutions.
- Problems that arise when working with Data Transfer
- General
- Data transformation
- API errors
- Network
- ClickHouse®
- Elasticsearch
- MongoDB
- MySQL®
- Object Storage
- OpenSearch
- PostgreSQL
- Yandex Managed Service for YDB
- Yandex Data Streams
- Who to report your problem to
Problems that arise when working with Data Transfer
To detect a problem in time:
- Monitor the transfer state on the Monitoring tab of the transfer management page or in Yandex Monitoring.
- Configure alerts in Yandex Monitoring to receive notifications about transfer failures.
- Request records about your resources from Yandex Cloud service logs.
- Use the Yandex Cloud mobile app to track the state of transfers.
If Data Transfer operation was disrupted during data transfer, try to localize and analyze the problem. You may find certain solutions in this article or other sections of our documentation.
Issue source | Issue | Solution |
---|---|---|
Endpoint | Lack of network accessibility or endpoint access permissions | Check source reading using the following charts: Maximum data transfer delay, Number of source events, and Reads.Check writing to the target using the following charts: Maximum data transfer delay, Number of source events, Number of target events, and Reads.If you can read and write data, check the restrictions on working with the DBMS.Check the requirements for preparing and configuring the endpoint.Look for a ready-made solution to the problem. |
Endpoint or transfer | Lack of physical resources for the transfer or endpoints | If the data can be read and written, check if there are enough physical resources on these charts: CPU and RAM.Read the guidelines for DBMS diagnostics. For example, MySQL®, MongoDB, or PostgreSQL. |
Data | Outdated data due to changes in the data schema | See the different data transfer scenarios in the Data Transfer tutorialssection. |
Data | Outdated data due to large data volume | Increase the number of workers for parallel copying or replication.Split the tables into several transfers. |
After solving the problem, depending on the status of the transfer, activate it or change the data transfer limits of the running transfer.
General
Long transfer activation
Solution: If a transfer being activated is in the Creating status for a long time, it's not an error. It takes time to create Yandex Compute Cloud resources that are allocated for each transfer individually. For some sources, a database schema is extracted, which also takes time.
Transaction race when running incremental copy
During incremental copy, there may be transaction races, i.e., situations when transactions are completed in a sequence different from the one they were run in. In that case, the transfer may ignore earlier transactions. For more information, see Transaction completion timeout.
Solution: Increase the transaction completion timeout in the transfer settings. The recommended and default value is 15
seconds.
Duplicate rows in a target database
Possible cause:
-
A target database contains pre-transfer data.
Solution: delete the data from the target database before the transfer is activated or set the
Drop
cleanup policy in the target endpoint. -
There is no primary key in the target database tables.
Solution: make sure the tables in the target database have primary keys.
Insufficient resources
Error message:
Warn(Activate): Snapshot loading failed:
snapshot tasks failed: main uploader failed:
errors detected on secondary workers: secondary worker #3 failed:
pod instance restarted 5h41m39.092446028s ago
Solution:
If the pod instance restarted
error was caused by the transfer VM running out of memory (OOM), you may consider the following solutions:
-
Decrease the number of threads per worker in the transfer settings. At the same time, you can increase the number of workers to maintain the overall level of sharding (parallel loading) at the copy stage. Since threads share the worker's resources, decreasing the number of threads per worker will increase the amount of resources available to each thread. As a result, a thread will be less likely to run out of memory.
-
For transfers at the GA stage, you can increase computing resources in the Runtime environment transfer setting. Such transfers are chargeable, so with increased computing resources your data transfer costs will be higher.
-
For transfers at the Preview stage, you cannot scale computing resources on your own: contact support
or your account manager.
If the pod instance restarted
error is not caused by OOM, contact support
Decrease in transfer speed
Issue:
If the target endpoint cleanup policy is set to Do not cleanup
and the transfer has already been activated, i.e., the tables being transferred to the targets are not empty, the transfer speed will be significantly reduced due to repeated insert attempts and errors they cause.
Solution:
Use the Drop
or Truncate
cleanup policy.
Error creating or editing a managed database endpoint
Error message:
Can't authorize usage of managed <DB_type>, you need permission <get-MDB_permission> to <cluster_folder_ID> folder
To create or edit an endpoint of a managed database, you need the service or primitive viewer
role assigned for the folder hosting a cluster of this managed database.
Solution:
Get a service or primitive viewer
role to work with the cluster.
Data transformation
String filter for APPEND-ONLY sources is not working
The String filter for APPEND-ONLY sources transformer does not work when performing a transfer.
Possible causes include:
-
If the column value type specified in the filter does not match the column type in the table, the transformer is not applied.
Solution: Specify column values that match the column type in the filtered table.
-
If the filter includes a string column, the type of this column in the filtered table must be set to
UTF8
for those sources where a parser explicitly specifies column types (e.g., for YDS). The transformer does not support theSTRING
column type.Solution: Specify the
UTF8
type for a string column in the filtered table.
API errors
Error example:
{"code": 13, "message": "internal"}
Solution: Contact supportrequest_id
. If you are using curl
for API calls, add the -v
flag to facilitate error diagnostics.
Network
No common availability zone
Error message:
Warn(Activate): YC: unable to resolve instance group:
unable to resolve net topology: neither source nor target subnet found:
No common availability zone found for the source and target endpoints:
source zones: [<source_zone_name>], target zones: [<target_zone_name>]
The error occurs if the source and target hosts are within Yandex Cloud but have no common availability zones.
Solution:
- Add a host to a cluster so that the hosts have a common availability zone.
- Set up routing through subnets in one availability zone:
- Check that the network of the endpoint from availability zone 2 has a subnet in availability zone 1. If not, create one.
- Change the type of the endpoint from availability zone 2 to
On-premise
. - Specify the subnet from availability zone 1 for this endpoint.
- As a host, specify the internal IP address (without a port number) of the endpoint hosted in the subnet of availability zone 2.
Overlapping IP address ranges
Error message:
YC: unable to resolve instance group:
unable to resolve net topology: subnet address space collision detected:
subnet <ID_of_subnet_1> [<IP_range_of_subnet_1>]
collides with subnet <ID_of_subnet_2> [<IP_range_of_subnet_2>]
The error occurs if the source and target hosts are in different subnets within Yandex Cloud but have overlapping IP address ranges.
Solution: Create a new target cluster and make sure the subnets of the target and source cluster hosts involved in a transfer are not overlapping by IP address range.
No server connection
There is no connection because the specified subnet has no preconfigured egress NAT gateway.
Error message:
Can't connect to server: Can't ping server:
dial tcp <address_of_an_endpoint's_host>:<port>: connect: connection timed out
A transfer would fail if it has one on_premise
endpoint and another endpoint with the subnet that has no egress NAT gateway.
Solution: Disable the endpoint setting that points to the subnet and reactivate the transfer.
Blocking a transfer's IP address
Solution: allow connections to the transfer through addresses and ranges
Cluster access error
Text of the error that occurs when creating a transfer:
Cannot retrieve table information from the source database: failed to resolve storage: failed to create a PostgreSQL storage: unable to get master host: unable to create postgres service client: All hosts are unavailable:
Solution: Check whether the cluster is accessible in your subnet.
Most often, the issue is due to missing security group rules that are required.
Lack of permissions for subnets or security groups when creating an endpoint
Error message:
Create endpoint failed: rpc error: code = PermissionDenied desc = Failed permission check: No permission to use VPC Security Groups: Permission denied
or
Failed permission check: No permission to use VPC Subnets: Permission denied
Solution: Assign the vpc.user
role to the user for the folder the subnet is in.
ClickHouse®
New tables cannot be added
No new tables are added to Snapshot and increment transfers.
Solution:
-
Create tables in the target database manually. For the transfer to work, do the following when creating a table:
-
Add the transfer service fields to it:
__data_transfer_commit_time timestamp, __data_transfer_delete_time timestamp
-
Use
ReplacingMergeTree
:ENGINE = ReplacingMergeTree
-
-
Create a separate transfer of the Snapshot and increment type and add only new tables to the list of objects to transfer. Deactivating the original Snapshot and increment transfer is not required. Activate the new transfer, and once it switches to the Replicating status, deactivate it.
To add other tables, put them into the list of objects to transfer in the created separate transfer (replacing other objects in that list), reactivate it, and, once it switches to the Replicating status, deactivate it.
Note
Since two transfers were simultaneously migrating data, you will have duplicate records in the new tables on the target. Run the
SELECT * from TABLE <table_name> FINAL
query to hide duplicate records orOPTIMIZE TABLE <table_name>
to delete them.
Data is not transferred
An attempt to transfer data from a ClickHouse® source fails with this error:
Syntax error: failed at position 25 ('-'): <error_details>. Expected one of: token, Dot, UUID, alias, AS, identifier, FINAL, SAMPLE, INTO OUTFILE, FORMAT, SETTINGS, end of query
Solution:
Yandex Data Transfer cannot transfer a database if its name contains a hyphen. You need to rename your database, if you can.
Unsupported date range
If the migrated data contains dates outside the supported ranges, ClickHouse® returns the following error:
TYPE_ERROR [target]: failed to run (abstract1 source): failed to push items from 0 to 1 in batch:
Push failed: failed to push 1 rows to ClickHouse shard 0:
ClickHouse Push failed: Unable to exec changeItem: clickhouse:
dateTime <field_name> must be between 1900-01-01 00:00:00 and 2262-04-11 23:47:16
Supported date ranges in ClickHouse®:
- For the
DateTime64
type fields: 1900-01-01 to 2299-12-31. For more information, see the ClickHouse® documentation . - For the
DateTime
type fields: 1970-01-01 to 2106-02-07. For more information, see the ClickHouse® documentation .
Solution: use one of the following options:
- Convert all dates in the source DB to a range supported by ClickHouse®.
- In the source endpoint parameters, exclude the table with incorrect dates from the transfer.
- In the transfer parameters, specify the Convert values to string transformer. This will change the field type during the transfer.
Lack of resources or increasing data latency
You may encounter the following problems when migrating data to a ClickHouse® target:
-
Transfer fails with an error. Error message:
pod instance restarted
-
Transfer state monitoring indicates an increasing data latency (a time difference between when the records appear in the target and when they appear in the source).
Possible cause:
The write interval specified in the target endpoint settings is too large, which leads to the lack of RAM (OOM) on the transfer VM.
Solution:
In the management console, set the value of the Flush interval target endpoint setting to 10 seconds or less.
In addition, if your transfer type is Snapshot, reactivate it. Transfers of the other types will restart automatically.
Data blocks limit exceeded
When migrating data to a ClickHouse® target, the transfer is interrupted due to an error. Error message:
ERROR Unable to Activate ...
unable to upload tables: unable to upload data objects: unable upload part <table name> ():
unable to start \*clickhouse.HTTPSource event source: failed to push events to destination:
unable to push http batch: <table name>: failed: INSERT INTO ...
Additionally, you can also get this error:
pod instance restarted
Errors occur when you try to insert more data blocks than allowed in the max_partitions_per_insert_block
setting to the ClickHouse® target.
Solution: Increase the max_partitions_per_insert_block
parameter for the account the transfer uses to connect to the target. For the Managed Service for ClickHouse® target, you can change this parameter in user settings. For a ClickHouse® custom installation, you can create a settings profile and assign it to the account:
CREATE SETTINGS PROFILE max_partitions
SETTINGS max_partitions_per_insert_block = <setting_value>
ALTER USER <username> PROFILE 'max_partitions'
Elasticsearch
Transfer failure
Error messages:
object field starting or ending with a [.] makes object resolution ambiguous <field_description>
Index -1 out of bounds for length 0
The transfer is aborted because the keys in the documents being transferred are not valid for the Elasticsearch target. Invalid keys are empty keys and keys that:
- Consist of spaces.
- Consist of periods.
- Have a period at the beginning or end.
- Have two or more periods in a row.
- Include periods separated by spaces.
Solution:
In the target endpoint additional settings, enable Sanitize documents keys and reactivate the transfer.
Document duplication on the target
When repeatedly transferring data, documents get duplicated on the target.
All documents transferred from the same source table fall under the same index named <schemaName.tableName>
on the target. In this case, the target automatically generates document IDs (_id
) by default. As a result, identical documents are assigned different IDs and get duplicated.
There is no duplication if the primary keys are specified in the source table or endpoint conversion rules. Document IDs are then generated at the transfer stage using the primary key values.
Generation is performed as follows:
- If the key value contains a period (
.
), it is escaped with\
:some.key
-->some\.key
. - All the primary key values are converted into a string:
<some_key1>.<some_key2>.<...>
. - The resulting string is converted by the url.QueryEscape
function. - If the length of the resulting string does not exceed 512 characters, it is used as the
_id
. If it is longer than 512 characters, it is hashed with SHA-1 and the resulting hash is used as the_id
.
As a result, documents with the same primary keys will receive the same ID when the data is transferred again, and the document transferred last will overwrite the existing one.
Solution:
- Set the primary key for one or more columns in the source table or in the endpoint conversion rules.
- Run the transfer.
MongoDB
Collection key size exceeds 5 MB
Error message:
Warn(replication): Usage of bulk objects in 'database <DB_name>'
breaks change event log, transfer is stopping.
Reason: (Location<item_number>) Tried to create string longer than 16MB.
If the collection key size exceeds 5 MB, transfers of the Replication type crash due to MongoDB internal limits
Solution: exclude any collections that exceed MongoDB limits from the transfer and reactivate it.
Collection object size exceeds 16 MB
Error message:
Warn(replication): Usage of bulk objects in 'collection '<DB_name>.<collection_name>''
breaks change event log, transfer is stopping.
Reason: (BSONObjectTooLarge) BSONObj size: <object_size> (<object_size_in_hex>) is invalid.
Size muse be between 0 and 16793600(16MB).
If the collection object size exceeds 16 MB, transfers of Replication type crash due to MongoDB internal limits
Solution: exclude any collections that exceed MongoDB limits from the transfer and reactivate it.
No table found
Error message:
Unable to find any tables
An empty number of collections was extracted from the database. The user might be missing permissions for the database used in the transfer.
Solution: for the database to be transferred, grant the user the transfer uses to connect to the source readWrite
permissions.
Error when transferring a sharded cluster
Solution: In the Snapshot settings → Parallel snapshot settings transfer parameter, specify the number of workers equal to the number of collections being transferred.
Error when transferring timeseries collections
Error messages:
Unable to find any tables
Cannot execute mongo activate hook:
Failed in accordance with configuration:
some tables from include list are missing in the source database: [<collection_name>]
The service does not support transfers of Time Series collections
Solution: exclude any Time Series collections from the transfer and reactivate it.
Unable to recognize an external cluster IP address or FQDN
The transfer fails with the error message:
server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: <unresolved_FQDN>, Type: Unknown, Last error: connection() error occurred during connection handshake: dial tcp: lookup <unresolved_FQDN> on <IP address>: no such host }, ] }"
The transfer error is due to the MongoDB cluster configuration. For example, when unresolved internal names are used in shard descriptions.
Solution:
Make sure the MongoDB cluster is configured so that it returns correctly resolving IP addresses or FQDNs (fully qualified domain names) in response to requests.
Error at data copying stage
The Snapshot and increment type transfer terminates with the following error at the copying stage:
encountered non-recoverable resume token error. Sync cannot be resumed from this state and must be terminated and re-enabled to continue functioning: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog.
The ChangeStreamHistoryLost
error occurs when the total copy time of the MongoDB origin cluster data exceeds the operation log (oplog) time window size. You can check the current time window size in the management console. See the Oplog window graph of the cluster monitoring page.
For more information on oplog, see the MongoDB documentation
Solution:
- Increase the oplog size (10% of the cluster disk size by default). To increase the oplog size in a Managed Service for MongoDB origin cluster, contact technical support
. To change the oplog size if using a custom origin installation, see the MongoDB documentation . - Enable parallel data copying to speed up the copying stage.
- Limit the list of transferable objects in the transfer settings.
Once that is done, activate the transfer again.
Source data cannot be sharded
The transfer from a MongoDB source fails with the following error message:
ERROR: Unable to Activate
error: "failed to execute mongo activate hook: Snapshot loading failed: unable to shard upload tables: unable to shard upload (main worker) tables: unable to shard tables for operation ID: unable to split table, err: cannot get delimiters: there are two or more types of objects in the sharding index"
The cannot get delimiters: there are two or more types of objects in the sharding index
error means that the source collection id
field contains different data types, making the source unsuitable for sharding.
Solution:
In the Snapshot settings → Parallel snapshot settings transfer settings, specify one worker and one stream to disable sharding.
Once that is done, activate the transfer again.
MySQL®
A single transaction's log size exceeds 4 GB
Error message:
Last binlog file <file_name:file_size> is more than 4GB
If the log size of a single transaction exceeds 4 GB, activation of Replication or Snapshot and increment transfers fails due to MySQL® internal limits
Solution: Reactivate the transfer.
No new tables are added
No new tables are added to Snapshot and increment transfers.
Solution:
- Deactivate and reactivate the transfer.
- Create tables in the target database manually.
- Create a separate transfer of the Snapshot type and only add to it the newly created tables. Deactivating the original Snapshot and increment transfer is not required.
Error when transferring from AWS RDS for MySQL®
Snapshot and increment and Replication transfers from the Amazon RDS for MySQL®
Error example:
Failed to execute LoadSnapshot:
Cannot load table "name": unable to read rows and push by chunks:
unable to push changes: unable to execute per table push:
error: err: sql: transaction has already been committed or rolled back
rollback err: sql: transaction has already been committed or rolled back
This error is caused by the short MySQL® binary log file retention time in Amazon RDS.
Solution:
Increase the binary log retention time by using the command below:
call mysql.rds_set_configuration('binlog retention hours', <hours>);
The maximum retention time is 168 hours (7 days). The default is NULL
(binary log files are not retained). For more information, see the Amazon RDS documentation
Error when transfering tables without primary keys
Error message:
Primary key check failed: 14 Tables errors: Table no key columns found
For Replication and Snapshot and increment transfers, tables without primary keys are not transferred.
Solution: Prepare the source as described in Preparing for the transfer.
Error when accessing a binary log
The following error may occur during Snapshot and increment transfers:
Warn(replication): failed to run (abstract1 source):
failed to run canal: failed to start binlog sync:
failed to get canal event: ERROR 1236 (HY000): Could not find first log file name in binary log index file
The error occurs when the binary log files required for replication are not available. This is usually due to exceeding the maximum allowed file size after adding new changes to the binary log. In this case, some of the old log data is deleted.
Solution:
Increase the maximum allowed size for the binary log files in the MySQL® settings using the Mdb preserve binlog bytes parameter.
The minimum value is 1073741824
(1 GB), while the maximum one is 107374182400
(100 GB). The default value is 1073741824
(1 GB).
Error when dropping a table under the Drop cleanup policy
Error message:
ERROR: cannot drop table <table_name> because other objects depend on it (SQLSTATE 2BP01)
With the Drop
cleanup policy, a transfer drops tables in multiple iterations:
-
The transfer tries to drop all tables one by one. Cascade delete is not used, as it may delete tables that are not included in the transfer. If a table cannot be dropped, for example, due to its links to external keys, an error occurs, but the transfer will continue dropping tables.
-
During the next iteration, the transfer will try to drop the remaining tables. If the blocking child tables were dropped in the previous iteration, the table linked to external keys will also be dropped. In this case, the error is fixed while Data Transfer is running and no other actions are required.
-
If the transfer does not drop any table during another iteration, table dropping stops. In which case:
- The transfer will continue running if all tables were dropped.
- The transfer will be aborted with an error if there are any tables left.
Solution:
-
If child tables are not involved in other transfers and their transfer is not in conflict with the transfer goals, add these tables:
- To the list of included tables in the source endpoint parameters.
- To the list of objects to be transferred in the transfer parameters.
-
Delete the blocking child tables in the target database manually.
-
Use the
Truncate
cleanup policy. -
Recreate the target database.
Warning
This will lead to losing all data in the database.
Time shift in DATETIME data type when transferring to ClickHouse®
Time is shifted because the source endpoint uses the UTC+0 time zone for DATETIME
data. For more information, see Known limitations.
Solution: Apply the appropriate time zone at the target level manually.
Object Storage
Source data update error
Error message:
Push failed: kind: update not supported
Object Storage only supports inserting new data but does not support updating it. If data is updated at the source, the transfer will fail with the above error.
Solution: Use sources supporting data insertion only or select a target other than Object Storage.
OpenSearch
Transfer failure
Error messages:
object field starting or ending with a [.] makes object resolution ambiguous <field_description>
Index -1 out of bounds for length 0
The transfer is aborted because the keys in the documents being transferred are not valid for the OpenSearch target. Invalid keys are empty keys and keys that:
- Consist of spaces.
- Consist of periods.
- Have a period at the beginning or end.
- Have two or more periods in a row.
- Include periods separated by spaces.
Solution:
In the target endpoint additional settings, enable Sanitize documents keys and reactivate the transfer.
Exceeding the maximum number of fields limit
Error message:
Limit of total fields [<limit_value>] has been exceeded
The transfer will be interrupted if the number of columns in the source database exceeds the maximum number of fields in the target database OpenSearch indexes.
Solution: Increase the maximum field number in the target database using the index.mapping.total_fields.limit
parameter.
Document duplication on the target
When repeatedly transferring data, documents get duplicated on the target.
All documents transferred from the same source table fall under the same index named <schemaName.tableName>
on the target. In this case, the target automatically generates document IDs (_id
) by default. As a result, identical documents are assigned different IDs and get duplicated.
There is no duplication if the primary keys are specified in the source table or endpoint conversion rules. Document IDs are then generated at the transfer stage using the primary key values.
Generation is performed as follows:
- If the key value contains a period (
.
), it is escaped with\
:some.key
-->some\.key
. - All the primary key values are converted into a string:
<some_key1>.<some_key2>.<...>
. - The resulting string is converted by the url.QueryEscape
function. - If the length of the resulting string does not exceed 512 characters, it is used as the
_id
. If it is longer than 512 characters, it is hashed with SHA-1 and the resulting hash is used as the_id
.
As a result, documents with the same primary keys will receive the same ID when the data is transferred again, and the document transferred last will overwrite the existing one.
Solution:
- Set the primary key for one or more columns in the source table or in the endpoint conversion rules.
- Run the transfer.
Transfer interruption with the can't index document error
Error message:
Push failed: can't index document: got an indexation error
In audit logs of different services, the details
field may contain data of different types. In OpenSearch, the details
target field only accepts data of the type that comes first. The rest of the data cannot be accepted due to type incompatibility, thus interrupting the transfer.
Solution: split the stream for data from different services to get into different indexes.
To do this, when creating a transfer under Data transformation:
- Transformer: Select Table splitter transformer.
- Set Columns to
event_source
.
mapper_parsing_exception
error
Transfer failure with the Error message:
mapper_parsing_exception failed to parse field [details.tags] of type [text]
The transfer is aborted due to incompatible data types at source and target.
Solution: Move the data to a new OpenSearch index with the details
field type changed to flat_object
.
-
Deactivate the transfer.
-
Create a new index in OpenSearch:
curl \ --user <OpenSearch_username>:<password> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/<new_index_name>/_settings' \ --data '{"index.mapping.total_fields.limit": 2000}'
-
Change the
details
field type:curl \ --user <OpenSearch_username>:<password> \ --header 'Content-Type: application/json' \ --request PUT 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/<new_index_name>/_mapping' \ --data ' { "properties": { "details": { "type": "flat_object" } } }'
-
Move the data from the source index to the new one:
curl \ --user <OpenSearch_username>:<password> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/_reindex' \ --data ' { "source":{ "index":"<source_index_name>" }, "dest":{ "index":"<new_index_name>" } }'
-
Delete the source index:
curl \ --user <OpenSearch_username>:<password> \ --header 'Content-Type: application/json' \ --request DELETE 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/<source_index_name>'
-
Assign an alias to the new index:
curl \ --user <OpenSearch_username>:<password> \ --header 'Content-Type: application/json' \ --request POST 'https://<address_of_OpenSearch_host_with_DATA_role>:9200/_aliases' \ --data ' { "actions": [ { "add": { "index": "<new_alias_name>", "alias": "<source_alias_name>" } } ] }'
PostgreSQL
Stopping a transfer's master transaction session
Error message:
Cannot set transaction snapshot:
ERROR: invalid snapshot identifier: "<snapshot_ID>" (SQLSTATE 22023).
Possible causes include:
- The source is running a cron job or other application that periodically terminates sessions that are too long.
- Someone manually terminated the master transaction.
- The source's CPU resources are insufficient to execute a query.
Solution: Disable this cron job and add more CPU resources to the source. After you make the changes, re-activate the transfer.
Exceeding the connection time-to-live quota
Yandex Managed Service for PostgreSQL has a connection time-to-live quota of 12 hours.
Solution: If transferring a database requires more time, [change the Session duration timeout setting of the Yandex Managed Service for PostgreSQL cluster](../../../../managed-postgresql/operations/update.md#change-postgresql-config).
Exceeding the number of database connections
PostgreSQL has a limit on the number of user connections to the database. If this limit is exceeded for a transfer, the transfer will not work properly or work at all.
The solution is to configure the number of user connections to the database.
VIEW transfer error
Error message:
[ERROR] "... _view": no key columns found
Can't replicate new data from Views. Under PostgreSQL — PostgreSQL transfers, only those views are transferred which are listed in the Table filter → Included tables source endpoint parameter.
Solution: manually exclude all views from the transfer, list them in the Table filter → Included tables source endpoint parameter, and then reactivate the transfer.
Error when adding a table entry by constraint
Solution: prepare the source as described in Preparing for the transfer.
Error when transferring all schema tables
Error message:
Unable to apply DDL of type 'TABLE', name '<schema>'.'<table>', error:
ERROR: schema "<schema name>" does not exist (SQLSTATE 3F000)
A transfer fails if tables of a specific schema are listed as <schema>.*
. This occurs due to the specifics of pg_dump
that is used for a schema transfer. When specifying tables of the entire schema in <schema>.*
format in the source endpoint parameter Table filter → Included tables, the PostgreSQL types from this schema aren't extracted even if it contains tables that depend on these types.
Solution: Create PostgreSQL types in the target database manually.
Unable to create objects involving extension functions
Error message:
Unable to apply DDL of type 'TABLE', <object_name>, error:
failed to push non-row item 0 of kind "pg:DDL" in batch:
Push failed: ERROR: function <schema_name>.<function_name>() does not exist
(SQLSTATE 42883)
In Managed Service for PostgreSQL, you cannot install an extension into a custom schema in a target database. This means the transfer will be interrupted if the user installation of Managed Service for PostgreSQL has extensions installed in a custom schema and these extensions are used in the DDLs of objects you are moving.
Solution: check the DDLs of objects whose names appear in the error. If these objects can call a function from a custom schema, manually create DDLs in the target that call functions without specifying a schema. Set the target endpoint cleanup policy to Truncate
to prevent the transfer from deleting these objects.
Low transfer speed
Snapshot or Snapshot and increment transfers from PostgreSQL to PostgreSQL may slow down.
Possible cause:
-
Write protocol.
Under normal operation, a transfer is performed via the fast
copy
protocol. However, if there's a batch write conflict, it switches to slow line-by-line writes. The more write conflicts, the lower the transfer speed.Solution: set the target endpoint's cleanup policy type to
Drop
and exclude other writing processes. -
Parallel table reading.
Parallel reading is only available for tables that contain a primary key in serial mode
.Solution: Set up parallel copy and reactivate the transfer.
Unable to transfer child tables
Child tables are either not transferred or transferred without data if a table is partitioned.
Solution: set the following source endpoint parameters:
- Select Merge inherited tables in advanced settings.
- In the Included tables field, specify all child tables to transfer data from.
- Make sure the user has access to the child tables.
To speed up the transfer of child tables, set up parallel copy.
Insufficient replication slots in a source database
Error message:
Warn(Activate): failed to create a replication slot "<transfer_ID>" at source:
failed to create a replication slot:
failed to create a replication slot:
ERROR: all replication slots are in use
(SQLSTATE 53400)
Solution: Increase the number of replication slots10
).
No data transfer after changing a source endpoint
After adding tables to the List of included tables in the source endpoint parameters, a data transfer restarted and stopped.
Solution:
-
Create tables in the target database manually.
1. Create new tables with a
Primary key
and without aForeign key
in the target database.
2. Add the new tables to the List of included tables in the source endpoint parameters.
3. Transfer a dump with historical data to the target database.
4. If there are errors in the logs, fix them depending on a specific error.
5. If there are no errors but the logs are empty, contact technical support or your account manager to dump goroutines. This may help resume replication without restarting the transfer. -
Deactivate and reactivate the transfer.
-
Create a separate transfer of the Snapshot type for the new tables. Deactivating the original transfer is not required.
Transfer error when changing a master host
The error occurs in Replication or Snapshot and increment transfers due to required Write-Ahead Log (WAL) parts missing. This occurs when the Write Ahead Log (WAL) logical replication lag between the current master and a replica is greater than the allowable WAL size on other hosts, so, when switching from the master to the replica, the replication slot can't sync to the WAL on the new master.
Solution: set a limit in the additional target endpoint parameter Maximum WAL size for the replication slot and reactivate the transfer.
Error when transferring nested transactions
Transfers of PostgreSQL below version 14 don't support a transfer of tables with committed transactions that are nested more than 1024 times with replication changes at each nesting level. The degree of nesting depends on the number of nested begin; .. commit;
statements.
Solution:
- Use PostgreSQL 14 or higher.
- Exclude transactions with a nesting level like this from a transfer.
Error transferring tables with deferred constraints
The error arises in the transfers of type Replication or Snapshot and increment because updating tables and transactions with DEFERRABLE
constraints isn't supported by Data Transfer. To learn more about deferred constraints, see the PostgreSQL documentation
Solution: Change the constraint type in such tables to IMMEDIATE
and activate the transfer again.
Could not create a replication slot at the activation step
In the beginning of the transfer, one or more replication slots
Solution:
-
Get the
PID
of the process that competes for locks with the transfer:/* Get PID of the transfer */ SELECT active_pid FROM pg_replication_slots WHERE slot_name = '<transfer_ID>'; /* search the PID of the locking process */ SELECT pid, pg_blocking_pids(pid) as blocked_by FROM pg_stat_activity WHERE cardinality(pg_blocking_pids(pid)) > 0;
pid | blocked_by -----------------+------------------- <transfer_PID> | {<locking_transaction_PID>} (1 row)
-
Look up the locking query:
SELECT query, username FROM pg_stat_activity WHERE pid = <locking_transaction_PID>;
-
(Optional) Stop the transaction by the command:
SELECT pg_terminate_backend(<locking_transaction_PID>);
Excessive WAL size increase
In the PostgreSQL source database, the size of the Write-Ahead Log (WAL) may reach up to dozens of GB due to the queries running for over five minutes. These queries block the WAL and do not allow it to move forward and reset.
You can see the WAL size has gone up when:
- The space used by the source database has increased.
- The Read buffer size chart in Data Transfer monitoring is rising.
Solution:
-
Find the query sessions running for more than five minutes:
SELECT now()-xact_start, query, pid FROM pg_stat_activity WHERE (now()-xact_start)>'5minute'::interval AND STATE != 'idle' ORDER BY 1 DESC;
-
Terminate the sessions found. Make sure to avoid such queries moving forward.
Error when replicating data from an external source
Error message:
[XX000] ERROR: could not connect to the publisher:
SSL error: certificate verify failed FATAL:
no pg_hba.conf entry for replication connection
from host "<PostgreSQL_host_IP_address>", user "postgres", SSL off
Solution: Prepare the source as described in Preparing for the transfer.
Error when transfering tables without primary keys
Error message:
Primary key check failed: 14 Tables errors: Table no key columns found
For Replication and Snapshot and increment transfers, tables without primary keys are not transferred.
Solution: Prepare the source as described in Preparing for the transfer.
Error when dropping a table under the Drop cleanup policy
Error message:
ERROR: cannot drop table <table_name> because other objects depend on it (SQLSTATE 2BP01)
With the Drop
cleanup policy, a transfer drops tables in multiple iterations:
-
The transfer tries to drop all tables one by one. Cascade delete is not used, as it may delete tables that are not included in the transfer. If a table cannot be dropped, for example, due to its links to external keys, an error occurs, but the transfer will continue dropping tables.
-
During the next iteration, the transfer will try to drop the remaining tables. If the blocking child tables were dropped in the previous iteration, the table linked to external keys will also be dropped. In this case, the error is fixed while Data Transfer is running and no other actions are required.
-
If the transfer does not drop any table during another iteration, table dropping stops. In which case:
- The transfer will continue running if all tables were dropped.
- The transfer will be aborted with an error if there are any tables left.
Solution:
-
If child tables are not involved in other transfers and their transfer is not in conflict with the transfer goals, add these tables:
- To the list of included tables in the source endpoint parameters.
- To the list of objects to be transferred in the transfer parameters.
-
Delete the blocking child tables in the target database manually.
-
Use the
Truncate
cleanup policy. -
Recreate the target database.
Warning
This will lead to losing all data in the database.
Yandex Managed Service for YDB
Transfer failure
A Replication or Snapshot and increment transfer is interrupted with an error.
Error message:
/Ydb.PersQueue.V1.PersQueueService/AddReadRule failed: OVERLOADED
Transfers are aborted due to the cloud quota
Solution:
- Increase the Number of schema transactions per minute property in the Managed Service for YDB quotas for the cloud with the required database and reactivate the transfer.
Yandex Data Streams
Transfer failure
A Replication or Snapshot and increment transfer is interrupted with an error.
Error message:
/Ydb.PersQueue.V1.PersQueueService/AddReadRule failed: OVERLOADED
Transfers are aborted due to the cloud quota
Solution:
- Increase the Number of schema transactions per minute property in the Managed Service for YDB quotas for the cloud with the required database and reactivate the transfer.
Cloud Functions redirects
In rare cases, the following error may occur during transfers from Data Streams or Apache Kafka®:
redirect to SOME_URL is requested but no redirects are allowed.
Possible cause:
The use of the Cloud Functions function is set up on the source. It returns a redirect to another URL rather than data.
Solution:
Such redirects are not allowed for security reasons. Avoid using redirects to Cloud Functions during transfers.
Who to report your problem to
If you tried the above suggestions but the problem persists, contact support
ClickHouse® is a registered trademark of ClickHouse, Inc