Automatically copying objects from one Object Storage bucket to another
Configure automatic object copy from one Object Storage bucket to another. Objects will be copied using a Cloud Functions function invoked by a trigger when a new object is added to a bucket.
To set up object copy:
- Prepare your cloud.
- Create service accounts.
- Create a static key.
- Create a Yandex Lockbox secret.
- Create Yandex Object Storage buckets.
- Prepare a ZIP archive with the function code.
- Create a Yandex Cloud Functions function.
- Create a trigger.
- Test the function.
If you no longer need the resources you created, delete them.
Prepare your cloud
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The cost of resources includes:
- Fee for storing data in a bucket (see Yandex Object Storage pricing).
- Fee for the number of function calls, computing resources allocated to executing the function, and outgoing traffic (see Yandex Cloud Functions pricing).
- Secret storage fees (see Yandex Lockbox pricing).
Create service accounts
Create a service account named s3-copy-fn
with the storage.uploader
, storage.viewer
, and lockbox.payloadViewer
roles that will operate the function, and the account named s3-copy-trigger
with the functions.functionInvoker
role to invoke the function.
- In the management console
, select the folder where you want to create a service account. - In the list of services, select Identity and Access Management.
- Click Create service account.
- Enter a name for the service account:
s3-copy-fn
. - Click Add role and select the
storage.uploader
,storage.viewer
, andlockbox.payloadViewer
roles. - Click Create.
- Repeat the previous steps and create a service account named
s3-copy-trigger
with thefunctions.functionInvoker
role. This service account will be used to invoke the function.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
-
Create a service account named
s3-copy-fn
:yc iam service-account create --name s3-copy-fn
Result:
id: nfersamh4sjq******** folder_id: b1gc1t4cb638******** created_at: "2023-03-21T10:36:29.726397755Z" name: s3-copy-fn
Save the ID of the
s3-copy-fn
service account (id
) and the ID of the folder where it was created (folder_id
). -
Assign the
storage.uploader
,storage.viewer
, andlockbox.payloadViewer
roles to the service account:yc resource-manager folder add-access-binding <folder_ID> \ --role storage.uploader \ --subject serviceAccount:<service_account_ID> yc resource-manager folder add-access-binding <folder_ID> \ --role storage.viewer \ --subject serviceAccount:<service_account_ID> yc resource-manager folder add-access-binding <folder_ID> \ --role lockbox.payloadViewer \ --subject serviceAccount:<service_account_ID>
Result:
done (1s)
-
Create a service account named
s3-copy-trigger
:yc iam service-account create --name s3-copy-trigger
Save the IDs of the
s3-copy-trigger
service account (id
) and the folder where it was created (folder_id
). -
Assign the service account the
functions.functionInvoker
role for the folder:yc resource-manager folder add-access-binding <folder_ID> \ --role storage.uploader \ --subject serviceAccount:<service_account_ID>
If you don't have Terraform, install it and configure the Yandex Cloud provider.
-
In the configuration file, describe the service account parameters:
// Service account to operate the function resource "yandex_iam_service_account" "s3-copy-fn" { name = "s3-copy-fn" folder_id = "<folder_ID>" } resource "yandex_resourcemanager_folder_iam_member" "uploader" { folder_id = "<folder_ID>" role = "storage.uploader" member = "serviceAccount:${yandex_iam_service_account.s3-copy-fn.id}" } resource "yandex_resourcemanager_folder_iam_member" "viewer" { folder_id = "<folder_ID>" role = "storage.viewer" member = "serviceAccount:${yandex_iam_service_account.s3-copy-fn.id}" } resource "yandex_resourcemanager_folder_iam_member" "payloadViewer" { folder_id = "<folder_ID>" role = "lockbox.payloadViewer" member = "serviceAccount:${yandex_iam_service_account.s3-copy-fn.id}" } // Service account to invoke the function resource "yandex_iam_service_account" "s3-copy-trigger" { name = "s3-copy-trigger" folder_id = "<folder_ID>" } resource "yandex_resourcemanager_folder_iam_member" "functionInvoker" { folder_id = "<folder_ID>" role = "functions.functionInvoker" member = "serviceAccount:${yandex_iam_service_account.s3-copy-trigger.id}" }
Where:
name
: Service account name. This is a required parameter.folder_id
: Folder ID. This is an optional parameter. By default, the value specified in the provider settings is used.role
: Role you want to assign.
For more information about the
yandex_iam_service_account
resource parameters in Terraform, see the relevant provider documentation . -
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is specified correctly, the terminal will display information about the service account. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the service accounts: type
yes
in the terminal and press Enter.This will create the service accounts. You can check the new service accounts using the management console
or this CLI command:yc iam service-account list
-
To create a service account, use the create REST API method for the ServiceAccount resource or the ServiceAccountService/Create gRPC API call.
To assign the service account the roles for the folder, use the setAccessBindings method for the ServiceAccount resource or the ServiceAccountService/SetAccessBindings gRPC API call.
Create a static key
Create a static access key for the s3-copy-fn
service account.
- In the management console
, select the folder containing the service account. - In the list of services, select Identity and Access Management.
- In the left-hand panel, select
Service accounts and select thes3-copy-fn
service account. - In the top panel, click Create new key.
- Select Create static access key.
- Specify the key description and click Create.
- Save the ID and the secret key.
-
Run this command:
yc iam access-key create --service-account-name s3-copy-fn
Result:
access_key: id: aje6t3vsbj8l******** service_account_id: ajepg0mjt06s******** created_at: "2023-03-21T14:37:51Z" key_id: 0n8X6WY6S24******** secret: JyTRFdqw8t1kh2-OJNz4JX5ZTz9Dj1rI********
-
Save the ID (
key_id
) and secret key (secret
). You will not be able to get the key value again.
-
In the configuration file, describe the key parameters:
resource "yandex_iam_service_account_static_access_key" "sa-static-key" { service_account_id = "<service_account_ID>" }
Where
service_account_id
is thes3-copy-fn
service account ID.For more information about the
yandex_iam_service_account_static_access_key
resource parameters in Terraform, see the relevant provider documentation . -
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the static access key: type
yes
in the terminal and press Enter.If any errors occur when creating the key, Terraform will indicate them.
If the key is successfully created, Terraform will write it into its configuration, but will not show it to the user. The terminal will only display the ID of the created key.You can check the new service account key in the management console
or using the CLI command:yc iam access-key list --service-account-name=s3-copy-fn
-
To create an access key, use the create REST API method for the AccessKey resource or the AccessKeyService/Create gRPC API call.
Create a secret
Create a Yandex Lockbox secret to store your static access key.
-
In the management console
, select the folder you want to create a secret in. -
In the list of services, select Lockbox.
-
Click Create secret.
-
In the Name field, specify the secret's name:
s3-static-key
. -
Under Secret data:
-
Select the Custom secret type.
-
Add the key ID value:
- In the Key field, put:
key_id
. - In the Value field, specify the key ID you got earlier.
- In the Key field, put:
-
Click Add key/value.
-
Add the secret key value:
- In the Key field, put:
secret
. - In the Value field, specify the secret key value you got earlier.
- In the Key field, put:
-
-
Click Create.
To create a secret, run this command:
yc lockbox secret create --name s3-static-key \
--payload "[{'key': 'key_id', 'text_value': '<key_ID>'},{'key': 'secret', 'text_value': '<private_key_value>'}]"
Result:
id: e6q2ad0j9b55********
folder_id: b1gktjk2rg49********
created_at: "2021-11-08T19:23:00.383Z"
name: s3-static-key
status: ACTIVE
current_version:
id: g6q4fn3b6okj********
secret_id: e6e2ei4u9b55********
created_at: "2023-03-21T19:23:00.383Z"
status: ACTIVE
payload_entry_keys:
- key_id
- secret
-
In the configuration file, describe the secret parameters:
resource "yandex_lockbox_secret" "my_secret" { name = "s3-static-key" } resource "yandex_lockbox_secret_version" "my_version" { secret_id = yandex_lockbox_secret.my_secret.id entries { key = "key_id" text_value = "<key_ID>" } entries { key = "secret" text_value = "<private_key_value>" } }
Where:
name
: Secret namekey
: Key nametext_value
: Key value
Note
We recommend using
yandex_lockbox_secret_version_hashed
: it stores values in Terraform state in hashed format. We continue supportingyandex_lockbox_secret_version
.For more information about
yandex_lockbox_secret_version_hashed
, see the relevant provider documentation .For more information about the parameters of resources used in Terraform, see the provider documentation:
-
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the secret creation: type
yes
in the terminal and press Enter.
-
To create a secret, use the create REST API method for the Secret resource or the SecretService/Create gRPC API call.
Create Object Storage buckets
Create two buckets: the main one to store files and the backup one to copy the main bucket's files to.
-
In the management console
, select the folder you want to create buckets in. -
In the list of services, select Object Storage.
-
Create the main bucket:
- Click Create bucket.
- In the ** Name** field, enter a name for the main bucket.
- In the Object read access, Object listing access, and Read access to settings fields, select
Restricted
. - Click Create bucket.
-
Similarly, create a backup bucket.
If you do not have the AWS CLI yet, install and configure it.
Create the main and the backup buckets:
aws --endpoint-url https://storage.yandexcloud.net \
s3 mb s3://<main_bucket_name>
aws --endpoint-url https://storage.yandexcloud.net \
s3 mb s3://<backup_bucket_name>
Result:
make_bucket: <main_bucket_name>
make_bucket: <backup_bucket_name>
Note
Terraform uses a service account to interact with Object Storage. Assign to the service account the required role, e.g., storage.admin
, for the folder where you are going to create resources.
-
Describe the parameters for creating a service account and access key in the configuration file:
... // Creating a service account resource "yandex_iam_service_account" "sa" { name = "<service_account_name>" } // Assigning a role to a service account resource "yandex_resourcemanager_folder_iam_member" "sa-admin" { folder_id = "<folder_ID>" role = "storage.admin" member = "serviceAccount:${yandex_iam_service_account.sa.id}" } // Creating a static access key resource "yandex_iam_service_account_static_access_key" "sa-static-key" { service_account_id = yandex_iam_service_account.sa.id description = "static access key for object storage" }
-
In the configuration file, describe the parameters of the main and backup buckets:
resource "yandex_storage_bucket" "main-bucket" { access_key = yandex_iam_service_account_static_access_key.sa-static-key.access_key secret_key = yandex_iam_service_account_static_access_key.sa-static-key.secret_key bucket = "<main_bucket_name>" } resource "yandex_storage_bucket" "reserve-bucket" { access_key = yandex_iam_service_account_static_access_key.sa-static-key.access_key secret_key = yandex_iam_service_account_static_access_key.sa-static-key.secret_key bucket = "<backup_bucket_name>" }
For more information about the
yandex_storage_bucket
resource, see the Terraform provider documentation . -
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the buckets: type
yes
in the terminal and press Enter.
-
To create a bucket, use the create REST API method for the Bucket resource or the BucketService/Create gRPC API call.
Prepare a ZIP archive with the function code
-
Save the following code to a file named
handler.sh
:set -e ( cat | jq -c '.messages[]' | while read message; do SRC_BUCKET=$(echo "$message" | jq -r .details.bucket_id) SRC_OBJECT=$(echo "$message" | jq -r .details.object_id) aws --endpoint-url="$S3_ENDPOINT" s3 cp "s3://$SRC_BUCKET/$SRC_OBJECT" "s3://$DST_BUCKET/$SRC_OBJECT" done; ) 1>&2
-
Add the
handler.sh
file into thehandler-sh.zip
archive.
Create a function
Create a function that will copy a new object to the backup bucket once you add it to the main bucket.
-
In the management console
, select the folder where you want to create a function. -
In the list of services, select Cloud Functions.
-
Create a function:
- Click Create function.
- Specify the function name:
copy-function
. - Click Create.
-
Create a function version:
-
Select the
Bash
runtime environment, disable the Add files with code examples option, and click Continue. -
Specify the
ZIP archive
upload method and select thehandler-sh.zip
archive created in the previous step. -
Specify the entry point:
handler.sh
. -
Under Parameters, specify:
-
Timeout, sec:
600
-
Memory:
128 MB
-
Service account:
s3-copy-fn
-
Environment variables:
S3_ENDPOINT
:https://storage.yandexcloud.net
DST_BUCKET
: Name of the backup bucket to copy files to
-
Lockbox secrets:
AWS_ACCESS_KEY_ID
: secret IDs3-static-key
, version IDlatest
, secret keykey_id
.AWS_SECRET_ACCESS_KEY
: secret IDs3-static-key
, version IDlatest
, secret keysecret
.
-
-
Click Save changes.
-
-
Create a function named
copy-function
:yc serverless function create --name=copy-function
Result:
id: b09bhaokchn9******** folder_id: <folder_ID> created_at: "2024-10-21T20:40:03.451Z" name: copy-function http_invoke_url: https://functions.yandexcloud.net/b09bhaokchn9******** status: ACTIVE
-
Create a version of the
copy-function
function:yc serverless function version create \ --function-name copy-function \ --memory=128m \ --execution-timeout=600s \ --runtime=bash \ --entrypoint=handler.sh \ --service-account-id=<service_account_ID> \ --environment DST_BUCKET=<backup_bucket_name> \ --environment S3_ENDPOINT=https://storage.yandexcloud.net \ --secret name=s3-static-key,key=key_id,environment-variable=AWS_ACCESS_KEY_ID \ --secret name=s3-static-key,key=secret,environment-variable=AWS_SECRET_ACCESS_KEY \ --source-path=./handler-sh.zip
Where:
--function-name
: Name of the function a version of which you are creating.--memory
: Amount of RAM.--execution-timeout
: Maximum function running time before the timeout is reached.--runtime
: Runtime environment.--entrypoint
: Entry point.--service-account-id
:s3-copy-fn
service account ID.--environment
: Environment variables.--secret
: Secret with parts of the static access key.--source-path
: Path to thehandler-sh.zip
archive.
Result:
done (1s) id: d4e394pt4nhf******** function_id: d4efnkn79m7n******** created_at: "2024-10-21T20:41:01.345Z" runtime: bash entrypoint: handler.sh resources: memory: "134217728" execution_timeout: 600s service_account_id: ajelprpohp7r******** image_size: "4096" status: ACTIVE tags: - $latest environment: DST_BUCKET: <backup_bucket_name> S3_ENDPOINT: https://storage.yandexcloud.net secrets: - id: e6qo2oprlmgn******** version_id: e6q6i1qt0ae8******** key: key_id environment_variable: AWS_ACCESS_KEY_ID - id: e6qo2oprlmgn******** version_id: e6q6i1qt0ae8******** key: secret environment_variable: AWS_SECRET_ACCESS_KEY log_options: folder_id: b1g681qpemb4******** concurrency: "1"
-
In the configuration file, describe the function parameters:
resource "yandex_function" "copy-function" { name = "copy-functionn" user_hash = "first function" runtime = "bash" entrypoint = "handler.sh" memory = "128" execution_timeout = "600" service_account_id = "<service_account_ID>" environment = { DST_BUCKET = "<backup_bucket_name>" S3_ENDPOINT = "https://storage.yandexcloud.net" } secrets = { id = "<secret_ID>" version_id = "<secret_version_ID>" key = "key_id" environment_variable = "AWS_ACCESS_KEY_ID" } secrets = { id = "<secret_ID>" version_id = "<secret_version_ID>" key = "secret" environment_variable = "AWS_SECRET_ACCESS_KEY" } content { zip_filename = "./handler-sh.zip" } }
Where:
name
: Function name.user_hash
: Any string to identify the function version.runtime
: Function runtime environment.entrypoint
: Entry point.memory
: Amount of memory allocated for the function, in MB.execution_timeout
: Function execution timeout.service_account_id
:s3-copy-fn
service account ID.environment
: Environment variables.secrets
: Secret with parts of the static access key.content
: Path to thehandler-sh.zip
archive with the function source code.
For more information about the
yandex_function
resource parameters, see the provider documentation . -
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the function: type
yes
in the terminal and press Enter.
-
To create a function, use the create REST API method for the Function resource or the FunctionService/Create gRPC API call.
To create a function version, use the createVersion REST API method for the Function resource or the FunctionService/CreateVersion gRPC API call.
Create a trigger
Create a trigger for Object Storage that will invoke copy-function
when you create a new object in the main bucket.
-
In the management console
, select the folder where you want to create a trigger. -
In the list of services, select Cloud Functions.
-
In the left-hand panel, select
Triggers. -
Click Create trigger.
-
Under Basic settings:
- Specify a name for the trigger:
bucket-to-bucket-copying
. - In the Type field, select
Object Storage
. - In the Launched resource field, select
Function
.
- Specify a name for the trigger:
-
Under Object Storage settings:
- Select the main bucket in the Bucket field.
- In the Event types field, select
Create object
.
-
Under Function settings:
- In the Function field, select
copy-function
. - In the Service account field, select the
s3-copy-trigger
service account.
- In the Function field, select
-
Click Create trigger.
Run this command:
yc serverless trigger create object-storage \
--name bucket-to-bucket-copying \
--bucket-id <main_bucket_name> \
--events 'create-object' \
--invoke-function-name copy-function \
--invoke-function-service-account-name s3-copy-trigger
Where:
--name
: Trigger name.--bucket-id
: Name of the main bucket.--events
: Events activating the trigger.--invoke-function-name
: Name of the function being invoked.--invoke-function-service-account-name
: Name of the service account to use for invoking the function.
Result:
id: a1s92agr8mpg********
folder_id: b1g88tflru0e********
created_at: "2024-10-21T21:04:01.866959640Z"
name: bucket-to-bucket-copying
rule:
object_storage:
event_type:
- OBJECT_STORAGE_EVENT_TYPE_CREATE_OBJECT
bucket_id: <main_bucket_name>
batch_settings:
size: "1"
cutoff: 1s
invoke_function:
function_id: d4eofc7n0m03********
function_tag: $latest
service_account_id: aje3932acd0c********
status: ACTIVE
-
In the configuration file, describe the trigger parameters:
resource "yandex_function_trigger" "my_trigger" { name = "bucket-to-bucket-copying" object_storage { bucket_id = "<main_bucket_name>" create = true } function { id = "<function_ID>" service_account_id = "<service_account_ID>" } }
Where:
name
: Trigger name.object_storage
: Storage parameters:bucket_id
: Name of the main bucket.create
: Trigger will invoke the function when a new object is created in the storage.
function
: Settings for the function which the trigger will activate:id
:copy-function
function ID.service_account_id
:s3-copy-trigger
service account ID.
For more information about resource parameters in Terraform, see the provider documentation
. -
Make sure the configuration files are correct.
-
In the command line, go to the folder where you created the configuration file.
-
Run a check using this command:
terraform plan
If the configuration is described correctly, the terminal will display a list of created resources and their parameters. If the configuration contains any errors, Terraform will point them out.
-
-
Deploy cloud resources.
-
If the configuration does not contain any errors, run this command:
terraform apply
-
Confirm creating the trigger: type
yes
in the terminal and press Enter.
-
To create a trigger for Object Storage, use the create method for the Trigger resource or the TriggerService/Create gRPC API call.
Test the function
- In the management console
, go to the folder where the main bucket is located. - In the list of services, select Object Storage.
- Click the name of the main bucket.
- In the top-right corner, click Upload.
- In the window that opens, select the required files and click Open.
- The management console will display all objects selected for upload. Click Upload.
- Refresh the page.
- Go to the backup bucket and make sure it contains the files you added.
How to delete the resources you created
To stop paying for the resources you created:
- Delete the objects from the buckets.
- Delete the buckets.
- Delete the
bucket-to-bucket-copying
trigger. - Delete the
copy-function
function.