Connecting to Object Storage from Virtual Private Cloud
Tip
In addition to the solution described in this article, you can also access Object Storage from cloud network resources without internet access using the VPC service connection. For more information, see Accessing a bucket using a service connection from VPC
In Yandex Cloud, you can connect to Yandex Object Storage via the appropriate API endpoint whose FQDN is then translated to a public IP using DNS.
In this tutorial, we will deploy a Yandex Cloud infrastructure to set up access to Object Storage for resources that are hosted in a VPC cloud network and have no public IP addresses or internet access through a NAT gateway.
While deploying this Yandex Cloud infrastructure, we will create the following resources:
Name | Description |
---|---|
s3-vpc |
Cloud network with the resources to provide with Object Storage access. For deployment, you can specify an existing cloud network as well. |
s3-nlb |
Object Storage internal NLB accepting TCP traffic with destination port 443 and distributing it across VM instances in a target group |
s3-nat-group |
Load balancer target group of NAT instances |
nat-a1-vm , nat-a2-vm , nat-b1-vm , nat-b2-vm |
NAT instances residing in the ru-central1-a and ru-central1-b availability zones for routing traffic to and from Object Storage with translation of IP addresses of traffic sources and targets |
pub-ip-a1 , pub-ip-a2 , pub-ip-b1 , pub-ip-b2 |
VM public IP addresses mapped by the VPC cloud network from their internal IP addresses |
DNS zone and A record |
storage.yandexcloud.net. internal DNS zone in the s3-vpc network with a type A resource record mapping the storage.yandexcloud.net domain name to the IP address of the internal network load balancer |
s3-bucket-<...> |
Bucket in Object Storage |
s3-subnet-a , s3-subnet-b |
Cloud subnets to host the NAT instances in the ru-central1-a and ru-central1-b availability zones |
test-s3-vm |
VM used to test access to Object Storage |
test-s3-subnet-a |
Cloud subnet to host the test VM |
For a cloud network with the resources hosted in Cloud DNS, we will create the storage.yandexcloud.net.
internal DNS zone and a type A
resource record that maps the storage.yandexcloud.net
domain name of Object Storage to the IP address of the internal network load balancer. This record will direct traffic coming from your cloud resources and aimed at Object Storage to the internal load balancer that will in turn distribute it across NAT instances.
To deploy a NAT instance, use this Ubuntu 22.04 LTS image from Cloud Marketplace. It translates the source and target IP addresses to ensure traffic routing to the Object Storage public IP address.
By placing the NAT instances in multiple availability zones, you can ensure fault-tolerant access to Object Storage. You can scale the solution for higher workload by adding more NAT instances. Before doing that, consider the internal NLB traffic processing locality.
The Object Storage bucket policy only allows bucket operations from NAT instance public IP addresses. Only cloud resources that use this solution can access the bucket. You cannot connect to a bucket in Object Storage via a public API endpoint. If you need, you can remove this limitation in the Terraform configuration file.
Test results for NAT instance throughput
The solution was tested on a single NAT instance with the following configuration:
- Platform: Intel Ice Lake (
standard-v3
) - Performance level: 100%
- vCPU cores: 2
- RAM: 2 GB
The average Object Storage traffic processing speed was 250 MB/s, both egress and ingress.
The test was performed using the warpwarp
command used for the test:
warp get \
--host storage.yandexcloud.net \
--access-key <static_key_ID> \
--secret-key <secret_key> \
--tls \
--bucket <bucket_name> \
--obj.randsize \
--concurrent 20 \
--warp-client <warp_client_IP_addresses>
Tips for production deployment
-
When deploying your NAT instances in multiple availability zones, use an even number of VMs to evenly distribute them across the availability zones.
-
When selecting the number of NAT instances, consider the internal NLB traffic processing locality.
-
Once the solution is deployed, reduce the number of NAT instances or update the list of availability zones in the
yc_availability_zones
parameter only during a pre-scheduled time window. Applying changes may cause interruptions in traffic processing. -
If, with increased Object Storage workload, a NAT instance demonstrates a high
CPU steal time
metric value, enable a software-accelerated network for that NAT instance. -
By default, you can access buckets in Object Storage via the Yandex Cloud management console
. To disable this access option, providebucket_console_access = false
in the configuration. -
With
bucket_private_access = true
, skippingmgmt_ip
will cause a bucket access error during Terraform deployment on your workstation. -
If you are using your own DNS server, create the following type
A
resource records in its settings:Name Type Value storage.yandexcloud.net
A
<internal_load_balancer_IP_address>
<bucket_name>.storage.yandexcloud.net
A
<internal_load_balancer_IP_address>
-
Save the
pt_key.pem
private SSH key for accessing NAT instances to a secure location or recreate it without using Terraform. -
Once the solution is deployed, SSH access to the NAT instances will be disabled. To enable it, add a rule for inbound SSH traffic (
TCP/22
) in thes3-nat-sg
security group to enable access only from trusted IP addresses of admin workstations.
Deployment plan
To deploy the solution and test it:
- Get your cloud ready.
- Set up an environment for deploying the resources.
- Deploy the solution.
- Test the solution.
If you no longer need the resources you created, delete them.
Get your cloud ready
Sign up in Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or register a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can navigate to the cloud page
Learn more about clouds and folders.
Required paid resources
The infrastructure support costs include:
- Fee for using Object Storage (see Yandex Object Storage pricing).
- Fee for using a network load balancer (see Network Load Balancer pricing).
- Fee for continuously running VMs (see Yandex Compute Cloud pricing).
- Fee for public IP addresses and outbound traffic (see Yandex Virtual Private Cloud pricing).
Set up an environment for deploying the resources
-
If you do not have the Yandex Cloud CLI yet, install it and sign in as a user.
-
Check if there is an account in the Yandex Cloud-enabled cloud with
admin
permissions for the folder where you are deploying the solution. -
Check whether your cloud quotas allow you to deploy your resources for this scenario:
Information about the number of new resources
Resource Quantity Virtual machines 5 VM vCPUs 10 VM RAM 10 GB Disks 5 HDD size 30 GB SSD size 40 GB Network load balancer 1 Load balancer target group 1 Networks 11 Subnets 3 Static public IP addresses 4 Security groups 1 DNS zone 1 Bucket 1 Service account 2 Service account static key 1 1 If the user did not specify the ID of an existing network in
terraform.tfvars
.
Deploy the solution using Terraform
-
Clone the
yandex-cloud-examples/yc-s3-private-endpoint
repository to your workstation and go to theyc-s3-private-endpoint
directory:git clone https://github.com/yandex-cloud-examples/yc-s3-private-endpoint.git cd yc-s3-private-endpoint
-
Set up an environment for Terraform authentication (learn more in Getting started with Terraform):
export YC_TOKEN=$(yc iam create-token)
-
The
variables.tf
file defines the variable parameters of the resources to create. Set your custom variable values in theterraform.tfvars
file. Refer to the table below to see which parameters you need to update.Detailed information about the values to set
Parameter
nameReplace with
a custom
valueDescription Type Example folder_id
Yes ID of the folder to host the solution components string
"b1gentmqf1ve********"
vpc_id
— ID of your cloud network to provide with Object Storage access. If left empty, the system will create a new network. string
"enp48c1ndilt********"
yc_availability_zones
— List of the availability zones for deploying NAT instances list(string)
["ru-central1-a", "ru-central1-b"]
subnet_prefix_list
— List of prefixes of cloud subnets to host the NAT instances (one subnet in each availability zone from the yc_availability_zones
list in the following order: ru-central1-a, ru-central1-b, and so on).list(string)
["10.10.1.0/24", "10.10.2.0/24"]
nat_instances_count
— Number of NAT instances to deploy. We recommend setting an even number to evenly distribute the instances across the availability zones. number
4
bucket_private_access
— This parameter limits access to the bucket to the public IP addresses of NAT instances. You can set it to true
to apply this limitation orfalse
, to remove it.bool
true
bucket_console_access
— This parameter allows or denies bucket access via the Yandex Cloud management console. You can set it to true
to enable this access option or tofalse
, to disable it. Make sure to specify it ifbucket_private_access
is set totrue
.bool
true
mgmt_ip
Yes Public IP address of your workstation where you are deploying the infrastructure using Terraform. Use it to allow bucket operations from your local machine when deploying Terraform. Make sure to specify this parameter if bucket_private_access
is set totrue
.string
"A.A.A.A"
trusted_cloud_nets
Yes List of aggregated prefixes of cloud subnets allowed to access Object Storage. It is used in the inbound traffic rule for NAT instance security groups. list(string)
["10.0.0.0/8", "192.168.0.0/16"]
vm_username
— NAT instance and test VM username string
"admin"
s3_ip
No Object Storage public IP address string
213.180.193.243
s3_fqdn
No Object Storage domain name string
storage.yandexcloud.net
-
Initialize Terraform:
terraform init
-
Check the list of new cloud resources:
terraform plan
-
Create the resources:
terraform apply
-
Once the
terraform apply
process is complete, the command line will show information you need to connect to the test VM and run Object Storage tests. Later on, you can view this info by running theterraform output
command:Information about the deployed resources
Name Description Value (example) path_for_private_ssh_key
File with a private key for SSH access to NAT instances and test VM ./pt_key.pem
vm_username
NAT instance and test VM username admin
test_vm_password
Test VM admin
passwordv3RCqU****
s3_bucket_name
Object Storage bucket name s3-bucket-<...>
s3_nlb_ip_address
Internal load balancer IP address 10.10.1.100
Test the solution
-
In the management console
, navigate to the folder with the resources you created. -
Select Compute Cloud.
-
Select
test-s3-vm
from the list of VMs. -
Navigate to the Serial console tab.
-
Click Connect.
-
Enter the
admin
username and the password from theterraform output test_vm_password
command output (without quotation marks). -
Run this command:
dig storage.yandexcloud.net
-
Check the DNS server response and make sure the Object Storage domain name resolves to the IP address of the internal load balancer. The command output will show type
A
resource records as follows:;; ANSWER SECTION: storage.yandexcloud.net. 300 IN A 10.10.1.100
-
Get an object from the bucket in Object Storage using the AWS CLI. The bucket name will be fetched from the test VM environment variable.
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3://$BUCKET/s3_test_file.txt s3_test_file.txt
Result:
download: s3://<bucket_name>/s3_test_file.txt to ./s3_test_file.txt
-
You can additionally run a number of commands to test Object Storage. The bucket name will be fetched from the test VM environment variable.
Upload the downloaded test file to the bucket under a different name:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3_test_file.txt s3://$BUCKET/textfile.txt
Result:
upload: ./s3_test_file.txt to s3://<bucket_name>/textfile.txt
Get a list of objects in the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 ls --recursive s3://$BUCKET
Result:
2023-08-16 18:24:05 53 s3_test_file.txt \ 2023-08-16 18:41:39 53 textfile.txt
Delete the object you uploaded to the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 rm s3://$BUCKET/textfile.txt
Result:
delete: s3://<bucket_name>/textfile.txt
Delete the resources you created
To delete the resources created with Terraform:
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy
-
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-