Connecting to Object Storage from Virtual Private Cloud
Tip
In addition to the solution described in this article, you can also access Object Storage from cloud network resources without internet access using the VPC service connection. For more information, see Accessing a bucket using a service connection from VPC
In Yandex Cloud, you can connect to Yandex Object Storage via the appropriate API endpoint whose FQDN is then translated to a public IP using the DNS service.
This article describes how to deploy a cloud infrastructure in Yandex Cloud to set up access to Object Storage for resources that are hosted in a VPC cloud network and have no public IPs or access to the internet through a NAT gateway.
After the solution is deployed in Yandex Cloud, the following resources will be created:
Name | Description |
---|---|
s3-vpc |
Cloud network with the resources for which access to Object Storage is set up. For deployment, you can specify an existing cloud network as well. |
s3-nlb |
Internal network load balancer that accepts traffic to Object Storage. The load balancer accepts TCP traffic with destination port 443 and distributes it across resources (VMs) in a target group. |
s3-nat-group |
Load balancer target group with VM instances that have the NAT function enabled. |
nat-a1-vm , nat-a2-vm , nat-b1-vm , nat-b2-vm |
NAT instances in the ru-central1-a and ru-central1-b availability zones for routing traffic to Object Storage and back with translation of IP addresses of traffic sources and targets. |
pub-ip-a1 , pub-ip-a2 , pub-ip-b1 , pub-ip-b2 |
VM public IPs to which the VPC cloud network translates their internal IPs. |
DNS zone and A record |
storage.yandexcloud.net. internal DNS zone in the s3-vpc network with a type A resource record that maps the storage.yandexcloud.net domain name to the IP address of the internal network load balancer. |
s3-bucket-<...> |
Bucket in Object Storage. |
s3-subnet-a , s3-subnet-b |
Cloud subnets to host the NAT instances in the ru-central1-a and ru-central1-b availability zones. |
test-s3-vm |
Test VM to verify access to Object Storage. |
test-s3-subnet-a |
Cloud subnet to host the test VM. |
For a cloud network with the resources hosted in Cloud DNS, create thestorage.yandexcloud.net.
internal DNS zone and a type A
resource record that maps the storage.yandexcloud.net
domain name of Object Storage to the IP address of the internal network load balancer. With this record, traffic from the cloud resources to Object Storage will be routed to the internal load balancer that will distribute the load across the NAT instances.
To deploy the NAT instances, use a NAT instance based on Ubuntu 22.04 LTS image from Cloud Marketplace. It provides translation of source and target IPs to ensure traffic routing to the Object Storage public IP.
By placing the NAT instances in multiple availability zones, you can ensure fault-tolerant access to Object Storage. By increasing the number of NAT instances, you can scale the solution up if the workload increases. When calculating the number of NAT instances, consider the locality of traffic handling by the internal load balancer.
Object Storage access policies allow actions with buckets only from the public IPs of NAT instances. Bucket access is only granted to the cloud resources using this solution. You cannot connect to a bucket in Object Storage via a public API endpoint. You can disable this limitation in the Terraform configuration file, if required.
Test results for NAT instance throughput
The solution was tested on a single NAT instance with the following configuration:
- Platform: Intel Ice Lake (
standard-v3
) - Performance level: 100%
- vCPU cores: 2
- RAM: 2 GB
The average Object Storage traffic processing speed was 250 MB/s, both egress and ingress.
The test was performed using the warpwarp
command used for the test had the following parameters:
warp get \
--host storage.yandexcloud.net \
--access-key <static_key_ID> \
--secret-key <secret_key> \
--tls \
--bucket <bucket_name> \
--obj.randsize \
--concurrent 20 \
--warp-client <warp_client_IP_addresses>
Tips for deployment in the production environment
-
When deploying your NAT instances in multiple availability zones, set an even number of VMs to evenly distribute them across the availability zones.
-
When selecting the number of NAT instances, consider the locality of traffic handling by the internal load balancer.
-
Once the solution is deployed, reduce the number of NAT instances or update the list of availability zones in the
yc_availability_zones
parameter only during a pre-scheduled time window. When the changes are being applied, traffic handling may be interrupted. -
If a NAT instance demonstrates a high
CPU steal time
metric value as the Object Storage workload goes up, we recommend enabling a software-accelerated network for that NAT instance. -
By default, buckets in Object Storage can be accessed via the Yandex Cloud management console
. You can revoke this permission using thebucket_console_access = false
parameter. -
If you omit
mgmt_ip
whenbucket_private_access = true
, solution deployment using Terraform on a workstation will fail with a bucket access error. -
If you are using your own DNS server, create type
A
resource records in its settings in the following format:Name Type Value storage.yandexcloud.net
A
<internal_load_balancer_IP_address>
<bucket_name>.storage.yandexcloud.net
A
<internal_load_balancer_IP_address>
-
Save the
pt_key.pem
private SSH key used to connect to the NAT instances to a secure location or recreate it separately from Terraform. -
Once the solution is deployed, SSH access to the NAT instances will be disabled. To enable access to the NAT instances over SSH, add a rule for incoming SSH traffic (
TCP/22
) in thes3-nat-sg
security group to enable access only from certain IP addresses of admin workstations.
Deployment plan
To deploy the solution and test it:
- Prepare your cloud.
- Prepare an environment for deploying the resources.
- Deploy the solution.
- Test the solution.
If you no longer need the resources you created, delete them.
Prepare your cloud
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The infrastructure support costs include:
- Fee for using Object Storage (see Yandex Object Storage pricing).
- Fee for using a network load balancer (see Network Load Balancer pricing).
- Fee for continuously running VMs (see Yandex Compute Cloud pricing).
- Fee for using public IP addresses and outgoing traffic (see Yandex Virtual Private Cloud pricing).
Prepare an environment for deploying the resources
-
If you do not have the Yandex Cloud command line interface yet, install it and sign in as a user.
-
Check if there is an account in the Yandex Cloud cloud with
admin
permissions for the folder the solution is being deployed in. -
Check the cloud quotas to be able to deploy your resources in this use case:
Information about resources to be created
Resource Amount Virtual machines 5 VM vCPUs 10 VM RAM 10 GB Disks 5 HDD size 30 GB SSD size 40 GB Network load balancer 1 Target group for the load balancer 1 Networks 11 Subnets 3 Static public IP addresses 4 Security groups 1 DNS zone 1 Bucket 1 Service accounts 2 Static key for the service account 1 1 If the user did not specify the ID of an existing network in
terraform.tfvars
.
Deploy the solution using Terraform
-
Clone the
yandex-cloud-examples/yc-s3-private-endpoint
repository to your workstation and go to theyc-s3-private-endpoint
folder:git clone https://github.com/yandex-cloud-examples/yc-s3-private-endpoint.git cd yc-s3-private-endpoint
-
Set up an environment for authentication in Terraform (for more information, see Getting started with Terraform):
export YC_TOKEN=$(yc iam create-token)
-
Variable parameters of resources to create are defined in the
variables.tf
file. Insert your custom variable values into theterraform.tfvars
file. Mandatory parameters to be updated are indicated in the table below.Detailed information about values to set
Parameter
nameReplace with
a custom
valueDescription Type Example folder_id
Yes ID of the folder to host the solution components string
"b1gentmqf1ve********"
vpc_id
— ID of the cloud network access to Object Storage is set up for. If omitted, a new network will be created. string
"enp48c1ndilt********"
yc_availability_zones
— List of availability zones for deploying NAT instances list(string)
["ru-central1-a", "ru-central1-b"]
subnet_prefix_list
— List of prefixes of cloud subnets to host the NAT instances (one subnet in each availability zone from the yc_availability_zones
list. The prefixes should be listed as ru-central1-a, ru-central1-b, and so on).list(string)
["10.10.1.0/24", "10.10.2.0/24"]
nat_instances_count
— Number of NAT instances to deploy. We recommend setting an even number to evenly distribute the instances across the availability zones. number
4
bucket_private_access
— Only allow bucket access from the public IPs of NAT instances. If true
, access is limited. To remove the limit, setfalse
.bool
true
bucket_console_access
— Allow bucket access via the Yandex Cloud management console. If true
, access is allowed. To deny access, setfalse
. This parameter is mandatory if thebucket_private_access
parameter is set totrue
.bool
true
mgmt_ip
Yes Public IP of your workstation where you are deploying the infrastructure using Terraform. It is used to allow your workstation to perform actions with the bucket when deploying Terraform. This parameter is mandatory if the bucket_private_access
parameter is set totrue
.string
"A.A.A.A"
trusted_cloud_nets
Yes List of aggregated prefixes of cloud subnets that Object Storage access is allowed for. It is used in the rule for incoming traffic of security groups for the NAT instances. list(string)
["10.0.0.0/8", "192.168.0.0/16"]
vm_username
— NAT instance and test VM user names string
"admin"
s3_ip
No Object Storage public IP address string
213.180.193.243
s3_fqdn
No Object Storage domain name string
storage.yandexcloud.net
-
Initialize Terraform:
terraform init
-
Check the list of cloud resources you are about to create:
terraform plan
-
Create resources:
terraform apply
-
Once the
terraform apply
process is completed, the command line will output information required for connecting to the test VM and running test operations with Object Storage. Later on, you can view this information by running theterraform output
command:Information about resources deployed
Name Description Sample value path_for_private_ssh_key
File with a private key used to connect to the NAT instances and test VM over SSH ./pt_key.pem
vm_username
NAT instance and test VM user names admin
test_vm_password
admin
user password for the test VMv3RCqU****
s3_bucket_name
Bucket name in Object Storage s3-bucket-<...>
s3_nlb_ip_address
IP address of the internal load balancer 10.10.1.100
Test the solution
-
In the management console
, go to the folder where the resources were created. -
Select Compute Cloud.
-
Select
test-s3-vm
from the list of VM instances. -
Go to the Serial console tab.
-
Click Connect.
-
Enter the
admin
username and the password from theterraform output test_vm_password
command output (without quotation marks). -
Run this command:
dig storage.yandexcloud.net
-
Make sure Object Storage domain name in the DNS server response matches the IP address of the internal load balancer. The output of the type
A
resource record is as follows:;; ANSWER SECTION: storage.yandexcloud.net. 300 IN A 10.10.1.100
-
Get an object from the bucket in Object Storage using the AWS CLI. The bucket name will be fetched from the test VM environment variable.
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3://$BUCKET/s3_test_file.txt s3_test_file.txt
Result:
download: s3://<bucket_name>/s3_test_file.txt to ./s3_test_file.txt
-
You can additionally run a number of commands to test Object Storage. The bucket name will be fetched from the test VM environment variable.
Upload the downloaded test file to the bucket under a different name:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3_test_file.txt s3://$BUCKET/textfile.txt
Result:
upload: ./s3_test_file.txt to s3://<bucket_name>/textfile.txt
Get a list of objects in the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 ls --recursive s3://$BUCKET
Result:
2023-08-16 18:24:05 53 s3_test_file.txt \ 2023-08-16 18:41:39 53 textfile.txt
Delete the object you uploaded to the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 rm s3://$BUCKET/textfile.txt
Result:
delete: s3://<bucket_name>/textfile.txt
Delete the resources you created
To delete the resources you created using Terraform, run the terraform destroy
command.
Warning
Terraform will permanently delete all the resources that were created while deploying the solution.