Connecting to Object Storage from Virtual Private Cloud
Tip
In addition to the solution described in this article, you can also access Object Storage from cloud network resources without internet access using the VPC service connection. For more information, see Accessing a bucket using a service connection from VPC
In Yandex Cloud, you can connect to Yandex Object Storage via the appropriate API endpoint whose FQDN is then translated to a public IP using DNS.
In this tutorial, we will deploy a Yandex Cloud infrastructure to set up access to Object Storage for resources that are hosted in a VPC cloud network and have no public IP addresses or internet access through a NAT gateway.
While deploying this Yandex Cloud infrastructure, we will create the following resources:
| Name | Description |
|---|---|
s3-vpc |
Cloud network with the resources to provide with Object Storage access. For deployment, you can specify an existing cloud network as well. |
s3-nlb |
Object Storage internal NLB accepting TCP traffic with destination port 443 and distributing it across VM instances in a target group |
s3-nat-group |
Load balancer target group of NAT instances |
nat-a1-vm, nat-a2-vm, nat-b1-vm, nat-b2-vm |
NAT instances residing in the ru-central1-a and ru-central1-b availability zones for routing traffic to and from Object Storage with translation of IP addresses of traffic sources and targets |
pub-ip-a1, pub-ip-a2, pub-ip-b1, pub-ip-b2 |
VM public IP addresses mapped by the VPC cloud network from their internal IP addresses |
DNS zone and A record |
storage.yandexcloud.net. internal DNS zone in the s3-vpc network with a type A resource record mapping the storage.yandexcloud.net domain name to the IP address of the internal network load balancer |
s3-bucket-<...> |
Bucket in Object Storage |
s3-subnet-a, s3-subnet-b |
Cloud subnets to host the NAT instances in the ru-central1-a and ru-central1-b availability zones |
test-s3-vm |
VM used to test access to Object Storage |
test-s3-subnet-a |
Cloud subnet to host the test VM |
For a cloud network with the resources hosted in Cloud DNS, we will create the storage.yandexcloud.net. internal DNS zone and a type A resource record that maps the storage.yandexcloud.net domain name of Object Storage to the IP address of the internal network load balancer. This record will direct traffic coming from your cloud resources and aimed at Object Storage to the internal load balancer that will in turn distribute it across NAT instances.
To deploy a NAT instance, use this Ubuntu 22.04 LTS image from Cloud Marketplace. It translates the source and target IP addresses to ensure traffic routing to the Object Storage public IP address.
By placing the NAT instances in multiple availability zones, you can ensure fault-tolerant access to Object Storage. You can scale the solution for higher workload by adding more NAT instances. Before doing that, consider the internal NLB traffic processing locality.
The Object Storage bucket policy only allows bucket operations from NAT instance public IP addresses. Only the cloud resources that use this solution can access the bucket. You cannot connect to a bucket in Object Storage via a public API endpoint. If you need, you can remove this limitation in the Terraform configuration file.
Test results for NAT instance throughput
The solution was tested on a single NAT instance with the following configuration:
- Platform: Intel Ice Lake (
standard-v3) - Performance level: 100%
- vCPU cores: 2
- RAM: 2 GB
The average Object Storage traffic processing speed was 250 MB/s, both egress and ingress.
The test was performed using warpwarp command used for the test:
warp get \
--host storage.yandexcloud.net \
--access-key <static_key_ID> \
--secret-key <secret_key> \
--tls \
--bucket <bucket_name> \
--obj.randsize \
--concurrent 20 \
--warp-client <warp_client_IP_addresses>
Tips for production deployment
-
When deploying your NAT instances in multiple availability zones, use an even number of VMs to evenly distribute them across the availability zones.
-
When selecting the number of NAT instances, consider the internal NLB traffic processing locality.
-
Once the solution is deployed, reduce the number of NAT instances or update the list of availability zones in the
yc_availability_zonesparameter only during a pre-scheduled time window. Applying changes may cause interruptions in traffic processing. -
If, with increased Object Storage workload, a NAT instance demonstrates a high
CPU steal timemetric value, enable a software-accelerated network for that NAT instance. -
By default, you can access buckets in Object Storage via the Yandex Cloud management console
. To disable this access option, providebucket_console_access = falsein the configuration. -
With
bucket_private_access = true, skippingmgmt_ipwill cause a bucket access error during Terraform deployment on your workstation. -
If you are using your own DNS server, create the following type
Aresource records in its settings:Name Type Value storage.yandexcloud.netA<internal_load_balancer_IP_address><bucket_name>.storage.yandexcloud.netA<internal_load_balancer_IP_address> -
Save the
pt_key.pemprivate SSH key for accessing NAT instances to a secure location or recreate it without using Terraform. -
Once the solution is deployed, SSH access to the NAT instances will be disabled. To enable it, add a rule for inbound SSH traffic (
TCP/22) in thes3-nat-sgsecurity group to enable access only from trusted IP addresses of admin workstations.
Deployment plan
To deploy the solution and test it:
- Get your cloud ready.
- Set up an environment for deploying the resources.
- Deploy the solution.
- Test the solution.
If you no longer need the resources you created, delete them.
Get your cloud ready
Sign up for Yandex Cloud and create a billing account:
- Navigate to the management console
and log in to Yandex Cloud or create a new account. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVEorTRIAL_ACTIVEstatus. If you do not have a billing account, create one and link a cloud to it.
If you have an active billing account, you can navigate to the cloud page
Learn more about clouds and folders here.
Required paid resources
The infrastructure support costs include:
- Fee for using Object Storage (see Yandex Object Storage pricing).
- Fee for using a network load balancer (see Network Load Balancer pricing).
- Fee for continuously running VMs (see Yandex Compute Cloud pricing).
- Fee for public IP addresses and outbound traffic (see Yandex Virtual Private Cloud pricing).
Set up an environment for deploying the resources
-
If you do not have the Yandex Cloud CLI yet, install it and sign in as a user.
-
Check if there is an account in the Yandex Cloud cloud with
adminpermissions for the folder where you are deploying the solution. -
Check whether your cloud quotas allow you to deploy your resources for this scenario:
Information about the number of new resources
Resource Quantity Virtual machines 5 VM vCPUs 10 VM RAM 10 GB Disks 5 HDD size 30 GB SSD size 40 GB Network load balancer 1 Load balancer target group 1 Networks 11 Subnets 3 Static public IP addresses 4 Security groups 1 DNS zone 1 Bucket 1 Service account 2 Service account static key 1 1 If the user did not specify the ID of an existing network in
terraform.tfvars.
Deploy the solution using Terraform
-
Clone the
yandex-cloud-examples/yc-s3-private-endpointrepository to your workstation and go to theyc-s3-private-endpointdirectory:git clone https://github.com/yandex-cloud-examples/yc-s3-private-endpoint.git cd yc-s3-private-endpoint -
Set up an environment for Terraform authentication (learn more in Getting started with Terraform):
export YC_TOKEN=$(yc iam create-token) -
The
variables.tffile defines variable parameters of resources to create. Set your custom variable values in theterraform.tfvarsfile. Refer to the table below to see which parameters you need to update.Detailed information about values to set
Parameter
nameReplace with
a custom
valueDescription Type Example folder_idYes ID of the folder to host the solution components string"b1gentmqf1ve********"vpc_id— ID of your cloud network to provide with Object Storage access. If left empty, the system will create a new network. string"enp48c1ndilt********"yc_availability_zones— List of the availability zones for deploying NAT instances list(string)["ru-central1-a", "ru-central1-b"]subnet_prefix_list— List of prefixes of cloud subnets to host the NAT instances (one subnet in each availability zone from the yc_availability_zoneslist in the following order: ru-central1-a, ru-central1-b, and so on).list(string)["10.10.1.0/24", "10.10.2.0/24"]nat_instances_count— Number of NAT instances to deploy. We recommend setting an even number to evenly distribute the instances across the availability zones. number4bucket_private_access— This parameter limits access to the bucket to the public IP addresses of NAT instances. You can set it to trueto apply this limitation orfalse, to remove it.booltruebucket_console_access— This parameter allows or denies bucket access via the Yandex Cloud management console. You can set it to trueto enable this access option or tofalse, to disable it. Make sure to specify it ifbucket_private_accessis set totrue.booltruemgmt_ipYes Public IP address of your workstation where you are deploying the infrastructure using Terraform. Use it to allow bucket operations from your local machine when deploying Terraform. Make sure to specify this parameter if bucket_private_accessis set totrue.string"A.A.A.A"trusted_cloud_netsYes List of aggregated prefixes of cloud subnets allowed to access Object Storage. It is used in the inbound traffic rule for NAT instance security groups. list(string)["10.0.0.0/8", "192.168.0.0/16"]vm_username— NAT instance and test VM username string"admin"s3_ipNo Object Storage public IP address string213.180.193.243s3_fqdnNo Object Storage domain name stringstorage.yandexcloud.net -
Initialize Terraform:
terraform init -
Check the list of new cloud resources:
terraform plan -
Create the resources:
terraform apply -
Once the
terraform applyprocess is complete, the command line will show information you need to connect to the test VM and run Object Storage tests. Later on, you can view this info by running theterraform outputcommand:Information about the deployed resources
Name Description Value (example) path_for_private_ssh_keyFile with a private key for SSH access to NAT instances and test VM ./pt_key.pemvm_usernameNAT instance and test VM username admintest_vm_passwordTest VM adminpasswordv3RCqU****s3_bucket_nameObject Storage bucket name s3-bucket-<...>s3_nlb_ip_addressInternal load balancer IP address 10.10.1.100
Test the solution
-
In the management console
, navigate to the folder with the resources you created. -
Select Compute Cloud.
-
Select
test-s3-vmfrom the list of VMs. -
Navigate to the Serial console tab.
-
Click Connect.
-
Enter the
adminusername and the password from theterraform output test_vm_passwordcommand output (without quotation marks). -
Run this command:
dig storage.yandexcloud.net -
Check the DNS server response and make sure the Object Storage domain name resolves to the IP address of the internal load balancer. The command output will show type
Aresource records as follows:;; ANSWER SECTION: storage.yandexcloud.net. 300 IN A 10.10.1.100 -
Get an object from the bucket in Object Storage using the AWS CLI. The bucket name will be fetched from the test VM environment variable.
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3://$BUCKET/s3_test_file.txt s3_test_file.txtResult:
download: s3://<bucket_name>/s3_test_file.txt to ./s3_test_file.txt -
You can additionally run a number of commands to test Object Storage. The bucket name will be fetched from the test VM environment variable.
Upload the downloaded test file to the bucket under a different name:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 cp s3_test_file.txt s3://$BUCKET/textfile.txtResult:
upload: ./s3_test_file.txt to s3://<bucket_name>/textfile.txtGet a list of objects in the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 ls --recursive s3://$BUCKETResult:
2023-08-16 18:24:05 53 s3_test_file.txt \ 2023-08-16 18:41:39 53 textfile.txtDelete the object you uploaded to the bucket:
aws --endpoint-url=https://storage.yandexcloud.net \ s3 rm s3://$BUCKET/textfile.txtResult:
delete: s3://<bucket_name>/textfile.txt
Delete the resources you created
To delete the resources created with Terraform:
-
In the terminal window, go to the directory containing the infrastructure plan.
Warning
Make sure the directory has no Terraform manifests with the resources you want to keep. Terraform deletes all resources that were created using the manifests in the current directory.
-
Delete resources:
-
Run this command:
terraform destroy -
Confirm deleting the resources and wait for the operation to complete.
All the resources described in the Terraform manifests will be deleted.
-