Pre-configuring a Yandex MPP Analytics for PostgreSQL cluster connection
A Yandex MPP Analytics for PostgreSQL cluster can be used to deploy a Greenplum® or Apache Cloudberry™ DBMS. Both DBMSs are based on PostgreSQL, so you need the same tools as for PostgreSQL to connect to them.
You can only connect to a Yandex MPP Analytics for PostgreSQL cluster through the primary master host. To identify host roles, get a list of hosts in the cluster.
You can connect to a cluster:
-
From Yandex Cloud virtual machines located in the same cloud network. For hosts without public access, SSL is not required to connect to them from these virtual machines.
-
Over the internet if you configured public access for your cluster. You can connect over the internet in the following ways:
- Use an SSL connection.
- Use IAM authentication.
Configuring security groups
You can assign one or more security groups to a Yandex MPP Analytics for PostgreSQL cluster. To connect to a cluster, security groups must include rules allowing traffic on port 6432 from certain IP addresses or other security groups.
Note
A security group assigned to a cluster controls traffic between the cluster and other cloud or external resources. You do not need to configure interaction between cluster hosts, as it is controlled by a separate system security group.
Rule settings depends on the connection method you select:
-
For incoming traffic:
- Port range:
6432. - Protocol:
TCP. - Source:
CIDR. - CIDR blocks: Range of addresses to connect from.
- Port range:
-
For outgoing traffic:
- Port range:
0-65535. - Protocol:
Any(Any). - Source:
CIDR. - CIDR blocks:
0.0.0.0/0.
This rule enables Yandex MPP Analytics for PostgreSQL to use external data sources, e.g., PXF or GPFDIST.
- Port range:
-
Add the following rules to the cluster security group:
-
For incoming traffic:
- Port range:
6432. - Protocol:
TCP. - Source:
Security group. - Security group: If your cluster and VM share the same security group, select
Current. Otherwise, specify the VM security group.
- Port range:
-
For outgoing traffic:
- Port range:
0-65535. - Protocol:
Any. - Source:
CIDR. - CIDR blocks:
0.0.0.0/0.
This rule enables Yandex MPP Analytics for PostgreSQL to use external data sources, e.g., PXF or GPFDIST.
- Port range:
-
-
Configure the VM security group to allow connections to the VM as well as traffic between the VM and the cluster hosts.
-
For inbound traffic:
- Port range:
22. - Protocol:
TCP. - Source:
CIDR. - CIDR blocks: Range of addresses to connect from.
This rule allows VM connections over SSH.
- Port range:
-
For outbound traffic:
- Port range:
0-65535. - Protocol:
Any. - Destination name:
CIDR. - CIDR blocks:
0.0.0.0/0.
This rule permits all outbound traffic, allowing you to install any necessary certificates and tools on your VM.
- Port range:
-
Security groups for Yandex Managed Service for Trino
The connector uses the GPFDIST protocol for connection to Managed Service for Trino:
- Managed Service for Trino coordinators and workers send queries to the Yandex MPP Analytics for PostgreSQL master over TCP port 6432.
- Yandex MPP Analytics for PostgreSQL segments forward data to Managed Service for Trino workers over the GPFDIST TCP port, e.g., 31111.
To ensure a secure connection to Managed Service for Trino, we recommend that you configure security groups in Yandex MPP Analytics for PostgreSQL and, optionally, in Managed Service for Trino.
If Yandex MPP Analytics for PostgreSQL interacts with other clusters or entities inside the user network, you need to separately configure security group rules for any such clusters or entities.
Yandex MPP Analytics for PostgreSQL side setup
-
Rule for internal Yandex MPP Analytics for PostgreSQL cluster traffic:
- Port range:
0-65535. - Protocol:
Any. - Source:
Security group. - Security group:
Current.
- Port range:
-
Rule for connections from a Managed Service for Trino cluster:
- Port range:
6432. - Protocol:
TCP. - Source:
Security group. - Security group: Specify the Managed Service for Trino cluster security group.
- Port range:
-
Rule for internal Yandex MPP Analytics for PostgreSQL cluster traffic:
- Port range:
0-65535. - Protocol:
Any. - Source:
Security group. - Security group:
Current.
- Port range:
-
Rule for connections to a Managed Service for Trino cluster:
- Port range:
30078-30085. - Protocol:
TCP. - Source:
Security group. - Security group: Specify the Managed Service for Trino cluster security group.
- Port range:
Managed Service for Trino side setup
To configure security group rules in Managed Service for Trino, invert the Yandex MPP Analytics for PostgreSQL rule settings. Setting up rules for a Managed Service for Trino cluster is optional, but this provides added security for your cluster.
Rule for receiving data from Yandex MPP Analytics for PostgreSQL segments:
- Port range:
30078-30085. - Protocol:
TCP. - Source:
Security group. - Security group: Specify the Yandex MPP Analytics for PostgreSQL cluster security group.
Rule for connections to a Yandex MPP Analytics for PostgreSQL master:
- Port range:
6432. - Protocol:
TCP. - Source:
Security group. - Security group: Specify the Yandex MPP Analytics for PostgreSQL cluster security group.
Obtaining an SSL certificate
To use an SSL connection, get a certificate:
mkdir -p ~/.postgresql && \
wget "https://storage.yandexcloud.net/cloud-certs/CA.pem" \
--output-document ~/.postgresql/root.crt && \
chmod 0655 ~/.postgresql/root.crt
The certificate will be saved to the ~/.postgresql/root.crt file.
mkdir $HOME\.postgresql; curl.exe -o $HOME\.postgresql\root.crt https://storage.yandexcloud.net/cloud-certs/CA.pem
The certificate will be saved to the $HOME\.postgresql\root.crt file.
Corporate policies and antivirus software can block the download of certificates. For more information, see FAQ.
To use graphical IDEs, save a certificate
What's next
- Get the FQDN of the host you want to connect to.
- Connect to the cluster from a graphical IDE, pgAdmin 4 or Docker container.
- Integrate the cluster connection into your application code.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.
Apache® and Apache Cloudberry™ are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.