Creating a Greenplum® cluster
A Yandex MPP Analytics for PostgreSQL cluster consists of master hosts that get client queries and segment hosts that process and store data.
For more information, see Resource relationships.
Creating a cluster
To create a Yandex MPP Analytics for PostgreSQL cluster, you need the vpc.user and managed-greenplum.editor roles or higher. For more information on assigning roles, see this Identity and Access Management guide.
To create a Yandex MPP Analytics for PostgreSQL cluster:
-
In the management console
, select the folder where you want to create a database cluster. -
Select Yandex MPP Analytics for PostgreSQL.
-
Click Create cluster.
-
Enter a name for the cluster. It must be unique within the folder.
-
Optionally, enter a description for the cluster.
-
Select the environment where you want to create your cluster (you cannot change the environment once the cluster is created):
PRODUCTION: For stable versions of your apps.PRESTABLE: For testing purposes. The prestable environment is similar to the production environment and likewise covered by an SLA, but it is the first to get new features, improvements, and bug fixes. In the prestable environment, you can test the new versions for compatibility with your application.
-
Select the Greenplum® version.
-
Optionally, select groups of dedicated hosts to place master hosts or segment hosts on dedicated hosts. You can assign groups to one of the two Greenplum® host types or to both of them at once.
You must first create a group of dedicated hosts in Yandex Compute Cloud.
You cannot edit this setting after you create a cluster.
If using dedicated hosts, the cluster cost is a sum of the charge for computing resources Yandex Compute Cloud and the markup Yandex MPP Analytics for PostgreSQL.
-
Under Network settings, select:
-
Cloud network to host the cluster.
-
Security groups for the cluster's network traffic. You may need to additionally set up security groups to be able connect to the cluster.
-
Availability zone and subnet to host the cluster. To create a new subnet, click Create subnet in the list of subnets.
Warning
You will not be able to change the availability zone selected for cluster deployment.
For clusters with hosts residing in the
ru-central1-davailability zone, local SSD storage is not available if using Intel Cascade Lake. -
The Public access option to enable connecting to the cluster from the internet.
-
-
Optionally, enable Hybrid storage.
It enables hybrid storage. You cannot disable hybrid storage after you save your cluster settings.
When hybrid storage is enabled, you can use the Yezzey extension to move part of your AO and AOCO tables from the cluster storage to a cold storage, and vice versa.
Cold storage is a convenient option if you need to store your table data for a long time without using it much. This will make data storage less costly.
Note
This feature is at the Preview stage and is free of charge.
-
Specify the admin user credentials. This special user is required for managing the cluster and cannot be deleted. For more information, see Users and roles.
-
Username may contain Latin letters, numbers, hyphens, and underscores, but cannot start with a hyphen. It must be from 1 to 32 characters long.
Note
Such names as
admin,gpadmin, mdb_admin,mdb_replication,monitor,none,postgres,public, andreplare reserved for Yandex MPP Analytics for PostgreSQL. You cannot create users with these names. -
The Password must be from 8 to 128 characters long.
-
-
Specify additional cluster settings, if required:
-
Backup start time (UTC): Time interval during which the cluster backup starts. Time is specified in 24-hour UTC format. The default time is
22:00 - 23:00UTC. -
Maintenance window: Maintenance window settings:
- To enable maintenance at any time, select arbitrary (default).
- To specify the preferred maintenance start time, select by schedule and specify the desired day of the week and UTC hour. For example, you can choose a time when the cluster is least loaded.
Maintenance operations are carried out both on enabled and disabled clusters. They may include updating the DBMS, applying patches, and so on.
-
Service account: Select an existing service account for accessing Yandex Cloud services or create a new one.
-
Write logs: Enables logging of cluster operations. You pay for log storage according to the Yandex Cloud Logging pricing policy. For logging, assign the
logging.writerrole to the service account you selected.If you enable this option, configure the logging settings:
-
Specify the logging destination:
- Folder: Log to the default log group for the selected folder.
- Group: Log either to a new log group or one selected from the list.
-
Select the logs you need:
- Command Center logs: Enables Command Center logs.
- Greenplum logs: Enables Greenplum® logging. Use Log min messages under DBMS settings to specify the logging level.
-
-
DataLens access: Allows you to analyze cluster data in Yandex DataLens.
-
The Yandex Query access option enables you to run YQL queries from Yandex Query to a managed database in Yandex MPP Analytics for PostgreSQL.
-
WebSQL access: This option allows sending queries to cluster databases using Yandex WebSQL.
-
Deletion protection: Manages cluster protection against accidental deletion.
Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
-
Optionally, configure the operating mode and connection pooler settings under Connection pooler:
- Mode:
SESSION(session mode) orTRANSACTION(transaction mode, default). - Size: Maximum number of client connections. The default value is
0(not limited). - Client Idle Timeout: Idle timeout for a client connection (in seconds). Default:
28800.
- Mode:
-
Optionally, under Managing background processes, edit the settings for routine maintenance operations:
- Start time (UTC):
VACUUMstart time. The default value is19:00 UTC. Once theVACUUMoperation is completed, theANALYZEoperation starts. - VACUUM timeout: Maximum
VACUUMexecution time, in seconds. Valid values: from7,200to86,399, with36,000by default. As soon as this period expires,VACUUMwill be forced to terminate. - ANALYZE timeout: Maximum
ANALYZEexecution time, in seconds. Valid values: from7,200to86,399, with36,000by default. As soon as this period expires, theANALYZEoperation will be forced to terminate.
The combined
VACUUMandANALYZEexecution time may not exceed 24 hours. - Start time (UTC):
-
Specify the master host properties on the Master tab. For the recommended configuration, see Calculating the cluster configuration.
-
Host class: Defines the technical properties of the VMs on which the cluster's master hosts will be deployed.
-
Under Storage, select the disk type and specify its size. The available disk types depend on the selected host class.
Warning
- You cannot change disk type after you create a cluster.
- You cannot decrease the storage size.
- While resizing the storage, cluster hosts will be unavailable.
-
-
Specify the properties of segment hosts on the Segment tab. For the recommended configuration, see Calculating the cluster configuration.
-
Number of segment hosts.
-
Number of segments per host. The maximum value of this setting depends on the host class.
The segment host class and the number of segments per host affect the maximum amount of memory allocated to each Greenplum® server process. If you select a host class with small RAM and specify a large number of segments, an error may occur.
-
Host class: Defines the technical properties of the VMs on which the cluster's segment hosts will be deployed.
-
Under Storage, select the disk type and specify its size. The available disk types depend on the selected host class.
Warning
- You cannot change disk type after you create a cluster.
- You cannot decrease the storage size.
- While resizing the storage, cluster hosts will be unavailable.
- Select the storage size.
-
-
If required, configure the cluster-level DBMS settings.
-
Click Create.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
To create a Yandex MPP Analytics for PostgreSQL cluster:
-
Check whether the folder has any subnets for the cluster hosts:
yc vpc subnet listIf your folder has no subnets, create them in VPC.
-
View the description of the CLI command for creating a cluster:
yc managed-greenplum cluster create --help -
In this command, specify the cluster properties (our example does not use all available parameters):
yc managed-greenplum cluster create <cluster_name> \ --greenplum-version=<Greenplum_version> \ --environment=<environment> \ --network-name=<network_name> \ --user-name=<username> \ --user-password=<user_password> \ --master-config resource-id=<host_class>,` `disk-size=<storage_size_in_GB>,` `disk-type=<network-hdd|network-ssd|network-ssd-nonreplicated|local-ssd> \ --segment-config resource-id=<host_class>,` `disk-size=<storage_size_in_GB>,` `disk-type=<network-ssd-nonreplicated|local-ssd> \ --zone-id=<availability_zone> \ --subnet-id=<subnet_ID> \ --assign-public-ip=<enable_public_access_to_cluster_hosts> \ --security-group-ids=<list_of_security_group_IDs> \ --deletion-protectionNote
The cluster name must be unique within a folder. It may contain Latin letters, numbers, hyphens, and underscores. The name may be up to 63 characters long.
Where:
-
--greenplum-version: Greenplum® version, 6.19. -
--environment: Environment:PRODUCTION: For stable versions of your apps.PRESTABLE: For testing purposes. The prestable environment is similar to the production environment and likewise covered by an SLA, but it is the first to get new features, improvements, and bug fixes. In the prestable environment, you can test the new versions for compatibility with your application.
-
--network-name: Network name. -
--user-name: Username. It may contain Latin letters, numbers, hyphens, and underscores, and must start with a letter, number, or underscore. It must be from 1 to 32 characters long. -
--user-password: Password. It must be from 8 to 128 characters long. -
--master-configand--segment-config: Master and segment host configuration:-
resource-id: Host class.The segment host class and the number of segments per host affect the maximum amount of memory allocated to each Greenplum® server process. If you select a host class with small RAM and specify a large number of segments, an error may occur.
-
disk-size: Storage size in GB. -
disk-type: Disk type:network-hdd(for master hosts only)network-ssd(for master hosts only)local-ssdnetwork-ssd-nonreplicated
-
-
--zone-id: Availability zone. -
--subnet-id: Subnet ID. You need to specify the ID if the selected availability zone has two or more subnets. -
--assign-public-ip: Flag used if public access to the hosts is required,trueorfalse. -
--security-group-ids: List of security group IDs. -
--deletion-protection: Cluster protection from accidental deletion,trueorfalse.Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
-
To set the start time for the backup, provide the required value in
HH:MM:SSformat under--backup-window-start:yc managed-greenplum cluster create <cluster_name> \ ... --backup-window-start=<backup_start_time> -
Optionally, to create a cluster based on dedicated host groups, specify their IDs as a comma-separated list in the
--master-host-group-idsand--segment-host-group-idsparameters:yc managed-greenplum cluster create <cluster_name> \ ... --master-host-group-ids=<IDs_of_dedicated_host_groups_for_master_hosts> \ --segment-host-group-ids=<IDs_of_dedicated_host_groups_for_segment_hosts>You can assign groups to one of the two Greenplum® host types or to both of them at once.
You must first create a group of dedicated hosts in Yandex Compute Cloud.
You cannot edit this setting after you create a cluster.
If using dedicated hosts, the cluster cost is a sum of the charge for computing resources Yandex Compute Cloud and the markup Yandex MPP Analytics for PostgreSQL.
-
To set up a maintenance window (including for disabled clusters), provide the relevant value in the
--maintenance-windowparameter when creating your cluster:yc managed-greenplum cluster create <cluster_name> \ ... --maintenance-window type=<maintenance_type>,` `day=<day_of_week>,` `hour=<hour> \Where
typeis the maintenance type:anytime: At any time (default).weekly: On a schedule. For this value, also specify the following:day: Day of week, i.e.,MON,TUE,WED,THU,FRI,SAT, orSUN.hour: Hour of day (UTC), from1to24.
-
To allow access to the cluster from different services, provide the
truevalue in the relevant parameters when creating the cluster:yc managed-greenplum cluster create <cluster_name> \ ... --datalens-access=<allow_access_from_DataLens> \ --yandexquery-access=<allow_access_from_Yandex_Query> \ --websql-access=<allow_access_from_WebSQL>Available services:
--datalens-access: Yandex DataLens--yandexquery-access: Yandex Query--websql-access: Yandex WebSQL
-
To enable transferring logs to Yandex Cloud Logging, specify the following parameters when creating the cluster:
yc managed-greenplum cluster create <cluster_name> \ ... --service-account <service_account_ID> \ --log-enabled \ --log-command-center-enabled \ --log-greenplum-enabled \ --log-pooler-enabled \ --log-folder-id <folder_ID>Where:
-
--service-account: Service account ID. -
--log-enabled: Enables log transfer. Required for other flags responsible for transferring specific logs, e.g.,--log-greenplum-enabled. -
--log-command-center-enabled: Transferring Command Center logs. -
--log-greenplum-enabled: Transferring Greenplum® logs. -
--log-pooler-enabled: Transferring connection pooler logs. -
--log-folder-id: Specify the ID of the folder whose log group you want to use. -
--log-group-id: ID of the log group to write logs to.Specify either
--log-folder-idor--log-group-id.
-
With Terraform
Terraform is distributed under the Business Source License
For more information about the provider resources, see the relevant documentation on the Terraform
To create a Yandex MPP Analytics for PostgreSQL cluster:
-
In the command line, navigate to the directory that contains the Terraform configuration files with the infrastructure plan. If there is no such directory, create one.
-
If you do not have Terraform yet, install it and configure the Yandex Cloud provider.
-
Create a configuration file describing the cloud network and subnets.
The cluster is hosted on a cloud network. If you already have a suitable network, you do not need to describe it again.
Cluster hosts are located on the selected cloud network's subnets. If you already have suitable subnets, you do not need to describe them again.
Below is a sample structure of a configuration file describing a single-subnet cloud network:
resource "yandex_vpc_network" "<network_name_in_Terraform>" { name = "<network_name>" } resource "yandex_vpc_subnet" "<subnet_name_in_Terraform>" { name = "<subnet_name>" zone = "<availability_zone>" network_id = yandex_vpc_network.<network_name_in_Terraform>.id v4_cidr_blocks = ["<subnet>"] } -
Create a configuration file with a description of the cluster and its hosts.
Here is an example of the configuration file structure:
resource "yandex_mdb_greenplum_cluster" "<cluster_name_in_Terraform>" { name = "<cluster_name>" environment = "<environment>" network_id = yandex_vpc_network.<network_name_in_Terraform>.id zone = "<availability_zone>" subnet_id = yandex_vpc_subnet.<subnet_name_in_Terraform>.id assign_public_ip = <enable_public_access_to_cluster_hosts> deletion_protection = <protect_cluster_from_deletion> version = "<Greenplum_version>" master_host_count = <number_of_master_hosts> segment_host_count = <number_of_segment_hosts> segment_in_host = <number_of_segments_per_host> master_subcluster { resources { resource_preset_id = "<host_class>" disk_size = <storage_size_in_GB> disk_type_id = "<disk_type>" } } segment_subcluster { resources { resource_preset_id = "<host_class>" disk_size = <storage_size_in_GB> disk_type_id = "<disk_type>" } } access { data_lens = <allow_access_from_DataLens> yandex_query = <allow_access_from_Yandex_Query> } user_name = "<username>" user_password = "<password>" security_group_ids = ["<list_of_security_group_IDs>"] }Where:
-
assign_public_ip: Public access to cluster hosts,trueorfalse. -
deletion_protection: Cluster protection from accidental deletion,trueorfalse.Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
version: Greenplum® version. -
master_host_count: Number of master hosts, 2. -
segment_host_count: Number of segment hosts, between 2 and 32. -
segment_in_host: Number of segments per host. The maximum value of this setting depends on the host class.The segment host class and the number of segments per host affect the maximum amount of memory allocated to each Greenplum® server process. If you select a host class with small RAM and specify a large number of segments, an error may occur.
-
access.data_lens: Access to the cluster from Yandex DataLens,trueorfalse. -
access.yandex_query: Access to the cluster from Yandex Query,trueorfalse.
For more information about the resources you can create with Terraform, see this provider article.
-
-
Optionally, specify dedicated host groups to place master or segment hosts on dedicated hosts:
resource "yandex_mdb_greenplum_cluster" "<cluster_name_in_Terraform>" { ... master_host_group_ids = [<IDs_of_dedicated_host_groups_for_master_hosts>] segment_host_group_ids = [<IDs_of_dedicated_host_groups_for_segment_hosts>] ... }You can assign groups to one of the two Greenplum® host types or to both of them at once.
You must first create a group of dedicated hosts in Yandex Compute Cloud.
You cannot edit this setting after you create a cluster.
If using dedicated hosts, the cluster cost is a sum of the charge for computing resources Yandex Compute Cloud and the markup Yandex MPP Analytics for PostgreSQL.
-
To enable transferring logs to Yandex Cloud Logging, specify the following parameters:
resource "yandex_mdb_greenplum_cluster" "<cluster_name_in_Terraform>" { ... service_account_id="<service_account_ID>" logging { enabled = <enable_transferring_logs> command_center_enabled = <transfer_Yandex_Command_Center_logs> greenplum_enabled = <transfer_Greenplum®_logs> pooler_enabled = <transfer_connection_pooler_logs> folder_id = "<folder_ID>" } }Where:
-
service_account_id: Service account ID. -
logging: Log transfer settings:-
enabled: Manages log transfer,trueorfalse. To enable parameters responsible for transferring specific logs, provide thetruevalue. -
command_center_enabled: Transferring Command Center logs,trueorfalse. -
greenplum_enabled: Transferring Greenplum® logs,trueorfalse. -
pooler_enabled: Transferring connection pooler logs,trueorfalse. -
folder_id: Specify the ID of the folder whose log group you want to use. -
log_group_id: ID of the log group to write logs to.Specify either
folder_idorlog_group_id.
-
-
-
Make sure the Terraform configuration files are correct:
-
In the command line, navigate to the directory that contains the current Terraform configuration files defining the infrastructure.
-
Run this command:
terraform validateTerraform will show any errors found in your configuration files.
-
-
Create your cluster:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
-
Get an IAM token for API authentication and set it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Create a file named
body.jsonand paste the following code into it:{ "folderId": "<folder_ID>", "name": "<cluster_name>", "environment": "<environment>", "config": { "version": "<Greenplum®_version>", "access": { "dataLens": <allow_access_from_DataLens>, "yandexQuery": <allow_access_from_Yandex_Query> }, "zoneId": "<availability_zone>", "subnetId": "<subnet_ID>", "assignPublicIp": <enable_public_access_to_cluster_hosts> }, "masterConfig": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" } }, "segmentConfig": { "resources": { "resourcePresetId": "<host_class>", "diskSize": "<storage_size_in_bytes>", "diskTypeId": "<disk_type>" } }, "masterHostCount": "<number_of_master_hosts>", "segmentHostCount": "<number_of_segment_hosts>", "segmentInHost": "<number_of_segments_per_host>", "userName": "<username>", "userPassword": "<user_password>", "networkId": "<network_ID>", "securityGroupIds": [ "<security_group_1_ID>", "<security_group_2_ID>", ... "<security_group_N_ID>" ], "deletionProtection": <protect_cluster_from_deletion>, "configSpec": { "pool": { "mode": "<operation_mode>", "size": "<number_of_client_connections>", "clientIdleTimeout": "<client_timeout>" } }, "cloudStorage": { "enable": <use_hybrid_storage> }, "masterHostGroupIds": [ "string" ], "segmentHostGroupIds": [ "string" ], "serviceAccountId": "<service_account_ID>", "logging": { "enabled": "<enable_transferring_logs>", "commandCenterEnabled": "<transfer_Yandex_Command_Center_logs>", "greenplumEnabled": "<transfer_Greenplum®_logs>", "poolerEnabled": "<transfer_connection_pooler_logs>", "folderId": "<folder_ID>" } }Where:
-
folderId: Folder ID. You can get it with the list of folders in the cloud. -
name: Cluster name. -
environment: Cluster environment,PRODUCTIONorPRESTABLE. -
config: Cluster settings:-
version: Greenplum® version. -
access: Cluster settings for access to the following Yandex Cloud services:dataLens: Yandex DataLens,trueorfalse.yandexQuery: Yandex Query,trueorfalse.
-
zoneId: Availability zone. -
subnetId: Subnet ID. -
assignPublicIp: Public access to cluster hosts,trueorfalse.
-
-
masterConfig.resources,segmentConfig.resources: Master and segment host configuration in the cluster:resourcePresetId: Host class.diskSize: Disk size, in bytes.diskTypeId: Disk type.
-
masterHostCount: Number of master hosts,1or2. -
segmentHostCount: Number of segment hosts, from2to32. -
segmentInHost: Number of segments per host. The maximum value of this setting depends on the host class.The segment host class and the number of segments per host affect the maximum amount of memory allocated to each Greenplum® server process. If you select a host class with small RAM and specify a large number of segments, an error may occur.
-
userName: Username. -
userPassword: User password. -
networkId: ID of the network the cluster will be in. -
securityGroupIds: Security group IDs. -
deletionProtection: Cluster protection from accidental deletion,trueorfalse.Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
configSpec.pool: Connection pooler settings:mode: Operation mode,SESSIONorTRANSACTION.size: Maximum number of client connections.clientIdleTimeout: Idle timeout for a client connection (in seconds).
-
cloudStorage.enable: Use of hybrid storage in clusters with Greenplum® 6.25 or higher. Set it totrueto enable the Yandex Cloud Yezzey extension in your cluster. This extension is used to export AO and AOCO tables from disks within the Yandex MPP Analytics for PostgreSQL cluster to a cold storage in Yandex Object Storage. This way, the data will be stored in a service bucket compressed and encrypted. This is a more cost-efficient storage method.You cannot disable hybrid storage after you save your cluster settings.
Note
This feature is at the Preview stage and is free of charge.
-
masterHostGroupIdsandsegmentHostGroupIds: Optionally, IDs of dedicated host groups for master and segment hosts.You must first create a group of dedicated hosts in Yandex Compute Cloud.
You cannot edit this setting after you create a cluster.
If using dedicated hosts, the cluster cost is a sum of the charge for computing resources Yandex Compute Cloud and the markup Yandex MPP Analytics for PostgreSQL.
-
serviceAccountId: Service account ID. -
logging: Settings for transferring logs to Yandex Cloud Logging:-
enabled: Manages log transfer,trueorfalse. To enable parameters responsible for transferring specific logs, provide thetruevalue. -
commandCenterEnabled: Transferring Command Center logs,trueorfalse. -
greenplumEnabled: Transferring Greenplum® logs,trueorfalse. -
poolerEnabled: Transferring connection pooler logs,trueorfalse. -
folderId: Specify the ID of the folder whose log group you want to use. -
logGroupId: ID of the log group to write logs to.Specify either
folderIdorlogGroupId.
-
-
-
Call the Cluster.Create method, e.g., via the following cURL
request:curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-greenplum/v1/clusters' \ --data "@body.json" -
Check the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. -
Create a file named
body.jsonand paste the following code into it:{ "folder_id": "<folder_ID>", "name": "<cluster_name>", "environment": "<environment>", "config": { "version": "<Greenplum®_version>", "access": { "data_lens": <allow_access_from_DataLens>, "yandex_query": <allow_access_from_Yandex_Query> }, "zone_id": "<availability_zone>", "subnet_id": "<subnet_ID>", "assign_public_ip": <enable_public_access_to_cluster_hosts> }, "master_config": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" } }, "segment_config": { "resources": { "resource_preset_id": "<host_class>", "disk_size": "<storage_size_in_bytes>", "disk_type_id": "<disk_type>" } }, "master_host_count": "<number_of_master_hosts>", "segment_host_count": "<number_of_segment_hosts>", "segment_in_host": "<number_of_segments_per_host>", "user_name": "<username>", "user_password": "<user_password>", "network_id": "<network_ID>", "security_group_ids": [ "<security_group_1_ID>", "<security_group_2_ID>", ... "<security_group_N_ID>" ], "deletion_protection": <protect_cluster_from_deletion> "config_spec": { "pool": { "mode": "<operation_mode>", "size": "<number_of_client_connections>", "client_idle_timeout": "<client_timeout>" } }, "cloud_storage": { "enable": <use_hybrid_storage> }, "master_host_group_ids": [ "string" ], "segment_host_group_ids": [ "string" ], "service_account_id": "<service_account_ID>", "logging": { "enabled": "<enable_transferring_logs>", "command_center_enabled": "<transfer_Yandex_Command_Center_logs>", "greenplum_enabled": "<transfer_Greenplum®_logs>", "pooler_enabled": "<transfer_connection_pooler_logs>", "folder_id": "<folder_ID>" } }Where:
-
folder_id: Folder ID. You can request it with the list of folders in the cloud. -
name: Cluster name. -
environment: Cluster environment,PRODUCTIONorPRESTABLE. -
config: Cluster settings:-
version: Greenplum® version. -
access: Cluster settings for access to the following Yandex Cloud services:data_lens: Yandex DataLens,trueorfalse.yandex_query: Yandex Query,trueorfalse.
-
zone_id: Availability zone. -
subnet_id: Subnet ID. -
assign_public_ip: Public access to cluster hosts,trueorfalse.
-
-
master_config.resources,segment_config.resources: Master and segment host configuration in the cluster:resource_preset_id: Host class.disk_size: Disk size, in bytes.disk_type_id: Disk type.
-
master_host_count: Number of master hosts,1or2. -
segment_host_count: Number of segment hosts, from2to32. -
segment_in_host: Number of segments per host. The maximum value of this setting depends on the host class.The segment host class and the number of segments per host affect the maximum amount of memory allocated to each Greenplum® server process. If you select a host class with small RAM and specify a large number of segments, an error may occur.
-
user_name: Username. -
user_password: User password. -
network_id: ID of the network the cluster will be in. -
security_group_ids: Security group IDs. -
deletion_protection: Cluster protection from accidental deletion,trueorfalse.Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
config_spec.pool: Connection pooler settings:mode: Operation mode,SESSIONorTRANSACTION.size: Maximum number of client connections.client_idle_timeout: Idle timeout for a client connection (in seconds).
-
cloud_storage.enable: Use of hybrid storage in clusters with Greenplum® 6.25 or higher. Set it totrueto enable the Yandex Cloud Yezzey extension in a cluster. This extension is used to export AO and AOCO tables from disks within the Yandex MPP Analytics for PostgreSQL cluster to a cold storage in Yandex Object Storage. This way, the data will be stored in a service bucket compressed and encrypted. This is a more cost-efficient storage method.You cannot disable hybrid storage after you save your cluster settings.
Note
This feature is at the Preview stage and is free of charge.
-
master_host_group_idsandsegment_host_group_ids: Optionally, IDs of dedicated host groups for master and segment hosts.You must first create a group of dedicated hosts in Yandex Compute Cloud.
You cannot edit this setting after you create a cluster.
If using dedicated hosts, the cluster cost is a sum of the charge for computing resources Yandex Compute Cloud and the markup Yandex MPP Analytics for PostgreSQL.
-
service_account_id: Service account ID. -
logging: Settings for transferring logs to Yandex Cloud Logging:-
enabled: Manages log transfer,trueorfalse. To enable parameters responsible for transferring specific logs, provide thetruevalue. -
command_center_enabled: Transferring Command Center logs,trueorfalse. -
greenplum_enabled: Transferring Greenplum® logs,trueorfalse. -
pooler_enabled: Transferring connection pooler logs,trueorfalse. -
folder_id: Specify the ID of the folder whose log group you want to use. -
log_group_id: ID of the log group to write logs to.Specify either
folder_idorlog_group_id.
-
-
-
Use the ClusterService.Create call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/greenplum/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.greenplum.v1.ClusterService.Create \ < body.json -
View the server response to make sure your request was successful.
Creating a cluster copy
You can create a Greenplum® cluster with the settings of another one created earlier. To do so, import the Greenplum® source cluster's configuration to Terraform. This way, you can either create an identical copy or use the configuration you imported as the baseline and modify it as needed. Importing a configuration is a good idea when a source Greenplum® cluster has a lot of settings and you need to create a similar one.
To create a Greenplum® cluster copy:
-
If you do not have Terraform yet, install it.
-
Get the authentication credentials. You can add them to environment variables or specify them later in the provider configuration file.
-
Configure and initialize a provider. There is no need to create a provider configuration file manually, you can download it
. -
Place the configuration file in a separate working directory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file.
-
In the same working directory, place a
.tffile with the following contents:resource "yandex_mdb_greenplum_cluster" "old" { } -
Write the ID of the source Greenplum® cluster to the environment variable:
export GREENPLUM_CLUSTER_ID=<cluster_ID>You can request the ID with the list of clusters in the folder.
-
Import the settings of the source Greenplum® cluster into the Terraform configuration:
terraform import yandex_mdb_greenplum_cluster.old ${GREENPLUM_CLUSTER_ID} -
Get the imported configuration:
terraform show -
Copy it from the terminal and paste it into the
.tffile. -
Place the file in the new
imported-clusterdirectory. -
Edit the copied configuration so that you can create a new cluster from it:
- Specify the new cluster name in the
resourcestring and thenameparameter. - Delete the
created_at,health,id,status,master_hosts, andsegment_hostsparameters. - Add the
user_passwordparameter. - If the
maintenance_windowsection hastype = "ANYTIME", delete thehourparameter. - Optionally, make further changes if you need a customized configuration.
- Specify the new cluster name in the
-
Get the authentication credentials in the
imported-clusterdirectory. -
In the same directory, configure and initialize the provider. To avoid creating a configuration file with the provider settings manually, download it
. -
Place the configuration file in the
imported-clusterdirectory and specify the parameter values. If you did not add the authentication credentials to environment variables, specify them in the configuration file. -
Validate your Terraform configuration files:
terraform validateTerraform will show any errors found in your configuration files.
-
Create the required infrastructure:
-
Run this command to view the planned changes:
terraform planIf you described the configuration correctly, the terminal will display a list of the resources to update and their parameters. This is a verification step that does not apply changes to your resources.
-
If everything looks correct, apply the changes:
-
Run this command:
terraform apply -
Confirm updating the resources.
-
Wait for the operation to complete.
-
All the required resources will be created in the specified folder. You can check resource availability and their settings in the management console
. -
Examples
Creating a cluster
Create a Yandex MPP Analytics for PostgreSQL cluster with the following test specifications:
-
Name:
gp-cluster -
Version:
6.19 -
Environment:
PRODUCTION -
Network:
default -
User:
user1 -
Password:
user1user1 -
Master and segment hosts:
- Class:
s2.medium - With 100 GB of local SSD (
local-ssd) storage
- Class:
-
Availability zone:
ru-central1-a, subnet:b0rcctk2rvtr8efcch64 -
With public access to hosts
-
Security group:
enp6saqnq4ie244g67sb -
Deletion protection: Enabled
Run this command:
yc managed-greenplum cluster create \
--name=gp-cluster \
--greenplum-version=6.19 \
--environment=PRODUCTION \
--network-name=default \
--user-name=user1 \
--user-password=user1user1 \
--master-config resource-id=s2.medium,`
`disk-size=100,`
`disk-type=local-ssd \
--segment-config resource-id=s2.medium,`
`disk-size=100,`
`disk-type=local-ssd \
--zone-id=ru-central1-a \
--subnet-id=b0rcctk2rvtr8efcch64 \
--assign-public-ip=true \
--security-group-ids=enp6saqnq4ie244g67sb \
--deletion-protection
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.