FAQ about Managed Service for Greenplum®
General questions
Connection
Backups
-
When are backups performed? Is a database cluster available during backup?
-
Can I run Managed Service for Greenplum® cluster backups manually?
-
Can I select other resources when restoring a cluster from a backup?
Updating a cluster
Managed Service for Greenplum® clusters and hosts
Working with external tables
Monitoring
General questions
What is Managed Service for Greenplum®?
Managed Service for Greenplum® is a service that helps you create, operate, and scale Greenplum® databases in a cloud infrastructure.
With Managed Service for Greenplum®, you can:
- Create a database with the required performance characteristics.
- Scale processing power and storage dedicated for your databases as needed.
- Get database logs.
Managed Service for Greenplum® takes on time-consuming Greenplum® infrastructure administration tasks:
- Monitors resource usage.
- Automatically creates DB backups.
- Provides fault tolerance through automatic failover to backup replicas.
- Keeps database software updated.
You interact with database clusters in Managed Service for Greenplum® the same way you interact with regular databases in your local infrastructure. This allows you to manage internal database settings to meet your app requirements.
What part of database management and maintenance is Managed Service for Greenplum® responsible for?
When creating clusters, Managed Service for Greenplum® allocates resources, installs the DBMS, and creates databases.
For the created and running databases, Managed Service for Greenplum® automatically creates backups and applies fixes and updates to the DBMS.
Managed Service for Greenplum® also provides data replication between database hosts (both inside and between availability zones) and automatically switches the load over to a backup replica in the event of a failure.
Which tasks are best addressed using Managed Service for Greenplum®, and which using VMs with databases?
Yandex Cloud offers two ways to work with databases:
- Managed Service for Greenplum® allows you to operate template databases with no need to worry about administration.
- Yandex Compute Cloud virtual machines allow you to create and configure your own databases. This approach allows you to use any database management systems, access databases via SSH, and so on.
How do I get started with Managed Service for Greenplum®?
Managed Service for Greenplum® is available to any registered Yandex Cloud user.
To create a database cluster in Managed Service for Greenplum®, you must define its characteristics:
- Host class (performance characteristics, such as CPUs, RAM, etc.).
- Storage size (reserved in full when you create the cluster).
- Network your cluster will be connected to.
- Number of hosts for a cluster and the cluster availability zone.
For a detailed guide, see Creating a cluster.
What happens when a new DBMS version is released?
The database software is updated when new minor versions are released. Owners of the affected DB clusters are notified of expected work times and DB availability in advance.
What happens when a DBMS version becomes deprecated?
One month after the database version becomes deprecated, Managed Service for Greenplum® automatically sends email notifications to the owners of DB clusters created with this version.
New hosts can no longer be created using deprecated DBMS versions. Database clusters are automatically upgraded to the next supported version: seven days after notification for minor versions and one month for major versions. Deprecated major versions are upgraded even if you disabled automatic updates.
on Personal Data
?
Does the service meet the requirements under Russian Federation Federal Law No. 152-FZ Yes, it does. You can read the full security audit conclusion
Can I get logs of my operations with services?
Yes, you can request log records about your resources from Yandex Cloud services. For more information, see Data requests.
Connection
Can I connect to the DB via SSH and get superuser permissions?
No, you cannot connect via SSH, nor can you get superuser permissions. This is done for the sake of security and user cluster fault tolerance because direct changes inside hosts can render them completely inoperable. However, you can connect to the DB as an admin user with the mdb_admin
role. The privileges it has match those of the superuser. For more information, see The mdb_admin role instead of a superuser.
How can I access a running DB host?
You can connect to Managed Service for Greenplum® databases using standard DBMS methods.
Learn more about connecting to clusters.
How do I set up user authentication?
You can set up user authentication in Managed Service for Greenplum® using rules.
For more information, see User authentication.
Backups
When are backups performed? Is a DB cluster available during backup?
The backup window is an interval during which a full daily backup of the DB cluster is performed. You can configure a backup window when creating or editing a cluster.
Clusters remain fully accessible during backups.
Is DB host backup enabled by default?
Yes, backup is enabled by default. For Greenplum®, a full backup is performed every day, saving all DB cluster transaction logs. The first and every second automatic backups are full backups of all databases. Other backups are incremental and store only the data that has changed since the previous backup to save space.
Automatically created backups of an existing cluster are kept for seven days, whereas those created manually are stored indefinitely. Once the cluster is deleted, all its backups are kept for seven days.
Can I run Managed Service for Greenplum® cluster backups manually?
Yes, Managed Service for Greenplum® supports manually running a cluster backup.
Can I select other resources when restoring a cluster from a backup?
Yes, with the following restrictions:
- The total number of segments must be the same as in the source cluster.
- The disk size per segment in the new cluster must be at least as large as in the source cluster.
Example
The source cluster has four segment hosts, each containing four segments. The total number of segments is 16. When restoring the cluster, you can choose two segment hosts with eight segments per host, so that the total number of segments remains 16.
To ensure that the disk size per segment does not decrease, the disk size in each segment host must at least double.
Updating a cluster
How can I change the computing resources and storage size for a database cluster?
You can change computing resources and storage size in the management console. All you need to do is choose a different host class for the required cluster.
The cluster characteristics change within 30 minutes. During this period, other maintenance activities may also be enabled for the cluster, such as installing updates.
Managed Service for Greenplum® clusters and hosts
What is a database host and database cluster?
A database host is an isolated database environment in the cloud infrastructure with dedicated computing resources and reserved data storage.
A database cluster is one or more database hosts between which replication can be configured.
How many DB hosts can a cluster contain?
A Managed Service for Greenplum® cluster includes a minimum of 4 hosts:
- 2 master hosts.
- 2 segment hosts.
You can increase the number of segment hosts up to 32.
For more information, see Quotas and limits.
How many clusters can you create in a single cloud?
For more information on MDB technical and organizational limitations, see Quotas and limits.
How are DB clusters maintained?
Maintenance in Managed Service for Greenplum® implies:
- Automatic installation of DBMS updates and fixes for your database hosts.
- Changes to the host class and storage size.
- Other Managed Service for Greenplum® maintenance activities.
For more information, see Maintenance.
How do you calculate usage cost for a database host?
In Managed Service for Greenplum®, the usage cost is calculated based on the following parameters:
- Selected host class.
- Size of the storage reserved for the database host.
- Size of the database cluster backups. Backup space in the amount of the reserved storage is free of charge. Backup storage that exceeds this size is charged at special rates.
- Number of hours of database host operation. Partial hours are rounded to an integer value. You can find the cost per hour of operation for each host class in Pricing policy.
Why is the cluster slow even though the computing resources are not used fully?
Perhaps, the maximum storage IOPS and bandwidth values are insufficient for processing the current number of requests. In this case, throttling is triggered and the performance of the entire cluster degrades.
The maximum IOPS and bandwidth values increase by a fixed value when the storage size increases by a certain step. The step and increment values depend on the disk type:
Disk type | Step, GB | Max IOPS increase (read/write) | Max bandwidth increase (read/write), MB/s |
---|---|---|---|
network-hdd |
256 | 300/300 | 30/30 |
network-ssd |
32 | 1,000/1,000 | 15/15 |
network-ssd-nonreplicated |
93 | 28,000/5,600 | 110/82 |
To increase the maximum IOPS and bandwidth values and make throttling less likely, increase the storage size when you update your cluster.
If you are using the network-hdd
storage type, consider switching to network-ssd
or network-ssd-nonreplicated
by restoring the cluster from a backup.
Working with external tables
How are user credentials transmitted when working with external tables?
When working with external tables using the PXF protocol, user credentials are provided as plain text. Therefore, such credentials are only available to the administrator user with the mdb_admin
role. Other users have no access to the credentials for security reasons.
Monitoring
What metrics and processes can be tracked using monitoring?
For all DBMS types, you can track:
- CPU, memory, network, or disk usage, in absolute terms.
- Memory, network, or disk usage as a percentage of the set limits for the corresponding cluster host class.
- Amount of data in the DB cluster and the remaining free space in the data storage.
For DB hosts, you can track metrics specific to the corresponding type of DBMS. For example, for Greenplum®, you can track:
- Average query execution time.
- Number of requests per second.
- Number of errors in logs.
You can monitor with a minimum resolution of 5 seconds.
For more information about monitoring, see Monitoring cluster and host state.
What is the retention period for logs?
Cluster logs are stored for 30 days.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.