Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Managed Service for Greenplum®
  • Getting started
    • Resource relationships
    • Host classes
    • Calculating the cluster configuration
    • Networking in Managed Service for Greenplum®
    • Quotas and limits
    • Backups
    • Resource groups
    • Sharding
    • Users and roles
    • User authentication
    • Command center
    • External tables
    • Managing connections
    • Expanding a cluster
    • Maintenance
    • Greenplum® settings
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Settings depending on the storage size
  • Cluster-level DBMS settings
  • External S3 data source settings
  • External JDBC data source settings
  • External HDFS data source settings
  • External Hive data source settings
  1. Concepts
  2. Greenplum® settings

Greenplum® settings

Written by
Yandex Cloud
Updated at May 5, 2025
  • Settings depending on the storage size
  • Cluster-level DBMS settings
  • External S3 data source settings
  • External JDBC data source settings
  • External HDFS data source settings
  • External Hive data source settings

For Managed Service for Greenplum® clusters, you can configure Greenplum® settings. Some settings are configured at the cluster level, while others, at the level of external data sources, such as S3, JDBC, HDFS, Hive.

The label next to the setting name helps determine which interface is used to set the value of this setting: the management console, CLI, API, SQL, or Terraform. The All interfaces label means that all of the above interfaces are supported.

Depending on the interface you select, the same setting will be represented differently. For example, Max connections in the management console is the same as:

  • max_connections in the gRPC API
  • maxConnections in the REST API

Settings depending on the storage sizeSettings depending on the storage size

The values of some Greenplum® settings may be automatically adjusted when you change the storage size:

  • If the values were not specified or are not suitable for the new size, the default settings for this size will apply.
  • If the settings you specified manually are suitable for the new size, they will be preserved.

The settings that depend on the storage size are:

  • Gp workfile limit per segment
  • Max slot wal keep size

Cluster-level DBMS settingsCluster-level DBMS settings

The following settings are available:

  • Gp add column inherits table setting Management console Terraform API

    This setting controls whether to apply the data compression parameters (compresstype, compresslevel, and blocksize) specified for the AOCO table when adding a column.

    By default, the setting is disabled, i.e., the table’s data compression parameters are ignored.

    For more information, see the relevant Greenplum® documentation.

  • Gp workfile compression Management console Terraform API

    This setting determines whether temporary files created on the disk during a hash connection or hash aggregation will be compressed.

    By default, it is disabled, i.e., temporary files are not compressed.

    For more information, see the relevant Greenplum® documentation.

    Warning

    Changing this setting will cause the cluster hosts to restart one at a time.

  • Gp workfile limits per query Management console Terraform API

    The maximum amount of disk space (in bytes) the temporary files of an active query can occupy in every segment.

    The maximum value is 1099511627776 (1 TB), the minimum value is 0 (unlimited amount), and the default value is 0.

    For more information, see the relevant Greenplum® documentation.

  • Gp workfile limit files per query Management console Terraform API

    The maximum number of temporary files the service creates in a segment to process a single query. If the limit is exceeded, the query will be canceled.

    The maximum value is 100000, the minimum value is 0 (unlimited number of temporary files), and the default value is 10000.

    For more information, see the relevant Greenplum® documentation.

  • Gp workfile limit per segment Management console Terraform API

    The maximum amount of disk space (in bytes) the temporary files of all active queries can occupy in every segment.

    The maximum value is 1099511627776 (1 TB), the minimum value is 0 (unlimited amount). The default value depends on the segment host storage size and is calculated by the formula:

    0.1 × <segment_host_storage_size> / <number_of_segments_per_host>
    

    For more information, see the relevant Greenplum® documentation.

  • Log connections Management console

    This setting controls whether to log a string detailing each successful connection to the Greenplum® server.

    The setting is disabled by default (no logging).

    For more information, see the Greenplum® documentation.

  • Log disconnections Management console

    This setting controls whether to log session completion. If the setting is enabled, after each completed client session, a string with the session duration is output to the log.

    The setting is disabled by default (no logging).

    For more information, see the Greenplum® documentation.

  • Log error verbosity Management console

    This setting controls the amount of detail written to the Greenplum® log for each message. Log detail levels in ascending order of verbosity:

    • terse.
    • default (default value).
    • verbose.

    For more information, see the Greenplum® documentation.

  • Log hostname Management console

    This setting controls whether to output the host name of the Greenplum® database master server to the connection log. If the setting is enabled, the IP address and host name are logged. If the setting is disabled, only the IP address is logged.

    This setting is disabled by default.

    For more information, see the Greenplum® documentation.

  • Log min duration statement Management console

    This setting specifies the minimum command duration required to log the command (in milliseconds).

    If set to 0, the runtime of all statements is logged.

    The minimum value is -1 (disables runtime logging); the maximum value is 2147483647; the default value is -1.

    For more information, see the Greenplum® documentation.

  • Log min messages Management console

    This setting defines the logging level in Greenplum®. All messages of the selected severity level (or higher) are logged. Possible values (in ascending order of severity): DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, INFO, NOTICE, WARNING, ERROR, LOG, FATAL, and PANIC.

    The default value is WARNING. This means all the messages with the following severity levels will be logged: WARNING, ERROR, LOG, FATAL, and PANIC.

    To disable logging of most messages, select PANIC.

    For more information, see the relevant Greenplum® documentation.

  • Log statement Management console Terraform API

    Filter for SQL commands that will be written to the Greenplum® log:

    • NONE: Filter is disabled, no SQL commands are logged.
    • DDL: Logs SQL commands used to change data structure definitions (such as CREATE, ALTER, DROP etc.).
    • MOD: Logs the DDL commands and commands allowing you to modify data (INSERT, UPDATE, DELETE, TRUNCATE, and COPY FROM).
    • ALL: Logs all SQL commands.

    The default value is DDL.

    The PREPARE and EXPLAIN ANALYZE expressions are also logged if they contain the relevant types of commands.

    For more information, see the Greenplum® documentation.

  • Log statement stats Management console

    This setting controls whether to log query statistics (parsing, scheduling, execution).

    The setting is disabled by default (no logging).

    For more information, see the Greenplum® documentation.

  • Master shared buffers Management console

    The amount of memory the Greenplum® master host uses for shared memory buffers (in bytes).

    The minimum value is 1048576 (1 MB). The default value is 134217728 (128 MB).

    The maximum value is calculated using the following formula:

    min(<master_host_storage_size> / 4, 8 * <size_of_DB_data>)
    

    For more information, see the relevant Greenplum® documentation.

    Warning

    Changing this setting will cause the cluster hosts to restart one at a time.

  • Max connections Management console Terraform API

    The maximum number of concurrent connections to the master host.

    The maximum value is 1000, the minimum value is 50, and the default value is 350. For segment hosts, this value is automatically multiplied by five.

    If you increase this value, we recommend increasing Max prepared transactions as well.

    If you update this setting, both the master and segment hosts will be checked to have at least 20 MB of available RAM per connection. If this condition is not met, this error occurs.

    For more information, see the relevant Greenplum® documentation.

  • Max prepared transactions Management console Terraform API

    The maximum number of transactions that can be in the prepared state at the same time.

    The maximum value is 10000, the minimum value is 350, and the default value is 350. The values for master hosts and segment hosts are the same.

    We recommend choosing a value higher than Max connections.

    For more information, see the relevant Greenplum® documentation.

  • Max slot wal keep size Management console Terraform API

    The maximum write-ahead log (WAL) file size in bytes allowed for replication.

    The minimum value is 0 (no logging), and the maximum value is 214748364800 (200 GB). The default value depends on the segment host storage size and is calculated by the formula:

    0.1 × <segment_host_storage_size> / <number_of_segments_per_host>
    

    For more information, see the relevant Greenplum® documentation.

  • Max statement mem Management console Terraform API

    The maximum amount of memory (in bytes) allocated for query processing.

    The minimum value is 134217728 (128 MB), the maximum value is 1099511627776 (1 TB), and the default value is 2097152000 (2,000 MB).

    For more information, see the Greenplum® documentation.

  • Segment shared buffers Management console

    The amount of memory the Greenplum® segment hosts use for shared memory buffers (in bytes).

    The minimum value is 1048576 (1 MB). The default value is 134217728 (128 MB).

    The maximum value is calculated using the following formula:

    min(<segment_host_storage_size> / (4 * <number_of_segments_per_host>), 8 * <size_of_DB_data>)
    

    For more information, see the relevant Greenplum® documentation.

    Warning

    Changing this setting will cause the cluster hosts to restart one at a time.

External S3 data source settingsExternal S3 data source settings

The following settings are available:

  • Access Key Management console CLI API

    S3 storage public access key.

    For more information, see the relevant Greenplum® documentation.

  • Secret Key Management console CLI API

    S3 storage secret access key.

    For more information, see the relevant Greenplum® documentation.

  • Fast Upload Management console CLI API

    This setting controls fast uploading of large files to S3 storage. If disabled, PXF generates files on the disk before sending them to S3 storage. If enabled, PXF generates files in RAM (if RAM capacity is reached, it writes them to disk).

    Fast upload is enabled by default.

    For more information, see the relevant Greenplum® documentation.

  • Endpoint Management console CLI API

    S3 storage address. Yandex Object Storage is set to storage.yandexcloud.net. This is a default value.

    For more information, see the relevant Greenplum® documentation.

External JDBC data source settingsExternal JDBC data source settings

The following settings are available:

  • Driver Management console CLI API

    JDBC driver class in Java. The possible values are:

    • com.simba.athena.jdbc.Driver
    • com.clickhouse.jdbc.ClickHouseDriver
    • com.ibm.as400.access.AS400JDBCDriver
    • com.microsoft.sqlserver.jdbc.SQLServerDriver
    • com.mysql.cj.jdbc.Driver
    • org.postgresql.Driver
    • oracle.jdbc.driver.OracleDriver
    • net.snowflake.client.jdbc.SnowflakeDriver
    • io.trino.jdbc.TrinoDriver

    For more information, see the relevant Greenplum® documentation.

  • Url Management console CLI API

    Database URL. Examples:

    • jdbc:mysql://mysqlhost:3306/testdb: For a local MySQL® DB.
    • jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/db1: For a Yandex Managed Service for PostgreSQL cluster. The address contains a special FQDN of the master host in the cluster.
    • jdbc:oracle:thin:@host.example:1521:orcl: For an Oracle DB.

    For more information, see the relevant Greenplum® documentation.

  • User Management console CLI API

    DB owner username.

    For more information, see the relevant Greenplum® documentation.

  • Password Management console CLI API

    DB user password.

    For more information, see the relevant Greenplum® documentation.

  • Statement Batch Size Management console CLI API

    Number of rows in a batch for reading from an external table.

    The default value is 100.

    For more information, see the relevant Greenplum® documentation.

  • Statement Fetch Size Management console CLI API

    Number of rows to buffer when reading from an external table.

    The default value is 1000.

    For more information, see the relevant Greenplum® documentation.

  • Statement Query Timeout Management console CLI API

    Time (in seconds) the JDBC driver waits for a read or write operation to complete.

    The default value is 60.

    For more information, see the relevant Greenplum® documentation.

  • Pool Enabled Management console CLI API

    This setting determines whether the JDBC connection pool is used. It is enabled by default.

    For more information, see the relevant Greenplum® documentation.

  • Pool Maximum Size Management console CLI API

    Maximum number of database server connections.

    The default value is 5.

    For more information, see the relevant Greenplum® documentation.

  • Pool Connection Timeout Management console CLI API

    Maximum time (in milliseconds) to wait for a connection from the pool.

    The default value is 30000.

    For more information, see the relevant Greenplum® documentation.

  • Pool Idle Timeout Management console CLI API

    Maximum time (in milliseconds) before an inactive connection is considered idle.

    The default value is 30000.

    For more information, see the relevant Greenplum® documentation.

  • Pool Minimum Idle Management console CLI API

    Minimum number of idle connections in the pool.

    The default value is 0.

    For more information, see the relevant Greenplum® documentation.

External HDFS data source settingsExternal HDFS data source settings

The following settings are available:

  • Core Management console API

    Settings of the file system and security rules.

    For more information, see the Apache Hadoop documentation.

    • Default Fs

      URI that defines the HDFS file system.

    • Security Auth To Local

      Rules for mapping Kerberos principals to user accounts of the operating system.

  • Kerberos Management console API

    Settings of the Kerberos network authentication protocol.

    For more information, see the relevant Greenplum® documentation.

    • Enable

      It defines the use of the Kerberos authentication server. By default, it is not used.

    • Primary

      Host of the KDC (Key Distribution Center) main server.

    • Realm

      Kerberos realm for a Greenplum® database.

    • Kdc Servers

      Hosts of KDC servers.

    • Admin server

      Host of the administration server. This is usually the main Kerberos server.

    • Default domain

      Domain that is used to expand host names when translating Kerberos 4 service principals to Kerberos 5 service principals (e.g., when converting rcmd.hostname to host/hostname.domain).

    • Keytab Base64

      Base64-encoded keytab file contents.

  • User Impersonation Management console API

    It defines whether you can authenticate in an external file storage or DBMS on behalf of a Greenplum® user.

    By default, such authentication is prohibited.

    For more information, see the relevant Greenplum® documentation.

  • Username Management console API

    Username that is used to connect to an external file storage or DBMS if user impersonation is disabled.

    For more information, see the relevant Greenplum® documentation.

  • Sasl Connection Retries Management console API

    Maximum number of retry attempts by PXF to request a SASL connection if the GSS initiate failed error occurs.

    The default value is 5.

    For more information, see the relevant Greenplum® documentation.

  • ZK Hosts Management console API

    Hosts of ZooKeeper servers. The values are specified in <address>:<port> format.

    For more information, see the Apache Hadoop documentation.

  • Dfs Management console API

    Distributed file system settings.

    For more information, see the Apache Hadoop documentation.

    • Ha Automatic Failover Enabled

      This setting determines whether automatic fault tolerance for high availability of the file system is enabled. It is enabled by default.

    • Block Access Token Enabled

      This setting determines whether access tokens are used. By default, tokens are verified when connecting to datanodes.

    • Use Datanode Hostname

      This setting determines whether datanode names are used when connecting to the relevant nodes. These are used by default.

    • Nameservices

      List of logical names of HDFS services. You can specify any names separating them by commas.

  • Yarn Management console API

    Settings for the ResourceManager service, which tracks resources within a cluster and schedules running apps, such as MapReduce jobs.

    For more information, see the Apache Hadoop documentation.

    • Resourcemanager Ha Enabled

      This setting determines whether high availability for ResourceManager is enabled. It is enabled by default.

    • Resourcemanager Ha Auto Failover Enabled

      This setting determines whether automatic failover to a different resource is enabled if the active service fails or becomes unresponsive. Automatic failover is enabled by default only if Resourcemanager Ha Enabled is enabled.

    • Resourcemanager Ha Auto Failover Embedded

      This setting determines whether to use the embedded ActiveStandbyElector method for selecting the active service. If the current active service fails or becomes unresponsive, ActiveStandbyElector designates another ResourceManager service as active, assuming the managing role.

      It is enabled by default only if the Resourcemanager Ha Enabled and Resourcemanager Ha Auto Failover Enabled settings are enabled.

    • Resourcemanager Cluster Id

      Cluster ID. It is used to prevent the ResourceManager service from becoming active for another cluster.

External Hive data source settingsExternal Hive data source settings

The following settings are available:

  • Core Management console API

    Settings of the file system and security rules.

    For more information, see the Apache Hadoop documentation.

    • Default Fs

      URI that defines the HDFS file system.

    • Security Auth To Local

      Rules for mapping Kerberos principals to user accounts of the operating system.

  • Kerberos Management console API

    Settings of the Kerberos network authentication protocol.

    For more information, see the relevant Greenplum® documentation.

    • Enable

      It defines the use of the Kerberos authentication server. By default, it is not used.

    • Primary

      Host of the KDC (Key Distribution Center) main server.

    • Realm

      Kerberos realm for a Greenplum® database.

    • Kdc Servers

      Hosts of KDC servers.

    • Admin server

      Host of the administration server. This is usually the main Kerberos server.

    • Default domain

      Domain that is used to expand host names when translating Kerberos 4 service principals to Kerberos 5 service principals (e.g., when converting rcmd.hostname to host/hostname.domain).

    • Keytab Base64

      Base64-encoded keytab file contents.

  • User Impersonation Management console API

    It defines whether you can authenticate in an external file storage or DBMS on behalf of a Greenplum® user.

    By default, such authentication is prohibited.

    For more information, see the relevant Greenplum® documentation.

  • Username Management console API

    Username that is used to connect to an external file storage or DBMS if user impersonation is disabled.

    For more information, see the relevant Greenplum® documentation.

  • Sasl Connection Retries Management console API

    Maximum number of retry attempts by PXF to request a SASL connection if the GSS initiate failed error occurs.

    The default value is 5.

    For more information, see the relevant Greenplum® documentation.

  • ZK Hosts Management console API

    Hosts of ZooKeeper servers. The values are specified in <address>:<port> format.

    For more information, see the Apache Hadoop documentation.

  • Ppd Management console API

    This setting determines whether predicate pushdown is enabled for external table queries. Enabled by default.

    For more information, see the relevant Greenplum® documentation.

  • Metastore Uris Management console API

    List of comma-separated URIs. To request metadata, the external DBMS connects to Metastore using one of these URIs.

  • Metastore Kerberos Principal Management console API

    Service principal for the Metastore Thrift server.

  • Auth Kerberos Principal Management console API

    Kerberos server principal.

Was the article helpful?

Previous
Maintenance
Next
Access management
© 2025 Direct Cursus Technology L.L.C.