Greenplum® settings
For Yandex MPP Analytics for PostgreSQL clusters, you can configure Greenplum® settings. Some settings are configured at the cluster level, while others, at the level of external data sources, such as S3, JDBC, HDFS, Hive.
The label next to the setting name helps determine which interface is used to set the value of this setting: the management console, CLI, API, SQL, or Terraform. The All interfaces label means that all of the above interfaces are supported.
Depending on the selected interface, the same setting will be represented differently. For example, max_connections in the management console is the same as:
max_connectionsin the gRPC APImaxConnectionsin the REST API
Settings depending on the storage size
The values of some Greenplum® settings may be automatically adjusted when you change the storage size:
- If the values were not specified or are not suitable for the new size, the default settings for this size will apply.
- If the settings you specified manually are suitable for the new size, they will be preserved.
The settings that depend on the storage size are:
Cluster-level DBMS settings
This section contains information about Greenplum® configuration properties users can edit by themselves, as well as important default properties that cannot be edited by users.
The list of properties partially duplicates the one found in the official documentation
The important distinctions of this list of parameters are the following:
- Within the service, the possible (boundary) values are different for some properties.
- Within the service, the default values are different for some properties.
- Within the service, there are some open-gpdb
build-specific properties.
Parameter contexts
Parameter context is what determines the level and interface the parameter can be specified in.
For Greenplum®, the following parameter contexts are used:
| Context | Allows the parameter to be set via SET |
Requires restart | Description | Interface |
|---|---|---|---|---|
user |
Yes | No | You can set these parameters for the cluster or a session using the SET command. Any user can edit this parameter for their session. Cluster-level changes will affect the existing sessions only if no local value was set for the session using the SET command. |
Management console CLI Terraform API |
backend |
Yes * | No | These parameters can be set for the cluster or a particular session via the connection request packet, e.g., using the PGOPTIONS environment variable in libpq. |
Management console CLI Terraform API |
sighup |
No | No | These parameters can be set only for the cluster. | Management console CLI Terraform API |
superuser |
No | No | These parameters can be set only for the cluster. | Management console CLI Terraform API |
postmaster |
No | Yes | These parameters can be set only for the cluster. | Management console CLI Terraform API |
* Any user can change these settings for their session; however, after the session starts, they can no longer be changed. New values, set at cluster level will only be valid for sessions started after they were applied.
Parameters the user can manage globally
The parameters listed bellow can be managed by the user globally at cluster level via the management console, YC CLI, Terraform, and API. Some of these parameters can also be managed locally at the session, user, or database level depending on parameter context.
gp_add_column_inherits_table_setting
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
false (off) |
user |
When adding a column to an append-optimized column-oriented table (append-optimized)
using the ALTER TABLE command, this parameters decides whether the table data compression settings for the column (compresstype, compresslevel, and blocksize) will be inherited from the table values.
By default, this parameter is off (false/off): the table data compression settings are not considered when adding a column. If the parameter is on (true/on), the table compression settings will be considered.
When creating an append-optimized column-oriented table (append-optimized), you can set the table data compression parameters (compresstype, compresslevel, and blocksize) in the WITH clause.
When adding a column, the DBMS sets each data compression parameter based on one of the following values (in order of priority):
- The data compression setting specified in the
ENCODINGclause of theALTER TABLEcommand. - If this server configuration parameter is enabled (
true/on), the data compression setting specified in theWITHclause when creating the table. Otherwise, the table data compression setting is ignored. - The data compression setting specified in the server configuration parameter gp_default_storage_options.
- The default data compression setting.
For more information, see this Greenplum® guide
gp_autostats_mode
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | none, on_change, on_no_stats |
on_no_stats |
user |
Specifies the mode of starting automatic statistics collection using ANALYZE.
The on_no_stats option starts statistics collection for CREATE TABLE AS SELECT, INSERT, or COPY operations in any tables that do not have any statistics yet.
The on_change option starts statistics collection only if the number of affected rows exceeds the threshold value set by the gp_autostats_on_change_threshold parameter.
Operations that can start automatic statistics collection with on_change:
CREATE TABLE AS SELECTUPDATEDELETEINSERTCOPY
Note
For partitioned tables, automatic statistics collection does not start if data is inserted from the top-level parent table. Automatic statistics collection starts if the data is inserted directly into the table storing the partitioned table data (the statistics is collected for this table only).
For more information, see this Greenplum® guide
gp_autostats_on_change_threshold
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 2147483647 | 2147483647 | user |
Specifies the threshold value for automatic statistics collection when the gp_autostats_mode parameter is set to on_change. If a table operation affects the number of rows that exceeds this threshold value, ANALYZE is performed to collect the statistics for the table.
For more information, see this Greenplum® guide
gp_cached_segworkers_threshold
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 1 to 10 | 5 | user |
When a uses starts a database session and sends a query, the system creates groups of workflows (gangs) on each segment to complete the work. After the work is complete, the worker processes on the segments are terminated, except for a certain number that are preserved in cache. This number is set by this parameter.
A lower value allows saving system resources on the segment hosts, yet a higher value can improve performance in scenarios where many complex queries are sent consecutively.
For more information, see this Greenplum® guide
gp_enable_global_deadlock_detector
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
true (on) |
postmaster |
Turns on or off the global deadlock detector (Global Deadlock Detector) used to manage concurrent UPDATE and DELETE operations in heap tables (heap tables) to improve performance.
If the global deadlock detector is off, the database performs concurrent update and delete operations in the heap table (heap tables) one by one. If the global deadlock detector is on, concurrent updates are allowed; the detector identifies deadlocks and resolves them by terminating one or more background processes associated with the "youngest" transactions.
For more information, see this Greenplum® guide
gp_enable_zstd_memory_accounting
Note
This is an open-gpdb
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.28 and higher | Boolean | true (on), false (off) |
true (on) |
user |
Controls the ZSTD memory accounting feature. Enabling it (true/on) prevents ZSTD from exceeding the memory available to the resource manager (a group or a queue) by moving ZSTD to a separate zstd_context memory context. This reduces the probability of an uncontrolled cluster crash due to out-of-memory errors.
gp_global_deadlock_detector_period
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 5 to 2147483647 | 120 | sighup |
Specifies the execution interval (in seconds) of the global deadlock detector background workflow (Global Deadlock Detector, see the gp_enable_global_deadlock_detector parameter).
For more information, see this Greenplum® guide
gp_max_plan_size
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 2147483647 | 0 | superuser |
Sets the maximum allowed total uncompressed size of a query execution plan, multiplied by the number of Motion statements (slices) in the plan. If the query plan size exceeds the specified value, the query gets canceled and an error is returned. A value of 0 means the plan size is not tracked. Measured in bytes. This value must be a multiple of 1MB.
For more information, see this Greenplum® guide
gp_max_slices
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 10 to 100 | 100 | user |
Specifies the maximum number of slices (slices) a query can generate. If a query generates more slices than specified, the database returns an error and does not execute the query.
Executing a query that generates a massive number of slices may compromise database performance. For example, a query containing the UNION or UNION ALL statements for several complex views can generate a massive number of slices. To view slice statistics for a query, run the EXPLAIN command.
Warning
The service-level acceptable values are different from the DBMS-level acceptable values. This is done to reduce the probability of an uncontrolled cluster crash due to out-of-memory (OOM) errors caused by attempting to execute a "non-optimal" query. You can bypass this restriction by overriding the parameter at the session, user, or database level.
For more information, see this Greenplum® guide
gp_resource_group_memory_limit
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Real | from 0.1 to 0.8 | 0.8 | postmaster |
Note
This configuration parameter applies only when group-based resource management is activated.
It sets the maximum percentage of system memory resources that can be allocated to resource groups per database segment node. Decreasing this parameter may be justified if you need to allocate more resources to auxiliary components residing on the database segment node (e.g., PXF).
For more information, see this Greenplum® guide
gp_vmem_protect_segworker_cache_limit
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Real | from 0 to 4096 | 500 | postmaster |
If the query execution process consumes more than the specified amount of memory, the process will not be cached for use in subsequent queries after it is over. In systems with a large number of connections or idle processes, you can reduce this value to free up more memory on the segments. Measured in megabytes.
For more information, see this Greenplum® guide
gp_workfile_compression
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
false (off) |
user |
Specifies whether to compress the temporary files created when uploading to disk during hashing operations (aggregation or join). In some cases, compression can help avoid overloading the disk I/O subsystem.
For more information, see this Greenplum® guide
Warning
Changing this setting will cause the cluster hosts to restart one at a time.
gp_workfile_limit_files_per_query
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 100000 | 10000 | user |
Sets the maximum number of temporary workfiles (workfiles) allowed per query on each segment. Workfiles are created when a query requires more memory than allocated.
If the limit is exceeded, the current query gets terminated. A value of 0 indicates an unlimited number of upload files.
For more information, see this Greenplum® guide
gp_workfile_limit_per_query
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | form 0 to 1099511627776 | 0 | user |
Sets the maximum disk space an individual query can use to create temporary workfiles (workfiles) on each segment. A value of 0 means there is no limit. Specified in bytes. This value must be a multiple of 1 MB.
For more information, see this Greenplum® guide
gp_workfile_limit_per_segment
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to the segment disk size / 2 | It is calculated dynamically based on the disk size on the segment | postmaster |
Sets the maximum total disk space that can be used by all ongoing queries to create temporary workfiles (workfiles) on each segment. A value of 0 means there is no limit.
The default value depends on the segment host storage size and is calculated by the formula:
0.1 × <segment_host_storage_size> / <number_of_segments_per_host>
For more information, see this Greenplum® guide
idle_in_transaction_session_timeout
Note
This is an open-gpdb
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 2147483647 | 0 | user |
Terminates any session which is idle (i.e., awaiting a client's request) as part of an open transaction for longer than the specified time interval. Measured in milliseconds. A value of 0 disables the timeout.
You can use this parameter to ensure that idle sessions will not be retaining locks for too long. Even if no significant lock are retained, an open transaction prevents deleting recently outdated tuples that are only visible within this transaction only and takes up space in the resource groups (the concurrency parameter of the resource group).
lock_timeout
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 2147483647 | 0 | user |
Interrupts any query that waits longer than the specified number of milliseconds for a lock on a table, index, row, or other database object. The time limit applies separately to each lock acquisition attempt. This limit applies both to explicit lock requests (e.g., LOCK TABLE or SELECT FOR UPDATE) and implicitly acquired locks.
If log_min_error_statement is set to ERROR or lower, the database logs the request that has exceeded the allocated time. A value of 0 disables the lock timeout monitoring.
Unlike statement_timeout, this timeout can only occur while awaiting locks. Please note: if statement_timeout is a non-zero value, it makes no sense setting lock_timeout the same or greater because the statement timeout will always trigger first.
The database uses the deadlock_timeout and gp_global_deadlock_detector_period parameters to initiate local and global deadlock detection. Please note: if lock_timeout is on and its setting is smaller that the deadlock detection timeouts, the database will terminate the query before the deadlock check is initiated for this session.
For more information, see this Greenplum® guide
log_connections
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
false (off) |
backend |
Adds to the server log a line with detailed info on each successful connection. Some client programs, e.g., psql, attempt to connect twice to figure out if a password is required; therefore, repeated "connection received" messages do not always indicate a problem.
For more information, see this Greenplum® guide
log_disconnections
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
false (off) |
backend |
Adds to the server log a line that marks the end of the client session, including the duration of the session.
For more information, see this Greenplum® guide
log_error_verbosity
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | terse, default, verbose |
default |
superuser |
Controls the level of detail of the server log entries logged for each message.
For more information, see this Greenplum® guide
log_hostname
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
true (on) |
sighup |
By default, connection log entries only show the IP address of the connecting host. With this option on, both the IP address and name of the connecting host will be logged to the master log.
For more information, see this Greenplum® guide
log_min_duration_statement
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from -1 to 2147483647 | -1 | superuser |
Logs a query and its duration in a single line if the query duration is greater than or equal to the specified number of milliseconds. A value of 0 logs all queries and their durations.
A value of -1 disables this function. For example, if you set it to 250, it will log all SQL queries that take 250 ms or longer.
Enabling this option can help to identify non-optimized queries in your applications.
For more information, see this Greenplum® guide
log_min_messages
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | debug5, debug4, debug3, debug2, debug1, info, notice, warning, log, error, fatal, panic |
warning |
superuser |
Controls which message levels to send to the server log. Each level includes all the following levels. The lower the level is down the list, the fewer messages will go to the log.
For more information, see this Greenplum® guide
log_statement
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | none, ddl, mod, all |
ddl |
superuser |
Controls which SQL queries to log. The ddl level logs all data definition commands, such as CREATE, ALTER, and DROP. The mod level registers all DDL statements, as well as INSERT, UPDATE, DELETE, TRUNCATE, and COPY FROM. The PREPARE and EXPLAIN ANALYZE queries are also logged if they contain a statement of the relevant type.
For more information, see this Greenplum® guide
log_statement_stats
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Boolean | true (on), false (off) |
false (off) |
superuser |
Logs general performance statistics for the parser, planner, and query executor to the server log for each query. It is used as a basic profiling tool.
For more information, see this Greenplum® guide
master_shared_buffers
Note
The DBMS itself has only one parameter: shared_buffers. On the service side, there is a separation into shared_buffers for masters and segments. The setting for segments is called segment_shared_buffers.
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 1048576 to the calculated value | 134217728 (128 MB) | postmaster |
Sets the amount of memory the master instance uses for shared memory buffers. This value must be at least 1 MB and must not exceed the value of the following expression (in particular, this depends on max_connections):
max(0.24 × available RAM on master, 16384 × max_connections)
Measured in bytes.
For more information, see this Greenplum® guide
Warning
Changing this setting will cause the cluster hosts to restart one at a time.
max_connections
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 50 to 1,000 | 200 | postmaster |
Maximum number of concurrent connections to the database cluster. When increasing this parameter, you should also increase max_prepared_transactions. Increasing this parameter may cause the database to request more shared memory (see master_shared_buffers and segment_shared_buffers).
For segments, the max_connections value is calculated automatically using the following formula:
5 × max_connections (for master)
If you update this setting, both the master and segment hosts will be checked to have at least 20 MB of available RAM per connection. If this condition is not met, this error occurs.
For more information, see this Greenplum® guide
max_locks_per_transaction
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 128 to 2,048 | 128 | postmaster |
The shared lock table is built to be able to describe locks for max_locks_per_transaction × (max_connections + max_prepared_transactions) objects. Therefore, not more than this number of different objects can be locked at the same time. This is not a hard limit on the number of locks set by a single transaction but rather the maximum average value. You may need to increase this value if you have clients that access multiple different tables within a single transaction.
For more information, see this Greenplum® guide
max_prepared_transactions
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 50 to 10,000 | 200 | postmaster |
Sets the maximum number of transactions that can be in prepared state simultaneously. The database uses prepared transactions internally to ensure data integrity across segments. This value must be at least as large as max_connections on the master.
For more information, see this Greenplum® guide
max_slot_wal_keep_size
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 214748364800 | Calculated | sighup |
Sets the maximum size in bytes of write-ahead log (WAL) files on disk for each segment instance that can be reserved when the database transfers data to a mirror segment or a standby master to maintain synchronization with its corresponding primary segment instance or master.
If the size of the files exceeds the set maximum, hey will be released and become available for deletion. As a result, the mirror or standby instance may loose the ability to continue replication due to the deletion of required WAL files.
If the active primary instances have a non-standard value set for max_slot_wal_keep_size, full and incremental recovery of their mirrors may become impossible. Depending on concurrent workload on the primary instance during full recovery, the recovery process may terminate with an error due to missing WAL files.
For more information, see this Greenplum® guide
max_statement_mem
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 134217728 to 1099511627776 | 2097152000 (2000 MB) | superuser |
Sets the maximum memory limit per query. This helps prevent out-of-memory errors on the segment host during query processing due to excessively high statement_mem.
When modifying the max_statement_mem and statement_mem parameters, change max_statement_mem first.
For more information, see this Greenplum® guide
runaway_detector_activation_percent
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 100 | 90 | postmaster |
For queries managed by resource queues or resource groups, decides when the database must terminate ongoing queries based on the amount of memory they are using. A value of 100 disables automatic termination of queries based on memory utilization percentage. Either the resource queue management method or the resource group management method can be active; both methods cannot be active at the same time.
The server configuration parameter gp_resource_manager determines which method is active. When resource queues are on, this parameter sets the memory utilization percentage vmem to terminate the queries if it is exceeded. If the vmem memory utilization percentage for a segment exceeds the specified value, the database begins terminating queries managed by resource queues, starting with the query using the most memory. Queries get terminated until the vmem utilization percentage becomes lower than the specified value.
For example, if the vmem memory size is set to 10 GB and this parameter is 90 (90%), the database will start terminating queries as soon as vmem exceeds 9 GB. When resource groups are enabled, this parameter sets the utilization percentage for the shared global memory of resource groups. When this percentage is exceeded, the database will terminate the queries managed by resource groups with the vmtracker memory auditor configured, e.g., admin_group and default_group. Resource groups share a global memory pool if the sum of the MEMORY_LIMIT values configured for all resource groups is less than 100%. For example, if you have three resource groups with memory_limit limit values of 10%, 20%, and 30%, the shared global memory will be 40% = 100% − (10% + 20% + 30%).
If the utilization percentage of the shared global memory exceeds the specified value, the database will begin terminating queries based on their memory consumption, selecting them from among the queries managed by resource groups with the vmtracker memory auditor. The system will start with the query that consumes the most memory. Queries will be terminated until the shared global memory utilization percentage becomes lower than the specified value. For example, if there is 10 GB of shared global memory and this parameter is 90 (90%), the database will start terminating queries as soon as 9 GB of shared global memory has been used up.
For more information, see this Greenplum® guide
segment_shared_buffers
Note
The DBMS itself has only one parameter: shared_buffers. On the service side, there is a separation into shared_buffers for masters and segments. The setting for masters is called master_shared_buffers.
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 1048576 to the calculated value | 134217728 (128 MB) | postmaster |
Sets the amount of memory the segment instance uses for shared memory buffers. This parameter value must be at least 1 MB and must not exceed the value of the following expression (in particular, this depends on max_connections):
max(0.2 × RAM available on the segment / number of segments per host), 16384 × 5 × max_connections)
For more information, see this Greenplum® guide
Warning
Changing this setting will cause the cluster hosts to restart one at a time.
Parameters the user can manage locally
This section contains parameters that the user can manage only at the session, user, or database level. These parameters cannot be managed at the management console, YC CLI, or API level.
gp_default_storage_options
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | appendoptimized: true, false; blocksize: [8192; 2097152]; checksum: true, false; compresstype: zlib, zstd, quicklz, rle_type, none; compresslevel: [0; 19]; orientation: row, column. |
appendonly=false, blocksize=32768, checksum=true, compresstype=none, orientation=row |
user |
Sets default values for the following table storage options when you create a table using CREATE TABLE:
appendoptimized: Use theappendoptimized=valuesyntax to specify the append-optimized table storage type.
Theappendoptimizedparameter is a shorthand for the deprecated storage parameterappendonly. The database stores theappendonlyvalue in the folder and displays it when listing storage parameters for append-optimized tables.blocksizechecksumcompresstypecompresslevelorientation
Multiple storage parameters are specified as a list separated by commas. You can set storage parameters using this parameter instead of specifying them in the WITH clause of the CREATE TABLE command. Storage parameters specified in the CREATE TABLE command override the values set by this parameter.
Not all combinations of storage parameter values are allowed. If the storage parameters you specify are not allowed, you get an error. You can find the information about the table storage parameters in the description of the CREATE TABLE command.
You can set default values for database and user. If the server configuration parameter is set at different levels, the table storage parameter values are prioritized as follows when the user logs in to the database and creates a table (from highest to lowest):
- Values specified in the
CREATE TABLEcommand with theWITHorENCODINGclause. - The gp_default_storage_options value set for the user using the
ALTER ROLE...SETcommand. - The gp_default_storage_options value set for the database using the
ALTER DATABASE...SETcommand. - The gp_default_storage_options value set for the entire cluster (default).
The parameter is not cumulative. For example, if the parameter sets appendoptimized and compresstype for the database, and the user uses it to set orientation, the appendoptimized and compresstype values set at the database level will be ignored.
In this example, the ALTER DATABASE command sets the default table storage parameters orientation and compresstype for the mytest database:
ALTER DATABASE mytest SET gp_default_storage_options = 'orientation=column, compresstype=rle_type'
To create an append-optimized columnar format table with RLE compression in the mytest database, it is enough to state appendoptimized=TRUE in the WITH clause.
For more information, see this Greenplum® guide
statement_mem
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 1000 to 2147483647 | 128000 (125 MB) | user |
Allocates segment host memory for each query. The amount of memory allocated using this parameter cannot exceed max_statement_mem or the memory limit of the resource queue or resource groups used to send the query. If a query requires additional memory, temporary upload files are used on disk.
For more information, see this Greenplum® guide
statement_timeout
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 0 to 2147483647 | 0 | user |
Terminates any query if its execution takes longer than the specified number of milliseconds. The 0 value disables the limit.
For more information, see this Greenplum® guide
Parameters not available to the user
This section presents parameters that cannot be managed by the user but are referenced by other parameters in this guide.
deadlock_timeout
| Available in version | Type | Acceptable values | Default value | Context |
|---|---|---|---|---|
| 6.25 and higher | Integer | from 1 to 214748364 | 1000 | superuser |
Lock timeout before checking for a deadlock. You may need to increase this value on a heavily loaded server. Ideally, this setting should exceed the typical transaction time to increase the probability that the lock will be removed before the waiting process decides run a deadlock check.
For more information, see this Greenplum® guide
log_min_error_statement
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | debug5, debug4, debug3, debug2, debug1, info, notice, warning, error, fatal, panic |
error |
superuser |
Controls whether to log the SQL query that causes an error. Logs all SQL queries causing an error of the specified level or higher. To effectively turn off logging of failed queries, set this parameter to panic.
For more information, see this Greenplum® guide
gp_resource_manager
| Available in version | Acceptable values | Default value | Context |
|---|---|---|---|
| 6.25 and higher | group, queue |
group |
postmaster |
Specifies the resource management method currently active in the cluster. The resource group method is used by default.
For more information, see this Greenplum® guide
External S3 data source settings
The following settings are available:
-
Access Key
Management consoleCLIAPIS3 storage public access key.
For more information, see this Greenplum® guide
. -
Secret Key
Management consoleCLIAPIS3 storage secret access key.
For more information, see this Greenplum® guide
. -
Fast Upload
Management consoleCLIAPIThis setting controls fast uploading of large files to S3 storage. If disabled, PXF generates files on the disk before sending them to S3 storage. If enabled, PXF generates files in RAM (if RAM capacity is reached, it writes them to disk).
Fast upload is enabled by default.
For more information, see this Greenplum® guide
. -
Endpoint
Management consoleCLIAPIS3 storage address. The value for Yandex Object Storage is
storage.yandexcloud.net. This is the default value.For more information, see this Greenplum® guide
.
External JDBC data source settings
The following settings are available:
-
Driver
Management consoleCLIAPIJDBC driver class in Java. The possible values are:
com.simba.athena.jdbc.Drivercom.clickhouse.jdbc.ClickHouseDrivercom.ibm.as400.access.AS400JDBCDrivercom.microsoft.sqlserver.jdbc.SQLServerDrivercom.mysql.cj.jdbc.Driverorg.postgresql.Driveroracle.jdbc.driver.OracleDrivernet.snowflake.client.jdbc.SnowflakeDriverio.trino.jdbc.TrinoDriver
For more information, see this Greenplum® guide
. -
Url
Management consoleCLIAPIDatabase URL. Examples:
jdbc:mysql://mysqlhost:3306/testdb: For a local MySQL® DB.jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/db1: For a Yandex Managed Service for PostgreSQL cluster. The address contains a special FQDN of the master host in the cluster.jdbc:oracle:thin:@host.example:1521:orcl: For an Oracle DB.
For more information, see this Greenplum® guide
. -
User
Management consoleCLIAPIDB owner username.
For more information, see this Greenplum® guide
. -
Password
Management consoleCLIAPIDB user password.
For more information, see this Greenplum® guide
. -
Statement Batch Size
Management consoleCLIAPINumber of rows in a batch for reading from an external table.
The default value is
100.For more information, see this Greenplum® guide
. -
Statement Fetch Size
Management consoleCLIAPINumber of rows to buffer when reading from an external table.
The default value is
1000.For more information, see this Greenplum® guide
. -
Statement Query Timeout
Management consoleCLIAPITime (in seconds) the JDBC driver waits for a read or write operation to complete.
The default value is
60.For more information, see this Greenplum® guide
. -
Pool Enabled
Management consoleCLIAPIThis setting determines whether the JDBC connection pool is used. It is enabled by default.
For more information, see this Greenplum® guide
. -
Pool Maximum Size
Management consoleCLIAPIMaximum number of database server connections.
The default value is
5.For more information, see this Greenplum® guide
. -
Pool Connection Timeout
Management consoleCLIAPIMaximum time (in milliseconds) to wait for a connection from the pool.
The default value is
30000.For more information, see this Greenplum® guide
. -
Pool Idle Timeout
Management consoleCLIAPIMaximum time (in milliseconds) before an inactive connection is considered idle.
The default value is
30000.For more information, see this Greenplum® guide
. -
Pool Minimum Idle
Management consoleCLIAPIMinimum number of idle connections in the pool.
The default value is
0.For more information, see this Greenplum® guide
.
External HDFS data source settings
The following settings are available:
-
Core
Management consoleAPISettings of the file system and security rules.
For more information, see the Apache Hadoop documentation
.-
Default Fs
URI that defines the HDFS file system.
-
Security Auth To Local
Rules for mapping Kerberos principals to user accounts of the operating system.
-
-
Kerberos
Management consoleAPISettings of the Kerberos network authentication protocol.
For more information, see the relevant Greenplum® documentation
.-
Enable
It defines the use of the Kerberos authentication server. By default, it is not used.
-
Primary
Host of the KDC (Key Distribution Center) main server.
-
Realm
Kerberos realm for a Greenplum® database.
-
Kdc Servers
Hosts of KDC servers.
-
Admin server
Host of the administration server. This is usually the main Kerberos server.
-
Default domain
Domain that is used to expand host names when translating Kerberos 4 service principals to Kerberos 5 service principals (e.g., when converting
rcmd.hostnametohost/hostname.domain). -
Keytab Base64
Base64-encoded keytab file contents.
-
-
User Impersonation
Management consoleAPIIt decides whether you can authenticate in an external file storage or DBMS on behalf of a Greenplum® user.
By default, such authentication is prohibited.
For more information, see the relevant Greenplum® documentation
. -
Username
Management consoleAPIUsername that is used to connect to an external file storage or DBMS if user impersonation is disabled.
For more information, see the relevant Greenplum® documentation
. -
Sasl Connection Retries
Management consoleAPIMaximum number of retry attempts by PXF to request a SASL connection if the
GSS initiate failederror occurs.The default value is
5.For more information, see the relevant Greenplum® documentation
. -
ZK Hosts
Management consoleAPIHosts of ZooKeeper servers. The values are specified in
<address>:<port>format.For more information, see the Apache Hadoop documentation
.
-
Dfs
Management consoleAPIDistributed file system settings.
For more information, see the Apache Hadoop documentation
.-
Ha Automatic Failover Enabled
This setting determines whether automatic fault tolerance for high availability of the file system is enabled. It is enabled by default.
-
Block Access Token Enabled
This setting determines whether access tokens are used. By default, tokens are verified when connecting to datanodes.
-
Use Datanode Hostname
This setting determines whether datanode names are used when connecting to the relevant nodes. These are used by default.
-
Nameservices
List of logical names of HDFS services. You can specify any names separating them by commas.
-
-
Yarn
Management consoleAPISettings for the ResourceManager service, which tracks resources within a cluster and schedules running apps, such as MapReduce jobs.
For more information, see the Apache Hadoop documentation
.-
Resourcemanager Ha Enabled
This setting determines whether high availability for ResourceManager is enabled. It is enabled by default.
-
Resourcemanager Ha Auto Failover Enabled
This setting determines whether automatic failover to a different resource is enabled if the active service fails or becomes unresponsive. Automatic failover is enabled by default only if Resourcemanager Ha Enabled is enabled.
-
Resourcemanager Ha Auto Failover Embedded
This setting determines whether to use the embedded ActiveStandbyElector method for selecting the active service. If the current active service fails or becomes unresponsive, ActiveStandbyElector designates another ResourceManager service as active, assuming the managing role.
It is enabled by default only if the Resourcemanager Ha Enabled and Resourcemanager Ha Auto Failover Enabled settings are enabled.
-
Resourcemanager Cluster Id
Cluster ID. It is used to prevent the ResourceManager service from becoming active for another cluster.
-
External Hive data source settings
The following settings are available:
-
Core
Management consoleAPISettings of the file system and security rules.
For more information, see the Apache Hadoop documentation
.-
Default Fs
URI that defines the HDFS file system.
-
Security Auth To Local
Rules for mapping Kerberos principals to user accounts of the operating system.
-
-
Kerberos
Management consoleAPISettings of the Kerberos network authentication protocol.
For more information, see the relevant Greenplum® documentation
.-
Enable
It defines the use of the Kerberos authentication server. By default, it is not used.
-
Primary
Host of the KDC (Key Distribution Center) main server.
-
Realm
Kerberos realm for a Greenplum® database.
-
Kdc Servers
Hosts of KDC servers.
-
Admin server
Host of the administration server. This is usually the main Kerberos server.
-
Default domain
Domain that is used to expand host names when translating Kerberos 4 service principals to Kerberos 5 service principals (e.g., when converting
rcmd.hostnametohost/hostname.domain). -
Keytab Base64
Base64-encoded keytab file contents.
-
-
User Impersonation
Management consoleAPIIt decides whether you can authenticate in an external file storage or DBMS on behalf of a Greenplum® user.
By default, such authentication is prohibited.
For more information, see the relevant Greenplum® documentation
. -
Username
Management consoleAPIUsername that is used to connect to an external file storage or DBMS if user impersonation is disabled.
For more information, see the relevant Greenplum® documentation
. -
Sasl Connection Retries
Management consoleAPIMaximum number of retry attempts by PXF to request a SASL connection if the
GSS initiate failederror occurs.The default value is
5.For more information, see the relevant Greenplum® documentation
. -
ZK Hosts
Management consoleAPIHosts of ZooKeeper servers. The values are specified in
<address>:<port>format.For more information, see the Apache Hadoop documentation
.
-
Ppd
Management consoleAPIThis setting determines whether predicate pushdown is enabled for external table queries. Enabled by default.
For more information, see this Greenplum® guide
. -
Metastore Uris
Management consoleAPIList of comma-separated URIs. To request metadata, the external DBMS connects to Metastore using one of these URIs.
-
Metastore Kerberos Principal
Management consoleAPIService principal for the Metastore Thrift server.
-
Auth Kerberos Principal
Management consoleAPIKerberos server principal.