Managing connections in Yandex MPP Analytics for PostgreSQL
Yandex MPP Analytics for PostgreSQL allocates a separate process for each established connection. With numerous client connections, it creates multiple processes and manages distributed data structures. This may cause a lack of computing resources affecting the DBMS performance.
To address the lack of resources issue, a connection pooler is added before the Yandex MPP Analytics for PostgreSQL cluster. The pooler manages connections to allow large numbers of clients to connect to the DBMS without degrading performance. A relatively small number of re-usable connections are maintained between the connection pooler and the DBMS. After the client is disconnected, the connection is returned to the pool and can be reused by the same or a new client.
This deployment method complicates the administration because the servers hosting the connection pooler are added to the DBMS infrastructure.
The Yandex MPP Analytics for PostgreSQL architecture has a built-in connection pooler: Odyssey
Odyssey supports two modes of connection management:
-
Session mode:
In this mode, the client connection is established at the first query to the database and maintained until the client terminates the session. This connection can then be used by that or any other client. This approach allows you to efficiently process multiple repeat client connections to the DBMS (for example, when starting applications that access databases).
This mode is less productive than transaction mode.
-
Transaction mode (default):
In this mode, the client connection is established at the first query to the database and maintained until the transaction ends. This connection can then be used by that or any other client. This approach allows maintaining few server connections between the connection pooler and Yandex MPP Analytics for PostgreSQL hosts when there are many client connections.
The transaction mode provides high performance and allows you to load the DBMS as efficiently as possible. However, this mode is not supported by all Greenplum® and Apache Cloudberry™ clients and does not allow the following:
- Temporary tables
, cursors , and advisory locks that exist beyond a single transaction. - Prepared statements
.
Note
To create a prepared statement in Yandex MPP Analytics for PostgreSQL, use the DBMS driver features. Creating a prepared statement using the
PREPARESQL query is not supported. - Temporary tables
You can change the connection pooler mode after the cluster is created.
Odyssey features
Integrating Yandex MPP Analytics for PostgreSQL with the Odyssey connection pooler has several advantages, e.g., compared to the PgBouncer
| Comparison criterion | Odyssey | PgBouncer |
|---|---|---|
| Resource usage | Yandex MPP Analytics for PostgreSQL clusters are less prone to running out of computing resources with multiple client connections. This is because of asynchronous multithreading built into the Odyssey architecture. This is especially important if most client connections to the DBMS use SSL/TLS. | PgBouncer uses a single-threaded architecture. This may lead to problems with resource consumption and scalability under high load. |
| Supporting client connections | Odyssey strives to keep the client connection alive as long as possible after the transaction ends in order to re-use it if this client returns with a new transaction. | PgBouncer seeks to return such kind of connection to the pool as quickly as possible. |
| Error handling | A Yandex MPP Analytics for PostgreSQL cluster provides improved error handling capabilities. This ensures that DBMS side errors can reach the client application unchanged. | PgBouncer hides DBMS error messages. As a result, all errors look like a PgBouncer connection error to the client. |
In addition, thanks to its integration with Odyssey, a Yandex MPP Analytics for PostgreSQL cluster:
-
Supports numerous client connections without affecting the DBMS performance.
-
Requires no extra connection pooler configuration effort or additional infrastructure for it to operate.
-
Allow you to limit the number of concurrent cluster connections.
-
Support advanced transaction pooling, such as automatic operation cancel and transaction rollback when a client connection is broken.
-
Provides detailed logging of all events that occur. Each client connection also gets a unique ID, which helps track the entire process of establishing a connection.
Tip
If you have issues connecting to a cluster, contact support
. To have your issue resolved faster, provide the full text of the error message, including the connection ID.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.
Apache® and Apache Cloudberry™ are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.