Working with Managed Service for Greenplum® databases
This section provides the basic information about working with Managed Service for Greenplum®.
To work with a Managed Service for Greenplum® database, follow these steps:
- Create a connection containing your database connection credentials.
- Run a query to the database.
Example of a query for reading data from Managed Service for Greenplum®:
SELECT * FROM greenplum_mdb_connection.my_table
Where:
greenplum_mdb_connection
: Name of the DB connection you created.my_table
: Name of the table in the database.
Setting up a connection
To create a connection to Managed Service for Greenplum®:
-
In the management console
, select the folder where you want to create a connection. -
In the list of services, select Yandex Query.
-
In the left-hand panel, go to the Connections tab.
-
Click
Create new. -
Specify the connection parameters:
-
Under General parameters:
- Name: Name of the connection to Managed Service for Greenplum®.
- Type:
Managed Service for Greenplum
.
-
Under Connection type parameters:
-
Cluster: Select an existing Managed Service for Greenplum® cluster or create a new one.
-
Service account: Select an existing Managed Service for Greenplum® service account, or create a new one with the
managed-greenplum.viewer
role, and use it to connect toManaged Service for Greenplum®
clusters.To use a service account, the
iam.serviceAccounts.user
role is required. -
Database: Select the database you will use when working with the Greenplum® cluster.
-
Schema: Specify the namespace
to use when working with the Greenplum® database. -
Login: Username to use when connecting to Greenplum® databases.
-
Password: User password to use when connecting to Greenplum® databases.
-
-
-
Click Create.
You need a service account to detect Managed Service for Greenplum® cluster connection points inside Yandex Cloud. You need a separate login/password pair to access data.
Warning
First, allow network access from Yandex Query to Managed Service for Greenplum® clusters. To do this, enable Access from Yandex Query in the settings of the database to which you are connecting.
Query syntax
Here is the SQL query format used to access Greenplum®:
SELECT * FROM <connection>.<table_name>
Where:
<connection>
: Name of the DB connection you created.<table_name>
: Name of the table in the database.
Limitations
Some limitations apply when working with Greenplum® clusters.
Limitations:
-
External sources are only available for data reading through
SELECT
requests. Yandex Query does not currently support requests that modify tables in external sources. - Data is read in a single stream via the master host, i.e., the coordinator host
of a Greenplum® cluster. Currently, massive parallel reading is not supported. - YQ uses the Yandex Managed Service for YDB type system
. However, the ranges of acceptable values for types used in YDB for date and time operations (Date
,Datetime
, andTimestamp
) often turn out to be insufficiently wide to cover the values of the relevant Greenplum® types (date
andtimestamp
).
Therefore, YQ returns date and time values read from Greenplum® as plain strings (theOptional<Utf8>
type) in ISO-8601 format.
Filter pushdown
Yandex Query can transfer processing of request parts to the data source system. This means that filter expressions are provided through Yandex Query directly to the database for processing; these are typically the query conditions specified in WHERE
. This kind of processing is called filter pushdown
.
Filter pushdown is possible when using:
Description | Example |
---|---|
Filters of the IS NULL /IS NOT NULL type |
WHERE column1 IS NULL or WHERE column1 IS NOT NULL |
OR , NOT , AND logical conditions |
WHERE column IS NULL OR column2 is NOT NULL |
= , <> , < , <= , > , >= comparison conditions with other columns or constants |
WHERE column3 > column4 OR column5 <= 10 |
Supported data types for filter pushdown:
Data type Yandex Query |
---|
BOOL |
INT8 |
UINT8 |
INT16 |
UINT16 |
INT32 |
UINT32 |
INT64 |
UINT64 |
FLOAT |
DOUBLE |
Supported data types
In a Greenplum® DB, the optionality of column values (whether or not the column can contain NULL
values) does not depend on the type system. The NOT NULL
constraint for each column is implemented by means of the attnotnull
attribute in the pg_attributeNULL
values and the YQ type system will represent them as optional
The table below shows how Greenplum® and Yandex Query types map. All other data types except those listed are not supported.
Data type Greenplum® | Data type Yandex Query | Notes |
---|---|---|
boolean |
Optional<Bool> |
|
smallint |
Optional<Int16> |
|
int2 |
Optional<Int16> |
|
integer |
Optional<Int32> |
|
int |
Optional<Int32> |
|
int4 |
Optional<Int32> |
|
serial |
Optional<Int32> |
|
serial4 |
Optional<Int32> |
|
bigint |
Optional<Int64> |
|
int8 |
Optional<Int64> |
|
bigserial |
Optional<Int64> |
|
serial8 |
Optional<Int64> |
|
real |
Optional<Float> |
|
float4 |
Optional<Float> |
|
double precision |
Optional<Double> |
|
float8 |
Optional<Double> |
|
date |
Optional<Utf8> |
|
timestamp |
Optional<Utf8> |
|
bytea |
Optional<String> |
|
character |
Optional<Utf8> |
Default sorting rules apply; the string is padded with spaces to the required length. |
character varying |
Optional<Utf8> |
Default sorting rules apply. |
text |
Optional<Utf8> |
Default sorting rules apply. |
json |
Optional<Json> |