Creating an external JDBC data source
In Yandex MPP Analytics for PostgreSQL, you can use the following as an external data source with the JDBC connection type:
- ClickHouse®
- MySQL®
- Oracle
- PostgreSQL
- SQL Server
- Trino
This list contains managed Yandex Cloud DBs and third-party DBs.
Create an external data source
- Open the folder dashboard
. - Navigate to Yandex MPP Analytics for PostgreSQL.
- Open the page of the Greenplum® cluster in question.
- In the left-hand panel, select
PXF. - Click Create data source.
- Select the
JDBCconnection type. - Enter a source name.
- Configure at least one optional setting.
- Click Create.
After you create an external data source, create an external table.
If you do not have the Yandex Cloud CLI yet, install and initialize it.
The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.
To create an external JDBC data source:
-
View the description of the CLI command for creating a data source:
yc managed-greenplum pxf-datasource create jdbc --help -
Configure the data source:
yc managed-greenplum pxf-datasource create jdbc <external_data_source_name> \ --cluster-id=<cluster_ID> \ --driver=<driver_address> \ --url=<database_URL> \ --user=<username> \ --password=<password>Where:
-
cluster-id: Cluster ID. You can get it with the list of clusters in the folder. -
driver: JDBC driver class in Java. The possible values are:com.clickhouse.jdbc.ClickHouseDrivercom.ibm.as400.access.AS400JDBCDrivercom.microsoft.sqlserver.jdbc.SQLServerDrivercom.mysql.cj.jdbc.Driverorg.postgresql.Driveroracle.jdbc.driver.OracleDrivernet.snowflake.client.jdbc.SnowflakeDriverio.trino.jdbc.TrinoDriver
-
url: Database URL. Examples:jdbc:mysql://mysqlhost:3306/testdb: For a local MySQL® database.jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/db1: For a Yandex Managed Service for PostgreSQL cluster. The address contains a special FQDN of the master host in the cluster.jdbc:oracle:thin:@host.example:1521:orcl: For an Oracle database.
-
user: DB owner username. -
password: DB user password.
You can also configure advanced settings.
-
After you create an external data source, create an external table.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Call the PXFDatasource.Create method, e.g., via the following cURL
request:curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-greenplum/v1/clusters/<cluster_ID>/pxf_datasources' \ --data '{ "datasource": { "name": "<external_data_source_name>", "jdbc": { "driver": "<driver_address>", "url": "<database_URL>", "user": "<user_login>", "password": "<user_password>", ... } } }'Where:
name: External data source name.jdbc: External data source settings. Configure at least one optional setting.
You can get the cluster ID with the list of clusters in the folder.
-
Check the server response to make sure your request was successful.
After you create an external data source, create an external table.
-
Get an IAM token for API authentication and put it into an environment variable:
export IAM_TOKEN="<IAM_token>" -
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume that the repository contents reside in the
~/cloudapi/directory. -
Call the PXFDatasourceService.Create method, e.g., via the following gRPCurl
request:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/greenplum/v1/pxf_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" "datasource": { "name": "<external_data_source_name>", "jdbc": { "driver": "<driver_address>", "url": "<database_URL>", "user": "<user_login>", "password": "<user_password>", ... } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.greenplum.v1.PXFDatasourceService.CreateWhere:
name: External data source name.jdbc: External data source settings. Configure at least one optional setting.
You can get the cluster ID with the list of clusters in the folder.
-
Check the server response to make sure your request was successful.
After you create an external data source, create an external table.
This method works well for Yandex MPP Analytics for PostgreSQL that uses Apache Cloudberry™
To create an external data source and an external table using SQL, follow these steps:
-
Create an external data source:
CREATE SERVER pgserver FOREIGN DATA WRAPPER jdbc_pxf_fdw OPTIONS ( CONFIG 'default', JDBC_DRIVER 'org.postgresql.Driver', DB_URL 'jdbc:postgresql://host:5432/db', USER '<username>', PASS '<password>' ); -
Create a mapping between a local user and a user in the external data source:
CREATE USER MAPPING FOR CURRENT_USER SERVER "pgserver"; -
Create an external table:
CREATE FOREIGN TABLE <table_name> (<column_name> <data_type> [, ...]) SERVER "pgserver";
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.
ClickHouse® is a registered trademark of ClickHouse, Inc