Working with PXF
The Greenplum® Platform Extension Framework
Let's say there is a table with sales data over several years. It contains three data types:
- Hot data over the last few months stored in MySQL®.
- Warm data over the last few years stored in Greenplum®.
- Cold data over an earlier period stored in S3.
The colder the data, the less often it is accessed.
To distribute data across multiple DBMSs and enable access to it, PXF is used to create external tables, i.e., special objects in Greenplum® that reference tables, buckets, or files from external sources. This section provides guidelines on how to create external tables that reference external DBMSs.
For such tables, you can specify external data source settings in the SQL query. Alternatively, you can create a source in Yandex MPP Analytics for PostgreSQL with the settings you need and provide that source in the SQL query.
Getting started
- In the Greenplum® cluster's subnet, set up a NAT gateway and link a routing table.
- In the same subnet, create a security group allowing all incoming and outgoing traffic from all addresses.
Get started with external tables using PXF
-
Add a data source to Yandex MPP Analytics for PostgreSQL. The steps for adding a source depend on the source connection type:
-
Create an external table using PXF.
-
Optionally, update the default PXF settings.
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.