Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
  • Blog
  • Pricing
  • Documentation
Yandex project
© 2025 Yandex.Cloud LLC
Yandex Managed Service for Greenplum®
  • Getting started
    • All guides
    • Connecting to a database
      • Overview
      • Creating an external table
      • Changing PXF settings
    • Connecting to an external file server (gpfdist)
    • Auxiliary utilities
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Getting started
  • Get started with external tables using PXF
  1. Step-by-step guides
  2. Working with PXF
  3. Overview

Working with PXF

Written by
Yandex Cloud
Updated at May 5, 2025
  • Getting started
  • Get started with external tables using PXF

The Greenplum® Platform Extension Framework (PXF) protocol is used to access data in external databases.

Let's say there is a table with sales data over several years. It contains three data types:

  • Hot data for the last few months stored in MySQL®.
  • Warm data for the last few years stored in Greenplum®.
  • Cold data for a longer period stored in S3.

The colder the data, the less often it is accessed.

To distribute data across multiple DBMS's and enable access to it, PXF is used to create external tables, i.e., special objects in Greenplum® that reference tables, buckets, or files from external sources. This section provides guidelines on how to create external tables that reference external DBMS'S.

For such tables, you can specify external data source settings in the SQL query. Alternatively, you can create a source in Managed Service for Greenplum® with the settings you need and provide that source in the SQL query.

Getting startedGetting started

  1. In the Managed Service for Greenplum® cluster subnet, set up a NAT gateway and link a routing table.
  2. In the same subnet, create a security group allowing all incoming and outgoing traffic from all addresses.

Get started with external tables using PXFGet started with external tables using PXF

  1. Add a data source to Managed Service for Greenplum®. The steps needed to add a source depend on the source connection type:

    • S3
    • JDBC
    • HDFS
    • Hive
  2. Create an external table using PXF.

  3. (Optional) Update the default PXF settings.

Greenplum® and Greenplum Database® are registered trademarks or trademarks of VMware, Inc. in the United States and/or other countries.

Was the article helpful?

Previous
Managing client processes and user sessions
Next
S3
Yandex project
© 2025 Yandex.Cloud LLC