Yandex Cloud
Search
Discuss with expertTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex MPP Analytics for PostgreSQL
  • Getting started
    • All guides
      • Overview
      • Creating an external table
      • Editing PXF settings
    • Connecting to an external file server (gpfdist)
    • Auxiliary utilities
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes

In this article:

  • Getting started
  • Creating a foreign table
  • Examples of creating foreign tables
  • Creating an external table
  • Examples of creating external tables
  1. Step-by-step guides
  2. Working with PXF
  3. Creating an external table

Creating an external table using PXF

Written by
Yandex Cloud
Updated at June 3, 2026
  • Getting started
  • Creating a foreign table
    • Examples of creating foreign tables
  • Creating an external table
  • Examples of creating external tables

Yandex MPP Analytics for PostgreSQL allows creating external tables to access data in external databases. In clusters with the Apache Cloudberry™ DBMS, in addition to external tables, you can also create foreign tables using the FDW mechanism. Such tables provide access to the same external DBMSs as external ones. However, foreign tables support both read and write operations within a single table, unlike external tables which are strictly read-only or write-only.

The FDW mechanism supports the following external data types:

  • s3_pxf_fdw
  • jdbc_pxf_fdw
  • hdfs_pxf_fdw
  • hive_pxf_fdw

External tables are created using the CREATE EXTERNAL TABLE SQL query, whereas foreign tables using CREATE FOREIGN TABLE.

Getting startedGetting started

  1. In the Yandex MPP Analytics for PostgreSQL cluster's subnet, set up a NAT gateway and link a routing table.

  2. In the Yandex MPP Analytics for PostgreSQL cluster network, create a security group that allows all incoming and outgoing traffic from all addresses.

  3. Create an external data source. The steps you need to follow depend on the source connection type:

    • S3
    • JDBC
    • HDFS
    • Hive

Creating a foreign tableCreating a foreign table

Syntax of an SQL query to create a foreign table:

CREATE FOREIGN TABLE <table_name>
  (<column_name> <data_type> [, ...])
  SERVER "<local_source_name>"
  OPTIONS (
    resource '<data_path_or_table_name>'
  );

Where:

  • <table_name>: Name of the external table in the Yandex MPP Analytics for PostgreSQL cluster.
  • <column_name>: Column name.
  • <data_type>: Column data type. It must match the type of the corresponding column in the external DBMS.
  • <local_source_name>: Local data source name.
  • <path_to_data_or_table_name>: Path to data or table name on the external source.

Examples of creating foreign tablesExamples of creating foreign tables

ClickHouse®
MySQL®
PostgreSQL
Object Storage
  1. Create a Yandex Managed Service for ClickHouse® cluster with the chuser username.

  2. Add rules allowing all incoming and outgoing traffic to the security group of the Managed Service for ClickHouse® cluster.

  3. Create a table in ClickHouse® and populate it with data:

    1. Connect to the ClickHouse® DB using clickhouse-client.

    2. Create a table named test:

      CREATE TABLE test (
        a INT,
        b INT
      ) ENGINE = Memory;
      
    3. Add data:

      INSERT INTO test (a, b)
      VALUES (1, 11), (2, 22);
      
  4. Get access to external data in Yandex MPP Analytics for PostgreSQL:

    1. Connect to the database in Yandex MPP Analytics for PostgreSQL.

    2. Create an external data source:

      CREATE SERVER "chserver"
        FOREIGN DATA WRAPPER jdbc_pxf_fdw
        OPTIONS (
          jdbc_driver 'com.clickhouse.jdbc.ClickHouseDriver',
          db_url 'jdbc:clickhouse:http://c-<cluster_ID>.rw.mdb.yandexcloud.net:8123/<DB_name>',
          user 'chuser',
          pass '<password>'
        );
      

      Where db_url is a special FQDN always pointing to an available Managed Service for ClickHouse® cluster host.

      You can get the cluster ID with the list of clusters in the folder.

      If public access is enabled for ClickHouse® hosts, use an encrypted connection when creating an external table. Do it by specifying SSL parameters; use the https protocol and port 8443.

      CREATE SERVER "chserver"
        FOREIGN DATA WRAPPER jdbc_pxf_fdw
        OPTIONS (
          ssl 'true',
          sslmode 'strict',
          sslrootcert '/etc/greenplum/ssl/allCAs.pem',
          jdbc_driver 'com.clickhouse.jdbc.ClickHouseDriver',
          db_url 'jdbc:clickhouse:https://c-<cluster_ID>.rw.mdb.yandexcloud.net:8443/<DB_name>',
          user 'chuser',
          pass '<password>'
        );
      
    3. Create a mapping between a local user and a user in the external data source:

      CREATE USER MAPPING FOR CURRENT_USER
        SERVER "chserver";
      
    4. Create an external table named fdw_ch to reference the test table in the ClickHouse® cluster:

      CREATE FOREIGN TABLE fdw_ch (
        a INT,
        b INT
      )
      SERVER "chserver"
      OPTIONS (
        resource 'test'
      );
      
    5. Query the external table for data:

      SELECT * FROM fdw_ch;
      
    6. Populate the external table with data:

      INSERT INTO fdw_ch (a, b)
      VALUES (3, 33);
      
  1. Create a Yandex Managed Service for MySQL® cluster with public access to its hosts and mysqluser for username.

  2. Add rules allowing all incoming and outgoing traffic to the security group of the Managed Service for MySQL® cluster.

  3. Create a table in MySQL® and populate it with data:

    1. Connect to the MySQL® DB using mysql.

    2. Create a table named test:

      CREATE TABLE test (
        a INT,
        b INT
      );
      
    3. Add data:

      INSERT INTO test (a, b)
      VALUES (1, 11), (2, 22);
      
  4. Get access to external data in Yandex MPP Analytics for PostgreSQL:

    1. Connect to the database in Yandex MPP Analytics for PostgreSQL.

    2. Create an external data source:

      CREATE SERVER "myserver"
        FOREIGN DATA WRAPPER jdbc_pxf_fdw
        OPTIONS (
          jdbc_driver 'com.mysql.jdbc.Driver',
          db_url 'jdbc:mysql://c-<cluster_ID>.rw.mdb.yandexcloud.net:3306/<DB_name>',
          user 'mysqluser',
          pass '<password>'
        );
      

      db_url is a special FQDN always pointing to the current master host in the Managed Service for MySQL® cluster.

      You can get the cluster ID with the list of clusters in the folder.

    3. Create a mapping between a local user and a user in the external data source:

      CREATE USER MAPPING FOR CURRENT_USER
        SERVER "myserver";
      
    4. Create an external table named fdw_mysql to reference the test table in the MySQL® cluster:

      CREATE FOREIGN TABLE fdw_mysql (
        a INT,
        b INT
      )
      SERVER "myserver"
      OPTIONS (
        resource 'test'
      );
      
    5. Query the external table for data:

      SELECT * FROM fdw_mysql;
      
    6. Populate the external table with data:

      INSERT INTO fdw_mysql (a, b)
      VALUES (3, 33);
      
  1. Create a Yandex Managed Service for PostgreSQL cluster with public access to its hosts and pguser for username.

  2. Add rules allowing all incoming and outgoing traffic to the security group of the Managed Service for PostgreSQL cluster.

  3. Create a table in PostgreSQL and populate it with data:

    1. Connect to the PostgreSQL DB using psql.

    2. Create a table named test:

      CREATE TABLE public.test (
        a INT,
        b INT
      );
      
    3. Add data:

      INSERT INTO public.test (a, b)
      VALUES (1, 11), (2, 22);
      
  4. Get access to external data in Yandex MPP Analytics for PostgreSQL:

    1. Connect to the database in Yandex MPP Analytics for PostgreSQL.

    2. Create an external data source:

      CREATE SERVER "pgserver"
        FOREIGN DATA WRAPPER jdbc_pxf_fdw
        OPTIONS (
          jdbc_driver 'org.postgresql.Driver',
          db_url 'jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/<DB_name>',
          user 'pguser',
          pass '<password>'
        );
      

      db_url is a special FQDN always pointing to the current master host in the Managed Service for PostgreSQL cluster.

      You can get the cluster ID with the list of clusters in the folder.

    3. Create a mapping between a local user and a user in the external data source:

      CREATE USER MAPPING FOR CURRENT_USER
        SERVER "pgserver";
      
    4. Create an external table named fdw_pg to reference the public.test table in the PostgreSQL cluster:

      CREATE FOREIGN TABLE fdw_pg (
        a INT,
        b INT
      )
      SERVER "pgserver"
      OPTIONS (
        resource 'public.test'
      );
      
    5. Query the external table for data:

      SELECT * FROM fdw_pg;
      
    6. Populate the external table with data:

      INSERT INTO fdw_pg (a, b)
      VALUES (3, 33);
      
  1. Create a service account named fdw-agent and assign it the storage.editor role to access the Object Storage bucket.

  2. Create an Object Storage bucket with restricted access.

  3. Grant fdw-agent the READ and WRITE permissions for the new bucket.

  4. Create a static access key.

  5. Prepare a test file and upload it to the bucket.

    1. Create a test file named test.csv on your local machine:

      1,11
      2,22
      
    2. Upload the test.csv file to the bucket.

  6. Get access to external data in Yandex MPP Analytics for PostgreSQL:

    1. Connect to the database in Yandex MPP Analytics for PostgreSQL.

    2. Create an external data source:

      CREATE SERVER "objserver"
        FOREIGN DATA WRAPPER s3_pxf_fdw
        OPTIONS (
          accesskey '<static_access_key_ID>',
          secretkey '<secret_access_key>',
          endpoint 'storage.yandexcloud.net'
        );
      
    3. Create a mapping between a local user and a user in the external data source:

      CREATE USER MAPPING FOR CURRENT_USER
        SERVER "objserver";
      
    4. Create an external table named fdw_s3 to reference the test.csv table in the bucket:

      CREATE FOREIGN TABLE fdw_s3 (
        a INT,
        b INT
      )
      SERVER "objserver"
      OPTIONS (
        resource '<bucket_name>/test.csv',
        format 'csv'
      );
      
    5. Query the external table for data:

      SELECT * FROM fdw_s3;
      

Tip

You can provide connection parameters either when creating the data source and when creating the external table.

Providing connection parameters when creating the external table
CREATE SERVER "<local_source_name>"
  FOREIGN DATA WRAPPER jdbc_pxf_fdw;

CREATE USER MAPPING FOR CURRENT_USER
  SERVER "<local_source_name>";

CREATE FOREIGN TABLE <table_name>
  (<column_name> <data_type> [, ...])
  SERVER "<local_source_name>"
  OPTIONS (
    resource '<data_path_or_table_name>',
    jdbc_driver '<JDBC_driver_class_name>',
    db_url 'jdbc:<DBMS_type>://<cluster_FQDN>:<port>/<database_name>',
    user '<username>',
    pass '<password>'
  );

Creating an external tableCreating an external table

Syntax of an SQL query to create an external table:

CREATE [WRITABLE] EXTERNAL TABLE <table_name>
       (<column_name> <data_type> [, ...])
       LOCATION('pxf://<data_path_or_table_name>?PROFILE=<profile_name>&SERVER=<source_name>')
       FORMAT '[TEXT|CSV|CUSTOM]';

Where:

  • <table_name>: Name of the external table you are creating in the Yandex MPP Analytics for PostgreSQL cluster.

  • <column_name>: Column name.

  • <data_type>: Column data type. It must match the column data type in the external DBMS table.

  • <data_path_or_table_name>: External object name, see examples of external tables.

  • PROFILE: Standard interface to an external DBMS (profile), e.g., JDBC. The list of possible values depends on the connection type:

    • S3
    • JDBC
    • HDFS and Hive
  • SERVER: Name of the external PXF data source.

    Instead of SERVER, you can provide parameters defining the external data source configuration. These depend on the source connection type. For more information, see the Greenplum® PFX guide and examples of creating external tables.

The WRITABLE option allows writing data to an external object. To read data from an external object, create an external table with the READABLE option.

Examples of creating external tablesExamples of creating external tables

ClickHouse®
MySQL®
PostgreSQL
Object Storage
  1. Create a Yandex Managed Service for ClickHouse® cluster with the chuser username.

  2. In the cluster subnet, set up a NAT gateway and create a security group allowing all incoming and outgoing traffic from all addresses.

  3. Optionally, create an external JDBC data source with the following properties:

    • Name: chserver

    • Driver: com.clickhouse.jdbc.ClickHouseDriver

    • Url: jdbc:clickhouse:http://c-<cluster_ID>.rw.mdb.yandexcloud.net:8123/db1, where:

      • c-<cluster_id>.rw.mdb.yandexcloud.net is a special FQDN always pointing to an available Managed Service for ClickHouse® cluster host.
      • 8123: Port for connection to the Managed Service for ClickHouse® cluster.
      • db1: DB name in the Managed Service for ClickHouse® cluster.
    • User: chuser

    You can get the cluster ID with the list of clusters in the folder.

    With no data source created, you will need to provide the source connection properties in the SQL query for creating an external table.

  4. Connect to the ClickHouse® DB using clickhouse-client.

  5. Create a test table and populate it with data:

    CREATE TABLE test (id int) ENGINE = Memory;
    
    INSERT INTO test VALUES (1);
    
  6. Connect to the database.

  7. Create an external table named pxf_ch to reference the test table in the ClickHouse® cluster. The SQL query depends on whether you previously created an external data source:

    • With a data source created:

      CREATE READABLE EXTERNAL TABLE pxf_ch(id int)
      LOCATION ('pxf://test?PROFILE=JDBC&SERVER=chserver')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
    • With no data source created:

      CREATE READABLE EXTERNAL TABLE pxf_ch(id int)
      LOCATION ('pxf://test?PROFILE=JDBC&JDBC_DRIVER=com.clickhouse.jdbc.ClickHouseDriver&DB_URL=jdbc:clickhouse:http://c-<cluster_ID>.rw.mdb.yandexcloud.net:8123/db1&USER=chuser')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      

    If public access is enabled for ClickHouse® hosts, use an encrypted connection when creating an external table. To do this, specify SSL parameters in your query:

    • With a data source created:

      CREATE READABLE EXTERNAL TABLE pxf_ch(id int)
      LOCATION ('pxf://test?PROFILE=JDBC&SERVER=chserver&ssl=true&sslmode=strict&sslrootcert=/etc/greenplum/ssl/allCAs.pem')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
    • With no data source created:

      CREATE READABLE EXTERNAL TABLE pxf_ch(id int)
      LOCATION ('pxf://test?PROFILE=JDBC&JDBC_DRIVER=com.clickhouse.jdbc.ClickHouseDriver&DB_URL=jdbc:clickhouse:https://c-<cluster_ID>.rw.mdb.yandexcloud.net:8443/db1&USER=chuser&ssl=true&sslmode=strict&sslrootcert=/etc/greenplum/ssl/allCAs.pem')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
  8. Query the external table for data:

    SELECT * FROM pxf_ch;
    

    Result:

    test_pxf=> SELECT * FROM pxf_ch;
     id
    ----
      1
    (1 row)
    
  1. Create a Yandex Managed Service for MySQL® cluster with the following settings:

    • Username: mysqluser.
    • In the host settings, select the Public access option.
  2. In the cluster subnet, set up a NAT gateway and create a security group allowing all incoming and outgoing traffic from all addresses.

  3. Optionally, create an external JDBC data source with the following properties:

    • Name: mysqlserver

    • Driver: com.mysql.jdbc.Driver

    • Url: jdbc:mysql://c-<cluster_ID>.rw.mdb.yandexcloud.net:3306/db1, where:

      • c-<cluster_ID>.rw.mdb.yandexcloud.net is a special FQDN always pointing to the current master host in the Managed Service for MySQL® cluster.
      • 3306: Port for connection to the Managed Service for MySQL® cluster.
      • db1: DB name in the Managed Service for MySQL® cluster.
    • User: mysqluser

    You can get the cluster ID with the list of clusters in the folder.

    With no data source created, you will need to provide the source connection properties in the SQL query for creating an external table.

  4. Connect to the MySQL® DB using mysql.

  5. Create a test table and populate it with data:

    CREATE TABLE test (a INT, b INT);
    
    INSERT INTO test VALUES (1, 11), (2, 22);
    
  6. Connect to the database.

  7. Create an external table named pxf_mysql to reference the test table in the MySQL® cluster. The SQL query depends on whether you previously created an external data source:

    • With a data source created:

      CREATE READABLE EXTERNAL TABLE pxf_mysql(a int, b int)
      LOCATION ('pxf://test?PROFILE=JDBC&SERVER=mysqlserver')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
    • With no data source created:

      CREATE READABLE EXTERNAL TABLE pxf_mysql(a int, b int)
      LOCATION ('pxf://test?PROFILE=JDBC&JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://c-<cluster_ID>.rw.mdb.yandexcloud.net:3306/db1&USER=mysqluser')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
  8. Query the external table for data:

    SELECT * FROM pxf_mysql;
    

    Result:

    test_pxf=> SELECT * FROM pxf_mysql;
    a | b
    --+----
    1 | 11
    2 | 22
    (2 rows)
    
  1. Create a Yandex Managed Service for PostgreSQL cluster with the following settings:

    • Username: pguser
    • In the host settings, select the Public access option.
  2. In the cluster subnet, set up a NAT gateway and create a security group allowing all incoming and outgoing traffic from all addresses.

  3. Optionally, create an external JDBC data source with the following properties:

    • Name: pgserver

    • Driver: org.postgresql.Driver

    • Url: jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/db1, where:

      • c-<cluster_ID>.rw.mdb.yandexcloud.net is a special FQDN always pointing to the current master host in the Managed Service for PostgreSQL cluster.
      • 6432: Port for connection to the Managed Service for PostgreSQL cluster.
      • db1: DB name in the Managed Service for PostgreSQL cluster.
    • User: pguser

    You can get the cluster ID with the list of clusters in the folder.

    With no data source created, you will need to provide the source connection properties in the SQL query for creating an external table.

  4. Connect to the PostgreSQL DB using psql.

  5. Create a test table and populate it with data:

    CREATE TABLE public.test ("a" INT,"b" INT);
    
    INSERT INTO public.test VALUES (1, 11), (2, 22);
    
  6. Connect to the database.

  7. Create an external table named pxf_pg to reference the public.test table in the PostgreSQL cluster. The SQL query depends on whether you previously created an external data source:

    • With a data source created:

      CREATE READABLE EXTERNAL TABLE pxf_pg(a int, b int)
      LOCATION ('pxf://public.test?PROFILE=JDBC&SERVER=pgserver')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
    • With no data source created:

      CREATE READABLE EXTERNAL TABLE pxf_pg(a int, b int)
      LOCATION ('pxf://public.test?PROFILE=JDBC&JDBC_DRIVER=org.postgresql.Driver&DB_URL=jdbc:postgresql://c-<cluster_ID>.rw.mdb.yandexcloud.net:6432/db1&USER=pguser')
      FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
      
  8. Query the external table for data:

    SELECT * FROM pxf_pg;
    

    Result:

    test_pxf=> SELECT * FROM pxf_pg;
    a | b
    --+----
    1 | 11
    2 | 22
    (2 rows)
    
  1. In the cluster subnet, set up a NAT gateway and create a security group allowing all incoming and outgoing traffic from all addresses.

  2. Create an Object Storage bucket with restricted access.

  3. Create a static access key.

  4. Optionally, create an external S3 data source with the following properties:

    • Name: objserver.
    • Access Key: ID of the static access key created earlier.
    • Secret Key: Secret key created earlier together with the static access key.
    • Endpoint: storage.yandexcloud.net.

    With no data source created, you will need to provide the source connection properties in the SQL query for creating an external table.

  5. Create a test file named test.csv on your local machine:

    1,111
    2,222
    
  6. Upload your test file to the bucket.

    Files you are uploading to the bucket must not start with . or _. Such files are considered hidden, and PXF does not read data from them.

  7. Connect to the database.

  8. To read data from an Object Storage bucket:

    1. Create an external table named pxf_s3_read to reference the bucket. The SQL query depends on whether you previously created an external data source:

      • With a data source created:

        CREATE READABLE EXTERNAL TABLE pxf_s3_read(a int, b int)
        LOCATION ('pxf://<bucket_name>/test.csv?PROFILE=s3:text&SERVER=objserver')
        FORMAT 'CSV';
        
      • With no data source created:

        CREATE READABLE EXTERNAL TABLE pxf_s3_read(a int, b int)
        LOCATION ('pxf://<bucket_name>/test.csv?PROFILE=s3:text&accesskey=<key_ID>&secretkey=<secret_key>&endpoint=storage.yandexcloud.net')
        FORMAT 'CSV';
        
    2. Query the external table for data:

      SELECT * FROM pxf_s3_read;
      

      Result:

      test_pxf=> SELECT * FROM pxf_s3_read;
      a | b
      ---+----
      1 | 111
      2 | 222
      (2 rows)
      
  9. To write data to an Object Storage bucket:

    1. Create an external table named pxf_s3_write with the WRITABLE option. You do not need to specify a file name when creating the table:

      • With a data source created:

        CREATE WRITABLE EXTERNAL TABLE pxf_s3_write(a int, b int)
        LOCATION ('pxf://<bucket_name>/?PROFILE=s3:text&SERVER=objserver')
        FORMAT 'CSV';
        
      • With no data source created:

        CREATE WRITABLE EXTERNAL TABLE pxf_s3_write(a int, b int)
        LOCATION ('pxf://<bucket_name>/?PROFILE=s3:text&accesskey=<key_ID>&secretkey=<secret_key>&endpoint=storage.yandexcloud.net')
        FORMAT 'CSV';
        
    2. Populate the table with data:

      INSERT INTO pxf_s3_write VALUES (3,333);
      
      INSERT 0 1
      
    3. Make sure that a new object has been created in the bucket.

Note

To create an external table from Object Storage, you can use the S3 protocol and transmit your static key properties using a file located on a web server. For more information, see this tutorial.

Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.

ClickHouse® is a registered trademark of ClickHouse, Inc.

Was the article helpful?

Previous
Hive
Next
Editing PXF settings
© 2026 Direct Cursus Technology L.L.C.