Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Yandex MPP Analytics for PostgreSQL
  • Getting started
    • All guides
      • Overview
        • S3
        • JDBC
        • HDFS
        • Hive
      • Creating an external table
      • Editing PXF settings
    • Connecting to an external file server (gpfdist)
    • Auxiliary utilities
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • Release notes
  1. Step-by-step guides
  2. Working with PXF
  3. Creating external data sources
  4. S3

Creating an external S3 data source

Written by
Yandex Cloud
Updated at May 7, 2026

In Yandex MPP Analytics for PostgreSQL, you can use Yandex Object Storage or other third-party S3 services as an external data source with the S3 connection type.

To get started, create a static access key. You will need to specify its data in the source properties.

Create an external data sourceCreate an external data source

Management console
CLI
REST API
gRPC API

To create an external S3 data source:

  1. Open the folder dashboard.

  2. Navigate to Yandex MPP Analytics for PostgreSQL.

  3. Open the page of the Greenplum® cluster in question.

  4. In the left-hand panel, select  PXF.

  5. Click Create data source.

  6. Select the S3 connection type.

  7. Enter a source name.

  8. Configure at least one optional setting:

    • Specify the static access key ID in the Access Key field, and its contents, in the Secret Key field.

      Learn more about static access keys.

    • Select Fast Upload to enable fast upload of large files to S3 storage.

      This option is enabled by default.

      When using fast upload, PXF generates files in RAM (if out of RAM, it writes them to disk). If fast upload is disabled, PXF generates files on disk.

    • In the Endpoint field, enter the S3 storage address.

      The default value is storage.yandexcloud.net for Object Storage.

  9. Click Create.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also specify a different folder for any command using --folder-name or --folder-id. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

To create an external S3 data source:

  1. View the description of the CLI command for creating a data source:

    yc managed-greenplum pxf-datasource create s3 --help
    
  2. Configure the data source:

    yc managed-greenplum pxf-datasource create s3 <external_data_source_name> \
       --cluster-id=<cluster_ID> \
       --access-key=<static_key_ID> \
       --secret-key=<secret_part_of_static_key> \
       --endpoint=<S3_storage_address> \
       --fast-upload=<fast_upload>
    

    Where:

    • cluster-id: Cluster ID. You can get it with the list of clusters in the folder.
    • access-key, secret-key: ID and contents of the static access key.
    • endpoint: S3 storage address. The value for Object Storage is storage.yandexcloud.net. This is the default value.
    • fast-upload: Fast upload of large files to S3 storage. The possible values are:
      • true: Default value. PXF generates files in RAM (if out of RAM, it writes them to disk).
      • false: PXF generates files on disk.
  1. Get an IAM token for API authentication and put it into an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Call the PXFDatasource.Create method, e.g., via the following cURL request:

    curl \
        --request POST \
        --header "Authorization: Bearer $IAM_TOKEN" \
        --header "Content-Type: application/json" \
        --url 'https://mdb.api.cloud.yandex.net/managed-greenplum/v1/clusters/<cluster_ID>/pxf_datasources' \
        --data '{
                  "datasource": {
                    "name": "<external_data_source_name>",
                    "s3": {
                      "accessKey": "<static_key_ID>",
                      "secretKey": "<secret_part_of_static_key>",
                      "fastUpload": "<fast_upload>",
                      "endpoint": "<S3_storage_address>"
                    }
                  }
                }'
    

    Where:

    • name: External data source name.

    • s3: External data source settings:

      • accessKey, secretKey: ID and contents of the static access key.

      • fastUpload: Fast upload of large files to S3 storage. The possible values are:

        • true: Default value. PXF generates files in RAM (if out of RAM, it writes them to disk).
        • false: PXF generates files on disk.
      • endpoint: S3 storage address. The value for Object Storage is storage.yandexcloud.net. This is the default value.

    You can get the cluster ID with the list of clusters in the folder.

  3. View the server response to make sure your request was successful.

  1. Get an IAM token for API authentication and put it into an environment variable:

    export IAM_TOKEN="<IAM_token>"
    
  2. Clone the cloudapi repository:

    cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
    

    Below, we assume that the repository contents reside in the ~/cloudapi/ directory.

  3. Call the PXFDatasourceService.Create method, e.g., via the following gRPCurl request:

    grpcurl \
        -format json \
        -import-path ~/cloudapi/ \
        -import-path ~/cloudapi/third_party/googleapis/ \
        -proto ~/cloudapi/yandex/cloud/mdb/greenplum/v1/pxf_service.proto \
        -rpc-header "Authorization: Bearer $IAM_TOKEN" \
        -d '{
              "cluster_id": "<cluster_ID>"
              "datasource": {
                "name": "<external_data_source_name>",
                "s3": {
                  "access_key": "<static_key_ID>",
                  "secret_key": "<secret_part_of_static_key>",
                  "fast_upload": <fast_upload>,
                  "endpoint": "<S3_storage_address>"
                }
              }
            }' \
        mdb.api.cloud.yandex.net:443 \
        yandex.cloud.mdb.greenplum.v1.PXFDatasourceService.Create
    

    Where:

    • name: External data source name.

    • s3: External data source settings:

      • access_key, secret_key: ID and contents of the static access key.

      • fast_upload: Fast upload of large files to S3 storage. The possible values are:

        • true: Default value. PXF generates files in RAM (if out of RAM, it writes them to disk).
        • false: PXF generates files on disk.
      • endpoint: S3 storage address. The value for Object Storage is storage.yandexcloud.net. This is the default value.

    You can get the cluster ID with the list of clusters in the folder.

  4. Check the server response to make sure your request was successful.

After you create an external data source, create an external table.

Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.

Was the article helpful?

Previous
Overview
Next
JDBC
© 2026 Direct Cursus Technology L.L.C.