Creating an external HDFS data source
In Yandex MPP Analytics for PostgreSQL, as an external data source with the HDFS connection type, you can use HDFS as part of Yandex Data Processing or other third-party HDFS services.
Create an external data source
- Go to the folder page
 and select Yandex MPP Analytics for PostgreSQL. - Open the page of the Yandex MPP Analytics for PostgreSQL cluster you need.
 - In the left-hand panel, select 
 PXF. - Click Create data source.
 - Select the 
HDFSconnection type. - Enter a source name.
 - Configure at least one optional setting.
 - Click Create.
 
- 
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>" - 
Use the PXFDatasource.Create method and make a request, e.g., via cURL
 :curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-greenplum/v1/clusters/<cluster_ID>/pxf_datasources' \ --data '{ "datasource": { "name": "<external_data_source_name>", "hdfs": { "core": { "defaultFs": "<storage_type>" }, ... } } }'Where:
name: External data source name.hdfs: External data source settings. Configure at least one optional setting.
You can get the cluster ID with a list of clusters in the folder.
 - 
View the server response to make sure the request was successful.
 
- 
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>" - 
Clone the cloudapi
 repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapiBelow, we assume the repository contents are stored in the
~/cloudapi/directory. - 
Use the PXFDatasourceService.Create call and make a request, e.g., via gRPCurl
 :grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/greenplum/v1/pxf_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>" "datasource": { "name": "<external_data_source_name>", "hdfs": { "core": { "default_fs": "<storage_type>" }, ... } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.greenplum.v1.PXFDatasourceService.CreateWhere:
name: External data source name.hdfs: External data source settings. Configure at least one optional setting.
You can get the cluster ID with a list of clusters in the folder.
 - 
View the server response to make sure the request was successful.
 
Greenplum® and Greenplum Database® are registered trademarks or trademarks of Broadcom Inc. in the United States and/or other countries.