Yandex Cloud
Search
Contact UsTry it for free
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
  • Marketplace
    • Featured
    • Infrastructure & Network
    • Data Platform
    • AI for business
    • Security
    • DevOps tools
    • Serverless
    • Monitoring & Resources
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
    • Price calculator
    • Pricing plans
  • Customer Stories
  • Documentation
  • Blog
© 2026 Direct Cursus Technology L.L.C.
Tutorials
    • All tutorials
    • Unassisted deployment of the Apache Kafka® web interface
    • Upgrading a Managed Service for Apache Kafka® cluster to migrate from ZooKeeper to KRaft
    • Migrating a database from a third-party Apache Kafka® cluster to Managed Service for Apache Kafka®
    • Moving data between Managed Service for Apache Kafka® clusters using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for YDB to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for ClickHouse® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Yandex StoreDoc using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for MySQL® using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for OpenSearch using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for PostgreSQL using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Managed Service for YDB using Data Transfer
    • Delivering data from Managed Service for Apache Kafka® to Data Streams using Data Transfer
    • Delivering data from Data Streams to Managed Service for YDB using Data Transfer
    • Delivering data from Data Streams to Managed Service for Apache Kafka® using Data Transfer
    • YDB change data capture and delivery to YDS
    • Configuring Kafka Connect to work with a Managed Service for Apache Kafka® cluster
    • Synchronizing Apache Kafka® topics in Object Storage with no web access
    • Monitoring message loss in an Apache Kafka® topic
    • Automating Query tasks with Managed Service for Apache Airflow™
    • Sending requests to the Yandex Cloud API via the Yandex Cloud Python SDK
    • Configuring an SMTP server to send e-mail notifications
    • Adding data to a ClickHouse® DB
    • Migrating data to Managed Service for ClickHouse® using ClickHouse®
    • Migrating data to Managed Service for ClickHouse® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for ClickHouse® using Data Transfer
    • Asynchronously replicating data from PostgreSQL to ClickHouse®
    • Exchanging data between Managed Service for ClickHouse® and Yandex Data Processing
    • Configuring Managed Service for ClickHouse® for Graphite
    • Fetching data from Managed Service for Apache Kafka® to Managed Service for ClickHouse®
    • Fetching data from Managed Service for Apache Kafka® to ksqlDB
    • Fetching data from RabbitMQ to Managed Service for ClickHouse®
    • Saving a data stream from Data Streams to Managed Service for ClickHouse®
    • Asynchronous replication of data from Yandex Metrica to ClickHouse® using Data Transfer
    • Using hybrid storage in Managed Service for ClickHouse®
    • Sharding Managed Service for ClickHouse® tables
    • Loading data from Yandex Direct to a Managed Service for ClickHouse® data mart using Cloud Functions, Object Storage, and Data Transfer
    • Loading data from Object Storage to Managed Service for ClickHouse® using Data Transfer
    • Migrating data from Managed Service for OpenSearch to Managed Service for ClickHouse® with a storage change using Data Transfer
    • Loading data from Managed Service for YDB to Managed Service for ClickHouse® using Data Transfer
    • Yandex Managed Service for ClickHouse® integration with Microsoft SQL Server via ClickHouse® JDBC Bridge
    • Migrating databases from Google BigQuery to Managed Service for ClickHouse®
    • Yandex Managed Service for ClickHouse® integration with Oracle via ClickHouse® JDBC Bridge
    • Configuring Cloud DNS to access a Managed Service for ClickHouse® cluster from other cloud networks
    • Migrating a Yandex Data Processing HDFS cluster to a different availability zone
    • Importing data from Managed Service for MySQL® to Yandex Data Processing using Sqoop
    • Importing data from Managed Service for PostgreSQL to Yandex Data Processing using Sqoop
    • Mounting Object Storage buckets to the file system of Yandex Data Processing hosts
    • Working with Apache Kafka® topics using Yandex Data Processing
    • Automating operations with Yandex Data Processing using Managed Service for Apache Airflow™
    • Shared use of Yandex Data Processing tables through Apache Hive™ Metastore
    • Transferring metadata across Yandex Data Processing clusters using Apache Hive™ Metastore
    • Importing, processing, and exporting Object Storage data to Managed Service for ClickHouse®
    • Migrating collections from a third-party MongoDB cluster to Yandex StoreDoc
    • Migrating data to Yandex StoreDoc
    • Migrating Yandex StoreDoc cluster from version 4.4 to 6.0
    • Sharding Yandex StoreDoc collections
    • Yandex StoreDoc performance analysis and tuning
    • Managed Service for MySQL® performance analysis and tuning
    • Syncing data from a third-party MySQL® cluster to Managed Service for MySQL® using Data Transfer
    • Migrating a database from Managed Service for MySQL® to a third-party MySQL® cluster
    • Migrating a database from Managed Service for MySQL® to Object Storage using Data Transfer
    • Migrating data from Object Storage to Managed Service for MySQL® via Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for MySQL® to Managed Service for Apache Kafka® using Debezium
    • Migrating a database from Managed Service for MySQL® to Managed Service for YDB using Data Transfer
    • MySQL® change data capture and delivery to YDS
    • Migrating data from Managed Service for MySQL® to Managed Service for PostgreSQL using Data Transfer
    • Migrating data from AWS RDS for PostgreSQL to Managed Service for PostgreSQL using Data Transfer
    • Migrating data from Managed Service for MySQL® to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Configuring an index policy in Managed Service for OpenSearch
    • Configuring a cold storage policy in Managed Service for OpenSearch
    • Migrating data from a third-party OpenSearch cluster to Managed Service for OpenSearch using Data Transfer
    • Loading data from Managed Service for OpenSearch to Object Storage using Data Transfer
    • Migrating data from Managed Service for OpenSearch to Managed Service for YDB using Data Transfer
    • Copying data from Managed Service for OpenSearch to Yandex MPP Analytics for PostgreSQL using Yandex Data Transfer
    • Migrating data from Managed Service for PostgreSQL to Managed Service for OpenSearch using Data Transfer
    • Authenticating a Managed Service for OpenSearch cluster in OpenSearch Dashboards using Keycloak
    • Using the yandex-lemmer plugin in Managed Service for OpenSearch
    • Creating a PostgreSQL cluster for 1C:Enterprise
    • Searching for the Managed Service for PostgreSQL cluster performance issues
    • Managed Service for PostgreSQL performance analysis and tuning
    • Logical replication in PostgreSQL
    • Migrating a database from a third-party PostgreSQL cluster to Managed Service for PostgreSQL
    • Migrating a database from Managed Service for PostgreSQL
    • Migrating a Managed Service for PostgreSQL cluster to a different version
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Data Transfer
    • Delivering data from Managed Service for PostgreSQL to Managed Service for Apache Kafka® using Debezium
    • Delivering data from Managed Service for PostgreSQL to Managed Service for YDB using Data Transfer
    • Migrating a database from Managed Service for PostgreSQL to Object Storage
    • Migrating data from Object Storage to Managed Service for PostgreSQL via Data Transfer
    • PostgreSQL change data capture and delivery to YDS
    • Migrating data from Managed Service for PostgreSQL to Managed Service for MySQL® using Data Transfer
    • Migrating data from Managed Service for PostgreSQL to Managed Service for OpenSearch using Data Transfer
    • Fixing string sorting issues in PostgreSQL after a glibc upgrade
    • Using a Yandex Lockbox secret in a PySpark job to connect to Yandex Managed Service for PostgreSQL
    • Configuring permissions for access to a secret created by Connection Manager for a Managed Service for PostgreSQL user
    • Migrating a database from Greenplum® to ClickHouse®
    • Migrating a database from Greenplum® to PostgreSQL
    • Exporting Greenplum® data to a cold storage in Object Storage
    • Loading data from Object Storage to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Copying data from Managed Service for OpenSearch to Yandex MPP Analytics for PostgreSQL using Yandex Data Transfer
    • Creating an external table from a Object Storage bucket table using a configuration file
    • Getting data from external sources using named queries in Greenplum®
    • Migrating a database from a third-party Valkey™ cluster to Yandex Managed Service for Valkey™
    • Using a Yandex Managed Service for Valkey™ cluster as a PHP session storage
    • Loading data from Object Storage to Managed Service for YDB using Data Transfer
    • Loading data from Managed Service for YDB to Object Storage using Data Transfer
    • Processing Audit Trails events
    • Processing Cloud Logging logs
    • Processing Debezium CDC streams
    • Analyzing data with Jupyter
    • Processing files with usage details in Yandex Cloud Billing
    • Ingesting data into storage systems
    • Smart log processing
    • Data transfer in microservice architectures
    • Migrating data to Object Storage using Data Transfer
    • Migrating data from a third-party Greenplum® or PostgreSQL cluster to Yandex MPP Analytics for PostgreSQL using Data Transfer
    • Migrating Yandex StoreDoc clusters
    • Migrating MySQL® clusters
    • Migrating to a third-party MySQL® cluster
    • Migrating PostgreSQL clusters
    • Creating a schema registry to deliver data in Debezium CDC format from Apache Kafka®
    • Automating operations using Yandex Managed Service for Apache Airflow™
    • Working with an Object Storage table from a PySpark job
    • Integrating Yandex Managed Service for Apache Spark™ with Apache Hive™ Metastore
    • Running a PySpark job using Yandex Managed Service for Apache Airflow™
    • Using Yandex Object Storage in Yandex Managed Service for Apache Spark™
    • Using a Yandex Lockbox secret in a PySpark job to connect to Yandex Managed Service for PostgreSQL
    • Running a PySpark job in Yandex Managed Service for YTsaurus

In this article:

  • Getting started
  • Required paid resources
  • Set up your infrastructure
  • Configure access permissions for the Managed Service for PostgreSQL user secret
  • Prepare a PySpark job
  • Run the job and check the result
  • Delete the resources you created
  1. Building a data platform
  2. Using a Yandex Lockbox secret in a PySpark job to connect to Yandex Managed Service for PostgreSQL

Using a Yandex Lockbox secret in a PySpark job to connect to Yandex Managed Service for PostgreSQL

Written by
Yandex Cloud
Updated at April 14, 2026
  • Getting started
    • Required paid resources
  • Set up your infrastructure
  • Configure access permissions for the Managed Service for PostgreSQL user secret
  • Prepare a PySpark job
  • Run the job and check the result
  • Delete the resources you created

You can use a Yandex Lockbox secret to connect to a Yandex Managed Service for PostgreSQL cluster from a PySpark job in Yandex Managed Service for Apache Spark™. To do this, grant the Yandex Managed Service for Apache Spark™ cluster service account access to the secret. Yandex Connection Manager creates the secret automatically when creating the Managed Service for PostgreSQL user.

For the PySpark job, we use a Python script stored in a Yandex Object Storage bucket. The script retrieves the user password from the secret and uses it to connect to the Managed Service for PostgreSQL cluster.

To get information from the secret and connect to the Managed Service for PostgreSQL cluster:

  1. Set up your infrastructure.
  2. Configure permissions for access to the user's secret.
  3. Prepare a PySpark job.
  4. Run the job and check the result.

If you no longer need the resources you created, delete them.

Getting startedGetting started

Sign up for Yandex Cloud and create a billing account:

  1. Navigate to the management console and log in to Yandex Cloud or create a new account.
  2. On the Yandex Cloud Billing page, make sure you have a billing account linked and it has the ACTIVE or TRIAL_ACTIVE status. If you do not have a billing account, create one and link a cloud to it.

If you have an active billing account, you can create or select a folder for your infrastructure on the cloud page.

Learn more about clouds and folders here.

Required paid resourcesRequired paid resources

  • Managed Service for Apache Spark™ cluster: computing resources allocated to the hosts (see Managed Service for Apache Spark™ pricing).
  • Managed Service for PostgreSQL cluster: computing resources allocated to hosts, storage and backup size (see Managed Service for PostgreSQL pricing).
  • Public IP addresses if public access is enabled for cluster hosts (see Virtual Private Cloud pricing).
  • Object Storage bucket: use of storage, data operations (see Object Storage pricing).
  • NAT gateway: hourly use of the gateway and its outgoing traffic (see Virtual Private Cloud pricing).

There is no charge for using Yandex Lockbox secrets created with Connection Manager.

Set up your infrastructureSet up your infrastructure

  1. Create a service account named spark-agent and assign it the following roles:

    • managed-spark.integrationProvider: For Yandex Managed Service for Apache Spark™ integration with other services.
    • storage.editor: For access to the PySpark job file in the Object Storage bucket.
  2. Create a bucket.

  3. Grant the READ permission to the spark-agent service account for the new bucket.

  4. Create a cloud network named spark-network.

    This will automatically create three subnets in different availability zones.

  5. In spark-network, create a security group named spark-sg for the Yandex Managed Service for Apache Spark™ cluster to allow outgoing TCP connections on:

    • Port 6432 to connect to PostgreSQL.
    • Port 443 to download Maven packages.
  6. In spark-network, create a security group named pg-sg for the Managed Service for PostgreSQL cluster to allow incoming TCP connections on port 6432.

  7. Create a Yandex Managed Service for Apache Spark™ cluster with the following settings:

    • Service account: spark-agent
    • Network: spark-network
    • Security group: spark-sg
  8. Set up a NAT gateway for the subnet where the Yandex Managed Service for Apache Spark™ cluster was created. The NAT gateway is required to download the PostgreSQL JDBC driver from the Maven repository.

  9. Create a Managed Service for PostgreSQL cluster with the following settings:

    • Network: spark-network
    • Security group: pg-sg

Configure access permissions for the Managed Service for PostgreSQL user secretConfigure access permissions for the Managed Service for PostgreSQL user secret

Management console
CLI
  1. In the management console, select the folder where you created the infrastructure.

  2. Go to Managed Service for PostgreSQL.

  3. Click the name of your cluster and select the Users tab.

  4. Click the connection ID next to the user you need.

    This will take you to the Connection Manager connection page for the selected user.

  5. On the page that opens, click the secret ID.

    This will open the page of the Yandex Lockbox secret storing the PostgreSQL user password.

  6. On the left-hand panel, select Access bindings and click Assign roles.

  7. Select the spark-agent service account you want to grant secret access to.

  8. Click Add role and select lockbox.payloadViewer.

  9. Click Save.

If you do not have the Yandex Cloud CLI yet, install and initialize it.

The folder used by default is the one specified when creating the CLI profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id options.

To configure access to the Managed Service for PostgreSQL user secret:

  1. Get the secret ID from the Connection Manager connection:

    yc metadata-hub connection-manager connection list \
      --mdb-cluster-id <PostgreSQL_cluster_ID>
    

    You can get the cluster ID with the list of clusters in the folder.

  2. Configure access to the user secret:

    yc lockbox secret add-access-binding <secret_ID> \
      --role lockbox.payloadViewer \
      --service-account-name spark-agent 
    

    The spark-agent service account will get access to the Managed Service for PostgreSQL user secret.

Prepare a PySpark jobPrepare a PySpark job

  1. Create a local file named job-mpg-connection-with-secret.py and paste the following script to it:

    job-mpg-connection-with-secret.py
    import json
    import os
    import urllib.request
    from pyspark.sql import SparkSession
    
    def get_secret(secret_id: str) -> dict:
        """Getting a secret from Yandex Lockbox."""
        # Getting an IAM token via metadata service
        metadata_host = os.environ.get("YC_METADATA_ADDR", "169.254.169.254")
        token_url = f"http://{metadata_host}/computeMetadata/v1/instance/service-accounts/default/token"
    
        req = urllib.request.Request(token_url)
        req.add_header("Metadata-Flavor", "Google")
        with urllib.request.urlopen(req, timeout=10) as resp:
            token = json.loads(resp.read())["access_token"]
    
        # Getting a Yandex Lockbox secret
        secret_url = f"https://payload.lockbox.api.cloud.yandex.net/lockbox/v1/secrets/{secret_id}/payload"
        req = urllib.request.Request(secret_url)
        req.add_header("Authorization", f"Bearer {token}")
        with urllib.request.urlopen(req, timeout=30) as resp:
            payload = json.loads(resp.read())
            return {e["key"]: e["textValue"] for e in payload["entries"]}
    
    
    # Using the secret retrieval function
    secrets = get_secret("<secret_ID>")
    
    # Initializing a Spark session
    spark = SparkSession.builder.appName("db-reader").getOrCreate()
    
    # Connecting to a Managed Service for PostgreSQL cluster
    df = spark.read \
        .format("jdbc") \
        .option("url", f"jdbc:postgresql://<host_FQDN>:6432/<DB_name>") \
        .option("driver", "org.postgresql.Driver") \
        .option("user", "<username>") \
        .option("password", secrets["postgresql_password"]) \
        .option("dbtable", "<table_name>") \
        .load()
    
    # Showing the first 20 rows of the table
    df.show()
    
  2. In the script file, specify the following settings:

    • Secret ID.
    • Host FQDN to connect to the Managed Service for PostgreSQL cluster.
    • Database name.
    • Table name.

Run the job and check the resultRun the job and check the result

Management console
CLI
  1. Upload the job-mpg-connection-with-secret.py file to the bucket.

  2. In the Yandex Managed Service for Apache Spark™ cluster, create a job with the following settings:

    • Job type: PySpark
    • Main python file: s3a://<bucket_name>/job-mpg-connection-with-secret.py
    • Packages: org.postgresql:postgresql:42.7.3
  3. Wait for the PySpark job you created to change its status to Done.

  4. Get job execution logs.

    The job logs will display rows from the table specified in the script.

  1. Upload the job-mpg-connection-with-secret.py file to the bucket:

    yc storage s3api put-object \
      --body <local_file_path> \
      --bucket <bucket_name> \
      --key "job-mpg-connection-with-secret.py"
    
  2. Run the job:

    yc managed-spark job create-pyspark \
      --cluster-id <cluster_ID> \
      --name "<job_name>" \
      --main-python-file-uri s3a://<bucket_name>/job-mpg-connection-with-secret.py \
      --packages org.postgresql:postgresql:42.7.3
    

    You can get the cluster ID with the list of clusters in the folder.

    The command will return the job ID. Save it as you will need it to get job execution logs.

  3. Get job execution logs:

    yc managed-spark job log <job_ID> \
      --cluster-id <cluster_ID>
    

    The job logs will display rows from the table specified in the script.

Delete the resources you createdDelete the resources you created

Some resources are not free of charge. Delete the resources you no longer need to avoid paying for them:

  1. Yandex Managed Service for Apache Spark™ cluster.
  2. Managed Service for PostgreSQL cluster.
  3. Object Storage bucket. Before deleting the bucket, delete all its objects.
  4. NAT gateway.

Was the article helpful?

Previous
Fixing string sorting issues in PostgreSQL after a glibc upgrade
Next
Configuring permissions for access to a secret created by Connection Manager for a Managed Service for PostgreSQL user
© 2026 Direct Cursus Technology L.L.C.