Connecting to a Yandex Data Processing host from graphical IDEs

Written by

Yandex Cloud

Updated at September 25, 2025

You can connect to a Yandex Data Processing cluster using graphical IDEs.

Before connecting:

Connect using graphical IDEs

Connections were tested in the following environment:

Ubuntu 20.04, DBeaver: 22.2.4
MacOS Monterey 12.7:
- JetBrains DataGrip: 2023.3.4
- DBeaver Community: 24.0.0

To use graphical IDEs, save a certificate to a local folder and specify the path to it in the connection settings.

DataGrip

DBeaver

Create a data source:
1. Select File → New → Data Source → Apache Hive.
  Note
  Select the data source depending on the Yandex Data Processing component you are connecting to:
  - Hive: Select Apache Hive.
  - Spark: Select Apache Spark.
  The list of settings does not change.
2. Specify the connection settings on the General tab:
  - Host: FQDN of the cluster master host or its public IP address.
  - If connecting for the first time, click Download to download the connection driver.
3. On the SSH/SSL tab:
  1. Enable the Use SSL setting and specify the SSL connection settings:
    - CA file: Downloaded SSL certificate for the connection.
    - Client key file, Client key password: File with the private key required to connect to the Yandex Data Processing cluster and its password.
  2. Optionally, to connect via a jump host VM, configure the SSH tunnel settings:
    1. Select Use SSH tunnel, create an SSH configuration, and specify these settings:
      - Host: VM IP address.
      - User name: VM username.
      - Private key file, Passphrase: Private key file required to connect to the VM and its password.
    2. Click Test Connection to test the connection to the VM from DataGrip.
    3. Click OK to save the configuration.
Click Test Connection. If the connection is successful, you will get the OK connection status and information about the DBMS and driver.
Click OK to save the data source.

Download the SSH key to the local machine or VM to connect to a Yandex Data Processing cluster.
Create a new DB connection:
1. From the Database menu, select New connection.
2. Select a data source from the DB list depending on the configuration of the Yandex Data Processing cluster you are connecting to:
  - If the cluster uses Hive, select Apache Hive.
  - If only Spark is enabled in the cluster and the Thrift server is enabled, select Apache Spark.
  The list of connection settings remains the same regardless of the selected data source.
3. Click Next.
4. On the SSH tab, enable the Use SSH tunnel setting and specify these settings:
  - Host/IP: FQDN (to connect via a jump host VM) or public IP address of the master host.
  - Username: Enter the username:
    - For version 2.0: ubuntu.
    - For version 1.4: root.
  - Authentication method: Public key.
  - Secret key: Path to the cluster’s private key file.
  - Passphrase: Private key password.
  - Optionally, to connect via a jump host VM, enable the Use jump server setting and specify the settings:
    - Host/IP: Public IP address of the VM for connection.
    - Username: Username for connecting to the VM.
    - Authentication method: Public key.
    - Secret key: Path to the VM’s private key file.
    - Passphrase: Private key password.
Click Test Connection .... If the connection is successful, you will see the connection status and information about the DBMS and driver.
Click Ready to save the database connection settings.

Connecting to a Yandex Data Processing host from graphical IDEs

Connect using graphical IDEsConnect using graphical IDEs

Was the article helpful?

Connect using graphical IDEs