Configuring access to Object Storage from a ClickHouse® cluster
Managed Service for ClickHouse® supports using Yandex Object Storage to:
- Enable ML models, data format schemas, and your own geobase.
- Process data that is stored in object storage if this data is represented in any of the supported ClickHouse® formats
.
To access Object Storage bucket data from a cluster, set up password-free access to the bucket using a service account:
See also Examples of working with objects.
Before you begin, make sure your account in Yandex Cloud has the iam.serviceAccounts.user role or higher. You will need this role in the following cases:
- To create or modify a cluster and link it to a service account.
- To restore a cluster linked to a service account from its backup.
Connect the service account to the cluster
-
When creating or updating a cluster, either select an existing service account or create a new one.
-
Make sure that this account is assigned the correct roles from the
storage.*
role group. If required, assign it the appropriate roles, e.g.,storage.viewer
andstorage.uploader
.
Tip
To link Managed Service for ClickHouse® clusters to Object Storage, we recommend using dedicated service accounts. This allows you to work with any buckets, including those to which you cannot or should not allow public access.
Set up access rights
-
In the management console
, select the folder where the bucket is located. If there is no bucket, create one and populate it with the required data. -
Select Object Storage.
-
Set up the bucket ACL or object ACL:
- In the list of buckets or objects, select the required element and click
. - Click Bucket ACL or Object ACL.
- In the Select a user drop-down list, specify the service account connected to the cluster.
- Set the required permissions for the service account from the drop-down list.
- Click Add and Save.
Note
If necessary, revoke access from one or more users by clicking Cancel in the appropriate line.
- In the list of buckets or objects, select the required element and click
Examples of working with objects
You can get a link to an object in a bucket in the following format: https://storage.yandexcloud.net/<bucket_name>/<object_name>
. You can use it to work with geotags and schemas or to use the s3
table function and the S3
table engine.
The S3
table engine is similar to FileSELECT
and INSERT
.
The s3
table function provides the same functionality as the S3
table engine, but you do not need to create a table in advance to use it.
For example, if the Object Storage bucket contains a file named table.tsv
that stores table data in TSV format, then you can create a table or function to work with this file. You need to set up password-free access and get a link to the table.tsv
file first.
-
Assign the
managed-clickhouse.editor
andstorage.uploader
roles to the service account linked to the Managed Service for ClickHouse® cluster. -
Create a table:
CREATE TABLE test (n Int32) ENGINE = S3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV');
-
Run test queries to the table:
INSERT INTO test VALUES (1); SELECT * FROM test; ┌─n─┐ │ 1 │ └───┘
-
Assign the
managed-clickhouse.editor
andstorage.uploader
roles to the service account linked to the Managed Service for ClickHouse® cluster. -
Insert data:
INSERT INTO FUNCTION s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32') VALUES (1);
-
Run a test query:
SELECT * FROM s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32'); ┌─n─┐ │ 1 │ └───┘
ClickHouse® is a registered trademark of ClickHouse, Inc