Configuring access to Object Storage from a ClickHouse® cluster
Managed Service for ClickHouse® supports using Yandex Object Storage to:
- Enable ML models, data format schemas, and your own geobase.
- Process data that is stored in object storage if this data is represented in any of the supported ClickHouse® formats
.
To access Object Storage bucket data from a cluster, set up password-free access to the bucket using a service account:
- Connect a service account to a cluster.
- Set up access rights for the service account.
- Get a link to the bucket object, which you can use to perform operations with the cluster data.
See also Examples of working with objects.
Connecting a service account to a cluster
-
When creating or updating a cluster, either select an existing service account or create a new one.
-
Make sure that this account is assigned the correct roles from the
storage.*
role group. If required, assign it the appropriate roles, e.g.,storage.viewer
andstorage.uploader
.
Tip
To link Managed Service for ClickHouse® clusters to Object Storage, we recommend using dedicated service accounts. This allows you to work with any buckets, including those to which you cannot or should not allow public access.
Setting up access rights
-
In the management console
, select the folder where the bucket is located. If there is no bucket, create one and populate it with the required data. -
Select Object Storage.
-
Set up the bucket ACL or object ACL:
- In the list of buckets or objects, select the required element and click
. - Click Bucket ACL or Object ACL.
- In the Select a user drop-down list, specify the service account connected to the cluster.
- Click Add.
- Set the required permissions for the service account from the drop-down list.
- Click Save.
Note
If necessary, revoke access from one or more users by clicking Cancel in the appropriate line.
- In the list of buckets or objects, select the required element and click
Getting a link to an object
To use Managed Service for ClickHouse® to work with data of an object in Object Storage, you need to get a link to this object in the bucket.
A link such as https://storage.yandexcloud.net/<bucket_name>/<object_name>?X-Amz-Algorithm=...
should be changed to https://storage.yandexcloud.net/<bucket_name>/<object_name>
with all parameters in the query string removed.
Examples of working with objects
You can use object links, such as https://storage.yandexcloud.net/<bucket_name>/<object_name>
to work with geotags and schemas or to use the s3
table function and the S3
table engine.
The S3
table engine is similar to FileSELECT
and INSERT
.
The s3
table function provides the same functionality as the S3
table engine, but you do not need to create a table in advance to use it.
For example, if the Object Storage bucket contains a file named table.tsv
that stores table data in TSV format, then you can create a table or function to work with this file. You need to set up password-free access and get a link to the table.tsv
file first.
-
Create a table:
CREATE TABLE test (n Int32) ENGINE = S3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV');
-
Run test queries to the table:
INSERT INTO test VALUES (1); SELECT * FROM test; ┌─n─┐ │ 1 │ └───┘
-
Insert data:
INSERT INTO FUNCTION s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32') VALUES (1);
-
Run a test query:
SELECT * FROM s3('https://storage.yandexcloud.net/<bucket_name>/table.tsv', 'TSV', 'n Int32'); ┌─n─┐ │ 1 │ └───┘
ClickHouse® is a registered trademark of ClickHouse, Inc