Adding your own geobase in Managed Service for ClickHouse®
Geobases in ClickHouse® are text files containing the hierarchy and names of regions. You can add several alternative geobases to ClickHouse® to support different stances on how regions pertain to countries. For more information, see the ClickHouse® documentation
To add your own geobase to a ClickHouse® cluster:
- Create a geobase.
- Upload the geobase to Yandex Object Storage.
- Add the geobase to a ClickHouse® cluster.
Creating a geobase
-
Create a file named
regions_hierarchy.txt
. The file must be in TSV tabular format without headers and with the following columns:- Region ID (UInt32)
- Parent region ID (UInt32)
- Region type (UInt8):
1
: Continent3
: Country4
: Federal district5
: Region6
: City
- Population (UInt32): Optional.
-
To add an alternative hierarchy of regions, create the
regions_hierarchy_<suffix>.txt
files with the same structure. To use an alternative geobase, pass this suffix when invoking the function. For example:regionToCountry(RegionID)
: Uses the defaultregions_hierarchy.txt
dictionary.regionToCountry(RegionID, 'alt')
: Uses the dictionary with thealt
suffix:regions_hierarchy_alt.txt
.
-
Create a file named
regions_names.txt
. The file must be in TSV tabular format without headers and with the following columns:- Region ID (UInt32)
- Region name (String): Cannot contain tab or newline characters, even escaped ones.
-
To add region names in other languages to your geobase, create the
regions_names_<language_code>.txt
files with the same structure. For example, you can createregions_names_en.txt
for English andregions_names_tr.txt
for Turkish. -
Create a
tar
,tar.gz
, orzip
archive from the geobase files.
Uploading a geobase to Yandex Object Storage
Managed Service for ClickHouse® only works with publicly readable geobases that are uploaded to Yandex Object Storage:
- To link your service account to the cluster, make sure your Yandex Cloud account has the iam.serviceAccounts.user role or higher.
- Upload the geobase archive to Yandex Object Storage.
- Connect the service account to the cluster. You will use this service account to configure access to the geobase archive.
- Assign the
storage.viewer
role to the service account. - In the bucket's ACL, add the
READ
permission to the service account. - Get a link to the geobase archive.
Adding the geobase to the ClickHouse® cluster
- In the management console
, go to the folder page and select Managed Service for ClickHouse. - Select the cluster and click Edit in the top panel.
- Under DBMS settings, click Settings.
- In the Geobase uri field, enter a link to the geobase archive in Yandex Object Storage.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
To add a geobase:
-
View a description of the update cluster configuration CLI command:
yc managed-clickhouse cluster update-config --help
-
Run the command by providing the link to the archive with the connected geobase in the
geobase_uri
parameter:yc managed-clickhouse cluster update-config <cluster_name_or_ID> \ --set geobase_uri="<link_to_geobase_archive_in_Object_Storage>"
You can request the cluster ID and name with a list of clusters in the folder.
-
Open the current Terraform configuration file with an infrastructure plan.
For more information about creating this file, see Creating clusters.
-
In the Managed Service for ClickHouse® cluster settings, add the
geobase_uri
parameter with the link to the archive containing the geobase to connect in Yandex Object Storage:resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" { ... clickhouse { config { geobase_uri = "<link_to_geobase_archive_in_Object_Storage>" ... } ... } ... }
-
Make sure the settings are correct.
-
Using the command line, navigate to the folder that contains the up-to-date Terraform configuration files with an infrastructure plan.
-
Run the command:
terraform validate
If there are errors in the configuration files, Terraform will point to them.
-
-
Confirm updating the resources.
-
Run the command to view planned changes:
terraform plan
If the resource configuration descriptions are correct, the terminal will display a list of the resources to modify and their parameters. This is a test step. No resources are updated.
-
If you are happy with the planned changes, apply them:
-
Run the command:
terraform apply
-
Confirm the update of resources.
-
Wait for the operation to complete.
-
-
For more information, see the Terraform provider documentation
Time limits
A Terraform provider sets the timeout for Managed Service for ClickHouse® cluster operations:
- Creating a cluster, including by restoring one from a backup: 60 minutes.
- Editing a cluster: 90 minutes.
- Deleting a cluster: 30 minutes.
Operations exceeding the set timeout are interrupted.
How do I change these limits?
Add the timeouts
block to the cluster description, for example:
resource "yandex_mdb_clickhouse_cluster" "<cluster_name>" {
...
timeouts {
create = "1h30m" # 1 hour 30 minutes
update = "2h" # 2 hours
delete = "30m" # 30 minutes
}
}
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Use the Cluster.Update method and send the following request, e.g., via cURL
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
updateMask
parameter as a single comma-separated string.curl \ --request PATCH \ --header "Authorization: Bearer $IAM_TOKEN" \ --header "Content-Type: application/json" \ --url 'https://mdb.api.cloud.yandex.net/managed-clickhouse/v1/clusters/<cluster_ID>' \ --data '{ "updateMask": "configSpec.clickhouse.config.geobaseUri", "configSpec": { "clickhouse": { "config": { "geobaseUri": "<link>" } } } }'
Where:
-
updateMask
: List of parameters to update as a single string, separated by commas.Here only one parameter is specified:
configSpec.clickhouse.config.geobaseUri
. -
configSpec.clickhouse.config.geobaseUri
: Link to the geobase archive in Object Storage.
You can get the cluster ID with a list of clusters in the folder.
-
-
View the server response to make sure the request was successful.
-
Get an IAM token for API authentication and put it into the environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Use the ClusterService.Update call and send the following request, e.g., via gRPCurl
:Warning
The API method will assign default values to all the parameters of the object you are modifying unless you explicitly provide them in your request. To avoid this, list the settings you want to change in the
update_mask
parameter as an array ofpaths[]
strings.Format for listing settings
"update_mask": { "paths": [ "<setting_1>", "<setting_2>", ... "<setting_N>" ] }
grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/mdb/clickhouse/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d '{ "cluster_id": "<cluster_ID>", "update_mask": { "paths": [ "config_spec.clickhouse.config.geobase_uri" ] }, "config_spec": { "clickhouse": { "config": { "geobase_uri": "<link>" } } } }' \ mdb.api.cloud.yandex.net:443 \ yandex.cloud.mdb.clickhouse.v1.ClusterService.Update
Where:
-
update_mask
: List of parameters to update as an array ofpaths[]
strings.Here only one parameter is specified:
config_spec.clickhouse.config.geobase_uri
. -
config_spec.clickhouse.config.geobase_uri
: Link to the geobase archive in Object Storage.
You can get the cluster ID with a list of clusters in the folder.
-
-
View the server response to make sure the request was successful.
ClickHouse® is a registered trademark of ClickHouse, Inc