Creating an Apache Hive™ Metastore cluster
To learn more about Apache Hive™ Metastore clusters in Yandex MetaData Hub, see Apache Hive™ Metastore clusters.
Getting started
-
To link a service account to a Apache Hive™ Metastore cluster, make sure your Yandex Cloud account has the iam.serviceAccounts.user role or higher.
-
Set up a NAT gateway in the subnet the cluster will connect to. It is needed for the cluster to interact with Yandex Cloud services.
-
Assign the
managed-metastore.integrationProvider
role to the service account. This role enables the cluster to work with Yandex Cloud services, e.g., Yandex Cloud Logging and Yandex Monitoring, under a service account.You can also add more roles. Their combination depends on your specific use case. To view the service roles, see the Apache Hive™ Metastore section, and for all available roles, see this reference.
-
If you want to save cluster logs to a custom log group, create one.
For more information, see Transferring cluster logs.
Create a cluster
-
In the management console
, select the folder where you want to create a server. -
Select Yandex MetaData Hub.
-
In the left-hand panel, select
Metastore. -
Click Create cluster.
-
Enter a name for the cluster. It must be unique within the folder.
-
Optionally, enter a description for the cluster.
-
Optionally, add Yandex Cloud labels to break resources into logical groups.
-
Specify the service account you created earlier.
-
Under Network settings, select the network and subnet to host the Apache Hive™ Metastore cluster. Specify the security group you configured previously.
-
Optionally, configure logging settings:
-
Enable Write logs.
-
Select where to write cluster logs to:
- Default log group: Select Folder in the Destination field and specify the folder. Logs will be stored in the selected folder's default log group.
- Custom log group: Select Log group in the Destination field and specify the log group you created earlier.
-
Select the minimum logging level.
The execution log will contain logs of this level or higher. The available levels are
TRACE
,DEBUG
,INFO
,WARN
,ERROR
, andFATAL
. The default isINFO
.
-
-
If required, enable protection of the cluster from accidental deletion by a user.
Even with deletion protection enabled, one can still connect to the cluster manually and delete the data.
-
Click Create.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID>
command. You can also set a different folder for any specific command using the --folder-name
or --folder-id
parameter.
To create a Apache Hive™ Metastore cluster:
-
View the description of the CLI command to create a cluster:
yc managed-metastore cluster create --help
-
Specify the cluster properties in the creation command:
yc managed-metastore cluster create \ --name <cluster_name> \ --description <cluster_description> \ --labels <label_list> \ --service-account-id <service_account_ID> \ --version <Apache_Hive™_Metastore_version> \ --subnet-ids <subnet_IDs> \ --security-group-ids <security_group_IDs> \ --resource-preset-id <ID_of_computing_resources> \ --maintenance-window type=<maintenance_type>,` `day=<day_of_week>,` `hour=<hour> \ --deletion-protection \ --log-enabled \ --log-folder-id <folder_ID> \ --log-min-level <logging_level>
Where:
--name
: Cluster name.
--description
: Cluster description.--labels
: List of labels. Provide labels in<key>=<value>
format.--service-account-id
: Service account ID.--version
: Apache Hive™ Metastore version.
--subnet-ids
: List of subnet IDs.
-
--security-group-ids
: List of security group IDs. -
--resource-preset-id
: Computing resource configuration. The possible values are:c2-m8
: 2 vCPUs and 8 GB RAM.c2-m4
: 2 vCPUs and 4 GB RAM.
-
--maintenance-window
: Maintenance window settings (including for disabled clusters), wheretype
is the maintenance type:anytime
: At any time (default).weekly
: On a schedule. For this value, also specify the following:day
: Day of week, i.e.,MON
,TUE
,WED
,THU
,FRI
,SAT
, orSUN
.hour
: Hour of day (UTC), from1
to24
.
-
--deletion-protection
: Enables cluster protection against accidental deletion. -
Logging parameters:
-
--log-enabled
: Enables logging. Logs generated by Apache Hive™ Metastore components will be sent to Yandex Cloud Logging. -
--log-folder-id
: Folder ID. Logs will be written to the default log group for this folder. -
--log-group-id
: Custom log group ID. Logs will be written to this group.Specify one of the two parameters:
--log-folder-id
or--log-group-id
. -
--log-min-level
: Minimum logging level. Possible values:TRACE
,DEBUG
,INFO
(default),WARN
,ERROR
, andFATAL
.
-
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>"
-
Create a file named
body.json
and paste the following code into it:{ "folderId": "<folder_ID>", "name": "<cluster_name>", "description": "<cluster_description>", "labels": { "<label_list>" }, "deletionProtection": <deletion_protection>, "version": "<Apache_Hive™_Metastore_version>", "configSpec": { "resources": { "resourcePresetId": "<resource_configuration_ID>" } }, "serviceAccountId": "<service_account_ID>", "logging": { "enabled": <use_of_logging>, "folderId": "<folder_ID>", "minLevel": "<logging_level>" }, "network": { "subnetIds": [ "<list_of_subnet_IDs>" ], "securityGroupIds": [ "<list_of_security_group_IDs>" ] }, "maintenanceWindow": { "weeklyMaintenanceWindow": { "day": "<day_of_week>", "hour": "<hour>" } } }
Where:
folderId
: Folder ID. You can get it with the list of folders in the cloud.
-
name
: Cluster name. -
description
: Cluster description. -
labels
: List of labels Provide labels in"<key>": "<value>"
format. -
deletionProtection
: Enables cluster protection against accidental deletion. The possible values aretrue
orfalse
. -
version
: Apache Hive™ Metastore version. -
configSpec.resources.resourcePresetId
: ID of the cluster's computing resources. The possible values are:c2-m8
: 2 vCPUs and 8 GB RAM.c2-m4
: 2 vCPUs and 4 GB RAM.
-
serviceAccountId
: Service account ID. -
logging
: Logging parameters:-
enabled
: Enables logging. Logs generated by Apache Hive™ Metastore components will be sent to Yandex Cloud Logging. The possible values aretrue
orfalse
. -
folderId
: Folder ID. Logs will be written to the default log group for this folder. -
logGroupId
: Custom log group ID. Logs will be written to this group.Specify either
folderId
orlogGroupId
. -
minLevel
: Minimum logging level. The possible values areTRACE
,DEBUG
,INFO
,WARN
,ERROR
, andFATAL
.
-
-
network
: Network settings:subnetIds
: List of subnet IDs.securityGroupIds
: List of security group IDs.
-
maintenanceWindow
: Maintenance window settings (including for disabled clusters). InmaintenanceWindow
, provide one of the two parameters:-
anytime
: Maintenance can take place at any time. -
weeklyMaintenanceWindow
: Maintenance takes place once a week at the specified time:day
: Day of week, inDDD
format,MON
,TUE
,WED
,THU
,FRI
,SAT
, orSUN
.hour
: Time of day (UTC) inHH
format, from1
to24
.
-
-
Use the Cluster.Create method and send the following request, e.g., via cURL
:curl \ --request POST \ --header "Authorization: Bearer $IAM_TOKEN" \ --url 'https://metastore.api.cloud.yandex.net/managed-metastore/v1/clusters' \ --data '@body.json'
-
View the server response to make sure your request was successful.
-
Get an IAM token for API authentication and save it as an environment variable:
export IAM_TOKEN="<IAM_token>"
-
Clone the cloudapi
repository:cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
Below, we assume the repository contents are stored in the
~/cloudapi/
directory. -
Create a file named
body.json
and paste the following code into it:{ "folder_id": "<folder_ID>", "name": "<cluster_name>", "description": "<cluster_description>", "labels": "{ <label_list> }", "deletion_protection": <deletion_protection>, "version": "<Apache_Hive™_Metastore_version>", "config_spec": { "resources": { "resource_preset_id": "<resource_configuration_ID>" } }, "service_account_id": "<service_account_ID>", "logging": { "enabled": <use_of_logging>, "folder_id": "<folder_ID>", "min_level": "<logging_level>" }, "network": { "subnet_ids": [ "<list_of_subnet_IDs>" ], "security_group_ids": [ "<list_of_security_group_IDs>" ] }, "maintenance_window": { "weekly_maintenance_window": { "day": "<day_of_week>", "hour": "<hour>" } } }
Where:
folder_id
: Folder ID. You can get it with the list of folders in the cloud.
-
name
: Cluster name. -
description
: Cluster description. -
labels
: List of labels Provide labels in"<key>": "<value>"
format. -
deletion_protection
: Enables cluster protection against accidental deletion. The possible values are:true
orfalse
. -
version
: Apache Hive™ Metastore version. -
config_spec.resources.resource_preset_id
: ID of the cluster's computing resources. The possible values are:c2-m8
: 2 vCPUs and 8 GB RAM.c2-m4
: 2 vCPUs and 4 GB RAM.
-
service_account_id
: Service account ID. -
logging
: Logging parameters:-
enabled
: Enables logging. Logs generated by Apache Hive™ Metastore components will be sent to Yandex Cloud Logging. The possible values are:true
orfalse
. -
folder_id
: Folder ID. Logs will be written to the default log group for this folder. -
log_group_id
: Custom log group ID. Logs will be written to this group.Specify either
folder_id
orlog_group_id
. -
min_level
: Minimum logging level. The possible values are:TRACE
,DEBUG
,INFO
,WARN
,ERROR
, andFATAL
.
-
-
network
: Network settings:subnet_ids
: List of subnet IDs.security_group_ids
: List of security group IDs.
-
maintenance_window
: Maintenance window settings (including for disabled clusters). Inmaintenance_window
, provide one of the two parameters:-
anytime
: Maintenance can take place at any time. -
weekly_maintenance_window
: Maintenance takes place once a week at the specified time:day
: Day of week, inDDD
format,MON
,TUE
,WED
,THU
,FRI
,SAT
, orSUN
.hour
: Time of day (UTC) inHH
format, from1
to24
.
-
-
Use the ClusterService.Create call and send the following request, e.g., via gRPCurl
:grpcurl \ -format json \ -import-path ~/cloudapi/ \ -import-path ~/cloudapi/third_party/googleapis/ \ -proto ~/cloudapi/yandex/cloud/metastore/v1/cluster_service.proto \ -rpc-header "Authorization: Bearer $IAM_TOKEN" \ -d @ \ metastore.api.cloud.yandex.net:443 \ yandex.cloud.metastore.v1.ClusterService.Create \ < body.json
-
View the server response to make sure your request was successful.
Apache® and Apache Hive™