Troubleshooting in Apache Hive™ Metastore
This section describes issues you may encounter in the service and how to troubleshoot them.
- Error when creating a database
- No permission error when attaching a service account to a cluster
- Hive table lock
Error when creating a database in Apache Hive™ Metastore
The error occurs if you use the following syntax to create a database:
CREATE DATABASE IF NOT EXISTS <DB_name>;
Solution
Apache Hive™ Metastore does not allow creating a database or table in Hive: they are stored in a Yandex Object Storage bucket linked to a Yandex Data Processing cluster. To create a database, use the following syntax:
CREATE DATABASE IF NOT EXISTS <DB_name> LOCATION <DB_location>;
In the LOCATION parameter, specify the path to the bucket and the database in it in the following format: s3a://<bucket_name>/<folder_name>/<DB_name>. Specifying a folder is optional; however, objects will load into a folder faster than into the bucket root.
No permission error when attaching a service account to the cluster
Error message:
ERROR: rpc error: code = PermissionDenied desc = you do not have permission to access the requested service account or service account does not exist
This error occurs if you link a service account to a cluster while creating or modifying it.
Solution
Assign the iam.serviceAccounts.user role or higher to your Yandex Cloud account.
Apache® and Apache Hive™
Hive table lock
When using Apache Hive™ Metastore, a Hive table may get locked, for example if the script is interrupted.
To remove the lock you can use the following:
- Hive Metastore thrift interface.
- Python script running in the same virtual private network (VPC) as Apache Hive™ Metastore.
Removing a lock using a Python script
Warning
Apache Hive™ Metastore is only accessible via a private VPC IP address and does not have a public DNS name. This provides additional security but requires all services connecting to Apache Hive™ Metastore to be in the same VPC or have configured network access.
To remove retention:
-
Connect to a VM or service that is in the same VPC as Apache Hive™ Metastore.
-
Install the dependencies:
pip install click pip install hive-metastore-client -
Create a file named
unlock.pyand paste the following script to it:unlock.py
import click from hive_metastore_client import HiveMetastoreClient from thrift_files.libraries.thrift_hive_metastore_client.ttypes import ShowLocksRequest, UnlockRequest class MetastoreClient: def __init__(self, metastore_hostname, metastore_port): self.metastore_hostname = metastore_hostname self.metastore_port = metastore_port self.metastore_client = HiveMetastoreClient(metastore_hostname, metastore_port) def show_locks(self, db_name, table): with self.metastore_client as metastore_client: req = ShowLocksRequest(dbname=db_name, tablename=table) return metastore_client.show_locks(req) def unlock(self, lock_id): with self.metastore_client as metastore_client: req = UnlockRequest(lockid=lock_id) return metastore_client.unlock(req) @click.group() @click.option( "--host", required=True, help="Metastore host", ) @click.option( "--port", type=int, help="Metastore port", default=9083, ) @click.pass_context def cli(ctx, host: str, port: int): """Hive Metastore CLI.""" ctx.obj = MetastoreClient(host, port) @cli.command("show-locks") @click.argument("db_name", required=True) @click.argument("table", required=True) @click.pass_obj def show_locks(client: MetastoreClient, db_name, table): """Show locks for table.""" result = client.show_locks(db_name, table) click.echo(result) @cli.command("unlock") @click.argument("lock_id", required=True, type=int) @click.pass_obj def unlock(client: MetastoreClient, lock_id): """Unlock by lock id.""" result = client.unlock(lock_id) click.echo(result) if __name__ == "__main__": cli() -
To view the list of locks, run this script:
python unlock.py --host <metastore-host> show-locks <db-name> <table-name>Where:
-
<metastore-host>: Apache Hive™ Metastore private IP address.To learn the IP address:
- Go to the resource folder
page. - Go to Yandex MetaData Hub.
- In the left-hand panel, select
Metastore.
- Go to the resource folder
-
<db-name>: Database name. -
<table-name>: Table name.
-
-
To remove the lock, run this script:
python unlock.py --host <metastore-host> unlock <lock-id>Where:
<metastore-host>: Apache Hive™ Metastore private IP address.<lock-id>: Lock ID.