Working with logs
Yandex Cloud Logging collects and displays a Yandex Data Processing cluster’s logs.
To monitor events on the cluster and its individual hosts, specify the relevant log group in its settings. You can do this when creating or updating the cluster. If no log group has been selected for the cluster, a default log group in the cluster directory will send and store logs.
For more information, see Logs.
Viewing log entries
- Open the folder dashboard
. - Go to Yandex Data Processing.
- Click the name of your cluster.
- Under Configuration, click the name of the cluster log group. The Cloud Logging page will open.
- Click the log group row. This will open the cluster logs.
- Optionally, specify the output settings:
-
-
Getting the job run output Yandex Data Processing:
job_id="<job_ID>" -
Getting the
stdoutoutput for all YARN application containers:application_id="<YARN_app_ID>" AND yarn_log_type="stdout" -
Getting a YARN container's
stderroutput:container_id="<YARN_container_ID>" AND yarn_log_type="stderr" -
Getting the YARN Resource Manager logs from the cluster's master host:
hostname="<master_host_FQDN>" AND log_type="hadoop-yarn-resourcemanager"
-
-
Message logging levels: from
TRACEtoFATAL. -
Number of messages per page.
-
Message interval (one of the standard intervals or an ad-hoc one).
-
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
See the description of the CLI command for getting logs:
yc logging read --help
Examples:
-
To get logs of the Yandex Data Processing cluster's HDFS NameNode, run this command:
yc logging read \ --group-id=<log_group_ID> \ --resource-ids=<cluster_ID> \ --filter=log_type=hadoop-hdfs-namenode -
To get logs for the last two hours from all Yandex Data Processing clusters assigned to a specific log group, run this command:
yc logging read \ --group-id=<log_group_ID> \ --resource-types=dataproc.cluster \ --since=2h -
To get your cluster's system log over a specific period, run this command:
yc logging read \ --group-id <log_group_ID> \ --resource-ids=<cluster_ID> \ --filter 'syslog' \ --since 'YYYY-MM-DDThh:mm:ssZ' \ --until 'YYYY-MM-DDThh:mm:ssZ'Set the logging period in the
--sinceand--untilparameters. Time format:YYYY-MM-DDThh:mm:ssZ. Example:2020-08-10T12:00:00Z. Use the UTC time zone. -
To get a log for metrics sent from a specific host to Yandex Monitoring, run this command:
yc logging read \ --group-id <log_group_ID> \ --resource-ids=<cluster_ID> \ --filter 'telegraf and hostname="<host_FQDN>"' \ --since 'YYYY-MM-DDThh:mm:ssZ' \ --until 'YYYY-MM-DDThh:mm:ssZ'
Note
You can omit the --group-id parameter and specify the log group ID directly.
To get the host FQDN:
1. Open the [folder dashboard](https://console.yandex.cloud).
1. [Go](../../console/operations/select-service.md#select-service) to **Yandex Data Processing**.
1. Click the name of your cluster.
1. Navigate to the **Hosts** tab.
1. Copy the host FQDN.
Disabling sending logs
When creating or updating a cluster, add the dataproc:disable_cloud_logging property set to true.
If you do not have the Yandex Cloud CLI installed yet, install and initialize it.
By default, the CLI uses the folder specified when creating the profile. To change the default folder, use the yc config set folder-id <folder_ID> command. You can also set a different folder for any specific command using the --folder-name or --folder-id parameter.
When creating or updating a cluster, specify dataproc:disable_cloud_logging=true in the --property parameter or provide an empty string ("") instead of the log group ID in the --log-group-id parameter:
yc dataproc cluster create <cluster_name> \
... \
--log-group-id=""
yc dataproc cluster update <cluster_name_or_ID> \
--property dataproc:disable_cloud_logging=true
Storing logs
Log collection and storage are billed according to the Cloud Logging pricing policy. The default log retention period is three days. To update the retention period, edit the log group settings.
For more information about logs, see this Cloud Logging article.