Working with logs
Yandex Cloud Logging collects and displays Yandex Data Processing cluster logs.
To monitor the events on the cluster and its individual hosts, specify, in its settings, a relevant log group. You can do this when creating or updating the cluster. If no log group has been selected for the cluster, a default log group in the cluster directory will be used to send and store logs.
For more information, see Logs.
Viewing log entries
- Go to the folder page
and select Yandex Data Processing. - Click the cluster name.
- Under Configuration, click the name of the cluster log group. The Cloud Logging page will open.
- Click the row of the log group. This will open the cluster logs.
- (Optional) Specify the output settings:
-
-
Getting the job start output Yandex Data Processing:
job_id="<job_ID>"
-
Getting the stdout output for all YARN application containers:
application_id="<YARN_application_ID>" AND yarn_log_type="stdout"
-
Getting YARN container's stderr output:
container_id="<YARN_container_ID>" AND yarn_log_type="stderr"
-
Getting the YARN Resource Manager service logs from the cluster's master host:
hostname="<master_host_FQDN>" AND log_type="hadoop-yarn-resourcemanager"
-
-
Message logging levels: from
TRACE
toFATAL
. -
Number of messages per page.
-
Message interval (one of the standard intervals or an ad-hoc one).
-
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
View a description of the CLI command to get logs:
yc logging read --help
Examples:
-
To get logs of the Yandex Data Processing cluster's HDFS NameNode service, run this command:
yc logging read \ --group-id=<log_group_ID> \ --resource-ids=<cluster_ID> \ --filter=log_type=hadoop-hdfs-namenode
-
To get logs for the last two hours from all Yandex Data Processing clusters assigned to a specific log group, run the command:
yc logging read \ --group-id=<log_group_ID> \ --resource-types=dataproc.cluster \ --since=2h
-
To get your cluster's system log for a specific period, run this command:
yc logging read \ --group-id <log_group_ID> \ --resource-ids=<cluster_ID> \ --filter 'syslog' \ --since 'YYYY-MM-DDThh:mm:ssZ' \ --until 'YYYY-MM-DDThh:mm:ssZ'
Set the logging period in the
--since
and--until
parameters. Time format:YYYY-MM-DDThh:mm:ssZ
. Example:2020-08-10T12:00:00Z
. The time zone must be specified in UTC format. -
To get a log for metrics sent from a specific host to Yandex Monitoring, run this command:
yc logging read \ --group-id <log_group_ID> \ --resource-ids=<cluster_ID> \ --filter 'telegraf and hostname="<host_FQDN>"' \ --since 'YYYY-MM-DDThh:mm:ssZ' \ --until 'YYYY-MM-DDThh:mm:ssZ'
Note
You can omit the --group-id
flag and specify the log group ID directly.
To get the host FQDN:
1. Go to the [folder page](https://console.yandex.cloud) and select **Yandex Data Processing**.
1. Click the cluster name.
1. Go to the **Hosts** tab.
1. Copy the host FQDN.
Disabling sending logs
When creating or updating a cluster, add the dataproc:disable_cloud_logging
property set to true
.
If you do not have the Yandex Cloud command line interface yet, install and initialize it.
The folder specified in the CLI profile is used by default. You can specify a different folder using the --folder-name
or --folder-id
parameter.
When creating or updating a cluster, specify dataproc:disable_cloud_logging=true
in the --property
parameter or provide an empty string (""
) instead of the log group ID in the --log-group-id
parameter:
yc dataproc cluster create <cluster_name> \
... \
--log-group-id=""
yc dataproc cluster update <cluster_name_or_ID> \
--property dataproc:disable_cloud_logging=true
Storing logs
You pay for receiving and storing your logs based on the Cloud Logging pricing policy. The default log retention period is three days. To update the retention period, edit the log group settings.
For more information about logs, see the Cloud Logging documentation.