Logs in Yandex Data Processing
Yandex Cloud Logging collects and displays Yandex Data Processing cluster logs. They are automatically saved to the log group linked to the cluster when you create or modify it. This can be the folder's default log group or a log group pre-configured by the user.
To view the logs of a Yandex Data Processing cluster, from its page, go to the cluster log group. Then, in the Query field, enter a filter:
-
Standard filtering parameters:
resource_type
: Always takes thedataproc.cluster
value.resource_id
: Cluster ID.
-
Additional filtering parameters:
hostname
: Host FQDN.log_type
: Type of entries in cluster logs.
As a result, the log group page will display a histogram of logs and records from them.
Types of log entries Yandex Data Processing
Cluster component logs
Depending on the subcluster role, the following types of entries are used for component logs:
-
All cluster hosts:
cloud-init
: Yandex Data Processing clusters with the image version of 2.0 or higher.salt-minion
: Service initialization log for the Yandex Data Processing cluster.syslog
: System log.telegraf
: Log of outgoing Yandex Data Processing cluster metrics sent to Monitoring.
-
Master host:
flume
: Yandex Data Processing clusters with image version below 2.0.hadoop-hdfs-namenode
hadoop-hdfs-secondarynamenode
hadoop-mapreduce
hadoop-yarn-resourcemanager
hadoop-yarn-timelineserver
hbase-master
hbase-rest
hbase-thrift
hive-metastore
hiveserver2
hive-webhcat-console
: Yandex Data Processing clusters with image version below 2.0.hive-webhcat-console-error
: Yandex Data Processing clusters with image version below 2.0.hive-webhcat
: Yandex Data Processing clusters with image version below 2.0.knox
: Yandex Data Processing clusters with image version below 2.0.livy-out
livy-request
oozie
oozie-audit
oozie-error
oozie-instrumentation
oozie-jetty
oozie-jpa
oozie-ops
postgres
sqoop
: Yandex Data Processing clusters with image version below 2.0.supervisor
: Yandex Data Processing clusters with image version below 2.0.yandex-dataproc-agent
zeppelin
zookeeper
-
Data storage subcluster hosts:
hadoop-hdfs-datanode
hadoop-yarn-nodemanager
-
Data processing subcluster hosts contain
hadoop-yarn-nodemanager
logs.
Job logs
The following types of entries are added to job logs:
-
Entries of YARN container logs.
For the entry type, specify
containers
.The entries also have tags:
-
yarn_log_type
: Name of the log file YARN saves as a container log.Examples:
stdout
stderr
launch_container.sh
prelaunch.out
directory.info
-
container_id
: YARN container ID. Example:container_1638976919626_0002_01_000001
. -
application_id
: YARN application ID. Example:application_1638976919626_0002
.
-
-
Log entries of the launching process output. They are saved if the job has been started via the Yandex Data Processing API rather than on cluster hosts.
For the entry type, specify
job_output
.The entries contain the
job_id
tag with the job ID created via the Yandex Data Processing API. If the job started but was not completed at the validation stage, the entries contain theapplication_id
tag.