Yandex Cloud
Search
Contact UsGet started
  • Blog
  • Pricing
  • Documentation
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • ML & AI
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Customer Stories
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Education and Science
    • Yandex Cloud Partner program
  • Blog
  • Pricing
  • Documentation
© 2025 Direct Cursus Technology L.L.C.
Yandex Data Processing
  • Getting started
    • Resource relationships
    • Runtime environment
    • Yandex Data Processing component interfaces and ports
    • Jobs in Yandex Data Processing
    • Spark jobs
    • Automatic scaling
    • Decommissioning subclusters and hosts
    • Networking in Yandex Data Processing
    • Maintenance
    • Quotas and limits
    • Storage in Yandex Data Processing
    • Component properties
    • Apache Iceberg™ in Yandex Data Processing
    • Delta Lake in Yandex Data Processing
    • Logs in Yandex Data Processing
    • Initialization scripts
  • Access management
  • Pricing policy
  • Terraform reference
  • Monitoring metrics
  • Audit Trails events
  • Public materials
  • FAQ

In this article:

  • Types of log entries Yandex Data Processing
  • Cluster component logs
  • Job logs
  1. Concepts
  2. Logs in Yandex Data Processing

Logs in Yandex Data Processing

Written by
Yandex Cloud
Updated at November 26, 2024
  • Types of log entries Yandex Data Processing
    • Cluster component logs
    • Job logs

Yandex Cloud Logging collects and displays Yandex Data Processing cluster logs. They are automatically saved to the log group linked to the cluster when you create or modify it. This can be the folder's default log group or a log group pre-configured by the user.

To view the logs of a Yandex Data Processing cluster, from its page, go to the cluster log group. Then, in the Query field, enter a filter:

  • Standard filtering parameters:

    • resource_type: Always takes the dataproc.cluster value.
    • resource_id: Cluster ID.
  • Additional filtering parameters:

    • hostname: Host FQDN.
    • log_type: Type of entries in cluster logs.

As a result, the log group page will display a histogram of logs and records from them.

Types of log entries Yandex Data ProcessingTypes of log entries Yandex Data Processing

Cluster component logsCluster component logs

Depending on the subcluster role, the following types of entries are used for component logs:

  • All cluster hosts:

    • cloud-init: Yandex Data Processing clusters with the image version of 2.0 or higher.
    • salt-minion: Service initialization log for the Yandex Data Processing cluster.
    • syslog: System log.
    • telegraf: Log of outgoing Yandex Data Processing cluster metrics sent to Monitoring.
  • Master host:

    • flume: Yandex Data Processing clusters with image version below 2.0.
    • hadoop-hdfs-namenode
    • hadoop-hdfs-secondarynamenode
    • hadoop-mapreduce
    • hadoop-yarn-resourcemanager
    • hadoop-yarn-timelineserver
    • hbase-master
    • hbase-rest
    • hbase-thrift
    • hive-metastore
    • hiveserver2
    • hive-webhcat-console: Yandex Data Processing clusters with image version below 2.0.
    • hive-webhcat-console-error: Yandex Data Processing clusters with image version below 2.0.
    • hive-webhcat: Yandex Data Processing clusters with image version below 2.0.
    • knox: Yandex Data Processing clusters with image version below 2.0.
    • livy-out
    • livy-request
    • oozie
    • oozie-audit
    • oozie-error
    • oozie-instrumentation
    • oozie-jetty
    • oozie-jpa
    • oozie-ops
    • postgres
    • sqoop: Yandex Data Processing clusters with image version below 2.0.
    • supervisor: Yandex Data Processing clusters with image version below 2.0.
    • yandex-dataproc-agent
    • zeppelin
    • zookeeper
  • Data storage subcluster hosts:

    • hadoop-hdfs-datanode
    • hadoop-yarn-nodemanager
  • Data processing subcluster hosts contain hadoop-yarn-nodemanager logs.

Job logsJob logs

The following types of entries are added to job logs:

  • Entries of YARN container logs.

    For the entry type, specify containers.

    The entries also have tags:

    • yarn_log_type: Name of the log file YARN saves as a container log.

      Examples:

      • stdout
      • stderr
      • launch_container.sh
      • prelaunch.out
      • directory.info
    • container_id: YARN container ID. Example: container_1638976919626_0002_01_000001.

    • application_id: YARN application ID. Example: application_1638976919626_0002.

  • Log entries of the launching process output. They are saved if the job has been started via the Yandex Data Processing API rather than on cluster hosts.

    For the entry type, specify job_output.

    The entries contain the job_id tag with the job ID created via the Yandex Data Processing API. If the job started but was not completed at the validation stage, the entries contain the application_id tag.

Was the article helpful?

Previous
Delta Lake in Yandex Data Processing
Next
Initialization scripts
© 2025 Direct Cursus Technology L.L.C.