Yandex Cloud
Search
Contact UsGet started
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
  • All Services
  • System Status
    • Featured
    • Infrastructure & Network
    • Data Platform
    • Containers
    • Developer tools
    • Serverless
    • Security
    • Monitoring & Resources
    • AI for business
    • Business tools
  • All Solutions
    • By industry
    • By use case
    • Economics and Pricing
    • Security
    • Technical Support
    • Start testing with double trial credits
    • Cloud credits to scale your IT product
    • Gateway to Russia
    • Cloud for Startups
    • Center for Technologies and Society
    • Yandex Cloud Partner program
  • Pricing
  • Customer Stories
  • Documentation
  • Blog
© 2025 Direct Cursus Technology L.L.C.
Yandex Data Streams
    • All tutorials
    • Ingesting data into storage systems
    • Smart log processing
    • Data transfer in microservice architectures
    • Storing data in ClickHouse®
    • Log replication to Object Storage via Fluent Bit
    • Log replication to Object Storage via Data Streams
    • Data migration to Yandex Object Storage using Yandex Data Transfer
    • Data delivery from Yandex Managed Service for Apache Kafka® using Yandex Data Transfer
    • Data delivery from an Data Streams queue to Managed Service for YDB
    • Delivering data to Yandex Managed Service for Apache Kafka® using Yandex Data Transfer
    • Change data capture (CDC) from YDB and delivery to YDS
    • Change data capture (CDC) from PostgreSQL and delivery to YDS
    • Change data capture (CDC) from MySQL® and delivery to YDS
    • Streaming Yandex Cloud Postbox events to Yandex Data Streams and analyzing them with Yandex DataLens
    • Building an interactive serverless application using WebSocket
    • Processing Audit Trails events
    • Debezium Change Data Capture (CDC) stream processing
    • Importing audit logs to MaxPatrol SIEM
    • Searching for Yandex Cloud events in Yandex Query
    • Yandex Cloud Postbox integration with external systems via webhooks
  • Access management
  • Pricing policy
  • FAQ

In this article:

  • Benefits
  • Reliability
  • Multiple storage systems
  • Masking data and processing logs
  • Setup
  1. Tutorials
  2. Smart log processing

Smart log processing

Written by
Yandex Cloud
Updated at August 15, 2025
  • Benefits
    • Reliability
    • Multiple storage systems
    • Masking data and processing logs
  • Setup

Apps generate logs to enable diagnostics. However, these logs alone are not enough for analysis: you need to be able to store and process them in a convenient way. This is why logs go to storage systems, such as Hadoop, ClickHouse®, or specialized cloud systems like Cloud Logging.

Applications do not usually write logs to storage systems directly. Instead, they send them to intermediate log aggregators. These aggregators can capture logs from stdout/stderr, read log files from disk, get them via syslog or over HTTP, and in many other ways.

After receiving logs, aggregators buffer them and then send them to different targets via plugins. This approach enables app developers to stay focused on coding while delegating log delivery to dedicated systems.

Standard log delivery systems are fluentd, fluentbit, logstash, and more.

Aggregators can write data directly to storage systems; however, for greater reliability, data first goes to an intermediate buffer (data streaming bus, message broker), i.e., Yandex Data Streams, and only from there, to storage systems.

Logs often contain too much data or restricted information. You can mask irrelevant or confidential information by adding more processing steps, e.g., in Cloud Functions.

BenefitsBenefits

ReliabilityReliability

To increase reliability, applications only need to configure a log aggregator to deliver data to a bus as quickly as possible, and the bus will ensure reliable data storage up to the point when the data is processed and written to storage systems.

Multiple storage systemsMultiple storage systems

The same logs are often stored across multiple storage systems at once: ClickHouse® manages rapid analytics, while Object Storage, long-term storage. To implement this, you can set up your aggregators to send two data streams: one to ClickHouse® and the other one to Object Storage.

Using a data bus makes this easier: you only need to send the log once to the data bus and from there, run two data transfer processes within Yandex Cloud. This solution will also enable you to add a third storage system, such as Greenplum® or Elasticsearch, at any time.

The multiple storage system approach is very convenient for ensuring compliance with FZ-152, PCI DSS, and other standards that require log retention for at least one year. In which case, while the last month's logs go to a quick access storage system, the rest of the data may be sent to a long-term "cold" storage in Object Storage.

Masking data and processing logsMasking data and processing logs

Log access is limited across employees. For example, certain logs may include personal user information that requires restricted access.

You can send logs to Cloud Functions for masking or any additional data processing as needed.

Once processed, the logs can be sent to multiple target systems at once: access to the logs containing masked personal data can be granted to all employees, while access to the full logs, to administrators only.

SetupSetup

To configure smart log processing:

  1. Create a data stream in Data Streams.

  2. Set up a log aggregator: fluentd, logstash, or any other aggregator that supports the Kinesis Data Streams API.

  3. Configure Yandex Data Transfer to transfer data to the selected storage system.

    For an example of setting up data delivery from Data Streams, see the tutorial on how to save data to ClickHouse®.

  4. Connect any data processing function to Yandex Data Transfer. This GitHub example illustrates the function code.

ClickHouse® is a registered trademark of ClickHouse, Inc.

Was the article helpful?

Previous
Ingesting data into storage systems
Next
Data transfer in microservice architectures
© 2025 Direct Cursus Technology L.L.C.