Replicating logs to Object Storage using Fluent Bit
Data aggregators enable you to transmit data, e.g., logs, from VM instances to log monitoring and data storage services.
In this tutorial, you will learn how to replicate VM logs automatically to an Object Storage bucket using the Fluent Bit
The solution described below works in the following way:
- Fluent Bit runs on an active VM as a systemd
module. - Fluent Bit collects logs according to the configuration settings and sends them to a Data Streams stream via the Amazon Kinesis Data Streams
protocol. - In the working folder, you set up a Data Transfer transfer that fetches data from the stream and saves it to an Object Storage bucket.
To set up log replication:
- Prepare your cloud.
- Configure the environment.
- Create an Object Storage bucket for storing your logs.
- Create a data stream Data Streams.
- Create a transfer Data Transfer.
- Install Fluent Bit.
- Connect Fluent Bit to a data stream.
- Test sending and receiving data.
If you no longer want to store logs, delete the resources allocated to them.
Prepare your cloud
Sign up for Yandex Cloud and create a billing account:
- Go to the management console
and log in to Yandex Cloud or create an account if you do not have one yet. - On the Yandex Cloud Billing
page, make sure you have a billing account linked and it has theACTIVE
orTRIAL_ACTIVE
status. If you do not have a billing account, create one.
If you have an active billing account, you can go to the cloud page
Learn more about clouds and folders.
Required paid resources
The cost of data storage support includes:
- Data stream maintenance fees (see Yandex Data Streams pricing).
- Fees for transmitting data between sources and targets (see Yandex Data Transfer pricing).
- Data storage fees (see Yandex Object Storage pricing).
Configure the environment
-
Create a service account, e.g.,
logs-sa
, with theeditor
role assigned for the folder. -
Create a static access key for the service account. Save the ID and private key. You will need them to log in to AWS.
-
Create a VM from a public Ubuntu 20.04 image. Under Access, specify the service account that you created in the previous step.
-
Connect to the VM via SSH.
-
Install the AWS CLI
utility on the VM. -
Run this command:
aws configure
-
Enter the following one by one:
AWS Access Key ID [None]:
: Service account key ID.AWS Secret Access Key [None]:
: Secret access key of the service account.Default region name [None]:
ru-central1
region.
Create a bucket
- In the management console
, select the folder where you want to create a bucket. - Select Object Storage.
- Click Create bucket.
- Enter the name of the bucket.
- In the Storage class field, select
Cold
. - Click Create bucket.
Create a data stream
- In the management console
, select the folder to create a data stream in. - Select Data Streams.
- Click Create stream.
- Specify an existing serverless YDB database or create a new one. If you have created a new database, click to update the database list.
- Enter the data stream name:
logs-stream
. - Click Create.
Wait for the stream to start. Once the stream is ready for use, its status will change from Creating
to Active
.
Create a transfer
- In the management console
, select the folder where you want to create a transfer. - Select Data Transfer.
- Create a source endpoint:
- In the
Endpoints tab, click Create endpoint. - In the Direction field, select
Source
. - Enter the endpoint name, for example,
logs-source
. - In the Database type list, select
Yandex Data Streams
. - Select the database you specified in the settings of the stream you created earlier.
- Enter the stream name:
logs-stream
. - Select the
logs-sa
service account you created earlier. - Under Advanced settings, specify the conversion rules for the
CloudLogging parser
data. - Click Create.
- In the
- Create a target endpoint:
- In the
Endpoints tab, click Create endpoint. - In the Direction field, select
Target
. - Enter the endpoint name, for example,
logs-receiver
. - In the Database type list, select
Object Storage
. - Enter the name of the previously created bucket.
- Select the previously created
logs-sa
service account. - In the Serialization format field, select
JSON
. - Click Create.
- In the
- Create a transfer:
- In the
Transfers tab, click Create transfer. - Enter the transfer name, e.g.,
logs-transfer
. - Select the previously created source endpoint,
logs-source
. - Select the previously created target endpoint,
logs-receiver
. - Click Create.
- In the
- Click
next to the created transfer and select Activate.
Wait until the transfer is activated. Once the transfer is ready for use, its status will change from Creating
to Replicating
.
Install Fluent Bit
Note
This tutorial uses the current Fluent Bit version 1.9.
-
To install Fluent Bit to your VM, run the command:
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
For more information on how to install Fluent Bit, see the official documentation
. -
Run the
fluent-bit
service:sudo systemctl start fluent-bit
-
Check the
fluent-bit
service status, it should be active:sudo systemctl status fluent-bit
The result should include the
active (running)
status and logs for the embeddedcpu
plugin that Fluent Bit starts collecting by default once installation is complete:● fluent-bit.service - Fluent Bit Loaded: loaded (/lib/systemd/system/fluent-bit.service; disabled; vendor preset: enabled) Active: active (running) since Thu 2022-09-08 10:23:03 UTC; 10s ago Docs: https://docs.fluentbit.io/manual/ Main PID: 1328 (fluent-bit) Tasks: 4 (limit: 2310) Memory: 2.8M CGroup: /system.slice/fluent-bit.service └─1328 /opt/fluent-bit/bin/fluent-bit -c //etc/fluent-bit/fluent-bit.conf Sep 08 10:23:03 ycl-20 fluent-bit[1328]: [2022/09/08 10:23:03] [ info] [output:stdout:stdout.0] worker #0 started Sep 08 10:23:05 ycl-20 fluent-bit[1328]: [0] cpu.local: [1662632584.114661597, {"cpu_p"=>1.000000, "user_p"=>0.000000, > Sep 08 10:23:06 ycl-20 fluent-bit[1328]: [0] cpu.local: [1662632585.114797726, {"cpu_p"=>0.000000, "user_p"=>0.000000, > ...
Connect Fluent Bit to the data stream
Note
If you are running Fluent Bit version below 1.9 that comes with the td-agent-bit
package, edit the /etc/td-agent-bit/td-agent-bit.conf
, /lib/systemd/system/td-agent-bit.service
files and restart the td-agent-bit
service.
-
Open the
/etc/fluent-bit/fluent-bit.conf
file:sudo vim /etc/fluent-bit/fluent-bit.conf
-
Add the
OUTPUT
section with thekinesis_streams
plugin settings:[OUTPUT] Name kinesis_streams Match * region ru-central-1 stream /<region>/<folder_ID>/<database_ID>/<data_stream_ID> endpoint https://yds.serverless.yandexcloud.net
Where:
data-stream
: Data Streams data stream ID.For example, your stream ID will appear as
/ru-central1/aoeu1kuk2dht********/cc8029jgtuab********/logs-stream
if:logs-stream
: Stream nameru-central1
: Regionaoeu1kuk2dht********
: Folder IDcc8029jgtuab********
: YDB database ID
For more information on how to install Fluent Bit, see the official documentation
. -
Open the
/lib/systemd/system/fluent-bit.service
file:sudo vim /lib/systemd/system/fluent-bit.service
-
To the
SERVICE
section, add the environment variables that include paths to files with access keys:Environment=AWS_CONFIG_FILE=/home/<username>/.aws/config Environment=AWS_SHARED_CREDENTIALS_FILE=/home/<username>/.aws/credentials
Where
<username>
is the username that you specified in the VM settings. -
Restart the
fluent-bit
service:sudo systemctl daemon-reload sudo systemctl restart fluent-bit
-
Check the status of the
fluent-bit
service. It should not include any error messages:sudo systemctl status fluent-bit
Result:
Sep 08 16:51:19 ycl-20 fluent-bit[3450]: Fluent Bit v1.9.8 Sep 08 16:51:19 ycl-20 fluent-bit[3450]: * Copyright (C) 2015-2022 The Fluent Bit Authors Sep 08 16:51:19 ycl-20 fluent-bit[3450]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd Sep 08 16:51:19 ycl-20 fluent-bit[3450]: * https://fluentbit.io Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [fluent bit] version=1.9.8, commit=, pid=3450 Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128 Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [cmetrics] version=0.3.6 Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [sp] stream processor started Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [output:kinesis_streams:kinesis_streams.1] worker #0 started Sep 08 16:51:19 ycl-20 fluent-bit[3450]: [2022/09/08 16:51:19] [ info] [output:stdout:stdout.0] worker #0 started
Test sending and receiving data
- In the management console
, navigate to the folder with the new data stream, transfer, and bucket. - Select Data Streams.
- Select the
logs-stream
data stream. - Go to the Monitoring tab and check the stream activity charts.
- Select Data Transfer.
- Select
logs-transfer
. - Go to the Monitoring tab and check the transfer activity charts.
- Select Object Storage.
- Select the previously created bucket.
- Make sure that you have objects in the bucket. Download and review the log files you got.
How to delete the resources you created
Some resources are not free of charge. To avoid paying for them, delete the resources you no longer need: