Setting up alerts
Alerting allows you to monitor changes in metrics and notifies you when they hit critical levels using periodic queries known as alerts. When metrics reach threshold values, the system sends a notification via the specified communication channel, e.g., by email or messenger.
To configure alerting:
- Create a notification channel.
- Select metrics and labels for monitoring.
- Create an alert.
Let's consider an example of creating an alert that notifies you when a service becomes unavailable.
The alert will trigger when the number of failed requests reaches 50% or more of the total requests. Such an alert helps detect DDoS attacks or infrastructure failures.
Let's use letters to indicate the number of incoming requests per second:
A: Total requests.B: Failed requests.
Let's set up our alerts:
Breaches 30% ofA:Warning.Breaches 50% ofA:Alarm(critical level).
Creating a notification channel
-
In the management console
, select the folder on the left. -
Select Monitoring.
-
Go to the Notification channels section.
-
Click Create channel.
-
Specify the channel name, notification method, and recipients.
Note
To get notifications, the user must:
- Have the
monitoring.viewerrole for the folder with the configured alert. - In the management console settings
:- Enable Monitoring.
- Specify an email address, phone number, and Telegram account or group.
- Have the
-
Click Create.
Selecting metrics for monitoring
- We will get the
Avalue using theload_balancer.requests_count_per_secondmetric. - We will get the
Bvalue using theload_balancer.requests_count_per_secondmetric filtered by thecode=503label. - We will calculate the
BtoAratio (in percent) using the100 * B / Aformula and save it asC.
Creating an alert
-
On the Monitoring home page, click Create alert.
-
Name your alert, e.g.,
unavailable_service. -
Under Alerts config, describe your query to get the
Avalue:- Next to , add the following parameters:
service=Application Load Balancername=load_balancer.requests_count_per_secondload_balancer=<load_balancer_name>
- Next to , set the
replace_nan(0)function to replace missing data with0, ensuring a continuous chart.
- Next to , add the following parameters:
-
Click Add query.
-
Describe your query to get the
Bvalue:- Specify the data to collect:
service=Application Load Balancername=load_balancer.requests_count_per_secondcode=503load_balancer=<load_balancer_name>
- Set the
replace_nan(0)function.
- Specify the data to collect:
-
Click Add query.
-
Describe you query for
Cto get theBtoAratio in percent:- Click
to switch to text mode to edit the query. - Enter
100 * B / Ain the query string.
- Click
-
Under Alert condition, specify:
Query to evaluate:CAggregation function:All valuesWarning:30Alarm:50(critical level)Evaluation window:30 secondsEvaluation delay:15 seconds
-
Leave the default values under No data policy.
-
Optionally, under Annotations, add the information to log when the alert triggers.
-
Under Notifications, add the notification channel.
-
Click Create.