Best practices for managing alerts
Configure Alerts with Min and Max Conditions
When creating alerts in Kloudfuse, use the Min and Max conditions carefully. Improper setup may cause alerts to miss critical events or trigger inconsistently.
Steps
-
Min condition
-
Max condition
Use the Min condition when you want to trigger an alert if any data point in the time range crosses a threshold.
-
In the Set Condition section:
-
Select Min as the Trigger when option.
-
Choose the query (for example,
Query (a)). -
Set the comparison rule, such as is above threshold over 5 minutes.
-
-
Enter the Alert threshold value.
-
Expand Configure Warning and Recovery Thresholds if needed.
-
Under No data and error handling, select:
-
Evaluate as Zero → if data is missing.
-
Keep Last State → if there is an execution error or timeout.
-
Use the Max condition when you want to trigger an alert if the highest value in the time range drops below a threshold.
-
In the Set Condition section:
-
Select Max as the Trigger when option.
-
Choose the query (for example,
Query (a)). -
Set the comparison rule, such as is below threshold over 10 minutes.
-
-
Enter the Alert threshold value.
-
Expand Configure Warning and Recovery Thresholds if needed.
-
Under No data and error handling, select options based on your monitoring needs.
Notes
-
Use Min when working with status filters in metrics, and always enable Evaluate as Zero during alert creation. After setup, reset the alert by updating the time range or threshold and disabling Evaluate as Zero if necessary.
-
Use Max when working with thresholds that must not drop below a certain level (for example, available nodes).
Ensure Alerts are meaningful and actionable
Set up alerts that are meaningful and actionable.
Consider the impact of having an alert configuration in your system with the following characteristics:
-
Alerts are being constantly triggered.
-
Alerts resolve themselves before someone can take action.
-
Alerts don’t provide information on what the problem is.
-
Alerts are a result of a problem that cannot be addressed by the team.
These situations inevitably results in your team ignoring ALL the alerts. Therefore, exercise good judgement in designing and implementing your alerting framework.
Add links to the Alert
You can add links to an Alert to associate it with documentation that helps to troubleshoot the issue. Kloudfuse provides several options:
- Link to a Runbook
-
Create and maintain instructions on how to proceed when the alert is triggered.
- Link a Dashboard
-
To help with troubleshooting, associate a Dashboard with the Alert.
- Link a Panel
-
Especially useful when working with large dashboards, link the alert to a specific panel in the Dashboard.