Best practices for managing alerts

Follow these recommendations when designing alerts in Kloudfuse to ensure they fire reliably, remain actionable, and provide useful context for troubleshooting.

Configure Alerts with Min and Max Conditions

When creating alerts in Kloudfuse, use the Min, Max, and Last conditions carefully. Improper setup may cause alerts to miss critical events or trigger inconsistently.

Steps

Use Min when you want the alert to fire only if the minimum of all data points in the evaluation window crosses the threshold — i.e., use Min for "is above" (>). This guarantees the condition is true for every point in the range (no intermittent breaches).

  1. In the Set Condition section:

    • Select Min as the Trigger when option (use this for is above / > semantics).

    • Choose the query (for example, Query (a)).

    • Set the comparison rule, such as is above threshold over 5 minutes.

  2. Enter the Alert threshold value.

  3. Expand Configure Warning and Recovery Thresholds if needed.

  4. Under No data and error handling, select:

    • Evaluate as Zero → if data is missing.

    • Keep Last State → if there is an execution error or timeout.

      Example Min condition alert

Notes

  • Min: Use when you need is above behavior (minimum > threshold) — e.g., Get alerted if cpu going over 80% over 5 minutes but not for a spike

  • Max: Use when you need is below behavior (maximum < threshold) — e.g., Get alerted if total requests falls below a threshold which can happen after a new deployment.

  • Last: Use for point-in-time checks or spike detection when only the most recent sample matters. E.g. if you want to get alerted immediately for pod restart and don’t want to wait for it to auto heal

  • After initial setup, test the alert by adjusting the time range or threshold. If you temporarily used Evaluate as Zero to tune behavior, disable it once the alert is finalized.

Ensure Alerts Are Meaningful and Actionable

Set up alerts that are meaningful and actionable.

Consider the impact of having an alert configuration in your system with the following characteristics:

  1. Alerts are being constantly triggered.

  2. Alerts resolve themselves before someone can take action.

  3. Alerts don’t provide information on what the problem is.

  4. Alerts are a result of a problem that cannot be addressed by the team.

These situations inevitably result in your team ignoring ALL the alerts. Therefore, exercise good judgement in designing and implementing your alerting framework.

You can add links to an Alert to associate it with documentation that helps to troubleshoot the issue. Kloudfuse provides several options:

Link to a Runbook

Create and maintain instructions on how to proceed when the alert is triggered.

Link a Dashboard

To help with troubleshooting, associate a Dashboard with the Alert.

Link a Panel

Especially useful when working with large dashboards, link the alert to a specific panel in the Dashboard.