Alerts and notifications
To notify you about changes in performance, Uptrace creates alerts and sends you notifications via email, Slack, or PagerDuty.
Uptrace supports the following types of alerts:
- error:new alert when a new error group is created.
- error:recurring alert when an error group reaches 10k/100k/1m occurrences.
- anomaly:span-count alert when there is an anomaly in the number of spans as reported by the
- anomaly:span-errors alert when there is an anomaly in the error rate as reported by the
span.error_pctis calculated as
span.error_count / span.count.
- anomaly:span-duration alert when there is an anomaly in the median span duration as reported by the
- anomaly:metric alert whenever a monitor detects an anomaly in the metric data.
For each created alert, Uptrace assigns a severity: minor, major, or critical. If the required conditions are met, the alert severity can be raised to a higher level, for example, from minor to major.
Based on the alert severity, you can send notifications via email, Slack, and PagerDuty, for example:
- Uptrace creates an alert with minor severity and sends a notification via email.
- After some time, if the alert is not resolved, Uptrace raises the alert severity to major and sends a notification via Slack.
- Lastly, Uptrace raises the severity to critical and creates an incident in PagerDuty.
On paid accounts, Uptrace uses anomaly detector to automatically detect anomalies in span groups. You can also create and monitor metrics using metric monitors.
In both cases, you can configure anomaly detector to work in automatic mode or manually set fixed bounds, for example, create an alert when the data is smaller than X or larger than Y.
In automatic mode, anomaly detector analyzes existing data to automatically calculate upper or lower bounds. In this mode, the detector supports 3 tolerance levels: low, medium, and high. Low tolerance level is less tolerant and create more alerts. High tolerance level is more tolerant and creates less alerts.