diff --git a/assets/flaky-tests/recipes/broken-classification-quarantine.svg b/assets/flaky-tests/recipes/broken-classification-quarantine.svg new file mode 100644 index 00000000..7fa0ffd9 --- /dev/null +++ b/assets/flaky-tests/recipes/broken-classification-quarantine.svg @@ -0,0 +1,67 @@ + + + + + + Classifying a test as broken can un-quarantine it + + + + + + + Flaky + auto-quarantined + + + Status: FLAKY + + Auto-quarantined + + CI passes (failure ignored) + + + broken-type monitor fires + + + + + + + Reclassified as broken + + + Status: BROKEN + + Not a quarantine candidate + + CI blocked (failure counts) + + + Broken tests are not quarantine candidates, so the test drops out of auto-quarantine + and its failures block CI again. Manually quarantined tests are unaffected. + + diff --git a/assets/flaky-tests/recipes/event-granularity-gap.svg b/assets/flaky-tests/recipes/event-granularity-gap.svg new file mode 100644 index 00000000..06ec87c3 --- /dev/null +++ b/assets/flaky-tests/recipes/event-granularity-gap.svg @@ -0,0 +1,91 @@ + + + + + + Two events, two granularities + + + + + + + + Healthy + starting state + + Monitor A fires + HEALTHY → FLAKY + + + + Monitor B fires + 2nd monitor, still FLAKY + + + + + Test status + v2.test_case. + status_changed + test_case. + monitor_status_changed + + + + HEALTHY + + + FLAKY + + + FLAKY + + + · + + event sent + + no event + + + · + + event sent + + event sent + + + status_changed fires only when the overall status changes, so Monitor B sends nothing. + monitor_status_changed fires on every monitor activation, so it catches both. + + diff --git a/docs.json b/docs.json index e16adc7d..6fe27a28 100644 --- a/docs.json +++ b/docs.json @@ -302,6 +302,12 @@ "flaky-tests/webhooks/jira-integration" ] }, + { + "group": "Recipes", + "pages": [ + "flaky-tests/recipes/alert-on-test-escalation" + ] + }, { "group": "Agents", "root": "flaky-tests/agents/index", diff --git a/flaky-tests/recipes/alert-on-test-escalation.mdx b/flaky-tests/recipes/alert-on-test-escalation.mdx new file mode 100644 index 00000000..f7e9f9b6 --- /dev/null +++ b/flaky-tests/recipes/alert-on-test-escalation.mdx @@ -0,0 +1,134 @@ +--- +title: "Alert When a Test Escalates" +description: "Send Slack alerts when a test gets worse, not just the first time it's flagged" +og:title: "Alerting on flaky test escalation with Trunk webhooks" +--- +A single "this test is now flaky" alert tells you a test crossed a threshold once. It says nothing about what happens next: the same test failing on more branches, tripping more monitors, or sliding from flaky into a consistently broken regression. For the tests that matter, you want to hear about the escalation, not just the first detection. + +This page wires that up with Trunk webhooks and a Slack transformation. It builds on the [Slack integration guide](../webhooks/slack-integration), so set that connection up first, then come back here to filter it down to escalations. + +## Pick the right event + +The one decision that matters is which event you subscribe to. Two events fire here, at two different granularities. + +| Event | Fires when | Use it to | +|---|---|---| +| [`v2.test_case.status_changed`](../webhooks/index) | The test's **overall health status** transitions between `HEALTHY`, `FLAKY`, and `BROKEN` | Alert on health escalations like `FLAKY` → `BROKEN` | +| [`test_case.monitor_status_changed`](../webhooks/index) | **Any individual monitor** activates or resolves for the test | Alert every time a monitor flags the test, even if its overall status doesn't move | + +That distinction matters. `v2.test_case.status_changed` only fires when the test's combined status changes. If a test is already `FLAKY` and a second monitor starts flagging it, the overall status stays `FLAKY`, so nothing is sent. To catch a test that keeps getting flagged by more monitors over time (the "more than just the first detection" case), subscribe to `test_case.monitor_status_changed` instead. + + + A test goes HEALTHY to FLAKY when Monitor A fires, so both events send. When Monitor B fires while the test is already FLAKY, v2.test_case.status_changed sends nothing while test_case.monitor_status_changed still fires. + + + +Test status priority is **Broken > Flaky > Healthy**. A test flagged by both a broken-type and a flaky-type monitor shows as `BROKEN` until the broken monitor resolves. See [Flake Detection](../detection/) for how the combined status is calculated. + + +## Alert when a test becomes broken + +Use this when consistently failing tests deserve a louder, separate signal than routine flakiness. + +**1. Configure a broken-type monitor.** A test only reaches `BROKEN` status when a [failure rate](../detection/failure-rate-monitor) or [failure count](../detection/failure-count-monitor) monitor with its **Detection type** set to **Broken** is active for it. Set one up if you haven't already. A common pattern is to pair a broken-type monitor (catching consistently failing tests) with a flaky-type monitor (catching intermittent ones). + +**2. Filter the transformation to escalations.** In your Slack endpoint's transformation, cancel the webhook unless the status got worse. This example ranks the three statuses and only sends a message when `new_status` is more severe than `previous_status`, so recoveries and resolutions stay quiet: + +```javascript +// Status values are uppercase (HEALTHY, FLAKY, BROKEN), matching the payload. +const SEVERITY = { HEALTHY: 0, FLAKY: 1, BROKEN: 2 }; + +function handler(webhook) { + const { previous_status = "HEALTHY", new_status = "HEALTHY" } = webhook.payload; + + // Only alert when the test got worse, not when it recovered. + if (SEVERITY[new_status] <= SEVERITY[previous_status]) { + webhook.cancel = true; + return webhook; + } + + // summarizeTestCase() is defined in the Slack integration guide. + webhook.payload = summarizeTestCase(webhook.payload); + return webhook; +} +``` + +To alert *only* when a test reaches the broken state, and stay quiet on first-time flaky detections, gate on the new status directly instead: + +```javascript +function handler(webhook) { + if (webhook.payload.new_status !== "BROKEN") { + webhook.cancel = true; + return webhook; + } + + // summarizeTestCase() is defined in the Slack integration guide. + webhook.payload = summarizeTestCase(webhook.payload); + return webhook; +} +``` + +Both snippets replace the `handler` function from the [Slack integration guide](../webhooks/slack-integration#id-2.-customize-your-transformation); keep that guide's `summarizeTestCase` helper in the same transformation so the message body still renders. Its `previous_status → new_status` line makes the escalation obvious in the channel. + +## The quarantine trade-off + +Before you reach for a broken-type monitor, know what it does to quarantine. Classifying a test as broken changes its health status, and auto-quarantine applies only to tests with a **Flaky** status. So when a broken-type monitor flags a test that was auto-quarantined as flaky, the test becomes `BROKEN`, drops out of the auto-quarantine set, and its failures start blocking CI again. That is by design, since a broken test is a real regression, not a flake to skip. It also means a broken classification is not a side-effect-free way to get an escalation alert. + +Labels avoid this. A labeling monitor doesn't change health status, so an auto-quarantined test stays quarantined while you still get the activation signal (see [Alert every time a monitor flags a test](#alert-every-time-a-monitor-flags-a-test) below). Manually quarantined tests are unaffected either way. See [Quarantining](../quarantining/) and [Flake Detection](../detection/) for the full composite-status behavior. + + + A flaky, auto-quarantined test with CI passing. A broken-type monitor fires and reclassifies it as BROKEN. Because broken tests are not quarantine candidates, it drops out of auto-quarantine and its failures block CI again. + + +## Alert every time a monitor flags a test + +Use this when you want to know about every detection event on a test, including the ones that don't change its overall status (a second monitor piling on, or a labeling monitor surfacing a new pattern). + +**1. Subscribe to `test_case.monitor_status_changed`.** On your Slack endpoint, enable this event in addition to (or instead of) `v2.test_case.status_changed`. + +**2. Filter to monitor activations.** The event fires on both activation and resolution, so cancel the webhook unless a monitor is becoming active: + +```javascript +function handler(webhook) { + const { monitor } = webhook.payload; + + // Only alert when a monitor starts flagging the test. + if (!monitor || monitor.status !== "active") { + webhook.cancel = true; + return webhook; + } + + webhook.payload = { + blocks: [ + { + type: "header", + text: { type: "plain_text", text: `Monitor active: ${webhook.payload.test_case.name}` }, + }, + { + type: "section", + text: { + type: "mrkdwn", + text: [ + `Monitor type: \`${monitor.type}\``, + `Test Details: ${webhook.payload.test_case.html_url}`, + ].join("\n"), + }, + }, + ], + }; + return webhook; +} +``` + +Because `test_case.monitor_status_changed` fires for every monitor independently, this catches a test that keeps tripping new monitors over time, even while its headline status stays `FLAKY`. The `monitor.type` field tells you which monitor fired, so you can branch on it: route [labeling monitors](../management/test-labels#automatic-labeling-from-monitors) to a triage channel and health classification monitors to your on-call channel. + + +To route by pattern without changing a test's health status, set a monitor's action to **Apply labels**, then branch on `monitor.type` in your transform to send those activations wherever they belong. See [Test Labels](../management/test-labels) for the full setup. + + +## Related + +- [Integration for Slack](../webhooks/slack-integration). The Slack connection these transformations build on. +- [Webhooks](../webhooks/index). The full event catalog and field reference. +- [Flake Detection](../detection/). How monitors classify tests as flaky or broken. +- [Test Labels](../management/test-labels). Apply and route labels with monitors. diff --git a/flaky-tests/webhooks/index.mdx b/flaky-tests/webhooks/index.mdx index 0a64f74e..902e71fb 100644 --- a/flaky-tests/webhooks/index.mdx +++ b/flaky-tests/webhooks/index.mdx @@ -88,6 +88,11 @@ Emitted when an AI-powered flaky test analysis finishes for a test case. You can also find guides for specific examples here: + + +## Alert only when a test gets worse + +By default this connection alerts on every status change. If you'd rather hear about a test only when it **escalates** — degrading to broken, or tripping more monitors over time — filter the transformation on the status transition instead of sending every event. + + + Send Slack alerts when a test gets worse, not just the first time it's flagged. + + ## Congratulations! You should now receive notifications in your Slack workspace when a test's status changes. You can further modify your transformation script to customize your messages.