docs(flaky-tests): add "Alert When a Test Escalates" webhook recipe#249
Conversation
Documents the test-escalation user story from the connectors Slack thread: how to get Slack alerts when a test gets worse, not just on first detection. - New recipe page covering the v2.test_case.status_changed (overall health transitions) vs test_case.monitor_status_changed (per-monitor activations) distinction, with transform snippets for the classify-as-broken and apply-a-label forks. - Cross-link section in the Slack integration guide. - Card on the webhooks index + nav entry in docs.json. All examples stay on the v2 event schema; legacy v1 event not documented. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Verification status (2026-06-12): Verified: customers can use this. Ready to publish.
No rollout to wait on. Engineering authors (@TylerJang27, @acatxnamedvirtue) are requested as reviewers for technical-accuracy sign-off before publish. |
|
Code verification (2026-06-15): 8 confirmed / 0 contradicted / 0 ambiguous / 0 unverifiable All factual claims in the recipe verified against
No contradictions. The new Source #1 — status_changed fires on overall health transitions (confirmed)File: description:
"Emitted when the health status of a test case changes. Test status can transition between HEALTHY, FLAKY, and BROKEN. ...",Reasoning: Fires on health status changes. Source #2 — status enum and monitor.status value, uppercase (confirmed)File: export const TestCaseStatusSchema = z.enum(["HEALTHY", "FLAKY", "BROKEN"]);
...
export const MonitorStatusSchema = z.enum(["active", "inactive"]);Reasoning: Status values are uppercase. The Node harness confirmed the practical trap: lowercasing the comparisons in the Source #3 — broken classification un-quarantines an auto-quarantined test (confirmed)File: `BROKEN` tests are **not quarantine candidates**. Quarantining is intended for flaky tests that can be safely skipped...
The existing quarantine logic checks for `FLAKY` status only and is not affected by the addition of `BROKEN`.Reasoning: The eng spec states the existing quarantine logic gates on |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
|
Seed a Flaky Tests > Recipes nav group with the escalation-alert page as its first entry. It's a process/pattern doc, not a connector reference, so it reads better as a recipe than under Webhooks. Webhooks keeps a cross-link card. Quarantine-recipes (#59) and monitor-tuning (#53) are the planned next entries. - git mv flaky-tests/webhooks/ -> flaky-tests/recipes/ - new Recipes group in docs.json after Webhooks - fixed relative links (./slack-integration, ./index -> ../webhooks/...) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Verification status (June 13, 2026): Verified: customers can use this. Ready to publish (currently draft — author's choice).
Verified by Daily Docs Sweep · June 13, 2026 Generated by Claude Code |
|
This is awesome! Two notes:
|
…ky tests Addresses Tyler's PR #249 review: - Clarify the transform snippets are drop-in replacements for the Slack guide's handler and depend on its summarizeTestCase helper staying in the transformation. - Add a Warning that classifying a test as broken changes its health status, dropping a flaky+auto-quarantined test out of auto-quarantine (broken tests aren't quarantine candidates) so it blocks CI again. Labeling monitors avoid this; manually quarantined tests are unaffected. - Tie the label Tip to the quarantine tradeoff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves the remaining part of Tyler's PR #249 review (transform validity): - Inline comment in both status snippets noting summarizeTestCase() lives in the Slack integration guide, so a single-block copy-paste doesn't silently ReferenceError. - Comment on the SEVERITY map noting status values are uppercase. Validated with a local Node harness against the real v2 + monitor payloads (16/16): handlers send/cancel correctly, and the casing experiment confirms lowercasing the comparisons silently breaks gating. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Voice/clarity pass on the escalation recipe, remove all em dashes.
- Add two standalone animated SVGs (CSS keyframes, reduced-motion safe):
- event-granularity-gap: HEALTHY->FLAKY->FLAKY across three columns,
showing status_changed stays silent on the second monitor while
monitor_status_changed fires on both.
- broken-classification-quarantine: a broken classification drops a
flaky auto-quarantined test out of quarantine and re-blocks CI.
- Embed both via <Frame> in the recipe.
Transforms validated end to end: trunk2 source, a Node harness (16/16),
and Svix Run Test on a play.svix.com test endpoint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Apply CONTRIBUTING admonition rules to the escalation recipe: - The quarantine side effect was a two-paragraph <Warning> wrapping what is really core content. The guide forbids wrapping a section in a callout, and a reversible, by-design behavior is not a Warning-grade hazard. Promote it to a '## The quarantine trade-off' section (prose), and move the broken-classification animation into it. - Trim the label <Tip> so it no longer duplicates that section; it now covers only the optional label-routing mechanics. Page now has two callouts (Info for background, Tip for an optional path), none stacked or section-wrapping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n gap diagram Accuracy pass on the event-granularity diagram. The column-3 event is correct (monitor_status_changed fires on Monitor B's own activation, independent of overall status), but the framing invited a 'why an event if FLAKY to FLAKY?' misread. Sharpen it: - column 3 sublabel 'already FLAKY' -> '2nd monitor, still FLAKY' - caption: 'catches both escalations' -> 'fires on every monitor activation, so it catches both' (Monitor A's first detection is not an escalation) broken-classification diagram audited, accurate, unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
What
Seeds a new Flaky Tests → Recipes nav group with its first entry: Alert When a Test Escalates, documenting how to send Slack alerts when a test gets worse — not just the first time it's flagged.
This is a process/pattern doc, not a connector reference, so it reads as a recipe rather than living under Webhooks (which keeps a cross-link card to it). Planned next entries for the group: the in-flight quarantine-recipes (#59) and monitor-tuning (#53) pages.
Driven by the connectors Slack thread (Tyler Jang's Reply 9): document the P2 user story for Slack notifications when a test starts flaking beyond the first detection level, with the classify-as-broken vs apply-a-label fork.
The core distinction (verified against trunk-io/trunk2)
v2.test_case.status_changedfires only on overall health transitions (HEALTHY/FLAKY/BROKEN). A second monitor piling onto an already-FLAKYtest changes nothing, so no event is sent.test_case.monitor_status_changedfires per individual monitor activation/resolution — the "more than just the first detection" coverage. This is whymonitor_status_changedwas added to the Slack connector in the thread.Verified the schemas and fan-out behavior in
ts/packages/flake-detection/src/types.ts,ts/apps/flake-detection-side-effects-handler/src/webhook.ts,ts/apps/detection-engine-webhook-event-handler/src/enrich.ts, andts/packages/tools/svix-publish-schemas/src/events/.Changes
flaky-tests/recipes/alert-on-test-escalation.mdx— event-picker table, classify-as-broken transform snippet (gate onnew_status), apply-a-label transform snippet (route onmonitor.typefrommonitor_status_changed).Recipesnav group indocs.json, after Webhooks.webhooks/slack-integration.mdx— "Alert only when a test gets worse" cross-link section.webhooks/index.mdx— card for the new recipe.All examples stay on the v2 event schema. The legacy
test_case.status_changed(v1) schema is intentionally not documented.Scope note
The product-side half of the thread — connectors (GitHub Issues, Linear, Jira, Slack) accepting both v1 and v2 events and the updated suggested transforms — was handled by Tyler Beebe. This PR is the docs-side user story only.
Engineering authors
For technical-accuracy review:
Notes
../webhooks/indexlinks in the event table are anchor-less: Mintlify's slug for a backtick-and-dot heading like### v2.test_case.status_changedis unpredictable. If we want deep links, confirm the real anchors against a Mintlify build.webhooks/index.mdxdocumentsmonitor.statusas "active or resolved", but the source enum isactive/inactive(types.ts:173). Worth a separate fix.