Skip to content

feat(metrics): event-bus by-type breakdown + webhook delivery counters#70

Merged
TeoSlayer merged 1 commit into
mainfrom
feat/events-webhook-breakdown
Jun 9, 2026
Merged

feat(metrics): event-bus by-type breakdown + webhook delivery counters#70
TeoSlayer merged 1 commit into
mainfrom
feat/events-webhook-breakdown

Conversation

@TeoSlayer

Copy link
Copy Markdown
Contributor

Operators looking at the N-events and N-webhook panels on the dashboard had nothing actionable — the existing counters told them "something happened" / "URL is configured" without surfacing what was actually flowing.

Event bus

Adds pilot_event_bus_publish_by_type_total{event="source.type"} so the publish total breaks down per publisher.event-name (membership.changed, trust.created, server.audit.entry, ...). The bus interface gains SetOnPublishEvent(func(Event)) alongside the existing no-arg SetOnPublish; old callers stay working.

Webhook

Adds pilot_webhook_deliveries_total{result="ok|error|dropped"} + pilot_webhook_last_delivery_unix_seconds. Webhook dispatcher already had atomic counters internally; surfaces them via Store.Stats() + a new lastAttemptUnix stamp updated in post(). Polled at scrape time via WebhookStatsFn (BeaconStatsFn pattern); ok=false → block omitted when no URL is configured.

Tests

  • by-type CounterVec renders one labeled line per source.type
  • webhook stats fn gating: ok=false → omit, ok=true → three result labels + gauge

go build ./... clean. go test ./events ./webhook ./metrics green.

Two new metric families operators need to actually see what the bus +
webhook are doing — the existing pilot_event_bus_publish_total /
pilot_webhook_configured only tell you "something happened" / "URL is
set" without surfacing what's flowing or whether deliveries are
succeeding.

## Event bus

`pilot_event_bus_publish_by_type_total{event="source.type"}`

Each Publish() now invokes a second hook OnPublishEvent(Event), wired
from server_lifecycle.go to increment a CounterVec labeled by the
publisher's Source + "." + Type. So operators see membership.changed
vs trust.created vs server.audit.entry side by side instead of one
aggregate number.

The bus interface gains SetOnPublishEvent alongside the existing
SetOnPublish; old callers stay working.

## Webhook

`pilot_webhook_deliveries_total{result="ok|error|dropped"}`
`pilot_webhook_last_delivery_unix_seconds`

Webhook dispatcher already maintained atomic delivered/failed/dropped
counters. Added accessor Store.Stats() + a lastAttemptUnix stamp set
in post() so /metrics has a freshness signal. The metrics layer polls
via WebhookStatsFn at scrape time (same pattern as BeaconStatsFn) and
omits the block when no URL is configured (ok=false) so a registry
without a webhook doesn't lie about zero deliveries.

## Tests

Two new in metrics/zz_metrics_test.go:
- by-type renders one labeled line per source.type pair
- webhook stats fn gating: ok=false → omit; ok=true → all three result
  labels + the gauge render
@TeoSlayer TeoSlayer merged commit 88d368d into main Jun 9, 2026
1 check passed
@TeoSlayer TeoSlayer deleted the feat/events-webhook-breakdown branch June 9, 2026 11:59
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants