Skip to content

feat(observability): OTel spans for every guardrail gate (closes #161)#167

Merged
initializ-mk merged 1 commit into
mainfrom
feat/issue-161-guardrail-spans
Jun 15, 2026
Merged

feat(observability): OTel spans for every guardrail gate (closes #161)#167
initializ-mk merged 1 commit into
mainfrom
feat/issue-161-guardrail-spans

Conversation

@initializ-mk

Copy link
Copy Markdown
Contributor

Summary

Symmetric to the `guardrail_check` audit emission shipped in #156 / #160 — every `Check*` method on `LibraryGuardrailEngine` now opens a `guardrail.` child span and stamps the same gate / decision / violation metadata operators already see on the audit event.

Operators looking at a trace see "PII was masked here" inline with the LLM and tool spans, no audit-stream pivot required. Block decisions surface as red bars in the trace UI (OTel Error status with the violation summary as the description).

Trace tree

Before:
```
a2a.tasks/send
└─ agent.execute
├─ llm.completion
└─ tool.
```

After:
```
a2a.tasks/send
├─ guardrail.input ← NEW (CheckInbound)
└─ agent.execute
├─ guardrail.context ← NEW (BeforeLLMCall hook scan)
├─ llm.completion
├─ guardrail.tool_call ← NEW (BeforeToolExec hook)
├─ tool.
├─ guardrail.output ← NEW (AfterToolExec hook, with tool=)
└─ guardrail.output ← NEW (CheckOutbound, no tool=)
```

Attribute reference (forge-core/observability/attrs.go)

Attribute When set Source
`forge.guardrail.gate` Always `Result.Gate` — same value as `fields.gate` on the audit event
`forge.guardrail.decision` Always `Result.Decision` (`allow` / `mask` / `block` / `warn`)
`forge.guardrail.violation_count` Always `len(Result.Violations)`
`forge.guardrail.type` When violations present First violation's `Type` (`pii`, `moderation`, …)
`forge.guardrail.category` When category set First violation's `Category` (`ssn`, `email`, …)
`forge.tool.name` tool_call + tool-output `output` spans The tool the gate fired on
`forge.guardrail.evidence` `capture_content: true` only Redacted + truncated triggering content

Content-capture parity with #130

The `forge.guardrail.evidence` attribute uses the exact same `PrepareSpanContent(redact, maxBytes)` pipeline as `gen_ai.input.messages` and `forge.tool.args` — same vendor secret-token scrub, same 4 KiB byte cap, same `…[truncated:N]` marker.

Same content rule as the audit event from PR #156:

  • mask → evidence carries the post-mask content (what the LLM actually saw)
  • block / warn → evidence carries the original triggering text (no masked variant produced)

Wiring

  • `LibraryGuardrailEngine` grows a `tracingCfg` field + `WithTracing(cfg)` setter.
  • `BuildGuardrailChecker` gains a `TracingConfig` parameter; `attach()` calls `WithTracing` on every constructed engine.
  • `runner.Start` resolves `TracingConfig` early (it's a pure config-resolution function — no I/O, no provider construction) so the guardrail engine sees it before `NewTracerProvider` runs. The later tracing block reuses the resolved value as `tracingCfg := tracingCfgEarly`.
  • When tracing is disabled, the noop tracer short-circuits and the spans are not produced at all. `CaptureContent` only controls the evidence attribute; the span itself is always opened (it's near-zero cost when tracing is off).

Files

File Change
`forge-core/observability/attrs.go` 6 new constants under `forge.guardrail.*`
`forge-cli/runtime/guardrails_tracing.go` (new) `startGuardrailSpan` + `finishGuardrailSpan` helpers
`forge-cli/runtime/guardrails_engine.go` `tracingCfg` field, `WithTracing` method, span open/close at every `Check*` method (deferred finish so block-and-return + library-error paths both close the span correctly)
`forge-cli/runtime/guardrails_loader.go` `BuildGuardrailChecker` signature grows `tracingCfg observability.TracingConfig`
`forge-cli/runtime/runner.go` Resolve `TracingConfig` before `BuildGuardrailChecker`; reuse downstream
`forge-cli/runtime/guardrails_tracing_test.go` (new) 7 tests covering all 5 gates + noop-tracer + capture-content posture
`docs/core-concepts/observability-tracing.md` New "Guardrail spans" section

Test plan

  • `go test -count=1 ./...` clean in forge-core and forge-cli
  • `golangci-lint run ./...` → 0 issues in both modules
  • `gofmt -w` applied
  • All 7 span tests pass:
    • `TestCheckInbound_OpensInputSpanWithGateAttributes` (CaptureContent off → evidence absent)
    • `TestCheckInbound_CaptureContent_StampsRedactedEvidence` (CaptureContent on → evidence present, raw PII absent)
    • `TestCheckToolCall_OpensToolCallSpanWithToolAttribute` (tool attribute set)
    • `TestCheckOutbound_OpensOutputSpan_NoToolAttribute` (distinguishes from tool-output)
    • `TestCheckContext_OpensContextSpan`
    • `TestCheckStream_OpensStreamSpan` (even though not auto-wired)
    • `TestCheckInbound_NoTracing_NoSpansRecorded` (noop-tracer short-circuit)
  • End-to-end smoke: deploy with OTLP exporter, send PII-bearing message, confirm `guardrail.input` child of `a2a.tasks/send` with `forge.guardrail.gate=input`, `forge.guardrail.decision=mask`, no `forge.guardrail.evidence` (CaptureContent default off)
  • Set `capture_content: true` and verify evidence attribute carries redacted content
  • Trigger an outbound block in enforce mode; confirm `guardrail.output` has OTel `Error` status

Stamping stack now complete

After this merges, every guardrail decision lands on both pipelines symmetrically:

Pipeline Field / Attribute
Audit NDJSON `event=guardrail_check` with `fields.gate` / `decision` / `type` / `category` / `violation_count` / `evidence`
OTel span `guardrail.` with `forge.guardrail.gate` / `decision` / `type` / `category` / `violation_count` / `evidence`

SIEM consumers reading the audit stream and trace consumers reading the OTel pipeline see the same information, in the same shape, with the same content-capture posture — one mental model.

Closes #161

Symmetric to the guardrail_check audit emission shipped in
#156 / #160 — every Check* method on LibraryGuardrailEngine opens
a guardrail.<gate> child span and stamps the same gate / decision
/ violation metadata operators see on the audit event.

Span names map to the library's gate vocabulary:
  - guardrail.input        (CheckInbound → InputGate)
  - guardrail.context      (CheckContext → ContextGate)
  - guardrail.tool_call    (CheckToolCall → ToolCallGate)
  - guardrail.output       (CheckOutbound + CheckToolOutput → OutputGate)
  - guardrail.stream       (CheckStream → StreamGate; not auto-wired)

The CheckOutbound case splits per text part — one
guardrail.output span per OutputGate call so the trace tree
mirrors the part-level iteration cleanly.

Attribute keys (new constants in forge-core/observability/attrs.go):
  - forge.guardrail.gate              (Result.Gate)
  - forge.guardrail.decision          (Result.Decision: allow/mask/block/warn)
  - forge.guardrail.type              (first violation's Type)
  - forge.guardrail.category          (first violation's Category)
  - forge.guardrail.violation_count   (len(Result.Violations))
  - forge.guardrail.evidence          (gated by CaptureContent + Redact)
  - forge.tool.name                   (reused from #130; set on tool_call + tool-output paths)

Block decisions stamp OTel Error status with the violation summary
as the status description — operators see red bars in the trace UI
without custom attribute queries.

forge.guardrail.evidence follows the #130 + #156 content rule
exactly: default off; with CaptureContent on, the mask path
emits post-mask content (matches what the LLM actually saw) and
the block/warn paths emit original content. PrepareSpanContent
runs the same redact-then-truncate pipeline used for
gen_ai.input.messages and forge.tool.args, so the four content
streams share one consistent shape.

Wiring:
  - LibraryGuardrailEngine grows a tracingCfg field + WithTracing
    setter; BuildGuardrailChecker gains a TracingConfig parameter
    and calls WithTracing on every constructed engine.
  - runner.Start resolves TracingConfig early (it's a pure config
    resolution — no I/O) so the guardrail engine sees it before
    NewTracerProvider runs; the later tracing block reuses the
    resolved value.
  - When tracing is disabled, the noop tracer short-circuits;
    spans are not produced at all. CaptureContent only controls
    the evidence attribute — the span itself is always opened
    (it's cheap when tracing is off).

Tests (forge-cli/runtime/guardrails_tracing_test.go):
  - guardrail.input span lands with gate/decision/violation_count
    attributes; evidence ABSENT when CaptureContent=false
  - evidence PRESENT but raw PII absent when CaptureContent=true
    (post-mask rule)
  - guardrail.tool_call carries forge.tool.name
  - guardrail.output for CheckOutbound has NO tool attribute
    (distinguishes "model reply to user" from tool-result OutputGate
    fires)
  - guardrail.context + guardrail.stream spans land
  - noop-tracer path: no spans recorded

Docs: docs/core-concepts/observability-tracing.md gains a
"Guardrail spans" section under "Span content capture" listing
span names, nesting, attribute reference, and the content-capture
parity note.
@initializ-mk initializ-mk merged commit e6a5e4e into main Jun 15, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Guardrail span parity: emit OTel spans + attributes alongside the audit events

1 participant