feat(observability): OTel spans for every guardrail gate (closes #161)#167
Merged
Conversation
Symmetric to the guardrail_check audit emission shipped in #156 / #160 — every Check* method on LibraryGuardrailEngine opens a guardrail.<gate> child span and stamps the same gate / decision / violation metadata operators see on the audit event. Span names map to the library's gate vocabulary: - guardrail.input (CheckInbound → InputGate) - guardrail.context (CheckContext → ContextGate) - guardrail.tool_call (CheckToolCall → ToolCallGate) - guardrail.output (CheckOutbound + CheckToolOutput → OutputGate) - guardrail.stream (CheckStream → StreamGate; not auto-wired) The CheckOutbound case splits per text part — one guardrail.output span per OutputGate call so the trace tree mirrors the part-level iteration cleanly. Attribute keys (new constants in forge-core/observability/attrs.go): - forge.guardrail.gate (Result.Gate) - forge.guardrail.decision (Result.Decision: allow/mask/block/warn) - forge.guardrail.type (first violation's Type) - forge.guardrail.category (first violation's Category) - forge.guardrail.violation_count (len(Result.Violations)) - forge.guardrail.evidence (gated by CaptureContent + Redact) - forge.tool.name (reused from #130; set on tool_call + tool-output paths) Block decisions stamp OTel Error status with the violation summary as the status description — operators see red bars in the trace UI without custom attribute queries. forge.guardrail.evidence follows the #130 + #156 content rule exactly: default off; with CaptureContent on, the mask path emits post-mask content (matches what the LLM actually saw) and the block/warn paths emit original content. PrepareSpanContent runs the same redact-then-truncate pipeline used for gen_ai.input.messages and forge.tool.args, so the four content streams share one consistent shape. Wiring: - LibraryGuardrailEngine grows a tracingCfg field + WithTracing setter; BuildGuardrailChecker gains a TracingConfig parameter and calls WithTracing on every constructed engine. - runner.Start resolves TracingConfig early (it's a pure config resolution — no I/O) so the guardrail engine sees it before NewTracerProvider runs; the later tracing block reuses the resolved value. - When tracing is disabled, the noop tracer short-circuits; spans are not produced at all. CaptureContent only controls the evidence attribute — the span itself is always opened (it's cheap when tracing is off). Tests (forge-cli/runtime/guardrails_tracing_test.go): - guardrail.input span lands with gate/decision/violation_count attributes; evidence ABSENT when CaptureContent=false - evidence PRESENT but raw PII absent when CaptureContent=true (post-mask rule) - guardrail.tool_call carries forge.tool.name - guardrail.output for CheckOutbound has NO tool attribute (distinguishes "model reply to user" from tool-result OutputGate fires) - guardrail.context + guardrail.stream spans land - noop-tracer path: no spans recorded Docs: docs/core-concepts/observability-tracing.md gains a "Guardrail spans" section under "Span content capture" listing span names, nesting, attribute reference, and the content-capture parity note.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Symmetric to the `guardrail_check` audit emission shipped in #156 / #160 — every `Check*` method on `LibraryGuardrailEngine` now opens a `guardrail.` child span and stamps the same gate / decision / violation metadata operators already see on the audit event.
Operators looking at a trace see "PII was masked here" inline with the LLM and tool spans, no audit-stream pivot required. Block decisions surface as red bars in the trace UI (OTel Error status with the violation summary as the description).
Trace tree
Before:
```
a2a.tasks/send
└─ agent.execute
├─ llm.completion
└─ tool.
```
After:
```
a2a.tasks/send
├─ guardrail.input ← NEW (CheckInbound)
└─ agent.execute
├─ guardrail.context ← NEW (BeforeLLMCall hook scan)
├─ llm.completion
├─ guardrail.tool_call ← NEW (BeforeToolExec hook)
├─ tool.
├─ guardrail.output ← NEW (AfterToolExec hook, with tool=)
└─ guardrail.output ← NEW (CheckOutbound, no tool=)
```
Attribute reference (forge-core/observability/attrs.go)
Content-capture parity with #130
The `forge.guardrail.evidence` attribute uses the exact same `PrepareSpanContent(redact, maxBytes)` pipeline as `gen_ai.input.messages` and `forge.tool.args` — same vendor secret-token scrub, same 4 KiB byte cap, same `…[truncated:N]` marker.
Same content rule as the audit event from PR #156:
Wiring
Files
Test plan
Stamping stack now complete
After this merges, every guardrail decision lands on both pipelines symmetrically:
SIEM consumers reading the audit stream and trace consumers reading the OTel pipeline see the same information, in the same shape, with the same content-capture posture — one mental model.
Closes #161