Skip to content

feat(audit): stamp entity_id + entity_type on every event (closes #164)#165

Merged
initializ-mk merged 1 commit into
mainfrom
feat/issue-164-entity-stamp
Jun 15, 2026
Merged

feat(audit): stamp entity_id + entity_type on every event (closes #164)#165
initializ-mk merged 1 commit into
mainfrom
feat/issue-164-entity-stamp

Conversation

@initializ-mk

Copy link
Copy Markdown
Contributor

Summary

Adds top-level `entity_id` and `entity_type` fields to every Forge audit event, sourced from `FORGE_AGENT_ID` (or forge.yaml `agent_id`) with `entity_type` hardcoded to `"agent"`. Field names + values match the guardrails library's vocabulary 1:1 so consumers can join the Forge NDJSON stream against the library's MongoDB `GuardrailAuditEvent` collection on `(entity_id, entity_type)` without translation.

Why entity_id + entity_type, not agent_id

Original #164 proposal said `agent_id`. Renamed to `entity_id` + `entity_type` for two reasons:

  1. 1:1 column compatibility — The guardrails library's `BasePayload` carries `EntityID` + `EntityType` (constants `agent` / `workflow` / `assistant`). When `FORGE_GUARDRAILS_DB` is set, the library writes `GuardrailAuditEvent` records into MongoDB with these exact column names. Forge using `agent_id` would force every consumer reading both streams to maintain a translation table forever.

  2. Future-proof for non-agent entities — Forge runs agents today, but the library already supports workflows and assistants. Encoding the entity type as a value (not a field name) means adding a second entity type later is an additive value change, not a schema change.

Precedence

Two layers — no per-request header layer like #157 has, because entity identity is fixed at process startup:

Layer Source Wins when
1 — Explicit on event `AuditEvent.EntityID` / `AuditEvent.EntityType` Always
2 — Deployment-time stamp `FORGE_AGENT_ID` env → `cfg.AgentID` (forge.yaml); `entity_type` hardcoded to `"agent"` Whenever the higher layer is empty

If a deployment needs per-request entity routing, the tenancy layer (`X-Forge-Org-ID` / `X-Forge-Workspace-ID`) from #157 already covers that — agent identity is the process, by definition.

Wiring (forge-cli/runtime/runner.go)

Mirrors `BuildGuardrailChecker`'s existing AgentID resolution at `guardrails_loader.go:46-50` — env wins over forge.yaml. Called right after the existing `WithTenancy` stamp so all four tenancy/entity fields land together on every emit:

```go
agentID := os.Getenv("FORGE_AGENT_ID")
if agentID == "" && r.cfg.Config != nil {
agentID = r.cfg.Config.AgentID
}
auditLogger.WithEntity("agent", agentID)
```

Event shape

Before:
```json
{"ts":"...","event":"session_start","correlation_id":"...","task_id":"...","org_id":"org_x","workspace_id":"ws_y"}
```

After (no behavior change unless `FORGE_AGENT_ID` or forge.yaml `agent_id` is set):
```json
{"ts":"...","event":"session_start","correlation_id":"...","task_id":"...","org_id":"org_x","workspace_id":"ws_y","entity_id":"my-agent","entity_type":"agent"}
```

Files

File Change
`forge-core/runtime/audit.go` `EntityID` / `EntityType` fields on `AuditEvent` (omitempty); `tenantEntityID` / `tenantEntityType` on `AuditLogger`; `WithEntity(entityType, entityID)` setter; `entityStamp()` internal accessor; stamp pass in `Emit` (no ctx layer needed; `EmitFromContext` reaches `Emit` at the end)
`forge-cli/runtime/runner.go` Read `FORGE_AGENT_ID` / `cfg.AgentID`, call `auditLogger.WithEntity(...)` right after the existing `WithTenancy` call
`forge-core/runtime/entity_test.go` (new) Static stamp on plain Emit, omit when unset, EmitFromContext per-invocation events, explicit value beats stamp, partial WithEntity
`docs/security/audit-logging.md` New Entity stamping subsection with the 1:1 library-join note
`docs/security/tenancy.md` Precedence table split into Tenancy fields + Entity fields subsections; documents the no-header-layer choice

Test plan

  • `go test ./...` clean in forge-core and forge-cli
  • `golangci-lint run ./...` → 0 issues in both modules
  • `gofmt -w` applied
  • All 5 new unit tests pass: `TestAuditLogger_StaticEntityStampsPlainEmit`, `TestAuditLogger_NoEntityStamp_OmitsFields`, `TestEmitFromContext_StaticEntityStampsPerInvocationEvents`, `TestEmitFromContext_ExplicitEntityValueWins`, `TestAuditLogger_WithEntity_PartialStamp`
  • End-to-end smoke: run with `FORGE_AGENT_ID=aibuilderdemo`, tail audit socket, confirm every event (startup banners + per-invocation rows) carries `"entity_id":"aibuilderdemo","entity_type":"agent"`
  • With `FORGE_GUARDRAILS_DB` set: confirm the library's `GuardrailAuditEvent.entity_id` value matches Forge's NDJSON `entity_id` value

Schema impact

Additive only. Both keys use `omitempty`. Deployments setting neither env nor forge.yaml `agent_id` keep emitting the pre-#164 JSON shape verbatim — no `AuditSchemaVersion` bump.

Stamping stack now complete

After this merges, every audit event carries the full deploy identifier set:

Field Source PR
`org_id` env + per-request header #157
`workspace_id` env + per-request header #157
`entity_id` env + forge.yaml this PR
`entity_type` hardcoded `"agent"` (future-extensible) this PR

SIEM filter: `org_id=X AND workspace_id=Y AND entity_id=Z` uniquely identifies a Forge deploy across the export stream.

Closes #164

Adds top-level `entity_id` and `entity_type` fields to every
Forge audit event, sourced from FORGE_AGENT_ID / forge.yaml
`agent_id` with `entity_type` hardcoded to "agent".

Field names + values are taken straight from the guardrails
library's BasePayload vocabulary (EntityID, EntityType — "agent"
/ "workflow" / "assistant" constants) so consumers reading both
the Forge NDJSON stream and the library's MongoDB
GuardrailAuditEvent collection can join on
`(entity_id, entity_type)` 1:1 without a translation table.

Two-layer precedence (no per-request override layer — entity
identity is fixed at process startup):

  1. Explicit EntityID/EntityType on the event
  2. AuditLogger.WithEntity static stamp (from env / forge.yaml)

Both fields use omitempty. Deployments not setting agent_id keep
emitting the pre-#164 JSON shape verbatim. No schema bump.

Wiring (forge-cli/runtime/runner.go) mirrors BuildGuardrailChecker's
existing AgentID resolution (guardrails_loader.go:46-50): env wins
over forge.yaml. Called right after the existing WithTenancy stamp
so all four tenancy/entity fields land together on every event,
including startup banners (agent_card_published, policy_loaded,
audit_export_status).

Tests pin: static stamp on plain Emit, no stamp omits both keys,
EmitFromContext per-invocation events carry the stamp alongside
correlation_id, explicit event value beats the static stamp,
partial WithEntity ("", id) installs only EntityID.

Docs:
- docs/security/audit-logging.md gains an Entity stamping section
  with the 1:1 library-join note.
- docs/security/tenancy.md splits the precedence table into
  Tenancy fields + Entity fields subsections; documents the
  no-header-layer choice.

Future-proofs for non-agent entities: when Forge adds workflow
or assistant runtimes, the field name doesn't change — only the
stamped value. Additive value change, not a schema change.
@initializ-mk initializ-mk merged commit fff0452 into main Jun 15, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit events: stamp entity_id + entity_type on every event (mirror of #157 tenancy stamp)

1 participant