Skip to content

[REVIEW] agent-security: add remote tool provenance and registry drift gates #176

@Alex-Parejo

Description

@Alex-Parejo

Skill Being Reviewed

Skill name: agent-security
Skill path: skills/ai-security/agent-security/

False Positive Analysis

Benign architecture that can be over-reported as unsafe if tool provenance is not separated from tool capability:

agent_runtime:
  remote_tools:
    - name: jira.search_issues
      source: mcp://internal-tools.company.example
      transport: mTLS
      publisher: platform-security
      manifest_digest: sha256:8c5...
      allowed_actions: ["search", "read"]
      write_actions: []

Why this is a false positive risk:
The skill correctly asks reviewers to inspect tool registration breadth, least privilege, HITL gates, and auditability. But it does not distinguish a remotely discovered tool with a verified publisher, pinned manifest, signed transport, and read-only action set from a dynamically discovered tool whose schema can change without review. A review that only sees "remote MCP/tool server present" can overstate risk when the trust root and manifest integrity are actually controlled.

Coverage Gaps

Missed variant 1: Dynamic tool manifest drift after approval

{
  "server": "mcp://tools.example",
  "manifest_reviewed_at": "2026-06-01T12:00:00Z",
  "reviewed_tools": ["repo.search", "repo.read_file"],
  "runtime_tools": ["repo.search", "repo.read_file", "repo.delete_branch", "workflow.dispatch"],
  "manifest_pin": null
}

Why it should be caught:
The skill reviews dynamic vs static tool sets, but does not require evidence that a runtime tool registry matches the reviewed registry. In MCP/plugin-based agents, capabilities can appear through server upgrades, feature flags, tenant config, marketplace installs, or remote schema changes. If the system approves an architecture based on a stale registry, a later destructive tool can bypass the original risk review without any code change in the agent application.

Missed variant 2: Remote tool server identity is trusted by URL, not by publisher or artifact integrity

tools:
  - url: https://tools.partner.example/mcp
    auth: bearer_from_env
    allowed: true
    certificate_pinning: false
    publisher_attestation: missing
    manifest_signature: missing
    schema_digest: missing

Why it should be caught:
The current process checks credentials and inter-agent trust, but it does not ask reviewers to verify tool-server supply chain controls: server identity, manifest signing, schema digest pinning, publisher ownership, transport security, and update approval. A compromised tool server or DNS/hosting takeover can silently alter descriptions, parameter schemas, or available operations. That is different from prompt injection: the agent may follow a legitimate-looking tool contract that was swapped underneath it.

Missed variant 3: OAuth/consent scopes are granted at connection time and later exceed the agent task

User connects Google Workspace integration.
Consent screen grants: Gmail read/write, Drive full access, Calendar write.
Agent task: summarize one shared document.
Runtime policy: trusts the integration token and does not downscope per task.

Why it should be caught:
The skill covers credential scope generally, but OAuth and user-consented integrations need a separate evidence gate. Many agent tools operate with delegated user tokens, not service accounts. The review should record consented scopes, token audience, refresh-token lifetime, per-task downscoping, revocation path, and whether the agent can use one user's delegated token for another user's workflow. Without this, least-privilege review can miss excessive delegated authority even when cloud IAM is clean.

Edge Cases

  • A tool marketplace or MCP server may expose only safe tools today but add write/delete operations after an automatic update; reviews need manifest pinning or update approval evidence.
  • Tool descriptions are security-relevant because models use them for planning. A malicious or compromised remote tool can describe a destructive action as read-only unless the broker enforces policy from a separate signed capability model.
  • Multiple tenants can share the same remote tool server while requiring different capability sets; one tenant's approved actions should not become another tenant's available tools.
  • OAuth refresh tokens can outlive the session that justified access. Agent reviews should verify token revocation, re-consent, and task-bounded access, not only initial consent.
  • A remote tool can return schema-valid output that contains instructions for the next agent step; provenance and taint labels should survive into downstream tool-policy decisions.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add a remote tool provenance and runtime registry verification step. Require reviewers to capture tool-server identity, publisher, manifest/schema digest, signing or attestation status, transport trust, runtime registry export, update policy, OAuth/delegated scopes, and per-task downscoping. Add Not Evaluable outcomes when runtime registry or consent-scope evidence is unavailable.

Comparison to Other Tools

Tool / Framework Catches this? Notes
OWASP Agentic AI Threats Partial Covers excessive agency and tool misuse, but local review still needs concrete tool registry and provenance evidence.
OWASP LLM Top 10 Partial LLM tool/plugin risks are relevant, but it does not by itself prove remote manifest integrity or consent downscoping.
SLSA / artifact signing patterns Partial Useful model for provenance and attestation, but must be adapted to tool manifests, schemas, and MCP/plugin releases.
OAuth security review Partial Can validate consent and token scope, but must be tied to agent task boundaries and tool execution policy.
Runtime MCP/tool registry export Yes when available Best evidence for what tools the agent can actually invoke at runtime, especially after feature flags or server updates.

Overall Assessment

Strengths:

  • Strong architecture checklist for permissions, least privilege, HITL, blast radius, audit, rollback, and multi-agent trust.
  • Good warnings about dynamic tool sets, runtime tool injection, shared state, and broad credentials.
  • Useful output format that can be extended without changing the skill's core structure.

Needs improvement:

  • Add remote tool/MCP/plugin provenance evidence, not just local tool registration review.
  • Require runtime tool registry comparison against the reviewed registry.
  • Treat tool descriptions and schemas as security-sensitive artifacts that need integrity controls.
  • Add OAuth/delegated-token scope review separate from service-account IAM.
  • Add Not Evaluable reason codes for missing manifest digest, publisher identity, registry export, update policy, or consent-scope evidence.

Priority recommendations:

  1. Add a "Remote Tool Provenance and Registry Drift" step before permission scoring: capture server identity, publisher, manifest digest, schema digest, signature/attestation, transport trust, update policy, and runtime registry export time.
  2. Expand the Agent Inventory or add a Tool Registry table with Reviewed manifest digest, Runtime digest, Publisher, Transport, Update approval, OAuth scopes, Downscoped per task?, and Evidence confidence.
  3. Add findings guidance: unpinned remote destructive tools should be High/Critical; docs-only tool provenance should be Not Evaluable; broad delegated OAuth scopes without per-task downscoping should be High when write-capable.
  4. Add pitfalls for trusting remote tool URLs, assuming marketplace installs are static, treating tool descriptions as harmless text, and reviewing service-account IAM while missing delegated user-token scope.

References

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: PayPal; payment details can be provided privately after acceptance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions