feat: large tool result offload by lizradway · Pull Request #2162 · strands-agents/sdk-python

lizradway · 2026-04-20T15:48:13Z

Description

Adds a ContextOffloader plugin that proactively intercepts oversized tool results via AfterToolCallEvent, persists each content block individually to a pluggable storage backend, and replaces the in-context result with a truncated preview plus per-block references. Includes an optional built-in retrieval tool (disabled by default) so the agent can fetch offloaded content on demand, returning each content type in its native format.

Problem: When tool outputs exceed the model's context capacity, SlidingWindowConversationManager reactively truncates to first/last 200 chars — losing data permanently and wasting a failed API call.

Solution: A Plugin (following the AgentSkills pattern) that operates at tool execution time, before the result enters the conversation. Each content block is stored individually with its content type preserved, enabling type-aware retrieval. Inline guidance in each offloaded result tells the agent to use the preview when possible and to use its available tools to selectively access the data it needs.

Token-based thresholds: Uses the agent's model.count_tokens() for accurate token estimation (tiktoken when available, chars/4 heuristic fallback). The async hook wraps the tool result as a message for counting. Preview slicing uses tiktoken for exact token-level cuts when available, falls back to tokens * 4 chars.

Content type handling:

Type	Behavior
Text	Stored as `text/plain`, replaced with a preview
JSON	Stored as `application/json` (serialized via `json.dumps`), replaced with a preview
Image	Stored in native format (e.g., `image/png`), replaced with placeholder + reference
Document	Stored in native format (e.g., `application/pdf`), replaced with placeholder + reference
Unknown	Passed through unchanged

Storage backends (required — user must choose one):

InMemoryStorage — no filesystem side effects, content cleared on process exit. clear() method for manual cleanup.
FileStorage — persists to disk with .metadata.json sidecar for content type tracking across process restarts
S3Storage — persists to Amazon S3 (follows S3SessionManager patterns), content type preserved via S3 object metadata

Built-in retrieval tool (opt-in, disabled by default):

retrieve_offloaded_content — agent can fetch offloaded content by reference
Enabled via include_retrieval_tool=True
Returns content in its native type: text as string, JSON as {"json": ...} block, images as {"image": ...} block, documents as {"document": ...} block
Retrieval results are excluded from re-offloading (prevents circular offload loops)
Disabled by default because once storage defaults to VFS/Sandbox, agents will use shell/grep/SQL tools to navigate offloaded results directly
Inline guidance in each offloaded result adapts: when tool is enabled, mentions retrieve_offloaded_content; always tells the agent to use available tools

Usage:

from strands import Agent
from strands.vended_plugins.context_offloader import (
    ContextOffloader,
    InMemoryStorage,
)

# In-memory storage — context reduction only, no persistence
agent = Agent(plugins=[
    ContextOffloader(storage=InMemoryStorage())
])

from strands.vended_plugins.context_offloader import (
    ContextOffloader,
    FileStorage,
)

# File storage — persists artifacts to disk, custom thresholds
agent = Agent(plugins=[
    ContextOffloader(
        storage=FileStorage("./artifacts"),
        max_result_tokens=5_000,
        preview_tokens=2_000,
    )
])

from strands.vended_plugins.context_offloader import (
    ContextOffloader,
    S3Storage,
)

# S3 storage with retrieval tool enabled
agent = Agent(plugins=[
    ContextOffloader(
        storage=S3Storage(
            bucket="my-agent-artifacts",
            prefix="tool-results/",
        ),
        include_retrieval_tool=True,
    )
])

Related Issues

Closes #1296

Documentation PR

strands-agents/docs#772

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

I ran hatch run prepare
66 new unit tests covering all storage backends (including metadata sidecar persistence and corruption), all content types, retrieval tool opt-in/opt-out with native type returns, token-based thresholds with mocked count_tokens, inline guidance adaptation, edge cases (threshold boundaries, cancelled tools, storage failures, partial failures, path traversal, thread safety, input validation, circular offload prevention)
Manually tested via demo script confirming: token-based offloading triggers correctly, agent uses preview for summaries, retrieves only when specific data is needed, retrieval returns native content types
All existing tests pass (1710 passed, 10 pre-existing telemetry failures unrelated to this change)

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

codecov · 2026-04-20T15:54:37Z

Codecov Report

❌ Patch coverage is 95.43726% with 12 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...strands/vended_plugins/context_offloader/plugin.py	94.44%	3 Missing and 5 partials ⚠️
...trands/vended_plugins/context_offloader/storage.py	96.55%	3 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

github-actions · 2026-04-20T15:56:48Z

Assessment: Comment

Well-designed plugin that solves a real problem — proactive tool result externalization is a clear improvement over reactive truncation. The Plugin/Protocol/storage architecture is clean and follows existing SDK patterns (AgentSkills for plugin design, S3SessionManager for S3 client setup). Test coverage is thorough at 47 tests across all backends and content types.

Review Categories

API Review: This PR introduces 5 new public types (Plugin + Protocol + 3 storage backends). Per the API Bar Raising process, it should carry the needs-api-review label before merge. The API surface is well-documented in the PR description with use cases and code examples.
Input Validation: Constructor parameters max_result_chars and preview_chars lack validation — negative values or preview_chars >= max_result_chars would cause silent misbehavior.
Code Duplication: _image_placeholder is duplicated from SlidingWindowConversationManager — extracting to a shared utility would align with the composability tenet.
Storage Lifecycle: The ExternalizationStorage protocol has no cleanup/deletion mechanism, which could lead to unbounded growth in the in-memory backend for long-running agents.
Documentation Gap: AGENTS.md directory structure needs updating per repository guidelines.

Solid implementation overall — the architecture is well thought out and the test suite is comprehensive. 🙏

…use preview when possible

…ssage

github-actions · 2026-04-23T18:40:54Z

Assessment: Comment

The move to token-based thresholds is excellent — it directly addresses mkmeral's earlier feedback and is a significant improvement over character heuristics. The integration with model.count_tokens() is clean, and the tiktoken-based preview slicing is a nice touch. The async hook transition is handled correctly with proper test mocking.

Review Items

Private API coupling: _get_encoding is imported from strands.models.model — a private function. Consider inlining the tiktoken logic (4 lines) to avoid fragile cross-module coupling to an internal API.
Unrelated test changes: test_model.py adds tests for _ModelPlugin, context_window_limit, and stateful that appear orthogonal to context offloading. These should be documented or split out.

The token-based approach is the right design. The private API import is the main item to resolve before merge.

github-actions · 2026-04-23T18:42:34Z

Assessment: Approve

No new changes since the previous review. The one open thread (_get_encoding private API import at line 41) still applies — resolving that by inlining the tiktoken logic would remove the only remaining fragile coupling. Everything else is merge-ready.

opieter-aws · 2026-04-23T19:10:23Z

nit: Should the PR title reference offloading for documentation purposes?

mkmeral

get_encoding comment is the main one. otherwise it looks good to me (if review agent also agrees :p )

github-actions · 2026-04-24T14:42:19Z

Assessment: Approve

The opt-in retrieval tool is a good design evolution — making it disabled by default with include_retrieval_tool=True is forward-compatible with VFS/Sandbox while keeping the escape hatch available. The init_agent() approach to strip the auto-discovered tool is clean. Inline guidance adapting based on tool availability is a nice UX touch. System prompt injection removal simplifies the plugin's lifecycle footprint.

No new issues found. This PR is ready to merge.

github-actions · 2026-04-24T16:41:49Z

Assessment: Approve

The self-skip guard refinement is a nice improvement — gating on self._include_retrieval_tool avoids unnecessary string comparison when the tool is disabled, and using self.retrieve_offloaded_content.tool_name instead of a hardcoded "retrieve_offloaded_content" string eliminates a magic constant. Both the enabled-skip and disabled-no-skip test paths are covered.

No new issues. This PR is ready to merge.

github-actions Bot added the size/xl label Apr 20, 2026

lizradway temporarily deployed to manual-approval April 20, 2026 15:48 — with GitHub Actions Inactive

lizradway had a problem deploying to auto-approve April 20, 2026 15:48 — with GitHub Actions Failure

lizradway added the needs-api-review Makes changes to the public API surface label Apr 20, 2026

github-actions Bot added the strands-running label Apr 20, 2026

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/plugin.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/storage.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/plugin.py Outdated

This comment was marked as outdated.

Sign in to view

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/plugin.py Outdated

This comment was marked as outdated.

Sign in to view

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/context_offloader/storage.py

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/storage.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/storage.py Outdated

github-actions Bot removed the strands-running label Apr 20, 2026

This was referenced Apr 20, 2026

design: add Tool Result Externalization proposal strands-agents/docs#762

Open

[FEATURE] Explore eviction strategy for externalized tools #2168

Open

[FEATURE] Explore common storage interface #2171

Open

github-actions Bot added size/xl and removed size/xl labels Apr 20, 2026

lizradway had a problem deploying to auto-approve April 20, 2026 20:33 — with GitHub Actions Failure

lizradway temporarily deployed to manual-approval April 20, 2026 20:33 — with GitHub Actions Inactive

github-actions Bot added the strands-running label Apr 20, 2026

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread AGENTS.md Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/plugin.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/storage.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/plugin.py Outdated

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/result_externalizer/__init__.py Outdated

lizradway added 8 commits April 23, 2026 10:22

rename to offload, store binary types

d2e59e3

address comments

f75b55c

include metadata sidecar

4d4d2d5

remove retrieval offset/limit

71a77ee

fix infinite loop, return non-text as non-text, update sys prompt to …

8da417c

…use preview when possible

simplify content type assignment

299656b

add metadata sidecar tests

e186d21

fix lint errors

b1ba2df

lizradway mentioned this pull request Apr 23, 2026

docs(decisions): use llm native units strands-agents/docs#777

Merged

5 tasks

opieter-aws previously approved these changes Apr 23, 2026

View reviewed changes

mkmeral reviewed Apr 23, 2026

View reviewed changes

lizradway added 2 commits April 23, 2026 12:48

move extensions to main plugin, include tool use prompt in preview me…

2c4c841

…ssage

use tokens as params instead of chars

f279aeb

github-actions Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/context_offloader/plugin.py

make get_encoding public

d0c1754

opieter-aws previously approved these changes Apr 23, 2026

View reviewed changes

default to not include tool, remove sys prompt

1563045

mkmeral reviewed Apr 24, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/context_offloader/plugin.py

Comment thread src/strands/vended_plugins/context_offloader/plugin.py

Comment thread src/strands/models/model.py Outdated

Comment thread src/strands/vended_plugins/context_offloader/plugin.py

make encoding private

4acd639

JackYPCOnline reviewed Apr 24, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/context_offloader/storage.py

mkmeral reviewed Apr 24, 2026

View reviewed changes

Comment thread src/strands/vended_plugins/context_offloader/plugin.py

Comment thread src/strands/vended_plugins/context_offloader/plugin.py Outdated

check tool name from spec

4c8691b

mkmeral approved these changes Apr 24, 2026

View reviewed changes

This was referenced Apr 24, 2026

docs(designs): background tasks strands-agents/docs#780

Open

chore: update style guide for tool spec navigation #2203

Merged

Conversation

lizradway commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

codecov Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

opieter-aws commented Apr 23, 2026

Uh oh!

mkmeral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lizradway commented Apr 20, 2026 •

edited

Loading

codecov Bot commented Apr 20, 2026 •

edited

Loading