Skip to content

fix(anthropic): default max_tokens to the model's output ceiling (#849)#853

Open
tombeckenham wants to merge 2 commits into
mainfrom
849-ai-anthropic-max_tokens-defaults-to-1024-silently-truncating-responses-when-caller-doesnt-set-it
Open

fix(anthropic): default max_tokens to the model's output ceiling (#849)#853
tombeckenham wants to merge 2 commits into
mainfrom
849-ai-anthropic-max_tokens-defaults-to-1024-silently-truncating-responses-when-caller-doesnt-set-it

Conversation

@tombeckenham

@tombeckenham tombeckenham commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🎯 Changes

Closes #849.

@tanstack/ai-anthropic's text adapter defaulted the Anthropic max_tokens request field to 1024 when the caller didn't pass one. Anthropic's Messages API requires max_tokens, so the adapter must send some value — but 1024 is far below what the targeted models can produce, so any non-trivial generation (codegen, agentic tool flows, long-form output) silently truncated mid-stream with stop_reason: "max_tokens", looking like the run "did nothing" rather than "ran out of output budget".

  • Model-meta-aware default. When the caller doesn't set modelOptions.max_tokens, default to the resolved model's real max_output_tokens from model-meta.ts (e.g. 64K Sonnet, 128K Opus) via a new getAnthropicDefaultMaxTokens(model). max_tokens is a ceiling, not a reservation (billing is per token generated), so this costs nothing unless the model genuinely produces more. Unrecognized model ids fall back to 64K (the current mainstream Claude tier's ceiling). The shared mapper means structuredOutput benefits too.
  • Truncation warning. When a response stops on stop_reason: "max_tokens" and the caller didn't set max_tokens, the adapter logs a logger.warn with actionable guidance so the truncation isn't silent. Explicit caller caps are left alone.
  • No drift. model-meta.ts is auto-synced by scripts/sync-provider-models.ts. That script now also maintains the new id → max_output_tokens map (new maxOutputTokensMapName config + addToObjectMap helper), so a freshly-synced model resolves to its real ceiling rather than the fallback — staying in lockstep with ANTHROPIC_MODELS.
  • Non-streaming clamp (follow-up fix). The full-ceiling default broke the non-streaming structuredOutput() path: the Anthropic SDK refuses a non-streaming request whose max_tokens could exceed its 10-minute timeout (~21,333 tokens), so every chat({ outputSchema }) on a fallback-path model threw "Streaming is required for operations that may take longer than 10 minutes". getAnthropicDefaultMaxTokens(model, { stream }) now clamps the default to ANTHROPIC_MAX_NONSTREAMING_TOKENS (21K) when stream: false; the streaming chat path keeps the model's full ceiling. An explicit oversized max_tokens still surfaces the SDK's "use streaming" error, which is the correct signal. This was caught by two failing E2E specs (anthropic-structured-usage, structured-output-middleware).

This is Anthropic-specific; @tanstack/ai-openai and the other adapters treat token limits as optional and have no equivalent floor.

Tests

  • getAnthropicDefaultMaxTokens unit tests (known models, unknown 64K fallback, never-1024; plus the non-streaming clamp, no-clamp-when-already-below, and streaming-unaffected cases).
  • Adapter tests: default now resolves to the model ceiling; truncation warning fires on a defaulted cap and is suppressed when the caller set max_tokens; the non-streaming structured-output request sends stream: false with the clamped max_tokens.
  • E2E suite (anthropic + structured + middleware) passes locally; the two originally-failing structured-output specs are green.

E2E note: the original defaulting signals — the max_tokens value in the request payload and a server-side log warning — aren't surfaceable through aimock's response-based assertions, so the defaulting itself is covered by unit tests. The non-streaming clamp is exercised end-to-end because the SDK guard caused an observable RUN_FINISHED.usage regression.

Docs / skill

  • docs/adapters/anthropic.md (+ docs/config.json updatedAt) and the adapter-configuration skill document the new defaulting behavior, including the non-streaming structured-output clamp.

✅ Checklist

  • I have followed the steps in the Contributing guide.
  • I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

  • This change affects published code, and I have generated a changeset.
  • This change is docs/CI/dev-only (no release).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Anthropic requests now default max_tokens to the selected model’s metadata-derived output limit when it isn’t provided.
    • Non-streaming structured outputs are automatically capped to avoid SDK timeout issues.
  • Bug Fixes

    • Improves truncation behavior by replacing the prior low hard-coded token default with model-specific limits; unknown models use a safe fallback.
  • Documentation

    • Updated adapter guidance and notes to clarify ceiling behavior and defaulting/truncation warnings.
  • Tests

    • Added/updated coverage for the new defaulting, warning behavior, and non-streaming clamping.

Anthropic's Messages API requires `max_tokens`, so the text adapter must
always send a value. It previously hard-coded `?? 1024` when the caller
didn't pass one, silently truncating any non-trivial generation mid-stream
with `stop_reason: "max_tokens"`.

Now default to the resolved model's real `max_output_tokens` from model-meta
(e.g. 64K Sonnet, 128K Opus), falling back to 64K for unrecognized ids.
`max_tokens` is a ceiling, not a reservation, so this costs nothing extra.
Also log a warning when a response is truncated while using the defaulted
cap, so it isn't silently read as the model "doing nothing"; callers that set
`max_tokens` explicitly are unaffected.

The new id -> max_output_tokens map is kept in lockstep with ANTHROPIC_MODELS
by `scripts/sync-provider-models.ts`, so a freshly-synced model resolves to
its real ceiling rather than the fallback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The Anthropic adapter now defaults max_tokens from model metadata instead of 1024, warns when a defaulted cap truncates a stream, and updates the sync script, tests, docs, skill note, and package metadata to match.

Changes

Anthropic max token defaulting

Layer / File(s) Summary
Model ceiling map generation
scripts/sync-provider-models.ts, packages/ai-anthropic/src/model-meta.ts
The sync script generates a runtime map of Anthropic model output ceilings, and model-meta.ts exposes the fallback ceiling constant, the non-streaming cap, and the lookup helper.
Adapter default and warning
packages/ai-anthropic/src/adapters/text.ts
The text adapter imports the ceiling helper, uses it for the default request max_tokens, switches structured output to the non-streaming path, and warns during streaming when truncation happens under a caller-unspecified default cap.
Validation and docs
packages/ai-anthropic/tests/*, docs/adapters/anthropic.md, packages/ai/skills/ai-core/adapter-configuration/SKILL.md, .changeset/anthropic-max-tokens-default.md, docs/config.json
The adapter tests, model-meta tests, changeset, docs, skill note, and config metadata are updated to describe and verify the new default ceiling behavior and truncation warning.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant text_ts as text.ts
  participant model_meta_ts as model-meta.ts
  participant Anthropic_Messages_API as Anthropic Messages API
  participant logger
  Caller->>text_ts: omit modelOptions.max_tokens
  text_ts->>model_meta_ts: getAnthropicDefaultMaxTokens(this.model, { stream })
  text_ts->>Anthropic_Messages_API: send max_tokens
  Anthropic_Messages_API->>text_ts: stop_reason = "max_tokens"
  text_ts->>logger: warn when max_tokens was defaulted
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hop through tokens, bright and neat,
With model ceilings at my feet.
No silent truncation in the night—
I sniff the stream and warn just right. 🐰

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: Anthropic max_tokens now defaults to the model output ceiling.
Description check ✅ Passed The PR description matches the template with Changes, Checklist, and Release Impact sections filled in.
Linked Issues check ✅ Passed The code changes implement the requested model-aware defaulting, truncation warning, and explicit-cap behavior for #849.
Out of Scope Changes check ✅ Passed The added docs, tests, sync script, and config updates all support the Anthropic max_tokens fix and appear in scope.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 849-ai-anthropic-max_tokens-defaults-to-1024-silently-truncating-responses-when-caller-doesnt-set-it

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 markdownlint-cli2 (0.22.1)
.changeset/anthropic-max-tokens-default.md

markdownlint-cli2 wrapper config was not available before execution

docs/adapters/anthropic.md

markdownlint-cli2 wrapper config was not available before execution


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

🚀 Changeset Version Preview

8 package(s) bumped directly, 4 bumped as dependents.

🟥 Major bumps

Package Version Reason
@tanstack/ai-react-ui 0.8.11 → 1.0.0 Dependent
@tanstack/ai-solid-ui 0.7.10 → 1.0.0 Dependent

🟨 Minor bumps

Package Version Reason
@tanstack/ai-angular 0.1.11 → 0.2.0 Changeset
@tanstack/ai-client 0.18.6 → 0.19.0 Changeset
@tanstack/ai-react 0.15.15 → 0.16.0 Changeset
@tanstack/ai-solid 0.13.15 → 0.14.0 Changeset
@tanstack/ai-svelte 0.13.15 → 0.14.0 Changeset
@tanstack/ai-vue 0.13.15 → 0.14.0 Changeset

🟩 Patch bumps

Package Version Reason
@tanstack/ai-anthropic 0.15.10 → 0.15.11 Changeset
@tanstack/ai-grok 0.14.4 → 0.14.5 Changeset
@tanstack/ai-preact 0.9.15 → 0.9.16 Dependent
@tanstack/ai-vue-ui 0.2.27 → 0.2.28 Dependent

@nx-cloud

nx-cloud Bot commented Jun 26, 2026

Copy link
Copy Markdown

View your CI Pipeline Execution ↗ for commit aee6dd1

Command Status Duration Result
nx run-many --targets=build --exclude=examples/... ✅ Succeeded 8s View ↗

☁️ Nx Cloud last updated this comment at 2026-06-26 09:03:20 UTC

@pkg-pr-new

pkg-pr-new Bot commented Jun 26, 2026

Copy link
Copy Markdown

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/@tanstack/ai@853

@tanstack/ai-angular

npm i https://pkg.pr.new/@tanstack/ai-angular@853

@tanstack/ai-anthropic

npm i https://pkg.pr.new/@tanstack/ai-anthropic@853

@tanstack/ai-client

npm i https://pkg.pr.new/@tanstack/ai-client@853

@tanstack/ai-code-mode

npm i https://pkg.pr.new/@tanstack/ai-code-mode@853

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/@tanstack/ai-code-mode-skills@853

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/@tanstack/ai-devtools-core@853

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/@tanstack/ai-elevenlabs@853

@tanstack/ai-event-client

npm i https://pkg.pr.new/@tanstack/ai-event-client@853

@tanstack/ai-fal

npm i https://pkg.pr.new/@tanstack/ai-fal@853

@tanstack/ai-gemini

npm i https://pkg.pr.new/@tanstack/ai-gemini@853

@tanstack/ai-grok

npm i https://pkg.pr.new/@tanstack/ai-grok@853

@tanstack/ai-groq

npm i https://pkg.pr.new/@tanstack/ai-groq@853

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-isolate-cloudflare@853

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/@tanstack/ai-isolate-node@853

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/@tanstack/ai-isolate-quickjs@853

@tanstack/ai-mcp

npm i https://pkg.pr.new/@tanstack/ai-mcp@853

@tanstack/ai-ollama

npm i https://pkg.pr.new/@tanstack/ai-ollama@853

@tanstack/ai-openai

npm i https://pkg.pr.new/@tanstack/ai-openai@853

@tanstack/ai-openrouter

npm i https://pkg.pr.new/@tanstack/ai-openrouter@853

@tanstack/ai-preact

npm i https://pkg.pr.new/@tanstack/ai-preact@853

@tanstack/ai-react

npm i https://pkg.pr.new/@tanstack/ai-react@853

@tanstack/ai-react-ui

npm i https://pkg.pr.new/@tanstack/ai-react-ui@853

@tanstack/ai-solid

npm i https://pkg.pr.new/@tanstack/ai-solid@853

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/@tanstack/ai-solid-ui@853

@tanstack/ai-svelte

npm i https://pkg.pr.new/@tanstack/ai-svelte@853

@tanstack/ai-utils

npm i https://pkg.pr.new/@tanstack/ai-utils@853

@tanstack/ai-vue

npm i https://pkg.pr.new/@tanstack/ai-vue@853

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/@tanstack/ai-vue-ui@853

@tanstack/openai-base

npm i https://pkg.pr.new/@tanstack/openai-base@853

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/@tanstack/preact-ai-devtools@853

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/@tanstack/react-ai-devtools@853

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/@tanstack/solid-ai-devtools@853

commit: aee6dd1

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/adapters/anthropic.md`:
- Around line 139-142: The `max_tokens` default section in the Anthropic adapter
docs skips a heading level and breaks the document hierarchy. Update the heading
used for this section in `docs/adapters/anthropic.md` from the current `####`
level to `###` so it sits correctly between `Model Options` and `Thinking
(Extended Thinking)`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c9a19c83-f27b-450b-a500-404090524534

📥 Commits

Reviewing files that changed from the base of the PR and between 33acdd4 and f468928.

📒 Files selected for processing (9)
  • .changeset/anthropic-max-tokens-default.md
  • docs/adapters/anthropic.md
  • docs/config.json
  • packages/ai-anthropic/src/adapters/text.ts
  • packages/ai-anthropic/src/model-meta.ts
  • packages/ai-anthropic/tests/anthropic-adapter.test.ts
  • packages/ai-anthropic/tests/model-meta.test.ts
  • packages/ai/skills/ai-core/adapter-configuration/SKILL.md
  • scripts/sync-provider-models.ts

Comment on lines +139 to +142
#### `max_tokens` default

Anthropic's Messages API _requires_ `max_tokens` on every request, so the adapter always sends a value. When you don't set `modelOptions.max_tokens`, it defaults to the selected model's full output ceiling (`max_output_tokens` from the model metadata — e.g. 64K for Sonnet, 128K for Opus), falling back to a safe constant for unrecognized models. `max_tokens` is a ceiling, not a reservation — billing is on tokens actually generated — so this default costs nothing extra and avoids the silent mid-response truncation (`stop_reason: "max_tokens"`) that a low default would cause. Set `max_tokens` explicitly only when you want to _cap_ output below the model ceiling. If a response is truncated while using the default cap, the adapter logs a warning (visible with [debug logging](../advanced/debug-logging) enabled).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Fix heading level to maintain document hierarchy.

The new #### max_tokens default heading skips a level. It follows ## Model Options and precedes ### Thinking (Extended Thinking), so it should be ### max_tokens default to increment by one level at a time.

📝 Proposed fix
-#### `max_tokens` default
+### `max_tokens` default
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#### `max_tokens` default
Anthropic's Messages API _requires_ `max_tokens` on every request, so the adapter always sends a value. When you don't set `modelOptions.max_tokens`, it defaults to the selected model's full output ceiling (`max_output_tokens` from the model metadata — e.g. 64K for Sonnet, 128K for Opus), falling back to a safe constant for unrecognized models. `max_tokens` is a ceiling, not a reservation — billing is on tokens actually generated — so this default costs nothing extra and avoids the silent mid-response truncation (`stop_reason: "max_tokens"`) that a low default would cause. Set `max_tokens` explicitly only when you want to _cap_ output below the model ceiling. If a response is truncated while using the default cap, the adapter logs a warning (visible with [debug logging](../advanced/debug-logging) enabled).
### `max_tokens` default
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 139-139: Heading levels should only increment by one level at a time
Expected: h3; Actual: h4

(MD001, heading-increment)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/adapters/anthropic.md` around lines 139 - 142, The `max_tokens` default
section in the Anthropic adapter docs skips a heading level and breaks the
document hierarchy. Update the heading used for this section in
`docs/adapters/anthropic.md` from the current `####` level to `###` so it sits
correctly between `Model Options` and `Thinking (Extended Thinking)`.

Source: Linters/SAST tools

…ult (#849)

The #849 default of the model's full output ceiling broke the non-streaming
`structuredOutput()` path: the Anthropic SDK refuses a non-streaming request
whose `max_tokens` could exceed its 10-minute timeout (~21,333 tokens), so
`chat({ outputSchema })` on any fallback-path model threw "Streaming is
required for operations that may take longer than 10 minutes".

`getAnthropicDefaultMaxTokens(model, { stream })` now clamps the default to
`ANTHROPIC_MAX_NONSTREAMING_TOKENS` when `stream: false`; the streaming chat
path keeps the model's full ceiling. An explicit oversized `max_tokens` still
surfaces the SDK's "use streaming" error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ai-anthropic/tests/anthropic-adapter.test.ts`:
- Line 5: The imports in anthropic-adapter.test.ts are out of the required order
and will fail the import/order lint rule. Reorder the value import from
model-meta so it comes before the type-only import from adapters/text, keeping
the existing symbols like ANTHROPIC_MAX_NONSTREAMING_TOKENS and the TextAdapter
type import intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfac1478-f2ab-4ae6-ada8-6fcb471edf34

📥 Commits

Reviewing files that changed from the base of the PR and between f468928 and aee6dd1.

📒 Files selected for processing (6)
  • .changeset/anthropic-max-tokens-default.md
  • docs/adapters/anthropic.md
  • packages/ai-anthropic/src/adapters/text.ts
  • packages/ai-anthropic/src/model-meta.ts
  • packages/ai-anthropic/tests/anthropic-adapter.test.ts
  • packages/ai-anthropic/tests/model-meta.test.ts
✅ Files skipped from review due to trivial changes (2)
  • .changeset/anthropic-max-tokens-default.md
  • docs/adapters/anthropic.md

import { chat, type Tool, type StreamChunk } from '@tanstack/ai'
import { AnthropicTextAdapter } from '../src/adapters/text'
import type { AnthropicTextProviderOptions } from '../src/adapters/text'
import { ANTHROPIC_MAX_NONSTREAMING_TOKENS } from '../src/model-meta'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Reorder import to satisfy import/order.

ESLint flags this: the value import of ../src/model-meta should precede the type import of ../src/adapters/text. This will fail lint in CI.

🧰 Tools
🪛 ESLint

[error] 5-5: ../src/model-meta import should occur before type import of ../src/adapters/text

(import/order)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai-anthropic/tests/anthropic-adapter.test.ts` at line 5, The imports
in anthropic-adapter.test.ts are out of the required order and will fail the
import/order lint rule. Reorder the value import from model-meta so it comes
before the type-only import from adapters/text, keeping the existing symbols
like ANTHROPIC_MAX_NONSTREAMING_TOKENS and the TextAdapter type import intact.

Source: Linters/SAST tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ai-anthropic: max_tokens defaults to 1024, silently truncating responses when caller doesn't set it

1 participant