Skip to content

fix(bedrock): upgrade default model to Claude Sonnet 4.5#2193

Merged
afarntrog merged 9 commits intostrands-agents:mainfrom
afarntrog:default_model_id_upgrade
Apr 24, 2026
Merged

fix(bedrock): upgrade default model to Claude Sonnet 4.5#2193
afarntrog merged 9 commits intostrands-agents:mainfrom
afarntrog:default_model_id_upgrade

Conversation

@afarntrog
Copy link
Copy Markdown
Contributor

@afarntrog afarntrog commented Apr 23, 2026

Description

Motivation

The current SDK default model is broken for most users. Claude Sonnet 4 (us.anthropic.claude-sonnet-4-20250514-v1:0) was marked legacy by Anthropic and is no longer served to Bedrock accounts that haven't actively used it in the last 30 days. Anyone who creates an Agent() without specifying a model — including new AWS accounts, Workshop Studio events, and anyone who simply hasn't called that specific model recently — gets this error:

This Model is marked by provider as Legacy and you have not been actively using
the model in the last 30 days. Please upgrade to an active model on Amazon Bedrock
└ Bedrock region: us-west-2
└ Model id: us.anthropic.claude-sonnet-4-20250514-v1:0

This isn't an edge case — it's the default path for every new user of the SDK.

Public API Changes

The default Bedrock model ID changes from us.anthropic.claude-sonnet-4-20250514-v1:0 to global.anthropic.claude-sonnet-4-6. This uses the Global cross-region inference profile, which means the default works from any supported region without per-region prefix resolution. The global prefix also aligns the Python SDK with the TypeScript SDK default.

No code changes required for users who already specify a model_id. The default-resolution helper was reordered so a caller-supplied model_id still wins over the global default.

(Updated policy for integ tests)

Related Issues

Documentation PR

Type of Change

Bug fix
Breaking change

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Claude Sonnet 4 was marked legacy by Anthropic and is no longer served
to Bedrock accounts that haven't used it in the last 30 days, breaking
new accounts and Workshop Studio events that rely on the SDK default.

Upgrade the default Bedrock model to us.anthropic.claude-sonnet-4-5-
20250929-v1:0. The same geographic inference profile prefixes (us, eu,
us-gov, apac) are supported, so the region resolution logic is
unchanged.
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Align the Python SDK default with the TypeScript SDK by using the
global inference profile for Claude Sonnet 4.6
(global.anthropic.claude-sonnet-4-6). Using the global profile avoids
per-region availability gaps the previous default hit in Workshop
Studio accounts and removes the need for region-prefix resolution at
init time.

Reorder the default-resolution helper so a caller-supplied model_id
still wins over the global default. Drop the now-unreachable
per-region assertions and region-unsupported warnings from the bedrock
tests.
Claude Sonnet 4.6 extends the context window to 1M tokens, so the
prior 400K-token payload no longer triggers overflow in the default
model. Triple the repetition count so the test exercises the
conversation manager's reduction path under the new default.
Different model providers return "dentist" with different casing
("Dentist" vs "dentist"), causing the structured-output conformance
test to flap across all providers. Lowercase the response before
comparing to make the assertion invariant to model-specific casing.
Specifies the model_id for BedrockModel in the output intervention
test to ensure consistent behavior rather than relying on a default.
…n test

Use a specific BedrockModel (Claude Sonnet 4) instead of relying on the
default model to ensure consistent behavior in the tool context injection
integration test.
Comment thread tests_integ/steering/test_tool_steering.py
Comment thread src/strands/models/bedrock.py
@afarntrog afarntrog requested a review from opieter-aws April 24, 2026 18:38
@afarntrog afarntrog merged commit ce64c3a into strands-agents:main Apr 24, 2026
25 of 28 checks passed
if model_config.get("model_id"):
return model_config["model_id"]

if DEFAULT_BEDROCK_MODEL_ID != _DEFAULT_BEDROCK_MODEL_ID.format("us"):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: With DEFAULT_BEDROCK_MODEL_ID = "global.anthropic.claude-sonnet-4-6" and _DEFAULT_BEDROCK_MODEL_ID = "{}.anthropic.claude-sonnet-4-6", the sentinel check DEFAULT_BEDROCK_MODEL_ID != _DEFAULT_BEDROCK_MODEL_ID.format("us") is always true (since "global..." != "us..."). This means the entire region-prefix fallback path below (lines 1105–1132) — including prefix_inference_map, the unsupported-region warning, and the prefix-based model construction — is now dead code for any user who hasn't monkey-patched DEFAULT_BEDROCK_MODEL_ID.

Suggestion: Either:

  1. Remove the dead code path entirely since global profiles don't need region resolution, or
  2. Add a clear comment above the sentinel check explaining that this path only executes when a user overrides DEFAULT_BEDROCK_MODEL_ID to a region-prefixed value (e.g., strands.models.bedrock.DEFAULT_BEDROCK_MODEL_ID = "us.anthropic.claude-sonnet-4-6"), so future maintainers understand why it exists.

Option 1 would be the cleaner approach since the global prefix makes the entire region-resolution mechanism unnecessary.

"""Test ToolContext functionality with real agent interactions."""

agent = Agent(tools=[good_story])
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514-v1:0")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Same concern as the guardrail test — this pins to the legacy model that is broken for new/inactive accounts. If there's a specific reason this test needs the old model, it should be documented.

Suggestion: Remove the explicit model pin and let the test use the new global default, or use a different non-legacy model.

def test_context_window_overflow():
messages: Messages = [
{"role": "user", "content": [{"text": "Too much text!" * 100000}]},
{"role": "user", "content": [{"text": "Too much text!" * 300000}]},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The multiplier change from 100000 to 300000 is unexplained. Is this because claude-sonnet-4-6 has a larger context window and the old value no longer triggers the overflow?

Suggestion: Add a comment explaining why this value was changed (e.g., the new default model has a larger context window), so future maintainers know how this magic number relates to the model.

Comment thread tests_integ/conftest.py
"ANTHROPIC_API_KEY",
"GOOGLE_API_KEY",
"MISTRAL_API_KEY",
# "MISTRAL_API_KEY", # will add back once we get a card on file for this.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Commenting out MISTRAL_API_KEY appears unrelated to the default model upgrade and lacks context for why it's bundled in this PR.

Suggestion: Consider splitting this into a separate commit with its own description, or at minimum add the rationale to the PR description.

@pytest.mark.parametrize("processing_mode", ["sync", "async"])
def test_guardrail_output_intervention(boto_session, bedrock_guardrail, processing_mode):
bedrock_model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This pins the guardrail test to the old legacy model (us.anthropic.claude-sonnet-4-20250514-v1:0) — the very model that motivated this PR because it fails for accounts that haven't used it in 30 days. This will cause CI failures in environments matching that description.

Suggestion: Use the new default model or another non-legacy model. If guardrails require a specific model for compatibility, document why with a comment.

@github-actions
Copy link
Copy Markdown

Assessment: Comment

The core change — upgrading from a legacy region-prefixed model to a global inference profile — is well-motivated and straightforward. The reordering to prioritize model_config.model_id over the sentinel check is correct. However, this change creates a significant amount of dead code in the region-resolution logic that should be addressed.

Review Categories
  • Dead code: The global prefix means the sentinel check (DEFAULT != _DEFAULT.format("us")) is always true, making the entire region-prefix fallback path unreachable in normal operation. Consider cleaning this up or documenting the intended monkey-patch use case.
  • Integ test fragility: Two integration tests pin to the old legacy model that this PR is moving away from. These will break in CI environments where the legacy model is unavailable.
  • Scope: Several unrelated changes (Mistral API key, context overflow multiplier, steering test rewrite) are bundled in without explanation — consider separating or documenting in the PR description.

Nice improvement to the tool steering test — making it deterministic rather than LLM-dependent is a much more robust approach.


def test_get_default_model_with_warning_supported_regions_shows_no_warning(captured_warnings):
"""Test get_model_prefix_with_warning doesn't warn for supported region prefixes."""
"""Test _get_default_model_with_warning doesn't warn for any region (global profile works everywhere)."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This test calls _get_default_model_with_warning for both us-west-2 and eu-west-2, but test_get_default_model_returns_global_inference_profile below already covers multiple regions (including us-east-1 and eu-west-1). These two tests overlap significantly.

Suggestion: Consider consolidating — you could add the "no does not support warning" assertion directly into test_get_default_model_returns_global_inference_profile, which already iterates regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants