fix(bedrock): upgrade default model to Claude Sonnet 4.5#2193
fix(bedrock): upgrade default model to Claude Sonnet 4.5#2193afarntrog merged 9 commits intostrands-agents:mainfrom
Conversation
Claude Sonnet 4 was marked legacy by Anthropic and is no longer served to Bedrock accounts that haven't used it in the last 30 days, breaking new accounts and Workshop Studio events that rely on the SDK default. Upgrade the default Bedrock model to us.anthropic.claude-sonnet-4-5- 20250929-v1:0. The same geographic inference profile prefixes (us, eu, us-gov, apac) are supported, so the region resolution logic is unchanged.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Align the Python SDK default with the TypeScript SDK by using the global inference profile for Claude Sonnet 4.6 (global.anthropic.claude-sonnet-4-6). Using the global profile avoids per-region availability gaps the previous default hit in Workshop Studio accounts and removes the need for region-prefix resolution at init time. Reorder the default-resolution helper so a caller-supplied model_id still wins over the global default. Drop the now-unreachable per-region assertions and region-unsupported warnings from the bedrock tests.
Claude Sonnet 4.6 extends the context window to 1M tokens, so the prior 400K-token payload no longer triggers overflow in the default model. Triple the repetition count so the test exercises the conversation manager's reduction path under the new default.
Different model providers return "dentist" with different casing
("Dentist" vs "dentist"), causing the structured-output conformance
test to flap across all providers. Lowercase the response before
comparing to make the assertion invariant to model-specific casing.
Specifies the model_id for BedrockModel in the output intervention test to ensure consistent behavior rather than relying on a default.
…n test Use a specific BedrockModel (Claude Sonnet 4) instead of relying on the default model to ensure consistent behavior in the tool context injection integration test.
| if model_config.get("model_id"): | ||
| return model_config["model_id"] | ||
|
|
||
| if DEFAULT_BEDROCK_MODEL_ID != _DEFAULT_BEDROCK_MODEL_ID.format("us"): |
There was a problem hiding this comment.
Issue: With DEFAULT_BEDROCK_MODEL_ID = "global.anthropic.claude-sonnet-4-6" and _DEFAULT_BEDROCK_MODEL_ID = "{}.anthropic.claude-sonnet-4-6", the sentinel check DEFAULT_BEDROCK_MODEL_ID != _DEFAULT_BEDROCK_MODEL_ID.format("us") is always true (since "global..." != "us..."). This means the entire region-prefix fallback path below (lines 1105–1132) — including prefix_inference_map, the unsupported-region warning, and the prefix-based model construction — is now dead code for any user who hasn't monkey-patched DEFAULT_BEDROCK_MODEL_ID.
Suggestion: Either:
- Remove the dead code path entirely since
globalprofiles don't need region resolution, or - Add a clear comment above the sentinel check explaining that this path only executes when a user overrides
DEFAULT_BEDROCK_MODEL_IDto a region-prefixed value (e.g.,strands.models.bedrock.DEFAULT_BEDROCK_MODEL_ID = "us.anthropic.claude-sonnet-4-6"), so future maintainers understand why it exists.
Option 1 would be the cleaner approach since the global prefix makes the entire region-resolution mechanism unnecessary.
| """Test ToolContext functionality with real agent interactions.""" | ||
|
|
||
| agent = Agent(tools=[good_story]) | ||
| model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514-v1:0") |
There was a problem hiding this comment.
Issue: Same concern as the guardrail test — this pins to the legacy model that is broken for new/inactive accounts. If there's a specific reason this test needs the old model, it should be documented.
Suggestion: Remove the explicit model pin and let the test use the new global default, or use a different non-legacy model.
| def test_context_window_overflow(): | ||
| messages: Messages = [ | ||
| {"role": "user", "content": [{"text": "Too much text!" * 100000}]}, | ||
| {"role": "user", "content": [{"text": "Too much text!" * 300000}]}, |
There was a problem hiding this comment.
Issue: The multiplier change from 100000 to 300000 is unexplained. Is this because claude-sonnet-4-6 has a larger context window and the old value no longer triggers the overflow?
Suggestion: Add a comment explaining why this value was changed (e.g., the new default model has a larger context window), so future maintainers know how this magic number relates to the model.
| "ANTHROPIC_API_KEY", | ||
| "GOOGLE_API_KEY", | ||
| "MISTRAL_API_KEY", | ||
| # "MISTRAL_API_KEY", # will add back once we get a card on file for this. |
There was a problem hiding this comment.
Issue: Commenting out MISTRAL_API_KEY appears unrelated to the default model upgrade and lacks context for why it's bundled in this PR.
Suggestion: Consider splitting this into a separate commit with its own description, or at minimum add the rationale to the PR description.
| @pytest.mark.parametrize("processing_mode", ["sync", "async"]) | ||
| def test_guardrail_output_intervention(boto_session, bedrock_guardrail, processing_mode): | ||
| bedrock_model = BedrockModel( | ||
| model_id="us.anthropic.claude-sonnet-4-20250514-v1:0", |
There was a problem hiding this comment.
Issue: This pins the guardrail test to the old legacy model (us.anthropic.claude-sonnet-4-20250514-v1:0) — the very model that motivated this PR because it fails for accounts that haven't used it in 30 days. This will cause CI failures in environments matching that description.
Suggestion: Use the new default model or another non-legacy model. If guardrails require a specific model for compatibility, document why with a comment.
|
Assessment: Comment The core change — upgrading from a legacy region-prefixed model to a global inference profile — is well-motivated and straightforward. The reordering to prioritize Review Categories
Nice improvement to the tool steering test — making it deterministic rather than LLM-dependent is a much more robust approach. |
|
|
||
| def test_get_default_model_with_warning_supported_regions_shows_no_warning(captured_warnings): | ||
| """Test get_model_prefix_with_warning doesn't warn for supported region prefixes.""" | ||
| """Test _get_default_model_with_warning doesn't warn for any region (global profile works everywhere).""" |
There was a problem hiding this comment.
Issue: This test calls _get_default_model_with_warning for both us-west-2 and eu-west-2, but test_get_default_model_returns_global_inference_profile below already covers multiple regions (including us-east-1 and eu-west-1). These two tests overlap significantly.
Suggestion: Consider consolidating — you could add the "no does not support warning" assertion directly into test_get_default_model_returns_global_inference_profile, which already iterates regions.
Description
Motivation
The current SDK default model is broken for most users. Claude Sonnet 4 (
us.anthropic.claude-sonnet-4-20250514-v1:0) was marked legacy by Anthropic and is no longer served to Bedrock accounts that haven't actively used it in the last 30 days. Anyone who creates anAgent()without specifying a model — including new AWS accounts, Workshop Studio events, and anyone who simply hasn't called that specific model recently — gets this error:This isn't an edge case — it's the default path for every new user of the SDK.
Public API Changes
The default Bedrock model ID changes from
us.anthropic.claude-sonnet-4-20250514-v1:0toglobal.anthropic.claude-sonnet-4-6. This uses the Global cross-region inference profile, which means the default works from any supported region without per-region prefix resolution. Theglobalprefix also aligns the Python SDK with the TypeScript SDK default.No code changes required for users who already specify a
model_id. The default-resolution helper was reordered so a caller-suppliedmodel_idstill wins over the global default.(Updated policy for integ tests)
Related Issues
Documentation PR
Type of Change
Bug fix
Breaking change
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.