Fix uncaught TypeError on string or non-dict message content#1058
Fix uncaught TypeError on string or non-dict message content#1058adventurelands wants to merge 1 commit into
Conversation
parse_message handled string content for user messages but not assistant messages, so assistant string content was iterated character by character and raised TypeError instead of the documented MessageParseError. A non-dict content block did the same in both branches. Mirror the user branch for assistant and raise MessageParseError on non-dict blocks. Adds a regression test that fails on main and passes with this change. Co-Authored-By: Claude <noreply@anthropic.com>
|
Filed #1064 with a minimal verified repro of the underlying crash (assistant |
|
@ashwin-ant @qing-ant — small fix here closing #1064 (verified repro: an assistant message with |
ashwin-ant
left a comment
There was a problem hiding this comment.
Thanks for the PR! The non-dict block guard is a real fix and I'd like to land that part. A couple of things to sort out first though:
The else raw_content arm actually makes things worse. If content is something weird like None or a bare dict, the ternary now passes it straight through into AssistantMessage.content (which is typed list[ContentBlock]). On main that case blows up immediately with a TypeError inside the parser; with this patch it returns a silently-broken object and the crash moves downstream to whoever iterates msg.content. That's the opposite of what we want here.
The string-content case isn't actually a thing for assistant messages. The CLI emits BetaMessage for assistant turns, and BetaMessage.content is always an array — the string shorthand only exists on the input side (MessageParam). That's why AssistantMessage.content is list[ContentBlock] while UserMessage.content is str | list[...]. So rather than normalizing a string into [TextBlock(...)], non-list assistant content should just raise MessageParseError like the other guards you added.
That also means we don't need the second AssistantMessage(...) return — duplicating those 8 field mappings is a drift hazard. Something like this keeps it to one constructor:
raw_content = data["message"]["content"]
if not isinstance(raw_content, list):
raise MessageParseError(
f"Invalid assistant content (expected list, got {type(raw_content).__name__})",
data,
)
content_blocks: list[ContentBlock] = []
for block in raw_content:
if not isinstance(block, dict):
raise MessageParseError(...)
...Small test cleanup while you're in there:
- Drop the
Place under "tests/"...line from the docstring test_normal_block_lists_unaffectedandtest_user_string_content_still_parsesduplicate existing tests intest_message_parser.py— can go- Swap the
test_assistant_string_content_*tests for one that assertsMessageParseErroron string content - If you don't mind, fold what's left into
class TestMessageParserintests/test_message_parser.pyso it lives with the other parser tests
The isinstance(block, dict) guards and the parametrized error test are good as-is. 👍
Closes #1064.
Small fix: assistant message with string content raises an uncaught TypeError
Thanks for open-sourcing this.
Minor one.
parse_messagehandles stringcontentforusermessages but notfor
assistantmessages, so an assistant message whosecontentis a plainstring gets iterated character by character and raises a
TypeErrorinstead ofthe documented
MessageParseError. A non-dict block in the content list doesthe same in both branches.
The patch makes
assistantmirror theuserbranch (wrap a string as oneTextBlock) and raisesMessageParseErroron a non-dict block. Added a testthat fails on main and passes with the patch. Nothing else touched.
Just figured I would pass it along. Thanks again.
Davis Brief, davisbrief@gmail.com