fix(soniox): prevent stream hang on server-side errors#5677
Open
morix1500 wants to merge 2 commits intolivekit:mainfrom
Open
fix(soniox): prevent stream hang on server-side errors#5677morix1500 wants to merge 2 commits intolivekit:mainfrom
morix1500 wants to merge 2 commits intolivekit:mainfrom
Conversation
When Soniox sent a WebSocket error frame (e.g. "Cannot continue request" with status 503), the recv loop only logged it and never reconnected, which caused the agent process to hang. Remove the dormant `_reconnect_event` (dead code since the plugin's first commit) and align with the Deepgram pattern: surface in-band error frames, unexpected WS closes, and mid-stream transport failures as `APIError` so the base class `SpeechStream._main_task` retry/backoff policy applies. - error_code/error_message frames -> raise APIStatusError - unexpected WS CLOSED/CLOSE/CLOSING -> raise APIStatusError - mid-stream aiohttp.ClientError from gather -> raise APIConnectionError - finished frame followed by normal close -> return cleanly (no retry) - tolerate Soniox control frames without a `tokens` field - decorate tasks with @utils.log_exceptions for uniform task-level logging
After the server sends `finished` and cleanly closes the WS, the recv task previously returned normally. `asyncio.gather` then waited forever for `_send_audio_task`, which was blocked on an empty `audio_queue`, hanging the entire stream and any consumer iterating over it. Raise a private `_SessionFinished` from the recv task and catch it in `_run` so the surrounding finally block cancels sibling tasks via `gracefully_cancel`, letting `_main_task` finish and `_event_ch` close.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When Soniox sends a WebSocket error frame such as
503 - Cannot continue request (code 4), the recv loop only logged the error and never reconnected, leaving the agent process hung indefinitely.Root cause: the plugin defined
_reconnect_eventbut never called_reconnect_event.set()anywhere in the codebase (dead code since the plugin's first commit), so the reconnection mechanism was effectively disabled.This PR removes
_reconnect_eventand aligns the plugin with the Deepgram pattern — surface failures asAPIErrorso the base classSpeechStream._main_taskretry/backoff policy handles recovery.Changes:
error_code/error_messageframes →APIStatusErrorCLOSED/CLOSE/CLOSING→APIStatusErroraiohttp.ClientErrorfrom gather →APIConnectionErrorfinishedframe followed by clean close) → raise an internal_SessionFinishedso the surroundingfinallycancels sibling tasks and_runreturns cleanly without triggering a retry (prevents the gather hang where_send_audio_taskblocks on an empty queue)tokensfield@utils.log_exceptionsfor uniform task-level logging