fix(events): SSE health-check watchdog to detect silently-dead streams#123
Open
avfirsov wants to merge 1 commit into
Open
fix(events): SSE health-check watchdog to detect silently-dead streams#123avfirsov wants to merge 1 commit into
avfirsov wants to merge 1 commit into
Conversation
The SSE for-await loop cannot detect a stream that died without throwing — the loop just sits idle waiting for events that never arrive, and the reconnect logic only fires when the stream explicitly ends or errors. A corporate proxy (e.g. cntlm) cutting off long-running connections, a broken pipe that never surfaces an error, or a tight reconnect loop with no successful event delivery can all leave the bot silent indefinitely. Changes: - events.ts: track lastSseEventTime (refreshed on every received event) and consecutiveReconnectAttempts (incremented in both reconnect paths, reset on event). Export getLastSseEventTime, getConsecutiveReconnectAttempts, isEventListening, getActiveEventDirectory. - index.ts: add sseWatchdogTimer that fires every 30s. If we have not seen an event for >30s or have piled up >=5 reconnect attempts while isEventListening() is true, stopEventListening() + restart via ensureEventSubscription(directory). Cleared in createBot() and cleanupBotRuntime(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The SSE for-await loop cannot detect a stream that died without throwing — it sits idle waiting for events that never arrive, and the reconnect logic only fires when the stream explicitly ends or errors. A corporate proxy (e.g. cntlm) cutting off long-running connections, a broken pipe that never surfaces an error, or a tight reconnect loop with no successful event delivery can all leave the bot silent indefinitely.
Changes
src/opencode/events.ts— track stream health and expose getters:lastSseEventTimeis refreshed on every received event;consecutiveReconnectAttemptsis incremented in both reconnect paths (stream-ended and error) and reset whenever an event arrives;getLastSseEventTime,getConsecutiveReconnectAttempts,isEventListening,getActiveEventDirectory.src/bot/index.ts— add a 30-secondsseWatchdogTimer:isEventListening()is true AND we have not seen an event for >30s OR have piled up ≥5 reconnect attempts →stopEventListening()+ensureEventSubscription(directory);createBot()andcleanupBotRuntime()alongside the existing heartbeat timer.Test plan
npm run buildclean (verified locally)opencode servewhile bot is connected → reconnects pile up → watchdog logs[SSE Watchdog] Restarting…→ onopencode serverestart, subscription resumes without bot restart