fix(auth): add HTTP timeouts to bridge-token and Client endpoints#558
Conversation
`_get_bridge_tokens`, `_poll_read_key`, and `_create_client` all called `requests.*` without a `timeout=` argument. A slow or hung identity server — or a slowloris — would pin the calling thread indefinitely. The existing `poll_timeout` only caps the total polling loop, not individual requests, and the bridge/create-client calls had no cap at all. Add a shared `(connect=10s, read=30s)` pair and thread it through all three call sites. Session: https://session-bc38cc85.poseidon.rapidata.internal/ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: lino <lino@rapidata.ai>
Code ReviewOverviewThis PR correctly identifies and fixes a real reliability issue: all three Strengths
Issues1. Multi-line comment violates project conventions (minor)
The 4-line block on # (connect, read) pair; outer poll_timeout only bounds the loop, not individual requests.
_HTTP_TIMEOUT: Tuple[float, float] = (10.0, 30.0)2. Polling loop can exceed
|
|
Code Review for PR 558 - fix(auth): add HTTP timeouts to bridge-token and Client endpoints |
|
Code Review — fix(auth): add HTTP timeouts to bridge-token and Client endpoints Overview This PR adds a (connect=10s, read=30s) timeout to all three bare requests.* calls in CredentialManager that previously had no timeout protection. The fix is correct and addresses a genuine reliability/security risk. What is good
Issues Bug: a single request timeout terminates the entire polling loop In _poll_read_key, requests.ReadTimeout is a subclass of requests.RequestException, so the existing handler catches it and immediately returns None, aborting all remaining retries. Before this PR a hung server would block forever; after this PR a single slow response during polling kills the whole auth flow. The fix should distinguish transient timeouts from hard errors: Read timeout of 30s is generous for a polling endpoint The /identity/bridge-token GET is a lightweight status check that should return 200 or 202 immediately. A 30s read timeout means a hung server during polling holds the thread for up to 30s per iteration. Consider a tighter read timeout for the polling path (e.g. 5-10s) and reserve the longer value for _create_client which may do heavier server-side work. You could also use two constants: _POLL_TIMEOUT and _CLIENT_TIMEOUT. Minor: URL construction moved inside try unnecessarily String formatting cannot raise requests.RequestException, so moving bridge_endpoint inside the try block in _get_bridge_tokens adds noise without benefit. Not a bug, just a clarity note. Convention notes
Test coverage No tests are added. The polling exception handler bug is exactly what a test would catch — a mock that raises requests.Timeout on the first poll and succeeds on the second would confirm retry behavior. Worth adding given this is the auth critical path. Summary The core fix is correct and needed. The main actionable concern is the polling exception handler: a single request timeout now terminates auth entirely rather than retrying within the remaining poll_timeout window, which could make the auth flow less reliable in practice than it was before this PR. |
Summary
_get_bridge_tokens,_poll_read_key, and_create_clientinCredentialManagerall hit identity endpoints withrequests.*and notimeout=. A slow/hung identity server or a deliberate slowloris holds the calling thread forever.poll_timeoutcaps the total polling loop only — a hang on any individual request still blocks forever. The bridge and create-client calls had no cap whatsoever.Fix
Single
_HTTP_TIMEOUT = (10.0, 30.0)class constant (connect, read), threaded through all three call sites.Test plan
uv run pyright src/rapidata/rapidata_client→ 0 errorsrequests.*call sites in the file now havetimeout=.🔗 Session: https://session-bc38cc85.poseidon.rapidata.internal/