fix: do not quota-backoff OAuth 403 checks#86
Conversation
There was a problem hiding this comment.
No issues found across 3 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Architecture diagram
sequenceDiagram
participant FallbackAccountManager
participant QuotaManager
participant AnthropicAPI as Anthropic API
participant AccountStore as Account Storage
Note over FallbackAccountManager,AccountStore: Quota refresh flow (per account)
FallbackAccountManager->>QuotaManager: refreshQuotaForDueAccounts()
QuotaManager->>AnthropicAPI: quota check request
AnthropicAPI-->>QuotaManager: 403 {error: {type: "permission_error", message: "OAuth not allowed for org"}}
alt 403 error (auth/org-policy)
QuotaManager->>QuotaManager: isAuthError() → true<br/>(CHANGED: now includes 403)
QuotaManager->>QuotaManager: Skip backoff arming – do not record quota failure
QuotaManager-->>FallbackAccountManager: Auth error (403)
FallbackAccountManager->>AccountStore: recordQuotaRefreshError(error)
AccountStore->>AccountStore: isQuotaPolicyAuthError() → true<br/>(NEW: returns early for 403)
AccountStore-->>FallbackAccountManager: No persist (lastQuotaRefreshError not set)
else other error (e.g., 429, 500, network)
QuotaManager->>QuotaManager: isAuthError() → false
QuotaManager->>QuotaManager: Arm backoff / record quota failure
QuotaManager-->>FallbackAccountManager: Quota error
FallbackAccountManager->>AccountStore: recordQuotaRefreshError(error)
AccountStore->>AccountStore: isQuotaPolicyAuthError() → false
AccountStore->>AccountStore: Store lastQuotaRefreshError
AccountStore-->>FallbackAccountManager: Persisted
end
Note over FallbackAccountManager,AccountStore: Regression test validates that 403 does not arm backoff and does not persist lastQuotaRefreshError
| function isQuotaPolicyAuthError(error: unknown) { | ||
| const status = (error as { status?: unknown }).status | ||
| if (status === 403) return true | ||
| return /Claude quota check failed: 403\b/.test(formatErrorMessage(error)) | ||
| } |
There was a problem hiding this comment.
status === 403 matches ClaudeOAuthRefreshError, silently drops refresh backoff
ClaudeOAuthRefreshError is declared with public readonly status: number and public readonly isRefreshError = true. When refreshAccount throws a 403 from the token endpoint (e.g., the org also blocks OAuth token refresh), the error reaches the outer catch at line 2114 and is passed into recordQuotaRefreshError. The new early-return guard fires on status === 403 — before the isRefreshError branch at line 1841 — so recordRefreshError is never called. lastRefreshError stays undefined, refreshBackoffActive stays false, and the manager retries the token refresh on every subsequent cycle, looping until something external changes.
The message-based check already covers all quota-check 403s (the error is always constructed as "Claude quota check failed: 403 — …"), so the status === 403 shortcut is redundant for the intended case and harmful for the refresh-error case. Removing the first branch and relying solely on the regex makes the guard quota-check-specific and fixes the masking.
| test('quota-endpoint 403 does NOT arm quota backoff', async () => { | ||
| const now = 1_000_000 | ||
| const storage = baseStorage() | ||
| storage.accounts.push({ | ||
| id: 'fb-quota-403', | ||
| type: 'oauth', | ||
| access: 'old-access', | ||
| refresh: 'old-refresh', | ||
| expires: now + 24 * 60 * 60_000, | ||
| }) | ||
| await saveAccounts(storage) | ||
|
|
||
| const fetchImpl = mock((input: string | URL | Request) => { | ||
| const url = | ||
| typeof input === 'string' | ||
| ? input | ||
| : input instanceof Request | ||
| ? input.url | ||
| : input.href | ||
| if (url.includes('/v1/oauth/token')) { | ||
| return Promise.resolve( | ||
| new Response( | ||
| JSON.stringify({ | ||
| access_token: 'new-access', | ||
| refresh_token: 'new-refresh', | ||
| expires_in: 3600, | ||
| }), | ||
| { status: 200 }, | ||
| ), | ||
| ) | ||
| } | ||
| return Promise.resolve( | ||
| new Response( | ||
| JSON.stringify({ | ||
| type: 'error', | ||
| error: { | ||
| type: 'permission_error', | ||
| message: | ||
| 'OAuth authentication is currently not allowed for this organization.', | ||
| }, | ||
| }), | ||
| { status: 403 }, | ||
| ), | ||
| ) | ||
| }) as unknown as typeof fetch | ||
|
|
||
| const manager = new FallbackAccountManager({ | ||
| fetchImpl, | ||
| now: () => now, | ||
| }) | ||
|
|
||
| await manager.refreshQuotaForDueAccounts() | ||
|
|
||
| const loaded = await loadAccounts() | ||
| const account = loaded?.accounts.find((a) => a.id === 'fb-quota-403') as | ||
| | OAuthAccount | ||
| | undefined | ||
| expect(account?.lastRefreshError).toBeUndefined() | ||
| expect(account?.lastQuotaRefreshError).toBeUndefined() | ||
| }) |
There was a problem hiding this comment.
Test doesn't cover token-refresh-403 path, which masks the
isRefreshError bug
The new test covers the case where the token refresh endpoint returns 200 and only the quota endpoint returns 403. The problematic scenario — where the token refresh endpoint itself returns 403 (e.g., org policy blocks OAuth entirely, so ClaudeOAuthRefreshError(403, …) is thrown from refreshAccount) — is not exercised. A test for that path would reveal that lastRefreshError is never armed, allowing the manager to loop indefinitely on the same failing account.
Summary
lastQuotaRefreshErrorfor quota-check 403s in fallback account quota refresh.permission_errorresponse: "OAuth authentication is currently not allowed for this organization."Closes #85.
Validation
Result: 707 tests passed.
Need help on this PR? Tag
/codesmithwith what you need. Autofix is disabled.Summary by cubic
Treat OAuth 403s from Claude quota checks as auth/org-policy errors instead of quota backoff, allowing immediate re-auth or fallback. Skip persisting
lastQuotaRefreshErrorfor these 403s and add a regression test for the Anthropicpermission_errorcase.QuotaManager: treatquota check failed: 401|403as auth/policy; do not record quota backoff.accounts: do not setlastQuotaRefreshErrorfor 403 quota-check errors.Written for commit 4114b8b. Summary will update on new commits.
Greptile Summary
This PR fixes two related problems: quota-check 403s (org/policy errors) were previously treated as rate-limit failures and armed quota backoff, and
lastQuotaRefreshErrorwas persisted for those 403s in the fallback manager's periodic refresh loop. The fix addsisQuotaPolicyAuthErrorinaccounts.tsand extendsQuotaManager.isAuthErrorto include 403 alongside 401.quota-manager.ts: TheisAuthErrorregex is extended from401to(401|403), cleanly preventing in-memory quota backoff for org-policy 403s. The message-only check is appropriately specific.accounts.ts:isQuotaPolicyAuthErroris introduced to short-circuitrecordQuotaRefreshError, but its first branch (status === 403) is too broad —ClaudeOAuthRefreshErroralso carriesstatus: number, so a 403 from the token refresh endpoint would hit this guard and silently preventrecordRefreshErrorfrom being called, leaving the refresh backoff permanently unarmed.accounts.test.ts: A regression test covers the quota-endpoint 403 path (token refresh succeeds, quota returns 403). The complementary case — token refresh itself returning 403 — is not tested.Confidence Score: 3/5
Safe to merge for the quota-check 403 case, but the
status === 403shortcut inisQuotaPolicyAuthErrorcan match token-refresh errors too, silently suppressing refresh-backoff recording when the token endpoint returns 403.The
quota-manager.tschange is clean and self-contained. Theaccounts.tsguard has an unintended side-effect:ClaudeOAuthRefreshErrorcarriespublic readonly status: number, so a 403 from the token-refresh endpoint hits the early return before theisRefreshErrorbranch, leavinglastRefreshErrorpermanently unset and the refresh loop unbounded. The test suite covers the happy path for the stated fix but does not exercise the token-refresh-403 variant that exposes the masking.packages/core/src/accounts.ts — specifically the
isQuotaPolicyAuthErrorfunction and its interaction withClaudeOAuthRefreshError.Important Files Changed
isQuotaPolicyAuthErrorguard inrecordQuotaRefreshError; thestatus === 403branch is overly broad and can matchClaudeOAuthRefreshError, silently suppressing refresh-backoff recording when token refresh itself returns 403.isAuthErrorregex from401to `(401lastRefreshErrornorlastQuotaRefreshErroris armed, but does not cover the token-refresh-403 path.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[refreshQuotaForDueAccounts] --> B{Token needs refresh?} B -- Yes --> C{refreshBackoffActive?} C -- Yes --> SKIP[continue / skip account] C -- No --> D[refreshAccount → token endpoint] D -- 200 OK --> E[proceed to quota check] D -- "403 ClaudeOAuthRefreshError\n(status=403, isRefreshError=true)" --> CATCH B -- No --> E E --> F[refreshAccountQuota → quota endpoint] F -- 200 OK --> G[update quota, clear errors] F -- "403 quota-check error\n(message='Claude quota check failed: 403')" --> CATCH CATCH[outer catch: recordQuotaRefreshError] --> GUARD{isQuotaPolicyAuthError?} GUARD -- "status===403 matches BOTH error types" --> SKIP2[early return — no backoff armed] GUARD -- No --> ARM[arm lastQuotaRefreshError + recordRefreshError if isRefreshError] SKIP2 -. Bug: ClaudeOAuthRefreshError skips recordRefreshError too .-> LOOP[retry next tick — refresh backoff never armed]%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[refreshQuotaForDueAccounts] --> B{Token needs refresh?} B -- Yes --> C{refreshBackoffActive?} C -- Yes --> SKIP[continue / skip account] C -- No --> D[refreshAccount → token endpoint] D -- 200 OK --> E[proceed to quota check] D -- "403 ClaudeOAuthRefreshError\n(status=403, isRefreshError=true)" --> CATCH B -- No --> E E --> F[refreshAccountQuota → quota endpoint] F -- 200 OK --> G[update quota, clear errors] F -- "403 quota-check error\n(message='Claude quota check failed: 403')" --> CATCH CATCH[outer catch: recordQuotaRefreshError] --> GUARD{isQuotaPolicyAuthError?} GUARD -- "status===403 matches BOTH error types" --> SKIP2[early return — no backoff armed] GUARD -- No --> ARM[arm lastQuotaRefreshError + recordRefreshError if isRefreshError] SKIP2 -. Bug: ClaudeOAuthRefreshError skips recordRefreshError too .-> LOOP[retry next tick — refresh backoff never armed]Reviews (1): Last reviewed commit: "fix: do not quota-backoff OAuth 403 chec..." | Re-trigger Greptile