Bug
When switching from a large-context model to a smaller one mid-session, the historian does not reduce the context to fit the new model before sending the prompt, resulting in Input exceeds context window errors.
Reproduced with: GLM-5.2 (512k context) → GPT-5.5 (272k context), with ~300k input tokens persisted on the old model.
Root cause
The model-change branch in transform.ts:502-547 detects the switch and clears lastContextPercentage / lastInputTokens to 0. This suppresses every reduction path on the same pass:
- Historian trigger (
checkCompartmentTrigger): 0% < proactive floor → shouldFire=false
- 95% emergency block (
transform-compartment-phase.ts:328): 0% < 95% → no block
- Overflow recovery bump (
transform.ts:616-628): needsEmergencyRecovery was just cleared → no bump
The oversized prompt — sized for the old model's window — is sent to the new smaller model and rejected. Recovery only arms on the second pass, after the real overflow error fires the event handler's recordOverflowDetected. The user sees the error on the first request.
The clearing is correct for the small→large direction (where clearing is harmless — there is headroom to spare), and hook-handlers.test.ts:160 only covers that case. The large→small direction is unhandled.
Fix direction
In the model-change branch (transform.ts:502), before clearing, compare sessionMeta.lastInputTokens (the old model's last measured input tokens) against the new model's context limit (resolveTrustedContextLimit with lastAssistantModel). When oldInputTokens > newContextLimit, call recordOverflowDetected(db, sessionId, newContextLimit, newModelKey) to arm recovery — which makes the existing transform.ts:616-628 bump-to-95% path fire on the same pass, so historian + emergency drops run before the prompt is sent.
Reproduction
Failing test in PR #187: #187
The test reproduces the scenario:
- Session at 300k input tokens on GLM-5.2 (58% of 512k — no trigger fired)
- Switch to GPT-5.5 (272k) — 300k is ~110% of the new window
- Asserts
needsEmergencyRecovery === true and the oversized tool output is dropped
The test currently fails: needsEmergencyRecovery is false and the tool output stays active.
Bug
When switching from a large-context model to a smaller one mid-session, the historian does not reduce the context to fit the new model before sending the prompt, resulting in
Input exceeds context windowerrors.Reproduced with: GLM-5.2 (512k context) → GPT-5.5 (272k context), with ~300k input tokens persisted on the old model.
Root cause
The model-change branch in
transform.ts:502-547detects the switch and clearslastContextPercentage/lastInputTokensto 0. This suppresses every reduction path on the same pass:checkCompartmentTrigger): 0% < proactive floor →shouldFire=falsetransform-compartment-phase.ts:328): 0% < 95% → no blocktransform.ts:616-628):needsEmergencyRecoverywas just cleared → no bumpThe oversized prompt — sized for the old model's window — is sent to the new smaller model and rejected. Recovery only arms on the second pass, after the real overflow error fires the event handler's
recordOverflowDetected. The user sees the error on the first request.The clearing is correct for the small→large direction (where clearing is harmless — there is headroom to spare), and
hook-handlers.test.ts:160only covers that case. The large→small direction is unhandled.Fix direction
In the model-change branch (
transform.ts:502), before clearing, comparesessionMeta.lastInputTokens(the old model's last measured input tokens) against the new model's context limit (resolveTrustedContextLimitwithlastAssistantModel). WhenoldInputTokens > newContextLimit, callrecordOverflowDetected(db, sessionId, newContextLimit, newModelKey)to arm recovery — which makes the existingtransform.ts:616-628bump-to-95% path fire on the same pass, so historian + emergency drops run before the prompt is sent.Reproduction
Failing test in PR #187: #187
The test reproduces the scenario:
needsEmergencyRecovery === trueand the oversized tool output is droppedThe test currently fails:
needsEmergencyRecoveryisfalseand the tool output staysactive.