Skip to content

Commit 364b762

Browse files
authored
tests/interaction: era-axis machinery for the requirements manifest (#2909)
1 parent 734746a commit 364b762

5 files changed

Lines changed: 1142 additions & 30 deletions

File tree

tests/interaction/README.md

Lines changed: 68 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,12 @@ test body — each directory pins its flavour's true output exactly.
5656

5757
Transport-agnostic tests take the `connect` fixture instead of constructing `Client(server)`
5858
directly, and therefore run once per transport: over the in-memory transport, over the server's
59-
real streamable HTTP app driven in-process through the streaming bridge, and over the legacy SSE
60-
transport the same way. A test connects with `async with connect(server, ...) as client:` and
61-
asserts the same output on every leg, because the transport is not supposed to change observable
62-
behaviour. Tests that are tied to one transport do not use the fixture: the wire-recording tests
59+
real streamable HTTP app driven in-process through the streaming bridge (in both stateful and
60+
stateless configurations), and over the legacy SSE transport the same way. A test connects with
61+
`async with connect(server, ...) as client:` and asserts the same output on every leg, because the
62+
transport is not supposed to change observable behaviour. Requirements that need a server-to-client
63+
back-channel or persisted session state are carved out of the stateless arm via `arm_exclusions`.
64+
Tests that are tied to one transport do not use the fixture: the wire-recording tests
6365
(their seam is the in-memory stream pair), the bare-`ClientSession` lifecycle tests, the
6466
real-clock timeout tests (the timeout machinery is transport-independent and must not race
6567
transport latency), and everything under `transports/`, which pins behaviour only observable on
@@ -95,6 +97,14 @@ clients can share one session manager.
9597
would require real-time waits the suite refuses.
9698
- **`transports`** names the transports a behaviour applies to; omitted means transport-independent.
9799
- **`issue`** carries the tracking link for a recorded gap once one is filed.
100+
- **`note`** carries free-form context that does not fit `divergence` or `deferred`.
101+
- **`added_in`** / **`removed_in`** bound the spec versions the behaviour exists in, as a half-open
102+
`[added_in, removed_in)` window.
103+
- **`supersedes`** / **`superseded_by`** link a retired entry to its replacement; the link is
104+
bidirectional and both ends must be versioned.
105+
- **`arm_exclusions`** carve specific `(transport, spec_version)` matrix cells out with a typed
106+
`ArmExclusionReason`.
107+
- **`known_failures`** mark specific `(transport, spec_version)` cells as strict xfail.
98108

99109
Tests link themselves to the manifest with a decorator:
100110

@@ -126,15 +136,60 @@ This is also the triage key for any rewrite: a test that fails on the new code p
126136
divergence note (the rewrite accidentally fixed a known gap — decide whether to keep the fix) or
127137
it does not (the rewrite broke something that was correct — fix the rewrite).
128138

129-
### When a new spec revision is released
130-
131-
1. Update `SPEC_REVISION` and walk the new revision's changelog.
132-
2. For each changed interaction, find its requirements (the IDs use the wire method strings the
133-
changelog speaks in), re-audit the tests against the new text, and update `source` links and
134-
assertions where behaviour legitimately changed.
135-
3. New interactions get new requirements and new tests; removed interactions get their
136-
requirements deleted along with their tests.
137-
4. A behaviour that is correct under both revisions needs no change beyond the `source` link.
139+
### Spec versions and the era axis
140+
141+
`SPEC_VERSIONS` in `_requirements.py` is the ordered tuple of protocol revisions the suite
142+
exercises. `SPEC_BASE_URL` (and `SPEC_2026_BASE_URL`) are pinned literals — not derived from
143+
`SPEC_VERSIONS` — so growing the active axis never repoints existing `source` links. The
144+
`connect` fixture fans out over `CONNECTABLE_TRANSPORTS × SPEC_VERSIONS`, but the grid is
145+
filtered per test:
146+
`pytest_generate_tests` reads the test's stacked `@requirement` marks and calls `compute_cells()`,
147+
which intersects the admissible cells across every cited requirement — a cell survives only if
148+
**all** of the test's requirements admit it.
149+
150+
`streamable-http-stateless` is the fourth connectable transport: the 2025-era unofficial stateless
151+
mode where each request opens a fresh transport, no session id is issued, and there is no standalone
152+
GET stream. Requirements that need a server→client back-channel or persisted session state are
153+
excluded from that arm via `arm_exclusions` (reasons `server-initiated-request` and
154+
`requires-session`).
155+
156+
What admits or excludes a cell:
157+
158+
- **`added_in` / `removed_in`** gate which spec versions a requirement exists in, as a half-open
159+
`[added_in, removed_in)` window. A test runs only on versions inside every cited requirement's
160+
window.
161+
- **`arm_exclusions`** carve specific `(transport, spec_version)` cells out with a typed
162+
`ArmExclusionReason`. The reason vocabulary doubles as a re-admission checklist: when the gap
163+
closes, grep for the reason string to find every cell to re-admit.
164+
- **`known_failures`** keep a cell in the grid but mark it as a strict xfail — the test runs and
165+
must fail; an unexpected pass fails the suite.
166+
- **`TRANSPORT_SPEC_VERSIONS`** era-locks a transport to a subset of spec versions (currently only
167+
`sse` is locked to `2025-11-25`). A `(transport, version)` cell is dropped if the version is not
168+
in the transport's entry; transports absent from the map serve every spec version. This is the
169+
mechanism for cutting an entire transport off from a new revision (or admitting it).
170+
- **`transports`** is descriptive metadata for the non-`connect` transport-specific suites under
171+
`transports/` and does **not** drive cell generation. Only `arm_exclusions`, `added_in`,
172+
`removed_in`, and `TRANSPORT_SPEC_VERSIONS` filter the grid.
173+
- **`supersedes` / `superseded_by`** link a retired entry to its replacement. `test_coverage.py`
174+
enforces that links are bidirectional and versioned: the retired entry carries `removed_in`, the
175+
replacement carries `added_in`.
176+
177+
Node IDs stay `[transport]` while `len(SPEC_VERSIONS) == 1`, so today's test IDs are
178+
byte-identical to before the era axis existed. They become `[transport-version]` the moment a
179+
second version is appended to `SPEC_VERSIONS`.
180+
181+
When a new spec revision lands:
182+
183+
1. Append the version string to `SPEC_VERSIONS` (and to the `SpecVersion` `Literal`).
184+
2. Walk the new revision's changelog.
185+
3. For each affected requirement: set `removed_in` on retired behaviour, add a new entry with
186+
`added_in` for its replacement, and link the pair with `supersedes` / `superseded_by`.
187+
Behaviour that survives unchanged needs nothing beyond a re-audit of its `source` URL.
188+
4. For requirements that cannot run on the new era's path, add an `arm_exclusions` entry with the
189+
appropriate `ArmExclusionReason`.
190+
5. Review `TRANSPORT_SPEC_VERSIONS`: any era-locked transport will not produce cells on the new
191+
version unless its entry is extended (or removed); add an entry for any transport the new
192+
revision retires.
138193

139194
## Writing a test
140195

tests/interaction/_connect.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99

1010
from collections.abc import AsyncIterator, Awaitable, Callable, Iterable
1111
from contextlib import AbstractAsyncContextManager, asynccontextmanager
12+
from functools import partial
1213
from typing import Any, Protocol
1314

1415
import httpx
@@ -69,6 +70,7 @@ def __call__(
6970
message_handler: MessageHandlerFnT | None = None,
7071
client_info: Implementation | None = None,
7172
elicitation_callback: ElicitationFnT | None = None,
73+
protocol_version: str = LATEST_PROTOCOL_VERSION,
7274
) -> AbstractAsyncContextManager[Client]: ...
7375

7476

@@ -83,6 +85,7 @@ async def connect_in_memory(
8385
message_handler: MessageHandlerFnT | None = None,
8486
client_info: Implementation | None = None,
8587
elicitation_callback: ElicitationFnT | None = None,
88+
protocol_version: str = LATEST_PROTOCOL_VERSION,
8689
) -> AsyncIterator[Client]:
8790
"""Yield a Client connected to the server over the in-memory transport."""
8891
async with Client(
@@ -113,13 +116,15 @@ async def connect_over_streamable_http(
113116
message_handler: MessageHandlerFnT | None = None,
114117
client_info: Implementation | None = None,
115118
elicitation_callback: ElicitationFnT | None = None,
119+
protocol_version: str = LATEST_PROTOCOL_VERSION,
116120
) -> AsyncIterator[Client]:
117121
"""Yield a Client connected to the server's streamable HTTP app, entirely in process.
118122
119-
With the defaults this is the matrix leg (stateful sessions, SSE responses); the
120-
transport-specific tests pass `stateless_http` or `json_response` to select the other
121-
server modes, and the resumability tests pass an `event_store` (with `retry_interval=0` so
122-
the client's reconnection wait is a no-op).
123+
With the defaults this is the matrix leg (stateful sessions, SSE responses); the stateless
124+
matrix arm binds `stateless_http=True` (see `connect_over_streamable_http_stateless`);
125+
transport-specific tests pass `json_response` to select the other server mode, and the
126+
resumability tests pass an `event_store` (with `retry_interval=0` so the client's
127+
reconnection wait is a no-op).
123128
"""
124129
app = server.streamable_http_app(
125130
stateless_http=stateless_http,
@@ -145,6 +150,12 @@ async def connect_over_streamable_http(
145150
yield client
146151

147152

153+
connect_over_streamable_http_stateless: Connect = partial(connect_over_streamable_http, stateless_http=True)
154+
"""The streamable-http matrix arm with the server in stateless mode (fresh transport per request,
155+
no session id, no standalone GET stream). The same shared Server instance backs every request --
156+
stateless mode does not require a server factory."""
157+
158+
148159
@asynccontextmanager
149160
async def mounted_app(
150161
server: Server | MCPServer,
@@ -326,6 +337,7 @@ async def connect_over_sse(
326337
message_handler: MessageHandlerFnT | None = None,
327338
client_info: Implementation | None = None,
328339
elicitation_callback: ElicitationFnT | None = None,
340+
protocol_version: str = LATEST_PROTOCOL_VERSION,
329341
) -> AsyncIterator[Client]:
330342
"""Yield a Client connected to the server's legacy SSE transport, entirely in process."""
331343
app, _ = build_sse_app(server)

0 commit comments

Comments
 (0)